Last active
February 9, 2021 20:21
-
-
Save ties/796d4e7993bf3a3275e6f30d44b4d06f to your computer and use it in GitHub Desktop.
Read a file in chunks of [length to the first newline after length chars after previous block]. There may be an easier solution...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import io | |
def read_in_chunks_of_whole_lines(file, length=1024*1024): | |
"""Read a file in blocks of length bytes, up to the next newline.""" | |
tail = "" | |
with open(file, "r") as f: | |
while True: | |
chunk = tail + f.read(length) | |
if not chunk: | |
return | |
while True: | |
part = f.read(128 + length//8) | |
idx = part.find("\n") | |
if idx >= 0: | |
chunk += part[:idx] | |
tail = part[idx:] | |
yield io.StringIO(chunk) | |
break | |
else: | |
chunk += part |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment