Skip to content

Instantly share code, notes, and snippets.

View JorgeMadson's full-sized avatar

Jorge Madson JorgeMadson

View GitHub Profile
@JorgeMadson
JorgeMadson / data_loading_utils.py
Last active March 27, 2025 14:05 — forked from iyvinjose/data_loading_utils.py
Read large files line by line without loading entire file to memory. Supports files of GB size
def read_lines_from_file_as_data_chunks(file_name, chunk_size, callback, return_whole_chunk=False):
"""
read file line by line regardless of its size
:param file_name: absolute path of file to read
:param chunk_size: size of data to be read at at time
:param callback: callback method, prototype ----> def callback(data, eof, file_name)
:param return_whole_chunk: if True, returns whole chunks instead of line by line
:return: None
"""