Skip to content

Instantly share code, notes, and snippets.

@NickCrews
NickCrews / coalesce_parquet.py
Last active January 10, 2024 03:48
Coalesce parquet files
"""coalesce_parquets.py
gist of how to coalesce small row groups into larger row groups.
Solves the problem described in https://issues.apache.org/jira/browse/PARQUET-1115
"""
from __future__ import annotations
from pathlib import Path
from typing import Callable, Iterable, TypeVar