Skip to content

Instantly share code, notes, and snippets.

View ayaksvals's full-sized avatar

Olesia Slavska ayaksvals

  • Vienna, Austria
  • 04:46 (UTC +02:00)
View GitHub Profile
@ayaksvals
ayaksvals / merge_sort_oi_2parquet_GSOC.ipynb
Created September 23, 2024 17:33
Sorting pairs.gz files
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / merge_sort_oi_GSOC.ipynb
Created September 23, 2024 16:39
Merge Sort on Pandas (Unix utility)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / pairsToParquet(PolarsVersion)_GSOC.ipynb
Last active September 23, 2024 16:35
Converter from pairs.gz to parquet. Contains Polars, DuckDB and Dask Versions
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / FinalWorkReport2024Slavska.md
Last active September 23, 2024 17:54
Google Summer of Code Final Report 2024 Slavska Olesia

Google Summer of Code 2024 Final Report

Contributor: Olesia Slavska

Organisation: Open2C

Mentors: Nezar Abdennur and Anton Goloborodko

@ayaksvals
ayaksvals / parquet_sort(DuckDB).ipynb
Created September 3, 2024 10:24
read, sort, save .parquet file with DuckDB
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / parquet_sort(PolarsVersion2).ipynb
Created August 20, 2024 10:36
Version 2: 4 min for 65_220_653 rows. Scan .parquet, sort, write it down as parquet.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / csv_sort_parquet(DaskVersion) (3).ipynb
Last active August 19, 2024 16:01
Read Csv, sort, save to parquet. TO FIX: dtypes
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / csv_sort_parquet(DaskVersion) (1).ipynb
Created August 14, 2024 18:02
Read csv (dask), sort(1,2,10), write to parquet, read for test.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / csv_sort_parquet(DaskVersion).ipynb
Created August 14, 2024 15:24
Read csv with pypairix. Sort. Save to Parquet. Read for test
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ayaksvals
ayaksvals / parquet_sort(PolarsVersion).ipynb
Last active August 19, 2024 17:23
Sort Parquet with polars 1,2,10 not 1,10,2 (really Long Version)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.