Skip to content

Instantly share code, notes, and snippets.

View AbdealiLoKo's full-sized avatar

AbdealiLoKo AbdealiLoKo

  • Corridor Platforms
  • Bengaluru
View GitHub Profile
@AbdealiLoKo
AbdealiLoKo / large-data-in-api.md
Last active March 12, 2025 02:14
large-data-in-api.md

Microbenchmarks to send large data via a REST API

Trying to use duckdb to perform some filters/groupbys on a parquet file and send a lot of data to a browser. Aprox 100mb of data was being sent to a browser. Found that the API was quite slow - and was looking for options to make it faster.

Frameworks being used:

  1. Chrome 133.x
  2. Angular 18.x
  3. Python 3.11.x
  4. Flask 3.0.x
@AbdealiLoKo
AbdealiLoKo / pyspark-compat.md
Created November 6, 2024 15:51
pyspark-compat.md

PySpark Compatibility

I've seen too many incompatibility issues with specific versions if Python - Spark - Arrow.

So, documenting which versions have worked for me in the past!

Spark Python Arrow Comment
3.5.x 3.8.x 12.x
3.5.x 3.9.x 12.x
@AbdealiLoKo
AbdealiLoKo / prepare-commit-msg
Created April 10, 2024 09:31
Git - prepare-commit-msg hook
#!/usr/bin/env sh
# . "$(dirname -- "$0")/_/husky.sh" # - uncomment if using husky
# Get the current commit message file
COMMIT_MSG_FILE=$1
SOURCE_MSG=$2
# Ref: https://git-scm.com/docs/githooks#_prepare_commit_msg
# SOURCE_MSG is the source where the commit messsage is taken from
# - EMPTY -> No source commit message present `git commit -a`
@AbdealiLoKo
AbdealiLoKo / 0-demo-jit.ipynb
Last active August 19, 2023 19:51
JIT In Python
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@AbdealiLoKo
AbdealiLoKo / kiit-design-principles-talk.pdf
Last active March 24, 2023 11:28
KIIT Design Principles Talk
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@AbdealiLoKo
AbdealiLoKo / spark-withColumn-vs-select.py
Created January 2, 2023 06:01
Spark - .withColumn() vs .select()
"""
Simple benchmark to check if withColumn() is faster or select() is faster
Conflusion: select() is faster than withColumn() in a for loop as lesser dataframes are created
"""
import datetime
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()
for ncol in [10, 100, 1000, 2000, 5000]:
@AbdealiLoKo
AbdealiLoKo / playwright-demo.ipynb
Last active August 19, 2023 19:46
Playwright Demo
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@AbdealiLoKo
AbdealiLoKo / README-sqlalchemy-query-relationsihps.md
Last active June 20, 2022 05:45
Usecase: Speedup SQLAlchemy queries
@AbdealiLoKo
AbdealiLoKo / better-commits.md
Last active May 18, 2022 07:54
Better commit messages (message creators and linters)

Better commit messages

The 3 common ways for developers to document information about their work is:

  1. Comments
    • When is this written: When the developer wants something to be clearly and immediately visible to all other developers
    • When is this found: As soon as other developers are reading code, they will find these comments
  2. Commit messages
    • When is this written: When the developer wants to explain the work involved in them making a change. Why a change was made, explanation of the
    • When is this found: When other developers dig a big deeper on why or when a change was made - they will find these commit messages
  3. Tech Documentation
@AbdealiLoKo
AbdealiLoKo / pyjava.py
Created July 21, 2020 19:33
Comparing py-java libraries
# Example:
# PYJAVA_LIB=jpype venv/bin/python pyjava.py
import os
from datetime import datetime
from jpmml_evaluator import _package_classpath
lib = os.environ.get('PYJAVA_LIB')
assert lib is not None, 'Set env var PYJAVA_LIB to py4j/jnius/jpype'