Skip to content

Instantly share code, notes, and snippets.

@datavudeja
datavudeja / sqlite_iqr.sql
Created August 26, 2025 16:10 — forked from mathpn/sqlite_iqr.sql
Find and remove outliers per group using the interquartile range with SQLite queries
/*
Find and remove outliers per group using the interquartile range with SQLite queries.
*/
-- create table with fake data
CREATE TEMPORARY TABLE test_table (group_id INTEGER, variable REAL);
INSERT INTO test_table
VALUES
(1, 2),
(1, 4),
@datavudeja
datavudeja / anti_join.py
Created August 15, 2025 10:55 — forked from sainathadapa/anti_join.py
anti-join-pandas
import pandas as pd
def anti_join(x, y, on):
"""Return rows in x which are not present in y"""
ans = pd.merge(left=x, right=y, how='left', indicator=True, on=on)
ans = ans.loc[ans._merge == 'left_only', :].drop(columns='_merge')
return ans
def anti_join_all_cols(x, y):
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@datavudeja
datavudeja / human-in-the-loop-learning.sh
Created August 15, 2025 10:51 — forked from josemreis/human-in-the-loop-learning.sh
A simple human in the loop workflow for document matching using Python, R, and Gnome's gedit
#!/bin/bash
## working dir at file location
cd "$(dirname "$0")"
## train the model
echo "[+] Fitting a glmnet model on labeled data"
Rscript 'learning.R'
echo "[+] Finished fitting the model"
## read in the ambiguous cases
TO_RECODE=`cat ambiguous.txt`
echo "[+] Recoding ambiguous"
@datavudeja
datavudeja / pandas.ipynb
Created August 15, 2025 10:50 — forked from animi4695/pandas.ipynb
Pandas cheatsheet with examples
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
"""Summary
"""
import logging
from pathlib import Path
import fire
from datasets import Dataset, load_dataset
from tqdm.auto import tqdm
from transformers import AutoTokenizer
@datavudeja
datavudeja / count_calls.py
Created August 13, 2025 15:40 — forked from fmder/count_calls.py
Count function calls decorator
class countCalls(object):
"""Decorator that keeps track of the number of times a function is called.
::
>>> @countCalls
... def foo():
... return "spam"
...
>>> for _ in range(10)
... foo()
@datavudeja
datavudeja / log_variables_in_recursion.py
Created August 13, 2025 12:43 — forked from CodeByAidan/log_variables_in_recursion.py
One of the most painful code I've wrote, a decorator to log variables in a function every time they set/get. Notice the recursion part 😔
from functools import wraps
from typing import Any, Callable, TypeVar
F = TypeVar("F", bound=Callable[..., Any])
def log_variables(func: F) -> F:
@wraps(func)
def wrapper(*args, **kwargs) -> Any:
if not hasattr(wrapper, "initialized"):
@datavudeja
datavudeja / log_variables_in_function.py
Created August 13, 2025 12:42 — forked from CodeByAidan/log_variables_in_function.py
Decorator that prints the variables in a function (being wrapped) values as they change. Pretty nifty.
import sys
from functools import wraps
from typing import Any, Callable, Optional, TypeVar
F = TypeVar("F", bound=Callable[..., Any])
def log_variables(func: F) -> F:
@wraps(func)
def wrapper(*args: Any, **kwargs: Any) -> Any:
def tracer(frame, event, arg) -> Optional[Callable]:
@datavudeja
datavudeja / external.py
Created August 13, 2025 12:42 — forked from betatim/external.py
Can we inject arguments from the caller's locals into callees? Yes! Run `python injector.py`
import inspect
import logging
from functools import wraps
def inject_logger(method):
@wraps(method)
def wrapper(*args, **kwargs):
frame = inspect.currentframe()