Skip to content

Instantly share code, notes, and snippets.

View saswata-dutta's full-sized avatar
I may be slow to respond.

Saswata Dutta saswata-dutta

I may be slow to respond.
View GitHub Profile
willccbb /
Last active March 16, 2025 21:46
GRPO Llama-1B
# See for ongoing developments
import re
import torch
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer
saggie /
Created August 22, 2021 06:18
Verify Amazon Cognito JWT in Ktor

(In Ktor: 1.6.2)

  • application.conf

    jwt {
        issuer = ""
        audience = "__SPECIFY_CLIENT_ID_HERE__"
        realm = "ktor sample app"
rahularity /
Last active March 14, 2025 21:28
How To Work With Multiple Github Accounts on your PC

How To Work With Multiple Github Accounts on a single Machine

Let suppose I have two github accounts, and Now i want to setup my mac to easily talk to both the github accounts.

NOTE: This logic can be extended to more than two accounts also. :)

The setup can be done in 5 easy steps:


  • Step 1 : Create SSH keys for all accounts
  • Step 2 : Add SSH keys to SSH Agent
snigdhasjg /
Last active April 30, 2021 11:34
Getting started with EMRFS.

Getting started with EMRFS

The EMR File System (EMRFS) is an implementation of HDFS that all Amazon EMR clusters use for reading and writing regular files from Amazon EMR directly to Amazon S3.

How to access a file from S3 using EMRFS

Using Java

Coming from HDFS it is very easy to implement EMRFS. You just need to pass URI("s3://<bucket-name>") object while getting filesystem object.

package com.joe;
soxofaan /
Last active April 25, 2024 10:50 — forked from reywood/
How to get a stack trace from a stuck/hanging python script

How to get a stack trace for each thread in a running Python script

Sometimes a Python script will simply hang forever with no indication of what is going wrong. Perhaps it's polling a service that will never return a value that allows the program to move forward.

Here's a way to see where the program is currently stuck, using pyrasite a tool for injecting code into running Python processes.

Install gdb and pyrasite

Install gdb.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
therightstuff / create_db.js
Created September 23, 2019 08:56
@mysql/xdevapi node.js example: (re)create mysql database, test table insert, select and delete
var mysqlx = require('@mysql/xdevapi');
const MIGRATIONS_USER = 'migrationsuser';
var server = {
host : 'localhost',
user : 'intendeduser',
database : 'mydatabase',
password : 'mypassword'
# Luke's config for the Zoomer Shell
# Enable colors and change prompt:
autoload -U colors && colors
PS1="%B%{$fg[red]%}[%{$fg[yellow]%}%n%{$fg[green]%}@%{$fg[blue]%}%M %{$fg[magenta]%}%~%{$fg[red]%}]%{$reset_color%}$%b "
# History in cache directory:
dannguyen /
Last active October 30, 2023 05:49
A gist of AWS Textract sample/demo data for easy reference and preview, in case you're curious how well Amazon does when it comes to pdf-to-csv

AWS Textract -- sample document image and data from the offical demo

AWS Textract is now out of closed beta. You can read the features page here, and you can also read about its limits here (e.g. no handwriting). Basically, if you've ever had to deal with the hell of getting structured data out of a PDF (scanned image or not), Textract is aiming for your business:


This short gist contains some of my brief observations about Textract and its demo, as well as direct links to the most relevant and important files, such as the Textract demo sample image and the resulting data files from Textract's API. If you have an AWS account, I h

mwakaba2 /
Last active February 27, 2025 15:50
Updated easy to remember system design numbers for back-of-the-envelope calculations

Updated, easy to remember numbers for back-of-the-envelope calculations in system design interviews

Powers of two table

Power    Approx Value (Bytes)       Bytes
10                 1 thousand        1 KB
16                65 thousand       64 KB
20                  1 million        1 MB
30 1 billion 1 GB