Skip to content

Instantly share code, notes, and snippets.

View saswata-dutta's full-sized avatar
💭
I may be slow to respond.

Saswata Dutta saswata-dutta

💭
I may be slow to respond.
View GitHub Profile
@saswata-dutta
saswata-dutta / DDB_counters.md
Last active May 8, 2023 10:05
Implementing atomic hierarchical counters at account and user level using AWS DynamoDb

Table:

AccountCustomerQuotaCounters

Keys:

accountId : pk : S

ym_id : sk : S

@saswata-dutta
saswata-dutta / AWS_schedule_events.md
Last active July 8, 2023 12:35
EventBridge Scheduler POC to perform tasks in future

https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html

  1. on each job creation event create a “one time” event bridge schedule event

    1. name it using the job id so that its easy to delete later
    2. set the time to be what we expect the overall sla breach to be in future: say now() + 3 days
    3. need to set the flex window aptly (1~15 mins)
  2. let the event bridge target be sqs (so that we can buffer and consume in a controlled rate)

  3. the sqs payload can contain the job-id,

import boto3
import json
s3_client = boto3.client('s3')
def lambda_handler(event, context):
# Extract the S3 bucket and object key from the event
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
@saswata-dutta
saswata-dutta / getAvailResponses.java
Last active April 18, 2023 05:04
Timed handling of requests
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
public class RequestHandler {
private ExecutorService threadPool;
@saswata-dutta
saswata-dutta / aws_lambda_logging.py
Last active April 12, 2023 10:16
AWS Lambda handler to avoid splitting log lines in cloudwatch
import sys
import logging
import traceback
import json
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def process(record):
@saswata-dutta
saswata-dutta / spark_approx_percentile.scala
Created March 26, 2023 06:49
Spark Scala Approx Percentile over group
val a_s = Seq.fill(9)("a" -> 1):+ ("a" -> 10)
// a_s: Seq[(String, Int)] = List((a,1), (a,1), (a,1), (a,1), (a,1), (a,1), (a,1), (a,1), (a,1), (a,10))
val b_s = Seq.fill(9)("b" -> 2):+ ("b" -> 10)
// b_s: Seq[(String, Int)] = List((b,2), (b,2), (b,2), (b,2), (b,2), (b,2), (b,2), (b,2), (b,2), (b,10))
val df = (a_s ++ b_s).toDF("kind", "value")
// df: org.apache.spark.sql.DataFrame = [kind: string, value: int]
df.groupBy("kind").agg(expr("approx_percentile(value, 0.90, 20)").as("x_percentile")).show
@saswata-dutta
saswata-dutta / Verify_Cognito_JWT_in_Ktor.md
Created February 5, 2023 05:58 — forked from saggie/Verify_Cognito_JWT_in_Ktor.md
Verify Amazon Cognito JWT in Ktor

(In Ktor: 1.6.2)

  • application.conf

    ...
    jwt {
        issuer = "https://cognito-idp.ap-northeast-1.amazonaws.com/__SPECIFY_POOL_ID_HERE__"
        audience = "__SPECIFY_CLIENT_ID_HERE__"
        realm = "ktor sample app"
    
@saswata-dutta
saswata-dutta / shardcalc.py
Created November 13, 2022 06:45 — forked from colmmacc/shardcalc.py
Calculate the blast radius of a shuffle shard
import sys
# choose() is the same as computing the number of combinations. Normally this is
# equal to:
#
# factorial(N) / (factorial(m) * factorial(N - m))
#
# but this is very slow to run and requires a deep stack (without tail
# recursion).
#
export AWS_ACCESS_KEY_ID='A???'
export AWS_SECRET_ACCESS_KEY='g???'
spark-shell --packages "org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262"
val test = spark.read.parquet("s3a://bucket/prefix/part-000000.snappy.parquet")
@saswata-dutta
saswata-dutta / get_lat_lon_exif_pil.py
Created September 30, 2022 03:13 — forked from erans/get_lat_lon_exif_pil.py
Get Latitude and Longitude from EXIF using PIL
from PIL import Image
from PIL.ExifTags import TAGS, GPSTAGS
def get_exif_data(image):
"""Returns a dictionary from the exif data of an PIL Image item. Also converts the GPS Tags"""
exif_data = {}
info = image._getexif()
if info:
for tag, value in info.items():
decoded = TAGS.get(tag, tag)