Skip to content

Instantly share code, notes, and snippets.

View vaquarkhan's full-sized avatar
:octocat:
while( !(succeed=try())){}

Vaquar Khan vaquarkhan

:octocat:
while( !(succeed=try())){}
View GitHub Profile
@vaquarkhan
vaquarkhan / getDashboardUrl.js
Created March 24, 2022 05:20 — forked from fideliocc/getDashboardUrl.js
Lambda function hooked to API Gateway GET endpoint to attend: role assuming, registering for QuickSight and dashboard URL resolving for web app embedding feature
'use strict'
// IMPORTANT: Replace environment variables with your current values
const aws = require('aws-sdk')
aws.config.region = process.env.REGION
const sts = new aws.STS({apiVersion: '2011-06-15'})
module.exports.handler = (event, context, callback) => {
console.log('User email', event.queryStringParameters.email)
@vaquarkhan
vaquarkhan / access_log_parser.py
Created March 14, 2022 05:32 — forked from wolf0403/access_log_parser.py
Python re for Nginx access_log
# Modified by adding names from https://github.com/richardasaurus/nginx-access-log-parser/blob/master/main.py
pat = (r''
'(?P<ip>\d+.\d+.\d+.\d+)\s-\s-\s' #IP address
'\[(?P<time>.+)\]\s' #datetime
'"(?P<method>GET|POST)\s(?P<uri>.+)\s(?P<ver>\w+/.+)"\s(?P<status>\d+)\s' #requested file
'(?P<content_length>\d+)\s"(?P<referrer>.+)"\s' #referrer
'"(?P<user_agent>.+)"' #user agent
)
r = re.match(pat, logline)
r.groupdict()

Databricks Delta Lake - A Friendly Intro

This article introduces Databricks Delta Lake. A revolutionary storage layer that brings reliability and improve performance of data lakes using Apache Spark.

First, we'll go through the dry parts which explain what Apache Spark and data lakes are and it explains the issues faced with data lakes. Then it talks about Delta lake and how it solved these issues with a practical, easy-to-apply tutorial.

Introduction to Apache Spark

If you don't know what Spark is, Apache Spark is a large-scale data processing and unified analytics engine for big data and machine learning. It was originally developed at UC Berkeley in 2009. Apache Spark is 100% open source, hosted at the vendor-independent Apache Software Foundation.

@vaquarkhan
vaquarkhan / lambda_function.py
Created March 5, 2022 01:35 — forked from umihico/lambda_function.py
Publish any AWS quicksight dashboards to public with lambda and API gateway
import json
import boto3
"""
API gateway URL example. You have to allow your quicksight domin setting to be accessed from amazonaws.com including subdomains.
https://xxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/xxx-gateway-stage-xxxx/your-lambda-func-name?dashboard_id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxx
"""
def lambda_handler(event, context):
dashboard_id=event["queryStringParameters"]['dashboard_id']
AWSLambdaBasicExecutionPolicy
---
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
@vaquarkhan
vaquarkhan / avro_rw.py
Created February 13, 2022 15:20 — forked from gamame/avro_rw.py
Python Avro Data Read Write
# Import the schema, datafile and io submodules
# from avro (easy_install avro)
from avro import schema, datafile, io
OUTFILE_NAME = 'sample.avro'
SCHEMA_STR = """{
"type": "record",
"name": "sampleAvro",
"namespace": "AVRO",
@vaquarkhan
vaquarkhan / console.py
Created February 7, 2022 05:29 — forked from weavenet/console.py
Python script to assume STS role and generate AWS console URL.
#!/usr/bin/env python
import getpass
import json
import requests
import sys
import urllib
import boto3
@vaquarkhan
vaquarkhan / bootstrap.json
Created January 30, 2022 02:03 — forked from TimurFayruzov/bootstrap.json
Setup for running a Flink application on EMR
[
{
"Name": "Ship Flink runtime to cluster",
"Path": "s3://<your_bucket>/flink/ship_flink_runtime.sh"
},
{
"Name": "Ship application to cluster",
"Path": "s3://<your_bucket>/flink/ship_app.sh"
}
]
@vaquarkhan
vaquarkhan / BatchJob.java
Created January 18, 2022 05:16 — forked from javier/BatchJob.java
A tale of two streams workshop
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
@vaquarkhan
vaquarkhan / How-to.txt
Created September 20, 2021 23:40 — forked from chicagobuss/How-to.txt
lambda-to-kafka
# First I created a working directory
~$ mkdir s3-kafka
# Then I installed the dependency I needed in that directory with pip
~$ cd s3-kafka
~$ pip install kafka-python -t $(pwd)
# Then I put my code into a file called s3-kafka.py
~$ vi s3-kafka.py