Skip to content

Instantly share code, notes, and snippets.

View vaquarkhan's full-sized avatar
:octocat:
while( !(succeed=try())){}

Vaquar Khan vaquarkhan

:octocat:
while( !(succeed=try())){}
View GitHub Profile
@vaquarkhan
vaquarkhan / hudi-glue.scala
Created September 20, 2021 16:30 — forked from igorbasko01/hudi-glue.scala
Hudi GLUE sync
// EMR Applications: Spark, Hive
// spark-shell --packages org.apache.hudi:hudi-spark-bundle_2.11:0.5.1-incubating,org.apache.spark:spark-avro_2.11:2.4.4 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf "spark.sql.hive.convertMetastoreParquet=false"
// EMR configuration:
// {
// "classification": "hive-site",
// "properties": {
// "hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory",
// "hive.metastore.schema.verification": "false"
// }
@vaquarkhan
vaquarkhan / Coursera.md
Created July 17, 2021 06:29
Coursera Specializations and Courses

Coursera Specializations and Courses

Implement solutions using Google Kubernetes Engine, or GKE, including building, scheduling, load balancing, and monitoring workloads, as well as providing for discovery of services, managing role-based access control and security, and providing persistent storage to these applications.

@vaquarkhan
vaquarkhan / 1_ecs_note.md
Created June 3, 2021 22:07 — forked from ejlp12/1_ecs_note.md
ECS Best Practices Notes
@vaquarkhan
vaquarkhan / kubectl.md
Last active March 31, 2021 04:16 — forked from pydevops/kubectl.md
k8s kubectl cheat sheet

Kubernetes Cheat Sheet

A cheat sheet for Kubernetes commands.

Kubectl Alias

Linux

alias k=kubectl
@vaquarkhan
vaquarkhan / interviewitems.MD
Created December 25, 2020 08:30 — forked from KWMalik/interviewitems.MD
My answers to over 100 Google interview questions

##Google Interview Questions: Product Marketing Manager

  • Why do you want to join Google? -- Because I want to create tools for others to learn, for free. I didn't have a lot of money when growing up so I didn't get access to the same books, computers and resources that others had which caused money, I want to help ensure that others can learn on the same playing field regardless of their families wealth status or location.
  • What do you know about Google’s product and technology? -- A lot actually, I am a beta tester for numerous products, I use most of the Google tools such as: Search, Gmaill, Drive, Reader, Calendar, G+, YouTube, Web Master Tools, Keyword tools, Analytics etc.
  • If you are Product Manager for Google’s Adwords, how do you plan to market this?
  • What would you say during an AdWords or AdSense product seminar?
  • Who are Google’s competitors, and how does Google compete with them? -- Google competes on numerous fields: --- Search: Baidu, Bing, Duck Duck Go
/**
* Definition for a binary tree node.
* public class TreeNode {
* int val;
* TreeNode left;
* TreeNode right;
* TreeNode(int x) { val = x; }
* }
*/
/*
@vaquarkhan
vaquarkhan / cassandra-notes.md
Created July 23, 2020 20:52 — forked from gavinmh/cassandra-notes.md
Cassandra Notes

Cassandra Notes

Introduction

Apache Cassandra is an open source, distributed database management system. Cassandra is designed to handle large amounts of data across many commodity servers. Cassandra uses a query language named CQL.

Cassandra's data model is a partitioned row store; Cassandra combines elements of key-value stores and tabular/columnar databases. Like a relational database, Cassandra stores data in tables, called column families, that have defined columns and associated data types. Each row in a column family is uniquely identified by a key. Each row has multiple columns, each of which has a timestamp, name, and value. Unlike a relational database, each row in a column family does not need to have the same set of columns. At any time, a column may be added to one or more rows. If this explanation is unclear, you might think of column families instead as sets of key-value pairs, in which the values are nested sets of key-value pairs.

The following depicts two rows of a column-family fro

@vaquarkhan
vaquarkhan / MNIST_Keras2DML.py
Created July 12, 2020 02:18 — forked from NiloyPurkait/MNIST_Keras2DML.py
An example of using Apache SparkML to train a convolutional neural network in parallel using the MNIST dataset, on IBM watson studio. Written for medium article: https://medium.com/@niloypurkait/how-to-train-your-neural-networks-in-parallel-with-keras-and-apache-spark-ea8a3f48cae6
################################### Keras2DML: Parallely training neural network with SystemML#######################################
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Input, Dense, Conv1D, Conv2D, MaxPooling2D, Dropout,Flatten
from keras import backend as K
from keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
@vaquarkhan
vaquarkhan / DemoApplication.java
Created July 4, 2020 20:59 — forked from MrBW/DemoApplication.java
chaos-monkey-pivotal-test
package com.issue.chaos.monkey.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
@vaquarkhan
vaquarkhan / aws_glue_boto3_example.md
Created August 26, 2019 05:50 — forked from ejlp12/aws_glue_boto3_example.md
AWS Glue Create Crawler, Run Crawler and update Table to use "org.apache.hadoop.hive.serde2.OpenCSVSerde"
import boto3

client = boto3.client('glue')

response = client.create_crawler(
    Name='SalesCSVCrawler',
    Role='AWSGlueServiceRoleDefault',
    DatabaseName='sales-cvs',
    Description='Crawler for generated Sales schema',