johntbush’s gists

johntbush / gist:e8e04b50c14f5c89ec4620e63003016c

Created December 19, 2017 07:04

	package com.example.parquet.writing

	import java.lang.Exception
	import java.util

	import org.apache.hadoop.conf.Configuration
	import org.apache.parquet.hadoop.ParquetWriter
	import org.apache.parquet.hadoop.metadata.CompressionCodecName
	import org.apache.hadoop.fs.Path
	import java.util.{Date, UUID}

johntbush / gist:3ff938e9aba659488c5581300bca9bd8

Last active December 8, 2017 16:03

	FROM AWS s05 (singapore)

	Datacenter: ap-southeast
	========================
	Status=Up/Down
	\|/ State=Normal/Leaving/Joining/Moving
	-- Address Load Tokens Owns Host ID Rack
	UN 10.252.2.191 80.38 GiB 256 ? d38f1bf8-f77e-454a-93d6-e0eddecffa06 1a
	UN 10.252.12.212 76.63 GiB 256 ? e3d9201e-799f-4abe-bd53-d22585556bd4 1b
	UN 10.252.2.160 76.33 GiB 256 ? 1ee587ca-e929-4057-85a0-94c2e5af5ab5 1a

johntbush / check_s3.py

Last active November 7, 2020 07:25

s3_inventory

	import json
	import gzip
	import pandas as pd
	import dateutil.parser

	def load_df(data_file):
	names = ['Bucket', 'Key', 'Size', 'LastModifiedDate', 'ETag', 'StorageClass', 'IsMultipartUploaded']
	return pd.read_csv(data_file, names=names)

	def old_files(df, year):

johntbush / gist:84da76b8de98515664e261f94bb5698b

Last active October 31, 2017 23:04

	curl -s "http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a?format=json&sort=recordid:desc&filter=audit_created_on:2017-10-31&size=-1" \| jq -r '.data[] \| .primary_key'

	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/40b42240-be7e-11e7-aa09-0e18d10715a6
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/9844e706-be7e-11e7-8909-1284c881d488
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/40fa68f4-be7e-11e7-aa09-0e18d10715a6
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/9886100a-be7e-11e7-8909-1284c881d488
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/98a6beae-be7e-11e7-8909-1284c881d488
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/4160559c-be7e-11e7-aa09-0e18d10715a6
	curl -s -XDELETE http://sheets.s03.filex.com/2726daf0-7dbf-5dae-bd4d-944d5313944a/4181268c-be7e-11e7-aa09-0e18d10715a6
	curl -s

johntbush / cassandra_tables_examples.sql

Last active September 18, 2017 05:32

cassandra tables examples

johntbush / sample_router_response.json

Created August 29, 2017 22:19

sample_router_response.json

	{
	"src_url": "smb://filex.com/comm/filerouter_virt",
	"src_subfolder": "",
	"route_id": 8205,
	"sha256": "62b6ddddef3b34b9840275d7dc898c6949b2a4775b88e0cd0a4559531b2e79f8",
	"dst": "",
	"dst_url": "",
	"subfolder": "",
	"path": "\\\\filex.com\\comm\\filerouter_virt",
	"x12_version": "",

johntbush / ics_schema.sql

Last active August 29, 2017 22:35

ICS Schema

	CREATE TABLE [dbo].[Issues]
	(
	[seq_num] [int] NOT NULL,
	[create date] [smalldatetime] NULL,
	[customer] [nvarchar](50) NULL,
	[issue_name] [varchar](MAX) NULL,
	[submitter_name] [nvarchar](75) NULL,
	[email_address] [varchar](MAX) NULL,
	[category] [nvarchar](64) NULL,
	[priority] [int] NULL,

johntbush / getDateSource.scala

Created August 29, 2017 16:50

getDataSource

	def getDataSource(db: ConnectionName, write: Boolean, userName: String = user, pwd: String = password, isDomainLogon: Boolean = true, sendStringParametersAsUnicode: Option[Boolean] = None): HikariDataSource = {

	val hconfig = new HikariConfig()
	val url = baseUrl + db.sqlDns + "/" + db.databaseName
	hconfig.setPoolName(db.connectionName + "_" + Utils.newUUID)
	hconfig.setMaximumPoolSize(poolSize)
	hconfig.setMinimumIdle(1)
	hconfig.setJdbcUrl(url)
	hconfig.setDriverClassName("net.sourceforge.jtds.jdbc.Driver")
	hconfig.addDataSourceProperty("serverName", db.sqlDns)

johntbush / ratemetrics.py

Last active August 7, 2017 20:27

storm rate metrics logic

	if (type = source or type is null) and queue = rate-metrics and msg is not from CDC
	write log in cassandra
	call rates API to apply patterns
	write doc in elasticsearch
	catch failure
	if retries not exhausted
	modify message to set retry flag and update num of retries
	publish back on rate-metrics queue
	ack message

johntbush / gist:16405eeb3252b8e596325fd3aef91276

Last active August 8, 2017 03:39

update from pattern

john bush johntbush