pbrumblay’s gists

pbrumblay / ftpsupload.ps1

Last active September 23, 2020 16:07

Powershell FTPS Upload Example

	param (
	[string]$file = $(throw "-file is required"),
	[string]$ftphostpath = $(throw "-ftphostpath is required"),
	[string]$username = $(throw "-username is required"),
	[string]$password = $(throw "-password is required")
	)

	$f = dir $file
	$req = [System.Net.FtpWebRequest]::Create("ftp://$ftphostpath/" + $f.Name);
	$req.Credentials = New-Object System.Net.NetworkCredential($username, $password);

pbrumblay / zipdirectory.ps1

Last active December 23, 2015 00:09

Powershell ZipFile Example (requires .NET 4.5)

	[Reflection.Assembly]::LoadWithPartialName('System.IO.Compression.FileSystem');
	[System.IO.Compression.ZipFile]::CreateFromDirectory($directoryTozip, "$output.zip");

pbrumblay / SplitTableRowsIntoPartitions.java

Last active August 9, 2023 09:34

Apache Beam writing TableRows by partition column using FileIO writeDynamic

	package com.fearlesstg.dataflow.pipelines;
	import java.util.*;

	import com.fasterxml.jackson.databind.ObjectMapper;
	import com.fasterxml.jackson.databind.SerializationFeature;
	import com.google.api.services.bigquery.model.TableRow;

	import org.apache.beam.sdk.coders.StringUtf8Coder;
	import org.apache.beam.sdk.io.Compression;
	import org.apache.beam.sdk.io.FileIO;

pbrumblay / ManualLoad.java

Created May 25, 2018 17:14

Use low level com.google.api.services.bigquery.Bigquery client to create load job to truncate partition

	package com.fearlesstg;

	import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
	import com.google.api.client.http.HttpTransport;
	import com.google.api.client.http.javanet.NetHttpTransport;
	import com.google.api.client.json.JsonFactory;
	import com.google.api.client.json.jackson.JacksonFactory;
	import com.google.api.services.bigquery.Bigquery;
	import com.google.api.services.bigquery.BigqueryScopes;
	import com.google.api.services.bigquery.model.*;

pbrumblay / ReplaceBQPartition.java

Created May 25, 2018 18:01

Use BigQueryIO.writeTableRows() to replace partitions based off values in TableRow elements

	package com.fearlesstg;

	import com.google.api.services.bigquery.model.TableFieldSchema;
	import com.google.api.services.bigquery.model.TableRow;
	import com.google.api.services.bigquery.model.TableSchema;
	import com.google.api.services.bigquery.model.TimePartitioning;
	import org.apache.beam.sdk.Pipeline;
	import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
	import org.apache.beam.sdk.io.gcp.bigquery.TableDestination;
	import org.apache.beam.sdk.options.PipelineOptions;

pbrumblay / GetTableShardPartition.java

Last active June 4, 2018 13:54

Table destination function which supports large numbers of partitions

	import com.google.api.services.bigquery.model.TableRow;
	import com.google.api.services.bigquery.model.TimePartitioning;
	import org.apache.beam.sdk.io.gcp.bigquery.TableDestination;
	import org.apache.beam.sdk.transforms.SerializableFunction;
	import org.apache.beam.sdk.values.ValueInSingleWindow;
	import org.slf4j.Logger;
	import org.slf4j.LoggerFactory;

	//From: https://shinesolutions.com/2017/12/05/fun-with-serializable-functions-and-dynamic-destinations-in-cloud-dataflow/
	public class GetTableShardPartition implements SerializableFunction<ValueInSingleWindow<TableRow>, TableDestination> {

pbrumblay / dump_avro_schema.py

Created July 13, 2018 14:28

Python script to extract schema from avro file in google cloud storage

	from google.cloud import storage
	import sys
	from avro.datafile import DataFileReader
	from avro.io import DatumReader
	import json

	client = storage.Client()

	bucket_name = sys.argv[1]
	blob_name = sys.argv[2]

pbrumblay / gcs_custom_hook.py

Created November 5, 2018 22:16

Airflow custom Google Cloud Storage Hook with resumable uploads, partial downloads, and compose (everyone else calls it "concatenating") functionality

	from google.cloud import storage
	from airflow.hooks.base_hook import BaseHook
	from airflow.utils.log.logging_mixin import LoggingMixin
	import random
	import string


	class GCSCustomHook(BaseHook, LoggingMixin):

	def __init__(self, storage_conn_id='google_cloud_storage_default'):

pbrumblay / terraform_plan_debug.txt

Created January 3, 2020 21:37

Upgrade settings block not working

	terraform plan
	2020/01/03 14:33:49 [WARN] Log levels other than TRACE are currently unreliable, and are supported only for backward compatibility.
	Use TF_LOG=TRACE to see Terraform's internal logs.
	----
	2020/01/03 14:33:49 [INFO] Terraform version: 0.12.18
	2020/01/03 14:33:49 [INFO] Go runtime version: go1.12.13
	2020/01/03 14:33:49 [INFO] CLI args: []string{"/usr/local/bin/terraform", "plan"}
	2020/01/03 14:33:49 [DEBUG] Attempting to open CLI config file: /Users/peter/.terraformrc
	2020/01/03 14:33:49 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
	2020/01/03 14:33:49 [INFO] CLI command args: []string{"plan"}

pbrumblay / add_row.py

Last active September 29, 2020 04:34

Add row to pandas_ta dataframe and recompute

	import pandas as pd
	import pandas_ta as ta
	from dateutil import parser

	df = pd.read_csv('AUD_CAD.csv', sep=',', names=[
	'datetime', 'bid_open', 'bid_high', 'bid_low', 'bid_close', 'ask_open', 'ask_high', 'ask_low', 'ask_close', 'mid_open', 'mid_high', 'mid_low', 'mid_close', 'volume'], error_bad_lines=False, parse_dates=['datetime'])

	df.ta.atr(append=True, high='bid_high', low='bid_low', close='bid_close')

	print(df)

Peter Brumblay pbrumblay