Skip to content

Instantly share code, notes, and snippets.

View pbrumblay's full-sized avatar

Peter Brumblay pbrumblay

  • Tyr Consulting, LLC
  • Denver, CO
View GitHub Profile
@pbrumblay
pbrumblay / ftpsupload.ps1
Last active September 23, 2020 16:07
Powershell FTPS Upload Example
param (
[string]$file = $(throw "-file is required"),
[string]$ftphostpath = $(throw "-ftphostpath is required"),
[string]$username = $(throw "-username is required"),
[string]$password = $(throw "-password is required")
)
$f = dir $file
$req = [System.Net.FtpWebRequest]::Create("ftp://$ftphostpath/" + $f.Name);
$req.Credentials = New-Object System.Net.NetworkCredential($username, $password);
@pbrumblay
pbrumblay / zipdirectory.ps1
Last active December 23, 2015 00:09
Powershell ZipFile Example (requires .NET 4.5)
[Reflection.Assembly]::LoadWithPartialName('System.IO.Compression.FileSystem');
[System.IO.Compression.ZipFile]::CreateFromDirectory($directoryTozip, "$output.zip");
@pbrumblay
pbrumblay / SplitTableRowsIntoPartitions.java
Last active August 9, 2023 09:34
Apache Beam writing TableRows by partition column using FileIO writeDynamic
package com.fearlesstg.dataflow.pipelines;
import java.util.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.google.api.services.bigquery.model.TableRow;
import org.apache.beam.sdk.coders.StringUtf8Coder;
import org.apache.beam.sdk.io.Compression;
import org.apache.beam.sdk.io.FileIO;
@pbrumblay
pbrumblay / ManualLoad.java
Created May 25, 2018 17:14
Use low level com.google.api.services.bigquery.Bigquery client to create load job to truncate partition
package com.fearlesstg;
import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.bigquery.Bigquery;
import com.google.api.services.bigquery.BigqueryScopes;
import com.google.api.services.bigquery.model.*;
@pbrumblay
pbrumblay / ReplaceBQPartition.java
Created May 25, 2018 18:01
Use BigQueryIO.writeTableRows() to replace partitions based off values in TableRow elements
package com.fearlesstg;
import com.google.api.services.bigquery.model.TableFieldSchema;
import com.google.api.services.bigquery.model.TableRow;
import com.google.api.services.bigquery.model.TableSchema;
import com.google.api.services.bigquery.model.TimePartitioning;
import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
import org.apache.beam.sdk.io.gcp.bigquery.TableDestination;
import org.apache.beam.sdk.options.PipelineOptions;
@pbrumblay
pbrumblay / GetTableShardPartition.java
Last active June 4, 2018 13:54
Table destination function which supports large numbers of partitions
import com.google.api.services.bigquery.model.TableRow;
import com.google.api.services.bigquery.model.TimePartitioning;
import org.apache.beam.sdk.io.gcp.bigquery.TableDestination;
import org.apache.beam.sdk.transforms.SerializableFunction;
import org.apache.beam.sdk.values.ValueInSingleWindow;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
//From: https://shinesolutions.com/2017/12/05/fun-with-serializable-functions-and-dynamic-destinations-in-cloud-dataflow/
public class GetTableShardPartition implements SerializableFunction<ValueInSingleWindow<TableRow>, TableDestination> {
@pbrumblay
pbrumblay / dump_avro_schema.py
Created July 13, 2018 14:28
Python script to extract schema from avro file in google cloud storage
from google.cloud import storage
import sys
from avro.datafile import DataFileReader
from avro.io import DatumReader
import json
client = storage.Client()
bucket_name = sys.argv[1]
blob_name = sys.argv[2]
@pbrumblay
pbrumblay / gcs_custom_hook.py
Created November 5, 2018 22:16
Airflow custom Google Cloud Storage Hook with resumable uploads, partial downloads, and compose (everyone else calls it "concatenating") functionality
from google.cloud import storage
from airflow.hooks.base_hook import BaseHook
from airflow.utils.log.logging_mixin import LoggingMixin
import random
import string
class GCSCustomHook(BaseHook, LoggingMixin):
def __init__(self, storage_conn_id='google_cloud_storage_default'):
@pbrumblay
pbrumblay / terraform_plan_debug.txt
Created January 3, 2020 21:37
Upgrade settings block not working
terraform plan
2020/01/03 14:33:49 [WARN] Log levels other than TRACE are currently unreliable, and are supported only for backward compatibility.
Use TF_LOG=TRACE to see Terraform's internal logs.
----
2020/01/03 14:33:49 [INFO] Terraform version: 0.12.18
2020/01/03 14:33:49 [INFO] Go runtime version: go1.12.13
2020/01/03 14:33:49 [INFO] CLI args: []string{"/usr/local/bin/terraform", "plan"}
2020/01/03 14:33:49 [DEBUG] Attempting to open CLI config file: /Users/peter/.terraformrc
2020/01/03 14:33:49 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2020/01/03 14:33:49 [INFO] CLI command args: []string{"plan"}
@pbrumblay
pbrumblay / add_row.py
Last active September 29, 2020 04:34
Add row to pandas_ta dataframe and recompute
import pandas as pd
import pandas_ta as ta
from dateutil import parser
df = pd.read_csv('AUD_CAD.csv', sep=',', names=[
'datetime', 'bid_open', 'bid_high', 'bid_low', 'bid_close', 'ask_open', 'ask_high', 'ask_low', 'ask_close', 'mid_open', 'mid_high', 'mid_low', 'mid_close', 'volume'], error_bad_lines=False, parse_dates=['datetime'])
df.ta.atr(append=True, high='bid_high', low='bid_low', close='bid_close')
print(df)