Skip to content

Instantly share code, notes, and snippets.

View jdthorpe's full-sized avatar
💭
Can we all just agree it's pronounced "kubernetes"

Jason Thorpe jdthorpe

💭
Can we all just agree it's pronounced "kubernetes"
View GitHub Profile
@jdthorpe
jdthorpe / README.md
Created December 16, 2016 13:38 — forked from dolphin278/README.md
Pub/sub example for nodejs using mongodb

Uses capped collection, tailable cursors and streams.

What's here?

  • init.js recreates collection capped collection 'queue' on mongodb.
  • writer.js spams queue with new messages
  • worker.js processes all messages saved to queue,
  • onceWorker.js processes only unprocessed messages, so you can spawn several of them and each of your messages will be processed by only one worker.

Steps

@jdthorpe
jdthorpe / column-data-validator.js
Created June 9, 2017 19:26
Validation of columnar data via AJV
var Ajv = require('ajv');
var ajv = new Ajv(); // options can be passed, e.g. {allErrors: true}
var ajv = new Ajv({"v5":true}); //https://github.com/epoberezkin/ajv/issues/297
ajv.addKeyword('columns', {
type: 'object',
macro: function (schema) {
console.log("IN Schema:\n", JSON.stringify(schema,null,2) ,"\n");
if(!schema.hasOwnProperty("properties"))

Using Ajv (AjvPy) with MongoEngine

Obviously you could use an ajvpy validator to validate documents directly, like this:

# imports
from mongoengine import *
from ajvpy import Ajv

# Instantiate an AJV validator
@jdthorpe
jdthorpe / MongoEngin with Pandas DataFrames.md
Last active November 9, 2020 05:05
MongoEngin with Pandas DataFrames

MongoEngine Documents with Pandas

This example is meant to demonstrate the separation of concerns between Data Modeling (i.e. storing things) and business logic (i.e. doing things) by creating a Python Package called InventoryDB which implements data models and can be imported into a script that implements some business logic (i.e. actual work!).

DATA MODELING

To minimize the amount of effort involved in storing a pandas DataFrame in a MongoDB Document

@jdthorpe
jdthorpe / DataFrameField-quirks.md
Last active July 15, 2017 16:05
DataFrameField Quirks

DataFrameField Quirks

The greatest quirk in the DataFrameField class comes from the fact that the Pandas DataFrame instances are coerced to a python dictionary via df.to_dict("list") as soon as it is bound to a MongoEngine.Document instance, and likewise a new Pandas DataFram instance is created from the stored dict when it is retrieved via dot notation. As result, when it is retrieved from document class via dot notation, a different instance is returned.

For example:

@jdthorpe
jdthorpe / README.md
Last active August 9, 2017 16:39
Returning Multiple Result Sets from a Azure Data Warehouse

Getting data from Azure SQL Data Warehouse into R

TL;DR

This note describes the process of automating the execution of queries against an Azure SQL Data Warehouse using Active Directory Integrated Authentication to produce (possibly multiple) CSV files

Automation with SqlCmd: Which SQLCMD to use?

@jdthorpe
jdthorpe / README.md
Last active October 30, 2017 16:50
Hierarchical version of dcast

Introduction to Hierarchical Casting with dhcast()

First, the h in dhcast is for hierarchy, and is useful when you need to (A) cast (go from long to wide data) and where (B) casting should observe a given variable hierarchy, and (optionally) aggregate your data(summarize related records in the long dataset) at the same time.

Lets say, for example, that we want to predict some time varying attribute of a grocery store customers based their produce purchases. Furthermore, we want to

@jdthorpe
jdthorpe / Screen Shot 2018-04-08 at 1.21.46 PM.png
Last active April 9, 2018 06:19
Vim Freezes when TSServer failes
Screen Shot 2018-04-08 at 1.21.46 PM.png
@jdthorpe
jdthorpe / R DataFrame 2 Pandas.md
Last active November 16, 2018 00:46
A little function for generating parameters for Panda's read_csv from an R data.frame

When you've got a data.frame in R and want to write it to csv and then import it into a Python Pandas instance, lazy code like this will fail to parse your string and date fields:

<some_file.R>
write.csv(my_data_frame, "my_data.csv" row.names=FALSE)
@jdthorpe
jdthorpe / package-lock.json
Created June 16, 2018 18:28
Package-lock.json file for svgdom issue
{
"name": "svg.js-test",
"version": "1.0.0",
"lockfileVersion": 1,
"requires": true,
"dependencies": {
"acorn": {
"version": "5.7.1",
"resolved": "https://registry.npmjs.org/acorn/-/acorn-5.7.1.tgz",
"integrity": "sha512-d+nbxBUGKg7Arpsvbnlq61mc12ek3EY8EQldM3GPAhWJ1UVxC6TDGbIvUMNU6obBX3i1+ptCIzV4vq0gFPEGVQ=="