Skip to content

Instantly share code, notes, and snippets.

@mieitza
mieitza / keybase.md
Last active November 27, 2016 20:13

Keybase proof

I hereby claim:

  • I am mieitza on github.
  • I am mieitza (https://keybase.io/mieitza) on keybase.
  • I have a public key ASCpl4A69Tk1VjHXdVtvfnJ_c0zRfD7MQscb5SPPhOf7kAo

To claim this, I am signing this object:

@mieitza
mieitza / pyspark_to_elasticsearch.py
Created January 3, 2018 15:04 — forked from adrianva/pyspark_to_elasticsearch.py
Save RDD and/or DataFrame from Spark to Elasticsearch
# Elastic configs
es_read_conf = {
"es.nodes" : "localhost",
"es.port" : "9200",
"es.resource" : "twitter/tweet"
}
es_write_conf = {
"es.nodes" : "localhost",
"es.port" : "9200",
@mieitza
mieitza / pyspark_to_elasticsearch.py
Created January 3, 2018 15:04 — forked from adrianva/pyspark_to_elasticsearch.py
Save RDD and/or DataFrame from Spark to Elasticsearch
# Elastic configs
es_read_conf = {
"es.nodes" : "localhost",
"es.port" : "9200",
"es.resource" : "twitter/tweet"
}
es_write_conf = {
"es.nodes" : "localhost",
"es.port" : "9200",
@mieitza
mieitza / mongodb_2_pandas.py
Created January 3, 2018 15:05 — forked from jmquintana79/mongodb_2_pandas.py
Functions to connect and read mongodb data to pandas df
import pandas as pd
from pymongo import MongoClient
# set connection with mongodb
def _connect_mongo(host, port, username, password, db):
""" A util for making a connection to mongo """
if username and password:
mongo_uri = 'mongodb://%s:%s@%s:%s/%s' % (username, password, host, port, db)
conn = MongoClient(mongo_uri)
@mieitza
mieitza / cypher.py
Created January 3, 2018 15:06 — forked from gregroberts/cypher.py
A function for pandas to get results of a cypher query directly into a DataFrame
from pandas.core.api import DataFrame
from pandas.tseries.tools import to_datetime
#save me at site-packages\pandas\io\cypher.py
def read_cypher(cypher, con, index_col=None, params = {},parse_dates = None, columns= None):
'''
Run a Cypher query against the graph at con, put the results into a df
Parameters
@mieitza
mieitza / gist:4341f290e0dba9c6c53a5596779f4e94
Created January 3, 2018 15:06 — forked from c0ldlimit/gist:5164171
#python #flask #pandas Using flask to return a csv response from a dataframe
import StringIO
from flask import Flask, Response
@app.route('/some_dataframe.csv')
def output_dataframe_csv():
output = StringIO.StringIO()
some_dataframe.to_csv(output)
@mieitza
mieitza / database_report.py
Created January 3, 2018 15:06 — forked from gregroberts/database_report.py
Py2neo Write a report on the contents of a graph database, describe what nodes are in, what sort of properties they (seem to) have, and what edges come in and out of each
import py2neo
import datetime
#where we write it
f_name = 'DBREPORT_%s.txt' % datetime.datetime.today().strftime('%Y-%m-%d')
#overwrite anything previous
with open(f_name,'wb') as f:
f.write('REPORT COMPILATION STARTED AT %s' % datetime.datetime.now())
@mieitza
mieitza / import_csv_to_mongo
Created January 3, 2018 15:07 — forked from alpoza/import_csv_to_mongo
Store CSV data into mongodb using python pandas
#!/usr/bin/env python
import sys
import pandas as pd
import pymongo
import json
def import_content(filepath):
mng_client = pymongo.MongoClient('localhost', 27017)
@mieitza
mieitza / import_csv_to_mongo
Created January 3, 2018 15:07 — forked from thangarajan8/import_csv_to_mongo
Store CSV data into mongodb using python pandas
#!/usr/bin/env python
import sys
import pandas as pd
import pymongo
import json
def import_content(filepath):
mng_client = pymongo.MongoClient('localhost', 27017)
@mieitza
mieitza / pandas-cheatsheet.py
Created January 3, 2018 15:08 — forked from spepechen/pandas-cheatsheet.py
handy 🐼 snippets
#### BASIC ########################################################################################################################
# cleaning str in the header
df.columns = [x.lower().strip() for x in df.columns] # lower case, trim leading and trailing spaces
df.columns = [x.strip().replace(' ', '_') for x in df.columns] # replace whitespaces b/w words with _
# checking NaN in all df
df.isnull().values.any()
# get column-slices