Skip to content

Instantly share code, notes, and snippets.

View vfulco's full-sized avatar

Vincent C Fulco vfulco

  • Weisisheng Corporate Management Consulting (Shanghai) Ltd.
  • Shanghai, China
View GitHub Profile
@vfulco
vfulco / text-cleaning+word2vec-gensim.py
Created November 27, 2015 02:15 — forked from a-paxton/text-cleaning+word2vec-gensim.py
Cleaning Text Data and Creating 'word2vec' Model with Gensim
# preliminaries
from pymongo import MongoClient
from nltk.corpus import stopwords
from string import ascii_lowercase
import pandas as pd
import gensim, os, re, pymongo, itertools, nltk, snowballstemmer
# set the location where we'll save our model
savefolder = '/data'
@vfulco
vfulco / runsearch.py
Created November 26, 2015 08:11 — forked from cpatrick/runsearch.py
Sample Python for running a full-text search using PyMongo
from pymongo import Connection
if __name__ == '__main__':
# Connect to mongo
conn = Connection()
db = conn['canepi']
# Set the search term
term = 'foo'
@vfulco
vfulco / incrementalmr.py
Created November 25, 2015 15:18 — forked from ses4j/incrementalmr.py
Periodically-updating pymongo/MongoDB incremental MapReduce example
def incremental_map_reduce(
map_f,
reduce_f,
db,
source_table_name,
target_table_name,
source_queued_date_field_name,
counter_table_name = "IncrementalMRCounters",
counter_key = None,
max_datetime = None,
@vfulco
vfulco / prep-contacts-for-ponymailer
Created November 24, 2015 13:46 — forked from leonardreidy/prep-contacts-for-ponymailer
Parse html file with Beautiful Soup, find emails and names and output as json, ready for ponymailer.rb. Emails are found (with href=mailto) and names (inside <strong> tags). The program creates a single list that contains both names, and emails, and then output it as json, ready for ponymailer to send.
# A simple python script to extract names, and emails from
# a certain online directory
import os, json
from bs4 import BeautifulSoup
#get a list of the files in the current directory
inputfiles = os.listdir(os.getcwd())
def postproc(inputfiles):
@vfulco
vfulco / flask_gridfs_server.py
Created November 1, 2015 04:41
A simple GridFS server built with Flask
from flask import Flask, request, redirect, url_for, make_response, abort
from werkzeug import secure_filename
from pymongo import MongoClient
from bson.objectid import ObjectId
from gridfs import GridFS
from gridfs.errors import NoFile
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])
@vfulco
vfulco / pedantically_commented_playbook.yml
Created October 14, 2015 09:46 — forked from marktheunissen/pedantically_commented_playbook.yml
Insanely complete Ansible playbook, showing off all the options
---
# ^^^ YAML documents must begin with the document separator "---"
#
#### Example docblock, I like to put a descriptive comment at the top of my
#### playbooks.
#
# Overview: Playbook to bootstrap a new host for configuration management.
# Applies to: production
# Description:
# Ensures that a host is configured for management with Ansible.
@vfulco
vfulco / nginx.conf
Created October 10, 2015 11:09 — forked from sansmischevia/nginx.conf
nginx http proxy to s3 static websites
##
## This nginx.conf servers as the main config file for webflow reverse proxy
##
## RCS:
## https://gist.github.com/sansmischevia/5617402
##
## Hardening tips:
## http://www.cyberciti.biz/tips/linux-unix-bsd-nginx-webserver-security.html
##
@vfulco
vfulco / google-sheets-json.py
Created October 10, 2015 03:23 — forked from nickjevershed/google-sheets-json.py
Python script to convert Google spreadsheets to simple JSON file and save it locally. Assumes your data is on the left-most sheet, ie the default. Spreadsheet needs to be 'published to the web'.
import simplejson as json
import requests
#your spreadsheet key here. I'm using an example from the Victorian election campaign
key = "1THJ6MgfEk-1egiPFeDuvs4qEi02xTpz4fq9RtO7GijQ"
#google api request urls - I'm doing the first one just to get nice key values (there's probably a better way to do this)
url1 = "https://spreadsheets.google.com/feeds/cells/" + key + "/od6/public/values?alt=json"
@vfulco
vfulco / import_json_appsscript.js
Created October 8, 2015 10:22 — forked from chrislkeller/import_json_appsscript.js
Adds what amounts to an =ImportJSON() function to a Google spreadsheet... To use go to Tools --> Script Editor and add the script and save.
/**
* Retrieves all the rows in the active spreadsheet that contain data and logs the
* values for each row.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function readRows() {
var sheet = SpreadsheetApp.getActiveSheet();
var rows = sheet.getDataRange();
var numRows = rows.getNumRows();
// Includes functions for exporting active sheet or all sheets as JSON object (also Python object syntax compatible).
// Tweak the makePrettyJSON_ function to customize what kind of JSON to export.
var FORMAT_ONELINE = 'One-line';
var FORMAT_MULTILINE = 'Multi-line';
var FORMAT_PRETTY = 'Pretty';
var LANGUAGE_JS = 'JavaScript';
var LANGUAGE_PYTHON = 'Python';