A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
var cheerio = require('cheerio'); | |
var request = require('request'); | |
var fs = require('fs'); | |
var exec = require('child_process').exec; | |
var twitterAPI = require('node-twitter-api'); | |
var phpsessid = '<get it from your channeli account. Chrome developer console is your friend>'; | |
var consumerKey = '<get it from twitter.js>'; | |
var consumerSecret = '<get it from twitter.js>'; | |
var accessToken = '<get it from twitter.js>'; |
""" | |
A deep neural network with or w/o dropout in one file. | |
License: Do What The Fuck You Want to Public License http://www.wtfpl.net/ | |
""" | |
import numpy, theano, sys, math | |
from theano import tensor as T | |
from theano import shared | |
from theano.tensor.shared_randomstreams import RandomStreams |
A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
#!/bin/bash | |
# Bash script to install latest version of ffmpeg and its dependencies on Ubuntu 12.04 or 14.04 | |
# Inspired from https://gist.github.com/faleev/3435377 | |
# Remove any existing packages: | |
sudo apt-get -y remove ffmpeg x264 libav-tools libvpx-dev libx264-dev | |
# Get the dependencies (Ubuntu Server or headless users): | |
sudo apt-get update |
A lot of these are outright stolen from Edward O'Campo-Gooding's list of questions. I really like his list.
I'm having some trouble paring this down to a manageable list of questions -- I realistically want to know all of these things before starting to work at a company, but it's a lot to ask all at once. My current game plan is to pick 6 before an interview and ask those.
I'd love comments and suggestions about any of these.
I've found questions like "do you have smart people? Can I learn a lot at your company?" to be basically totally useless -- everybody will say "yeah, definitely!" and it's hard to learn anything from them. So I'm trying to make all of these questions pretty concrete -- if a team doesn't have an issue tracker, they don't have an issue tracker.
I'm also mostly not asking about principles, but the way things are -- not "do you think code review is important?", but "Does all code get reviewed?".
This can reduce files to ~15% of their size (2.3M to 345K, in one case) with no obvious degradation of quality.
ghostscript -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
Other options for PDFSETTINGS:
import os | |
import numpy | |
from pandas import DataFrame | |
from sklearn.feature_extraction.text import CountVectorizer | |
from sklearn.naive_bayes import MultinomialNB | |
from sklearn.pipeline import Pipeline | |
from sklearn.cross_validation import KFold | |
from sklearn.metrics import confusion_matrix, f1_score | |
NEWLINE = '\n' |
var fn = function(arg1, arg2) { | |
var str = '<p>aap ' + this.noot + ' ' + arg1 + ' ' + arg2 + '</p>'; | |
document.body.innerHTML += str; | |
}; | |
var context = { | |
'noot': 'noot' | |
}; | |
var args = ['mies', 'wim']; | |
// Calls a function with a given 'this' value and arguments provided individually. |
Ideas are cheap. Make a prototype, sketch a CLI session, draw a wireframe. Discuss around concrete examples, not hand-waving abstractions. Don't say you did something, provide a URL that proves it.
Nothing is real until it's being used by a real user. This doesn't mean you make a prototype in the morning and blog about it in the evening. It means you find one person you believe your product will help and try to get them to use it.
<cities> | |
<state> | |
<name>Alabama</name> | |
<city>Abbeville</city> | |
<number>1</number> | |
</state> | |
<state> | |
<name>Alabama</name> | |
<city>Adamsville</city> | |
<number>1</number> |