Skip to content

Instantly share code, notes, and snippets.

View igorbrigadir's full-sized avatar

Igor Brigadir igorbrigadir

View GitHub Profile
@richarddmorey
richarddmorey / likert_check.R
Last active March 4, 2017 14:51
R script to check possible means and standard deviations for likert responses
############
# Option 1: Get all solutions
# using brute force
############
## This function creates every possible distribution of responses for
## a likert scale with nlev responses. This is total brute force. There's
## probably a better way.
## Argument:
## v : initially, the total number of responses
@arjanelfassed
arjanelfassed / NLarmsexports.csv
Created February 23, 2016 15:06
Maandrapportages uitvoer, doorvoer en dual-use goederen 2004-2015
We can't make this file beautiful and searchable because it's too large.
datum,nummer,omschrijving,gebruik,aantal,type,catdt,herkomst,bestemming,waarde,Column 11,Column 12,Column 13,Column 14
2004-05-01T00:00:00Z,2.3632284E7,(Tweede generatie) beeldversterkerbuizen,Nachtzichtsystemen,,Dual-use,D,Nederland,Chili Via België,25740.0,,,,
2004-06-01T00:00:00Z,2.3653915E7,Dimethylamine,Vervaardiging van herbiciden,,Dual-use,D,Nederland,Kroatië,15000.0,,,,
2004-07-01T00:00:00Z,2.3691973E7,Methyldiethanolamine,Tbv de olie-industrie,,Dual-use,D,Onbekend,Zuid-afrika,377834.0,,,,
2004-08-01T00:00:00Z,2.3674858E7,Fosforoxychloride,Vervaardiging van harsen,,Dual-use,D,Onbekend,Turkije,5100.0,,,,
2004-08-01T00:00:00Z,2.3711486E7,(Tweede generatie) beeldversterkerbuizen,Demonstratie,,Dual-use,T,Onbekend,Jordanië,6600.0,,,,
2004-08-01T00:00:00Z,2.3711494E7,(Tweede generatie) beeldversterkerbuizen,Demonstratie,,Dual-use,T,Onbekend,Jordanië,2200.0,,,,
2004-01-14T00:00:00Z,2.3477165E7,Fosfortrichloride,Vervaardiging van fosfieten,,Dual-use,D,Duitsland,Taiwan,525000.0,,,,
2004-01-14T00:00:00Z,2.37292
@eevee
eevee / gist:55426e5856f5825317b1
Last active January 28, 2021 22:51
adblock rules to hide mentions from people who don't follow you

Pop open "filter preferences" in adblock plus, and add the following rules to hide mentions from people who don't follow you (and who you don't follow).

For the interactions/notifications page:

twitter.com##.interaction-page [data-follows-you="false"][data-you-follow="false"]:not(.my-tweet)

For the mentions page:

twitter.com##.mentions-page [data-follows-you="false"][data-you-follow="false"]:not(.my-tweet)
rm doit.out ; touch doit.out ; yes | head -200 | awk '{print "echo "NR" `lynx -dump '"'"'https://en.wikipedia.org/wiki/"NR"_(number)'"'"' | wc -l` >> doit.out"}' > ! /tmp/doit.sh ; source /tmp/doit.sh ; sort -n -k 2 < doit.out | head -5
@johnmyleswhite
johnmyleswhite / statistical_maxims.md
Created December 1, 2015 15:25
Statistical Maxims
  • Correlation is not causation (???)
  • No causation without manipulation. (Holland)
  • All models are wrong, some are useful. (Box)
  • Statistics is the science of uncertainty. (arguably Tukey)
  • Statistics is the science of learning from experience, especially experience that arrives a little bit at a time. (Efron)
@kylemcdonald
kylemcdonald / _tsne.pdf
Last active February 22, 2024 22:13
Exploring antonyms with word2vec.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@achabotl
achabotl / doi2bib
Last active June 24, 2016 15:30
doi2bib
#!/bin/sh
if [[ "${1}" == "http"* ]] ; then
doi="${1}"
else
doi="http://dx.doi.org/${1}"
fi
# Stopped working around 2015-10-04.
# curl -sLH "Accept: text/bibliography; style=bibtex" "${doi}" | sed 's/^ *//'

version 1.0.3 #Spark Logo + Python Logo

Text Analysis and Entity Resolution

####Entity resolution is a common, yet difficult problem in data cleaning and integration. This lab will demonstrate how we can use Apache Spark to apply powerful and scalable text analysis techniques and perform entity resolution across two datasets of commercial products.

Entity Resolution, or "[Record linkage][wiki]" is the term used by statisticians, epidemiologists, and historians, among others, to describe the process of joining records from one data source with another that describe the same entity. Our terms with the same meaning include, "entity disambiguation/linking", duplicate detection", "deduplication", "record matching", "(reference) reconciliation", "object identification", "data/information integration", and "conflation".

Entity Resol

##vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv##
## Download and examine deleted congress tweets ##
## Data Source: politwoops.sunlightfoundation.com ##
## Analysis: Katherine Ognyanova at www.kateto.net ##
## Visualizations: http://kateto.net/politwoops ##
##vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv##
library(RJSONIO)
library(RCurl)
library(plyr)
@dmasad
dmasad / Intro_to_ICEWS.ipynb
Last active February 4, 2017 18:12
Intro to ICEWS in Python
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.