- Knowledge Bases (KBs) are effective tools for Question Answering (QA) but are often too restrictive (due to fixed schema) and too sparse (due to limitations of Information Extraction (IE) systems).
- The paper proposes Key-Value Memory Networks, a neural network architecture based on Memory Networks that can leverage both KBs and raw data for QA.
- The paper also introduces MOVIEQA, a new QA dataset that can be answered by a perfect KB, by Wikipedia pages and by an imperfect KB obtained using IE techniques thereby allowing a comparison between systems using any of the three sources.
- Link to the paper.
| #!/usr/bin/env python | |
| """ | |
| Twitter's API doesn't allow you to get replies to a particular tweet. Strange | |
| but true. But you can use Twitter's Search API to search for tweets that are | |
| directed at a particular user, and then search through the results to see if | |
| any are replies to a given tweet. You probably are also interested in the | |
| replies to any replies as well, so the process is recursive. The big caveat | |
| here is that the search API only returns results for the last 7 days. So |
- Create or find a gist that you own.
- Clone your gist (replace
<hash>with your gist's hash):# with ssh git clone [email protected]:<hash>.git mygist # with https
git clone https://gist.github.com/.git mygist
Taught by Brad Knox at the MIT Media Lab in 2014. Course website. Lecture and visiting speaker notes.
- Power to the People: The Role of Humans in Interactive Machine Learning by Knox, Cakmak, Kulesza, Amershi, and Lau
- A Few Useful Things to Know about Machine Learning by Domingos
- Machine Learning that Matters by Wagstaff
- Beyond Concise and Colorful: Learning Intelligible Rules by Pazzani et al.
- [Designing Games with a Purpose] (https://www.cs.cmu.edu/~biglou/GWAP_CACM.pdf) by von Ahn and Dabbish
- [Human Model Evaluation in Interactive Supervised
| n <- 200 | |
| m <- 40 | |
| set.seed(1) | |
| x <- runif(n, -1, 1) | |
| library(rafalib) | |
| bigpar(2,2,mar=c(3,3,3,1)) | |
| library(RColorBrewer) | |
| cols <- brewer.pal(11, "Spectral")[as.integer(cut(x, 11))] | |
| plot(x, rep(0,n), ylim=c(-1,1), yaxt="n", xlab="", ylab="", | |
| col=cols, pch=20, main="underlying data") |
| The MIT License (MIT) | |
| Copyright (c) 2016 Jim Kang | |
| Permission is hereby granted, | |
| free of charge, | |
| to any person obtaining a copy | |
| of this software and associated documentation files (the "Software"), | |
| to deal | |
| in the Software without restriction, |
| # Check URLs in a document | |
| ## This code will extract URLs from a text document using regex, | |
| ## then execute an HTTP HEAD request on each and report whether | |
| ## the request failed, whether a redirect occurred, etc. It might | |
| ## be useful for cleaning up linkrot. | |
| if (!require("httr")) { | |
| install.packages("httr", repos = "http://cran.rstudio.com/") | |
| } |
This is a collection of basic "recipes", many using twurl (the Swiss Army Knife for the Twitter API!) and jq to query the Twitter API and format the results. Also, some scripts to test or automate common actions.
An idea that I proved unable to express in the number of characters on Twitter:
Train two word2vec models on the same corpus with 100 dimensions apiece; one with window size 5, and one with window size 15 (say).
Now you have 2 100-dimensional vector spaces with the same words in each.
That's the same as 1 200-dimensional vector space: you just append each of the vectors to each other.
That vector space has all the information from each of the original models in it: you can just use linear algebra to flatten it out along either of the original 100 degree vectors.