-
query classification: to tell if a text is a review or opinion
-
which documents or portions of documents contain review-like or opinionated material.
-
identifying the overall sentiment expressed
-
the system needs to present the sentiment information it has garnered in some reasonable summary fashion.
-
aggregation of "votes" that may re registered on different scales
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# launch an ec2 instance with lucid (ubuntu 10.04) e.g. ami-ad36fbc4 | |
# ssh to the machine | |
################################################################ | |
# install java | |
# https://ccp.cloudera.com/display/CDHDOC/Java+Development+Kit+Installation | |
# RELEASE=lucid, which you can find by running lsb_release -c. | |
################################################################ | |
$ sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dumbo start demo_dumbo.py -hadoop /usr/lib/hadoop -input shares -output video_demos -outputformat text -files hdfs://ec2-xxx-xx-xx-xx.compute-1.amazonaws.com:8020/user/ubuntu/users/part-m-00000 | |
### piece of code in demo_dumbo.py | |
for line in file('part-m-00000'): | |
print line | |
# ---------------- | |
dumbo start demo_dumbo.py -hadoop /usr/lib/hadoop -input shares -output video_demos -outputformat text -files hdfs://ec2-xxx-xx-xx-xx.compute-1.amazonaws.com:8020/user/ubuntu/users | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://mrtopf.de/blog/en/a-small-introduction-to-python-eggs/ | |
sudo apt-get install python-setuptools | |
python setup.py bdist_egg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo apt-get install gfortran libblas-dev liblapack-dev | |
cd scipy-0.10.0 | |
python setupegg.py bdist_egg | |
# wait about 5 minutes on my machine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ssh -o "StrictHostKeyChecking no" user@host |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
npm install | |
nodeunit tests/recommender_tests.coffee | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://www.in-ulm.de/~mascheck/various/argmax/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy | |
import math | |
# LSH signature generation using random projection | |
def get_signature(user_vector, rand_proj): | |
res = 0 | |
for p in (rand_proj): | |
res = res << 1 | |
val = numpy.dot(p, user_vector) | |
if val >= 0: |
OlderNewer