Skip to content

Instantly share code, notes, and snippets.

@danbri
Created August 31, 2011 17:08
Show Gist options
  • Save danbri/1184080 to your computer and use it in GitHub Desktop.
Save danbri/1184080 to your computer and use it in GitHub Desktop.
sh build-reuters.sh
Please select a number to choose the corresponding clustering algorithm
1. kmeans clustering
2. lda clustering
Enter your choice : 1
ok. You chose 1 and we'll use kmeans Clustering
Downloading Reuters-21578
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7959k 100 7959k 0 0 105k 0 0:01:15 0:01:15 --:--:-- 155k
Extracting...
Running on hadoop, using HADOOP_HOME=/Users/danbri/working/hadoop/hadoop-0.20.2
HADOOP_CONF_DIR=/Users/danbri/working/hadoop/hadoop-0.20.2/conf
MAHOUT-JOB: /Users/danbri/Documents/workspace/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar
2011-08-31 18:49:08,417 WARN [org.apache.mahout.driver.MahoutDriver] - No org.apache.lucene.benchmark.utils.ExtractReuters.props found on classpath, will use command-line arguments only
Deleting all files in /Users/danbri/Documents/workspace/trunk/examples/bin/mahout-work/reuters-out-tmp
2011-08-31 19:04:11,526 INFO [org.apache.mahout.driver.MahoutDriver] - Program took 903041 ms
MAHOUT_LOCAL is set, running locally
CLASSPATH: [...snip]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/danbri/Documents/workspace/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/danbri/Documents/workspace/trunk/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/danbri/Documents/workspace/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2011-08-31 19:04:18,702 INFO [org.apache.mahout.common.AbstractJob] - Command line arguments: {--charset=UTF-8, --chunkSize=5, --endPhase=2147483647, --fileFilterClass=org.apache.mahout.text.PrefixAdditionFilter, --input=mahout-work/reuters-out, --keyPrefix=, --output=mahout-work/reuters-out-seqdir, --startPhase=0, --tempDir=temp}
Exception in thread "main" java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
at org.apache.hadoop.ipc.Client.call(Client.java:1033)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:208)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:175)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1310)
at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:210)
at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:59)
at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:110)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:85)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
rmr: cannot remove mahout-work/reuters-out-seqdir: No such file or directory.
put: File mahout-work/reuters-out-seqdir does not exist.
tellyclub:bin danbri$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment