Starting up:
[bw1425n01]s0925570: ./hshell.rb
Clearing previous output:
s0925570@hadoop $ rmr ~/data/output
Deleted hdfs://bw1425n01.inf.ed.ac.uk/user/s0925570/data/output
Running a streaming job:
s0925570@hadoop $ :stream
Input: ~/data/input
Output: ~/data/output
Mapper: ./mapper.py
Reducer: ./reducer.py
To run this streaming job again, use the shortcut:
:stream --- \n:mapper: "./mapper.py "\n:reducer: "./reducer.py "\n:input: ~/data/input\n:output: ~/data/output\n
... job runs ...
s0925570@hadoop $ lsr ~/data/output
-rw-r--r-- 3 s0925570 supergroup 52 2013-10-23 15:00 /user/s0925570/data/output/part-00000
... etc ...
Running the job (shorthand):
s0925570@hadoop $ :stream --- \n:mapper: "./mapper.py "\n:reducer: "./reducer.py "\n:input: ~/data/input\n:output: ~/data/output\n
Inspecting part of the result:
s0925570@hadoop $ tail ~/data/output/part-00000
But 1
ask 1
both 1
desert 2
jack 161
makes 161
up 6