Skip to content

Instantly share code, notes, and snippets.

@alexwoolford
Created August 6, 2015 20:21
Show Gist options
  • Save alexwoolford/8b03a762f5525e091123 to your computer and use it in GitHub Desktop.
Save alexwoolford/8b03a762f5525e091123 to your computer and use it in GitHub Desktop.
mahout recommenditembased --similarityClassname SIMILARITY_LOGLIKELIHOOD -i /etl/recommender/input/mahout_input.tsv -o /etl/recommender/output/ --numRecommendations 1
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/bin/../lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/mahout/mahout-examples-0.9-cdh5.4.4-job.jar
15/08/06 14:00:14 WARN driver.MahoutDriver: No recommenditembased.props found on classpath, will use command-line arguments only
15/08/06 14:00:14 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[/etl/recommender/input/mahout_input.tsv], --maxPrefsInItemSimilarity=[500], --maxPrefsPerUser=[10], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], --numRecommendations=[1], --output=[/etl/recommender/output/], --similarityClassname=[SIMILARITY_LOGLIKELIHOOD], --startPhase=[0], --tempDir=[temp]}
15/08/06 14:00:14 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[/etl/recommender/input/mahout_input.tsv], --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]}
15/08/06 14:00:14 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/08/06 14:00:14 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
15/08/06 14:00:14 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
15/08/06 14:00:14 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:00:16 INFO input.FileInputFormat: Total input paths to process : 1
15/08/06 14:00:16 INFO mapreduce.JobSubmitter: number of splits:1
15/08/06 14:00:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0246
15/08/06 14:00:16 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0246
15/08/06 14:00:16 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0246/
15/08/06 14:00:16 INFO mapreduce.Job: Running job: job_1438723303243_0246
15/08/06 14:00:26 INFO mapreduce.Job: Job job_1438723303243_0246 running in uber mode : false
15/08/06 14:00:26 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:00:32 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:00:36 INFO mapreduce.Job: map 100% reduce 50%
15/08/06 14:00:37 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:00:37 INFO mapreduce.Job: Job job_1438723303243_0246 completed successfully
15/08/06 14:00:38 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=124
FILE: Number of bytes written=574719
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=9669845
HDFS: Number of bytes written=522
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=1
Launched reduce tasks=4
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3770
Total time spent by all reduces in occupied slots (ms)=11315
Total time spent by all map tasks (ms)=3770
Total time spent by all reduce tasks (ms)=11315
Total vcore-seconds taken by all map tasks=3770
Total vcore-seconds taken by all reduce tasks=11315
Total megabyte-seconds taken by all map tasks=3860480
Total megabyte-seconds taken by all reduce tasks=11586560
Map-Reduce Framework
Map input records=726672
Map output records=726672
Map output bytes=1453344
Map output materialized bytes=108
Input split bytes=136
Combine input records=726672
Combine output records=11
Reduce input groups=11
Reduce shuffle bytes=108
Reduce input records=11
Reduce output records=11
Spilled Records=22
Shuffled Maps =4
Failed Shuffles=0
Merged Map outputs=4
GC time elapsed (ms)=199
CPU time spent (ms)=5280
Physical memory (bytes) snapshot=1520218112
Virtual memory (bytes) snapshot=6807212032
Total committed heap usage (bytes)=2639790080
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=9669709
File Output Format Counters
Bytes Written=522
15/08/06 14:00:38 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:00:38 INFO input.FileInputFormat: Total input paths to process : 1
15/08/06 14:00:38 INFO mapreduce.JobSubmitter: number of splits:1
15/08/06 14:00:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0247
15/08/06 14:00:38 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0247
15/08/06 14:00:38 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0247/
15/08/06 14:00:38 INFO mapreduce.Job: Running job: job_1438723303243_0247
15/08/06 14:00:43 INFO mapreduce.Job: Job job_1438723303243_0247 running in uber mode : false
15/08/06 14:00:43 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:00:50 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:00:55 INFO mapreduce.Job: map 100% reduce 25%
15/08/06 14:00:56 INFO mapreduce.Job: map 100% reduce 75%
15/08/06 14:01:00 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:01:01 INFO mapreduce.Job: Job job_1438723303243_0247 completed successfully
15/08/06 14:01:01 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=1156478
FILE: Number of bytes written=2889007
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=9669845
HDFS: Number of bytes written=208307
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=1
Launched reduce tasks=4
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=4302
Total time spent by all reduces in occupied slots (ms)=17666
Total time spent by all map tasks (ms)=4302
Total time spent by all reduce tasks (ms)=17666
Total vcore-seconds taken by all map tasks=4302
Total vcore-seconds taken by all reduce tasks=17666
Total megabyte-seconds taken by all map tasks=4405248
Total megabyte-seconds taken by all reduce tasks=18089984
Map-Reduce Framework
Map input records=726672
Map output records=726672
Map output bytes=5274123
Map output materialized bytes=1156462
Input split bytes=136
Combine input records=0
Combine output records=0
Reduce input groups=11434
Reduce shuffle bytes=1156462
Reduce input records=726672
Reduce output records=9230
Spilled Records=1453344
Shuffled Maps =4
Failed Shuffles=0
Merged Map outputs=4
GC time elapsed (ms)=126
CPU time spent (ms)=9190
Physical memory (bytes) snapshot=1520930816
Virtual memory (bytes) snapshot=6847045632
Total committed heap usage (bytes)=2390753280
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=9669709
File Output Format Counters
Bytes Written=208307
org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducer$Counters
USERS=9230
15/08/06 14:01:01 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:01:01 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:01:01 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:01:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0248
15/08/06 14:01:02 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0248
15/08/06 14:01:02 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0248/
15/08/06 14:01:02 INFO mapreduce.Job: Running job: job_1438723303243_0248
15/08/06 14:01:06 INFO mapreduce.Job: Job job_1438723303243_0248 running in uber mode : false
15/08/06 14:01:06 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:01:11 INFO mapreduce.Job: map 75% reduce 0%
15/08/06 14:01:12 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:01:17 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:01:18 INFO mapreduce.Job: Job job_1438723303243_0248 completed successfully
15/08/06 14:01:18 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=46720
FILE: Number of bytes written=1013793
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=208971
HDFS: Number of bytes written=56071
HDFS: Number of read operations=28
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Launched reduce tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=11860
Total time spent by all reduces in occupied slots (ms)=11596
Total time spent by all map tasks (ms)=11860
Total time spent by all reduce tasks (ms)=11596
Total vcore-seconds taken by all map tasks=11860
Total vcore-seconds taken by all reduce tasks=11596
Total megabyte-seconds taken by all map tasks=12144640
Total megabyte-seconds taken by all reduce tasks=11874304
Map-Reduce Framework
Map input records=9230
Map output records=9279
Map output bytes=157655
Map output materialized bytes=47373
Input split bytes=664
Combine input records=9279
Combine output records=13
Reduce input groups=5
Reduce shuffle bytes=47373
Reduce input records=13
Reduce output records=5
Spilled Records=26
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=219
CPU time spent (ms)=6360
Physical memory (bytes) snapshot=2923507712
Virtual memory (bytes) snapshot=10857656320
Total committed heap usage (bytes)=4481613824
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=208307
File Output Format Counters
Bytes Written=56071
15/08/06 14:01:18 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --excludeSelfSimilarity=[true], --input=[temp/preparePreferenceMatrix/ratingMatrix], --maxObservationsPerColumn=[500], --maxObservationsPerRow=[500], --maxSimilaritiesPerRow=[100], --numberOfColumns=[9230], --output=[temp/similarityMatrix], --randomSeed=[-9223372036854775808], --similarityClassname=[SIMILARITY_LOGLIKELIHOOD], --startPhase=[0], --tempDir=[temp], --threshold=[4.9E-324]}
15/08/06 14:01:18 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:01:18 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:01:18 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:01:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0249
15/08/06 14:01:19 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0249
15/08/06 14:01:19 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0249/
15/08/06 14:01:19 INFO mapreduce.Job: Running job: job_1438723303243_0249
15/08/06 14:01:23 INFO mapreduce.Job: Job job_1438723303243_0249 running in uber mode : false
15/08/06 14:01:23 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:01:28 INFO mapreduce.Job: map 25% reduce 0%
15/08/06 14:01:29 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:01:34 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:01:34 INFO mapreduce.Job: Job job_1438723303243_0249 completed successfully
15/08/06 14:01:34 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=38031
FILE: Number of bytes written=652174
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=56739
HDFS: Number of bytes written=92318
HDFS: Number of read operations=19
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=4
Launched reduce tasks=1
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=13084
Total time spent by all reduces in occupied slots (ms)=2414
Total time spent by all map tasks (ms)=13084
Total time spent by all reduce tasks (ms)=2414
Total vcore-seconds taken by all map tasks=13084
Total vcore-seconds taken by all reduce tasks=2414
Total megabyte-seconds taken by all map tasks=13398016
Total megabyte-seconds taken by all reduce tasks=2471936
Map-Reduce Framework
Map input records=5
Map output records=4
Map output bytes=92712
Map output materialized bytes=38168
Input split bytes=668
Combine input records=4
Combine output records=4
Reduce input groups=1
Reduce shuffle bytes=38168
Reduce input records=4
Reduce output records=0
Spilled Records=8
Shuffled Maps =4
Failed Shuffles=0
Merged Map outputs=4
GC time elapsed (ms)=209
CPU time spent (ms)=2600
Physical memory (bytes) snapshot=2213433344
Virtual memory (bytes) snapshot=6765625344
Total committed heap usage (bytes)=3253731328
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=56071
File Output Format Counters
Bytes Written=98
15/08/06 14:01:34 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:01:34 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:01:34 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:01:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0250
15/08/06 14:01:35 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0250
15/08/06 14:01:35 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0250/
15/08/06 14:01:35 INFO mapreduce.Job: Running job: job_1438723303243_0250
15/08/06 14:01:39 INFO mapreduce.Job: Job job_1438723303243_0250 running in uber mode : false
15/08/06 14:01:39 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:01:44 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:01:50 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:01:51 INFO mapreduce.Job: Job job_1438723303243_0250 completed successfully
15/08/06 14:01:51 INFO mapreduce.Job: Counters: 52
File System Counters
FILE: Number of bytes read=6339
FILE: Number of bytes written=948844
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=425619
HDFS: Number of bytes written=20492
HDFS: Number of read operations=32
HDFS: Number of large read operations=0
HDFS: Number of write operations=11
Job Counters
Launched map tasks=4
Launched reduce tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=11321
Total time spent by all reduces in occupied slots (ms)=11496
Total time spent by all map tasks (ms)=11321
Total time spent by all reduce tasks (ms)=11496
Total vcore-seconds taken by all map tasks=11321
Total vcore-seconds taken by all reduce tasks=11496
Total megabyte-seconds taken by all map tasks=11592704
Total megabyte-seconds taken by all reduce tasks=11771904
Map-Reduce Framework
Map input records=5
Map output records=726
Map output bytes=14457
Map output materialized bytes=6317
Input split bytes=668
Combine input records=726
Combine output records=724
Reduce input groups=711
Reduce shuffle bytes=6317
Reduce input records=724
Reduce output records=708
Spilled Records=1448
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=276
CPU time spent (ms)=6000
Physical memory (bytes) snapshot=2946424832
Virtual memory (bytes) snapshot=10917265408
Total committed heap usage (bytes)=4875878400
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=56071
File Output Format Counters
Bytes Written=20426
org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
NEGLECTED_OBSERVATIONS=8565
ROWS=5
USED_OBSERVATIONS=714
15/08/06 14:01:51 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:01:51 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:01:51 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:01:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0251
15/08/06 14:01:52 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0251
15/08/06 14:01:52 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0251/
15/08/06 14:01:52 INFO mapreduce.Job: Running job: job_1438723303243_0251
15/08/06 14:01:56 INFO mapreduce.Job: Job job_1438723303243_0251 running in uber mode : false
15/08/06 14:01:56 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:02:01 INFO mapreduce.Job: map 75% reduce 0%
15/08/06 14:02:06 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:02:11 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:02:12 INFO mapreduce.Job: Job job_1438723303243_0251 completed successfully
15/08/06 14:02:12 INFO mapreduce.Job: Counters: 51
File System Counters
FILE: Number of bytes read=268
FILE: Number of bytes written=932888
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=21242
HDFS: Number of bytes written=519
HDFS: Number of read operations=40
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Launched reduce tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=15601
Total time spent by all reduces in occupied slots (ms)=10888
Total time spent by all map tasks (ms)=15601
Total time spent by all reduce tasks (ms)=10888
Total vcore-seconds taken by all map tasks=15601
Total vcore-seconds taken by all reduce tasks=10888
Total megabyte-seconds taken by all map tasks=15975424
Total megabyte-seconds taken by all reduce tasks=11149312
Map-Reduce Framework
Map input records=708
Map output records=714
Map output bytes=14334
Map output materialized bytes=560
Input split bytes=552
Combine input records=714
Combine output records=13
Reduce input groups=5
Reduce shuffle bytes=560
Reduce input records=13
Reduce output records=5
Spilled Records=26
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=178
CPU time spent (ms)=5020
Physical memory (bytes) snapshot=2921918464
Virtual memory (bytes) snapshot=10884321280
Total committed heap usage (bytes)=4377280512
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=20426
File Output Format Counters
Bytes Written=519
org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
COOCCURRENCES=720
PRUNED_COOCCURRENCES=0
15/08/06 14:02:12 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:02:17 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:02:17 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:02:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0252
15/08/06 14:02:18 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0252
15/08/06 14:02:18 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0252/
15/08/06 14:02:18 INFO mapreduce.Job: Running job: job_1438723303243_0252
15/08/06 14:02:27 INFO mapreduce.Job: Job job_1438723303243_0252 running in uber mode : false
15/08/06 14:02:27 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:02:32 INFO mapreduce.Job: map 50% reduce 0%
15/08/06 14:02:33 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:02:38 INFO mapreduce.Job: map 100% reduce 75%
15/08/06 14:02:39 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:02:40 INFO mapreduce.Job: Job job_1438723303243_0252 completed successfully
15/08/06 14:02:40 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=231
FILE: Number of bytes written=922416
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1115
HDFS: Number of bytes written=555
HDFS: Number of read operations=28
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Launched reduce tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=10948
Total time spent by all reduces in occupied slots (ms)=12466
Total time spent by all map tasks (ms)=10948
Total time spent by all reduce tasks (ms)=12466
Total vcore-seconds taken by all map tasks=10948
Total vcore-seconds taken by all reduce tasks=12466
Total megabyte-seconds taken by all map tasks=11210752
Total megabyte-seconds taken by all reduce tasks=12765184
Map-Reduce Framework
Map input records=5
Map output records=9
Map output bytes=171
Map output materialized bytes=429
Input split bytes=596
Combine input records=9
Combine output records=8
Reduce input groups=5
Reduce shuffle bytes=429
Reduce input records=8
Reduce output records=5
Spilled Records=16
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=233
CPU time spent (ms)=4840
Physical memory (bytes) snapshot=2943651840
Virtual memory (bytes) snapshot=10891583488
Total committed heap usage (bytes)=4383047680
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=519
File Output Format Counters
Bytes Written=555
15/08/06 14:02:40 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:02:41 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:02:41 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:02:41 INFO mapreduce.JobSubmitter: number of splits:8
15/08/06 14:02:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0253
15/08/06 14:02:41 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0253
15/08/06 14:02:41 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0253/
15/08/06 14:02:41 INFO mapreduce.Job: Running job: job_1438723303243_0253
15/08/06 14:02:45 INFO mapreduce.Job: Job job_1438723303243_0253 running in uber mode : false
15/08/06 14:02:45 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:02:51 INFO mapreduce.Job: map 25% reduce 0%
15/08/06 14:02:52 INFO mapreduce.Job: map 88% reduce 0%
15/08/06 14:02:55 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:02:57 INFO mapreduce.Job: map 100% reduce 75%
15/08/06 14:02:58 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:02:58 INFO mapreduce.Job: Job job_1438723303243_0253 completed successfully
15/08/06 14:02:58 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=50838
FILE: Number of bytes written=1486028
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=211594
HDFS: Number of bytes written=58931
HDFS: Number of read operations=44
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=8
Launched reduce tasks=4
Data-local map tasks=8
Total time spent by all maps in occupied slots (ms)=32076
Total time spent by all reduces in occupied slots (ms)=11962
Total time spent by all map tasks (ms)=32076
Total time spent by all reduce tasks (ms)=11962
Total vcore-seconds taken by all map tasks=32076
Total vcore-seconds taken by all reduce tasks=11962
Total megabyte-seconds taken by all map tasks=32845824
Total megabyte-seconds taken by all reduce tasks=12249088
Map-Reduce Framework
Map input records=9235
Map output records=9284
Map output bytes=76943
Map output materialized bytes=51906
Input split bytes=2732
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=51906
Reduce input records=9284
Reduce output records=5
Spilled Records=18568
Shuffled Maps =32
Failed Shuffles=0
Merged Map outputs=32
GC time elapsed (ms)=600
CPU time spent (ms)=7950
Physical memory (bytes) snapshot=4911194112
Virtual memory (bytes) snapshot=16301408256
Total committed heap usage (bytes)=7654604800
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=58931
15/08/06 14:02:58 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
15/08/06 14:02:58 INFO Configuration.deprecation: mapred.map.child.java.opts is deprecated. Instead, use mapreduce.map.java.opts
15/08/06 14:02:58 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
15/08/06 14:02:58 INFO Configuration.deprecation: mapred.task.timeout is deprecated. Instead, use mapreduce.task.timeout
15/08/06 14:02:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop01.woolford.io/10.0.1.11:8032
15/08/06 14:02:59 INFO input.FileInputFormat: Total input paths to process : 4
15/08/06 14:02:59 INFO mapreduce.JobSubmitter: number of splits:4
15/08/06 14:02:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1438723303243_0254
15/08/06 14:02:59 INFO impl.YarnClientImpl: Submitted application application_1438723303243_0254
15/08/06 14:02:59 INFO mapreduce.Job: The url to track the job: http://hadoop01.woolford.io:8088/proxy/application_1438723303243_0254/
15/08/06 14:02:59 INFO mapreduce.Job: Running job: job_1438723303243_0254
15/08/06 14:03:03 INFO mapreduce.Job: Job job_1438723303243_0254 running in uber mode : false
15/08/06 14:03:03 INFO mapreduce.Job: map 0% reduce 0%
15/08/06 14:03:08 INFO mapreduce.Job: map 100% reduce 0%
15/08/06 14:03:14 INFO mapreduce.Job: map 100% reduce 75%
15/08/06 14:03:19 INFO mapreduce.Job: map 100% reduce 100%
15/08/06 14:03:20 INFO mapreduce.Job: Job job_1438723303243_0254 completed successfully
15/08/06 14:03:20 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=61142
FILE: Number of bytes written=1043641
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=61603
HDFS: Number of bytes written=19
HDFS: Number of read operations=64
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Launched reduce tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=11053
Total time spent by all reduces in occupied slots (ms)=16858
Total time spent by all map tasks (ms)=11053
Total time spent by all reduce tasks (ms)=16858
Total vcore-seconds taken by all map tasks=11053
Total vcore-seconds taken by all reduce tasks=16858
Total megabyte-seconds taken by all map tasks=11318272
Total megabyte-seconds taken by all reduce tasks=17262592
Map-Reduce Framework
Map input records=5
Map output records=9279
Map output bytes=263393
Map output materialized bytes=60519
Input split bytes=584
Combine input records=0
Combine output records=0
Reduce input groups=9230
Reduce shuffle bytes=60519
Reduce input records=9279
Reduce output records=1
Spilled Records=18558
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=236
CPU time spent (ms)=6280
Physical memory (bytes) snapshot=2920525824
Virtual memory (bytes) snapshot=10899587072
Total committed heap usage (bytes)=4580179968
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=58931
File Output Format Counters
Bytes Written=19
15/08/06 14:03:20 INFO driver.MahoutDriver: Program took 186470 ms (Minutes: 3.1078333333333332)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment