Skip to content

Instantly share code, notes, and snippets.

@zonpantli
Created September 1, 2011 19:02
Show Gist options
  • Save zonpantli/1186974 to your computer and use it in GitHub Desktop.
Save zonpantli/1186974 to your computer and use it in GitHub Desktop.
ukko hadoop trace

masters ukko049

slaves ukko049 ukko050

Use hostname ukko049 in config

In core-site.xml:

<property>
  <name>fs.default.name</name>
  <value>hdfs://ukko049:54312</value>
  <description></description>
</property>

In mapred-site.xml:

<property>
  <name>mapred.job.tracker</name>
  <value>ukko049:54313</value>
</property>

Then ukko050 is running DataNode and TaskTracker, but DataNode can't connect to NameNode

hadoop-mepihlaj-datanode-ukko050.log

2011-09-01 20:58:05,252 INFO org.apache.hadoop.ipc.RPC: Server at ukko049/86.50.20.50:54312 not available yet, Zzzzz...
2011-09-01 20:58:07,253 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 0 time(s).
2011-09-01 20:58:08,254 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 1 time(s).
2011-09-01 20:58:09,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 2 time(s).
2011-09-01 20:58:10,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 3 time(s).
2011-09-01 20:58:11,256 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 4 time(s).
2011-09-01 20:58:12,257 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 5 time(s).
2011-09-01 20:58:13,257 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 6 time(s).
2011-09-01 20:58:14,258 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 7 time(s).
2011-09-01 20:58:15,259 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 8 time(s).
2011-09-01 20:58:16,260 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 9 time(s).

Use ip 86.50.20.50 in config

As proposed here tried using ip instead of hostname in config

In core-site.xml:

<property>
  <name>fs.default.name</name>
  <value>hdfs://86.50.20.50:54312</value>
  <description></description>
</property>

In mapred-site.xml:

<property>
  <name>mapred.job.tracker</name>
  <value>86.50.20.50:54313</value>
</property>

In this case HDFS works properly, data gets replicated to both nodes. Can also run mapred job which completes, but all the computation is done at ukko049, ukko050 fails everything:

hadoop-mepihlaj-tasktracker-ukko050.log

2011-09-01 21:54:49,557 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
2011-09-01 21:54:52,561 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201109012153_0001_m_000016_0 task's state:UNASSIGNED
2011-09-01 21:54:52,562 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201109012153_0001_m_000016_0 which needs 1 slots
2011-09-01 21:54:52,562 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201109012153_0001_m_000016_0 which needs 1 slots
2011-09-01 21:54:52,562 WARN org.apache.hadoop.mapred.TaskTracker: Error initializing attempt_201109012153_0001_m_000016_0:
java.lang.IllegalArgumentException: Wrong FS: hdfs://86.50.20.50:54312/tmp/hadoop-mepihlaj/mapred/system/job_201109012153_0001/jobToken, expected: hdfs://ukko049.hpc.cs.helsinki.fi:54312
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:354)
        at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:521)
        at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:3942)
        at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1060)
        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1001)
        at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2161)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2125)

stdout

11/09/01 21:54:52 WARN mapred.JobClient: Error reading task outputhttp://ukko050.hpc.cs.helsinki.fi:50062/tasklog?plaintext=true&attemptid=attempt_201109012153_0001_m_000014_0&filter=stdout
11/09/01 21:54:52 WARN mapred.JobClient: Error reading task outputhttp://ukko050.hpc.cs.helsinki.fi:50062/tasklog?plaintext=true&attemptid=attempt_201109012153_0001_m_000014_0&filter=stderr
11/09/01 21:54:52 INFO mapred.JobClient: Task Id : attempt_201109012153_0001_m_000015_0, Status : FAILED
Error initializing attempt_201109012153_0001_m_000015_0:
java.lang.IllegalArgumentException: Wrong FS: hdfs://86.50.20.50:54312/tmp/hadoop-mepihlaj/mapred/system/job_201109012153_0001/jobToken, expected: hdfs://ukko049.hpc.cs.helsinki.fi:54312
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:354)
        at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:521)
        at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:3942)
        at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1060)
        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1001)
        at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2161)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2125)

google also told that I can't mix hostnames and IPs in config files..

@kbkaran
Copy link

kbkaran commented Sep 28, 2011

Please make sure HADOOP_HOME is set on the shells in the slave machine. This happens when Hadoop libraries cannot locate the configuration files.

@aman-ttc
Copy link

In my case this was caused by a record in /etc/hosts on the master, e.g.
127.0.0.1 ukko049
which caused the services to bind to 127.0.0.1
I replaced it with
86.50.20.50 ukko049
to fix the issue. The other option, as stated by kbkaran, is to explicitly specify IP in the xml files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment