masters
ukko049
slaves
ukko049
ukko050
In core-site.xml
:
<property>
<name>fs.default.name</name>
<value>hdfs://ukko049:54312</value>
<description></description>
</property>
In mapred-site.xml
:
<property>
<name>mapred.job.tracker</name>
<value>ukko049:54313</value>
</property>
Then ukko050 is running DataNode and TaskTracker, but DataNode can't connect to NameNode
hadoop-mepihlaj-datanode-ukko050.log
2011-09-01 20:58:05,252 INFO org.apache.hadoop.ipc.RPC: Server at ukko049/86.50.20.50:54312 not available yet, Zzzzz...
2011-09-01 20:58:07,253 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 0 time(s).
2011-09-01 20:58:08,254 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 1 time(s).
2011-09-01 20:58:09,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 2 time(s).
2011-09-01 20:58:10,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 3 time(s).
2011-09-01 20:58:11,256 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 4 time(s).
2011-09-01 20:58:12,257 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 5 time(s).
2011-09-01 20:58:13,257 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 6 time(s).
2011-09-01 20:58:14,258 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 7 time(s).
2011-09-01 20:58:15,259 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 8 time(s).
2011-09-01 20:58:16,260 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ukko049/86.50.20.50:54312. Already tried 9 time(s).
As proposed here tried using ip instead of hostname in config
In core-site.xml
:
<property>
<name>fs.default.name</name>
<value>hdfs://86.50.20.50:54312</value>
<description></description>
</property>
In mapred-site.xml
:
<property>
<name>mapred.job.tracker</name>
<value>86.50.20.50:54313</value>
</property>
In this case HDFS works properly, data gets replicated to both nodes. Can also run mapred job which completes, but all the computation is done at ukko049, ukko050 fails everything:
hadoop-mepihlaj-tasktracker-ukko050.log
2011-09-01 21:54:49,557 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
2011-09-01 21:54:52,561 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201109012153_0001_m_000016_0 task's state:UNASSIGNED
2011-09-01 21:54:52,562 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201109012153_0001_m_000016_0 which needs 1 slots
2011-09-01 21:54:52,562 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201109012153_0001_m_000016_0 which needs 1 slots
2011-09-01 21:54:52,562 WARN org.apache.hadoop.mapred.TaskTracker: Error initializing attempt_201109012153_0001_m_000016_0:
java.lang.IllegalArgumentException: Wrong FS: hdfs://86.50.20.50:54312/tmp/hadoop-mepihlaj/mapred/system/job_201109012153_0001/jobToken, expected: hdfs://ukko049.hpc.cs.helsinki.fi:54312
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:354)
at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:521)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:3942)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1060)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1001)
at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2161)
at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2125)
stdout
11/09/01 21:54:52 WARN mapred.JobClient: Error reading task outputhttp://ukko050.hpc.cs.helsinki.fi:50062/tasklog?plaintext=true&attemptid=attempt_201109012153_0001_m_000014_0&filter=stdout
11/09/01 21:54:52 WARN mapred.JobClient: Error reading task outputhttp://ukko050.hpc.cs.helsinki.fi:50062/tasklog?plaintext=true&attemptid=attempt_201109012153_0001_m_000014_0&filter=stderr
11/09/01 21:54:52 INFO mapred.JobClient: Task Id : attempt_201109012153_0001_m_000015_0, Status : FAILED
Error initializing attempt_201109012153_0001_m_000015_0:
java.lang.IllegalArgumentException: Wrong FS: hdfs://86.50.20.50:54312/tmp/hadoop-mepihlaj/mapred/system/job_201109012153_0001/jobToken, expected: hdfs://ukko049.hpc.cs.helsinki.fi:54312
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:354)
at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:521)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:3942)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1060)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1001)
at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2161)
at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2125)
google also told that I can't mix hostnames and IPs in config files..
Please make sure HADOOP_HOME is set on the shells in the slave machine. This happens when Hadoop libraries cannot locate the configuration files.