Hadoop HDFS Error: xxxx could only be replicated to 0 nodes, instead of 1

Sometime when using Hadoop  either using HDFS directly or running a MapReduce job which access HDFS, user get an error i.e. XXXX could only be replicated to 0 nodes, instead of 1

Example (1): Copying a file from local file system to HDFS
$myhadoop$ ./currenthadoop/bin/hadoop fs -copyFromLocal ./b.txt /
14/02/03 11:59:48 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /b.txt could only be replicated to 0 nodes, instead of 1
Example (2): Running MapReduce Job:
$myhadoop$ ./currenthadoop/bin/hadoop jar hadoop-examples-1.2.1.jar pi 10 1
 Number of Maps  = 10
 Samples per Map = 1
 14/02/03 12:02:11 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/henryo/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
The root cause for above problem is that Datanode is not available means datanode process is not running at all.
You can verify it by running the jps command as below to make sure all key process are running specific to HDFS/MR1/MR2(YARN) version.
Hadoop Process for HDFS/MR1:

$ jps
69269 TaskTracker
69092 DataNode
68993 NameNode
69171 JobTracker

Hadoop Process for HDFS/MR2

$ jps
43624 DataNode
44005 ResourceManager
43529 NameNode
43890 SecondaryNameNode
44105 NodeManager

If you look at Datanode logs you might see the reason for why Datanode could not started i.e. as below:

2014-02-03 17:50:37,334 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties
2014-02-03 17:50:37,947 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /private/tmp/hdfs/datanode: namenode namespaceID = 1867802097; datanode namespaceID = 1895712546
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
Based on above the problem is that folder where HDFS datanode is defined (/tmp/hdfs/datanode ), is not correctly configured. Either the folder does not exist or the contents are unreadable or the folder is inaccessible or locked.
Solution:
To solve this problem you may need to look for your HDFS -> Datanode folder accessibility and once properly configured, start Datanode/Namenode again.
Advertisements

One thought on “Hadoop HDFS Error: xxxx could only be replicated to 0 nodes, instead of 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s