Hadoop 2.4.0 release (helpful links)

Kudos to Hadoop community as Hadoop 2.4.0 release is available for everyone to consume. A small list of improvements in HDFS, MapReduce along with overall framework are as below but not limited to:

Hadoop 2.4.0 Highlights:

  • HDFS:
    • Full HTTPS support
    • ACL Supported HDFS, allows easier access to Apache Sentry-managed data by components using it
    • Native supported Rolling upgrades in HDFS
    • HDFS FSImage using protocol-buffers for smoother operational upgrades
  • YARN:
    • ResourceManager HA Automatic Failover
    •  YARN Timeline Server PREVIEW for storing and serving generic application history

Hadoop 2.4.0 Release Notes:

http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/releasenotes.html

Hadoop 2.4.0 Source download:

http://apache.mirrors.tds.net/hadoop/common/hadoop-2.4.0/hadoop-2.4.0-src.tar.gz

Hadoop 2.4.0 Binary download:

http://apache.mirrors.tds.net/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz

Advertisements

Setting up Pivotal Hadoop (PivotalHD 1.1 Community Edition) Cluster in CentOS 6.5

Download Pivotal HD Package

http://bitcast-a.v1.o1.sjc1.bitgravity.com/greenplum/pivotal-sw/pivotalhd_community_1.1.tar.gz

The package consist of 3 tarball package:

  • PHD-1.1.0.0-76.tar.gz
  • PCC-2.1.0-460.x86_64.tar.gz
  • PHDTools-1.1.0.0-97.tar.gz

Untar above package and start with PCC (Pivotal Command Center)

Install Pivotal Command Center:

$tar -zxvf PCC-2.1.0-460.x86_64.tar.gz
$PHDCE1.1/PCC-2.1.0-460/install

Log in using  newly created user gpadmin:
$  su – gpadmin
$  sudo cp /root/.bashrc .
$  sudo cp /root/.bash_profile .
$  sudo cp /root/.bash_logout .
$  sudo cp /root/.cshrc .
$  sudo cp /root/.tcshrc .

Logout and re-login:
$ exit
$ su – gpadmin

Make sure you have alias set for your localhost:
$  vi /etc/hosts
xx.xx.xx.xx pivotal-master.hadoopbox.com  pivotal-master
$ service network restart
$ ping pivotal-master
$ ping pivotal-master.hadoopbox.com
Now we will use Pivotal HD Package, so lets untar it into PHD-1.1.0.0-76 folder.
Expand PHD* package and then import it:
$  icm_client import -s PHD-1.1.0.0-76/

Get cluster specific configuration:
$ icm_client fetch-template -o ~/ClusterConfigDir

Edit cluster configuration based on your domain details:
$  vi ~/ClusterConfigDir/clusterConfig.xml
Replace all host.yourdomain.com to your_Domainname. Somehow having .  {dot} in domain name is not accepted.
Also select the services you would want to install. you must need base 3 services hdfs, YARN, and Zookeeper in PivotalHD:

<services>hdfs,yarn,zookeeper</services> <!– hbase,hive,hawq,gpxf,pig,mahout</services> –>

Create password-less SSH configuration:

$ ssh-keygen -t rsa
$  cd .ssh
$  cat id_rsa.pub >> authorized_keys
$  cat authorized_keys
$  chmod 700 $HOME && chmod 700 ~/.ssh && chmod 600 ~/.ssh/*

[gpadmin@pivotal-master ~]$ icm_client deploy -c ClusterConfigDir
Please enter the root password for the cluster nodes:
PCC creates a gpadmin user on the newly added cluster nodes (if any). Please enter a non-empty password to be used for the gpadmin user:
Verifying input
Starting install
Running scan hosts
[RESULT] The following hosts do not meet PHD prerequisites: [ pivotal-master.hadoopbox.com ] Details…

Host: pivotal-master.hadoopbox.com
Status: [FAILED]
[ERROR] Please verify supported OS type and version. Supported OS: RHEL6.1, RHEL6.2, RHEL6.3, RHEL6.4, CentOS6.1, CentOS6.2, CentOS6.3, CentOS6.4
[OK] SELinux is disabled
[OK] sshpass installed
[OK] gpadmin user exists
[OK] gpadmin user has sudo privilege
[OK] .ssh directory and authorized_keys have proper permission
[OK] Puppet version 2.7.20 installed
[OK] Ruby version 1.9.3 installed
[OK] Facter rpm version 1.6.17 installed
[OK] Admin node is reachable from host using FQDN and admin hostname.
[OK] umask is set to 0002.
[OK] nc and postgresql-devel packages are installed or available in the yum repo
[OK] iptables: Firewall is not running.
[OK] Time difference between clocks within acceptable threshold
[OK] Host FQDN is configured correctly
[OK] Host has proper java version.
ERROR: Fetching status of the cluster failed
HTTP Error 500: Server Error
Cluster ID: 4

Because I have Cent OS 6.5 so lets edit /etc/centos-release file to let Pivotal installation know CentOS 6.4.
[gpadmin@pivotal-master ~]$ cat /etc/centos-release
CentOS release 6.5 (Final)
[gpadmin@pivotal-master ~]$ sudo mv /etc/centos-release /etc/centos-release-orig
[gpadmin@pivotal-master ~]$ sudo cp /etc/centos-release-orig /etc/centos-release
[gpadmin@pivotal-master ~]$ sudo vi /etc/centos-release

CentOS release 6.4 (Final)  <— Edit to look like I am using CentOS 6.4 even when I have CentOS 6.5

[gpadmin@pivotal-master ~]$ icm_client deploy -c ClusterConfigDir
Please enter the root password for the cluster nodes:
PCC creates a gpadmin user on the newly added cluster nodes (if any). Please enter a non-empty password to be used for the gpadmin user:
Verifying input
Starting install
[====================================================================================================] 100%
Results:
pivotal-master… [Success]
Details at /var/log/gphd/gphdmgr/
Cluster ID: 5

$ cat /var/log/gphd/gphdmgr/GPHDClusterInstaller_1392419546.log
Updating Option : TimeOut
Current Value   : 60
TimeOut=”180″
pivotal-master : Push Succeeded
pivotal-master : Push Succeeded
pivotal-master : Push Succeeded
pivotal-master : Push Succeeded
pivotal-master : Push Succeeded
pivotal-master : Push Succeeded
[INFO] Deployment ID: 1392419546
[INFO] Private key path : /var/lib/puppet/ssl-icm/private_keys/ssl-icm-1392419546.pem
[INFO] Signed cert path : /var/lib/puppet/ssl-icm/ca/signed/ssl-icm-1392419546.pem
[INFO] CA cert path : /var/lib/puppet/ssl-icm/certs/ca.pem
hostlist: pivotal-master
running: massh /tmp/tmp.jaDiwkIFMH bombed uname -n
sync cmd sudo python ~gpadmin/GPHDNodeInstaller.py –server=pivotal-master.hadoopbox.com –certname=ssl-icm-1392419546 –logfile=/tmp/GPHDNodeInstaller_1392419546.log –sync –username=gpadmin
[INFO] Deploying batch with hosts [‘pivotal-master’]
writing host list to file /tmp/tmp.43okqQH7Ji
[INFO] All hosts succeeded.

$ icm_client list
Fetching installed clusters
Installed Clusters:
Cluster ID: 5     Cluster Name: pivotal-master     PHD Version: 2.0     Status: installed

$ icm_client start -l pivotal-master
Starting services
Starting cluster
[====================================================================================================] 100%
Results:
pivotal-master… [Success]
Details at /var/log/gphd/gphdmgr/

Check HDFS:
$ hdfs dfs -ls /
Found 4 items
drwxr-xr-x   – mapred hadoop          0 2014-02-14 15:19 /mapred
drwxrwxrwx   – hdfs   hadoop          0 2014-02-14 15:19 /tmp
drwxrwxrwx   – hdfs   hadoop          0 2014-02-14 15:20 /user
drwxr-xr-x   – hdfs   hadoop          0 2014-02-14 15:20 /yarn

Now open Browser @ https://your_domain_name:5443/
Username/Password – gpadmin/gpadmin

 

Pivotal Command Center Service Status:
$ service commander status
commander (pid  2238) is running…

Hadoop HDFS Error: xxxx could only be replicated to 0 nodes, instead of 1

Sometime when using Hadoop  either using HDFS directly or running a MapReduce job which access HDFS, user get an error i.e. XXXX could only be replicated to 0 nodes, instead of 1

Example (1): Copying a file from local file system to HDFS
$myhadoop$ ./currenthadoop/bin/hadoop fs -copyFromLocal ./b.txt /
14/02/03 11:59:48 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /b.txt could only be replicated to 0 nodes, instead of 1
Example (2): Running MapReduce Job:
$myhadoop$ ./currenthadoop/bin/hadoop jar hadoop-examples-1.2.1.jar pi 10 1
 Number of Maps  = 10
 Samples per Map = 1
 14/02/03 12:02:11 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/henryo/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
The root cause for above problem is that Datanode is not available means datanode process is not running at all.
You can verify it by running the jps command as below to make sure all key process are running specific to HDFS/MR1/MR2(YARN) version.
Hadoop Process for HDFS/MR1:

$ jps
69269 TaskTracker
69092 DataNode
68993 NameNode
69171 JobTracker

Hadoop Process for HDFS/MR2

$ jps
43624 DataNode
44005 ResourceManager
43529 NameNode
43890 SecondaryNameNode
44105 NodeManager

If you look at Datanode logs you might see the reason for why Datanode could not started i.e. as below:

2014-02-03 17:50:37,334 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties
2014-02-03 17:50:37,947 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /private/tmp/hdfs/datanode: namenode namespaceID = 1867802097; datanode namespaceID = 1895712546
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
Based on above the problem is that folder where HDFS datanode is defined (/tmp/hdfs/datanode ), is not correctly configured. Either the folder does not exist or the contents are unreadable or the folder is inaccessible or locked.
Solution:
To solve this problem you may need to look for your HDFS -> Datanode folder accessibility and once properly configured, start Datanode/Namenode again.

Troubleshooting YARN NodeManager – Unable to start NodeManager because mapreduce.shuffle value is invalid

With Hadoop 2.2.x you might experience NodeManager is not running and the failure reports the following error message when starting YARN NodeManger:

2014-01-31 17:13:00,500 FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:98)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:218)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:188)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:338)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:386)

 

If you check yarn-site.xml (in etc/hadoop/) you will see the following setting by default:

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>

Solution:

To solve this problem you just need to change mapreduce.shuffle to mapreduce_shuffle as shown below:

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

Note: With Hadoop 0.23.10 the value mapreduce.shuffle is still correct and works fine so this change is applicable to Hadoop 2.2.x 

 

 

YARN Job Problem: Application application_** failed 1 times due to AM Container for XX exited with exitCode: 127

Run a sample Pi job in YARN (Hadoop 0.23.x or 2.2.x) might fail with the following error message:

[Hadoop_Home] $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.10.jar pi -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory -libjars share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-0.23.10.jar 16 10000

Number of Maps = 16
Samples per Map = 10000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Starting Job
14/01/31 14:58:10 INFO input.FileInputFormat: Total input paths to process : 16
14/01/31 14:58:10 INFO mapreduce.JobSubmitter: number of splits:16
14/01/31 14:58:10 INFO mapred.ResourceMgrDelegate: Submitted application application_1391206707058_0002 to ResourceManager at /0.0.0.0:8032
14/01/31 14:58:10 INFO mapreduce.Job: The url to track the job: http://Avkashs-MacBook-Pro.local:8088/proxy/application_1391206707058_0002/
14/01/31 14:58:10 INFO mapreduce.Job: Running job: job_1391206707058_0002
14/01/31 14:58:12 INFO mapreduce.Job: Job job_1391206707058_0002 running in uber mode : false
14/01/31 14:58:12 INFO mapreduce.Job: map 0% reduce 0%
14/01/31 14:58:12 INFO mapreduce.Job: Job job_1391206707058_0002 failed with state FAILED due to: Application application_1391206707058_0002 failed 1 times due to AM Container for appattempt_1391206707058_0002_000001 exited with exitCode: 127 due to:
.Failing this attempt.. Failing the application.
14/01/31 14:58:12 INFO mapreduce.Job: Counters: 0
Job Finished in 2.676 seconds
java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/user/avkashchauhan/QuasiMonteCarlo_1391209089737_1265113759/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:738)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1685)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1709)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Rootcause:

The problem is caused because YARN is using different path for JAVA executable different then you have in your OS. The way you can troubleshoot this problem is that look for the local logs for the failed task which will show stderr and stdout, in which select “stderr” to see the failure and you will the following message:

/bin/bash: /bin/java: No such file or directory

Screen Shot 2014-01-31 at 3.30.08 PM

 

 

 

 

The hardcoded path to check for java is /bin/java however if you don’t have /bin/java as your Java executable the YARN job will fail. Like in OSX I have Java 1.7 running at /usr/bin/java as below:

$java -version 

java version “1.7.0_45”
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

Solution:

To solve this problem in OSX I created a link from /bin/java to /usr/bin/java as below:

$ sudo ln -s /usr/bin/java /bin/java Password: *****

Lets retry the Pi Sample again:

$bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.10.jar pi -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory -libjars share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-0.23.10.jar 16 10000

Number of Maps = 16
Samples per Map = 10000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15
Starting Job
14/01/31 15:09:55 INFO input.FileInputFormat: Total input paths to process : 16
14/01/31 15:09:55 INFO mapreduce.JobSubmitter: number of splits:16
14/01/31 15:09:56 INFO mapred.ResourceMgrDelegate: Submitted application application_1391206707058_0003 to ResourceManager at /0.0.0.0:8032
14/01/31 15:09:56 INFO mapreduce.Job: The url to track the job: http://Avkashs-MacBook-Pro.local:8088/proxy/application_1391206707058_0003/
14/01/31 15:09:56 INFO mapreduce.Job: Running job: job_1391206707058_0003
14/01/31 15:10:01 INFO mapreduce.Job: Job job_1391206707058_0003 running in uber mode : false
14/01/31 15:10:01 INFO mapreduce.Job: map 0% reduce 0%
14/01/31 15:10:07 INFO mapreduce.Job: map 37% reduce 0%
14/01/31 15:10:12 INFO mapreduce.Job: map 50% reduce 0%
14/01/31 15:10:13 INFO mapreduce.Job: map 75% reduce 0%
14/01/31 15:10:18 INFO mapreduce.Job: map 100% reduce 0%
14/01/31 15:10:18 INFO mapreduce.Job: map 100% reduce 100%
14/01/31 15:10:18 INFO mapreduce.Job: Job job_1391206707058_0003 completed successfully
14/01/31 15:10:18 INFO mapreduce.Job: Counters: 43
File System Counters
FILE: Number of bytes read=358
FILE: Number of bytes written=1088273
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=4358
HDFS: Number of bytes written=215
HDFS: Number of read operations=67
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=16
Launched reduce tasks=1
Rack-local map tasks=16
Total time spent by all maps in occupied slots (ms)=61842
Total time spent by all reduces in occupied slots (ms)=4465
Map-Reduce Framework
Map input records=16
Map output records=32
Map output bytes=288
Map output materialized bytes=448
Input split bytes=2470
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=448
Reduce input records=32
Reduce output records=0
Spilled Records=64
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=290
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=3422552064
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1888
File Output Format Counters
Bytes Written=97
Job Finished in 23.024 seconds
14/01/31 15:10:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Estimated value of Pi is 3.14127500000000000000

Troubleshooting Hadoop Problems – FSNamesystem: FSNamesystem initialization failed.

When Namenode starts sometime you do not see the Namenode process running so you can take a look at the logs to understand what is going on…

This is the message you get when you start namenode:

[exec] Starting namenodes on [localhost]
[exec] localhost: starting namenode, logging to */hadoop-0.23.10/logs/hadoop-avkashchauhan-namenode-Avkashs-MacBook-Pro.local.out

Now you can see the log by opening the log using */hadoop-0.23.10/logs/hadoop-avkashchauhan-namenode-Avkashs-MacBook-Pro.local.log  <- make sure use the log extension

In the log you will see the error as below:

STARTUP_MSG: java = 1.7.0_45
************************************************************/
2014-01-31 11:36:16,789 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-01-31 11:36:16,848 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-01-31 11:36:16,848 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2014-01-31 11:36:17,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Missing directory /tmp/hdfs23/namenode
at org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker.addDirsToCheck(NameNodeResourceChecker.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker.<init>(NameNodeResourceChecker.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:348)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:332)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:303)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:346)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:472)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:464)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:765)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:814)
2014-01-31 11:36:17,039 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system…
2014-01-31 11:36:17,039 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2014-01-31 11:36:17,040 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2014-01-31 11:36:17,040 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: Missing directory /tmp/hdfs23/namenode
at org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker.addDirsToCheck(NameNodeResourceChecker.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker.<init>(NameNodeResourceChecker.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:348)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:332)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:303)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:346)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:472)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:464)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:765)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:814)
2014-01-31 11:36:17,041 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2014-01-31 11:36:17,041 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at avkashs-macbook-pro.local/10.0.0.17
************************************************************/

Step-by-step guide

  • Jump to you hadoop_install/bin folder
  • Launch command ./hadoop namenode -format
  • Make sure you see the following message showing the format was done correctly:

14/01/31 11:41:48 INFO namenode.NNStorage: Storage directory /tmp/hdfs23/namenode has been successfully formatted.
14/01/31 11:41:48 INFO namenode.FSImage: Saving image file /tmp/hdfs23/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
14/01/31 11:41:48 INFO namenode.FSImage: Image file of size 128 saved in 0 seconds.
14/01/31 11:41:48 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/01/31 11:41:48 INFO util.ExitUtil: Exiting with status 0
14/01/31 11:41:48 INFO namenode.NameNode: SHUTDOWN_MSG:

  • Now restart the namenode again.

Spark Summit 2013- Mon Dec 2, 2013 Keynotes

The State of Spark, and Where We’re Going Next

Matei Zaharia (CTO, Databricks; Assistant Professor, MIT)

Screen Shot 2013-12-05 at 8.08.21 AM

Community Contributions for Spark

  • YARN support (Yahoo!)
  • Columnar compression in Shark (Yahoo!)
  • Fair scheduling (Intel)
  • Metrics reporting (Intel, Quantifind)
  • New RDD operators (Bizo, ClearStory)
  • Scala 2.10 support (Imaginea)
Downloads: pptx slidespdf slides

Turning Data into Value
Ion Stoica (CEO, Databricks; CTO, Conviva; Co-Director, UC Berkeley AMPLab)

Screen Shot 2013-12-05 at 8.11.30 AM

  • Everyone collects but few extract value from data
  • Unification of comp. and prog. models key to
    • » Efficiently analyze data
    • » Make sophisticated, real-time decisions
  • Spark is unique in unifying
    • » batch, interactive, streaming computation models
    • » data-parallel and graph-parallel prog. models
Downloads: pptx slidespdf slides

Big Data Research in the AMPLab
Mike Franklin (Director, UC Berkeley AMPLab)

Screen Shot 2013-12-05 at 8.14.46 AM

  • GraphX: Unifying Graph Parallel & Data Parallel Analytics
  • OLTP and Serving Workloads •  MDCC: Mutli Data Center Consistency
  • HAT: Highly-Available Transactions
  • PBS: Probabilistically Bounded Staleness
  • PLANET: Predictive Latency-Aware Networked Transactions
  • Fast Matrix Manipulation Libraries
  • Cold Storage, Partitioning, Distributed Caching
  • Machine Learning Pipelines, GPUs,
Downloads: pptx slidespdf slides

Hadoop and Spark Join Forces in Yahoo
Andy Feng (Distinguished Architect, Cloud Services, Yahoo)

Screen Shot 2013-12-05 at 7.53.47 AM

YAHOO AT SCALE:

  • 150 PB of data on Yahoo Hadoop clusters
    • Yahoo data scientists need the data for
      • Model building
      • BI analytics
    • Such datasets should be accessed efficiently
      • avoid latency caused by data movement
  • 35,000 servers in Hadoop cluster
    • Science projects need to leverage all these servers for computation

SOLUTION: HADOOP + SPARK

  • science … Spark API & MLlib ease development of ML algorithms
  • speed … Spark reduces latency of model training via in-memory RDD etc
  • scale … YARN brings Hadoop datasets & servers at scientists’ fingertips
Downloads: pdf slides (large file)

Integration of Spark/Shark into the Yahoo! Data and Analytics Platform

Tim Tully (Distinguished Engineer/Architect, Yahoo)

  • Legacy / Current Hadoop Architecture
  • Reflection / Pain Points
  • Why the movement towards Spark / Shark
  • New Hybrid Environment
  • Future Spark/Shark/Hadoop Stack

Screen Shot 2013-12-05 at 7.55.52 AM

 

Downloads:  pptx slidespdf slides

 

Spark in the Hadoop Ecosystem
Eric Baldeschwieler (@jeric14)

Screen Shot 2013-12-05 at 8.20.33 AM

Data scientists & Developers need an open standard for sharing their Algorithms & functions, an “R” for big data. 

• Spark best current candidate:
•  Open Source – Apache Foundation
•  Expressive (MR, iteration, Graphs, SQL, streaming)
•  Easily extended & embedded (DSLs, Java, Python…)

Spark “on the radar”
•  2008 – Yahoo! Hadoop team collaboration w Berkeley Amp/Rad lab begins
•  2009 – Spark example built for Nexus -> Mesos
•  2011 – “Spark is 2 years ahead of anything at Google””- Conviva seeing good results w Spark
•  2012 – Yahoo! working with Spark / Shark
•  Today – Many success stories” – Early commercial support

Downloads: ppt slidespdf slides

Keywords: Apache Spark, Hadoop, YARN,  Big Data, Mesos, Databricks, Conviva,