Hadoop Adventures with Microsoft HDInsight

 

What is HDInsight?

  • Hdinsight is the product name for Microsoft installation of Hadoop and Hadoop on azure service. HDInsight is Microsoft’s 100% Apache compatible Hadoop distribution, supported by Microsoft. HDInsight, available both on Windows Server or as an Windows Azure service, empowers organizations with new insights on previously untouched unstructured data, while connecting to the most widely used Business Intelligence (BI) tools on the planet.
  • http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx

It is available in two mode:

  • HDInsight as Cloud Service: Cloud Version running on Windows Azure
  • HDInsight as Local Cluster: A downloadable version to runs locally on Windows Server and Desktop

In this article we will see how to use HDInsight on local machine.

 

Where to get it?

What does Windows installer brings to your machine:

clip_image001

After the installation is completed you will see the following applications are installed:

  1. Microsoft HDInsight Community Technology Preview Version 1.0.0.0
  2. Hortonwoks Data Platform 1.0.1 Developer Preview Version 1.0.1
  3. If you do not change the installed component, Python 2.7.3150 is also installed
  4. Java and C++ runtime is also installed as required in the machine

clip_image002

 

By default the Hadoop is installed at C:Hadoop as below:

clip_image003

Once installer is completed you will see the following shortcuts are setup in your machine:

clip_image004

Here is the list of shortcuts:

  1. Hadoop Command Line
  2. Microsoft HDInsight Dashboard
  3. Hadoop MapReduce Status
  4. Hadoop Name Node Status

 

If you launch the “Hadoop command Line” you will see the list of commands as below:

clip_image005

· namenode -format format the DFS filesystem

· secondarynamenode run the DFS secondary namenode

· namenode run the DFS namenode

· datanode run a DFS datanode

· dfsadmin run a DFS admin client

· mradmin run a Map-Reduce admin client

· fsck run a DFS filesystem checking utility

· fs run a generic filesystem user client

· balancer run a cluster balancing utility

· fetchdt fetch a delegation token from the NameNode

· jobtracker run the MapReduce job Tracker node

· pipes run a Pipes job

· tasktracker run a MapReduce task Tracker node

· historyserver run job history servers as a standalone daemon

· job manipulate MapReduce jobs

· queue get information regarding JobQueues

· version print the version

· jar <jar> run a jar file

· distcp <srcurl> <desturl> copy file or directories recursively

· archive -archiveName NAME <src>* <dest> create a hadoop archive

· daemonlog get/set the log level for each daemon

· or

· CLASSNAME run the class named CLASSNAME

Most commands print help when invoked w/o parameters.

Try checking the Version as below:

c:Hadoophadoop-1.1.0-SNAPSHOT>hadoop version

Hadoop 1.1.0-SNAPSHOT

Subversion on branch -r

Compiled by jenkins on Wed Oct 17 22:28:56 PDT 2012

From source with checksum 80f5614dfb0743b569344f051a07b37d

Now if you Launch “Microsoft HDInsight Dashboard” shortcut you will see the dashboard running locally as below:

clip_image006

Launching “Hadoop MapReduce Status” shortcut will give you the following info:

clip_image007

And Launching “Hadoop Name Node Status” shortcut you will see the following:

clip_image008

So as you can see above, you do have Hadoop Cluster running on your local machine.

Play with it a little more and my next article is coming with more info on this regard.

Have fun with Hadoop!!

Advertisements

2 thoughts on “Hadoop Adventures with Microsoft HDInsight

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s