Constructing blocks and file system relationship in HDFS

I am using a 3 nodes Hadoop cluster running Windows Azure HDInsight for the testing.

In Hadoop we can use fsck utility to diagnose the health of the HDFS file system, to find missing files or blocks or calculate them for integrity.

Lets Running FSCK for the root file system:

c:appsdisthadoop-1.1.0-SNAPSHOT>hadoop fsck /

FSCK started by avkash from /10.114.132.17 for path / at Thu Mar 07 05:27:39 GMT 2013
……….Status: HEALTHY
Total size: 552335333 B
Total dirs: 21
Total files: 10
Total blocks (validated): 12 (avg. block size 46027944 B)
Minimally replicated blocks: 12 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 3
Number of racks: 3
FSCK ended at Thu Mar 07 05:27:39 GMT 2013 in 8 milliseconds

The filesystem under path ‘/’ is HEALTHY

 

Now let’s check the total files in the root (/) to verify the files and directories:

 

c:appsdisthadoop-1.1.0-SNAPSHOT>hadoop fs -lsr /

drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /example
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /example/apps
-rw-r–r– 3 avkash supergroup 4608 2013-03-04 21:16 /example/apps/cat.exe
-rw-r–r– 3 avkash supergroup 5120 2013-03-04 21:16 /example/apps/wc.exe
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /example/data
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /example/data/gutenberg
-rw-r–r– 3 avkash supergroup 1395667 2013-03-04 21:16 /example/data/gutenberg/davinci.txt
-rw-r–r– 3 avkash supergroup 674762 2013-03-04 21:16 /example/data/gutenberg/outlineofscience.txt
-rw-r–r– 3 avkash supergroup 1573044 2013-03-04 21:16 /example/data/gutenberg/ulysses.txt
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:15 /hdfs
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:15 /hdfs/tmp
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:15 /hdfs/tmp/mapred
drwx—— – avkash supergroup 0 2013-03-04 21:15 /hdfs/tmp/mapred/system
-rw——- 3 avkash supergroup 4 2013-03-04 21:15 /hdfs/tmp/mapred/system/jobtracker.info
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /hive
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /hive/warehouse
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /hive/warehouse/hivesampletable
-rw-r–r– 3 avkash supergroup 5015508 2013-03-04 21:16 /hive/warehouse/hivesampletable/HiveSampleData.txt
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /tmp
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /tmp/hive-avkash
drwxrwxrwx – SYSTEM supergroup 0 2013-03-04 21:15 /uploads
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /user
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /user/SYSTEM
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /user/SYSTEM/graph
-rw-r–r– 3 avkash supergroup 80 2013-03-04 21:16 /user/SYSTEM/graph/catepillar_star.edge
drwxr-xr-x – avkash supergroup 0 2013-03-04 21:16 /user/SYSTEM/query
-rw-r–r– 3 avkash supergroup 12 2013-03-04 21:16 /user/SYSTEM/query/catepillar_star_rwr.query
drwxr-xr-x – avkash supergroup 0 2013-03-05 07:37 /user/avkash
drwxr-xr-x – avkash supergroup 0 2013-03-04 23:00 /user/avkash/.Trash
-rw-r–r– 3 avkash supergroup 543666528 2013-03-05 07:37 /user/avkash/data_w3c_large.txt

Above we found that there are total 21 directories and 10 files. Now we can dig further to check the total 12 blocks in HDFS for each files:

c:appsdisthadoop-1.1.0-SNAPSHOT>hadoop fsck / -files -blocks –racks
FSCK started by avkash from /10.114.132.17 for path / at Thu Mar 07 05:35:44 GMT 2013
/

/example

/example/apps

/example/apps/cat.exe 4608 bytes, 1 block(s): OK

0. blk_9084981204553714951_1008 len=4608 repl=3 [/fd0/ud0/10.114.236.28:50010, /

fd0/ud2/10.114.236.42:50010, /fd1/ud1/10.114.228.35:50010]

/example/apps/wc.exe 5120 bytes, 1 block(s): OK
0. blk_-7951603158243426483_1009 len=5120 repl=3 [/fd1/ud1/10.114.228.35:50010,
/fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/example/data

/example/data/gutenberg

/example/data/gutenberg/davinci.txt 1395667 bytes, 1 block(s): OK

0. blk_3859330889089858864_1005 len=1395667 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/example/data/gutenberg/outlineofscience.txt 674762 bytes, 1 block(s): OK

0. blk_-3790696559021810548_1006 len=674762 repl=3 [/fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010, /fd1/ud1/10.114.228.35:50010]

/example/data/gutenberg/ulysses.txt 1573044 bytes, 1 block(s): OK

0. blk_-8671592324971725227_1007 len=1573044 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/hdfs

/hdfs/tmp

/hdfs/tmp/mapred

/hdfs/tmp/mapred/system

/hdfs/tmp/mapred/system/jobtracker.info 4 bytes, 1 block(s): OK

0. blk_5997185491433558819_1003 len=4 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/hive

/hive/warehouse

/hive/warehouse/hivesampletable

/hive/warehouse/hivesampletable/HiveSampleData.txt 5015508 bytes, 1 block(s): OK

0. blk_44873054283747216_1004 len=5015508 repl=3 [/fd1/ud1/10.114.228.35:50010,/fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/tmp

/tmp/hive-avkash

/uploads

/user

/user/SYSTEM

/user/SYSTEM/graph

/user/SYSTEM/graph/catepillar_star.edge 80 bytes, 1 block(s): OK

0. blk_-6715685143024983574_1010 len=80 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud2/10.114.236.42:50010, /fd0/ud0/10.114.236.28:50010]

/user/SYSTEM/query

/user/SYSTEM/query/catepillar_star_rwr.query 12 bytes, 1 block(s): OK

0. blk_8102317020509190444_1011 len=12 repl=3 [/fd0/ud0/10.114.236.28:50010, /fd0/ud2/10.114.236.42:50010, /fd1/ud1/10.114.228.35:50010]

/user/avkash

/user/avkash/.Trash

/user/avkash/data_w3c_large.txt 543666528 bytes, 3 block(s): OK

0. blk_2005027737969478969_1012 len=268435456 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud0/10.114.236.28:50010, /fd0/ud2/10.114.236.42:50010]

1. blk_1970119524179712436_1012 len=268435456 repl=3 [/fd1/ud1/10.114.228.35:50010, /fd0/ud0/10.114.236.28:50010, /fd0/ud2/10.114.236.42:50010]

2. blk_6223000007391223944_1012 len=6795616 repl=3 [/fd0/ud0/10.114.236.28:50010, /fd0/ud2/10.114.236.42:50010, /fd1/ud1/10.114.228.35:50010]

Status: HEALTHY
Total size: 552335333 B
Total dirs: 21
Total files: 10
Total blocks (validated): 12 (avg. block size 46027944 B)
Minimally replicated blocks: 12 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 3
Number of racks: 3
FSCK ended at Thu Mar 07 05:35:44 GMT 2013 in 10 milliseconds

The filesystem under path ‘/’ is HEALTHY

Above we can verify that where all total 12 blocks are distributed. 9 blocks are distributed through 9 files and 3 blocks are distributed through 1 file.

 

Keywaords: FSCK, Hadoop,HDFS, Blocks, File System, Replications, HDInsight

Advertisements

A Menu based Windows Azure PowerShell script for PaaS and IaaS Operations

When running the Powershell Menu look like as below:

psmenu10

 

 

 

 

 

 

Get script from here:

https://github.com/Avkash/AzurePowershellmenu/blob/master/PowershellMenuPub.ps1

For those who would like to fork and then add more functionality use the command as below:

 

$ ls -l

total 5

-rw-r–r–    1 avkashc  Administ     8350 Feb 12 12:02 PowershellMenuPub.ps1

-rw-r–r–    1 avkashc  Administ       73 Feb  6 23:49 README.md

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

$ git init

Reinitialized existing Git repository in c:/installbox/azureps/toshare/.git/

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

$ git add PowershellMenuPub.ps1

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

$ git commit -m “initial commit”

[master 9157958] initial commit

Committer: unknown <**********************************>

Your name and email address were configured automatically based

on your username and hostname. Please check that they are accurate.

You can suppress this message by setting them explicitly:

 

git config –global user.name “Your Name”

git config –global user.email you@example.com

 

After doing this, you may fix the identity used for this commit with:

 

git commit –amend –reset-author

 

1 file changed, 28 insertions(+), 5 deletions(-)

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

$ git remote add myPS https://github.com/Avkash/AzurePowershellmenu.git

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

$ git push myPS master

Username for ‘https://github.com&#8217;: **********@*****.***

Password for ‘https://*********@github.com&#8217;:#################

Counting objects: 8, done.

Delta compression using up to 4 threads.

Compressing objects: 100% (6/6), done.

Writing objects: 100% (6/6), 1.17 KiB, done.

Total 6 (delta 2), reused 0 (delta 0)

To https://github.com/Avkash/AzurePowershellmenu.git

fa6307a..9157958  master -> master

 

avkashc@AVKASHXPS12 /c/installbox/azureps/toshare (master)

Keywords: Powershell, Windows Azure, Scripting

Resource Allocation Model in MapReduce 2.0

What was available in previous MapReduce:

  • Each node in the cluster was statically assigned the capability of running a predefined number of Map slots and a predefined number of Reduce slots.
  • The slots could not be shared between Maps and Reduces. This static allocation of slots wasn’t optimal since slot requirements vary during the MR job life cycle
  • In general there is a demand for Map slots when the job starts, as opposed to the need for Reduce slots towards the end

Key drawback in previous MapReduce:

  • In a real cluster, where jobs are randomly submitted and each has its own Map/Reduce slots requirement, having an optimal utilization of the cluster was hard, if not impossible.

What is new in MapReduce 2.0:

  • The resource allocation model in Hadoop 0.23 addresses above (Key drawback) deficiency by providing a more flexible resource modeling.
  • Resources are requested in the form of containers, where each container has a number of non-static attributes.
  • At the time of writing this blog, the only supported attribute was memory (RAM). However, the model is generic and there is intention to add more attributes in future releases (e.g. CPU and network bandwidth).
  • In this new Resource Management model, only a minimum and a maximum for each attribute are defined, and Application Master (AMs) can request containers with attribute values as multiples of these minimums.

Credit: http://www.cloudera.com/blog/2012/02/mapreduce-2-0-in-hadoop-0-23/

Processing already sorted data with Hadoop Map/Reduce jobs without performance overhead

While working with Map/Reduce jobs in Hadoop, it is very much possible that you have got “sorted data” stored in HDFS. As you may know the “Sort function” exists not only after map process in map task but also with merge process during reduce task, so having sorted data to sort again would be a big performance overhead. In this situation you may want to have your Map/Reduce job not to sort the data.

 

Note: If you have tried changing map.sort.class to no-op, it would haven’t work as well.

 

So the question comes:

  • if it is possible to force Map/Reduce to not to sort the data again (as it is already sorted) after map phase?
  • Or “how to run Map/Reduce jobs in a way that you can control how do you want to results, sorted or unsorted”?

So if you do not need result be sorted the following Hadoop patch would be great place to start:

Note: Before using above Patch the I would suggest reading the following comment from Robert about this patch:

  • Combiners are not compatible with mapred.map.output.sort. Is there a reason why we could not make combiners work with this, so long as they must follow the same assumption that they will not get sorted input? If the algorithms you are thinking about would never get any benefit from a combiner, could you also add the check in the client. I would much rather have the client blow up with an error instead of waiting for my map tasks to launch and then blow up 4+ times before I get the error.
  • In your test you never validate that the output is what you expected it to be. That may be hard as it may not be deterministic because there is no sorting, but it would be nice to have something verify that the code did work as expected. Not just that it did not crash.
  • mapred-default.xml Please add mapred.map.output.sort to mapred-default.xml. Include with it a brief explanation of what it does.
  • There is no documentation or examples. This is a new feature that could be very useful to lots of people, but if they never know it is there it will not be used. Could you include in your patch updates to the documentation about how to use this, and some useful examples, preferably simple. Perhaps an example computing CTR would be nice.
  • Performance. The entire reason for this change is to improve performance, but I have not seen any numbers showing a performance improvement. No numbers at all in fact. It would be great if you could include here some numbers along with the code you used for your benchmark and a description of your setup. I have spent time on different performance teams, and performance improvement efforts from a huge search engine to an OS on a cell phone and the one thing I have learned is that you have to go off of the numbers because well at least for me my intuition is often wrong and what I thought would make it faster slowed it down instead.
  • Trunk. This patch is specific to 0.20/1.0 line. Before this can get merged into the 0.20/1.0 lines we really need an equivalent patch for trunk, and possibly 0.21, 0.22, and 0.23. This is so there are no regressions. It may be a while off after you get the 1.0 patch cleaned up though.

Keyword: Hadoop, Map/Reduce, Jobs Performance, Hadoop Patch

How to submit Hadoop Map/Reduce jobs in multiple command shell to run in parallel

Sometimes it is required to run multiple Map/Reduce jobs in same Hadoop cluster however opening several Hadoop command shell or (Hadoop terminal) could be trouble. Note that depend on your Hadoop cluster size and configuration, you can run limited amount of Map/Reduce jobs in parallel however if you would need to do so, here is something you can use to accomplish your objective:

First take a look at ToolRunner method defined in Hadoop utils library as below:

http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/ToolRunner.html

Here are the quick steps:

  • Use ToolRunner.run(..) method in loop while keeping your main Map/Reduce method inside the loop.
  • You must be using Job.submit() instead of Job.waitForCompletion() because:
    • Job.Submit() will submit all the jobs parallel
    • Job.waitForCompletion() will submit all the jobs sequentially.

Here is the code snippet:

public class LaunchParallel extends Configured implements Tool { public static void main(String args[]) { for (int i = 0; i < 50; i++) { ToolRunner.run(new LaunchParallel(), jobArgs); } } public int run(String args) { Job job = new Job(getConf()); // ... // Your job details here // ... job.submit(); // Must to have job.submit() to apply parallel jobs } }

Note: If you have variable arguments for each job then you can put all the arguments in an array and use the array with counter to pass the Map/Reduce job arguments.

Keyword: Hadoop, Map/Reduce, Parallel Jobs

Listing current running Hadoop Jobs and Killing running Jobs

When you have jobs running in Hadoop, you can use the map/reduce web view to list the current running jobs however what if you would need to kill any current running job because the submitted jobs started malfunctioning or in worst case scenario, the job is stuck in infinite loops. I have seen several scenarios when a submitted job got stuck in problematic states due to code defect in map/reduce job or the Hadoop cluster itself. In any of such situation, you would need to manually kill the job which is already started.

To kill a currently running Hadoop job first you need Job ID and then Kill the job using the as following commands:

  • Hadoop job -list
  • Hadoop job –kill <JobID>

To list current running job in Hadoop Command shell please use below command:

On Linux:      $ bin/hadoop job –list
On Windows:    HADOOP_HOME = C:AppsDist
               HADOOP_HOMEbinHadoop job list

Above command will return job details as below:

[Linux]

1 jobs currently running
JobId                  State          StartTime       UserName
job_201203293423_0001   1             1334506474312   avkash

[Windows]
c:appsdist>hadoop job -list
1 jobs currently running
JobId                  State   StartTime       UserName        Priority        SchedulingInfo
job_201204011859_0002   1       1333307249654   avkash       NORMAL          NA

Once you have the JobID you can use the following command to kill the job:

 
On Linux:      $ bin/hadoop job -kill jobid
On Windows:    HADOOP_HOME = C:AppsDist
               HADOOP_HOMEbinHadoop job –kill <Job_ID>

[Windows]
c:appsdist>hadoop job -kill job_201204011859_0002
Killed job job_201204011859_0002

How to troubleshoot MapReduce jobs in Hadoop

When writing MapReduce programs you definitely going to hit problems in your programs such as infinite loops, crash in MapReduce, Incomplete jobs etc. Here are a few things which will help you to isolate these problems:

Map/Reduce Logs Files:

All MapReduce jobs activities are logged by default in Hadoop. By default, log files are stored in the logs/ subdirectory of the HADOOP_HOME main directory. Thee Log file format is based on HADOOP-username-service-hostname.log. The most recent data is in the .log file; older logs have their date appended to them.

Log File Format:

HADOOP-username-service-hostname.log

  • The username in the log filename refers to the username account in which Hadoop was started. In Windows the Hadoop service is started with different  user name however you can logon to the machine with different user name. So the user name is not necessarily the same username you are using to run programs.
  • The service name belong to several Hadoop programs are writing the logm such as below which are important for debugging a whole Hadoop installation:
    • Jobtracker
    • Namenode
    • Datanode
    • Secondarynamenode
    • tasktracker.

For Map/Reduce process, the tasktraker logs provide details about individual programs ran on datanote. Any exceptions thrown by your MapReduce program will be logged in tasktracker logs.

Subdirectory Userlogs:

Inside the HADOOP_HOMElogs folder you will also find a subdirectory name userlogs. In this directory you will find another subdirectory for every MapREduce task running in your Hadoop cluster. Each task records its stdout and stderr to two files in this subdirectory. If you are running a multi-node Hadoop cluster, then the logs you will find here are not centrally aggregated. To collect correct logs you would need to check and verify each TaskNode’s logs/userlogs/ directory for their output and then create the full log history to understand what went wrong.