Free ebook: Introducing Microsoft Azure HDInsight

New Free eBook by Microsoft Press:

Microsoft Press is thrilled to share another new free ebook with you:Introducing Microsoft Azure HDInsight, by Avkash Chauhan, Valentine Fontama, Michele Hart, Wee Hyong Tok, and Buck Woody. 

hdinsight-book

Free ebook: Introducing Microsoft Azure HDInsight

Introduction (excerpt)

Microsoft Azure HDInsight is Microsoft’s 100 percent compliant distribution of Apache Hadoop on Microsoft Azure. This means that standard Hadoop concepts and technologies apply, so learning the Hadoop stack helps you learn the HDInsight service. At the time of this writing, HDInsight (version 3.0) uses Hadoop version 2.2 and Hortonworks Data Platform 2.0.

In Introducing Microsoft Azure HDInsight, we cover what big data really means, how you can use it to your advantage in your company or organization, and one of the services you can use to do that quickly—specifically, Microsoft’s HDInsight service. We start with an overview of big data and Hadoop, but we don’t emphasize only concepts in this book—we want you to jump in and get your hands dirty working with HDInsight in a practical way. To help you learn and even implement HDInsight right away, we focus on a specific use case that applies to almost any organization and demonstrate a process that you can follow along with.

We also help you learn more. In the last chapter, we look ahead at the future of HDInsight and give you recommendations for self-learning so that you can dive deeper into important concepts and round out your education on working with big data.

Here are the download links (and below the links you’ll find an ebook excerpt that describes this offering):

Download the PDF (6.37 MB; 130 pages) fromhttp://aka.ms/IntroHDInsight/PDF

Download the EPUB (8.46 MB) fromhttp://aka.ms/IntroHDInsight/EPUB

Download the MOBI (12.8 MB) fromhttp://aka.ms/IntroHDInsight/MOBI

Download the code samples (6.83 KB) fromhttp://aka.ms/IntroHDInsight/CompContent

Advertisements

20TB Earth Science Dataset on AWS With NASA / NEX available for Public

AWS has been working with the NASA Earth Exchange (NEX) team to make it easier and more efficient for researchers to access and process earth science data. The goal is to make a number of important data sets accessible to a wider audience of full-time researchers, students, and citizen scientists. This important new project is called OpenNEX. Up until now, it has been logistically difficult for researchers to gain easy access to this data due to its dynamic nature and immense size (tens of terabytes). Limitations on download bandwidth, local storage, and on-premises processing power made in-house processing impractical.

nasa_nex_landsat_us_2005_forest_leaf_area_1

Access Dataset: s3://nasanex/NEX-DCP30

Consult the detail page and the tech note to learn more about the provenance, format, structure, and attribution requirements.

NASA Earth Exchange (NEX):

The NASA Earth Exchange (NEX) Downscaled Climate Projections (NEX-DCP30) dataset is comprised of downscaled climate scenarios for the conterminous United States that are derived from the General Circulation Model (GCM) runs conducted under the Coupled Model Intercomparison Project Phase 5 (CMIP5) [Taylor et al. 2012] and across the four greenhouse gas emissions scenarios known as Representative Concentration Pathways (RCPs) [Meinshausen et al. 2011] developed for the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR5). The dataset includes downscaled projections from 33 models, as well as ensemble statistics calculated for each RCP from all model runs available. The purpose of these datasets is to provide a set of high resolution, bias-corrected climate change projections that can be used to evaluate climate change impacts on processes that are sensitive to finer-scale climate gradients and the effects of local topography on climate conditions.

Each of the climate projections includes monthly averaged maximum temperature, minimum temperature, and precipitation for the periods from 1950 through 2005 (Retrospective Run) and from 2006 to 2099 (Prospective Run).

Website: NASA NEX

Summary

  • Short Name: NEX-DCP30
  • Version: 1
  • Format: netCDF4 classic
  • Spatial Coverage: CONUS
  • Temporal Coverage:
    • 1950 – 2005 historical or 2006 – 2099 RCP
  • Data Resolution:
    • Latitude Resolution: 30 arc second
    • Longitude Resolution: 30 arc second
    • Temporal Resolution: monthly
  • Data Size:
    • Total Dataset Size: 17 TB
    • Individual file size: 2 GB

Learn more about NEX – NASA Earth Exchange Downscaled Project

NEX Virtual Workshop: https://nex.nasa.gov/nex/projects/1328/

 

Top 20 Big Data Platfora and Analytics Startups with significant VC Funding

Top 20 Big Data and Analytics Startups with significant VC Funding

 

Startup

Funding in Million

URL

MongoDB

231

Pivotal

210

Mu Sigma

208

Cloudera

141

Opera Solutions

114

HortonWorks

98

DataStax

83.7

Guavas

80.5

GoodData

75.5

ParAccel (Actian)

74

Talend

61.6

Pentaho

60

MapR

61

CouchBase

56

Platfora

27.5

Datameer

18

Hadapt

16.2

Karmasphere

14.5

DataBricks

14

Quantifind

11.2

 

Top 20 Big Data

 

Keywords: Big Data, Data Analytics, Infographic, Hadoop,  BI

NFL: QB Rating after week 6

Quarterback Rating: Total Games/Touchdowns/Interceptions:

qb6-tds

Quarterback Statistics: Total Yards/Attempts/Completed:

qb-6-yds

Quarterback Statistics:

Rank Player Team G EPA WPA/G EPA/P SR(%) Att Cmp Cmp% Yds Sk SkYds Int TDs %Deep AYPA

1

18-P.Manning DEN

6

109.4

0.49

0.41

61

240

178

74.2

2179

5

25

2

22

16.7

8.4

2

17-P.Rivers SD

6

62.4

0.26

0.24

56.5

224

163

72.8

1847

10

64

5

14

21

6.7

3

9-T.Romo DAL

6

40.5

0.1

0.16

52.6

218

153

70.2

1691

14

103

3

14

15.1

6.3

4

16-M.Cassel MIN

2

7.7

0.12

0.1

44.2

69

48

69.6

489

4

26

2

3

15.9

5.1

5

2-M.Ryan ATL

5

59.4

0.33

0.24

53.5

218

151

69.3

1649

9

76

3

10

12.4

6.3

6

9-N.Foles PHI

4

34.2

0.22

0.48

54.9

61

41

67.2

542

2

12

0

6

18

8.4

7

9-D.Brees NO

6

74.6

0.3

0.28

51.1

237

157

66.2

1958

15

94

5

14

22.8

6.5

8

6-J.Cutler CHI

6

21.5

0.22

0.09

51.2

218

144

66.1

1628

9

65

6

12

20.6

5.7

9

7-B.Roethlisberger PIT

5

19.7

0.06

0.09

47.9

192

126

65.6

1495

18

115

5

6

21.9

5.5

10

14-A.Dalton CIN

6

18.3

0.14

0.07

49

215

140

65.1

1552

14

80

6

8

18.6

5.2

11

2-T.Pryor OAK

5

13.7

0.04

0.06

45.6

138

89

64.5

1061

21

144

5

5

20.3

4.4

12

8-M.Schaub HST

6

-23.3

-0.08

-0.09

43.6

233

150

64.4

1552

15

105

9

8

17.6

4.2

13

12-A.Rodgers GB

5

45.4

0.24

0.21

51.1

184

118

64.1

1646

14

99

4

10

23.4

6.9

14

9-M.Stafford DET

6

41.7

0.19

0.15

50.9

239

150

62.8

1772

10

71

4

12

17.6

6.1

15

17-R.Tannehill MIA

5

15.2

0.14

0.07

48.6

182

114

62.6

1383

24

148

5

6

15.4

4.9

16

10-J.Locker TEN

4

25.3

0.32

0.18

47.5

111

69

62.2

721

9

57

0

6

24.3

5.5

17

12-A.Luck IND

6

49.5

0.16

0.21

51.1

188

117

62.2

1354

13

88

3

7

23.4

5.6

18

3-R.Wilson SEA

6

35.7

0.16

0.15

49.1

158

97

61.4

1254

17

93

4

8

26.6

5.6

19

1-C.Newton CAR

5

23.2

0.05

0.11

48.6

153

93

60.8

1127

16

123

5

9

22.2

4.6

20

10-R.Griffin WAS

5

9.2

-0.09

0.04

46.3

209

125

59.8

1448

10

92

5

6

19.6

5.2

21

7-G.Smith NYJ

6

-8.5

0

-0.03

42.7

190

113

59.5

1490

21

163

10

7

30.5

4.2

22

8-S.Bradford SL

6

16.5

0.05

0.06

46.6

232

138

59.5

1432

13

89

3

13

16.4

4.9

23

6-B.Hoyer CLV

3

1.2

0.1

0.01

41.7

96

57

59.4

615

6

48

3

5

18.8

4.2

24

3-C.Palmer ARZ

6

-20.2

-0.18

-0.08

44.9

221

131

59.3

1483

13

80

11

7

23.5

3.9

25

7-C.Ponder MIN

3

7.8

-0.07

0.06

44.4

100

59

59

691

10

44

5

2

27

3.8

26

7-C.Henne JAX

5

-4.2

0

-0.03

40

137

80

58.4

904

10

60

4

2

19

4.5

27

8-M.Glennon TB

2

-6.9

-0.22

-0.07

42

86

50

58.1

466

4

36

3

3

14

3.3

28

5-J.Flacco BLT

6

24.8

0.01

0.09

40

235

136

57.9

1702

19

124

8

7

28.5

4.8

29

12-T.Brady NE

6

18.7

0.16

0.07

44

239

136

56.9

1480

15

110

4

8

21.3

4.7

30

3-E.Manuel BUF

5

-13.3

0.05

-0.07

41.6

150

85

56.7

985

13

74

3

5

23.3

4.8

31

11-A.Smith KC

6

3.7

0.04

0.01

42.4

216

122

56.5

1330

16

86

3

7

14.8

4.8

32

3-B.Weeden CLV

4

-6.5

-0.09

-0.03

37.2

153

86

56.2

1005

18

130

5

4

25.5

3.8

33

7-C.Kaepernick SF

6

6.3

0.14

0.03

42.9

161

90

55.9

1221

13

75

5

8

22.4

5.3

34

7-M.Vick PHI

5

12.4

0.18

0.07

45.7

133

72

54.1

1185

12

72

2

5

32.3

7.1

35

10-E.Manning NYG

6

-0.2

-0.08

0

45.4

229

123

53.7

1721

16

104

15

9

27.5

3.8

36

4-R.Fitzpatrick TEN

3

5.1

-0.14

0.05

40.4

78

41

52.6

526

6

27

4

2

19.2

3.8

37

11-B.Gabbert JAX

3

-43.2

-0.33

-0.39

35.7

86

42

48.8

481

12

67

7

1

15.1

1

38

5-J.Freeman TB

3

-7.7

-0.07

-0.07

38.8

94

43

45.7

571

7

47

3

2

29.8

3.9

Keywords: NFL, Quarterback, Infographics, Data Visualization