Big Data 1B dollars Club – Top 20 Players

Here is a list of top players in Big Data world having influence over billion dollars (or more) Big Data projects directly or indirectly (not in order):

  1. Microsoft
  2. Google
  3. Amazon
  4. IBM
  5. HP
  6. Oracle
  7. VMWare
  8. Terradata
  9. EMC
  10. Facebook
  11. GE
  12. Intel
  13. Cloudera
  14. SAS
  15. 10Gen
  16. SAP
  17. Hortonworks
  18. MapR
  19. Palantir
  20. Splunk

The list is based on each above companies involvement in Big data directly or indirectly along with a direct product or not. All of above companies are involved in Big Data projects worth considering Billion+ …

One thought on “Big Data 1B dollars Club – Top 20 Players

  1. Hi Avkash,

    I hope my mail finds you well and in good health.
    I’m asking if you could assist me solve a problem I’m currently facing in using Hadoop as this is the first time to use it.
    I’m using Hadoop on my Ubuntu machine. And I’m trying to write a seq file using the below code, but the code finishes and says “sucessfully (or) read 202000 ….” and on the destintation folder I can’t find my output seq file.

    So I’d be very thankful if you could please advise why this is happening.

    Many Thanks Avkash for your kind assistance and support,

    Hussein

    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;

    import java.io.FileInputStream;
    import java.io.BufferedInputStream;
    import java.io.DataInputStream;

    import java.nio.ByteBuffer;

    import org.apache.hadoop.io.SequenceFile;
    import org.apache.hadoop.io.LongWritable;

    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.util.GenericOptionsParser;

    import org.apache.hadoop.util.Tool;
    import org.apache.hadoop.util.ToolRunner;

    public class MainUpload_sp
    extends Configured
    implements Tool
    {

    public static void main(String[] args) throws Exception {
    int res = ToolRunner.run(new Configuration(), new MainUpload_sp(), args);
    System.exit(res);
    }

    public int run(String[] args) throws Exception {

    System.out.println(“UPLOADING Program”);

    if( args.length != 2 ) {
    System.err.println(“usage: hadoop xxx.jar upload_sp “);
    return -1;
    }
    String filename_in = args[0];
    String filename_out = args[1];

    Configuration conf = getConf();

    FileInputStream fin = new FileInputStream(filename_in);
    BufferedInputStream bfin = new BufferedInputStream(fin, 8192);

    conf.set(“fs.file.impl”,”org.apache.hadoop.fs.LocalFileSystem”);
    FileSystem fs = FileSystem.get(conf);
    if( fs.getConf() == null) {
    System.err.println(“Could not initialize the local filesystem”);
    return -1;
    }

    java.io.File fileout = new java.io.File(filename_out);

    Path path_out = new Path(fileout.getPath());
    SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf,
    path_out,
    LongWritable.class,
    LabeledSplitPointBuff.class,
    SequenceFile.CompressionType.NONE);

    long numRecords;

    try {
    numRecords = 0;

    LongWritable Key = new LongWritable(0);
    LabeledSplitPointBuff lsp = new LabeledSplitPointBuff();
    while(true) {

    int read = bfin.read(lsp.data, 0, LabeledSplitPointBuff.RECORDSIZE);
    System.out.println(“read “+read);
    if (read != LabeledSplitPointBuff.RECORDSIZE) break;
    writer.append(Key, lsp);
    numRecords += 1;
    }
    } finally {
    writer.close();
    }

    System.out.println(“sucessfully (or) read ” + Long.toString(numRecords)
    + ” from ” + filename_in + ” to ” + filename_out);

    return 0;
    }

    }

    Like

Leave a comment