Getting all categorical for predictors in H2O POJO and MOJO models

Here is the Java/Scala code snippet which shows how you can get the categorical values for each enum/factor predictor from H2O POJO and MOJO Models:

to get the list of all column names in your POJO/MOJO model, you can try the following:

Imports:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
import java.util.Arrays;

POJO:

## First use the POJO model class as below:
private static String modelClassName = "gbm_prostate_binomial";

##Then you can GenModel class to get info you are looking for as below:
hex.genmodel.GenModel rawModel;
rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance();

## Now you can get the results as below:
System.out.println("isSupervised : " + rawModel.isSupervised());
System.out.println("Columns Names :  " + Arrays.toString(rawModel.getNames()));
System.out.println("Response ID : " + rawModel.getResponseIdx());
System.out.println("Number of columns : " + rawModel.getNumCols());
System.out.println("Response Name : " + rawModel.getResponseName());

## Printing all categorical values for each predictors
for (int i = 0; i < rawModel.getNumCols(); i++) 
{
 String[] domainValues = rawModel.getDomainValues(i);
 System.out.println(Arrays.toString(domainValues));
}
Output Results:
isSupervised : true
Column Names : [ID, AGE, RACE, DPROS, DCAPS, PSA, VOL, GLEASON]
Response ID : 8
Number of columns : 8
null
null
[0, 1, 2]
null
null
null
null
null
Note: For all null values means the predictor was numeric values and all the categorical values are listed for the each enum/factor predictor.

MOJO:

## Lets assume you have MOJO model as gbm_prostate_binomial.zip
## You would need to load your model as below:
hex.genmodel.GenModel mojo = MojoModel.load("gbm_prostate_binomial.zip");

## Now you can get list of predictors as below:
System.out.println("isSupervised : " + mojo.isSupervised());
System.out.println("Columns Names : " + Arrays.toString(mojo.getNames()));
System.out.println("Number of columns : " + mojo.getNumCols());
System.out.println("Response ID : " + mojo.getResponseIdx());
System.out.println("Response Name : " + mojo.getResponseName());

## Printing all categorical values for each predictors
for (int i = 0; i < mojo.getNumCols(); i++) {
 String[] domainValues = mojo.getDomainValues(i);
 System.out.println(Arrays.toString(domainValues));
 }
Output Results:
isSupervised : true
Column Names : [ID, AGE, RACE, DPROS, DCAPS, PSA, VOL, GLEASON]
Response ID : 8
Number of columns : 8
null
null
[0, 1, 2]
null
null
null
null
null
Note: For all null values means the predictor was numeric values and all the categorical values are listed for the each enum/factor predictor.

To can get help on using MOJO and POJO models visit the following:

That’s it, enjoy!!

Advertisements

Getting predictors from H2O POJO and MOJO models in Java and Scala

Here is the Java/Scala code snippet which shows how you can get the predictors and response details from H2O POJO and MOJO Models:

to get the list of all column names in your POJO/MOJO model, you can try the following:

Imports:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
import java.util.Arrays;

POJO:

## First use the POJO model class as below:
private static String modelClassName = "gbm_prostate_binomial";

##Then you can GenModel class to get info you are looking for as below:
hex.genmodel.GenModel rawModel;
rawModel = (hex.genmodel.GenModel) Class.forName(modelClassName).newInstance();

## Now you can get the results as below:
System.out.println("isSupervised : " + rawModel.isSupervised());
System.out.println("Columns Names :  " + Arrays.toString(rawModel.getNames()));

MOJO:

## Lets assume you have MOJO model as gbm_prostate_binomial.zip
## You would need to load your model as below:
hex.genmodel.GenModel mojo = MojoModel.load("gbm_prostate_binomial.zip");

## Now you can get list of predictors as below:
System.out.println("isSupervised : " + mojo.isSupervised());
System.out.println("Columns Names : " + Arrays.toString(mojo.getNames()));

To can get help on using MOJO and POJO models visit the following:

That’s it, enjoy!!

Scoring H2O model with TIBCO StreamBase

If you are using H2O models with StreamBase for scoring this is what you have to do:

  1. Get the Model as Java Code (POJO Model)
  2. Get the h2o-genmodel.jar (Download from the H2O cluster)
    1. Alternatively you can use the REST api (works in every H2O version) as below to download h2o-genmodel.jar:
      curl http://localhost:54321/3/h2o-genmodel.jar > h2o-genmodel.jar
  3. Create the project in StreamBase add H2O Model Java to the project (POJO)
  4. Change the H2O operator to using the POJO in Streambase.
  5. Adding h2o-genmodel.jar to the project’s Java Build Path\libraries

After that you can use H2O Model into StreamBase.

That’s it, enjoy!!

Building GBM model in R and exporting POJO and MOJO model

Get the dataset:

Training:

http://h2o-training.s3.amazonaws.com/pums2013/adult_2013_train.csv.gz

Test:

http://h2o-training.s3.amazonaws.com/pums2013/adult_2013_test.csv.gz

Here is the script to build GBM grid model and export MOJO model:

library(h2o)
h2o.init()

# Importing Dataset
trainfile <- file.path("/Users/avkashchauhan/learn/adult_2013_train.csv.gz")
adult_2013_train <- h2o.importFile(trainfile, destination_frame = "adult_2013_train")
testfile <- file.path("/Users/avkashchauhan/learn/adult_2013_test.csv.gz")
adult_2013_test <- h2o.importFile(testfile, destination_frame = "adult_2013_test")

# Display Dataset
adult_2013_train
adult_2013_test

# Feature Engineering
actual_log_wagp <- h2o.assign(adult_2013_test[, "LOG_WAGP"], key = "actual_log_wagp")

for (j in c("COW", "SCHL", "MAR", "INDP", "RELP", "RAC1P", "SEX", "POBP")) {
 adult_2013_train[[j]] <- as.factor(adult_2013_train[[j]])
 adult_2013_test[[j]] <- as.factor(adult_2013_test[[j]])
}
predset <- c("RELP", "SCHL", "COW", "MAR", "INDP", "RAC1P", "SEX", "POBP", "AGEP", "WKHP", "LOG_CAPGAIN", "LOG_CAPLOSS")

# Building GBM Model:
log_wagp_gbm_grid <- h2o.gbm(x = predset,
 y = "LOG_WAGP",
 training_frame = adult_2013_train,
 model_id = "GBMModel",
 distribution = "gaussian",
 max_depth = 5,
 ntrees = 110,
 validation_frame = adult_2013_test)

log_wagp_gbm_grid

# Prediction 
h2o.predict(log_wagp_gbm_grid, adult_2013_test)

# Download POJO Model:
h2o.download_pojo(log_wagp_gbm_grid, "/Users/avkashchauhan/learn", get_genmodel_jar = TRUE)

# Download MOJO model:
h2o.download_mojo(log_wagp_gbm_grid, "/Users/avkashchauhan/learn", get_genmodel_jar = TRUE)

You will see GBM_model.java (as POJO Model) and GBM_model.zip (MOJO model) at the location where you will save these models.

Thats it, enjoy!

 

Using RESTful API to get POJO and MOJO models in H2O

 

CURL API for Listing Models:

http://<hostname>:<port>/3/Models/

CURL API for Listing specific POJO Model:

http://<hostname>:<port>/3/Models/model_name

List Specific MOJO Model:

http://<hostname>:<port>/3/Models/glm_model/mojo

Here is an example:

curl -X GET "http://localhost:54323/3/Models"
curl -X GET "http://localhost:54323/3/Models/deeplearning_model" >> NAME_IT

curl -X GET "http://localhost:54323/3/Models/deeplearning_model" >> dl_model.java
curl -X GET "http://localhost:54323/3/Models/glm_model/mojo" > myglm_mojo.zip

Thats it, enjoy!!