This sample generate a GBM model from R H2O library and then consume the model into Java for prediction.
Here is R Script to generate sample model using H2O
setwd("/tmp/resources/") library(h2o) h2o.init() df = iris h2o_df = as.h2o(df) y = "Species" x = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width") model = h2o.gbm(y = y, x = x, training_frame = h2o_df) model h2o.download_mojo(model, get_genmodel_jar = TRUE)
Here is the Java code to use Model for prediction:
import hex.genmodel.easy.RowData; import hex.genmodel.easy.EasyPredictModelWrapper; import hex.genmodel.easy.prediction.*; import hex.genmodel.MojoModel; public class main { static void printIt(String message, MultinomialModelPrediction p) { System.out.println(""); System.out.println(message); for (int i = 0; i < p.classProbabilities.length; i++) { if (i > 0) { System.out.print(","); } System.out.print(p.classProbabilities[i]); } System.out.println(""); } public static void main(String[] args) throws Exception { EasyPredictModelWrapper model_orig = new EasyPredictModelWrapper(MojoModel.load("unzipped_orig")); { RowData row = new RowData(); row.put("Sepal.Length", "1"); row.put("Sepal.Width", "1"); row.put("Petal.Length", "1"); row.put("Petal.Width", "1"); MultinomialModelPrediction p = model_orig.predictMultinomial(row); printIt("All 1s, orig", p); } { RowData row = new RowData(); MultinomialModelPrediction p = model_orig.predictMultinomial(row); printIt("All NAs, orig", p); } { RowData row = new RowData(); row.put("Sepal.Length", "1"); row.put("sepwid", "1"); row.put("Petal.Length", "1"); row.put("Petal.Width", "1"); MultinomialModelPrediction p = model_orig.predictMultinomial(row); printIt("Sepal width NA, orig", p); } // ------------------- EasyPredictModelWrapper model_modified = new EasyPredictModelWrapper(MojoModel.load("unzipped_modified")); { RowData row = new RowData(); row.put("Sepal.Length", "1"); row.put("sepwid", "1"); row.put("Petal.Length", "1"); row.put("Petal.Width", "1"); MultinomialModelPrediction p = model_modified.predictMultinomial(row); printIt("All 1s (with sepwid instead of Sepal.Width), modified", p); } { RowData row = new RowData(); MultinomialModelPrediction p = model_modified.predictMultinomial(row); printIt("All NAs, modified", p); } { RowData row = new RowData(); row.put("Sepal.Length", "1"); row.put("Sepal.Width", "1"); row.put("Petal.Length", "1"); row.put("Petal.Width", "1"); MultinomialModelPrediction p = model_modified.predictMultinomial(row); printIt("Sepal width NA (with Sepal.Width instead of sepwid), modified", p); } } }
After the MOJO is downloaded you can see the model.ini as below:
[info] h2o_version = 3.10.4.8 mojo_version = 1.20 license = Apache License Version 2.0 algo = gbm algorithm = Gradient Boosting Machine endianness = LITTLE_ENDIAN category = Multinomial uuid = 7712689150025610456 supervised = true n_features = 4 n_classes = 3 n_columns = 5 n_domains = 1 balance_classes = false default_threshold = 0.5 prior_class_distrib = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] model_class_distrib = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] timestamp = 2017-05-23T08:19:42.961-07:00 n_trees = 50 n_trees_per_class = 3 distribution = multinomial init_f = 0.0 offset_column = null [columns] Sepal.Length Sepal.Width Petal.Length Petal.Width Species [domains] 4: 3 d000.txt
If you decided to modify model.ini by renaming column (i.e.sepal.width to sepwid) you can do as below:
[info] h2o_version = 3.10.4.8 mojo_version = 1.20 license = Apache License Version 2.0 algo = gbm algorithm = Gradient Boosting Machine endianness = LITTLE_ENDIAN category = Multinomial uuid = 7712689150025610456 supervised = true n_features = 4 n_classes = 3 n_columns = 5 n_domains = 1 balance_classes = false default_threshold = 0.5 prior_class_distrib = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] model_class_distrib = [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] timestamp = 2017-05-23T08:19:42.961-07:00 n_trees = 50 n_trees_per_class = 3 distribution = multinomial init_f = 0.0 offset_column = null [columns] Sepal.Length SepWid Petal.Length Petal.Width Species [domains] 4: 3 d000.txt
Now we can run the Java commands to test the code as below:
$ java -cp .:h2o-genmodel.jar main All 1s, orig 0.7998234476072545,0.15127335891610785,0.04890319347663747 All NAs, orig 0.009344361534466918,0.9813250958541073,0.009330542611425827 Sepal width NA, orig 0.7704658301004306,0.19829292017147707,0.03124124972809238 All 1s (with sepwid instead of Sepal.Width), modified 0.7998234476072545,0.15127335891610785,0.04890319347663747 All NAs, modified 0.009344361534466918,0.9813250958541073,0.009330542611425827 Sepal width NA (with Sepal.Width instead of sepwid), modified 0.7704658301004306,0.19829292017147707,0.03124124972809238
Thats it, enjoy!!