Starting from H2O 3.10.0.8 H2O added partial dependency plot which has the Java backend to do the mutli-scoring of the dataset with the model. This makes creating PDP much faster.
To get PDP in H2O you must need Model, and the original data set used to generate mode. Here are few ways to create PDP:
If you want to generate PDP on a single column:
response = h2o.predict(model, data.pdp[, column_name])
response = h2o.predict(model, data.pdp)
model = prostate.gbm column_name = "AGE" data.pdp = data.hex bins = unique(h2o.quantile(data.hex[, column_name], probs = seq(0.05,1,0.05)) ) mean_responses = c() for(bin in bins ){ data.pdp[, column_name] = bin response = h2o.predict(model, data.pdp[, column_name]) mean_response = mean(response[,ncol(response)]) mean_responses = c(mean_responses, mean_response) } pdp_manual = data.frame(AGE = bins, mean_response = mean_responses) plot(pdp_manual, type = "l")