When building classification models in H2O, you will get to see the variable importance table at the FLOW UI. It looks like as below:
Most of the users are using python or R as their shell so there could be a need to get this variable importance table into python or R shell. This is what we will do in next step.
If we want to plot the variable importance graph we can use the following script:
import matplotlib.pyplot as plt plt.rcdefaults() fig, ax = plt.subplots() variables = mymodel._model_json['output']['variable_importances']['variable'] y_pos = np.arange(len(variables)) scaled_importance = mymodel._model_json['output']['variable_importances']['scaled_importance'] ax.barh(y_pos, scaled_importance, align='center', color='green', ecolor='black') ax.set_yticks(y_pos) ax.set_yticklabels(variables) ax.invert_yaxis() ax.set_xlabel('Scaled Importance') ax.set_title('Variable Importance') plt.show()
Here is the variable importance graph looks like:
If we want to see the variable metrics directly from the model in python we can do the following:
mymodel._model_json['output']['variable_importances'].as_data_frame()
The results are shown as below:
Thats it, enjoy!!