For logistics classification problem we use AUC metrics to check the model performance. The higher is better however any value above 80% is considered good and over 90% means the model is behaving great.
AUC is an abbreviation for Area Under the Curve. It is used in classification analysis in order to determine which of the used models predicts the classes best. An example of its application are ROC curves. Here, the true positive rates are plotted against false positive rates. You can learn more about AUC in this QUORA discussion.
We will also look for GINI metric which you can learn from wiki. In this example we will learn how AUC and GINI model metric is calculated using True Positive Results (TPR) and False Positive Results (FPR) values from a given test dataset.
Lets build a logistic classification model in H2O using the prostate data set:
Preparation of H2O environment and dataset:
## Importing required libraries import h2o import sys import pandas as pd from h2o.estimators.gbm import H2OGradientBoostingEstimator ## Starting H2O machine learning cluster h2o.init() ## Importing dataset local_url = "https://raw.githubusercontent.com/h2oai/sparkling-water/master/examples/smalldata/prostate/prostate.csv" df = h2o.import_file(local_url) ## defining feaures and response column y = "CAPSULE" feature_names = df.col_names feature_names.remove(y) ## setting our response column to catagorical so our model classify the problem df[y] = df[y].asfactor()
Now we will be splitting the dataset into 3 sets for training, validation and test:
df_train, df_valid, df_test = df.split_frame(ratios=[0.8,0.1]) print(df_train.shape) print(df_valid.shape) print(df_test.shape)
Setting H2O GBM Estimator and building GBM Model:
prostate_gbm = H2OGradientBoostingEstimator(model_id = "prostate_gbm", ntrees=500, learn_rate=0.001, max_depth=10, score_each_iteration=True) ## Building H2O GBM Model: prostate_gbm.train(x = feature_names, y = y, training_frame=df_train, validation_frame=df_valid) ## Understand the H2O GBM Model prostate_gbm
Generating model performance with training, validation & test datasets:
train_performance = prostate_gbm.model_performance(df_train) valid_performance = prostate_gbm.model_performance(df_valid) test_performance = prostate_gbm.model_performance(df_test)
Let’s take a look at the AUC metrics provided by Model performance:
print(train_performance.auc()) print(valid_performance.auc()) print(test_performance.auc()) print(prostate_gbm.auc())
Let’s take a look at the GINI metrics provided by Model performance:
print(train_performance.gini()) print(valid_performance.gini()) print(test_performance.gini()) print(prostate_gbm.gini())
Let generate the predictions using test dataset:
predictions = prostate_gbm.predict(df_test) ## Here we will get the probability for the 'p1' values from the prediction frame: predict_probability = predictions['p1']
Now we will import required scikit-learn libraries to generate AUC manually:
from sklearn.metrics import roc_curve, auc import matplotlib.pyplot as plt import random
Lets get the real response results from the test data frame:
actual = df_test[y].as_data_frame() actual_list = actual['CAPSULE'].tolist() print(actual_list)
Now lets get the results probabilities from the prediction frame:
predictions_temp = predict_probability_x['p1'].as_data_frame() predictions_list = predictions_temp['p1'].tolist() print(predictions_list)
Calculating False Positive Rate and True Positive Rate:
Lets calculate TPR, FPR and Threshold metrics from the predictions and original data frame
– False Positive Rate (fpr)
– True Positive Rate (tpr)
fpr, tpr, thresholds = roc_curve(actual_list, predictions_list) roc_auc = auc(fpr, tpr) print(roc_auc) print(test_performance.auc())
Note: Above you will see that our calculated ROC values is exactly same as given by model performance for test dataset.
Lets plot the AUC Curve using matplotlib:
plt.title('ROC (Receiver Operating Characteristic)') plt.plot(fpr, tpr, 'b', label='AUC = %0.4f'% roc_auc) plt.legend(loc='lower right') plt.plot([0,1],[0,1],'r--') plt.xlim([-0.1,1.2]) plt.ylim([-0.1,1.2]) plt.ylabel('True Positive Rate (TPR)') plt.xlabel('False Positive Rate (FPR)') plt.show()
This is how GINI metric is calculated from AUC:
GINI = (2 * roc_auc) - 1 print(GINI) print(test_performance.gini())
Note: Above you will see that our calculated GINI values is exactly same as given by model performance for test dataset.
Thats it, enjoy!!