Changing categorical values names with set_level API in H2O

Sometime you may need to change the categorical name within a dataset to some other values. This can be done using h2o.set_levels() API as below:

>>> df = h2o.import_file("/Users/avkashchauhan/src/github.com/h2oai/h2o-3/smalldata/iris/iris.csv")
Parse progress: |█████████████████████████████████████████████████████████████████████████████| 100%
>>> df
 C1 C2 C3 C4 C5
---- ---- ---- ---- -----------
 5.1 3.5 1.4 0.2 Iris-setosa
 4.9 3 1.4 0.2 Iris-setosa
 4.7 3.2 1.3 0.2 Iris-setosa
 4.6 3.1 1.5 0.2 Iris-setosa
 5 3.6 1.4 0.2 Iris-setosa
 5.4 3.9 1.7 0.4 Iris-setosa
 4.6 3.4 1.4 0.3 Iris-setosa
 5 3.4 1.5 0.2 Iris-setosa
 4.4 2.9 1.4 0.2 Iris-setosa
 4.9 3.1 1.5 0.1 Iris-setosa

[150 rows x 5 columns]

>>> df['C5'].levels()
[['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']]
>>> df['C5'].set_levels(['Iris_A', 'Iris_B', 'Iris_C'])
C5
------
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A
Iris_A

[150 rows x 1 column]

>>> df['C5'].levels()
[['Iris_A', 'Iris_B', 'Iris_C']]

 

Thank you, enjoy!!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s