Creating, Adding and managing H2O frame in Scala

Creating a new H2O Frame:

To create a new frame in H2O you will call as below:

val df = new Frame()

Adding a frame to another H2O Frame:

To add an H2O from to another H2O frame you will do the following:
frame1.add(frame2)
When h2oDataFrame.add() method is called, it mutates the calling frame. It doesn’t create a new Frame and the Frame keeps the same Key.  Its the same object in memory.
What happens is that frame1 now depends on frame2.  Frame “frame1” has the new columns but they are actually the data from “frame2”. Looking into this operation, it looks like data has been duplicated because there are 2 keys in the DKV, but actually there has been no memory copy at all.  If you delete “frame2” you will run into an error , because the Frame “frame1” now depends on “frame2”.
In general managing memory in H2O DKV, there is no automated way of deleting old Frames during your program execution, you just need to manually call Frame.delete() on the Frames you no longer need.

Difference of using val vs var in Scala with new frame:

While looking from Scala point of view val dataframeNew = new Frame() doesn’t stop you from changing the dataframeNew frame with dataframeNew.add,  this does however stop you from reassigning dataframeNew to a different instance of a Frame.
Note: If you had var dataframeNew = new Frame(), then this df can be set to a completely different Frame. The reason for this difference is mainly because how Scala treats the val vs var in variable assignment.
Thats it, enjoy
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s