In this example we will launch H2O machine learning cluster using pysparkling package. You can visit my github and this article to learn more about the code execution explained in this article.
For you would need to install pysparkling in python 2.7 setup as below:
> pip install -U h2o_pysparkling_2.1
Now we can launch the pysparkling Shell as below:
SPARK_HOME=/Users/avkashchauhan/tools/spark-2.1.0-bin-hadoop2.6
Launch pysparkling shell:
~/tools/sw2/sparkling-water-2.1.14 $ bin/pysparkling
Python Code Script Launch the H2O cluster in pysparkling:
## Importing Libraries from pysparkling import * import h2o ## Setting H2O Conf Object h2oConf = H2OConf(sc) h2oConf ## Setting H2O Conf for different port h2oConf.set_client_port_base(54300) h2oConf.set_node_base_port(54300) ## Gett H2O Conf Object to see the configuration h2oConf ## Launching H2O Cluster hc = H2OContext.getOrCreate(spark, h2oConf) ## Getting H2O Cluster status h2o.cluster_status()
Now If you verify the Sparkling Water configuration you will see that the H2O is running on the given IP and port 54300 as configured:
Sparkling Water configuration:
backend cluster mode : internal
workers : None
cloudName : Not set yet, it will be set automatically before starting H2OContext.
flatfile : true
clientBasePort : 54300
nodeBasePort : 54300
cloudTimeout : 60000
h2oNodeLog : INFO
h2oClientLog : WARN
nthreads : -1
drddMulFactor : 10
Thats it, enjoy!!