Jo-fai Chow – Introduction to Machine Learning with H2O and Python

Description
H2O.ai is focused on bringing AI to businesses through software. Its flagship product is H2O, the leading open source platform that makes it easy for financial services, insurance and healthcare companies to deploy machine learning and predictive analytics to solve complex problems.

This tutorial aims to demonstrate the basic usage of H2O with worked examples in Python.

Abstract
About H2O.ai

H2O.ai is focused on bringing AI to businesses through software. Its flagship product is H2O, the leading open source platform that makes it easy for financial services, insurance and healthcare companies to deploy machine learning and predictive analytics to solve complex problems. More than 8,500+ organizations and 75,000+ data scientists depend on H2O for critical applications like predictive maintenance and operational intelligence. The company accelerates business transformation for 107 Fortune 500 enterprises, 8 of the world’s 12 largest banks, 7 of the 10 largest auto insurance companies and all 5 major telecommunications providers. Notable customers include Capital One, Progressive Insurance, Transamerica, Comcast, Nielsen Catalina Solutions, Macy’s, Walgreens, Kaiser Permanente, and Aetna.

This tutorial aims to demonstrate the basic usage of H2O with worked examples in Python. Code and data for the worked examples will be provided.

Learning Objectives
By the end of the tutorial, participants will be able to:
-Start and connect to a local H2O cluster from Python.
-Start and connect to H2O cluster(s) on the cloud (e.g. AWS) (i.e. straight-forward distributed machine learning)
-Import data from Python data frames, local files or web.
-Perform basic data transformation and exploration.
-Train classification and regression models using H2O machine learning algorithms.
-Evaluate model performance and make predictions.

Agenda
-About H2O.ai
-H2O machine learning platform & algorithms
-H2O + Python API
-Basic Extract, Transform and Load (ETL) procedures
-Worked examples: classification and regression

www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.