Speaker: Andreas Muller – Research Engineer at NYU Center for Data Science.
Scikit-learn is a machine learning library in Python, that has become a valuable tool for many data science practitioners.
This talk covers some of the more advanced aspects of scikit-learn, such as building complex machine learning pipelines, model evaluation, parameter search, and out-of-core learning.
Apart from metrics for model evaluation, the talk covers how to evaluate model complexity, and how to tune parameters with grid search, randomized parameter search, and what their trade-offs are. Additionally, it covers out of core text feature processing via feature hashing.