Kevin Markham – Machine Learning with Text in scikit-learn – PyCon 2016

Speaker: Kevin Markham

Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we’ll build and evaluate predictive models from real-world text using scikit-learn.

Slides can be found at: and

  1. Data School says:

    Hello! I’m the speaker in the video. All of the code and data can be
    downloaded from GitHub:

    If you want to go deeper into the material, I also teach an online course
    about scikit-learn: 

