A Gentle Introduction To Machine Learning; SciPy 2013 Presentation

Authors: Kastner, Kyle, Southwest Research Institute

Track: Machine Learning

This talk will be an introduction to the root concepts of machine learning, starting with simple statistics, then working into parameter estimation, regression, model estimation, and basic classification. These are the underpinnings of many techniques in machine learning, though it is often difficult to find a clear and concise explanation of these basic methods.

Parameter estimation will cover Gaussian parameter estimation of the following types: known variance, unknown mean; known mean, unknown variance; and unknown mean, unknown variance.

Regression will cover linear regression, linear regression using alternate basis functions, bayesian linear regression, and bayesian linear regression with model selection.

Classification will extend the topic of regression, exploring k-means clustering, linear discriminants, logistic regression, and support vector machines, with some discussion of relevance vector machines for “soft” decision making.

Starting from simple statistics and working upward, I hope to provide a clear grounding of how basic machine learning works mathematically. Understanding the math behind parameter estimation, regression, and classification will help individuals gain an understanding of the more complicated methods in machine learning. This should help demystify some of the modern approaches to machine learning, leading to better technique selection in real-world applications.


  1. Daren Wilson says:

    An excellent brief overview of ML for beginners in that field, which I am.
    I’ll be starting Ng’s class through Coursera soon. 

  2. Harvey Summers says:

    This would have been a lot more interesting had we seen the general
    algorithms, code, data, and output. This was a lot of opinion, but very
    little proven fact. 

  3. jj7353praise says:

    “supervise verse unsupervised,, all this means is that supervised problems
    typically have a training dataset where you have a set of features and you
    have an output. You use the output to train and get new data that may or
    may not have labels that are associated with it” …. thank you sir

  4. panzach says:

    I watched the presentation and the questions but failed to find the
    “discussion of relevance vector machines”

Comments are closed.