Regression Intro – Practical Machine Learning Tutorial with Python p.2

To begin, what is regression in terms of us using it with machine learning? The goal is to take continuous data, find the equation that best fits the data, and be able forecast out a specific value. With simple linear regression, you are just simply doing this by creating a best fit line.

From here, we can use the equation of that line to forecast out into the future, where the ‘date’ is the x-axis, what the price will be.

A popular use with regression is to predict stock prices. This is done because we are considering the fluidity of price over time, and attempting to forecast the next fluid price in the future using a continuous dataset.

Regression is a form of supervised machine learning, which is where the scientist teaches the machine by showing it features and then showing it was the correct answer is, over and over, to teach the machine. Once the machine is taught, the scientist will usually “test” the machine on some unseen data, where the scientist still knows what the correct answer is, but the machine doesn’t. The machine’s answers are compared to the known answers, and the machine’s accuracy can be measured. If the accuracy is high enough, the scientist may consider actually employing the algorithm in the real world.


  1. manoj kharol says:

    getting error after 3rd line — “ssl.SSLError: [SSL:
    CERTIFICATE_VERIFY_FAILED] certificate verify failed
    (_ssl.c:645)”…..please resolve my error.

  2. AlphaBetaGamma 96 says:

    I get this error, I’m not sure why! Help please! 🙂

    Quandl.Quandl.DatasetNotFound: Dataset not found. Check Quandl code:
    WIKI/GOOGLE for errors

  3. Jitcha Shivang says:

    It shows requirements satisfied when i pip installed quandl, pandas and
    sklearn but i am getting an error when running the file it says no module
    named ‘pandas’

  4. phoneAntics says:

    If anyone’s getting a “No module named ‘Quandl'” error I fixed mine by
    changing all the ‘Quandl’s to lowercase.

  5. Hoenir Bhullar says:

    Hi Harris, thank you for the vids. Why do you use df as a variable name and
    as a function name for the data frame ? I mean df = df[[ ]]. I understand
    python lets you do that as compared to a more rigid c/c++ program. But im
    confused how to use the function df now… could you help please 🙁 …
    like why did you type df[[ ]]. Surely you dont have to specific a data
    frame as being a data frame ?

  6. C .R. Anil says:

    on printing df.head() after modifying the data frame i am getting error as :

    AttributeError: ‘list’ object has no attribute ‘head’

  7. Deepak Kumar says:


    Nedd help installing Quandl……..

    lost one day …………………………………..

    below error Command “/usr/bin/python -u -c “import setuptools,
    ‘open’, open)(__file__).read().replace(‘rn’, ‘n’), *_file_*, ‘exec’))”
    install –record /tmp/pip-IMzGtg-record/install-record.txt
    –single-version-externally-managed –compile” failed with error code 1 in

  8. Deepak Kumar says:

    okay i was able to fianllyis quandl issue in window verion still need to
    check for linux.

    if you install using pip and still it doenst work then jsut download
    package for GIT and save it you your library packe where all the files are
    installed make sure to check them

  9. Nonton Anime says:

    File “”, line 2, in
    import quandl
    File “/usr/local/lib/python2.7/dist-packages/quandl/”, line 11,

    from .model.merged_dataset import MergedDataset
    line 1, in

    from more_itertools import unique_everseen
    ImportError: No module named more_itertools
    tegar@tegar-ThinkPad-E450:~/machine-learning$ python
    Traceback (most recent call last):
    File “”, line 2, in

    import quandl
    File “/usr/local/lib/python2.7/dist-packages/quandl/”, line 11,

    from .model.merged_dataset import MergedDataset
    line 1, in

    from more_itertools import unique_everseen
    ImportError: No module named more_itertools

  10. anwesh mishra says:

    hi! You can try the below code instead of hardcoding all the keys starting
    with Adj.
    for key in df.columns:
    if key.startswith(“Adj. “):
    print keys
    df = df[keys]
    print df.head(3)

  11. hachimitsuchai says:

    I am using python 3.5 as well but Quandl module isn’t recognized. I
    imported using pip3 and condo. I checked all the comments below but still
    no go. Python 3.5 / OS X.

  12. combatLaCarie says:

    figure out Quandl was lowercase, but now “””module ‘quandl’ has no
    attribute ‘get'””” :(

  13. hachimitsuchai says:

    @00:5:00 you said with deep learning you can discover the relationships
    between attributes but not with regression. Why? Regression is all about
    assessing the strength between relationships no?

  14. hachimitsuchai says:

    Also I noticed he has sort of an auto complete in his shell. I’m using Atom
    and Spyder on a Mac. Does anyone know of any packages that have this?

  15. Trung Duong Nguyen Trinh says:

    I got this error:
    KeyError: “[‘Adj.Open’ ‘Adj.High’ ‘Adj.Low’ ‘Adj.Close’ ‘Adj.Volume’] not
    in index”
    Anyone has the same one?

  16. 91Georgina says:

    i did what you mentioned but i have this problem
    ImportError: No module named ‘pandas’
    what can i do?

  17. Mauricio Cardona says:

    hello! I’ve been stucked with this error, please somebody can help me,
    line 6 in df[‘HL_PCT’] = (df[‘Adj. High’] – df[‘Adj. Close’]) /
    df[‘Adj. Close’] * 100.0
    TypeError: list indices must be integers or slices, not str

    I’m working with python 3.5.2 and have exactly the same code.

  18. P3gasus says:

    Panda was installed with no erros in here, but the idle still not founding
    when i put import pandas. (Using miniconda python 3.5)

Comments are closed.