Okay, I know that real problems usually aren’t linear nor simple. However, looking into linear regression model it’s a nice way to figure out what’s going on inside regression models in general. This is a mandatory knowledge for every data scientist and can help you to solve real challenges, as well.

Linear regression models aim to predict a numeric value (Y) according to one or more variables (X). Mathematically, we can define such relation as Y = f(X), were Y is known as dependent variable and X as independent variable.

Regression models belong to the supervised side of machine learning (the other side is non-supervised) because algoritms try to predict values according to existing correlations between independent and dependent variables.

But…what does “f” mean into: Y=f(X)? “F” is the regression function responsible to predict Y based on X. Once we are talking about simple linear regression model, pay attention on the next three questions because they will change our mind:

  1. What is the shape of “f” in a linear regression? Linear, sure!
  2. How can we represent a linear relationship? Using a line (you will understand why in a few minutes)
  3. So what’s the function that define a line? ax + b (just check maths books)

That’s it! Linear regression models are given by y = ax + b. Once we are trying to predict Y given X, we just need to find out the values of “a” and “b”. We can adopt the same logic to figure out what’s going on insise others kind of regression.

And believe me, find out the values of “a” and “b” are the only things we’re going to do. It’s nice to know that “a” is also known as alpha coefficient and represents the line inclination; and “b” is also known as beta coefficient and represents the place where the line crosses the y axis (into a two-dimension plan with x and y).

It’s also nice to know that there is an error associated with every predictor. Nothing different here. Let’s name it as “e” and formally define simple linear regression as y = ax + b + e

Ok guys, let’s find alpha and beta and give a happy end to this reading.

Step-by-Step – Simple linear regression model from scratch
Support material

Sobre o Autor: Weslley Moura

Mestre em engenharia da computação, professor de cursos relacionados a análise de dados e co-fundador da empresa Pepsoft Sistemas. Profissional apaixonado pela ciência existente nos dados e suas aplicações práticas. Nos últimos anos vem dedicando seu tempo a projetos de aprendizagem de máquina e mantém seu site pessoal com dicas e aulas relacionadas ao tema em Hacking Analytics.

um comentário

Deixe um comentário

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair /  Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair /  Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair /  Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair /  Alterar )

w

Conectando a %s