Understanding Linear Regression Algorithm in Data Modeling: A Step-by-Step Guide with Examples

Crear

Creado 01/03/2024
Modificado 01/03/2024
8 Vistas

This blog will explain Linear Regression algorithm, a way to achieve Data modeling (fourth step in CRISP-DM model)

CRISP-DM: Cross Industry Standard Process for Data Mining provides a structured approach to planning a data mining project. This model is an idealized sequence of below mentioned events:

Business Understanding

Data Understanding

Data Preparation

Data Modeling

Model Evaluation

Model Deployment

Data Modeling uses machine learning algorithms, in which machine learns from the data. It is like the way humans learn from their experience.

Machine Learning models are classified in two categories:

Supervised learning method: This method has historical data with labels. Regression and Classification algorithms fall under this category.

Unsupervised learning methods: No pre-defined labels are assigned to historical data. Clustering algorithms fall under this category.

For example, predicting the performance of a company in terms of revenue based on history data is a regression problem and classifying if a person is likely to default loan or not is a classification problem.

How regression works?

Let’s consider an example, a company could predict it sales based on the money they put in advertising.

Previous data of spending on advertising and actual sales

Advertising expenditure (in thousands)	Sales (in lakhs)
20	11
30	23
11	6
14	7
45	44.4

You would like to know if you are spending X amount in advertising then what would be your sales.

Always remember that domain expertise helps in finding the right prediction results. Also, the domain expertise of the company’s advertising team can give a rough idea on the effect of change in advertising expenditure on change in sales. But to find exactly what amount of sales would get generated and to know whether a relationship between advertising expenditure and sales exists or not; you can use regression algorithm to build a model and to do a prediction.

Let’s try to plot a graph of Advertising Expenditure versus Sales

Independent Variable: Variable on X-axis which is used for prediction is independent variable.

Dependent Variable: Variable on Y-axis which we want to predict is a dependent variable.

Equation of a straight-line y = mx + c, where m is the slope of the line and c is the intercept.

What is the significance of m and c in the equation of a straight?

‘m’ signifies the strength of the relation between X and Y.

‘c’ in above example means the amount of Sales when no money is spent on Advertising that is when X = 0.

Best Fit line: The line that best fits the scatter plot. What does best fit means and how to determine whether a line is best fit or not?

Residual: Residual is used to find the best fit line. Every data point has a residual value which is the difference between the actual value and the predicted value (the value of point on line). Let’s denote this by E(error)

E = Actual – Predicted (for every data point)

Minimize the total error square i.e. minimize e1 ² + e2 ² + …… + en ² .

This is also called as Residual Sum of Squares (RSS). So, choose the value of m and c in such a way that it reduces the value of RSS.

Let’s write E in terms of m and c.

E = e _i = y _i (actual) – ypred

ei = y _i – mx _i – c

In Machine Learning models, a cost function is defined for a problem and then it is either minimized or maximized according to the requirement. In case of the above described regression the cost function in Residual Sum of Squares.

How to minimize a cost function?

Differentiate the cost function and put it equal to zero.

Gradient Descent; start with some value of ‘m’ and ‘c’ and then iteratively move to better ‘m’ and ‘c’ to minimize the cost function.

SAP Predictive Analytics

Pedro Pascal

Se unió el 07/03/2018

Responder

Facebook

Twitter

Sin respuestas

No hay respuestas para mostrar Se el primero en responder

PARA MÁS INFORMACIÓN
INGRESA TUS DATOS

Nombre completo

Correo electrónico

Curso de interés

Celular

Etiquetas más populares

Ver todo

Preguntas más populares

Guía para encontrar todos los ID de usuario de SAP y sus nombres de forma sencilla

Hola, ¿Cómo encontrar la lista de TODOS los ID de usuario de SAP y sus nombres? por ejemplo, ID de SAP 909099 Usuario Ram

01/03/2024 · 6K+ Vistas · 5 Respuestas
Generación de Números de Tarjetas de Crédito Válidos para Pruebas: Amplía tus Horizontes de Testing

¿Has estado probando la funcionalidad de tarjetas de pago y alguna vez has pensado en conseguir más datos de prueba (números de tarjetas de crédito válidos) para expandir tus horizontes...

01/03/2024 · 8K+ Vistas · 0 Respuestas
Cómo Imprimir el Importe Total de una Factura en Letras: Ejemplo de Convertir 1500.00 a Mil Quinientos Pesos 00/100 M.N.

Hola ¿Qué tal? ¿Alguien sabe cómo puedo imprimir el importe total de una factura en letras? Ejemplo 1500.00 Mil Quinientos Pesos 00/100 M.N. Saludos...

01/03/2024 · 5K+ Vistas · 4 Respuestas
Cómo evitar y solucionar java.lang.NullPointerException en Java

Hola a todos, ¿Alguien puede decirme sobre java.lang.NullPointerException? Cuándo recibiremos este tipo de error y cómo solucionarlo. Si obtenemos este error, ¿dónde debemo...

01/03/2024 · 6K+ Vistas · 4 Respuestas
Pasos para abrir y cerrar períodos en SAP MM FI CO - Guía completa

Pasos para abrir y cerrar período - MM FI CO Muchos de nosotros luchamos con los cambios de períodos en nuestros entornos DEV y QA, aquí tienes una referencia rápida para abrir y...

01/03/2024 · 8K+ Vistas · 1 Respuestas

Aprende en Comunidad

Understanding Linear Regression Algorithm in Data Modeling: A Step-by-Step Guide with Examples

Sin respuestas

PARA MÁS INFORMACIÓN
INGRESA TUS DATOS

Etiquetas más populares

Preguntas más populares

Partners:

Aprende en Comunidad

Understanding Linear Regression Algorithm in Data Modeling: A Step-by-Step Guide with Examples

Sin respuestas

PARA MÁS INFORMACIÓNINGRESA TUS DATOS

Etiquetas más populares

Preguntas más populares

Partners:

PARA MÁS INFORMACIÓN
INGRESA TUS DATOS