Product details

By continuing to use our site you consent to the use of cookies as described in our privacy policy unless you have disabled them.
You can change your cookie settings at any time but parts of our site will not function correctly without them.
Note
-
Reference no. 9-622-100
Published by: Harvard Business Publishing
Originally published in: 2022
Version: 20 July 2022
Revision date: 15-Mar-2024
Length: 18 pages
Data source: Published sources
Topics: Data science

Abstract

This note provides an overview of linear regression for an introductory data science course. It begins with a discussion of correlation, and explains why correlation does not necessarily imply causation. The note then describes the method of least squares , and how to interpret the r-squared and model coefficient values of a simple linear regression model. Next, the note describes how the interpretation of a model coefficient changes when there are multiple independent variables in the model. Finally, the note explains how to interpret the coefficients on dummy variables in a regression model. The appendix includes R code for implementing all of these topics.

About

Abstract

This note provides an overview of linear regression for an introductory data science course. It begins with a discussion of correlation, and explains why correlation does not necessarily imply causation. The note then describes the method of least squares , and how to interpret the r-squared and model coefficient values of a simple linear regression model. Next, the note describes how the interpretation of a model coefficient changes when there are multiple independent variables in the model. Finally, the note explains how to interpret the coefficients on dummy variables in a regression model. The appendix includes R code for implementing all of these topics.

Related