The Frisch–Waugh–Lovell Theorem

Introduction

In a standard multiple linear regression model, the effect of a predictor is typically estimated by including all variables simultaneously in the model. However, the Frisch–Waugh–Lovell (FWL) theorem states that any coefficient in a multiple regression model can be equivalently obtained by regressing both the dependent variable and the regressor of interest against all other control variables and then regressing the resulting residuals against each other. This two-step residualization process yields the exact same coefficient as the ordinary least squares estimate obtained by including all regressors simultaneously in a multiple regression.

The Frisch–Waugh–Lovell Theorem Statement

Formally, suppose that the full regression model is: \(y=\beta_0+\beta_1x+\beta_2z_1+…+\beta_kz_k\), where \(x\) is the variable of interest and \(z_1, z_2,…,z_k\) are the control variables. Then, the coefficient \(\beta_1\) can be obtained via the following three-step process:

1. Residualize \(y\) on \(z_1, z_2,…,z_k\): Regress \(y\) on \(z_1, z_2,…,z_k\) to obtain the residuals \(\tilde{y} = y – \hat{y}(z_1, z_2,…,z_k)\).

2. Residualize \(x\) on \(z_1, z_2,…,z_k\): Regress \(x\) on \(z_1, z_2,…,z_k\) to obtain the residuals \(\tilde{x} = x – \hat{x}(z_1, z_2,…,z_k)\).

3. Estimate the FWL Regression: Regress the \(y\)-residuals on the \(x\)-residuals: \( \tilde{y} = \beta_x \tilde{x} + e \)

The resulting \(\beta_x\) is numerically identical to the coefficient from the original multiple regression model. This demonstrates that \(\beta_x\) represents the relationship between \(x\) and \(y\) after the linear influence of \(z_1,…,z_k\) has been removed from both variables.

Illustration of the Frisch–Waugh–Lovell Theorem in R

To illustrate the FWL theorem in R, we’ll create a small data frame with 5 observations:

# Creating the data frame:
df <- data.frame(
  y = c(6.1, 12.8, 14.3, 22.9, 24.0),
  x = c(1, 3, 2, 5, 4),
  z = c(1, 2, 3, 4, 5)
)
# The data frame is:
df
#     y  x z
# 1  6.1 1 1
# 2 12.8 3 2
# 3 14.3 2 3
# 4 22.9 5 4
# 5 24.0 4 5

Then, we’ll estimate the full regression of \(y\) on \(x\) and \(z\):

# Estimating the full regression:
full_mod <- lm(y ~ x + z, data = df)
coef(full_mod)
# Displaying the coefficients:
# (Intercept)      x           z 
#   1.1533333  1.8277778  3.1277778

The output shows that the coefficient of \(x\) from the full model is \(1.83\). This represents the effect of \(x\) on \(y\) while controlling for the influence of \(z\). Next, we’ll regress \(y\) and \(x\) and \(y\) on \(z\), and save the residuals:

# Regressing y and x and y on z:
x_reg = lm(y ~ z, data = df)
z_reg = lm(x ~ z, data = df)
# Saving the residuals:
res_y <- resid(x_reg)
res_x <- resid(z_reg)

Then, we’ll regress the residualized \(y\) on the residualized \(x\) with no intercept:

# Regressing the residuals:
fwl_mod <- lm(res_y ~ res_x + 0)   
coef(fwl_mod)
# res_x 
# 1.8277778

The coefficient on \(x\) from the residual regression is \(1.83\), matching the coefficient on \(x\) from the full regression. This outcome confirms that the FWL procedure effectively filters out the linear influence of the control variable z from both y and x. By regressing these residuals against each other, we have successfully isolated the “pure” relationship between x and y, demonstrating that the coefficient in a multiple regression model is essentially the effect of x on y after accounting for—or “partialling out”—the variation attributed to all other predictors.

In summary, the FWL theorem demonstrates that controlling for other variables is identical to removing their impact from your primary data, leaving only the pure, underlying connection between your variables.

About the author: Premier Statistics Tutoring provides personalized statistics tutoring and R programming support for university students and professionals, offering guidance across a wide range of statistical analysis and programming topics.