When ordinary least squares (OLS) assumptions are violated—especially when the error terms have unequal variance or are correlated—the OLS estimates may still be unbiased but they are no longer efficient. In such cases, Generalized Least Squares (GLS) and Weighted Least Squares (WLS) methods are better alternatives. These estimation techniques are used when we need to handle heteroscedasticity or autocorrelation in the data, and they provide more reliable coefficient estimates compared to standard OLS.
I chose this topic because many learners stop at OLS when studying regression. But in practice, data rarely behaves perfectly. Especially in time series, financial data, and cross-sectional studies, we often see issues like unequal error variances or correlated residuals. Understanding when and how to use GLS or WLS allows you to fix model inefficiencies and get more accurate results. Whether you’re a statistics student, a researcher, or a data analyst, grasping these methods equips you to deal with real-world data more confidently.
What is Generalized Least Squares (GLS)?
GLS is used when the assumption of constant variance of errors (homoscedasticity) or the independence of errors is violated. In such situations, OLS becomes inefficient. GLS adjusts for this by transforming the model in a way that corrects these issues.
When to Use GLS:
- When residuals are correlated (common in time series data)
- When there’s heteroscedasticity, i.e., variance of error terms is not constant
Key Idea:
Instead of minimizing the sum of squared residuals, GLS minimizes a weighted sum of squared residuals, where the weights come from the inverse of the variance-covariance matrix of the errors.
GLS model:β̂_GLS = (XᵀΩ⁻¹X)⁻¹ XᵀΩ⁻¹y
Where Ω is the variance-covariance matrix of the error terms.
What is Weighted Least Squares (WLS)?
WLS is a special case of GLS used when the errors are uncorrelated but have unequal variances. Instead of assuming all residuals are equally reliable, WLS gives less weight to observations with higher variance and more weight to those with lower variance.
When to Use WLS:
- When data shows clear signs of heteroscedasticity
- When some data points are more reliable than others
WLS Model:
To fix heteroscedasticity, each observation is weighted using the inverse of its error variance.
β̂_WLS = (XᵀWX)⁻¹ XᵀWy
Where W is a diagonal matrix with weights (usually 1/σ²ᵢ).
Differences Between OLS, WLS, and GLS
Method | Error Variance Assumption | Error Correlation Assumption | When to Use |
---|---|---|---|
OLS | Constant | None | Ideal condition, base method |
WLS | Varies | None | Heteroscedastic data |
GLS | Varies | May be correlated | Heteroscedastic and/or autocorrelated errors |
Example Scenario
Suppose you are modelling income vs education level across different regions. In richer regions, data may be more consistent (low variance), while in poorer regions, it may vary more. Using WLS will allow you to assign proper weights to each data point. If you’re working with time series data (like stock prices), where today’s residual depends on yesterday’s, GLS is more suitable.
Make sure you estimate or know the error variances/covariances before applying these models. In practice, you may use residual plots, Breusch-Pagan test, or White’s test to detect heteroscedasticity.
Advantages of Using GLS and WLS
- Corrects inefficiencies in the OLS model
- Improves precision of coefficient estimates
- Leads to better predictive performance
- Helps in correctly estimating standard errors and confidence intervals
Download PDF – GLS and WLS in Regression Analysis
Download Link: [Click here to download the PDF] (Insert your actual PDF link here)
This PDF contains:
- Theoretical explanation of both GLS and WLS
- Step-by-step implementation in R and Python
- Sample problems with solution outlines
- Useful formulae and comparison tables
Conclusion
Generalized and Weighted Least Squares methods are crucial when dealing with real-world data that doesn’t follow the neat assumptions of ordinary least squares. Whether you’re analysing economic data, survey results, or experimental outcomes, knowing how and when to apply GLS or WLS ensures your models are efficient and trustworthy. Make use of the PDF for a detailed reference, and don’t just stop at theory—try running these models on your own datasets. That’s where the real learning happens.