Poisson regression is a type of regression used when the dependent variable is a count — for example, the number of times a customer calls support, the number of accidents on a road in a month, or the number of goals in a football match. Unlike linear regression, which assumes a continuous outcome, Poisson regression is suited for modelling discrete data, particularly count-based outcomes. In this article, we’ll understand what Poisson regression is, when to use it, its assumptions, and how to apply it — along with a PDF download for revision notes.
I’ve chosen this topic because count data is extremely common in real-life scenarios, especially in fields like public health, operations, insurance, and risk management. When I first came across Poisson regression during my coursework, I realised that many of us often tried to use linear regression for count outcomes without checking if it’s the right fit. This not only gives incorrect results but also weakens the entire analysis. Knowing when to use Poisson regression and how to interpret its output is an important skill, especially if you’re preparing for exams or working in analytics. This post is a beginner-friendly walkthrough to help you get comfortable with it.
What is Poisson Regression?
Poisson regression is a statistical technique used to model count data — where the values are non-negative integers (0, 1, 2, 3…). It assumes that the response variable YYY follows a Poisson distribution and the logarithm of its expected value can be modeled as a linear combination of independent variables.
When to Use Poisson Regression
Use Poisson regression when:
- Your dependent variable is a count (e.g., number of visits, calls, claims)
- The counts are non-negative integers
- The events happen independently
- The variance is roughly equal to the mean (important assumption)
If the variance is much higher than the mean, it may indicate overdispersion, and in that case, a Negative Binomial Regression is often better.
Key Assumptions of Poisson Regression
- The response variable follows a Poisson distribution
- The logarithm of the expected value is a linear function of the independent variables
- The events are independent of each other
- The mean and variance of the outcome variable are equal
Real-World Examples
Scenario | Poisson Regression Use |
---|---|
Healthcare | Modelling number of patient visits per month |
Insurance | Predicting the number of claims per customer |
Transport | Estimating number of accidents per road segment |
Customer Service | Modelling call centre complaints per day |
Model Evaluation Metrics
While linear regression uses R², in Poisson regression we rely on:
- Deviance: A goodness-of-fit measure
- AIC (Akaike Information Criterion): For model comparison
- Residuals: Pearson or deviance residuals to detect outliers
- Dispersion statistic: To check for overdispersion
Common Issues and Fixes
- Overdispersion: When the variance is greater than the mean. Use Quasi-Poisson or Negative Binomial models instead.
- Zero-inflation: Too many zeros in the data. Use Zero-Inflated Poisson (ZIP) model.
Download PDF – Poisson Regression Notes
Download Link: [Click here to download PDF] (Insert the actual download link)
What’s included in the PDF:
- Clear explanation of Poisson regression
- Model formula and assumptions
- Solved example problems
- Differences between Poisson and other models
- Code snippets for R and Python