🔍 What’s the problem?
When estimating grouped data with a binary dependent variable and fixed effects, linear (OLS) and logit specifications can produce different results. The apparent reason is that logit excludes groups where the outcome is all zeros or all ones, while the linear specification retains all groups.
📊 How each model handles homogeneous groups
- The linear specification averages slope estimates across all groups, including homogeneous-outcome groups.
- Homogeneous groups (all zeros or all ones) have slope coefficients of zero by definition.
- As a result, the linear estimate is a weighted average of zero slopes from homogeneous groups and the nonzero slopes from groups with mixed outcomes.
âś… What to compare for a fair test
- The correct apples-to-apples comparison between linear and logit estimates restricts analysis to groups that exhibit within-group variation in the dependent variable (i.e., groups with some zeros and some ones).
đź§ľ Reporting recommendations for researchers
- Report OLS results using all groups and also report OLS results limited to groups where the dependent variable varies.
- Report logit results alongside the restricted OLS results to show whether differences stem from dropped groups or from functional-form differences.
⚠️ Why this matters
- Differences between OLS and logit estimates may reflect which groups are included, not only differences in model form.
- The substantive interpretation of any difference between the full-sample and restricted-sample results depends on assumptions about the excluded homogeneous groups—assumptions that cannot be verified empirically.






