- 1At least 316 "factors" have been published — most don't replicate
- 2Adding more factors increases noise and overfitting risk
- 3Our six factors all exceed the Harvey et al. (2016) threshold of t > 3.0
- 4Each factor has independent economic rationale and works across countries
- 5Disciplined factor selection is as important as factor weighting
#The Factor Zoo Problem
Academic finance has a problem: researchers have published too many "factors."
| Decade | New Factors Published | Total Cumulative |
|---|---|---|
| 1970s | ~15 | ~15 |
| 1980s | ~25 | ~40 |
| 1990s | ~50 | ~90 |
| 2000s | ~80 | ~170 |
| 2010s | ~120 | ~290 |
| 2020s (so far) | ~30 | ~320+ |
With hundreds of factors proposed, some inevitably appear to "work" purely by statistical chance. This is the multiple testing problem — if you test enough variables, some will show spurious significance.
#Why More Factors Hurt
1. Overfitting
A model with 50 factors can perfectly explain past returns — but it fails on new data. This is called overfitting: the model captures noise (random patterns) along with signal (real patterns).
| Number of Factors | In-Sample R² | Out-of-Sample R² |
|---|---|---|
| 3 | 8% | 6% |
| 6 | 12% | 9% |
| 20 | 25% | 5% |
| 50 | 45% | -2% (negative!) |
More factors → better backtests → worse real-world results.
2. Correlation Among Factors
Many published factors are really measuring the same thing with different names:
- "Quality" and "profitability" and "ROE" overlap extensively
- "Value" measured 20 different ways
- "Momentum" across different lookback periods
Adding correlated factors doesn't increase diversification — it just adds complexity without benefit.
3. Transaction Costs
More factors mean more signals, more disagreements between signals, and more trading. In practice, complex models generate excessive turnover that eats returns.
4. Reduced Transparency
A six-factor model is explainable. A 50-factor model is a black box. Transparency matters for trust, accountability, and understanding when the model fails.
#Our Selection Criteria
We chose our six factors using strict criteria:
Criterion 1: Statistical Robustness (t > 3.0)
Following Harvey, Liu & Zhu (2016), we require every factor to have a t-statistic above 3.0 — far above the traditional 2.0 threshold. This dramatically reduces false positives.
Criterion 2: Economic Rationale
Every factor needs a plausible explanation for WHY it works:
| Factor | Economic Rationale |
|---|---|
| Profitability | Profitable companies generate cash and compound wealth |
| Momentum | Investors underreact to new information |
| Value | Investors overpay for glamour and underpay for boring |
| Low Volatility | Leverage constraints create excess demand for risky stocks |
| Investment | Empire-building and overinvestment destroy shareholder value |
| Short Interest | Informed traders identify fundamental problems |
Criterion 3: Global Evidence
Each factor must work across multiple countries and time periods. U.S.-only evidence could be a data artifact.
Criterion 4: Post-Publication Persistence
Factors that stop working after publication were likely data-mined. All six of our factors continue to earn premiums after their discovery papers were published.
Criterion 5: Low Correlation with Other Factors
Each factor should provide independent information. Adding a seventh factor that's 0.8 correlated with profitability adds noise without signal.
#What We Excluded (and Why)
| Excluded Factor | Reason |
|---|---|
| Size (small cap premium) | Weakest evidence; largely disappears after quality controls |
| Dividend yield | Captured by profitability; introduces sector biases |
| Earnings revisions | Too high frequency; excessive turnover |
| Analyst sentiment | Subject to bias; captured by momentum |
| Sector momentum | Correlated with stock momentum; adds complexity |
| ESG scores | Inconsistent across providers; limited return evidence |
#The Goldilocks Number
Six factors represent a "Goldilocks" balance:
- Too few (1-2): Not enough diversification across signal types
- Just right (4-7): Captures the major independent return drivers
- Too many (20+): Overfitting, noise, excessive complexity
Most sophisticated institutional quant firms (AQR, Dimensional, Robeco) use 4-7 core factors. More than that, and you're likely capturing noise.
#How This Applies to Our Rankings
Our six-factor model is deliberately parsimonious. We'd rather be disciplined about what we include than impressive about how many variables we can process.
Every time we consider adding a factor, we ask: 1. Does it pass the t > 3.0 threshold? 2. Does it have clear economic rationale? 3. Does it add information beyond our existing six? 4. Does it work globally and post-publication?
So far, no seventh factor has met all four criteria.
See our disciplined approach in action →
#Further Reading
Last updated: February 1, 2026