The “factor zoo” is a well-known phenomenon: hundreds of published anomalies that fail out-of-sample. ADIA Lab researchers point to a subtler and more dangerous problem: the “factor mirage.” It arises not from data-mining but from models that are misspecified, despite having been developed following the econometric canon taught in textbooks.
Models with colliders are particularly concerning, because they exhibit higher R² and often also lower p-values than correctly specified ones. The econometric canon favors such misspecified models, mistaking better fit for correctness.
In a factor model with a collider, the value of the return is set before the value of the collider. As a result, the stronger association derived from the collider cannot be monetized. The profits promised by those academic papers are a mirage. In practice, that methodological mistake has billion-dollar consequences.
For example, consider two researchers estimating a quality factor. One of the researchers controls for profitability, leverage, and size; the other adds return on equity, a variable influenced by both profitability (the factor) and stock performance (the outcome).
By including a collider, the second researcher creates a spurious link: high quality now correlates with high past returns. In a backtest, the second model appears to be superior. In live trading, the tables are turned, the backtest is a statistical illusion that quietly drains capital. For individual managers, these errors may quietly erode returns; for markets as a whole, they distort capital allocation and create inefficiencies at a global scale.
