6.43 What Is a Factor, Really
A factor is just an alpha that explains so much variance you keep re-finding it. Subtract it from returns, then dot your positions with what's left: flat means old exposure, sloped means real edge.
I never understood factors from the textbook definition. The understanding came from trading them, watching the same effect show up in alpha after alpha until I stopped treating each rediscovery as a new edge. A factor is not a fundamental force of markets, not a risk premium handed down from theory, not anything special. A factor is an alpha that explains a large chunk of return variance, important enough that you keep accidentally finding it, so you regress it out of returns and stop paying yourself for the same bet twice. Nothing more.
A factor is just an alpha you got tired of re-finding
Say you research crypto momentum. You build a 24-hour momentum signal, it works, you ship it. Three weeks later you build a "new" signal off funding rates and it also works, and a month after that a volume-breakout thing works too. Run the correlations and the three return streams are 0.7 with each other, because all three are momentum wearing different clothes. You did not find three alphas. You found one alpha three times and congratulated yourself twice for free.
The fix is to name the dominant effect, call it a factor, and remove it from returns before you test anything new. Cross-sectionally regress returns on your momentum feature, keep the residual, and test future signals against that residual instead of raw returns. The residual is returns minus the part momentum already explains. Now a signal only scores if it predicts something momentum did not, which is the only thing worth paying for.
So the word "factor" is doing far less work than people think. A factor is an alpha you believe in enough, and that explains enough variance, that you are willing to subtract it from the world before looking for the next thing. The promotion from "alpha" to "factor" is a decision about variance and recurrence, not a discovery about the deep structure of markets.
The market is always factor number one
The first factor, in any equity book, is the market. You take an asset's beta to the S&P 500 and remove the index return scaled by that beta, and what is left is the idiosyncratic return.
$$ r^{\text{idio}}_{i,t} = r_{i,t} - \beta_i \, r^{\text{mkt}}_{t} $$
Read it plainly: the asset's own return minus its share of whatever the market did. If a stock has a beta of 1.3 and the S&P returned 1% on the day, you attribute 1.3% of the stock's move to the market and treat the rest as the stock being itself. From there you keep going, peeling off a size factor, a value factor, a momentum factor, each one another column you regress out, each one another piece of "this is just exposure I already understand" stripped away until the residual is the part that is genuinely the stock.
This is the same move as the crypto momentum case, scaled up. The market is factor number one because it explains the most variance of anything you will ever measure, so it is the effect most likely to contaminate every other signal you test. You remove it first for the same reason you remove momentum first in crypto: it is the biggest liar in the room about whether you have found something new.
Factors are relative to your horizon, and HFT has them too
People assume factors are a slow, low-frequency, equities-only idea. They are not. A factor is whatever explains a lot of variance at the horizon you trade, so the factors change when the horizon changes. Predicting one minute ahead in an order book, the canonical factor is order book imbalance, the same effect "Order Book Imbalance: The First Microstructure Feature to Test" treats as the first feature anyone should check. At a one-hour horizon the dominant factor is closer to short-term reversion, past return times negative one, because that is what explains most of the variance there.
Order book imbalance at HFT horizons is worth a careful caveat, because it is not alpha in the risk-adjusted-return sense. You do not get paid a clean Sharpe for being long imbalance. It is a factor in the variance-and-recurrence sense: it explains a large fraction of one-minute price movement, and it leaks into almost every other microstructure signal you build, so you will keep finding it inside your "new" alphas whether you meant to or not. That is exactly why you factor it out of returns. If you do not, you will confuse factor exposure, which is an effect you already found, for fresh edge. Imbalance is the order-book version of the market beta: too big and too pervasive to leave in the residual.
The test that tells you new alpha from old exposure
Here is the part that made factors click for me, and it is a single line of arithmetic. Take your candidate strategy's position vector, the signed size it wanted to hold over time, and dot it with the specific-factor returns, which are the returns left after the factors are removed.
$$ \text{PnL}^{\text{specific}} = \sum_{t} \text{position}_{t} \cdot r^{\text{specific}}_{t}, \qquad r^{\text{specific}}_{t} = r_{t} - \big(\text{returns explained by factors}\big)_t $$
If your alpha is really just order book imbalance in disguise, this curve flatlines, because the thing your positions were tracking has already been subtracted out of the returns you are paying them against. The positions still move, but they move against returns that no longer contain the effect they were exploiting, so the cumulative line goes nowhere. If your alpha carries genuine idiosyncratic edge, the same curve slopes up, because the positions are aligned with the part of returns the factors do not explain. One dot product separates "I found something new" from "I re-found the factor," and it shows you performance strictly in excess of what you already know.
Run this before you get excited about any signal. A backtest against raw returns will happily reward you for reloading the market beta or the imbalance factor, and the equity curve looks real because the returns are real, they are just not new. The specific-factor curve is the honest version of the same backtest, and a flat one is the cheapest bad news you will ever buy.
Visualizing the factor test

Where this connects
Removing a factor before forecasting is the same instinct as predicting the residual rather than the gross return, which "Predict Residual Returns, Not Gross" takes as its whole thesis: forecast the part that is not already explained, not the contaminated whole. It also sits underneath cross-sectional work, because ranking a market's normalized signal against its peers, the move in "Cross-Sectional Percentile Rank Within a Universe," is a quiet way of removing the common factor and keeping the relative residual. And fair value, the spine of "Why Fair Value Is the Core of Market Making," is itself a factor at the tick level: the dominant explainer of the next price that you subtract before you go hunting for anything subtler. Same operation everywhere. Name the big effect, remove it, and test against what survives.
The last use is portfolio construction, where factors act as a regularization tool: constraining factor exposures keeps a book from quietly piling into one effect across positions that looked independent. Same idea from the other side. If a factor is the alpha you already own, controlling exposure to it stops you from owning it ten more times by accident.
KEY POINTS
- A factor is not fundamental or special. It is an alpha that explains a large share of return variance, important enough that you keep re-finding it, so you regress it out of returns before testing anything new.
- Promotion from "alpha" to "factor" is a decision about variance and recurrence, not a discovery. You subtract the effect from the world so you stop paying yourself twice for the same bet.
- The market is always factor number one. Remove beta times the index return to get the idiosyncratic return, then peel off further factors. It is removed first because it explains the most variance and contaminates everything.
- Factors are relative to horizon. Order book imbalance is the canonical one-minute factor and short-term reversion dominates at one hour. Imbalance is not risk-adjusted alpha at HFT horizons, but it leaks into other signals, so factor it out.
- The test: dot your position vector with the specific-factor returns (returns minus what factors explain). A flat curve means the alpha was known exposure; an upward curve means genuine new edge. It shows performance in excess of what you already know.
- Run the specific-factor backtest before trusting any signal. A raw-returns backtest rewards reloading the factor and the curve looks real, just not new. Factors also regularize portfolio construction by capping exposure to effects you already own.
References
- Systematic Trading - Robert Carver (Amazon)
- Trading Systems - Urban Jaekle Emilio Tomasini (Amazon)
- A Practitioner’s Guide to Factor Models
- Factor Investing and Asset Allocation
- Which factor model? A systematic return covariation perspective
- The Incredible Shrinking Factor Return
- A machine learning based asset pricing factor model comparison on anomaly portfolios
- Size and Value in China
- The History of the Cross-Section of Stock Returns
- Time-Varying Factor-Augmented Models for Volatility Forecasting