Understand Quantitative Finance

What Are Markets?

Markets are strange places. Every day, millions of people simultaneously buy and sell the same stocks, currencies, and derivatives, each convinced they know something the others don't. Prices move. Money changes hands. Fortunes are made and lost, sometimes in milliseconds. Quantitative finance is the attempt to make sense of all of that using math, data, and code. Not to perfectly predict markets, because that is impossible, but to find edges: small, repeatable advantages that, over thousands of trades, add up to something real.

The word "quantitative" just means you are working with numbers rather than gut feel. But that is not the interesting part. The interesting part is that quantitative finance forces you to be precise about your beliefs. Instead of saying "I think this stock will go up," you are forced to say: given historical data, what is the expected return of this position, what is the probability of loss, and how does this trade behave when the market crashes? That discipline of turning intuition into testable, falsifiable claims is what separates quant work from everything else.

Before any math, here is the right mental model. Think of a market as a machine that constantly tries to answer one question: what is this asset worth right now? The price you see is the market's best collective guess at any given moment. It is not always right. It is built from everyone's expectations, fears, risk tolerance, and access to information. When new information arrives, the machine updates. Prices move. Quantitative finance asks whether there are patterns in how that machine updates, and whether there are moments when it is systematically wrong in a predictable direction. If so, can you trade against that mistake before it corrects itself?

The people chasing that question use wildly different tools, work on completely different timescales, and operate on entirely different theories of where the edge comes from. That is what makes this field so deep. There is no single answer to what a quant does, because the word covers ten fundamentally different philosophies about what markets are and how to beat them.

The Landscape

Before getting into each archetype, it helps to have the full landscape in one place. The table below maps out all ten by the most practically important dimensions: what kind of edge they are exploiting, how long they hold a position, how much technology and capital they need, and whether their returns come from true alpha, a structural risk premium, or providing a market service.

Statistical arbitrageurs at firms like DE Shaw and Two Sigma exploit mean reversion in correlated price spreads, holding positions from minutes to days, generating true alpha. Market makers like Citadel Securities and Virtu operate at millisecond to second timescales, collecting the bid-ask spread at massive scale, earning a service premium rather than directional alpha. Latency arbitrageurs at Jump Trading push this further into microseconds, using speed advantages across venues, requiring very high capital just to build the infrastructure.

Trend followers like Man AHL and Renaissance exploit the slow diffusion of information across the investor population, holding from days to months, earning a blend of alpha and risk premium. Volatility traders at firms like Susquehanna trade the gap between implied and realized volatility. Macro quants like Bridgewater and DE Shaw model cross-asset flows and rate differentials at the largest scale.

Factor investors like AQR and Dimensional tilt portfolios toward structural risk premia like value, quality, and momentum. Machine learning traders like Two Sigma and Numerai use very high technology to discover non-human patterns from alternative data. Event-driven traders at Millennium and Point72 price probabilities around mergers, earnings, and regulatory decisions. Finally, execution algorithm designers at Goldman and ITG reduce the cost of trading for large institutions.

The return type matters more than most introductions acknowledge. True alpha means you are generating returns uncorrelated with any known risk factor — you found something the market missed. A risk premium means you are being compensated for bearing a specific kind of risk that other investors want to offload. A service premium means you are being paid to do something the market needs. All three can make money. Only the first is genuinely scarce.

Strategy Explorer — click any archetype

Edge Source —

Holding Period —

Return Type —

Key Firms —

The Strategies

Each archetype has a philosophy, a math, and a failure mode. Understanding all three is what separates someone who can read about a strategy from someone who can actually run one.

The Statistical Arbitrageur

The stat arb trader is essentially a scientist of market relationships. They do not care what a company makes or what its CEO said in the last earnings call. They care about the statistical relationship between two or more prices, and whether that relationship has temporarily broken down in a way that will correct itself.

The classic version is pairs trading. Take Goldman Sachs and Morgan Stanley. Both are large US investment banks exposed to similar economic forces. Over long periods their stocks tend to move together. The spread they are watching:

Spread(t) = Price_GS(t) − β × Price_MS(t)

Here β is estimated from historical data and represents how much Goldman tends to move for every one unit Morgan Stanley moves. If Goldman drops sharply while Morgan Stanley holds steady, Spread(t) falls well below its historical mean. The trader goes long Goldman, short Morgan Stanley, and waits for convergence. The z-score triggers the trade:

Z(t) = ( Spread(t) − Mean(Spread) ) / StdDev(Spread)

When Z exceeds 2 in either direction, you enter. When it returns toward zero, you exit. The position is market neutral: the only thing you are betting on is the relationship between them, not the market direction.

Pairs Trading — Z-Score Signal

Z-score 0.0 σ

−3σ −2σ −1σ 0 +1σ +2σ +3σ

Signal: No trade — spread within normal range

The Market Maker

The market maker is not trying to predict where prices go. They are trying to make money regardless of where prices go, by being on both sides of every trade and collecting the difference.

Spread = Ask_price − Bid_price

A market maker simultaneously quotes both. They will buy from you at the bid and sell to you at the ask. On a liquid stock like Apple, the spread might be just one cent. Collect one cent on a hundred million shares per day and you are printing money. The Avellaneda-Stoikov model (2008) gives the optimal quotes given current inventory and volatility:

Bid* = Fair value − (γ σ² q T) − (σ/2) √(γT)
Ask* = Fair value + (γ σ² q T) + (σ/2) √(γT)

Where γ is risk aversion, σ is the asset's volatility, q is current inventory, and T is the time horizon. If you are holding too much inventory, you shade your quotes to attract sellers and discourage buyers. Citadel Securities handles roughly 25 percent of all US equity volume. Virtu went 1,237 out of 1,238 trading days without a single losing day between 2009 and 2014.

The Latency Arbitrageur

The latency arbitrageur exploits the fact that information travels at the speed of light, and exchanges are not at the same location. When a large order hits the NYSE and moves the price of Apple, the same information has not yet reached NASDAQ or BATS. For approximately 150 microseconds, Apple is priced differently on different venues.

Profit = (Price_venue 1 − Price_venue 2) × Volume − Infrastructure_cost

Jump Trading built microwave relay networks between New York and Chicago specifically for this purpose. The microwave signal, traveling through air instead of fiber optic cable, arrives roughly 100 microseconds faster. The infrastructure costs hundreds of millions of dollars. The profit per trade is fractions of a cent, done millions of times per day.

The Trend Follower

The trend follower bets that whatever is moving will keep moving. The underlying thesis is behavioral: markets trend because information spreads slowly and unevenly through the investor population. A simple momentum signal:

Signal(t) = ( Price(t) − Price(t − n) ) / StdDev(t−n to t)

You measure how far the price has moved over the last n days, normalized by how volatile it has been over that period. The normalization is critical: a ten dollar move in a stock that moves one dollar a day is a strong signal; the same move in a stock that moves twenty dollars a day is noise. Man AHL trades across over 500 markets simultaneously. The diversification matters: in any given month maybe 40 percent of markets are trending in a way the model catches. The profit from that 40 percent more than covers the churning 60 percent.

The Volatility Trader

The volatility trader is not trading the direction of prices — they are trading the price of uncertainty itself. The Black-Scholes formula prices a call option as:

C = S₀ N(d₁) − K e^−rT N(d₂)

d₁ = [ ln(S₀/K) + (r + σ²/2) × T ] / (σ √T)
d₂ = d₁ − σ √T

Every term is observable except σ. You can back out an implied volatility from market prices — it is the σ value that makes the formula match the actual market price. The key insight is that implied volatility tends to be higher than realized volatility most of the time. The market consistently overpays for protection. A volatility trader who systematically sells options and hedges the directional risk is collecting that premium. Susquehanna International Group, founded in 1987, built one of the most sophisticated options trading operations in the world on exactly this analysis.

The Macro Quant

The macro quant models entire economies and the capital flows between them. The most accessible entry point is the carry trade:

Carry = r_AUD − r_JPY − Δe

Where Δe is the percentage change in the AUD/JPY exchange rate. If you borrow yen at near zero cost and invest in Australian dollars at 4.5 percent, you pocket the differential as long as the currency holds. Uncovered interest rate parity — the theory that says this trade should be arbitraged away — empirically fails. Bridgewater runs macro strategies like this managing over 150 billion dollars at its peak. What makes the macro quant different from a discretionary trader like Soros is systematization: hundreds of smaller versions of that logic run simultaneously across dozens of currency pairs, managed by an algorithm.

The Factor Investor

Factor investing was born from an empirical puzzle: if markets are efficient, why do cheap stocks, small stocks, and high-quality stocks persistently outperform? Fama and French answered this in 1992 by extending CAPM into a three-factor model:

E[R_i] − R_f = β_market (E[R_m] − R_f) + β_size SMB + β_value HML

Where SMB is the return of small stocks minus large stocks, and HML is the return of cheap stocks minus expensive stocks. AQR Capital, founded by Cliff Asness in 1998, built a firm managing over 140 billion dollars on exactly this idea. The strategy requires no particular stock to do anything — it just requires that the systematic tilt earns its historical premium on average. The risk is crowding: by 2018, enormous institutional money had piled into factor strategies, and a deleveraging event forces them all to sell the same stocks simultaneously.

The Machine Learning Trader

The ML trader does not start with a hypothesis and test it. They feed data into a model and let the patterns emerge. What makes this possible now is alternative data: satellite imagery of retail parking lots to estimate foot traffic before earnings, credit card transaction data to see exactly how much consumers are spending at each retailer weeks before official revenue figures, natural language processing on SEC filings and earnings call transcripts.

Two Sigma, founded in 2001 by David Siegel and John Overdeck, treats financial markets as a machine learning problem at every level: signal generation, portfolio construction, execution, and risk management. The deep danger is overfitting. A model trained on ten years of data will find patterns — many real, some statistical artifacts. The academic term is the multiple comparisons problem. In finance it is called p-hacking. Numerai built an entire business model around this problem, aggregating competing models in ways that hopefully cancel out individual overfitting.

The Event-Driven Trader

The event-driven trader hunts for moments when the market is pricing the probability of a specific outcome incorrectly. The clearest example is merger arbitrage. When Company A announces it will acquire Company B at $50 per share, Company B's stock does not immediately trade at $50 — it trades at $47, reflecting deal completion risk. The math:

Expected return = (p × Gain_{deal closes}) + ((1−p) × Loss_{deal breaks})

Where p is the estimated probability the deal closes. The quant edge is estimating p more accurately than the market by building models that incorporate regulatory approval probabilities for different deal types, financing risk, shareholder vote likelihood, and how similar deals in similar conditions have fared historically. Millennium Management and Point72 run significant event-driven books on exactly this kind of systematic analysis.

Merger Arb Calculator — expected value

P(deal closes) 80%

gain if closed +6%

loss if broken −20%

Expected Return: +0.80%

0.80 × (+6%) + 0.20 × (−20%) = +0.80%

The Execution Algorithm Designer

The tenth archetype is the one most people forget to include. Every institutional investor that needs to buy or sell large blocks of stock faces the same problem: the act of trading moves the price against you. The simplest execution algorithm is VWAP, Volume Weighted Average Price:

VWAP = ∑(Price_i × Volume_i) / ∑(Volume_i)

If you participate in 10 percent of volume throughout the day, you will achieve roughly the market's average price for the day. More sophisticated algorithms like Implementation Shortfall, formalized by Almgren and Chriss in 2000, optimize the tradeoff between trading quickly (reducing market impact risk from price moving away from you) and trading slowly (reducing market impact from your own orders). Goldman Sachs runs SIGMA X, one of the largest alternative trading systems in the world, built around execution optimization. These strategies are not generating alpha — they are saving institutional clients billions annually in reduced trading costs.

The Discipline

What ties all ten of these together is a single underlying discipline. You form a hypothesis about how some aspect of the market behaves. You express it mathematically, so it is precise and testable. You test it against historical data honestly, including realistic transaction costs and without looking forward in time. You measure the risk-adjusted return and understand why it should persist. And then you build a system to execute it mechanically, because the moment human judgment re-enters the loop, so do cognitive biases.

The math serves the thinking. It does not replace it. A formula with a wrong assumption is worse than no formula at all, because it gives you false confidence. Every model in this document comes with its assumptions explicitly stated and its failure modes mapped out. That is not pessimism. That is what it means to actually understand a model rather than just being able to write it down.

Probability & Statistics

Every strategy in quantitative finance is ultimately a bet on a probability distribution. Before you can price a derivative, build a factor model, or test a mean-reversion signal, you need a precise language for describing uncertainty. That language is probability theory.

Random Variables

A random variable is a number whose value is determined by some process. Tomorrow's stock price is a random variable. So is the number of trades executed in the next hour, or the profit from a position you have not yet closed. By convention, random variables are written with capital letters: X, Y, Z. Their realized values are written in lowercase: x, y, z.

Probability Distributions

A probability distribution is a complete map of uncertainty — the full range of possible outcomes and how likely each one is. For a discrete variable (like the number of heads in ten coin flips), this is the probability mass function (PMF). For a continuous variable (like a stock return), it is the probability density function (PDF). The PDF does not give you the probability of an exact value — it gives you the probability of landing in a range. The probability that X falls between a and b is the area under the curve from a to b.

The Normal Distribution

The normal distribution X ∼ N(μ, σ²) is the workhorse of quantitative finance. It appears everywhere for three reasons: mathematical convenience, the Central Limit Theorem (the sum of many small independent random variables converges to normal), and the fact that it is fully described by just two parameters — mean μ and variance σ². Log-returns on assets are approximately normally distributed over short intervals, which is why nearly every model in this field starts there.

Expectation, Variance, and Covariance

The expected value of a random variable is the probability-weighted average of all possible outcomes:

E[X] = ∑_i x_i × P(X = x_i)

Variance measures how spread out a distribution is around its mean. Standard deviation is its square root, which restores the original units. In finance, the standard deviation of returns is called volatility.

Var[X] = E[(X − E[X])²] Std[X] = √Var[X]

Covariance and correlation measure how two variables move together. Correlation normalizes covariance to [−1, +1], making it scale-invariant and directly comparable across different pairs of assets:

Cov[X, Y] = E[(X − E[X])(Y − E[Y])]

Corr[X, Y] = Cov[X, Y] / (Std[X] × Std[Y]) ∈ [−1, +1]

Binomial Distribution — key parameters

n (trials) 10

p (probability) 0.50

Mean (μ = np)

5.00

Variance (np(1−p))

2.50

Std Dev (σ = √Var)

1.58

Mode (most likely k)

5

As n grows large, this distribution approaches N(np, np(1−p)) — the Central Limit Theorem in action.

Exercise 1 — Portfolio Volatility

You hold 50% in Stock A (σ = 10%) and 50% in Stock B (σ = 10%). Portfolio variance formula:

Portfolio σ = √(w_A²σ_A² + w_B²σ_B² + 2w_Aw_BCov[A,B])

At ρ = +1, portfolio σ = 10% (no diversification). At ρ = 0, portfolio σ ≈ 7.07%. At ρ = −1, portfolio σ = 0% (perfect hedge). Try these values in the Correlation Explorer below.

Stochastic Processes

A stochastic process is a sequence of random variables indexed by time. Asset prices are stochastic processes — they are not deterministic functions of time, but evolve randomly according to some underlying distribution. The key challenge is modeling that evolution in a way that is both mathematically tractable and empirically reasonable.

Random Walks and Brownian Motion

The simplest model is the symmetric random walk: X(t+1) = X(t) ± 1 with equal probability. The key insight is that uncertainty compounds as √T, not T. After 4 days your expected displacement is twice, not four times, what it is after 1 day. Brownian motion W(t) is the continuous-time limit, satisfying W(0) = 0, independent increments, and W(t) ∼ N(0, t). An increment over a small interval dt satisfies:

dW ∼ N(0, dt) → Var[dW] = dt → (dW)² = dt

The fact that (dW)² = dt in expectation is the key insight that drives Ito's Lemma. This term does not vanish as dt → 0, which makes stochastic calculus fundamentally different from ordinary calculus.

Ito's Lemma

If f is a smooth function of a Brownian motion W and time t, then its differential is:

df = ∂f/∂t · dt + ∂f/∂W · dW + (1/2) · ∂²f/∂W² · dt

The extra term (1/2)(∂²f/∂W²)dt is the Ito correction. It has no analogue in ordinary calculus and arises precisely because (dW)² = dt. Without it, every financial model built on Brownian motion would be systematically wrong.

Geometric Brownian Motion

The standard model for stock prices is Geometric Brownian Motion. Rather than modeling S(t) directly, we model the percentage change:

dS = μ · S · dt + σ · S · dW

Applying Ito's Lemma to f(S) = log(S) gives the crucial result: log-returns are normally distributed with mean (μ − σ²/2) per unit time. The σ²/2 term is the volatility drag — even if the expected arithmetic return is μ, the realized compound return is μ − σ²/2.

d(log S) = (μ − σ²/2) dt + σ dW

# Exact discrete simulation of GBM
import numpy as np

def simulate_gbm(S0, mu, sigma, T, n_steps=252, n_paths=10):
    dt          = T / n_steps
    Z           = np.random.standard_normal((n_steps, n_paths))
    log_returns = (mu - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * Z
    return S0 * np.exp(np.cumsum(log_returns, axis=0))  # shape: (n_steps, n_paths)

GBM — Volatility Drag Calculator

μ annual drift +10%

σ volatility 20%

Arithmetic drift (μ)

+10.0%

Volatility drag (−σ²/2)

−2.0%

Compound return (μ − σ²/2)

+8.0%

$100 → after 1 year (median)

$108.33

Exercise 2 — Volatility Drag

With μ = 10% and σ = 20% annually, the compound return is 10% − 2% = 8% per year. The median path after one year lands near $108.33, not at the arithmetic expectation of $110. The 2% gap is the volatility drag — a permanent mathematical cost of compounding under uncertainty.

Time Series

A time series is a sequence of observations indexed by time. The statistical tools built for time series differ from cross-sectional statistics in one fundamental way: observations are not independent. Understanding these dependencies — and detecting when they break down — is the core of systematic trading.

Stationarity

A time series X(t) is weakly stationary if its mean, variance, and covariance structure are all constant over time. Stock prices are not stationary — a price of 150 today does not revert toward some fixed mean. Log-returns, however, are approximately stationary. Regressing two non-stationary series on each other produces spurious regression: you will find a statistically significant relationship even when none exists. Test for stationarity with the Augmented Dickey-Fuller (ADF) test before any regression involving price levels.

E[X(t)] = μ Var[X(t)] = σ² Cov[X(t), X(t+k)] = γ(k)

# Augmented Dickey-Fuller test for stationarity
from statsmodels.tsa.stattools import adfuller

result    = adfuller(price_series, autolag='AIC')
adf_stat  = result[0]
p_value   = result[1]

# H0: series has a unit root (non-stationary). Reject if p_value < 0.05.
print(f"ADF: {adf_stat:.3f}  p-value: {p_value:.4f}")

Autocorrelation

The autocorrelation function (ACF) measures the correlation of a series with its own past values:

ACF(k) = Corr[X(t), X(t−k)]

Positive ACF at short lags suggests momentum — recent moves tend to continue. Negative ACF suggests mean reversion — recent moves tend to reverse. Both are tradeable patterns, but they require different strategies.

Cointegration and Pairs Trading

Two non-stationary series X(t) and Y(t) are cointegrated if a linear combination is stationary:

Z(t) = Y(t) − β · X(t) is stationary

The coefficient β is the cointegrating coefficient. If Z(t) drifts far from its mean, it will eventually revert — the two series are held together by an underlying economic force.

# Pairs trading workflow
from statsmodels.tsa.stattools import coint
from numpy.linalg import lstsq

# 1. Test for cointegration
_, p_value, _ = coint(price_A, price_B)   # p < 0.05 ⇒ cointegrated

# 2. Estimate cointegrating coefficient
beta   = lstsq(price_A.reshape(-1,1), price_B, rcond=None)[0][0]

# 3. Compute spread and z-score
spread  = price_B - beta * price_A
z_score = (spread - spread.mean()) / spread.std()

# 4. Signal: enter when |z| > 2, exit when |z| < 0.5
long_B  = z_score < -2.0
short_B = z_score >  2.0

The critical failure mode is a structural break. During the 2020 pandemic, many pairs that had been cointegrated for decades broke apart permanently. When the cointegrating relationship breaks, the spread does not revert — it simply walks away. Any strategy built on cointegration must include stop-loss logic and ongoing monitoring of whether the relationship remains intact.

Linear Algebra

When you move from two assets to hundreds, the two-variable correlation formula becomes unwieldy. Linear algebra provides the compact notation and the computational tools to handle arbitrary numbers of assets simultaneously.

The Covariance Matrix

For N assets, all pairwise variances and covariances are organized into a symmetric N×N matrix Σ. The diagonal entries are the variances; off-diagonal entries are covariances. Portfolio variance is the quadratic form:

Portfolio Variance = w^T Σ w

# Portfolio variance via matrix multiplication
import numpy as np

w          = np.array([0.5, 0.5])
cov_matrix = np.array([[0.04, 0.02],
                        [0.02, 0.09]])   # σ_A=20%, σ_B=30%, Cov=0.02

port_var = w @ cov_matrix @ w          # = 0.0425
port_std = np.sqrt(port_var)           # ≈ 20.6%

Portfolio Correlation Explorer — equal 50/50 weights

σ_A 20%

σ_B 30%

ρ correlation 0.00

ρ = −1 — perfect hedge

ρ = 0.00 18.0% selected ρ

ρ = +1 — no diversification

Portfolio σ: 18.0% · Simple average: 25.0% · Diversification benefit: 7.0%

Principal Component Analysis

PCA decomposes the covariance matrix into its eigenvectors and eigenvalues:

Σ · v = λ · v

Each eigenvector v is a principal component — a portfolio of assets with specific loadings. The corresponding eigenvalue λ is the variance explained by that component. In a portfolio of 500 stocks, the first principal component typically accounts for 30–50% of total variance and corresponds to the broad market factor. PCA reduces a 500×500 covariance matrix to 10–15 risk factors explaining 80%+ of variance — dramatically improving both estimation reliability and computational tractability.

# PCA on a covariance matrix
from numpy.linalg import eigh

eigenvalues, eigenvectors = eigh(cov_matrix)

# Sort descending (largest variance first)
idx          = eigenvalues.argsort()[::-1]
eigenvalues  = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

explained = eigenvalues / eigenvalues.sum()
print(f"PC1 explains {explained[0]:.1%} of variance")

Synthesis: The Four Tools in One Strategy

These four mathematical frameworks are not independent. A pairs trading strategy uses all of them simultaneously. Probability and statistics define the signal: compute the spread, estimate its mean and standard deviation, express entry and exit as z-score thresholds. Stochastic processes model the dynamics: the spread between two cointegrated assets follows an Ornstein-Uhlenbeck process, the mean-reverting analogue of Brownian motion. Time series provide the empirical tests: the ADF test confirms the spread is stationary; the cointegration test confirms the relationship is structural; the ACF at short lags informs the optimal holding period. Linear algebra ensures diversification: PCA of the covariance matrix of all current positions checks that different pairs do not all load on the same first principal component.

Understanding any one of these tools in isolation is not difficult. What makes quantitative finance hard — and valuable — is knowing how they fit together, and which assumptions each one is smuggling in.