Bias vs Variance

Bias and variance are two distinct ways a model can fail. Bias is systematic error — consistently wrong in the same direction. Variance is sensitivity to training data — consistent on seen data, inconsistent on new data.

Intuition First

Imagine you're throwing darts at a target.

High bias: All your darts land in the same spot — but that spot is far from the bullseye. You're consistently wrong. The systematic error is built into your throwing technique.
High variance: Your darts are scattered all over the board. You're unpredictable — sometimes close, sometimes way off. You're sensitive to tiny fluctuations in how you throw.
Low bias, low variance: Darts clustered near the bullseye. This is what you want.

In ML: bias is "consistently wrong"; variance is "inconsistently right."

What's Actually Happening

Bias comes from a model that's too simple to capture the true pattern. It's baked into the model's structure.

Variance comes from a model that's too sensitive — it memorizes the training data, including noise, and fails to generalize.

Both cause bad test-set performance, but for opposite reasons:

High bias: model can't learn the pattern (underfitting)
High variance: model learns the noise instead of the pattern (overfitting)

Build the Idea Step-by-Step

Collect training data (with noise)

→

Train model on data

→

High bias: error comes from wrong assumptions

→

High variance: error comes from sensitivity to training data

→

Total error = Bias² + Variance + Irreducible noise

→

Goal: find the sweet spot in model complexity

Formal Explanation

For a model trained on dataset D to predict target y from input x:

Expected Test Error = Bias² + Variance + Irreducible Noise

Where:

Bias² = how far off the average prediction is from the true value
Variance = how much predictions vary across different training sets
Irreducible noise = randomness in the data itself — can't be fixed by any model

This decomposition reveals that total error has two controllable components that trade off against each other.

As model complexity increases:

Complexity	Bias	Variance
Too low (linear for nonlinear data)	High	Low
Just right	Low	Low
Too high (overfitting)	Low	High

Key Properties / Rules

Signal	Likely Cause
Train error high AND test error high	High bias (underfitting)
Train error low AND test error high	High variance (overfitting)
Train error low AND test error low	Good fit
Adding more data doesn't help	High bias (model can't use more data)
Adding more data helps	High variance (model was overfitting)

Why It Matters

The bias-variance tradeoff is the core tension in choosing model complexity:

A linear model applied to data with a nonlinear pattern → high bias
A 100-layer network trained on 50 examples → high variance

In practice:

Regularization, dropout, and early stopping reduce variance
Deeper networks, more features, and better architectures reduce bias
More data helps most with high variance

Understanding this tradeoff tells you which direction to push when your model isn't performing well.

Common Pitfalls

Confusing high bias and high variance from test error alone. Always check training error too — that's the diagnostic. High train error + high test error = bias. Low train error + high test error = variance.
Thinking more data fixes everything. More data reduces variance but doesn't fix a biased model. If the model structure is wrong, no amount of data helps.
Assuming a complex model is always better. More complexity reduces bias but increases variance — especially when data is limited.
Tuning on the test set. This collapses the train/test gap artificially, making variance invisible until production.

Examples

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline

# True function: y = sin(x) + noise
np.random.seed(42)
X = np.sort(np.random.rand(30, 1) * 6, axis=0)
y = np.sin(X.ravel()) + np.random.randn(30) * 0.3

X_test = np.linspace(0, 6, 100).reshape(-1, 1)

# High bias: degree-1 (linear) — too simple for sin
model_bias = make_pipeline(PolynomialFeatures(1), LinearRegression())
model_bias.fit(X, y)
# Will have high train AND test error — can't capture the curve

# Good fit: degree-3
model_good = make_pipeline(PolynomialFeatures(3), LinearRegression())
model_good.fit(X, y)
# Low train and test error

# High variance: degree-15 — memorizes noise
model_var = make_pipeline(PolynomialFeatures(15), LinearRegression())
model_var.fit(X, y)
# Low train error, wild oscillations on test set

for name, model in [("degree-1", model_bias), ("degree-3", model_good), ("degree-15", model_var)]:
    train_err = np.mean((model.predict(X) - y)**2)
    print(f"{name}: train MSE = {train_err:.4f}")
    # degree-1:  ~0.25 (high bias — even train is bad)
    # degree-3:  ~0.08 (good fit)
    # degree-15: ~0.03 (overfits — train looks great, test is terrible)

Key diagnostic table:

Model	Train Error	Test Error	Problem
degree-1	High	High	High bias
degree-3	Low	Low	Ideal
degree-15	Very low	Very high	High variance