Machine Learning Math

Loss Functions

Loss functions measure how wrong a model's predictions are. Choosing the right one — MSE for regression, cross-entropy for classification — determines what the model actually optimizes for during training.

Bias vs Variance

Bias and variance are two distinct ways a model can fail. Bias is systematic error — consistently wrong in the same direction. Variance is sensitivity to training data — consistent on seen data, inconsistent on new data.

Overfitting and Underfitting

Overfitting happens when a model memorizes training data instead of learning the pattern. Underfitting happens when a model is too simple to capture the pattern at all. Both produce bad models — for opposite reasons.

Regularization (L1 and L2)

Regularization adds a penalty for model complexity to prevent overfitting. L1 produces sparse models by zeroing out weak weights. L2 shrinks all weights smoothly toward zero. Both discourage the model from relying too heavily on any single feature.

Train/Test Split

You can't evaluate a model on data it was trained on — it has already seen the answers. Splitting data into train, validation, and test sets gives you an honest measure of how well the model generalizes to new inputs.

Evaluation Metrics

Accuracy alone is misleading. Precision, recall, and F1 score give you a complete picture of model performance — especially when classes are imbalanced or when the cost of false positives and false negatives is different.