istatistik.ai
Illustration of regression lines and neural network nodes

Key Algorithms: Regression & Neural Networks

Regression Analysis: A Practical Guide

Regression remains the workhorse for predicting continuous outcomes and explaining relationships between variables. Despite the rise of deep learning, linear and generalised linear models provide transparency, speed, and strong baselines. This guide covers assumptions, diagnostics, regularisation, and when to switch to non‑linear approaches.

1) Assumptions & Diagnostics

Check linearity (partial residual plots), independence (design), homoscedasticity (residual vs. fitted), and normality of errors (QQ plots). Violations are common and not fatal—robust regression, transformations, or heteroscedasticity‑consistent errors can mitigate issues.

2) Feature Selection & Multicollinearity

Use domain knowledge first. Quantify collinearity via variance inflation factors (VIF). Penalised regression (ridge/lasso) handles multicollinearity and reduces variance. Remember that sparsity from the lasso is a modelling choice, not a truth claim.

3) Regularisation

Ridge (L2) shrinks coefficients, lasso (L1) sets some exactly to zero, and elastic‑net blends both. Cross‑validate the penalty parameter and prefer simpler models when performance differences are negligible.

4) Non‑Linearities & Interactions

Use splines, polynomials with care, or tree‑based models to capture non‑linear effects. Interaction terms reveal conditional relationships (e.g., price sensitivity differs by customer segment).

5) GLMs & Beyond

When outcomes are counts or rates, Poisson or negative binomial may fit better. For bounded outcomes consider beta regression. Quantile regression estimates conditional quantiles and is robust to outliers—excellent for service‑level guarantees.

FAQ

How do I choose between RMSE and MAE?

MAE is robust to outliers; RMSE penalises large errors more. Choose based on business cost of large misses.

Can I interpret coefficients causally?

Only under strong identification assumptions. For interventions use causal inference tools (IVs, DID, RCTs).

Back to articles