The University of Sheffield Logo

Dataviz.Shef

Loading, please wait ...

Statistical Modeling

Dataviz Team

A statistical model is a mathematical model used to describe the relationship between different variables. It contains a set of assumptions about the sample data and usually represents the data generation process in an idealised form. Statistical modelling is the process of exploring statistical models which could represent and best describe observed data. This process includes (but not limited to) initial selection of probability distributions, encapsulating assumptions as parameters, estimation of stochastic variables, sampling, and comparison between different models. Once a statistical model is drafted, the model will be used to test hypotheses, create predicted values (make predictions), and compute confidence intervals (an interval we are confident that values will fall into).

Most modelling methods that do not model the random component de facto assume Gaussian distribution. The most common exceptions are often growth models which will often assume a lognormal. The random component in statistical models can be either from something that is non deterministic or it might be due to deterministic elements that are unknown or noises in the system caused by elements that are not captured. Statistics is therefore the art of handling elements in a model that create random noise, and statisticians are agnostic about the nature of the randomness with respect to whether it is deterministic or not.

In this learning path we will be introducing you to probability distributions for common variable types, sampling and how to describe a sample from a certain distribution, basics of statistical model, common statistical testing techniques, and many more.

Edit this page on GitHub