glossary-aez-notes

Home

Mathematics and Statistics Glossary
- A
- B
- C
- D
- E
- F
- G
- H
- I
- J
- K
- L
- M
- N
- O
- P
- Q
- R
- S
- T
- U
- V
- W
- X
- Y
- Z

Mathematics and Statistics Glossary

A

B

C

Cauchy distribution

Continuous probability distribution used as a pathological example which is a symmetric distribution about zero which does not have a well defined mean. The tails decay as \(1 / x^{2}\).

Consistent estimator

An estimator of a parameter, \(\theta\), is said to be consistent if it converges to \(\theta\) as the sample size goes to infinity.
An estimator can be unbiased but not consistent and it can be biased and consistent.

D

Data gravity

A buzzword used to refer to the phenomenon that large collections of data tend to attract attention and additional data.

E

F

G

H

Hamming distance

Wikipedia says: "In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different."

Hat-values

See leverage.

horseshoe prior

A continuous shrinkage prior which is a normal distribution where the scale parameter is drawn from a half-Cauchy distribution.

hyperparameters

Parameters that specify the parameters of the prior distribution.

I

Infinite divisibility (of a distribution)

A random variable is infinitely divisible if for every natural number \(n\) there exists \(n\) IID random variables whos sum has the same same distribution.

Inverse probability

An older term used to describe the posterior distribution in Bayesian statistics.

J

K

L

Leverage

In regression, leverage is a measure of how unusual the explanatory values of a particular datum are. Hat-values are one measure of leverage. Leverage is important because points with high leverage have the potential to have a strong effect on estimates.

Likelihood function

The joint probability distribution of the sample viewed as a function of the parameters treating the sample as a fixed quantity¹.

Lindley's paradox

It is simple to construct a hypothesis test in which the Bayesian and frequentist approaches give different results. Typically, the frequentist test will reject and the Bayesian one will not. When this occurs it is said to be a Lindley paradox.

M

N

O

Ordinal

Categorical data with an ordering, for example the data collected on a Likert scale.

Outlier

In regression, an outlier is a point that has a response value that is inconsistent with what one might expect given the rest of the data. This is not to be confused with an unusual point which is one for which both the response and predictor values are atypical.

P

Prior distribution

Probability distributions representing the beliefs about parameters held before seeing the data.
Examples include reference priors, and the trilogy: diffuse priors, weakly informative and informative priors which span the spectrum of informativeness.

Prior elicitation

The process of constructing a prior distribution based upon existing knowledge, potentially by interviewing domain experts. There is a large literature attempting to find best practises for eliciting prior distributions. In some settings it is very important, in others it is not worth the large amount of effort required to consult experts.

Prior predictive checking

When carrying out a Bayesian analysis, this is the process of checking that the implications of the prior distribution make sense. As with prior elicitation there are a lot of methods available for this².

Q

R

reference prior

A type of non-informative prior distribution developed by Berger and Bernardo which aims to maximise the KL-divergence when averaged over potential data generated by the prior predictive distribution³.

S

shrinkage prior

A prior distribution which concentrates substantial prior probability on a particular value of the parameter, typically at a value of zero; this is useful for variable selection.
See the spike-and-slab prior and the horseshoe prior.

spike-and-slab prior

A shrinkage prior which is a mixture of two distributions, one with a small variance about zero, the spike, and another with a large variance, the slab. For example a normal distribution with a point mass at zero.

T

total order

Given a set \(X\), a binary relation, \(\leq\), is a total order iff it is

reflexive
transitive
anti-symmetric (meaning \(\forall a b\) if \(a\leq b\) and \(b \leq a\) then \(a=b\).)
total (meaning \(\forall a b\) either \(a\leq b\) or \(b \leq a\).)

U

V

W

WAMBS

When to worry and how to Avoid the Misuse of Bayesian Statistics from 10.1037/met0000065.

X

Y

Z

Footnotes:

Some authors consider it sufficient to consider the probability up to unknown multiplicative factor.

See vandeschoot2021bayesian Bayesian statistics and modelling and the links within for further details. https://doi.org/10.1038/s43586-020-00001-2

scricciolo1999probability Probability matching priors: a review https://doi.org/10.1007/BF03178943