glossary-aez-notes
Table of Contents
Mathematics and Statistics Glossary
A
B
C
Cauchy distribution
- Continuous probability distribution used as a pathological example which is a symmetric distribution about zero which does not have a well defined mean. The tails decay as \(1 / x^{2}\).
Consistent estimator
- An estimator of a parameter, \(\theta\), is said to be consistent if it converges to \(\theta\) as the sample size goes to infinity.
- An estimator can be unbiased but not consistent and it can be biased and consistent.
D
Data gravity
- A buzzword used to refer to the phenomenon that large collections of data tend to attract attention and additional data.
E
F
G
H
Hamming distance
- Wikipedia says: "In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different."
Hat-values
- See leverage.
horseshoe prior
- A continuous shrinkage prior which is a normal distribution where the scale parameter is drawn from a half-Cauchy distribution.
hyperparameters
- Parameters that specify the parameters of the prior distribution.
I
Infinite divisibility (of a distribution)
- A random variable is infinitely divisible if for every natural number \(n\) there exists \(n\) IID random variables whos sum has the same same distribution.
Inverse probability
- An older term used to describe the posterior distribution in Bayesian statistics.
J
K
L
Leverage
- In regression, leverage is a measure of how unusual the explanatory values of a particular datum are. Hat-values are one measure of leverage. Leverage is important because points with high leverage have the potential to have a strong effect on estimates.
Likelihood function
- The joint probability distribution of the sample viewed as a function of the parameters treating the sample as a fixed quantity1.
Lindley's paradox
- It is simple to construct a hypothesis test in which the Bayesian and frequentist approaches give different results. Typically, the frequentist test will reject and the Bayesian one will not. When this occurs it is said to be a Lindley paradox.
M
N
O
Ordinal
- Categorical data with an ordering, for example the data collected on a Likert scale.
Outlier
- In regression, an outlier is a point that has a response value that is inconsistent with what one might expect given the rest of the data. This is not to be confused with an unusual point which is one for which both the response and predictor values are atypical.
P
Prior distribution
- Probability distributions representing the beliefs about parameters held before seeing the data.
- Examples include reference priors, and the trilogy: diffuse priors, weakly informative and informative priors which span the spectrum of informativeness.
Prior elicitation
- The process of constructing a prior distribution based upon existing knowledge, potentially by interviewing domain experts. There is a large literature attempting to find best practises for eliciting prior distributions. In some settings it is very important, in others it is not worth the large amount of effort required to consult experts.
Prior predictive checking
- When carrying out a Bayesian analysis, this is the process of checking that the implications of the prior distribution make sense. As with prior elicitation there are a lot of methods available for this2.
Q
R
reference prior
- A type of non-informative prior distribution developed by Berger and Bernardo which aims to maximise the KL-divergence when averaged over potential data generated by the prior predictive distribution3.
S
shrinkage prior
- A prior distribution which concentrates substantial prior probability on a particular value of the parameter, typically at a value of zero; this is useful for variable selection.
- See the spike-and-slab prior and the horseshoe prior.
spike-and-slab prior
- A shrinkage prior which is a mixture of two distributions, one with a small variance about zero, the spike, and another with a large variance, the slab. For example a normal distribution with a point mass at zero.
T
total order
Given a set \(X\), a binary relation, \(\leq\), is a total order iff it is
- reflexive
- transitive
- anti-symmetric (meaning \(\forall a b\) if \(a\leq b\) and \(b \leq a\) then \(a=b\).)
- total (meaning \(\forall a b\) either \(a\leq b\) or \(b \leq a\).)
U
V
W
WAMBS
- When to worry and how to Avoid the Misuse of Bayesian Statistics from
10.1037/met0000065
.
X
Y
Z
Footnotes:
1
Some authors consider it sufficient to consider the probability up to unknown multiplicative factor.
2
See vandeschoot2021bayesian
Bayesian statistics and modelling and the
links within for further details. https://doi.org/10.1038/s43586-020-00001-2
3
scricciolo1999probability
Probability matching priors: a review
https://doi.org/10.1007/BF03178943