Parametric Families

Lecture 2

Please, sign in on iClicker

Today’s Learning Goals

By the end of this lecture, we will be able to…

Calculate expectations of a linear combination of random variables.
Match a physical process to a distribution family (Binomial, Geometric, Negative Binomial, Poisson, and Bernoulli).
Calculate probabilities, mean, and variance of a distribution belonging to a distribution family.

And…

Find the probability mass function of a random variable transformation, e.g., \(X^2\).
Distinguish between a family of distributions and a distribution.
Identify whether a specification of parameters (such as mean and variance) is enough/too little/too much to specify a distribution from a family of distributions.

Outline

Properties of Distributions
Random Variable Transformations
Distribution Families
Another Common Discrete Distribution Families

1. Properties of Distributions

We must start getting familiar with central tendency and uncertainty measures from lecture1.
Hence, let us practice their computations with some in-class iClicker.

1.1. A Single Probability Mass Function

Suppose \(X\) is a discrete random variable denoting the following: \[X = \text{Number of crabs found at a nest in a Mexican beach.}\]

Probability Mass Function (PMF)

\(X\)	\(P(X = x)\)
0	0.4
1	0.1
2	0.1
3	0.4

We plot it as a bar chart…

iClicker Question

Using the PMF for random variable \(X\), compute \(\mathbb{E}(X)\). Select the correct option:

A. 1

B. 1.5

C. 1.9

D. 6

iClicker Question

Using the PMF for random variable \(X\), compute the variance \(\text{Var}(X)\). Select the correct option:

A. 2.6

B. 1.85

C. 4.1

D. -1.85

iClicker Question

Using the PMF for random variable \(X\), obtain the mode \(\text{Mode}(X)\). Select the correct option:

A. 0

B. 3

C. Both 0 and 3

D. Neither

iClicker Question

Using the PMF for random variable \(X\), obtain the entropy \(H(X).\) Select the correct option:

A. -1.19

B. 0.52

C. -0.52

D. 1.19

1.2. Comparing Multiple Probability Mass Functions

Suppose there are four different random variables related to four Mexican beaches:

\[\begin{gather*} U = \text{Number of crabs found at a nest at Acapulco} \\ V = \text{Number of crabs found at a nest at Cabo San Lucas} \\ W = \text{Number of crabs found at a nest at Cancún} \\ Y = \text{Number of crabs found at a nest at Puerto Vallarta.} \end{gather*}\]

Probability Mass Functions (PMFs)

iClicker Question

Answer TRUE or FALSE:

By only looking at the PMFs, \(U\) has higher entropy than \(V\).

A. TRUE

B. FALSE

iClicker Question

Answer TRUE or FALSE:

By only looking at the PMFs, \(U\) has higher variance than \(V\).

A. TRUE

B. FALSE

iClicker Question

Answer TRUE or FALSE:

By only looking at the PMFs, \(W\) has the highest variance amongst the four distributions.

A. TRUE

B. FALSE

iClicker Question

Answer TRUE or FALSE:

By only looking at the PMFs, \(Y\) has the highest entropy amongst the four distributions.

A. TRUE

B. FALSE

2. Random Variable Transformations

A random variable can be turned into other random variables via mathematical transformations.
This characteristic is crucial in data modelling!

2.1. Revisiting Variance

We can compute the variance of a random variable \(X\) in two forms:
1. \(\text{Var}(X) = \mathbb{E}\{[X - \mathbb{E}(X)]^2\}\), or alternatively
2. \(\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\).

Computing the Variance Using Crabs PMF for \(X\)

Method 1

\[\begin{align*} \text{Var}(X) &= \mathbb{E}\{[X - \mathbb{E}(X)]^2\} \\ &= \mathbb{E}[(X - 1.5)^2] \qquad \qquad \text{since } \mathbb{E}(X) = 1.5 \\ &= (-1.5)^2(0.4) + (-0.5)^2(0.1) + (0.5)^2(0.1) + (1.5)^2(0.4) \\ &= 1.85. \end{align*}\]

Now with the other approach…

Method 2

\[\begin{align*} \text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2 \\ &= \mathbb{E}(X^2) - (1.5)^2 \qquad \qquad \text{since } \mathbb{E}(X) = 1.5 \\ &= (0)^2(0.4) + (1)^2(0.1) + (2)^2(0.1) + (3)^2(0.4) - (1.5)^2 \\ &= 1.85. \end{align*}\]

2.2. Distribution Mapping

Following up with the crabs PMF, let us focus the attention on \(\mathbb{E}(X^2)\).
More specifically, what does \(X^2\) mean?
It comes down to what we define in Statistics as a random variable transformation.
We can rename \(X^2\) as \[Z = X^2.\]

Comparing PMFs

2.3. Expected Value Properties

Expected values have certain useful properties.
If \(a\) and \(b\) are constants, with \(X\) and \(Y\) as random variables, then we can obtain the expected value of the following expressions as:

\[\begin{gather*} \mathbb{E}(a X) = a \mathbb{E}(X) \\ \mathbb{E}(X + Y) = \mathbb{E}(X) + \mathbb{E}(Y) \\ \mathbb{E}(aX + bY) = a\mathbb{E}(X) + b\mathbb{E}(Y). \end{gather*}\]

Caution

The operator \(\mathbb{E}(\cdot)\) does not follow the usual algebraic rules.
For instance, if no further assumptions are made for random variables \(X\) and \(Y\), then \[\mathbb{E}(XY) \neq \mathbb{E}(X)\mathbb{E}(Y).\]
Moreover, note the following: \[\mathbb{E}(X^2) \neq [\mathbb{E}(X)]^2.\]

2.3. Variance Properties

If \(a\) and \(b\) are constants, with \(X\) and \(Y\) as independent random variables, then we can obtain the variance of the following expressions as:

\[\begin{gather*} \text{Var}(a X) = a^2 \text{Var}(X) \\ \text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) \\ \text{Var}(aX + bY) = a^2 \text{Var}(X) + b^2 \text{Var}(Y). \end{gather*}\]

3. Distribution of Families

A huge component of Data Science is to model data as random variables with uncertain outcomes.
For instance:
- The number of ships that arrive at the port of Vancouver on a given day (i.e., a discrete and count random variable).
- A rock class (i.e., a discrete and categorical random variable).

3.1. Bernoulli

Suppose you play a game and win with probability \(0 \leq p \leq 1\).
Let \(X\) be the outcome of this game. It is a binary random variable as follows \[ X = \begin{cases} 1 \; \; \; \; \text{if you win the game (success)},\\ 0 \; \; \; \; \mbox{otherwise}. \end{cases} \]
The value \(1\) has a probability of \(p\), whereas the value \(0\) has a probability of \(1 - p\).

PMF

A Bernoulli distribution is depicted as: \[X \sim \text{Bernoulli}(p).\]
Its PMF is \[P(X = x \mid p) = p^x (1 - p)^{1 - x} \quad \text{for} \quad x = 0, 1.\]

Mean

\[\begin{align*} \mathbb{E}(X) &= \sum_{x = 0}^1 x \cdot P(X = x \mid p) \\ &= \sum_{x = 0}^1 x \cdot p^x (1 - p)^{1 - x} \\ &= p. \end{align*}\]

Variance

\[\begin{align*} \text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2 \\ &= \mathbb{E}(X^2) - p^2 \qquad \qquad \text{since } \mathbb{E}(X) = p \\ &= \sum_{x = 0}^1 x^2 \cdot P(X = x \mid p) - p^2 \\ &= p(1 - p). \end{align*}\]

3.2. Binomial

Suppose you play a game, and win with probability \(0 \leq p \leq 1\).
Let \(X\) be the number of games you win within \(n\) independent games in total.
\(X\) is said to have a Binomial distribution, written as \[X \sim \text{Binomial} (n, p).\]

PMF

A Binomial distribution is characterized by the PMF \[P \left( X = x \mid n, p \right) = {n \choose x} p^x (1 - p)^{n - x} \quad \text{for} \quad x = 0, 1, \dots, n.\]
The above PMF has the following component: \[{n \choose x} = \frac{n!}{x!(n - x)!}.\]

Example

Let us derive the probability of winning exactly two games out of five.
I.e., \(P(X = 2)\) when \(n = 5\) and \(p = 0.25\): \[\begin{align*} P(X = 2 \mid n = 5, p = 0.25) &= {5 \choose 2} (0.25)^2 (1 - 0.25)^{5 - 2} \\ &= \frac{5!}{2!(5 - 2)!} (0.25)^2 (1 - 0.25)^{5 - 2} \\ &= 0.26. \end{align*}\]

Mean and Variance

\[\mathbb{E}(X) = n p\]

\[\text{Var}(X) = n p (1 - p).\]

3.3. Families Versus Distributions

Specifying a value for both \(p\) and \(n\) results in a unique Binomial distribution.

There are, in fact, infinite Binomial distributions.

3.4. Parameters

Since \(p\) and \(n\) fully specify a Binomial distribution, we call them parameters of the Binomial family.
We call the Binomial family a parametric family of distributions.

3.5. Parameterization

Which variables we decide to use to identify a distribution within a family is called the family’s parameterization.
Parameterization will depend on the information you can more easily obtain.

4. Another Common Discrete Distribution Families

Aside from the Binomial family of distributions, many other families come up in data modelling.
In practice, distribution families still act as useful approximations.

4.1. Geometric

Suppose you play a game, and win with probability \(0 \leq p \leq 1\).
Let \(X\) be the number of independent failures before the first independent success.
\(X\) is said to have a Geometric distribution, written as \[X \sim \text{Geometric} (p).\]

PMF

A Geometric distribution is characterized by the PMF \[P(X = x \mid p) = p (1 - p)^x \quad \text{for} \quad x = 0, 1, \dots\]
Since there is only one parameter, this means that if you know the mean, you also know the variance!
It has an infinite support.

Mean and Variance

\[\mathbb{E}(X) = \frac{1 - p}{p}\]

\[\text{Var}(X) = \frac{1 - p}{p^2}.\]

4.2. Negative Binomial (a.k.a. Pascal)

Suppose you play a game, and win with probability \(0 \leq p \leq 1\).
Let \(X\) be the number of independent losses at playing the game before experiencing \(k\) independent wins.
\(X\) is said to have a Negative Binomial distribution, written as \[X \sim \text{Negative Binomial} (k, p).\]

PMF

A Negative Binomial distribution is characterized by the PMF \[P(X = x \mid k, p) = {k - 1 + x \choose x} p^k (1 - p)^x \quad \text{for} \quad x = 0, 1, \dots\]
It has two parameters: \(k\) and \(p\).
The Geometric family results with \(k = 1\).

Mean and Variance

\[\mathbb{E}(X) = \frac{k(1 - p)}{p}\]

\[\text{Var}(X) = \frac{k(1 - p)}{p^2}.\]

4.3. Poisson

Suppose customers independently arrive at a store at some average rate \(\lambda\).
Then, the total number \(X\) of customers arriving after a pre-specified length of time follows a Poisson distribution: \[X \sim \text{Poisson} (\lambda).\]

PMF

We can find other examples that are indicative of a Poisson process:
- The number of ships arriving at Vancouver port on a given day.
- The number of emails you receive on a given day.
A Poisson distribution is characterized by the PMF \[P(X = x \mid \lambda) = \frac{\lambda^x \exp(-\lambda)}{x!} \quad \text{for} \quad x = 0, 1, \dots\]

Mean and Variance

\[\mathbb{E}(X) = \lambda\]

\[\text{Var}(X) = \lambda.\]

A notable property of this family is that the mean is equal to the variance!

Parametric Families

Today’s Learning Goals

By the end of this lecture, we will be able to…

And…

Outline

1. Properties of Distributions

1.1. A Single Probability Mass Function

Probability Mass Function (PMF)

We plot it as a bar chart…

iClicker Question

iClicker Question

iClicker Question

iClicker Question

1.2. Comparing Multiple Probability Mass Functions

Probability Mass Functions (PMFs)

iClicker Question

iClicker Question

iClicker Question

iClicker Question

2. Random Variable Transformations

2.1. Revisiting Variance

Computing the Variance Using Crabs PMF for \(X\)

Now with the other approach…

2.2. Distribution Mapping

Comparing PMFs

2.3. Expected Value Properties

Caution

2.3. Variance Properties

3. Distribution of Families

3.1. Bernoulli

PMF

Mean

Variance

3.2. Binomial

PMF

Example

Mean and Variance

3.3. Families Versus Distributions

3.4. Parameters

3.5. Parameterization

4. Another Common Discrete Distribution Families

4.1. Geometric

PMF

Mean and Variance

4.2. Negative Binomial (a.k.a. Pascal)

PMF

Mean and Variance

4.3. Poisson

PMF

Mean and Variance

4.5. Finally, let us check this mindmap…