Joint Probability

Lecture 3

Please, sign in on iClicker

Today’s Learning Goals

By the end of this lecture, we will be able to…

Calculate marginal distributions from a joint distribution of random variables.
Describe the probabilistic consequences of working with independent random variables.

And…

Calculate and describe covariance in multivariate cases (i.e., with more than one random variable).
Calculate and describe two mainstream correlation metrics: Pearson’s correlation and Kendall’s \(\tau_K\).

Outline

Joint Distributions
Independence and Dependence Concepts

1. Joint Distributions

So far, we have only considered one random variable at a time which has an univariate distribution.
However, we very often have more than one random variable.

Coins come again!

Consider two independent fair coins (i.e., two independent Bernoulli random variables!).
The sample space is: \(\texttt{HH}\), \(\texttt{HT}\), \(\texttt{TH}\), \(\texttt{TT}\), each with a probability \(0.25\).
The joint distribution of this process is the following:

\(X/Y\)	\(\texttt{H}\)	\(\texttt{T}\)
\(\texttt{H}\)	0.25	0.25
\(\texttt{T}\)	0.25	0.25

Random Variable Setup

Let us set the following binary random variables (since each one could only have two outcomes, \(\texttt{H}\) or \(\texttt{T}\)):

\[\begin{gather*} X = \text{First coin's outcome.} \\ Y = \text{Second coin's outcome.} \end{gather*}\]

Computing Probabilities

Each cell of the previous joint distribution is computed as:

\[\begin{align*} P(X = \texttt{H} \cap Y = \texttt{H}) &= P(X = \texttt{H}) \cdot P(Y = \texttt{H}) \qquad \text{independence} \\ &= 0.5 \cdot 0.5 \qquad \text{fair coins} \\ &= 0.25. \end{align*}\]

Can we have an univariate setup?

Alternatively, we can define the following random variable:

\[Z = \text{Outcomes obtained when tossing two independent coins.}\]

Outcome	Probability
\(\texttt{HH}\)	0.25
\(\texttt{HT}\)	0.25
\(\texttt{TH}\)	0.25
\(\texttt{TT}\)	0.25

1.1. Example: Length of Stay Versus Gang Demand

We will work with the following joint distribution of length of stay (\(\text{LOS}\)) of a ship and its gang demand (\(\text{Gangs}\)).
Consider an example that a Vancouver port faces with gang demand.
When a ship arrives, they request a certain number of gangs to unload the ship.

Probability Mass Function (PMF) for Gangs

PMF for Length of Stay

LOS	Probability
1	0.25
2	0.35
3	0.20
4	0.10
5	0.10

Now, we might wonder…

What is the probability that a ship requires 4 gangs AND will stay in port for 5 days?

The information provided by both separate PMFs (\(\text{Gangs}\) and \(\text{LOS}\)) is not sufficient to answer this question.
We would need to use a joint distribution between \(\text{LOS}\) and \(\text{Gangs}\).

Before, let us define the following:

In a random system/process with more than one random variable, the distribution of a standalone variable is called a marginal distribution.

Now, going back to the joint distribution…

We need a probability for every possible combination of the number of \(\text{Gangs}\) and \(\text{LOS}\).
In this case, \(5 \times 4 = 20\) probabilities (that again add up to 1).

	Gangs = 1	Gangs = 2	Gangs = 3	Gangs = 4
LOS = 1	0.00170	0.04253	0.12471	0.08106
LOS = 2	0.02664	0.16981	0.13598	0.01757
LOS = 3	0.05109	0.11563	0.03203	0.00125
LOS = 4	0.04653	0.04744	0.00593	0.00010
LOS = 5	0.07404	0.02459	0.00135	0.00002

Now, we might wonder…

Could the 20 numbers in the joint distribution be absolutely ANY probabilities between 0 and 1?

No, we have the following restrictions:
- They are restricted by the fact that they will need to add up to 1 (recall the Law of Total Probability!).
- We need the joint distribution to be consistent with those marginal distributions.

1.2. Calculating Marginal Distributions from the Joint Distribution

In the case of discrete random variables, we add up the probabilities of the corresponding standalone outcomes.

	Gangs = 1	Gangs = 2	Gangs = 3	Gangs = 4
LOS = 1	0.00170	0.04253	0.12471	0.08106
LOS = 2	0.02664	0.16981	0.13598	0.01757
LOS = 3	0.05109	0.11563	0.03203	0.00125
LOS = 4	0.04653	0.04744	0.00593	0.00010
LOS = 5	0.07404	0.02459	0.00135	0.00002

Let us start with the marginal distribution of \(\text{LOS}\)…

We can compute \(P(\text{LOS} = 1)\).

Thus, there are four ways this could happen:
- \(\text{LOS} = 1\) and \(\text{Gangs} = 1\).
- \(\text{LOS} = 1\) and \(\text{Gangs} = 2\).
- \(\text{LOS} = 1\) and \(\text{Gangs} = 3\).
- \(\text{LOS} = 1\) and \(\text{Gangs} = 4\).

So, to find the marginal probability \(P(\text{LOS} = 1)\)…

\[\begin{align*} P(\text{LOS} = 1) &= P(\text{LOS} = 1 \cap \text{Gangs} = 1) + \\ & \quad \quad P(\text{LOS} = 1 \cap \text{Gangs} = 2) + \\ & \quad \quad \quad P(\text{LOS} = 1 \cap \text{Gangs} = 3) + \\ & \quad \quad \quad \quad P(\text{LOS} = 1 \cap \text{Gangs} = 4) \\ &= 0.00170 + 0.04253 + 0.12471 + 0.08106 \\ &= 0.25. \end{align*}\]

We have \(P(\text{LOS} = 1)\)…

But we would also need \(P(\text{LOS} = 2)\), \(P(\text{LOS} = 3)\), etc.
Thus, we add up each row from our joint distribution.

R Code
Output

rowSums(joint_distribution) |>
  kable(col.names = "Probability", align = "c") |>
  kable_styling(font_size = 30) |>
  column_spec(1, bold = TRUE)

	Probability
LOS = 1	0.25
LOS = 2	0.35
LOS = 3	0.20
LOS = 4	0.10
LOS = 5	0.10

colSums(joint_distribution) |>
  kable(col.names = "Probability", align = "c") |>
  kable_styling(font_size = 30) |>
  column_spec(1, bold = TRUE)

	Probability
Gangs = 1	0.2
Gangs = 2	0.4
Gangs = 3	0.3
Gangs = 4	0.1

Note both marginals computed from the joint are consistent with our initial marginals.

iClicker Question

Answer TRUE or FALSE:

We obtain a marginal distribution by summing the rows of a joint distribution; therefore, each row of a joint distribution must sum to 1.

A. TRUE

B. FALSE

2. Independence and Dependence Concepts

A big part of Data Science is about harvesting the relationship between the variables in our datasets.

2.1. Independence

Let \(X\) and \(Y\) be two random variables.
\(X\) and \(Y\) are independent if knowing something about one of them tells us nothing about the other: \[P(X = x \cap Y = y) = P(X = x) \cdot P(Y = y).\]
We would only need the marginals to obtain their joint distribution.

Going back to the two coins!

Recall we had this joint distribution:

\[\begin{gather*} X = \text{First coin's outcome.} \\ Y = \text{Second coin's outcome.} \end{gather*}\]

\(X/Y\)	\(\texttt{H}\)	\(\texttt{T}\)
\(\texttt{H}\)	0.25	0.25
\(\texttt{T}\)	0.25	0.25

Obtaining the Marginals from the Joint

We can see that the two coin flips are independent:

\[\begin{align*} P(X = \texttt{H}) &= P(X = \texttt{H} \cap Y = \texttt{H}) + P(X = \texttt{H} \cap Y = \texttt{T}) \\ &= 0.25 + 0.25 = 0.5 \\ P(X = \texttt{T}) &= P(X = \texttt{T} \cap Y = \texttt{H}) + P(X = \texttt{T} \cap Y = \texttt{T}) \\ &= 0.25 + 0.25 = 0.5 \\ P(Y = \texttt{H}) &= P(X = \texttt{H} \cap Y = \texttt{H}) + P(X = \texttt{T} \cap Y = \texttt{H}) \\ &= 0.25 + 0.25 = 0.5 \\ P(Y = \texttt{T}) &= P(X = \texttt{H} \cap Y = \texttt{T}) + P(X = \texttt{T} \cap Y = \texttt{T}) \\ &= 0.25 + 0.25 = 0.5. \end{align*}\]

Applying the Independence Property via the Marginals

\[\begin{align*} P(X = \texttt{H} \cap Y = \texttt{H}) &= P(X = \texttt{H}) \cdot P(Y = \texttt{H}) \\ &= 0.5 \cdot 0.5 = 0.25 \\ P(X = \texttt{H} \cap Y = \texttt{T}) &= P(X = \texttt{H}) \cdot P(Y = \texttt{T}) \\ &= 0.5 \cdot 0.5 = 0.25 \\ P(X = \texttt{T} \cap Y = \texttt{H}) &= P(X = \texttt{T}) \cdot P(Y = \texttt{H}) \\ &= 0.5 \cdot 0.5 = 0.25 \\ P(X = \texttt{T} \cap Y = \texttt{T}) &= P(X = \texttt{T}) \cdot P(Y = \texttt{T}) \\ &= 0.5 \cdot 0.5 = 0.25. \end{align*}\]

Let us check another two-coin case…

\[\begin{gather*} X = \text{First coin's outcome} \\ Y = \text{Second coin's outcome.} \end{gather*}\]

\(X/Y\)	\(\texttt{H}\)	\(\texttt{T}\)
\(\texttt{H}\)	0.2	0.6
\(\texttt{T}\)	0.05	0.15

Computing the Marginals

\[\begin{align*} P(X = \texttt{H}) &= P(X = \texttt{H} \cap Y = \texttt{H}) + P(X = \texttt{H} \cap Y = \texttt{T}) \\ &= 0.2 + 0.6 \\ &= 0.8. \end{align*}\]

By the Law of Total Probability, we can obtain: \[\begin{align*} P(X = \texttt{T}) &= 1 - P(X = \texttt{H})\\ &= 1 - 0.8 \\ &= 0.2. \end{align*}\]

And likewise for the second coin…

\[\begin{align*} P(Y = \texttt{H}) &= P(X = \texttt{H} \cap Y = \texttt{H}) + P(X = \texttt{T} \cap Y = \texttt{H}) \\ &= 0.2 + 0.05 \\ &= 0.25. \end{align*}\]

By the Law of Total Probability, we can obtain: \[\begin{align*} P(Y = \texttt{T}) &= 1 - P(Y = \texttt{H})\\ &= 1 - 0.25 \\ &= 0.75. \end{align*}\]

Applying the Independence Property via the Marginals

These two coins are also independent!

\[\begin{align*} P(X = \texttt{H} \cap Y = \texttt{H}) &= P(X = \texttt{H}) \cdot P(Y = \texttt{H}) \\ &= 0.8 \cdot 0.25 = 0.2 \\ P(X = \texttt{H} \cap Y = \texttt{T}) &= P(X = \texttt{H}) \cdot P(Y = \texttt{T}) \\ &= 0.8 \cdot 0.75 = 0.6 \\ P(X = \texttt{T} \cap Y = \texttt{H}) &= P(X = \texttt{T}) \cdot P(Y = \texttt{H}) \\ &= 0.2 \cdot 0.25 = 0.05 \\ P(X = \texttt{T} \cap Y = \texttt{T}) &= P(X = \texttt{T}) \cdot P(Y = \texttt{T}) \\ &= 0.2 \cdot 0.75 = 0.15. \end{align*}\]

But there is no independence in this other two-coin case!

\[\begin{gather*} X = \text{First coin's outcome} \\ Y = \text{Second coin's outcome.} \end{gather*}\]

\(X/Y\)	\(\texttt{H}\)	\(\texttt{T}\)
\(\texttt{H}\)	0.5	0
\(\texttt{T}\)	0	0.5

2.2. Measures of Dependence

Let us ask ourselves the following:

What if two random variables are not independent?

Is there some measure of dependence?

2.2.1. Covariance and Pearson’s Correlation

Covariance is one common way of measuring dependence between two numeric random variables.
It measures the amount of dependence and direction:

\[\begin{align*} \operatorname{Cov}(X, Y) &= \mathbb{E}[(X-\mu_X)(Y-\mu_Y)] \\ &= \mathbb{E}(XY) - \mathbb{E}(X)\mathbb{E}(Y). \end{align*}\]

Going back to our cargo ship example!

R Code
Output

joint_distribution |>
  kable(align = "cccc") |>
  kable_styling(font_size = 30) |>
  column_spec(1, bold = TRUE)

	Gangs = 1	Gangs = 2	Gangs = 3	Gangs = 4
LOS = 1	0.00170	0.04253	0.12471	0.08106
LOS = 2	0.02664	0.16981	0.13598	0.01757
LOS = 3	0.05109	0.11563	0.03203	0.00125
LOS = 4	0.04653	0.04744	0.00593	0.00010
LOS = 5	0.07404	0.02459	0.00135	0.00002

For a larger \(\text{LOS}\), there are larger probabilities associated with a smaller gang demand.

Coding Up the Marginal PMFs

R Code
Output

Marginal_PMF_LOS <- tribble(
  ~n_days, ~p,
  1, 0.25,
  2, 0.35,
  3, 0.2,
  4, 0.1,
  5, 0.1
)
Marginal_PMF_LOS

Marginal_PMF_Gangs <- tribble(
  ~n_gangs, ~p,
  1, 0.2,
  2, 0.4,
  3, 0.3,
  4, 0.1,
)
Marginal_PMF_Gangs

# A tibble: 5 × 2
  n_days     p
   <dbl> <dbl>
1      1  0.25
2      2  0.35
3      3  0.2 
4      4  0.1 
5      5  0.1

# A tibble: 4 × 2
  n_gangs     p
    <dbl> <dbl>
1       1   0.2
2       2   0.4
3       3   0.3
4       4   0.1

Computing Marginals Expected Values

R Code
Output

E_LOS <- sum(Marginal_PMF_LOS$n_days * Marginal_PMF_LOS$p)
E_LOS

E_Gangs <- sum(Marginal_PMF_Gangs$n_gangs * Marginal_PMF_Gangs$p)
E_Gangs

[1] 2.45

[1] 2.3

Hence:

\[\mathbb{E}(\text{LOS}) = 2.45\] \[\mathbb{E}(\text{Gangs}) = 2.3.\]

Melting `joint_distribution` (manually!)

R Code
Output

joint_distribution <- data.frame(
  LOS = c(rep(1, 4), rep(2, 4), rep(3, 4), rep(4, 4), rep(5, 4)),
  Gangs = rep(1:4, 5),
  p = c(
    0.00170, 0.04253, 0.12471, 0.08106,
    0.02664, 0.16981, 0.13598, 0.01757,
    0.05109, 0.11563, 0.03203, 0.00125,
    0.04653, 0.04744, 0.00593, 0.00010,
    0.07404, 0.02459, 0.00135, 0.00002
  )
)
joint_distribution

   LOS Gangs       p
1    1     1 0.00170
2    1     2 0.04253
3    1     3 0.12471
4    1     4 0.08106
5    2     1 0.02664
6    2     2 0.16981
7    2     3 0.13598
8    2     4 0.01757
9    3     1 0.05109
10   3     2 0.11563
11   3     3 0.03203
12   3     4 0.00125
13   4     1 0.04653
14   4     2 0.04744
15   4     3 0.00593
16   4     4 0.00010
17   5     1 0.07404
18   5     2 0.02459
19   5     3 0.00135
20   5     4 0.00002

Computing the Crossed Expected Value

R Code
Output

E_LOS_Gangs <- sum(joint_distribution$LOS *
  joint_distribution$Gangs *
  joint_distribution$p)
E_LOS_Gangs

[1] 4.89956

Thus:

\[\mathbb{E}(\text{LOS} \cdot \text{Gangs}) = 4.89956.\]

Computing the Covariance

\[\begin{align*} \operatorname{Cov}(\text{LOS}, \text{Gangs}) &= \mathbb{E}(\text{LOS} \cdot \text{Gangs}) - \mathbb{E}(\text{LOS})\mathbb{E}(\text{Gangs}) \\ &= 4.89956 - \left[ (2.45)(2.3) \right] \\ &= -0.73544. \end{align*}\]

Indeed, we can see that the covariance between \(\text{LOS}\) and \(\text{Gangs}\) is negative.
A negative sign indicates that an increase in \(\text{LOS}\) is associated with a decrease in \(\text{Gangs}\).

Covariance Drawback

This measure depends on the spread of the random variables \(X\) and \(Y\).
For instance, if we multiply \(X\) by 10, then the covariance of \(X\) and \(Y\) increases by a factor of 10 as well: \[\begin{align*} \operatorname{Cov}(10X,Y) &= \mathbb{E}(10XY) - \mathbb{E}(10X) \mathbb{E}(Y) \\ &= 10\mathbb{E}(XY) - 10\mathbb{E}(X)\mathbb{E}(Y) \\ &= 10[\mathbb{E}(XY) - \mathbb{E}(X) \mathbb{E}(Y)] \\ &= 10\operatorname{Cov}(X,Y). \end{align*}\]

Pearson’s Correlation

Pearson’s correlation standardizes the distances according to the standard deviations \(\sigma_X\) and \(\sigma_Y\) of \(X\) and \(Y\), respectively. \[\begin{align*} \operatorname{Corr}(X, Y) &= \mathbb{E} \left[ \left(\frac{X-\mu_X}{\sigma_X}\right) \left(\frac{Y-\mu_Y}{\sigma_Y}\right) \right] \\ &= \frac{\operatorname{Cov}(X, Y)}{\sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}}. \end{align*}\]
It turns out that \(-1 \leq \text{Corr}(X, Y) \leq 1\).

Pearson’s Correlation Scale

\(-1\) means a perfect negative linear relationship between \(X\) and \(Y\).
\(0\) means no linear relationship (however, this does not mean independence!).
\(1\) means a perfect positive linear relationship.

iClicker Question

Answer TRUE or FALSE:

Covariance can be negative, but not the variance.

A. TRUE

B. FALSE

iClicker Question

Answer TRUE or FALSE:

Without any further assumptions between random variables \(X\) and \(Y\), covariance is calculated as

\[\operatorname{Cov}(X,Y) = \mathbb{E}(XY) - \left[ \mathbb{E}(X) \mathbb{E}(Y) \right].\]

Computing \(\mathbb{E}(XY)\) requires the joint distribution, but computing \(\mathbb{E}(X) \mathbb{E}(Y)\) only requires the marginals.

A. TRUE

B. FALSE

2.2.2. Kendall’s \(\tau_K\)

Pearson’s correlation measures linear dependence.
This might be a big downfall, since many relationships between real-world variables are not linear.
Hence, there is an alternative measure called Kendall’s \(\tau_K\).

Characteristics of Kendall’s \(\tau_K\)

Kendall’s \(\tau_K\) can measure non-linear dependence.
It measures concordance between each pair of observations \((x_i, y_i)\) and \((x_j, y_j)\) with \(i \neq j\):

Concordant means \[\begin{gather*} x_i < x_j \quad \text{and} \quad y_i < y_j, \\ \text{or} \\ x_i > x_j \quad \text{and} \quad y_i > y_j; \end{gather*}\] which gets a positive sign.

Characteristics of Kendall’s \(\tau_K\)

Discordant means \[\begin{gather*} x_i < x_j \quad \text{and} \quad y_i > y_j, \\ \text{or} \\ x_i > x_j \quad \text{and} \quad y_i < y_j; \end{gather*}\] which gets a negative sign.

Formal Definition

Kendall’s \(\tau_K\) averages the amount of concordance and discordance by taking the difference between the number of concordant and number of discordant pairs.
The formal definition with \(n\) data pairs is

\[\tau_K = \frac{\text{Number of concordant pairs} - \text{Number of discordant pairs}}{{n \choose 2}}.\]

Kendall’s \(\tau_K\) is between -1 and 1, and measures dependence’s strength (and direction).

First Example

Consider the two correlation measures: Pearson and Kendall’s \(\tau_K\).
We will hypothetical dataset called non_linear_function with \(n = 21\) where: \[y = x^{1/3}.\]

Coding Up `non_linear_function`

R Code
Output

non_linear_pairs <- tibble(
  x = seq(from = 0, to = 100, by = 5),
  y = x^(1 / 3)
)
non_linear_pairs

# A tibble: 21 × 2
       x     y
   <dbl> <dbl>
 1     0  0   
 2     5  1.71
 3    10  2.15
 4    15  2.47
 5    20  2.71
 6    25  2.92
 7    30  3.11
 8    35  3.27
 9    40  3.42
10    45  3.56
# ℹ 11 more rows

Plotting `non_linear_function`

Computing Correlation Metrics

R Code
Output

tribble(
  ~Pearson, ~Kendall,
  round(cor(non_linear_pairs, method = "pearson")[1, 2], 4),
  round(cor(non_linear_pairs, method = "kendall")[1, 2], 4)
) |>
  knitr::kable(align = "cc")

Pearson	Kendall
0.9097	1

Second Example

Consider the two correlation measures: Pearson and Kendall’s \(\tau_K\).
We will hypothetical dataset called parabola_pairs with \(n = 21\) where: \[y = x^2.\]

Coding Up `parabola_pairs`

R Code
Output

parabola_pairs <- tibble(
  x = seq(from = -50, to = 50, by = 5),
  y = x^2
)
parabola_pairs

# A tibble: 21 × 2
       x     y
   <dbl> <dbl>
 1   -50  2500
 2   -45  2025
 3   -40  1600
 4   -35  1225
 5   -30   900
 6   -25   625
 7   -20   400
 8   -15   225
 9   -10   100
10    -5    25
# ℹ 11 more rows

Plotting `parabola_pairs`

Computing Correlation Metrics

R Code
Output

tribble(
  ~Pearson, ~Kendall,
  round(cor(parabola_pairs, method = "pearson")[1, 2], 4),
  round(cor(parabola_pairs, method = "kendall")[1, 2], 4)
) |>
  knitr::kable(align = "cc")

Pearson	Kendall
0	0

Patterns like a parabola are not monotonically increasing or decreasing.
Thus, neither Pearson nor Kendall’s \(\tau_K\) will capture the parabola pattern.

2.3. Variance of a Sum Involving Two Non-Independent Random Variables

Suppose \(X\) and \(Y\) are not independent random variables.
Therefore: \[\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y).\]

If \(X\) and \(Y\) are independent, then…

\[\mathbb{E}(XY) = \mathbb{E}(X) \mathbb{E}(Y).\]

Finally:

\[\begin{align*} \operatorname{Var}(X + Y) &= \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y) \\ &= \operatorname{Var}(X) + \operatorname{Var}(Y) + 2 \left\{ \mathbb{E}(XY) - \left[ \mathbb{E}(X)\mathbb{E}(Y) \right] \right\} \\ &= \operatorname{Var}(X) + \operatorname{Var}(Y) + 2 \underbrace{\left\{ \left[ \mathbb{E}(X) \mathbb{E}(Y) \right] - \left[ \mathbb{E}(X)\mathbb{E}(Y) \right] \right\}}_{0} \\ &= \operatorname{Var}(X) + \operatorname{Var}(Y). \end{align*}\]

Joint Probability

Today’s Learning Goals

By the end of this lecture, we will be able to…

And…

Outline

1. Joint Distributions

Coins come again!

Random Variable Setup

Computing Probabilities

Can we have an univariate setup?

1.1. Example: Length of Stay Versus Gang Demand

Probability Mass Function (PMF) for Gangs

PMF for Length of Stay

Now, we might wonder…

Before, let us define the following:

Now, going back to the joint distribution…

Now, we might wonder…

1.2. Calculating Marginal Distributions from the Joint Distribution

Let us start with the marginal distribution of \(\text{LOS}\)…

So, to find the marginal probability \(P(\text{LOS} = 1)\)…

We have \(P(\text{LOS} = 1)\)…

Now for \(\text{Gangs}\)!

iClicker Question

2. Independence and Dependence Concepts

2.1. Independence

Going back to the two coins!

Obtaining the Marginals from the Joint

Applying the Independence Property via the Marginals

Let us check another two-coin case…

Computing the Marginals

And likewise for the second coin…

Applying the Independence Property via the Marginals

But there is no independence in this other two-coin case!

2.2. Measures of Dependence

2.2.1. Covariance and Pearson’s Correlation

Going back to our cargo ship example!

Coding Up the Marginal PMFs

Computing Marginals Expected Values

Melting joint_distribution (manually!)

Computing the Crossed Expected Value

Computing the Covariance

Covariance Drawback

Pearson’s Correlation

Pearson’s Correlation Scale

iClicker Question

iClicker Question

2.2.2. Kendall’s \(\tau_K\)

Characteristics of Kendall’s \(\tau_K\)

Characteristics of Kendall’s \(\tau_K\)

Formal Definition

First Example

Coding Up non_linear_function

Plotting non_linear_function

Computing Correlation Metrics

Second Example

Coding Up parabola_pairs

Plotting parabola_pairs

Computing Correlation Metrics

2.3. Variance of a Sum Involving Two Non-Independent Random Variables

If \(X\) and \(Y\) are independent, then…

Melting `joint_distribution` (manually!)

Coding Up `non_linear_function`

Plotting `non_linear_function`

Coding Up `parabola_pairs`

Plotting `parabola_pairs`