Depicting Uncertainty

Lecture 1

Hello and welcome!

DSCI 551 Specifics

High-Level Goals

Provide fundamental concepts in probability, including conditional, joint, and marginal distributions.
Develop a statistical view of data coming from a probability distribution.

Course Essentials

Eight lectures (non-graded), four labs (graded), and two PrairieLearn-based quizzes (graded).
MDS general policies can be found here.

Course content/logistics can be found in the GitHub repo.
There is a handy cheatsheet you can find here.
We will mostly use R in lectures and labs (except for lecture8 on Monte Carlo simulation which will be delivered in Python and R).

Lecture Overview

You are required to do some reading in advance (except for lecture1).
We have a rendered a Quarto website which you can find here.
iClicker will be used for active learning via in-class activities beginning lecture2.

Lab Overview

Each cohort will be split in two sections:
- L01 and L02 for section 1.
- L03 and L04 for section 2.
Handouts to be submitted as R markdowns via Gradescope.
Labs will have auto-graded questions.

Communication

We will use the course’s Slack channel for your corresponding section.
Try to post all your course general inquiries on your corresponding channel.

Addressing your course related-questions can also be helpful for your classmates!

Today’s Learning Goals

By the end of this lecture, we will be able to…

Identify probability as a proportion that converges to the truth as you collect more data.
Calculate probabilities using the Inclusion-Exclusion Principle, the Law of Total Probability, and probability distributions.
Convert between and interpret odds and probability.

And…

Specify the usefulness of odds over probability.
Be aware that probability has multiple interpretations/philosophies.
Calculate and interpret mean, mode, entropy, variance, and standard deviation, mainly from a distribution.

Outline

Thinking About of Probability
Probability Distributions
Measures of Central Tendency and Uncertainty

1. Thinking About of Probability

Probability is recurring throughout different Data Science-related topics.
In MDS, you will find it in either the Statistics or Machine Learning courses.

Mexican Lotería (photo by irvin Macfarland on Unsplash).

1.1. Defining Probability

Let $A$ be an event of interest, its probability is denoted as \[P(A) = \frac{\text{Number of times event $A$ is observed}}{\text{Total number of events observed}}\]

as the total number of events observed goes to infinity.

The Coin Toss

Frequentist Statistics is the mainstream approach we learn in introductory courses.
Let us illustrate the frequentist paradigm idea with the typical coin toss example.

System Insights

The coin toss represents our system for which we assume two possible random outcomes:

\[\begin{gather*} H = \{ \text{Getting heads} \} \\ T = \{ \text{Getting tails} \}. \end{gather*}\]

Our system has the following parameters of interest:

\[\begin{gather*} P(H) = \text{Probability of getting heads} \\ P(T) = \text{Probability of getting tails}. \end{gather*}\]

Probabilistic Inquiries

Suppose this coin is unfair, i.e., \[P(H) \neq P(T) \neq \frac{1}{2};\]

and we want to estimate these two unknown probabilities!

Now, think about the following questions:

How would you estimate these two unknown probabilities?
What are the characteristics of these two estimated probabilities?

1.2. Calculating Probabilities using Laws

Let us start with two fundamental laws that will allow us to exercise our probabilistic reasoning:
- Law of Total Probability.
- Inclusion-Exclusion Principle.

Sample Space ($S$)

It is the collection of all the possible outcomes of a random process or system.

Each one of these outcomes has a probability associated with it.
Note that \[P(S) = 1.\]

Law of Total Probability

Breaks down the sample space $S$ of a random process or system into disjoint parts.
We can obtain specific probabilities based on sample space partitions.

The Mario Kart Example

Name	Probability	Combat Type	Defeats Blue Shells
Banana	0.12	contact	no
Bob-omb	0.05	explosion	no
Coin	0.75	ineffective	no
Horn	0.03	explosion	yes
Shell	0.05	contact	no

Attribution: Images from pngkey.

Now, think about the following questions:

Are there any other items possible? Why or why not?
What is the probability of getting something other than a coin?

Inclusion and Exclusion Principle

Let $A$ and $B$ be two events of interest in the sample space $S$: \[P(A \cup B) = P(A) + P(B) - P(A \cap B).\]

Extension to Three Events

Let $A$, $B$, and $C$ be three events of interest in the sample space $S$: \[\begin{align*} P(A \cup B \cup C) &= P(A) + P(B) + P(C) - P(A \cap B) - P(B \cap C) \\ & \qquad - P(A \cap C) + P(A \cap B \cap C) \end{align*}\]

Now, let us answer the following questions:

Using the below table, what is the probability of getting an item with an explosion combat type (event $E$)?

Name	Probability	Combat Type	Defeats Blue Shells
Banana	0.12	contact	no
Bob-omb	0.05	explosion	no
Coin	0.75	ineffective	no
Horn	0.03	explosion	yes
Shell	0.05	contact	no

Attribution: Images from pngkey.

Mutually Exclusive (or Disjoint) Events

Two events are mutually exclusive (or disjoint) if they cannot happen at the same time in the sample space $S$:

\[ P(A \cup B) = P(A) + P(B) - \underbrace{P(A \cap B)}_{0} = P(A) + P(B). \]

Then…

What is the probability of getting an item that is both an explosion item (event $E$) and defeats blue shells (event $D$)?

Finally…

What is the probability of getting an item that is an explosion item (event $E$) or an item that defeats blue shells (event $D$)?

Independent Events

Two events are independent if the occurrence of one of them does not affect the probability of the other.
Their intersection is defined as: \[P(A \cap B) = P(A) \cdot P(B).\]

1.3. Comparing Probabilities

We might be interested in comparing two probabilities.
Suppose an event has a probability $p$ of happening.

The Odds

The odds $o$ are defined as the ratio of this probability to the probability of not happening $1 - p$: \[o = \frac{p}{1 - p}.\]
With some algebraic rearrangements, we can obtain $p$ with the odds: \[p = \frac{o}{o+1}.\]

Example

If you win 80% of the times at solitaire, i.e., $p = 0.8$; then your odds are: \[o = \frac{p}{1 - p} = \frac{0.8}{0.2} = 4\]
This is sometimes written as 4:1 odds – that is, four wins for every loss.

2. Probability Distributions

A probability distribution is the set of all outcomes and their corresponding probabilities.
The outcome itself, which is uncertain, is called a random variable; e.g., \[X = \text{Number of customers standing in line at a bank branch.}\]

Types of Random Variables

In general, random variables are classified as:

Continuous: it can take on a set of uncountable outcomes.
Discrete: it can take on a set of countable outcomes.

Heads-up: A continuous random variable has a probability density function (PDF), whereas a discrete has a probability mass function (PMF).

Example of a Discrete and Categorical Random Variable

\[Y = \text{Item obtained from the box.}\]

Item	$Y$	Probability
	Banana	0.12
	Bob-omb	0.05
	Coin	0.75
	Horn	0.03
	Shell	0.05

Attribution: Images from pngkey.

Example of a Discrete and Count Random Variable

\[C = \text{Length of ship stay in days.}\]

$C$	Probability
1	0.25
2	0.50
3	0.15
4	0.10

3. Measures of Central Tendency and Uncertainty

These measures summarize the information of a probability distribution.
They are subset as:
- Central tendency: a “typical” value in a random variable.
- Uncertainty: a measure of how “spread” the random variable is.

3.1. Mode and Entropy in Discrete Random Variables

Both measures apply to all classes of discrete random variables.
The mode is a measure of central tendency. It is the outcome having the highest probability.

The Entropy

It is a measure of uncertainty defined as

\[H(Y) = -\displaystyle \sum_y P(Y = y)\log[P(Y = y)].\]

It is a nonnegative measure of uncertainty.
If its value is equal to zero, then there is no randomness.

Example

What is the mode for $Y = \text{Item obtained from the box}$?

Item	$Y$	Probability
	Banana	0.12
	Bob-omb	0.05
	Coin	0.75
	Horn	0.03
	Shell	0.05

Attribution: Images from pngkey.

How About the Entropy?

\[\begin{align*} H(Y) &= -\displaystyle \sum_y P(Y = y)\log[P(Y = y)] \\ &= -[0.12 \log(0.12) + 0.05 \log(0.05) + \\ & \qquad \quad 0.75 \log(0.75) + 0.03 \log(0.03) + 0.05 \log(0.05)] \\ &= 0.87 \end{align*}\]

3.2. Mean and Variance

Both measures apply to both discrete and continuous random variables (as long as they are numeric!).

The Mean

It is a measure of central tendency.

If $X$ is discrete, with $P(X = x)$ as a PMF, then \[\mathbb{E}(X) = \displaystyle \sum_x x \cdot P(X = x).\]
If $X$ is continuous, with $f_X(x)$ as a PDF, then \[\mathbb{E}(X) = \displaystyle \int_x x \cdot f_X(x) \text{d}x.\]

The Variance

It is a measure of uncertainty.

\[\text{Var}(X) = \mathbb{E}\{[X - \mathbb{E}(X)]^2\} = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2.\]

Note it is an expectation (specifically, the squared deviation from the mean).

Depicting Uncertainty

Hello and welcome!

DSCI 551 Specifics

High-Level Goals

Course Essentials

Lecture Overview

Lab Overview

Communication

Today’s Learning Goals

By the end of this lecture, we will be able to…

And…

Outline

1. Thinking About of Probability

1.1. Defining Probability

The Coin Toss

System Insights

Probabilistic Inquiries

Now, think about the following questions:

1.2. Calculating Probabilities using Laws

Sample Space (\(S\))

Law of Total Probability

The Mario Kart Example

Now, think about the following questions:

Inclusion and Exclusion Principle

Extension to Three Events

Now, let us answer the following questions:

Mutually Exclusive (or Disjoint) Events

Then…

Finally…

Independent Events

1.3. Comparing Probabilities

The Odds

Example

2. Probability Distributions

Types of Random Variables

Example of a Discrete and Categorical Random Variable

Example of a Discrete and Count Random Variable

3. Measures of Central Tendency and Uncertainty

3.1. Mode and Entropy in Discrete Random Variables

The Entropy

Example

How About the Entropy?

3.2. Mean and Variance

The Mean

The Variance