Lecture 3

GLMs: Ordinal Logistic Regression

(Please, sign in on iClicker)

Today’s Learning Goals

By the end of this lecture, you should be able to:

Outline the modelling framework of the Ordinal Logistic regression.
Explain the concept of proportional odds.
Fit and interpret Ordinal Logistic regression.
Use the Ordinal Logistic regression for prediction.
Assess the assumption of proportional odds via the Brant-Wald test.
Contrast the Ordinal Logistic regression models under proportional and non-proportional odds assumptions.

Outline

The Need for an Ordinal Model
College Juniors Dataset
Exploratory Data Analysis
Data Modelling Framework
Estimation
Inference
Coefficient Interpretation
Predictions
Model Selection
Non-proportional Odds

1. The Need for an Ordinal Model

We will continue with Regression Analysis on categorical responses.

1.1. Reminder

Recall that a discrete response is considered as categorical if it is of factor-type with more than two categories:

Nominal. We have categories that do not follow any specific order—for example, the type of dwelling according to the Canadian census: single-detached house, semi-detached house, row house, apartment, and mobile home.

Ordinal. The categories, in this case, follow a specific order—for example, a Likert scale of survey items: strongly disagree, disagree, neutral, agree, and strongly agree.

1.2. A question to start with …

Why not just using Mutinomial Logistic regression in the case of an ordinal response?

The previous Multinomial Logistic model is suboptimal for ordinal responses.

Suppose one does not consider the order in the response levels when necessary. In that case, we might risk losing valuable information when making inference or predictions.
This valuable information is in the ordinal scale, which could be set up via cumulative probabilities.

1.3. What to do then?

An alternative logistic regression that suits ordinal responses.
Hence, the existence of the Ordinal Logistic regression model.

We will expand the regression mindmap to include this new model.
Note there is a key characteristic called proportional odds which is reflected on the data modelling framework.

2. College Juniors Dataset

To illustrate the use of this generalized linear model (GLM), let us introduce an adequate dataset.
The data frame college_data was obtained from the webpage of the UCLA Statistical Consulting Group.

It contains the results of a survey applied to \(n = 400\) college juniors regarding the factors that influence their decision to apply to graduate school.

2.1. Read data in `R`

2.2. A quick look at data!

The dataset contains the following variables:
- decision: how likely the student will apply to graduate school, a discrete and ordinal response with three increasing categories (unlikely, somewhat likely, and very likely).
- parent_ed: whether at least one parent has a graduate degree, a discrete and binary explanatory variable (Yes and No).
- GPA: the student’s current GPA, a continuous explanatory variable.

2.3. A quick data wrangling!

Variable decision will be our response of interest.
Before proceeding with our regression analysis, let us set it up as an ordered factor via function as.ordered().

2.4. Main Statistical Inquiries

Let us suppose we want to assess the following:
- Is decision statistically related to parent_ed and GPA?
- How can we interpret these statistical relationships (if there are any!)?

3. Exploratory Data Analysis (EDA)

Since decision is a discrete ordinal response, we have to be careful about the plots to be used in the EDA.
To explore the relationship between decision and GPA , we will use these side-by-side plots: boxplots and/or violin plots.

GPA is on the \(y\)-axis, and all categories of decision are on the \(x\)-axis.
The side-by-side violin plots show the mean of GPA by decision category in red points.

Plots!

How can we visually explore the relationship between `decision` and `parent_ed`?

In the case of two categorical explanatory variables, we can use stacked bar charts.
Each bar will show the percentages (depicted on the \(y\)-axis) of college students falling on each ordered category of decision with the categories found in parent_ed on the \(x\)-axis.

Plots!

In-class Question

What can we see descriptively from the previous plots?

4. Data Modelling Framework

Let us suppose that a given discrete ordinal response \(Y_i\) (for \(i = 1, \dots, n\)) has categories \(1, 2, \dots, m\) in a training set of size \(n\).

Heads-up: Categories \(1, 2, \dots, m\) implicate an ordinal scale here, i.e., \(1 < 2 < \dots < m\).

Also, note there is more than one class of Ordinal Logistic regression.
We will review the proportional odds model (a cumulative logit model).

When \(m = 3\)

We will set up an Ordinal Logistical regression model for this specific dataset with \(m = 3\) for the levels of \(Y_i\), i.e.,

\[\begin{gather*} 1 = \texttt{unlikely} \\ 2 = \texttt{somewhat likely} \\ 3 = \texttt{very likely} \end{gather*}\]

The Regressors

For the \(i\)th observation, we have the continuous \(X_{i, \texttt{GPA}}\) and the dummy variable

\[ X_{i, \texttt{parent_ed}} = \begin{cases} 1 \; \; \; \; \mbox{at least one parent has a graduate degree},\\ 0 \; \; \; \; \mbox{neither of the parents has a graduate degree.} \end{cases} \]

Then…

The model will indicate how each one of the regressors affects the cumulative logarithm of the odds in decision for the following \(2\) situations:

\[\begin{gather*} \text{Level } \texttt{somewhat likely} \text{ or any lesser degree versus level } \texttt{very likely} \\ \text{Level } \texttt{unlikely} \text{ versus level } \texttt{somewhat likely} \text{ or any higher degree} \end{gather*}\]

These are our cumulative log-odds!

We have the following system of \(2\) equations:

\[\begin{align*} \eta_i^{(\texttt{somewhat likely})} &= \log\left[\frac{P(Y_i \leq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i = \texttt{very likely} \mid X_{i, \texttt{GPA}}, X_{i, \texttt{parent_ed}})}\right] \\ &= \beta_0^{(\texttt{somewhat likely})} - \beta_1 X_{i, \texttt{GPA}} - \beta_2 X_{i, \texttt{parent_ed}} \end{align*}\] \[\begin{align*} \eta_i^{(\texttt{unlikely})} &= \log\left[\frac{P(Y_i = \texttt{unlikely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i \geq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}\right] \\ &= \beta_0^{(\texttt{unlikely})} - \beta_1 X_{i, \texttt{GPA}} - \beta_2 X_{i, \texttt{parent_ed}}. \end{align*}\]

Note the following…

The system has \(2\) intercepts but only \(2\) regression coefficients.
Check the signs of the common \(\beta_1\) and \(\beta_2\), which are minuses. This is a fundamental parameterization of the proportional odds model.

Some algebraic manipulation…

To make coefficient intepretation easier, we could re-expresss the equations of \(\eta_i^{(\texttt{somewhat likely})}\) and \(\eta_i^{(\texttt{unlikely})}\) as follows:

\[\begin{gather*} \frac{P(Y_i = \texttt{very likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i \leq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})} = \\ \exp\big(-\beta_0^{(\texttt{somewhat likely})}\big) \exp\big(\beta_1 X_{i, \texttt{GPA}}\big) \exp\big(\beta_2 X_{i, \texttt{parent_ed}}\big) \end{gather*}\] \[\begin{gather*} \frac{P(Y_i \geq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i = \texttt{unlikely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})} = \\ \exp\big(-\beta_0^{(\texttt{unlikely})}\big) \exp\big(\beta_1 X_{i, \texttt{GPA}}\big) \exp\big(\beta_2 X_{i, \texttt{parent_ed}}\big). \end{gather*}\]

Further Insights

Both regression coefficients (\(\beta_1\) and \(\beta_2\)) have a multiplicative effect on the odds that we have on the left-hand side.
Moreover, given the proportional odds assumptions, the regressor effects are the same for both cumulative odds.

What is the math for the general case with \(m\) response categories and \(k\) regressors?

This model will indicate how each one of the \(k\) regressors \(X_{i,1}, \dots, X_{i,k}\) affects the cumulative logarithm of the odds in the ordinal response (with \(m\) levels) for the following \(m - 1\) situations:

\[\begin{gather*} \text{Level } m - 1 \text{ or any lesser degree versus level } m\\ \text{Level } m - 2 \text{ or any lesser degree versus level } m - 1 \text{ or any higher degree}\\ \vdots \\ \text{Level } 2 \text{ or any lesser degree versus level } 3 \text{ or any higher degree}\\ \text{Level } 1 \text{ versus level } 2 \text{ or any higher degree}\\ \end{gather*}\]

Link Functions

\[\begin{gather*} \eta_i^{(m - 1)} = \log\left[\frac{P(Y_i \leq m - 1 \mid X_{i,1}, \ldots, X_{i,k})}{P(Y_i = m \mid X_{i,1}, \ldots, X_{i,k})}\right] = \beta_0^{(m - 1)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k} \\ \eta_i^{(m - 2)} = \log\left[\frac{P(Y_i \leq m - 2 \mid X_{i,1}, \ldots, X_{i,k})}{P(Y_i > m - 2 \mid X_{i,1}, \ldots, X_{i,k})}\right] = \beta_0^{(m - 2)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k} \\ \vdots \\ \eta_i^{(2)} = \log\left[\frac{P(Y_i \leq 2 \mid X_{i,1}, \ldots, X_{i,k})}{P(Y_i > 2 \mid X_{i,1}, \ldots, X_{i,k})}\right] = \beta_0^{(2)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k} \\ \eta_i^{(1)} = \log\left[\frac{P(Y_i = 1 \mid X_{i,1}, \ldots, X_{i,k})}{P(Y_i > 1 \mid X_{i,1}, \ldots, X_{i,k})}\right] = \beta_0^{(1)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k}. \end{gather*}\]

Cumulative Probabilities

Note that the previous system has \(m - 1\) intercepts but only \(k\) regression coefficients.
In general, the previous \(m - 1\) equations can be generalized for levels \(j = m - 1, \dots, 1\) as follows:

\[\begin{gather*} \eta_i^{(j)} = \log\left[\frac{P(Y_i \leq j \mid X_{i,1}, \ldots, X_{i,k})}{P(Y_i > j \mid X_{i,1}, \ldots, X_{i,k})}\right] = \beta_0^{(j)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k} \\ \; \; \; \; \; \; \; \; \Rightarrow \; P(Y_i \leq j \mid X_{i,1}, \ldots, X_{i,k}) = \frac{\exp\left(\beta_0^{(j)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k}\right)}{1 + \exp\left(\beta_0^{(j)} - \beta_1 X_{i, 1} - \ldots - \beta_k X_{i, k}\right)}. \end{gather*}\]

Standalone Probabilities

The standalone probability that \(Y_i\) will fall in the category \(j\) is computed as follows: \[ p_{i,j} = P(Y_i = j \mid X_{i,1}, \ldots, X_{i,k}) \\ = P(Y_i \leq j \mid X_{i,1}, \ldots, X_{i,k}) - P(Y_i \leq j - 1 \mid X_{i,1}, \ldots, X_{i,k}), \]
Hence: \[ P(Y_i = 1) = p_{i,1} \;\;\;\; P(Y_i = 2) = p_{i,2} \;\; \dots \;\; P(Y_i = m) = p_{i,m}, \]
Where: \[ \sum_{j = 1}^m p_{i,j} = p_{i,1} + p_{i,2} + \dots + p_{i,m} = 1. \]

iClicker Question

Suppose we have a case with an ordered response of five levels and four continuous regressors.
How many link functions, intercepts, and coefficients will our Ordinal Logistic regression model have under a proportional odds assumption?

A. Four link functions, four intercepts, and sixteen regression coefficients.

B. Five link functions, five intercepts, and five regression coefficients.

C. Five link functions, five intercepts, and twenty regression coefficients.

D. Four link functions, four intercepts, and four regression coefficients.

A Graphical Intuition of the Proportional Odds Assumption

There is an optional section in the notes that shows mathematically what the proportional odds assumption stands for in the Ordinal Logistic regression model with \(k\) regressors and a training set size of \(n\), using a response of \(m\) levels in general.

That said, let us check the graphical intuition with an example of an ordinal response of \(5\) levels and a single continuous regressor \(X_{i , 1}\) (which is unbounded!).

System of Cumulative Logit Equations

\[\begin{gather*} \eta_i^{(4)} = \log\left[\frac{P(Y_i \leq 4 \mid X_{i,1})}{P(Y_i = 5 \mid X_{i,1})}\right] = \beta_0^{(4)} - \beta_1 X_{i, 1} \\ \eta_i^{(3)} = \log\left[\frac{P(Y_i \leq 3 \mid X_{i,1})}{P(Y_i > 3 \mid X_{i,1})}\right] = \beta_0^{(3)} - \beta_1 X_{i, 1} \\ \eta_i^{(2)} = \log\left[\frac{P(Y_i \leq 2 \mid X_{i,1})}{P(Y_i > 2 \mid X_{i,1})}\right] = \beta_0^{(2)} - \beta_1 X_{i, 1} \\ \eta_i^{(1)} = \log\left[\frac{P(Y_i = 1 \mid X_{i,1})}{P(Y_i > 1 \mid X_{i,1})}\right] = \beta_0^{(1)} - \beta_1 X_{i, 1}. \end{gather*}\]

Some algebraic manipulation…

\[\begin{gather*} P(Y_i \leq 4 \mid X_{i,1}) = \frac{\exp\left(\beta_0^{(4)} - \beta_1 X_{i, 1} \right)}{1 + \exp\left(\beta_0^{(4)} - \beta_1 X_{i, 1}\right)} \\ P(Y_i \leq 3 \mid X_{i,1}) = \frac{\exp\left(\beta_0^{(3)} - \beta_1 X_{i, 1} \right)}{1 + \exp\left(\beta_0^{(3)} - \beta_1 X_{i, 1}\right)} \\ P(Y_i \leq 2 \mid X_{i,1}) = \frac{\exp\left(\beta_0^{(2)} - \beta_1 X_{i, 1} \right)}{1 + \exp\left(\beta_0^{(2)} - \beta_1 X_{i, 1}\right)} \\ P(Y_i \leq 1 \mid X_{i,1}) = \frac{\exp\left(\beta_0^{(1)} - \beta_1 X_{i, 1} \right)}{1 + \exp\left(\beta_0^{(1)} - \beta_1 X_{i, 1}\right)}. \\ \end{gather*}\]

Note following…

There are four intercepts \(\left( \beta_0^{(1)}, \beta_0^{(2)}, \beta_0^{(3)}, \beta_0^{(4)} \right)\) and a single coefficient \(\beta_1\).
Let us assume we estimate these five parameters to be the following: \[\begin{gather*} \hat{\beta_0}^{(1)} = 4 \quad \quad \hat{\beta_0}^{(2)} = 5.5 \\ \hat{\beta_0}^{(3)} = 6.3 \quad \quad \hat{\beta_0}^{(4)} = 10 \\ \hat{\beta_1} = 12. \end{gather*}\]

The sigmoid functions!

5. Estimation

Estimates are obtained through maximum likelihood, where we also assume a Multinomial joint probability mass function of the \(n\) responses \(Y_i\) via cumulative probabilities in the joint likelihood function.

To fit the model with the package MASS, we use polr(). The argument Hess = TRUE is required to compute the Hessian matrix of the log-likelihood function, which is used to obtain the standard errors of the estimates.

6. Inference

We also use the Wald statistic \(z_j = \frac{\hat{\beta}_j}{\mbox{se}(\hat{\beta}_j)}\) to test the hypotheses: \[\begin{gather*} H_0: \beta_j = 0\\ H_a: \beta_j \neq 0. \end{gather*}\]
Provided the sample size \(n\) is large enough, \(z_j\) has an approximately Standard Normal distribution under \(H_0\).

Code in `R`!

Terms unlikely|somewhat likely and somewhat likely|very likely refer to \(\hat{\beta_0}^{(\texttt{unlikely})}\) and \(\hat{\beta_0}^{(\texttt{somewhat likely})}\), respectively.

Using \(\alpha = 0.05\)

We see that those coefficients associated with GPA and parent_ed are statistically significant with \(\alpha = 0.05\).
Function confint() can provide the 95% confidence intervals for the estimates contained in ordinal_model.

7. Coefficient Interpretation

The interpretation of the Ordinal Logistic regression coefficients will change since we are modelling cumulative probabilities.
Let us interpret the association of decision with GPA and parent_ed.

Interpretations

By using the column exp.estimate, along with the model equations on the original scale of the cumulative odds, we interpret the two regression coefficients above by each odds as follows:
For the odds:

\[\frac{P(Y_i = \texttt{very likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i \leq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})} = \\ \exp\big(-\beta_0^{(\texttt{somewhat likely})}\big) \exp\big(\beta_1 X_{i, \texttt{GPA}}\big) \exp\big(\beta_2 X_{i, \texttt{parent_ed}}\big).\]

We have …

\(\beta_1\): “for each one-unit increase in the GPA, the odds that the student is very likely versus somewhat likely or unlikely to apply to graduate school increase by \(\exp \left( \hat{\beta}_1 \right) = 1.82\) times (while holding parent_ed constant).”

\(\beta_2\): “for those respondents whose parents attended to graduate school, the odds that the student is very likely versus somewhat likely or unlikely to apply to graduate school increase by \(\exp \left( \hat{\beta}_2 \right) = 2.86\) times (when compared to those respondents whose parents did not attend to graduate school and holding GPA constant).”

And…

For the odds:

\[\frac{P(Y_i \geq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i = \texttt{unlikely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})} = \\ \exp\big(-\beta_0^{(\texttt{unlikely})}\big) \exp\big(\beta_1 X_{i, \texttt{GPA}}\big) \exp\big(\beta_2 X_{i, \texttt{parent_ed}}\big).\]

We have …

\(\beta_1\): “for each one-unit increase in the GPA, the odds that the student is very likely or somewhat likely versus unlikely to apply to graduate school increase by \(\exp \left( \hat{\beta}_1 \right) = 1.82\) times (while holding parent_ed constant).”

\(\beta_2\): “for those respondents whose parents attended to graduate school, the odds that the student is very likely or somewhat likely versus unlikely to apply to graduate school increase by \(\exp \left( \hat{\beta}_2 \right) = 2.86\) times (when compared to those respondents whose parents did not attend to graduate school and holding GPA constant).”

8. Predictions

Aside from our main statistical inquiries, using the function predict() with the object ordinal_model, we can obtain the estimated probabilities to apply to graduate school (associated to unlikely, somewhat likely, and very likely) for a student with a GPA of 3.5 whose parents attended to graduate school.

Thus…

We can see that it is somewhat likely to apply to graduate school with a probability of 0.48.
Thus, we could classify the student in this ordinal category.

9. Model Selection

To perform model selection, we can use the same techniques from lecture2. That said, some R coding functions might not be entirely available for the polr() models.
Still, these statistical techniques and metrics can be manually coded.

10. Non-proportional Odds

Now, we might wonder:

What happens if we do not fulfil the proportional odds assumption in our Ordinal Logistic regression model?

10.1. The Brant-Wald Test

The Ordinal Logistic model under the proportional odds assumption is the first step when performing Regression Analysis on an ordinal response.
Once this model has been fitted, it is possible to assess whether it fulfils this strong assumption statistically.

We can do it via the Brant-Wald test.
This tool performs statistically assesses whether our model globally fulfils this assumption: \[\begin{gather*} H_0: \text{Our Ordinal Logistic regression model globally} \\ \text{fulfils the proportional odds assumption.} \\ H_a: \text{Otherwise}. \end{gather*}\]

Regressor-specifc Tests

It also performs further hypothesis testing on each regressor.
With \(k\) regressors for \(j = 1, \dots, k\); we have the following hypotheses: \[\begin{gather*} H_0: \text{The } j \text{th regressor in our Ordinal Logistic regression} \\ \text{model fulfils the proportional odds assumption.} \\ H_a: \text{Otherwise}. \end{gather*}\]

10.2. Code in `R`!

Function brant() from package brant implements this tool, which can be used in our polr() object ordinal_model:

Notes on Output

The row Omnibus represents the global model, while the other two rows correspond to our two regressors: parent_ed and GPA.
With \(\alpha = 0.05\), we are completely fulfilling the proportional odds assumption (the column probability delivers the corresponding \(p\)-values).

Heads-up: The Brant Wald test essentially compares this basic Ordinal Logistic regression model of \(m - 1\) cumulative logit functions (i.e., the ordinal response has \(m\) categories) versus a set of \(m − 1\) correlated Binary Logistic regressions.

10.3. A Non-proportional Odds Model

Suppose that our example case does not fulfil the proportional odds assumption according to the Brant Wald test.
It is possible to have a model under a non-proportional odds assumption:

\[\begin{align*} \eta_i^{(\texttt{somewhat likely})} &= \log\left[\frac{P(Y_i \leq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i = \texttt{very likely} \mid X_{i, \texttt{GPA}}, X_{i, \texttt{parent_ed}})}\right] \\ &= \beta_0^{(\texttt{somewhat likely})} - \beta_1^{(\texttt{somewhat likely})} X_{i, \texttt{GPA}} - \\ & \qquad \beta_2^{(\texttt{somewhat likely})} X_{i, \texttt{parent_ed}} \end{align*}\] \[\begin{align*} \eta_i^{(\texttt{unlikely})} &= \log\left[\frac{P(Y_i = \texttt{unlikely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}{P(Y_i \geq \texttt{somewhat likely} \mid X_{i, \texttt{GPA}},X_{i, \texttt{parent_ed}})}\right] \\ &= \beta_0^{(\texttt{unlikely})} - \beta_1^{(\texttt{unlikely})} X_{i, \texttt{GPA}} - \beta_2^{(\texttt{unlikely})} X_{i, \texttt{parent_ed}}. \end{align*}\]

Generalized Ordinal Logistic Regression Model

This class of models can be fit via the function cumulative() from package VGAM.
Note that estimation, inference, coefficient interpretation, and predictions are conducted in a similar way compared to the proportional odds model.

In-class Question

Why would we have to be cautious when fitting a non-proportional odds model in the context of a response with a considerable number of ordinal levels and regressors?

11. Wrapping Up on Categorical Regression Models

Multinomial and Ordinal Logistic regression models address cases where our response is discrete and categorical.
These models are another class GLMs.

Multinomial Logistic regression approaches nominal responses. It heavily relies on a baseline response level and the odds’ natural logarithm.
Ordinal Logistic regression approaches ordinal responses. It relies on cumulative odds on an ordinal scale.

We have to be careful with coefficient interpretations in these models, especially with the cumulative odds. This will imply an effective communication of these estimated models for inferential matters.

Lecture 3

Today’s Learning Goals

Outline

1. The Need for an Ordinal Model

1.1. Reminder

1.2. A question to start with …

1.3. What to do then?

2. College Juniors Dataset

2.1. Read data in R

2.2. A quick look at data!

2.3. A quick data wrangling!

2.4. Main Statistical Inquiries

3. Exploratory Data Analysis (EDA)

Plots!

How can we visually explore the relationship between decision and parent_ed?

Plots!

In-class Question

4. Data Modelling Framework

When \(m = 3\)

The Regressors

Then…

These are our cumulative log-odds!

Note the following…

Some algebraic manipulation…

Further Insights

What is the math for the general case with \(m\) response categories and \(k\) regressors?

Link Functions

Cumulative Probabilities

Standalone Probabilities

iClicker Question

A Graphical Intuition of the Proportional Odds Assumption

System of Cumulative Logit Equations

Some algebraic manipulation…

Note following…

The sigmoid functions!

5. Estimation

6. Inference

Code in R!

Using \(\alpha = 0.05\)

7. Coefficient Interpretation

Interpretations

We have …

And…

We have …

8. Predictions

Thus…

9. Model Selection

10. Non-proportional Odds

10.1. The Brant-Wald Test

Regressor-specifc Tests

10.2. Code in R!

Notes on Output

10.3. A Non-proportional Odds Model

Generalized Ordinal Logistic Regression Model

In-class Question

11. Wrapping Up on Categorical Regression Models

2.1. Read data in `R`

How can we visually explore the relationship between `decision` and `parent_ed`?

Code in `R`!

10.2. Code in `R`!