quizdown for Quarto, MDS fork

Interactive quizzes for Quarto documents, powered by quizdown-js. MDS fork of parmsam/quarto-quizdown, extended with new question types and a print review mode.

Install:

quarto add UBC-MDS/quarto-quizdown-mds-ext

Try it — full quiz

All five question types in one quiz. Answer everything, hit ✓✓ to evaluate, then Cmd/Ctrl+P to see the print review sheet.

--- shuffle_answers: true --- ## Match each data loading approach to its best use case! - `pd.read_parquet()` at startup :: Small dataset that fits comfortably in memory > Fast and simple — load once at startup, reuse throughout the session. - `ibis` + DuckDB lazy query :: Large file — push filters before loading > Only the query result enters memory; the full file never loads. Ideal for multi-GB Parquet. - In-memory pre-sampled data :: Shinylive / WASM deployment > DuckDB file access silently fails in WASM — embed a small pre-sampled dataset instead. - `@reactive.calc` with eager load :: Slow computation shared by multiple outputs > Load eagerly, then cache the filtered result so downstream outputs don't re-run the query. - :: Streaming from a REST API > REST streaming is a separate pattern — none of the above tools are designed for it. > Think about **where the data lives**, **how large it is**, and **what the runtime environment supports**. ## Classify each ML task as Regression or Classification! The chips **Regression** and **Classification** are reusable — drag or click each to as many rows as needed. > **Regression** predicts a *continuous* value (a number on a scale). **Classification** predicts a *category* (one of a finite set of classes). - Predict house price :: Regression > The output is a dollar amount — a continuous value. - Detect spam email :: Classification > The output is spam / not spam — a binary category. - Forecast next month's sales revenue :: Regression > Revenue is continuous — even though it's rounded to cents. - Identify the digit in a handwritten image :: Classification > The output is one of 10 discrete classes (0–9). - Estimate a patient's remaining hospital stay in days :: Regression > Days is a continuous (or at least ordinal) quantity. - Diagnose whether a tumour is malignant or benign :: Classification > A binary categorical outcome. - :: Clustering > Clustering is **unsupervised** — no target label is predicted. ## Put the EDA steps in order! > Exploratory Data Analysis follows a natural progression from raw inspection to insight — skipping early steps often leads to missed data quality issues. 1. Load and inspect the raw data (`df.head()`, `.info()`, `.dtypes`) 2. Check for missing values and duplicates 3. Compute summary statistics (`df.describe()`) 4. Visualize distributions of individual variables 5. Explore relationships between variables (correlations, scatter plots) 6. Identify and investigate outliers 7. Document findings and decide on data cleaning steps # A Shiny app loads a 2 GB Parquet file at startup. It crashes on Posit Connect due to memory limits. What is the right fix? > Think about *when* data actually enters memory, not just *where* the load call is placed. 1. [x] Switch to `ibis` + DuckDB lazy loading — load only the query result > Correct — with `ibis`, `.execute()` runs a SQL query and loads only the filtered rows. The full 2 GB file never enters RAM. 1. [ ] Move the `pd.read_parquet()` call inside a `@reactive.calc` > This defers the load to first use, but the full file still enters memory — the crash will still happen. 1. [ ] Convert the Parquet file to CSV, which uses less memory > CSV is actually *larger* than Parquet on disk, and loading it still reads the full file into memory. 1. [ ] Increase the worker memory limit on Posit Connect > Scaling up is a temporary patch — the correct fix is to never load the full file in the first place. ## Which of the following are valid reasons to choose Parquet over CSV for storing tabular data? Select **all** that apply. > Parquet is a binary columnar format; CSV is plain text row-by-row. Both store tabular data, but they make very different trade-offs. - [x] Column-oriented storage enables faster column scans > ✓ Parquet reads only the columns requested — CSV must scan every row to extract a single column. - [x] Data types are stored in the file — no silent type coercion on read > ✓ CSV has no type information; pandas must infer types and can silently misread integers as floats. - [x] Columnar compression makes file sizes much smaller > ✓ Similar values in a column compress well together — Parquet files are often 5–10× smaller than equivalent CSVs. - [ ] Parquet files are human-readable in any text editor > ✗ Parquet is a binary format — you need a tool like DuckDB or pandas to inspect it. - [ ] Parquet is natively supported by Excel > ✗ Excel cannot open Parquet files without a plugin.

Question examples

Matching and classification questions

1-to-1 matching

Each right-side chip belongs to exactly one left prompt. Chips are consumed from the pool when placed. Distractors (:: Value) are chips that don’t match any prompt.

```quizdown
## Match X to Y!

- Left prompt :: Right chip
    > Per-pair feedback shown after evaluation.
- Another prompt :: Another chip
    > Feedback for this pair.
- :: Distractor chip
    > Shown when this distractor is placed in a slot.

> Overall hint shown when student clicks 💡.
```
--- shuffle_answers: true --- ## Match each data loading approach to its best use case! - `pd.read_parquet()` at startup :: Dataset fits comfortably in memory > Fast and simple — load once, reuse throughout the session. - `ibis` + DuckDB lazy query :: Multi-GB file — push filters before loading > Only the query result enters memory; the full file never loads. - In-memory pre-sampled data :: Shinylive / WASM deployment > DuckDB file access silently fails in WASM — embed a small pre-sampled dataset instead. - :: Streaming from a REST API > None of these tools are designed for live API streaming. > Think about **where the data lives**, **how large it is**, and **what the runtime supports**.

Multi-match / classification

When the same right-side label appears more than once, it becomes a reusable chip that stays in the pool after being placed — perfect for classify-into-bins tasks.

```quizdown
## Classify each item!

- Item A :: Category 1
- Item B :: Category 2
- Item C :: Category 1
- :: Distractor

> Hint: repeated right-side values become reusable chips.
```
--- shuffle_answers: true --- ## Classify each ML task as Regression or Classification! > Regression → continuous target. Classification → categorical target. - Predict house price :: Regression > The target (price) is a continuous number. - Detect spam email :: Classification > The target is a category: spam or not spam. - Forecast stock return :: Regression > Returns are continuous values, not categories. - Identify handwritten digit :: Classification > Digits 0–9 are discrete categories. - Estimate delivery time :: Regression > Delivery time is a continuous quantity. - :: Clustering > Clustering is unsupervised — there is no target label to predict.

Multiple choice and single choice

Single choice

Exactly one correct answer. Use a numbered list (1.) with [x] on the correct option.

```quizdown
# Question heading

> Optional hint shown when student clicks 💡.

1. [x] Correct answer
    > Feedback shown after evaluation.
1. [ ] Wrong answer
    > Explain why this is wrong.
1. [ ] Another wrong answer
    > Explain why this is wrong.
```
--- shuffle_answers: true --- # A Shiny app loads a 2 GB Parquet file at startup and crashes on Posit Connect. What is the right fix? > Think about *when* data actually enters memory. 1. [x] Switch to `ibis` + DuckDB lazy loading — load only the query result > Correct — `.execute()` loads only the filtered rows. The full 2 GB file never enters RAM. 1. [ ] Move the `pd.read_parquet()` call inside a `@reactive.calc` > The full file still enters memory on first use — the crash will still happen. 1. [ ] Convert the Parquet to CSV, which uses less memory > CSV is larger than Parquet on disk, and still reads fully into memory. 1. [ ] Increase the worker memory limit on Posit Connect > Scaling up is a patch — the right fix is to never load the full file.

Multiple choice

One or more correct answers. Use an unordered list (-) with [x] on all correct options.

```quizdown
## Question heading

> Optional hint.

- [x] Correct option
    > Feedback.
- [x] Also correct
    > Feedback.
- [ ] Wrong option
    > Feedback.
```
--- shuffle_answers: true --- ## Which of the following are valid reasons to choose Parquet over CSV? > Parquet is a binary columnar format; CSV is plain-text row-by-row. Both store tabular data but make very different trade-offs. - [x] Column-oriented storage enables faster column scans > ✓ Parquet reads only the columns requested — CSV must scan every row. - [x] Data types are stored in the file — no silent coercion on read > ✓ CSV has no type info; pandas must infer types and can silently misread values. - [x] Columnar compression makes file sizes much smaller > ✓ Similar values in a column compress well — often 5–10× smaller than equivalent CSVs. - [ ] Parquet files are human-readable in any text editor > ✗ Parquet is binary — you need DuckDB, pandas, or similar to inspect it. - [ ] Parquet is natively supported by Excel > ✗ Excel cannot open Parquet files without a plugin.

Sequence

Students drag items into the correct order. Use a numbered list without checkboxes.

```quizdown
## Put these steps in order!

> Optional hint.

1. First step
2. Second step
3. Third step
4. Fourth step
```
--- shuffle_answers: true --- ## Put the EDA steps in order! > Exploratory Data Analysis follows a natural progression — skipping early steps often leads to missed data quality issues. 1. Load and inspect the raw data (`df.head()`, `.info()`, `.dtypes`) 2. Check for missing values and duplicates 3. Compute summary statistics (`df.describe()`) 4. Visualize distributions of individual variables 5. Explore relationships between variables (correlations, scatter plots) 6. Identify and investigate outliers 7. Document findings and decide on data cleaning steps

Embedding quizzes in your pages

Basic usage

Add the filter to your document front matter, then write a quiz inside a ```quizdown code block:

---
filters:
  - quizdown
---

```quizdown
---
shuffle_answers: true
primary_color: "#4b75a8"
---

## Question heading

Answer options here.
```

Quiz inside a “Check your understanding” callout

You can embed a quiz directly inside a Quarto callout — students expand it when they’re ready:

::: {.callout-note collapse="true" title="✏️ Check your understanding"}

```quizdown
## Which of these is a supervised learning task?

1. [ ] Clustering customers by behaviour
1. [x] Predicting whether a loan will default
    > Correct — the model is trained on historical loan outcomes (labels).
1. [ ] Dimensionality reduction with PCA
```

:::

Live example:

--- shuffle_answers: true --- ## Which of these is a supervised learning task? > Supervised learning requires labelled training data — examples where the correct output is already known. 1. [ ] Clustering customers by purchase behaviour > No — clustering is unsupervised. There is no target label. 1. [x] Predicting whether a loan will default > Yes — the model is trained on historical loan outcomes (default / no default) as labels. 1. [ ] Reducing 100 features to 2 dimensions with PCA > No — PCA is unsupervised dimensionality reduction. 1. [ ] Grouping news articles by topic with LDA > No — LDA topic modelling is unsupervised.

Quiz configuration options

Pass YAML front matter inside the quiz block to configure it:

---
primary_color: "#4b75a8"   # accent colour (hex or CSS name)
secondary_color: lightgray  # chip / background colour
text_color: black
shuffle_questions: false    # randomise question order
shuffle_answers: true       # randomise answer/chip order
---

How it works — the syntax

A single ```quizdown block can hold multiple questions. Question type is inferred from the list syntax.

Question type List syntax Correct marker
Single choice Ordered list 1. [x] on one item
Multiple choice Unordered list - [x] on one or more items
Sequence Ordered list 1. No [x] — order is the answer
Matching / classification Unordered list - Left :: Right pairs

Feedback and hints

## Question heading

Optional overall hint as a blockquote.

> Hint text shown when student clicks the lightbulb 💡.

- [x] Correct option
    > Per-option feedback — shown after evaluation.
- [ ] Wrong option
    > Explain why this is wrong.

For matching questions:

- Left prompt :: Right chip
    > Per-pair feedback shown after evaluation.
- :: Distractor chip
    > Shown when this distractor is placed in a slot.

Attributions