What you learned, when to use it
When I need to build a web app from Python without writing HTML/JS,
I take Shiny to use reactivity (input -> reactive graph -> output model),
so I can create interactive dashboards entirely in Python.
Key points:
ui defines the layoutserver wires inputs to outputs via the reactive graphWhen NOT to use:
Static report or chart that doesn’t need user interaction → Quarto/Altair alone is simpler.
Starting blocks
app-05-core-altair-first_app.py — complete first app: radio input + Altair outputWhen I’m designing a dashboard and don’t know where to start,
I want a set of layout and communication principles,
so I can make intentional choices about what to show and where.
Key points:
When NOT to use:
Exploratory analysis for yourself → just use a notebook. Dashboards are for communicating to an audience.
Starting blocks
app.py — complete restaurant tipping dashboard: sidebar + cards + value boxesReferences
When my app works but looks rough, or I need users to download results,
I use Bootstrap utilities, Shiny themes, value boxes, and CSV export,
so I can ship a polished, usable dashboard.
Key points:
shinyswatch.theme.* for global themesui.value_box() for KPI cards@render.download + io.StringIO for CSV exportui.update_* + req() for linked filters that reset gracefullyWhen NOT to use:
Deep custom branding needs → inline CSS and custom Bootstrap SCSS is more flexible but much more work.
Starting blocks
app-04-full-dashboard.py — full styled dashboard with theme + value boxesapp-07-export-csv.py — CSV download buttonapp-08-reset-selections.py — reset all filters to defaultsWhen I first add interactivity to a Python web app, I use Shiny’s reactive graph (inputs → reactive contexts → outputs), so I can write plain Python functions and let Shiny handle re-execution automatically.
Key points:
input.x() inside a reactive context registers a dependency on that input@render.* functions are reactive endpoints — they re-run only when their upstream inputs changeWhen NOT to use: The reactive graph underlies every Shiny app — it’s not a choice. Understanding it is essential for debugging why outputs don’t update (missing reactive context) or update too often (shared mutable state outside reactive scope).
Starting blocks
app-05-core-altair-first_app.py — minimal reactive graph: one input → one renderReferences
@reactive.calc — Shared ComputationWhen I have an expensive filter or aggregation used by multiple outputs,
I use @reactive.calc to cache the result,
so I can avoid re-running the same computation for every output that depends on it.
Key points:
@reactive.calc@render.* reruns the full pipeline independentlyWhen NOT to use:
Logic used by only one output → just compute inline inside @render.*. @reactive.calc adds indirection without benefit for single consumers.
Starting blocks
app-05-reactive_calc_reuse.py — @reactive.calc shared across multiple outputsWhen I need computation or a write to fire only on user action, I use @reactive.event, @reactive.effect, and req(), so I can control exactly when side effects run and guard against partial input.
Key points:
@reactive.event(input.btn) and ui.input_action_button("btn", "Submit"): defers any render, calc, or effect to a button click@reactive.effect: runs a side effect (e.g. write to DB) when dependencies change — returns nothingreq(input.x): silently stops execution if input is None or empty — guards against partial-input errorsWhen NOT to use: @reactive.event when the UI should stay live — it freezes updates until the button fires. @reactive.effect for computation that returns a value — use @reactive.calc instead.
Starting blocks
app-03.py — @reactive.event(input.submit) gates a MongoDB write behind a button.data_view()When I want users to sort, filter, or select rows and have the rest of the app react,
I use DataGrid with cell_selection + .data_view(),
so I can drive charts and summaries from whatever the user is looking at.
Key points:
render.DataGrid(df, selection_mode="rows") enables row selection.data_view(selected=True) returns selected rows.data_view() returns filtered rowsWhen NOT to use:
Read-only display where users just scroll → plain DataTable is simpler and styled out of the box. No need for .data_view() if nothing reacts to the table.
Starting blocks
app-04-table-linked.py — DataGrid row selection drives linked Altair chartWhen users need to export the current view of the data, I use @render.download with ui.download_button, so I can give users a file without blocking the UI or writing anything to disk.
Key points:
@render.download wraps a function that yields file bytes — Shiny serves it as a download linktbl.data_view() inside to export only the currently filtered rowsio.StringIO + .getvalue().encode() converts a DataFrame to CSV bytes in memoryWhen NOT to use:
Very large files → result is buffered in memory before sending.
Starting blocks
app-04-table-linked.py — CSV download from DataGrid selectionReferences
When I need to save user input data or logs from my app across sessions, I use an external database (MongoDB / Postgres / Airtable / Google Sheets) and @reactive.effect + @reactive.event(input.save) to write on button click, so I can record submissions without blocking the UI or triggering on every keystroke.
Choosing a provider:
gspread) or Airtable (pyairtable).env, never in codeWhen NOT to use: Multi-step transactional writes → wrap in a proper DB transaction. Read-only dashboards → no persistence needed.
Starting blocks
MongoDB quick pattern:
MongoClient(uri)["db"]["col"].insert_one(doc) / .find()
| Provider | Free tier | Client |
|---|---|---|
| MongoDB Atlas | M0, 512 MB | pymongo |
| Neon / Supabase | Postgres, varies | psycopg2 |
| Airtable | 1 000 rows | pyairtable |
| Google Sheets | Unlimited | gspread |
When my dataset is too large to load into memory at app startup,
I use lazy evaluation with ibis + DuckDB over Parquet files,
so filters run as queries against the file and only the result is loaded.
Key points:
ibis.duckdb.connect() → con.read_parquet(path) → build ibis expressions → .execute() only when rendering.execute()When NOT to use:
Dataset fits in memory easily (<100MB) → pd.read_parquet() at startup is simpler. Shinylive (WASM) → DuckDB file access silently fails; use in-memory sample instead.
Starting blocks
app-02b-taxi-parquet.py — full lazy filtering over NYC taxi Parquet with ibis + DuckDBWhen I want to integrate an LLM into my app and need to understand costs and limits,
I need to be aware of tokens, context windows, and the HTTP request cycle,
so I can choose the right architecture, model and provider to avoid nasty surprises in production.
Key points:
chatlas (Python) or ellmer (R) to abstract the APIWhen NOT to use:
If your “AI feature” is just keyword matching or a lookup table → no need for an LLM at all.
Starting blocks
03-first-chat.ipynb — basic Chatlas conversation05-switch-provider.ipynb — swapping between GitHub Models / Anthropic / OpenAIWhen I embed a chat interface in my dashboard using QueryChat, I use a custom system prompt and welcome message, so the LLM stays on topic and users know what to ask.
Key points:
system_prompt= to QueryChat to constrain behavior — sent with every requestgreeting= for the opening message — rendered locally, not generated by the LLMdata_description= to inject schema context the LLM needsQueryChat(df, "name", system_prompt="You are...", greeting="Hi! Ask me about...")
When NOT to use:
If the user should have full open-ended access to any topic → a minimal or no system prompt is better.
Starting blocks
app-07e-querychat-instructions.py — system_prompt + data_descriptionapp-07f-two-tabs.py — full two-tab dashboard with querychatquerychat-explore.ipynb — greeting + extra instructions walkthroughWhen I want the LLM to fetch live data or call functions during a conversation,
I use tool calling (or MCP for multi-tool setups),
so the LLM can answer questions it couldn’t from training data alone.
Key points:
chat.register_tool(fn) (chatlas)Tool calling vs MCP: use tool calling for 1-3 custom functions in your app; use MCP when connecting to an existing ecosystem (GitHub, databases, file systems).
When NOT to use:
Static data the LLM can receive once in the system prompt → just inject it, no tool needed.
Starting blocks
chatlas-weather.py — register + call a weather toolapp-weather-core.py — tool calling wired into a Shiny appquerychat-explore.ipynb — querychat tool loop walkthroughWhen I need to extract structured fields from unstructured text (documents, reviews, emails),
I use chat.extract_data() with a Pydantic schema,
so I can get typed, validated JSON instead of prose.
Key points:
BaseModel with field descriptions → pass to chat.extract_data(text, SentimentResult)When NOT to use:
Data that’s already structured (CSV, SQL) → no need. Simple yes/no classification with fixed categories → a rule-based approach is cheaper and more reliable.
Starting blocks
01-simple.py — extract typed fields from text with Pydantic02-image.py — multimodal: extract structure from an imageWhen my LLM doesn’t know domain-specific terms or dataset conventions,
I inject relevant context chunks per query,
so the LLM answers with correct domain knowledge without fine-tuning.
Key points:
.txt files work; for larger KBs use llama_index with VectorStoreIndex or ChromaDB)[context] + [user question]When NOT to use:
General knowledge questions the LLM already knows well → RAG adds latency for no gain. More than ~50 chunks → consider a proper vector DB (Chroma, Pinecone).
Starting blocks
rag_demo.ipynb — full walkthrough: KB → TF-IDF retrieval → per-query injectionWhen I want to record what users ask the LLM in my Shiny app,
I use the on_tool_request hook on the QueryChat client,
so I can log queries to a database without blocking the conversation flow.
Key points:
client.on_tool_request(fn) fires once per query (not per tool call)@reactive.effect only for Shiny-reactive side effectsWhen NOT to use:
You only need to log for debugging → print to console instead. You need full conversation history → store the entire message thread, not just the query.
Starting blocks
app-04.py — on_tool_request hook logs queries to MongoDB AtlasWhen I want to know if my LLM prompt changes are actually improvements,
I use a repeatable eval suite that scores model responses,
so I can iterate on prompts with evidence instead of vibes.
Key points:
inspect_ai (or similar) to run the same queries across prompt versions and score with an LLM judge or exact matchWhen NOT to use:
Prototyping — manual spot-checking is faster. Production with unpredictable input distributions → evals on fixed test sets won’t catch all failure modes; combine with logging.
Starting blocks
evals.py — inspect_ai task → solver → scorer pipelineReferences
When I need to show geographic patterns in my data,
I use choropleth maps or point maps linked to charts,
so I can let users explore regional variation interactively.
Key points:
mark_geoshape() for choroplethsmark_circle() on a map for point dataWhen NOT to use:
Data without a clear geographic dimension → a standard bar/scatter is less noisy. Fine-grained street-level routing → use a dedicated mapping library (Folium, Leaflet).
Starting blocks
app-04-map-and-chart.py — choropleth map linked to bar chartapp-05-map-click.py — Altair-native click selection on map| When I… | I use… | So I can… | Resources |
|---|---|---|---|
| Reactivity | |||
| have a costly filter used by multiple outputs | @reactive.calc |
reuse one computation across dependent outputs, reducing redundancy | app · L03 L04 · docs |
| need a side effect to run only when specific inputs change | @reactive.event |
avoid unintended side effects from reactive cascades | L03 · docs |
| need to trigger a reset or action on button click | @reactive.event(input.btn) + @reactive.effect |
run DB writes or UI resets exactly when the user asks | L03 · docs |
| Data I/O | |||
| want my app to react to users sorting/filtering/selecting rows | DataGrid + .data_view() |
build logic on what the user sees | app · L06a · docs |
| need to give users a data file (csv/…) | @render.download + download_button |
let users take some results out of the app | L06a · docs |
| need to save user input or logs across sessions | external DB + @reactive.effect (with @reactive.event) |
store data in a way that survives multiple sessions and server restarts | app-03 app-04 · L09 · docs |
| have a dataset too large to load at startup | Parquet + DuckDB [+ ibis] | keep the app fast without loading everything into memory | app · L09 · ibis DuckDB |
| LLM & AI | |||
| want to embed a chat interface in my dashboard | QueryChat |
let users explore data conversationally, in context | app-07e app-07f · L07d · querychat |
| need to constrain what the LLM answers about | system_prompt + data description |
focus the assistant on your data and prevent off-topic responses | L07d · querychat |
| want the LLM to use instruments and tools | chat.register_tool() (chatlas) |
extend the LLM with external functionality and data access | L08a · chatlas |
| need structured fields from unstructured text | Pydantic + structured output | extract structured fields to use in code or output in a particular format | L08b · pydantic |
| need the LLM to answer questions about my specific domain | RAG (per-query injection) | get correct answers without fine-tuning | rag_demo · L08c · LlamaIndex |
| want to log what users ask the LLM | on_tool_request hook + external DB |
audit trail without blocking the flow | app-04 · L09 · querychat |
| want to measure if prompt changes improve output | inspect_ai evals |
compare models and iterate with evidence | L08e · inspect.ai |
| Geospatial | |||
| need to show geographic patterns | Altair choropleth / point map | explore regional variation interactively | app-04 app-05 · L05a · Altair maps |
DSCI 532: Data Visualization 2 https://github.com/UBC-MDS/DSCI_532_vis-2_book