UBC MDS Capstone Seminar Series 2022

The UBC MDS Capstone Seminar Series is a collection of invited talks held for MDS-V and MDS-CL students, faculty and staff during the DSCI 591 Capstone Project course. Talks will be held from 2-3pm in DMP 310.

Schedule

Date	Speaker	Seminar Title
2022/05/13	Scott Mackie, Amazon	Building language models for voice assistants
2022/05/20	Zaid Haddad, Slalom	Slalom’s MLOps Framework
2022/05/27	Carrie Cheung, NuData Security, A Mastercard Company	Life of a Data Scientist: Beyond the Classroom
2022/06/03	Adina Williams, Meta	TBD
2022/06/10	Christo Kirov, Google	Low-Resource Multilingual NLP at Google
2022/06/17	Capstone Presentations (no seminar series talk this week)
2022/06/24	Maysam Emadi, Microsoft	Recommender Systems in Industry

Speaker Bio’s

Scott Mackie

Dr. Scott Mackie is a Senior Language Engineer at Amazon Alexa, where he has worked since 2018. Scott graduated with a PhD from UBC Linguistics in 2016.

Zaid Haddad

Zaid is a Data Scientist Leader at Slalom. Prior to Slalom, Zaid was a data scientist at Aurora Cannabis where he developed several end-to-end machine learning models & enterprise data science solutions. Also, he designed, developed & led various machine learning projects in personalized healthcare for start-ups and regulated environments. He has supported several organizations in the development and implementation of data science road maps from conception to deployment & advised local start-ups on machine learning & AI strategy.

Carrie Cheung

Carrie graduated from the UBC MDS program in 2019. Prior to MDS, she double majored in Computer Science and Psychology at UBC, an admittedly odd combination that hinted at an affinity for opportunities which blend the technical with the non-technical. One of these opportunities led her to data science and the rest is history.

Christo Kirov

I’m currently a research software engineer at Google Research, working out of the NYC office. I’m interested in low-resource (both in terms of data and computational power) NLP in multilingual settings. I’m coming from a broad academic background - my undergraduate degree is in Computer Science and Linguistics, and I received my PhD in Cognitive Science from JHU. Afterwards, I worked as a PostDoc teaching NLP at the Georgetown Linguistics Department. Before moving to Google, I spent several years as a PostDoc at the JHU Center for Language and Speech Processing, working under a grant program called LORELEI (low-resource Languages for Emergent Incidents). A major focus was the development of resources for morphological processing, particularly UniMorph (https://unimorph.github.io/).

Maysam Ehmadi

Maysam Emadi is a senior data & applied scientist within the News & Feeds team at Microsoft Vancouver. Before starting his career in data science, he obtained his PhD in theoretical particle physics from Simon Fraser University. Maysam’s experience in building recommender systems goes back to his time at Plenty of Fish where he worked on matching algorithms for the popular online dating platform.

Abstracts

Slalom’s MLOps Framework

by Zaid Haddad

This worked on my workstation! MLOps is a framework that aims to deploy & maintain machine learning models in production reliability & efficiently. Often, models are developed in isolated systems. MLOps frameworks help transition models developed from ad-hoc solutions & isolated systems to a ML lifecycle management. This talk will walk you through Slalom’s MLOps framework.

Life of a Data Scientist: Beyond the Classroom

by Carrie Cheung

Capstone can be both an exciting and scary time for MDS students as they approach the end of the program and the prospect of working professionally in the industry. As a graduate of the MDS program, Carrie understands this feeling and will share with students her journey from MDS to working as a Data Scientist at NuData Security.

NuData Security, a Mastercard company, helps businesses everyday to combat online fraud through a multi-layered approach leveraging behavioral analytics, behavioral biometrics, and device intelligence. Carrie will talk about NuData’s solutions for fighting fraud, her work as a Data Scientist at NuData and what a typical workday for her looks like. She will also touch upon areas where student expectations can often differ significantly from reality in industry, and tips for how students can make the transition from school to industry just a little bit easier.

Low-Resource Multilingual NLP at Google

by Christo Kirov, Google

As more and more users around the world access the Internet, especially on their smartphones, it becomes imperative to enable seamless interaction in their native language (see nextbillionusers.google). Unfortunately, support still lags behind for many languages outside of English. I’ll discuss some of the challenges of building NLP models for these smaller languages for tasks such as language identification, translation, transliteration, and language modeling. In many cases, the development of new datasets (including open source efforts like UniMorph, Universal Dependencies, and Google’s own Dakshina Dataset) plays a critical role in making progress.

Recommender Systems in Industry

by Maysam Emadi

Recommender Systems have become increasingly ubiquitous in recent decades with billions of people interacting with them daily. This talk begins with a brief introduction to RecSys and some of the commonly used methods for building them. The focus will be on aspects that are different from the standard ML problems you are familiar with, particularly on nuances of evaluating RecSys. We then highlight some of the practical challenges of going beyond recommender models and towards deploying large scale recommender systems such as Microsoft Start. We finish with an example of using deep learning to build user representations for content recommendation.