The UBC MDS Capstone Seminar Series is a collection of invited talks held for MDS-V and MDS-CL students, faculty and staff during the DSCI 591 Capstone Project course. Talks will be held from 2-3pm in DMP 310.
|2022/05/13||Scott Mackie, Amazon||Building language models for voice assistants|
|2022/05/20||Zaid Haddad, Slalom||Slalom’s MLOps Framework|
|2022/05/27||Carrie Cheung, NuData Security, A Mastercard Company||Life of a Data Scientist: Beyond the Classroom|
|2022/06/03||Adina Williams, Meta||TBD|
|2022/06/10||Christo Kirov, Google||Low-Resource Multilingual NLP at Google|
|2022/06/17||Capstone Presentations (no seminar series talk this week)|
|2022/06/24||Maysam Emadi, Microsoft||TBD|
Dr. Scott Mackie is a Senior Language Engineer at Amazon Alexa, where he has worked since 2018. Scott graduated with a PhD from UBC Linguistics in 2016.
Zaid is a Data Scientist Leader at Slalom. Prior to Slalom, Zaid was a data scientist at Aurora Cannabis where he developed several end-to-end machine learning models & enterprise data science solutions. Also, he designed, developed & led various machine learning projects in personalized healthcare for start-ups and regulated environments. He has supported several organizations in the development and implementation of data science road maps from conception to deployment & advised local start-ups on machine learning & AI strategy.
Carrie graduated from the UBC MDS program in 2019. Prior to MDS, she double majored in Computer Science and Psychology at UBC, an admittedly odd combination that hinted at an affinity for opportunities which blend the technical with the non-technical. One of these opportunities led her to data science and the rest is history.
I’m currently a research software engineer at Google Research, working out of the NYC office. I’m interested in low-resource (both in terms of data and computational power) NLP in multilingual settings. I’m coming from a broad academic background - my undergraduate degree is in Computer Science and Linguistics, and I received my PhD in Cognitive Science from JHU. Afterwards, I worked as a PostDoc teaching NLP at the Georgetown Linguistics Department. Before moving to Google, I spent several years as a PostDoc at the JHU Center for Language and Speech Processing, working under a grant program called LORELEI (low-resource Languages for Emergent Incidents). A major focus was the development of resources for morphological processing, particularly UniMorph (https://unimorph.github.io/).
Slalom’s MLOps Framework
by Zaid Haddad
This worked on my workstation! MLOps is a framework that aims to deploy & maintain machine learning models in production reliability & efficiently. Often, models are developed in isolated systems. MLOps frameworks help transition models developed from ad-hoc solutions & isolated systems to a ML lifecycle management. This talk will walk you through Slalom’s MLOps framework.
Life of a Data Scientist: Beyond the Classroom
by Carrie Cheung
Capstone can be both an exciting and scary time for MDS students as they approach the end of the program and the prospect of working professionally in the industry. As a graduate of the MDS program, Carrie understands this feeling and will share with students her journey from MDS to working as a Data Scientist at NuData Security.
NuData Security, a Mastercard company, helps businesses everyday to combat online fraud through a multi-layered approach leveraging behavioral analytics, behavioral biometrics, and device intelligence. Carrie will talk about NuData’s solutions for fighting fraud, her work as a Data Scientist at NuData and what a typical workday for her looks like. She will also touch upon areas where student expectations can often differ significantly from reality in industry, and tips for how students can make the transition from school to industry just a little bit easier.
Low-Resource Multilingual NLP at Google
by Christo Kirov, Google
As more and more users around the world access the Internet, especially on their smartphones, it becomes imperative to enable seamless interaction in their native language (see nextbillionusers.google). Unfortunately, support still lags behind for many languages outside of English. I’ll discuss some of the challenges of building NLP models for these smaller languages for tasks such as language identification, translation, transliteration, and language modeling. In many cases, the development of new datasets (including open source efforts like UniMorph, Universal Dependencies, and Google’s own Dakshina Dataset) plays a critical role in making progress.