University of Minnesota
Software Engineering Center
/

You are here

Provenance: What's Happening in your Production Data and ML Systems?

Date of Presentation: 
Thursday, January 16, 2020
Presented By: 
Description: 
Data is often your company's most valuable asset, yet very few implementations of ETL and machine learning capabilities provide the ability to measure their effectiveness (quality), or their performance. Data and machine learning pipelines are built as multi-step software integrations, but when an issue arises, how will you determine what happened? Machine learning models degrade over time, but without the ability to observe them, your models could be ineffective long before someone notices. In this talk, you will learn strategies for building visibility into data systems using data and ML provenance. Provenance is the concept of tracking the evolution of your data and models as data are moving through your system. As a side effect, you will also gain the measurements that typical software systems require to measure latency, throughput, load, and error rates...all without having to sift through dozens of logs from different systems in your technology stack.