Ensuring Reliable Evaluation Systems for Your ML
Ensuring Reliable Evaluation Systems for Your ML
Meritage Ballroom
Demetrios Brinkmann
|
Founder of MLOps Community
Wed 10:20AM - 11:00AM, September 10th
Standard benchmarks often fall short and can be misleading. Leaderboards can erode trust in model claims, as they rarely address specific, real-world needs. In this talk, Demetrios Brinkmann will detail how MLOps engineers and developers can build and continuously update their own evaluation systems to create a strong competitive advantage. He’ll cover how to build a reliable “golden dataset,” optimize data collection, labeling, and utilize the right tools to ensure evaluations truly reflect their intended use case.