How can we trust systems built from machine learning components? We need advances in many areas, including machine learning algorithms, software engineering, ML ops, and explanation. This talk will describe recent work in two important directions: obtaining calibrated performance estimates and performing run-time monitoring with guarantees.
SHOWNOTES:
00:00 Intro
03:30 Outline
05:10 High Reliability Human Organizations
10:15 Designing AI Systems to be HROs
15:00 Part 1: Competence Modeling: Prospective MDP Performance Guarantees
17:00 Summary of the Approach
24:00 Conformal Guarantees
28:00 Conformal Guarantees in ℎ dimensions: Compute “exceedances” for each
29:20 Conformalized Quantile Regression: SCALEDSDTRAJECTORY
33:00 Problem 1: Tamarisk Invasions in River Networks
37:09 Example Prospective Intervals and Actual Trajectories
39:00 Tamarisk prediction interval coverage
41:08 MDP 2: Starcraft Battles
42:10 Startcraft prediction interval coverage
45:40 Part 2: Runtime Open Category Detection
47:30 Method: Reject Aliens Using Anomaly Detection
48:50 How to set 𝜏 without labeled data?
49:58 Idea: Use Unlabeled Data that Contains Novel Class Examples
51:03 CDFs of Nominal, Mixture, and Alien Anomaly Scores
53:20 Estimating the mixing proportion
53:55 Q3: How good are Recall and FPR in practice? UCI Datasets
55:10 Concluding Remarks
56:10 Acknowledgments
56:45 Start of the Q&A
57:00 Question: CQR - what are the limits of CQR when the data generation changes?
59:00 Question: QR method - if you have a new unseen data point, can we use the same random forest?
60:00 Q: what is missing? What do you want, in order to give this an A instead of a D?
62:50 Is it possible to classify unknown?
65:40 Q: Do we need to move to a more generative model?
66:45 Q: With multitask learning - will it learn more meaningful representations?
Can AI be trustworthy?
[ Ссылка ]
Website: [ Ссылка ]
Twitter: [ Ссылка ]
LinkedIn Page: [ Ссылка ]
LinkedIn Group: [ Ссылка ]
Instagram: [ Ссылка ]
Facebook: [ Ссылка ]
Ещё видео!