Can We Trust Machine Learning Predictions to Answer Science Questions?

TITLE:

CSRC Colloquium

Can We Trust Machine Learning Predictions to Answer Science Questions?

DATE:

Friday, March 4, 2022

TIME:

3:30 PM

LOCATION:

Virtual Zoom Conference

SPEAKER:

Dr. Diane Oyen, Scientist, Information Sciences, Los Alamos National Laboratory

ABSTRACT:

Scientists in fields as diverse as bioscience, geoscience, and cyber security are successfully applying machine learning models to solve problems of critical importance to science and security. Machine learning models generalize patterns from datasets and can result in emergent behaviors that are poorly understood by their creators and users. Machine learning is trained and validated on available datasets — whether from simulations, experiments or observations — but must be trusted to deploy on real data and to answer scientific puzzles. Questions of robustness, fairness, bias and trustworthiness in machine learning models have arisen in social contexts (such as the ethics of using machine learning models to determine prison sentences in criminal court cases). Yet science problems present a rich testbed for developing trustworthy machine learning methods and evaluation tools. We are developing methods to evaluate datasets, machine learning models, and the output predictions of these models to go beyond only achieving high accuracy on a fixed validation set, but to ensure that machine learning is answering the science question at hand.

Bio: Diane Oyen is a Scientist in the Information Sciences Group at Los Alamos National Laboratory. She received her B.S. degree in Electrical Engineering from Carnegie Mellon University and her Ph.D. in Computer Science from the University of New Mexico. Diane develops machine learning methods for scientific analysis; with particular focus in explainable machine learning, transfer learning, and robust machine learning. She uses probabilistic graphical models in machine learning to better understand the dependence among variables in complex systems, and extends the latest machine learning methods, including deep learning, for use in novel applications such as pattern recognition and scientific discovery in ChemCam observations on Mars, accelerating simulations of physics simulations, malware characterization, and computer vision for technical images.

Presenter Website

HOST:

Rodrigo Navarro Perez

DOWNLOAD: