Robustness to Multiplicity in the Machine Learning Pipeline

Guardat en:

Dades bibliogràfiques
Publicat a:	ProQuest Dissertations and Theses (2025)
Autor principal:	Meyer, Anna P.
Publicat:	ProQuest Dissertations & Theses
Matèries:	Computer science Computer engineering Artificial intelligence
Accés en línia:	Citation/Abstract Full Text - PDF
Etiquetes:	Afegir etiqueta Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC


LEADER	00000nab a2200000uu 4500
001	3219202014
003	UK-CbPIL
020			\|a 9798280780651
035			\|a 3219202014
045	2		\|b d20250101 \|b d20251231
084			\|a 66569 \|2 nlm
100	1		\|a Meyer, Anna P.
245	1		\|a Robustness to Multiplicity in the Machine Learning Pipeline
260			\|b ProQuest Dissertations & Theses \|c 2025
513			\|a Dissertation/Thesis
520	3		\|a Machine learning (ML) is increasingly used as a tool to replace or aid human decision making in high stakes settings like finance, medicine, and employment. The outcomes of these models can be pivotal in individuals’ lives, determining, for instance, whether someone gets a loan, access to proper medical care, or a job. However, these models’ decisions are often not robust to multiplicity: i.e., there are often multiple models that perform similarly well in aggregate, yet give conflicting predictions for individual samples. This multiplicity can stem from any part of the ML pipeline and affects not only predictions, but also explanations and global model behavior like adherence to fairness goals.In this dissertation, we study when multiplicity occurs, how to measure and control for it, and what its implications are for the fairness of using ML models. Our goal is to be able to understand when ML models’ outputs are reliable, so that model developers, deployers, and decision subjects can interact with models in an informed way.First, we propose dataset multiplicity, i.e., that multiple datasets may be equally appropriate to use as training data, yet yield models whose predictions disagree. We analyze prediction robustness to dataset multiplicity for two common model architectures, decision trees and linear models. The results of these analyses can be used to increase confidence in model predictions if the robustness proof is successful, or to prompt caution in blindly relying on the model outcomes otherwise. Then, we study the stability of explanations under multiplicity, and in particular dataset multiplicity that can be represented as data shift. We show how to improve explanation robustness in this setting, which allows model developers and explanation recipients to be more confident that the provided explanations will remain valid over time. Finally, we perform the first study about non-expert stakeholders’ views towards how multiplicity affects the fairness of ML models and how decisions should be made in the presence of multiplicity. Our results indicate that lay stakeholders have strong feelings about how multiplicity is resolved, but that these opinions are often at odds with what the existing literature recommends.
653			\|a Computer science
653			\|a Computer engineering
653			\|a Artificial intelligence
773	0		\|t ProQuest Dissertations and Theses \|g (2025)
786	0		\|d ProQuest \|t ProQuest Dissertations & Theses Global
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3219202014/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3219202014/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch