Another look at inference after prediction

Сохранить в:
Библиографические подробности
Опубликовано в::arXiv.org (Dec 6, 2024), p. n/a
Главный автор: Gronsbell, Jessica
Другие авторы: Gao, Jianhui, Shi, Yaqi, McCaw, Zachary R, Cheng, David
Опубликовано:
Cornell University Library, arXiv.org
Предметы:
Online-ссылка:Citation/Abstract
Full text outside of ProQuest
Метки: Добавить метку
Нет меток, Требуется 1-ая метка записи!

MARC

LEADER 00000nab a2200000uu 4500
001 3134986567
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3134986567 
045 0 |b d20241206 
100 1 |a Gronsbell, Jessica 
245 1 |a Another look at inference after prediction 
260 |b Cornell University Library, arXiv.org  |c Dec 6, 2024 
513 |a Working Paper 
520 3 |a Prediction-based (PB) inference is increasingly used in applications where the outcome of interest is difficult to obtain, but its predictors are readily available. Unlike traditional inference, PB inference performs statistical inference using a partially observed outcome and a set of covariates by leveraging a prediction of the outcome generated from a machine learning (ML) model. Motwani and Witten (2023) recently revisited two innovative PB inference approaches for ordinary least squares. They found that the method proposed by Wang et al. (2020) yields a consistent estimator for the association of interest when the ML model perfectly captures the underlying regression function. Conversely, the prediction-powered inference (PPI) method proposed by Angelopoulos et al. (2023) yields valid inference regardless of the model's accuracy. In this paper, we study the statistical efficiency of the PPI estimator. Our analysis reveals that a more efficient estimator, proposed 25 years ago by Chen and Chen (2000), can be obtained by simply adding a weight to the PPI estimator. We also contextualize PB inference with methods from the economics and statistics literature dating back to the 1960s. Our extensive theoretical and numerical analyses indicate that the Chen and Chen (CC) estimator offers a balance between robustness to ML model specification and statistical efficiency, making it the preferred choice for use in practice. 
653 |a Statistical methods 
653 |a Machine learning 
653 |a Statistical analysis 
653 |a Statistical inference 
653 |a Regression models 
700 1 |a Gao, Jianhui 
700 1 |a Shi, Yaqi 
700 1 |a McCaw, Zachary R 
700 1 |a Cheng, David 
773 0 |t arXiv.org  |g (Dec 6, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3134986567/abstract/embedded/MHJ61NGFZP0T4GI4?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2411.19908