Comparison of Off-the-Shelf Methods and a Hotelling Multidimensional Approximation for Data Drift Detection
Gardado en:
| Publicado en: | Machine Learning and Knowledge Extraction vol. 7, no. 1 (2025), p. 2 |
|---|---|
| Autor Principal: | |
| Outros autores: | , |
| Publicado: |
MDPI AG
|
| Materias: | |
| Acceso en liña: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetas: |
Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3181640152 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2504-4990 | ||
| 024 | 7 | |a 10.3390/make7010002 |2 doi | |
| 035 | |a 3181640152 | ||
| 045 | 2 | |b d20250101 |b d20250331 | |
| 100 | 1 | |a Navarro-Cerdán, J Ramón | |
| 245 | 1 | |a Comparison of Off-the-Shelf Methods and a Hotelling Multidimensional Approximation for Data Drift Detection | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a Data drift can significantly impact the outcome of a model. Early detection of data drift is crucial for ensuring user confidence in predictions. It allows the user to check if a particular model needs retraining using updated data to adapt to the evolving process dynamics. This study compares five different statistical tests, namely four unidimensional and a new multidimensional test (MSPC), to identify data drift in both mean and deviation. While some are designed to detect drift in mean only, like our multidimensional proposal, others respond to changes in both mean and deviation. However, our Hotelling multidimensional method can be trained once and then applied in a single stage to any data stream with several attributes, and it can identify the most relevant variables causing a data drift with one execution, thus avoiding the need for a single univariate test for each attribute. Moreover, our method yields the relative importance of each attribute for drift and allows users to increase or decrease the relative weight of each variable regarding drift detection. It also may be capable of detecting drift due to changes in multivariate interactions. This behavior is especially suitable for real-world scenarios, such as industry, finance, or healthcare environments. | |
| 653 | |a Deviation | ||
| 653 | |a Machine learning | ||
| 653 | |a Data transmission | ||
| 653 | |a Methods | ||
| 653 | |a Datasets | ||
| 653 | |a Hypothesis testing | ||
| 653 | |a Algorithms | ||
| 653 | |a Multidimensional methods | ||
| 653 | |a Drift | ||
| 653 | |a Hypotheses | ||
| 653 | |a Statistical tests | ||
| 653 | |a Decision making | ||
| 653 | |a Process controls | ||
| 700 | 1 | |a Vicent Ortiz Castelló | |
| 700 | 1 | |a David Millán Escrivá | |
| 773 | 0 | |t Machine Learning and Knowledge Extraction |g vol. 7, no. 1 (2025), p. 2 | |
| 786 | 0 | |d ProQuest |t Advanced Technologies & Aerospace Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3181640152/abstract/embedded/09EF48XIB41FVQI7?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3181640152/fulltextwithgraphics/embedded/09EF48XIB41FVQI7?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3181640152/fulltextPDF/embedded/09EF48XIB41FVQI7?source=fedsrch |