Spatial analysis of air pollutant exposure and its association with metabolic diseases using machine learning

Сохранить в:
Библиографические подробности
Опубликовано в::BMC Public Health vol. 25 (2025), p. 1
Главный автор: Liu, Jingjing
Другие авторы: Liu, Chang, Liu, Zhangdaihong, Zhou, Yibin, Li, Xiaoguang, Yang, Yang
Опубликовано:
Springer Nature B.V.
Предметы:
Online-ссылка:Citation/Abstract
Full Text
Full Text - PDF
Метки: Добавить метку
Нет меток, Требуется 1-ая метка записи!

MARC

LEADER 00000nab a2200000uu 4500
001 3175402255
003 UK-CbPIL
022 |a 1471-2458 
024 7 |a 10.1186/s12889-025-22077-9  |2 doi 
035 |a 3175402255 
045 2 |b d20250101  |b d20251231 
084 |a 58491  |2 nlm 
100 1 |a Liu, Jingjing 
245 1 |a Spatial analysis of air pollutant exposure and its association with metabolic diseases using machine learning 
260 |b Springer Nature B.V.  |c 2025 
513 |a Journal Article 
520 3 |a BackgroundMetabolic diseases (MDs), exemplified by diabetes, hypertension, and dyslipidemia, have become increasingly prevalent with rising living standards, posing significant public health challenges. The MDs are influenced by a complex interplay of genetic factors, lifestyle choices, and socioeconomic conditions. Additionally, environmental pollutants, particularly air pollutants (APs), have attracted increasing attention for their potential role in exacerbating these MDs. However, the impact of APs on the MDs remains unclear. This study introduces a novel machine learning (ML) pipeline, an Algorithm for Spatial Relationships Analysis between Exposome and Metabolic Diseases (ASEMD), to analyze spatial associations between APs and MDs at the prefecture-level city scale in China.MethodsThe ASEMD pipeline comprises three main steps: (i) Spatial autocorrelation between APs and MDs is evaluated using Moran’s I statistic and Local Indicators of Spatial Association (LISA) maps. (ii) dimensionality reduction and spatial similarities identification between APs and MDs clusters using Principal Component Analysis (PCA), k-means clustering, and Jaccard index calculations, further validated through spatial maps. (iii) AP exposure is adjusted by demographic and lifestyle confounders to predict MDs using machine learning models (e.g., eXtreme Gradient Boosting (XGBoost), Random Forest (RF), Decision Tree (DT), LightGBM, and Multi-Layer Perceptron (MLP)). SHAP values are employed to identify key adjusted APs that are linked to MDs. Model performance is evaluated through 10-fold cross-validation using five different metrics. The data utilized include CHARLS (2015) and meteorological data (2013-2015).ResultsSignificant spatial correlations were found between APs and the prevalence of diabetes, dyslipidemia, and hypertension, with higher prevalence rates observed in alignment with elevated APs concentrations. By adjusting for demographic and lifestyle confounders, APs effectively predicted the risk of developing MDs (AUROC=0.890, 0.877, 0.710 for diabetes, dyslipidemia, and hypertension, respectively). The results showed that \(\mathrm CO\), \(\mathrm PM_{2.5}\), and \(\mathrm AQI\) were strongly correlated with diabetes, whereas \(\mathrm NO_{2}\), \(\mathrm PM_{2.5}\), and \(\mathrm PM_{10}\) were significantly associated with dyslipidemia. For hypertension, \(\mathrm CO\), \(\mathrm O_{3}\), and \(\mathrm AQI\) were mostly correlated. Sensitivity analyses across different regions and different types of APs underscored the robustness of our conclusions.ConclusionThe ASEMD pipeline successfully integrates ML models, epidemiological methods, and spatial analysis techniques, providing a robust framework for understanding the complex interactions between APs and MDs. We also identified specific APs, including \(PM_{10}\), \(\mathrm CO\), and \(\mathrm SO_{2}\), as being strongly linked to higher rates of diabetes, dyslipidemia, and hypertension in central and northern cities. Future region-specific public health strategies or interventions, especially in those areas with high pollutant levels, are needed to mitigate air pollution’s impact on metabolic health. 
651 4 |a China 
653 |a Performance evaluation 
653 |a Spatial analysis 
653 |a Epidemiology 
653 |a Demographics 
653 |a Sensitivity analysis 
653 |a Principal components analysis 
653 |a Multilayer perceptrons 
653 |a Questionnaires 
653 |a Cities 
653 |a Genetic factors 
653 |a Hypertension 
653 |a Machine learning 
653 |a Air pollution 
653 |a Metabolic disorders 
653 |a Decision trees 
653 |a Blood pressure 
653 |a Clustering 
653 |a Cadmium 
653 |a Exposure 
653 |a Regions 
653 |a Public health 
653 |a Socioeconomics 
653 |a Dyslipidemia 
653 |a Algorithms 
653 |a Meteorological data 
653 |a Insulin resistance 
653 |a Plasma 
653 |a Diabetes 
653 |a Pollutants 
653 |a Socioeconomic factors 
653 |a Multilayers 
653 |a Self report 
653 |a Disease prevention 
653 |a Fasting 
653 |a Longitudinal studies 
653 |a Osteoporosis 
653 |a Learning algorithms 
653 |a Pollution levels 
653 |a Cluster analysis 
653 |a Cohort analysis 
653 |a Support vector machines 
653 |a Glucose 
653 |a Diabetes mellitus 
653 |a Demography 
653 |a Vector quantization 
653 |a Social 
700 1 |a Liu, Chang 
700 1 |a Liu, Zhangdaihong 
700 1 |a Zhou, Yibin 
700 1 |a Li, Xiaoguang 
700 1 |a Yang, Yang 
773 0 |t BMC Public Health  |g vol. 25 (2025), p. 1 
786 0 |d ProQuest  |t Health & Medical Collection 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3175402255/abstract/embedded/09EF48XIB41FVQI7?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3175402255/fulltext/embedded/09EF48XIB41FVQI7?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3175402255/fulltextPDF/embedded/09EF48XIB41FVQI7?source=fedsrch