MARC

LEADER 00000nab a2200000uu 4500
001 3267580384
003 UK-CbPIL
022 |a 2198-994X 
024 7 |a 10.1186/s40807-025-00220-9  |2 doi 
035 |a 3267580384 
045 2 |b d20251201  |b d20251231 
100 1 |a Adeleke, Oluwatobi  |u University of Johannesburg, Mechanical Engineering Science, Johannesburg, South Africa (GRID:grid.412988.e) (ISNI:0000 0001 0109 131X) 
245 1 |a Data-driven and explainable AI (XAI) framework for optimizing methane yield in large-scale biogas production 
260 |b Springer Nature B.V.  |c Dec 2025 
513 |a Journal Article 
520 3 |a The shift to sustainable energy systems necessitates scalable techniques for the valorization of organic waste via biogas production. This study presents a comprehensive data-driven framework encompassing statistical analysis, Explainable AI (XAI), clustering, and predictive modeling of methane yield to gain deeper operational insights into large-scale biogas production. By utilizing the operational data of a large-scale biodigester in Western cape province of South Africa including key biochemical and physicochemical variables such as temperature, pH, total solids (TS), volatile solids (VS), moisture content (MC), and FOS/TAC, key insights were derived through correlation mapping, scatter analysis, SHapley Additive exPlanations (SHAP)-based XAI for ranking digestion operational features, Principal component analysis (PCA) for addressing multicollinearity, and k-means cluster analysis to identify the operational clusters or groups which highlights critical shifts in system stability. Moreover, ensemble learning approaches, namely, XGBoost, Random Forest, as well as Support Vector Machine and Artificial Neural Network, were developed for methane yield prediction. The SHAP-based XAI identified FOS/TAC, volatile solids (VS), and moisture content (MC) as the most influential predictors of methane yield, while PCA explains 74% of the data variance in three Principal components (PCs), with PC1 dominated by VS, MC, and temperature as key drivers of methane yield. K-means clustering uncovered three distinct operational clusters, offering actionable guidance for feedstock management and process stabilization. Feedstock regression further established municipal solid waste (MSW) as the optimal input for maximizing methane output, with processed organic waste (POW) serving as an effective co-substrate. XGBoost achieved the best performance with an RMSE value of 1.18, followed by Random Forest (RMSE = 1.83), demonstrating the robustness of ensemble models in handling non-linear operational datasets. The research methodology is limited by its reliance on past operational data from a single digester and a lack of direct optimization experiments. However, the research strongly demonstrates the potential of data-driven approaches not only as powerful standalone tools but also as vital complements to experimental investigations. By transforming raw plant data into actionable intelligence, this study offers a scalable methodology for improving energy recovery, enhancing process control, and guiding sustainable development in industrial-scale bioenergy applications. 
651 4 |a South Africa 
653 |a Biogas 
653 |a Organic wastes 
653 |a Yield forecasting 
653 |a Data smoothing 
653 |a Principal components analysis 
653 |a Artificial neural networks 
653 |a Sustainable energy 
653 |a Methane 
653 |a Optimization 
653 |a Energy recovery 
653 |a Municipal solid waste 
653 |a Volatile solids 
653 |a Raw materials 
653 |a Moisture content 
653 |a Renewable energy 
653 |a Machine learning 
653 |a Strategic planning 
653 |a Energy consumption 
653 |a Process control 
653 |a Prediction models 
653 |a Clustering 
653 |a Genetic algorithms 
653 |a Sustainable development 
653 |a Statistical methods 
653 |a Alternative energy sources 
653 |a Ensemble learning 
653 |a Sustainability 
653 |a Statistical analysis 
653 |a Municipal waste management 
653 |a Explainable artificial intelligence 
653 |a Circular economy 
653 |a Food waste 
653 |a Water content 
653 |a Case studies 
653 |a Cluster analysis 
653 |a Artificial intelligence 
653 |a Solid waste management 
653 |a Support vector machines 
653 |a Process controls 
653 |a Variables 
653 |a Waste to energy 
653 |a Energy efficiency 
653 |a Research methods 
653 |a Solid wastes 
653 |a Systems stability 
653 |a Neural networks 
653 |a Vector quantization 
653 |a Economic 
700 1 |a Jen, Tien-Chien  |u University of Johannesburg, Mechanical Engineering Science, Johannesburg, South Africa (GRID:grid.412988.e) (ISNI:0000 0001 0109 131X) 
773 0 |t Sustainable Energy Research  |g vol. 12, no. 1 (Dec 2025), p. 65 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3267580384/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3267580384/fulltext/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3267580384/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch