Enhanced temporal encoding-decoding for survival analysis of multimodal clinical data in smart healthcare
Tallennettuna:
| Julkaisussa: | Visual Computing for Industry Biomedicine, and Art vol. 8, no. 1 (Dec 2025), p. 28 |
|---|---|
| Päätekijä: | |
| Muut tekijät: | , , , , , , |
| Julkaistu: |
Springer Nature B.V.
|
| Aiheet: | |
| Linkit: | Citation/Abstract Full Text Full Text - PDF |
| Tagit: |
Ei tageja, Lisää ensimmäinen tagi!
|
| Abstrakti: | Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements; however, existing prediction models face limitations, primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy. To address these gaps, this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling. For facial feature extraction, this study developed a lightweight face-region convolutional neural network (FRegNet) specialized in detecting key facial components, such as eyes and lips in clinical patients that incorporates a residual backbone (Rstem) to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion; comparative experiments reveal that FRegNet outperforms state-of-the-art target detection algorithms, achieving average precision (AP) of 0.922, average recall of 0.933, mean average precision (mAP) of 0.987, and precision of 0.98–significantly surpassing other mask region-based convolutional neural networks (RCNN) variants, such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957. Based on the extracted facial features and clinical physiological indicators, this study proposes an enhanced temporal encoding-decoding (ETED) model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance, with comparative results demonstrating that the ETED variant incorporating facial features (ETEncoding-Decoding-Face) outperforms traditional models, achieving an accuracy of 0.916, precision of 0.850, recall of 0.895, F1 of 0.884, and area under the curve (AUC) of 0.947–outperforming gradient boosting with an accuracy of 0.922, but AUC of 0.669, and other classifiers in comprehensive metrics. The results confirm that the multimodal dataset (facial features + physiological indicators) significantly enhances the prediction accuracy of the seven-day survival conditions of patients. Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival, while temperature, Glasgow Coma Scale, and fibrinogen are negatively correlated. |
|---|---|
| ISSN: | 2524-4442 |
| DOI: | 10.1186/s42492-025-00209-7 |
| Lähde: | Publicly Available Content Database |