Int J Med Sci 2026; 23(5):1808-1821. doi:10.7150/ijms.127764 This issue Cite
Research Paper
1. Department of Orthopedic Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan.
2. Department of Biotechnology and Laboratory Science in Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
3. Graduate Institute of Linguistics, National Cheng Chi University, Taipei, Taiwan.
4. Institute of Communications Engineering, National Tsing Hua University, Hsinchu, Taiwan.
5. Graduate Institute of Technology, Innovation and Intellectual Property Management, National Cheng Chi University, Taipei, Taiwan.
6. School of Public Health, National Defense Medical University, Taipei, Taiwan.
7. Department of Public Health, China Medical University, Taichung, Taiwan.
8. Graduate Institute of Life Sciences, National Defense Medical University, Taipei, Taiwan
9. Graduate Institute of Medical Sciences, National Defense Medical University, Taipei, Taiwan.
10. Big Data Research Center, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan
11. Department of Public Health, College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
12. Department of Healthcare Administration and Medical Informatics, College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
13. Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan
Background: Human umbilical cord-derived mesenchymal stromal/stem cells (UC-MSCs) are promising for regenerative medicine, but consistent manufacturing quality is critical.
Objective: To develop and interpret machine-learning models (Extreme gradient boosting (XGBoost), with Shapley Additive Explanations, SHAP) that identify facilitatory and inhibitory factors affecting UC-MSC culture duration and post-processing viability.
Methods: We analyzed data from 203 UC-MSC manufacturing cases. Candidate predictors included neonatal characteristics (e.g., sex, delivery mode), processing timelines, medium composition, cell features, and operator-related factors. Performance was evaluated using accuracy, the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), log loss, and Brier score, with calibration assessed in cross-validation.
Results: For predicting shorter culture duration (defined as a time interval between UC collection and the completion of cryopreservation of <600 h), the model achieved accuracy = 0.80, AUROC = 0.72, and log loss = 0.55; cross-validation yielded AUROC = 0.68, AUPRC = 0.81, and Brier score = 0.20 with good calibration. For predicting higher cell viability, the model achieved accuracy = 0.71, AUROC = 0.72, and log loss = 0.62; cross-validation yielded AUROC = 0.54, AUPRC = 0.58, and Brier score = 0.26. SHAP analysis indicated that shorter culture duration was most associated with medium composition, processing time, and delivery mode, whereas higher viability was linked to neonatal sex, operator identity, and processing time. Sensitivity analyses showed stable top-ranked features across decision-threshold shifts and after removing operator identity.
Conclusions: An interpretable XGBoost+SHAP pipeline is effective for identifying process-critical drivers of UC-MSC culture duration. While current predictive precision for cell viability remains limited, the framework functions as a robust diagnostic tool for elucidating qualitative trends. By exploiting these insights, the model facilitates targeted optimization of media selection, timeline control, and standard operating procedures (SOPs), ultimately enhancing manufacturing quality.
Keywords: umbilical cord-derived mesenchymal stem cell, XGBoost algorithm, culture duration, cell viability