Int J Med Sci 2026; 23(5):1808-1821. doi:10.7150/ijms.127764 This issue Cite

Research Paper

Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells

Tse-Pu Huang1, Hsin-Hui Huang2, Bing-Tsiong Li3, Pei-Hung Shen1, Gracy Thomas4, Juin-Yi Han5, Chi-Ming Chu6,7,8,9,10,11,12,13, Kun-Yi Lin1✉

1. Department of Orthopedic Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan.
2. Department of Biotechnology and Laboratory Science in Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
3. Graduate Institute of Linguistics, National Cheng Chi University, Taipei, Taiwan.
4. Institute of Communications Engineering, National Tsing Hua University, Hsinchu, Taiwan.
5. Graduate Institute of Technology, Innovation and Intellectual Property Management, National Cheng Chi University, Taipei, Taiwan.
6. School of Public Health, National Defense Medical University, Taipei, Taiwan.
7. Department of Public Health, China Medical University, Taichung, Taiwan.
8. Graduate Institute of Life Sciences, National Defense Medical University, Taipei, Taiwan
9. Graduate Institute of Medical Sciences, National Defense Medical University, Taipei, Taiwan.
10. Big Data Research Center, College of Medicine, Fu-Jen Catholic University, New Taipei City, Taiwan
11. Department of Public Health, College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
12. Department of Healthcare Administration and Medical Informatics, College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan
13. Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan

Citation:
Huang TP, Huang HH, Li BT, Shen PH, Thomas G, Han JY, Chu CM, Lin KY. Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells. Int J Med Sci 2026; 23(5):1808-1821. doi:10.7150/ijms.127764. https://www.medsci.org/v23p1808.htm
Other styles

File import instruction

Abstract

Graphic abstract

Background: Human umbilical cord-derived mesenchymal stromal/stem cells (UC-MSCs) are promising for regenerative medicine, but consistent manufacturing quality is critical.

Objective: To develop and interpret machine-learning models (Extreme gradient boosting (XGBoost), with Shapley Additive Explanations, SHAP) that identify facilitatory and inhibitory factors affecting UC-MSC culture duration and post-processing viability.

Methods: We analyzed data from 203 UC-MSC manufacturing cases. Candidate predictors included neonatal characteristics (e.g., sex, delivery mode), processing timelines, medium composition, cell features, and operator-related factors. Performance was evaluated using accuracy, the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), log loss, and Brier score, with calibration assessed in cross-validation.

Results: For predicting shorter culture duration (defined as a time interval between UC collection and the completion of cryopreservation of <600 h), the model achieved accuracy = 0.80, AUROC = 0.72, and log loss = 0.55; cross-validation yielded AUROC = 0.68, AUPRC = 0.81, and Brier score = 0.20 with good calibration. For predicting higher cell viability, the model achieved accuracy = 0.71, AUROC = 0.72, and log loss = 0.62; cross-validation yielded AUROC = 0.54, AUPRC = 0.58, and Brier score = 0.26. SHAP analysis indicated that shorter culture duration was most associated with medium composition, processing time, and delivery mode, whereas higher viability was linked to neonatal sex, operator identity, and processing time. Sensitivity analyses showed stable top-ranked features across decision-threshold shifts and after removing operator identity.

Conclusions: An interpretable XGBoost+SHAP pipeline is effective for identifying process-critical drivers of UC-MSC culture duration. While current predictive precision for cell viability remains limited, the framework functions as a robust diagnostic tool for elucidating qualitative trends. By exploiting these insights, the model facilitates targeted optimization of media selection, timeline control, and standard operating procedures (SOPs), ultimately enhancing manufacturing quality.

Keywords: umbilical cord-derived mesenchymal stem cell, XGBoost algorithm, culture duration, cell viability


Citation styles

APA
Huang, T.P., Huang, H.H., Li, B.T., Shen, P.H., Thomas, G., Han, J.Y., Chu, C.M., Lin, K.Y. (2026). Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells. International Journal of Medical Sciences, 23(5), 1808-1821. https://doi.org/10.7150/ijms.127764.

ACS
Huang, T.P.; Huang, H.H.; Li, B.T.; Shen, P.H.; Thomas, G.; Han, J.Y.; Chu, C.M.; Lin, K.Y. Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells. Int. J. Med. Sci. 2026, 23 (5), 1808-1821. DOI: 10.7150/ijms.127764.

NLM
Huang TP, Huang HH, Li BT, Shen PH, Thomas G, Han JY, Chu CM, Lin KY. Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells. Int J Med Sci 2026; 23(5):1808-1821. doi:10.7150/ijms.127764. https://www.medsci.org/v23p1808.htm

CSE
Huang TP, Huang HH, Li BT, Shen PH, Thomas G, Han JY, Chu CM, Lin KY. 2026. Identifying key bioprocess variables using explainable machine learning to enhance culture efficiency and viability of umbilical cord-derived mesenchymal stem cells. Int J Med Sci. 23(5):1808-1821.

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/). See https://ivyspring.com/terms for full terms and conditions.
Popup Image