Int J Med Sci 2020; 17(3):280-291. doi:10.7150/ijms.37134
Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer
1. Division of Colorectal Surgery, Department of Surgery, Chung Shan Medical University Hospital, Taiwan
2. Institute of Medicine, Chung Shan Medical University, Taiwan
3. School of Nursing, Chung-Shan Medical University, Taiwan
4. Department of Gastroenterology, Jen-Ai Hospital, Taichung, Taiwan
5. Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Faculty of Medicine, Chiang Mai University, Thailand
6. Division of Nephrology, Department of Internal medicine, Chung Shan Medical University Hospital, Taiwan
7. School of Medicine, Chung Shan Medical University
8. Department of Nutrition, Jen-Ai hospital, Taichung, Taiwan
Ting WC, Lu YCA, Ho WC, Cheewakriangkrai C, Chang HR, Lin CL. Machine Learning in Prediction of Second Primary Cancer and Recurrence in Colorectal Cancer. Int J Med Sci 2020; 17(3):280-291. doi:10.7150/ijms.37134. Available from http://www.medsci.org/v17p0280.htm
Background: Colorectal cancer (CRC) is the third commonly diagnosed cancer worldwide. Recurrence of CRC (Re) and onset of a second primary malignancy (SPM) are important indicators in treating CRC, but it is often difficult to predict the onset of a SPM. Therefore, we used mechanical learning to identify risk factors that affect Re and SPM.
Patient and Methods: CRC patients with cancer registry database at three medical centers were identified. All patients were classified based on Re or no recurrence (NRe) as well as SPM or no SPM (NSPM). Two classifiers, namely A Library for Support Vector Machines (LIBSVM) and Reduced Error Pruning Tree (REPTree), were applied to analyze the relationship between clinical features and Re and/or SPM category by constructing optimized models.
Results: When Re and SPM were evaluated separately, the accuracy of LIBSVM was 0.878 and that of REPTree was 0.622. When Re and SPM were evaluated in combination, the precision of models for SPM+Re, NSPM+Re, SPM+NRe, and NSPM+NRe was 0.878, 0.662, 0.774, and 0.778, respectively.
Conclusions: Machine learning can be used to rank factors affecting tumor Re and SPM. In clinical practice, routine checkups are necessary to ensure early detection of new tumors. The success of prediction and early detection may be enhanced in the future by applying “big data” analysis methods such as machine learning.
Keywords: colorectal cancer, second primary malignancy, machine learning