Background and Aim: Colorectal cancer is one of the most common gastrointestinal cancers among human beings and the most important cause of death in the world. Based on the risk of colorectal cancer for individuals, using an appropriate screening program can help to prevent the disease. Therefore, the purpose of this study was to design a model for screening colorectal cancer based on risk factors to increase the survival rate of the disease on the one hand and to reduce the mortality rate on the other.
Materials and Methods: By reviewing articles and patients' records, 38 risk factors were detected. To determine the most important risk factors clinically, CVR(content validity ratio) was used; and considering the collected data, Spearman correlation coefficient and logistic regression analysis were applied for statistical analyses. Then, four algorithms -- J-48, J-RIP, PART and REP-Tree -- were used for data mining and rule generation. Finally, the most common model was obtained based on comparing the performance of the algorithms.
Results: After comparing the performance of algorithms, the J-48 algorithm with an F-Measure of 0.889 was found to be better than the others.
Conclusion: The results of evaluating J-48 data mining algorithm performance showed that this algorithm could be considered as the most appropriate model for colorectal cancer risk prediction.