Among all the fatal diseases cancer remains the most dreadful due to its imperfect diagnostics, unnoticed development, and variety of possible forms. Often-observed types of cancer include breast cancer, leukaemia, or blood cancer, lung, colon cancer, and more. The current cancer screening focuses on two main aspects. First, it detects the actual presence of cancerous cells, and second, it identifies the stage of cancer development. Like in any other medical screening, a cancer diagnosis can be falsely assigned due to manual errors or wrong interpretation of the data.
To avoid these errors biotech experts in close collaboration with IT and ML researchers propose effective software solutions to support decision-making in cancer screening and treatment. Self-learning models are to help in this case. Some NN works already demonstrate over 96% accuracy on cancer prediction based on the clinical data. But how does it work?
Identifying Cancer Driver Genes
Medicine already knows about the cancer driver genes, and machine learning is widely used for the detection of genomics patterns. Computer-based methods are essential for this data structure analysis due to the high risk of human error. One of the approaches to detect cancer driver genes is based on frequency. More precisely the dynamics and the uncontrollable growth of tumor caused by genetic mutation indicates that cancer might be the outcome of this mutation. Researchers use classification methods to label the effects of mutation as either cancer driver genes or passenger genes. Random forest is believed to be one of the best algorithms for this purpose showing over 92% accuracy in research papers. The most positive influence of this kind of research is that it gives hope to make cancer treatment more personalized due to the detection of the specific cancer-causing genes based on the DNA sequencing of a particular patient.
Better Classification of the Manually Labeled Clinical Data
Clinical datasets include volumes of real-time environment data, which does not necessarily have predictive power. The first stages of machine learning research always include the identification of the best metrics for building further predictions. Some researchers aggregate the data on several cancer types and build universal screening models, however, a distinct approach is applied to the selection of the features for each cancer type dataset, which is why the stage of data selection may take a while.
Recent comparative research has obtained results on the performance of the most common machine learning algorithms in terms of accuracy, precision, F-Score, and recall. Just like in several other biotech studies, multilayer neural networks exceed other algorithms in cancer prediction. More specifically, multilayer neural networks perform best at analyzing electronic health records to diagnose and assign further treatment reducing the chances of human error and misinterpretation.
Image Processing for Cancer Prediction
Medical imaging helps detect abnormal transformations in the organ structures, such as the growth of cancerous cells, tumors. Machine learning works on the real-world datasets of medical images and follows a four-step procedure.
- The algorithms absorb the pre-processed real-world data assigned to a particular type of cancer.
- The segmentation of this data happens. For each tumor type the best segmentation approach exists, such as Fuzzy C-means, k-means clustering, and Otsu threshold methods work best for the brain tumor segmentation.
- The training process occurs.
- The output shows the images classified as benign and malignant tumor types.
Deep learning algorithms demonstrate over 97% precision and accuracy scores for breast cancer and leukaemia classification based on image processing. One of the hurdles on the way to improvement is the processing time. Deep learning algorithms take more time to generate the outputs and this may lead to the need for more resources in the future because the volumes of cancer data only increase from year to year.
Today’s machine learning research for cancer screening and treatment improvement has already brought about a revolution in cancer research. However, the algorithms for each cancer type are not mature enough to help people forget about the mortality from malignant tumors. More research should be devoted to revealing how tumor growth works, what peculiarities of the complex forms of cancer are, which methods of clinical trial design are the most innovative, and more.