G.S. Ivanova – Dr.Sc.(Eng.), Professor, Department IU-6, Bauman Moscow State Technical University
A.A. Golovkov – Ph.D.(Eng.), Director, LCC «Institute for Development of Digital Economy»
A.V. Umnov – Developer, LLC «Yandex»
Ya.S. Petrova – Student, IU-6, Bauman Moscow State Technical University
A.P. Borodin – Student, IU-6, Bauman Moscow State Technical University
A.O. Shakhlan – Student, IU-6, Bauman Moscow State Technical University
K.A. Lonshakova – Student, IU-6, Bauman Moscow State Technical University
V.V. Kelenin – Junior Research Scientist, Russian Scientific Center of Roentgenoradiology (Moscow)
This article describes an approach to solving the problem of automatic diagnosis of breast diseases using x ray mammology images, based on the use of a convolutional neural network with multiple inputs. An image preprocessing algorithm for creating representative learning datasets for learning, including filtering procedures, tags deleting and brightness increasing, is proposed. Several modifications of the neural network are developed and trained to classify images with diagnoses: dysplasia, a precancerous condition, cancer, benign tumor. A set of classification quality assessments is proposed, the diagnostic efficiency of each model is calculated. As a result of a comparative analysis, the features of the influence of the neural network architecture on the quality of classification are identified; the most effective neural network among considered is found that can recognize dysplasia in the image with 72% accuracy.
According to a report from the International Agency for Research on Cancer in 2018, breast cancer is the most commonly diagnosed type of cancer and also the leading cause of cancer mortality among women. Consequently, the problem of early detection of breast cancer is a theme of current interest.
One of the primary detecting breast diseases methods is mammography. Recognizing oncology by a mammogram is a sophisticated process, with the risk of misdiagnosis. Image analysis automation through the use of machine learning methods can reduce the influence of the human factor. The purpose of this work is to research the possibility of using machine learning methods for the breast images classification.
Existing neural network classification methods in this area were reviewed in this article. The neural network architecture that analyzes two projections of mammograms was developed to classify images within four diagnoses: dysplasia, a precancerous condition, cancer, benign tumor. It was also proposed five modifications of the base model to find efficient architecture for recognition.
To increase the representativeness of the training datasets, an image preprocessing algorithm, consisting of the function of detecting non-mammogram images, procedures for removing text labels, compressing and increasing the image contrast, was developed. The proposed models were trained and, evaluation of the classification quality was determined on the basis of the proposed set of per-formance metrics.
According to the results of our analysis, it was revealed that the best model is capable of detecting dysplasia with F1 = 0.8, while the proposed probabilistic generator is more effective for training neural networks for classification in this area. It was experimentally de-termined that the logistic regression is not applicable to such challenge, as it quickly overfits the pictures of one class. It is established that the analysis of two projections of the chest, rather than one, leads to more accurate classification results.
Prospects for further research are to complement the training data set with new samples to improve the quality of diagnosing a pre-cancerous condition, cancer, and benign tumors. In addition, the visualization of feature maps and heat maps of network layer acti-vations with subsequent analysis can help in finalizing the model to increase the overall classification efficiency.
- Freddie Bray, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries // CA: A Cancer Journal for Clinicians. 2018. P. 1−31. URL = https://onlinelibrary.wiley.com/doi/full/10.3322/caac.21492 (Accessed: 12.09.2018).
- Sostoyanie onkologicheskoj pomoshhi naseleniyu Rossii v 2017 g / Pod red. A.D. Kaprina, V.V. Starinskogo, G.V. Petrovoj. M.: MNIOI im. P.A. Gerczena – filial FGBU «NMICz radiologii» Minzdrava Rossii. 2018. 236 s. URL = http://www.oncology.ru/service/statistics/condition/2017.pdf (Data obrashheniya: 12.09.2018).
- Olaf Ronneberger, Philipp Fischer, Thomas Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation // arXiv preprint arXiv: 1505.04597v1. 2015.
- The DICOM Standard. URL = https://www.dicomstandard.org/current/ (Accessed: 12.09.2017).
- Mediczinskaya informaczionnaya sistema MGERM. URL = http://www.mgerm.ru/ (Data obrashheniya: 10.10.2018).
- Czifrovoj arxiv mediczinskix izobrazhenij PACS. URL = http://www.povidar.ru/products.html (Data obrashheniya: 10.10.2018).
- Sady’kov S.S., Bulanova Yu.A., Zaxarova E.A. Avtomatizirovannaya obrabotka i analiz mammograficheskix snimkov. Vladimir: Izd-vo VlGU. 2014. 208 s.
- Fisher R.A. The Use of Multiple Measurements in Taxonomic Problems // Annals of Eugenics. 1936. V. 7. Part II. P. 179−188.
- Danczaranova L.O. Algoritm komp’yuternogo obucheniya raspoznavaniya rakovy’x opuxolej s pomoshh’yu mammogramm na osnove nejronny’x setej. URL = http://library.eltech.ru/files/vkr/2017/bakalavri/3502/2017VKR350214DANCzARANOVA.pdf (Data obrashheniya: 03.09.2017).
- Mohammed J. Islam, Majid Ahmadi, Maher A. Sid-Ahmed. Computer-Aided Detection and Classification of Masses in Digitized Mammograms // Lecture Notes in Computer Science. V. 6146. URL = http://cybertron.vlsi.uwindsor.ca/presentations/2009/Computer-Aided %20Detection%20and%20Classification%20of%20 Masses%20in%20Digitized%20Mammograms.pdf (Accessed: 03.09.2017).
- High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks / Krzysztof J. Geras [et al.] // arXiv preprint arXiv: 1703.07047v3. 2018.
- OpenCV packages for Python. URL = https://pypi.org/project/opencv-python/ (Accessed: 10.10.2017).
- A tutorial on CLAHE. URL = https://docs.opencv.org/3.1.0/d5/daf/tutorial_py_histogram_equalization.html (Accessed: 10.10.2017).
- Sergey Ioffe, Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. URL = https://arxiv.org/abs/1502.03167 (Accessed: 20.11.2017).
- Timothy Dozat. Incorporating Nesterov Momentum into Adam / Stanford University // Tech. Rep. 2015. URL = http://cs229.stanford.edu/proj2015/054_report.pdf (Accessed: 17.10.2017).
- A Tutorial on the Cross-Entropy Method // Annals of operations research. 2005. V. 134. № 1. P. 19−67.
- Pieter-Tjerk de Boer, et al. Keras: The Python Deep Learning library. URL = https://keras.io/ (Accessed: 20.09.2017).
- Pepe, Margaret S. The statistical evaluation of medical tests for classification and prediction. New York. NY: Oxford. 2003.