DeepBatch: a hybrid deep learning model for an interpretable diagnosis of breast cancer in whole-slide images
Descripción
CONTEXT: Advances in studies at the molecular and genetic sequencing have brought significant progress in understanding the behavior, treatment, and prognosis of breast cancer. However, the diagnosis of breast cancer in the early stages is still essential for successful treatment. Currently, the gold standard for breast cancer diagnosis, treatment, and management is a histological analysis of a suspected section. Histopathology consists of analyzing the characteristics of the lesions through sections of tissue stained with Hematoxylin and Eosin. However, pathologists are currently subjected to high workloads, mainly due to the fundamental role played by histological analysis in the treatment of the patient. In this context, applications that can reduce the time of histological analysis, provide a second opinion, or even point out suspicious locations as a screening tool can help the pathologist. OBJECTIVE: We envision two main challenges: the first, how to identify cancerous regions in Whole Slide Imaging (WSI) using Deep Learning (DL) with an accuracy comparable to the pathologist’s annotations, considered the gold standard in the literature. And second, how a DL-based model can provide an interpretable diagnosis. The scientific contribution consists in proposing a model based on Convolutional Neural Networks (CNN) to provide a refined and multi-class segmentation of WSI in breast cancer. METHODOLOGY: The methodology consists of proposing and developing a model called DeepBatch. The DeepBatch is divided into four modules: Preprocessing, ROI Detection, ROI Sampling, and Cell Segmentation. These modules are organized to decode the information learned using CNNs in interpretable predictions for pathologists. The Preprocessing module is responsible for removing background and noise from WSI. At ROI Detection, we use the U-Net convolutional architecture to identify suspicious regions in low magnification WSI. Suspected areas identified are mapped from low magnifications by ROI Sampling to 40× magnifications. Cell Segmentation then segments high-magnification areas. Segmentation is performed using a ResNet50/U-Net. To validate the DeepBatch, we use datasets from different sources that can be used together or separately in each module, depending on the module’s objective. RESULTS: The evaluations performed demonstrate the feasibility of the model. We assessed the impact of four-color spaces (RGB, HSV, YCrCb, and LAB) for multi-class segmentation of breast cancer WSI. We used 205 WSI of breast cancer for training, validation, and testing. For the detection of suspicious regions by ROI Detection we obtained a IoU of 93.43%, accuracy of 91.27%, sensitivity of 90.77%, specificity of 94.03%, F1-Score of 84.17%, and an AUC of 0.93. For the refined segmentation of WSI by the Cell Segmentation module we obtained a IoU of 88.23%, accuracy of 96.10%, sensitivity of 71.83%, specificity of 96.19%, F1 -Score of 82.94%, and an AUC of 0.86. CONCLUSION: As a contribution, DeepBatch provides refined segmentation of breast cancer WSIs using a cascade of CNNs. This segmentation helps the interpretation of the diagnosis by the pathologist, accurately presenting the regions considered during the inference of WSI. The results indicate the possibility of using the model as a second reading system.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior