MultiVarNet
Predicting Tumour Mutational Status at the Protein Level
Summary
Deep learning research in medical image analysis demonstrated the capability of predicting molecular information, including tumour mutational status, from cell and tissue morphology extracted from standard histology images.
While this capability holds the promise of revolutionising pathology, it is of critical importance to go beyond gene-level mutations and develop methodologies capable of predicting precise variant mutations. Only then will it be possible to support important clinical applications, including specific targeted therapies. To address this need we developed MultiVarNet which allows us to decipher complex genomic patterns, facilitating precise predictions of hotspot alterations at the protein level. For the first time we demonstrate that we can achieve notable success in identifying over 20 mutation variants across major oncogenes. This study introduces a novel approach that underscores the importance of incorporating the underlying molecular biology of tumours to enhance algorithm accuracy, moving us towards more personalized and advanced targeted treatment options for patients.
Introduction
With an estimated 9.6 million deaths, or one in six deaths, in 2018 according to the WHO, cancer continues to be a significant global health challenge because of the heterogeneity and intricacy of the disease. There has been significant progress in precision medicine aimed to provide more efficient therapies by targeting specific molecular tumour profiles. Molecular diagnostic tools, such as DNA sequencing, RNA quantification or methylation profiling, can reveal specific molecular characteristics of the tumours. These techniques are crucial as they condition the access to molecularly guided treatment options (MGTOs) which are specifically designed to target these precisely identified alterations. However, these tests require large tumour samples, have long waiting times, and are costly. For instance, Gondos and colleagues found that almost a quarter of patients with newly diagnosed advanced non-small cell lung cancer (NSCLC) in their large study did not receive 'gold-standard' genomic testing for any of the four guideline-recommended therapeutic targets (ALK, BRAF, EGFR, and ROS1 alterations) before first-line treatment due to these limitations. Consequently, there is an increasing demand for alternative solutions to conventional molecular profiling methods, aiming to fulfill the pressing need for comprehensive testing of molecular alterations.
Meanwhile, a growing body of evidence supports the use of deep learning in analyzing hematoxylin and eosin (H&E) stained histopathology images to infer molecular information, demonstrating state-of-the-art performance in predicting outcomes and relevant biomarkers. These methods have been applied to predict single somatic mutations, copy number variations, molecular subtypes, RNA expression, and prognosis. However, despite their success in identifying overall gene mutation statuses, the precise prediction of the protein consequences resulting from specific mutations within genes has yet to be explored. This gap is significant, as variations at the protein level often hold greater clinical relevance. These specific alterations, hereafter referred to as variants, dictate access to targeted therapies such as Sotorasib and Adagrasib. These drugs target specific protein variants like p.G12C in genes such as KRAS in NSCLC, highlighting the importance of analyzing mutations at a variant-specific level rather than solely at the gene level.
We address the need of predicting specific cancer-associated mutations and variants across 11 cancer types, leveraging deep learning analysis of H&E digitized slides. Our approach not only demonstrates the potential to accurately predict variant alterations, often surpassing gene-level mutation accuracy but also introduces a novel label-engineering paradigm to exploit unique morphological signatures of these variants. Indeed, current developments in deep learning methodologies primarily focus on enhancing architectures and training processes, often overlooking the significance of available biological information. Our MultiVarNet method illustrates that this information can be intelligently leveraged to enhance mutation prediction accuracy, offering a novel direction for advancement in the field of histopathology.