Marie A Brunet, DVM, PhD
Assistant Professor
Medical Genetics Service, Department of Pediatrics
Faculty of Medicine and Health Sciences
University of Sherbrooke, Sherbrooke, Canada
Centre de Recherche du CHUS
Non-canonical translation: MS-based proteomics can rise to the challenge
Abstract
Thousands of functional coding sequences have eluded annotations. These overlooked coding sequences are often small and at unsuspected genomic loci. Several approaches have been developed throughout the last decade trying to identify these non-canonical coding sequences. However, due to experimental limitations and inherent biases, our interpretation of the proteomic landscape is most likely underestimated, and our understanding is limited to the sensitivity and specificity of our methodologies. Accurate annotation of functional elements holds crucial implications for clinical and fundamental research, there is a dire need for tools allowing an exhaustive evaluation of our proteomes. The astounding success of deep learning on sequence modeling tasks, and pattern recognition, combined with high-quality omics data, provides hope for reliable and exhaustive investigation of our proteomes.
Here, we will focus on three current challenges for the detection of non-canonical proteins: (a) generating an exhaustive and accurate protein repository; (b) detecting unique peptides to deconvolute protein inference; and (c) considering protein modification at a proteogenomic scale.
(a) We will present FOMOnet (Fear-Of-Missing-ORFs network), a deep learning model that performs end-to-end transcript segmentation with single-base resolution. Our model vastly outperforms current methodologies and predicts an additional 47,000 coding sequences in the human genome. FOMOnet helps in building exhaustive personalized protein databases. (b) We will present RTAP (Real-Time Adaptive Proteogenomics), a machine learning based pipeline that process mass spectrometry data in real-time. The real-time analysis allows to inform the instrument during acquisition and favor identification of unique peptides. Our pipeline can run in real-time and optimizes the use of instrument resources to deconvolute protein groups. (c) Finally, we will present OUI-Discovery (Openprot using Ionbot), a methodology that enables open modification searches on large proteogenomic databases. Our approach significantly increases the identification rate, the reproducibility of identifications, and fosters robust identification of non-canonical proteins.
Biosketch
Pr. Brunet qualified as a Doctor of Veterinary Medicine and later completed her PhD at the University of Cambridge under a prestigious Gates Cambridge scholarship. She did her postdoc in Biochemistry and Functional Genomics at the University of Sherbrooke, before establishing her own research group there in 2021. Her research focuses on non-canonical translation and its impact on our biological systems. With collaborator Pr Roucou, she develops and manages the OpenProt resource, the first proteogenomics resource endorsing a polycistronic annotation of eukaryotic genomes. She is an executive member of the international Ribo-seq ORF consortium aiming to foster genomic annotation of non-canonical coding sequences; and she leads the non-canonical Proteome Project (ncPP) HUPO initiative. She uses both fundamental and computational research methods to better explore biological data and understand human diseases. She holds a research chair in Multi-omics and Deep Learning applied to pediatric diseases; and a FRQS Junior 1 career grant recognizing her leadership at the intersection of health sciences and artificial intelligence. She is the 2024 CNPN ECR awardee recognizing her innovative contributions to proteomics.
Date
Date(s) - March 18, 2025
6:00 pm - 8:00 pm
Emplacement / Location
Université de Montréal - Campus MIL (Beer and pizza at 18h, conference at 19h in A-4502)