Restricted Research - Award List, Note/Discussion Page

Fiscal Year: 2021

253  University of North Texas  (84549)

Principal Investigator: Guo,Xuan

Total Amount of Contract, Award, or Gift (Annual before 2011): $ 361,302

Exceeds $250,000 (Is it flagged?): Yes

Start and End Dates: - 1/31/23

Restricted Research: YES

Academic Discipline: Computer Science & Engineering

Department, Center, School, or Institute: College of Engineering

Title of Contract, Award, or Gift: A Computational Framework for Protein Identification and Quantification in Metaproteomics Using Data-Independent Acquisition

Name of Granting or Contracting Agency/Entity: National Institutes of Health

Program Title: N/A
CFDA Linked: Medical Library Assistance


1.1.1 (SAM); Extensive efforts to characterize the human microbiome have tremendously increased the knowledge about the diversity of the microbiome and about its composition in health and in disease. Dysbiosis in human microbiota underlies the development of many diseases, such as obesity, diabetes, and inflammatory bowel disease. Metaproteomics based on mass spectrometry (MS) has become widely used in icrobiome research for gaining insights into the functional states of microbial communities. Mass spectrometry with data-dependent acquisition (DDA) is the most common method of choice for identifying and quantifying microbial proteins in metaproteomics, but this technique is fundamentally limited in terms of reproducibility and comprehensiveness. Proteomics using data-independent acquisition (DIA) can, in theory, resolve the fundamental problems associated with the DDA method. However, the lack of bioinformatics tools still presents unresolved challenges in the context of DIA, and only few DIA applications on microbiome or host-microbe interactions have been reported. MS-based metaproteomics is a challenging measurement due to the high complexity with thousands of species at vastly different abundances. To obtain a comprehensive characterization of the functional state of microbial communities requires considering proteins not just from dominant microorganisms but also low-abundance microorganisms. This proposal addresses the need for identifying and quantifying proteins through the availability of a set of computational tools that use DIA data to identify and quantify peptides and their variants at the microbial strain level. The false peptide identifications are controlled by newly proposed methods for false discovery rate assessment at multiple granularities. The protein inference and quantification are optimized by linear programming models that contain information from genome/transcriptome sequencing data and metaproteome sample replicas. The improvement will increase the number of identified protein variants, especially those from the low-abundance microorganisms, which can help accurately characterize the functional composition in microbial communities and reveal the functional redundancy.

Discussion: No discussion notes


Close Window