NIGMS - National Institute of General Medical Sciences
Proteins and their interaction networks play crucial roles in virtually all cellular processes. As such, knowledge of their functioning is essential for understanding the molecular basis of life. While high-throughput experimental methods are routinely used to characterize proteins and catalog their interactions, the overwhelming amount of genomic variation both within and across organisms—and in healthy and disease states—necessitates the development and use of efficient computational methods. Our overarching goal is to devise algorithms and machine learning methods that yield a predictive understanding of proteins and their specificities, interactions, and networks, and of how these attributes are altered by genetic variation. This is an especially exciting time to develop methods for analyzing protein sequences, as in recent years the field has been transformed by new artificial intelligence technologies, including protein language models that, akin to the progress in natural language processing, learn the “language” of proteins. In the next five years, we will combine these groundbreaking technologies with our years of domain expertise on proteins and their networks to tackle fundamental problems in three important areas. First, we will develop powerful new machine learning methods to predict DNA-binding specificities for broad classes of transcription factor proteins; such methods will newly enable the inference of regulatory interactions mediated by uncharacterized transcription factors or those mutated in disease, significantly advancing upon current work that is focused just on specific transcription factor families. Second, we will develop highly scalable, sequence- based approaches to predict the specific functional effects of variants within proteins; these approaches will be a great aid in interpreting the millions of coding variants observed across human populations and will be a step towards obtaining a mechanistic understanding of disease mutations. Third, we will develop novel algorithmic and machine learning approaches to predict the targets of kinase proteins and the pathways they regulate; these methods will elucidate kinase signaling networks and help decipher the growing body of complex phosphoproteomics datasets. A final, cross-cutting goal of our research is to rigorously evaluate current protein language models in order to uncover their strengths and limitations, and to develop innovative strategies to improve their capacity to capture the syntax, grammar, and semantics of protein sequences. We will release open source software for all developed methods.
Up to $2.0M
2029-08-31
Detailed requirements not yet analyzed
Have the NOFO? Paste it below for AI-powered requirement analysis.
One-time $49 fee · Includes AI drafting + templates + PDF export
Dynamic Cognitive Phenotypes for Prediction of Mental Health Outcomes in Serious Mental Illness
NIMH - National Institute of Mental Health — up to $18.3M
COORDINATED FACILITIES REQUIREMENTS FOR FY25 - FACILITIES TO I
NCI - National Cancer Institute — up to $15.1M
Leveraging Artificial Intelligence to Predict Mental Health Risk among Youth Presenting to Rural Primary Care Clinics
NIMH - National Institute of Mental Health — up to $15.0M
Feasibility of Genomic Newborn Screening Through Public Health Laboratories
OD - NIH Office of the Director — up to $14.4M
WOMEN'S HEALTH INITIATIVE (WHI) CLINICAL COORDINATING CENTER - TASK AREA A AND A2
NHLBI - National Heart Lung and Blood Institute — up to $10.2M
Metal Exposures, Omics, and AD/ADRD risk in Diverse US Adults
NIA - National Institute on Aging — up to $10.2M