Skip to main content

ACED: Tail-aware Generative Modeling for Inverse Discovery of Molecules

NSF

open

About This Grant

Discovering new molecules that have desired properties will address critical technological challenges ranging from energy storage to drug development. The traditional trial-and-error approach of creating and testing molecules is expensive, inefficient, and time-consuming. Likewise, it is too computationally expensive to use only quantum mechanical calculations to adequately screen the vast space of possible molecules for desired properties. In contrast, generative modeling based upon machine learning methods have the potential to efficiently navigate chemical space to discover molecules capable of addressing important problems. Generative modeling enables predicting the structure of molecules from the desired properties, known as inverse design. However, generative modeling approaches are hindered by data scarcity and struggle to accurately generate molecules when the desired properties lie outside the range of the training data. This issue is known in statistics as tail extrapolation. It is important to emphasize that when trying to generate new molecules, researchers typically look for molecules that have exceptional properties, thus making them rare and likely outside the range of the training data. This project will address this critical challenge via extrapolation-aware conditional molecule generation and experimental design methods. This project will develop methods that generate novel molecules with desired properties and will be demonstrated on organic molecules that are useful for reduction-oxidation (redox) flow battery applications for energy storage. This research connects to training and mentorship at the University of Michigan and also promotes education across data science, statistics, computer science, and engineering. Students from Washtenaw Community College will be mentored each summer of the project as part of an eight-week summer research internship. This project will develop extrapolation-aware conditional generative models. The key idea is to adapt pre-additive noise models, which can provably perform tail extrapolation (’tail-aware’) in classical regression tasks, to a variety of conditional generative models. Further, this project will advance extrapolation-aware experimental design for conditional generative modeling. This project will design efficient continual updates for experimental design in generative models. Continual updates, by design, do not require retraining; rather, it only requires updating with the newly acquired data points and thus are much faster. The methodologies developed will be validated on synthetic datasets of organic molecules and real-world datasets for organic molecule discovery for redox flow batteries using state-of-the-art equivariant generative models and large language models. This research focuses on developing novel and generalizable artificial intelligence techniques to accelerate scientific discovery. The proposed extrapolation-aware generative modeling and experimental design approaches are widely applicable to scientific problems involving the design of systems in small data regimes or with exceptional, rare desired properties. Thus, beyond making an impact on chemistry for molecule discovery, these proposed methods are useful for any generative modeling application that requires extrapolation. By developing a rigorous foundation for extrapolation-aware conditional generative modeling and experimental design with generative models, this project aims to make generative modeling far more reliable, enabling trustworthy predictions. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

computer sciencemachine learningengineeringchemistryeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $500K

Deadline

2027-06-30

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)