Skip to main content

ACED: Revolutionizing Instrumental Analysis Using Foundation Models

NSF

open

About This Grant

Decoding the structures and properties of unknown molecules through analyzing the wavelengths of their electromagnetic properties is known as spectral analysis. Spectral analysis is crucial for scientific discovery and practical applications in various fields, including material manufacture, drug design, food safety, explosive detection, and non-invasive diagnosis. Spectral analysis offers rapid, sensitive, non-destructive, and cost-effective identification of unknown molecules through their characteristic numerical signals and outperforms traditional chemical analysis. However, the process of translating numerical signals into molecular structures is currently resource-intensive and not user-friendly because it often requires extensive trial-and-error and specialized training. This project aims to revolutionize spectral analysis using state-of-the-art artificial intelligence (AI) in an automatic, accelerated, and accurate fashion. This project will treat spectral signals and molecular structures as two different "languages". Models developed in this project will automatically transform spectral signals and molecular structures into descriptions of molecules in the two languages and enable rapid conversion between each description based on advanced AI-powered language translation tools. The resulting universal toolkit will simplify and streamline spectral analysis in practical scenarios and benefit applications in scientific research, national healthcare, national security, educational activities, and other domains. The primary intellectual contribution of this project is the development of a novel chemistry-informed, multi-modal, powerful, and flexible deep learning framework to realize automatic, accelerated, and accurate end-to-end spectrum-to-structure translations. Investigators will adapt and leverage foundation models from the frontier of AI, especially pre-trained large language models (LLMs) like Transformers. This project will design an encoder-decoder architecture, where the spectrum encoder converts the input numerical spectral signals (e.g., wavenumber-absorbance pairs from infrared (IR) spectra and chemical shift-intensity pairs from nuclear magnetic resonance (NMR) spectra) into context vectors, and the structure decoder transforms these context vectors into the output molecular fingerprint containing two-dimensional (2D) topological structures and three-dimensional (3D) spatial conformations of target molecules. The encoder and decoder will be pre-trained on high-quality data sets of molecular spectra from experimental measurements and theoretical calculations and fine-tuned to boost the performance. The project will accomplish three fundamental thrusts, including (a) developing natural-language representations for both spectral signals and molecular structures that align with the architecture of foundation models, (b) designing multi-modal learning frameworks to leverage pre-trained foundation models as the backbone approaches and inject chemical constraints as domain-specific knowledge for an end-to-end spectrum-to-structure translation, and (c) tailoring our multi-modal learning frameworks using chemistry- and data-informed schemes to adapt to the practical instrumental analysis pipeline for applications in real-life scenarios. This project will demonstrate significance across a broad range of disciplines where spectral analysis is essential for identifying or recognizing single molecules or molecular mixtures, including chemistry, biology, medicines, pharmacology, astronomy, security, materials science, food science, and environmental science. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

biologychemistryeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $500K

Deadline

2027-06-30

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)