Skip to main content

EAGER: Development of an Experimental Basis for Large Language Model (LLM) Hypothesis Generation toward Materials Design

NSF

open

About This Grant

PART 1: NON-TECHNICAL SUMMARY Large Language Models (LLMs) are an advanced artificial intelligence method to generate innovative ideas from the processing of copious volumes of information at speeds for which human engagement is simply infeasible, if not impossible. This research project focuses on developing a framework for extracting valuable information from non-text-based, sources to improve, grow and enhance LLMs with a focus on the design of new metallic systems with superior properties and performance. By helping LLMs to ingest and learn from data formats such as graphs, tables, microscopy images and other rich, layered and complex scientific data, this project is using LLMs to uncover relationships in materials that might be overlooked or never discovered by relying solely on textual data from scientific papers, written studies or textbooks. This work supports the realization of broadly democratizing the design process for materials investigations while fostering innovations that could surpass materials performance limits as commonly held today. In this way, this work aligns well with NSF’s mission to promote the progress of science. This project also coincides with the goals of the U.S. Materials Genome Initiative by harnessing the power of materials data and developing a skilled workforce to drive innovation and strengthen U.S. competitiveness in materials science. PART 2: TECHNICAL SUMMARY This project aims to establish an effective framework to ingest, inform and leverage multimodal data directly from experiments to advance large language models (LLMs) for hypotheses generation toward the design of new and superior metallic alloys. LLMs have significant potential to produce novel and innovative design hypotheses by integrating textual information across extensive domains of literature at speeds and volumes far beyond human capacity. A even more revolutionary opportunity however, lies in expanding the variety of formats of information available for LLMs to learn from. Presently, LLMs are restricted largely to data contained in text. This research project is actively expanding this also include the rich, complex, layered and multi-modal data found in materials science experiments including, but not limited to: segmented, annotated, quantified and labeled micrographs from SEM and TEM; x-ray diffraction patterns; graphical constructs; and even images. To achieve this aim, this project is building and training enhanced LLMs for the generation of metallurgical design inquiries through the use of high-quality, high provenance, context-rich, metallurgical dataset standards derived from a variety of sources including the research team’s very own experimental data and datasets obtained from open-source literature with all the metadata and context required to ensure high confidence in interpretability and repeatability. This work supports the realization of broadly democratizing the design process for materials investigations while fostering innovations that could surpass materials performance limits as commonly held today. In this way, this work aligns well with NSF’s mission to promote the progress of science. This project also coincides with the goals of the U.S. Materials Genome Initiative by harnessing the power of materials data and developing a skilled workforce to drive innovation and strengthen U.S. competitiveness in materials science. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

research

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $300K

Deadline

2027-01-31

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)