Skip to main content

CIRC: Planning-C: Accelerating LLM Safety Research with Self-Evolving Evaluation Infrastructure

NSF

open

About This Grant

Large language models (LLMs) have revolutionized various fields, including education, healthcare, media, and national security, enabling powerful AI-driven applications to enhance productivity and decision-making. However, their widespread deployment has raised significant safety concerns about potential misuse, as they can be manipulated to generate harmful or misleading content. As AI safety threats and countermeasures continue to evolve in a dynamic arms race, a critical gap remains: the lack of a standardized evaluation framework to systematically assess the true safety risks of LLMs. This project aims to close this gap by developing an open, community-driven evaluation infrastructure that engages researchers, practitioners, and policymakers. By advancing AI safety research, fostering public awareness, and strengthening workforce training in responsible AI practices, this initiative will support national interests in trustworthy AI, ultimately ensuring that LLMs benefit society in a safer manner. Building on the research and outreach expertise of the project team, this project focuses on four primary objectives to advance community efforts in LLM safety evaluation: (i) conducting structured surveys and interviews with experts across computing disciplines to identify critical safety concerns, assess existing evaluation gaps, and gather insights on infrastructure requirements; (ii) organizing a workshop to facilitate discussions and advancements on evaluation strategies, and refine the design for a shared evaluation infrastructure; (iii) developing a prototype evaluation infrastructure based on gathered insights and iterative feedback, featuring a comprehensive suite that includes a data library, an LLM model zoo, an attack/defense repository, and an evaluator hub, ensuring both usability and scalability; (iv) hosting an LLM safety challenge to engage the community in testing and improving the evaluation infrastructure. Collectively, these efforts will establish a foundational framework that evolves with the community needs, fostering a deeper understanding, comprehensive evaluation, and continuous enhancement of LLM safety. The outcomes of this project not only advance LLM research but also drive progress in the broader community of machine learning. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

machine learningeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $100K

Deadline

2026-06-30

Complexity
Medium
Start Application

One-time $249 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)