NSF AI Disclosure Required

NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.

CAREER: Efficient and Trustworthy Machine Learning via Post-Processing

NSF

open

Machine learning (ML) technologies continue to permeate diverse aspects of the world. Their impact is large and impacts areas from personalized recommendations to critical decisions in healthcare, finance, and legal systems. The ability of these systems to make accurate and interpretable predictions, therefore, is of paramount importance. However, contemporary ML models, such as large-scale neural networks, often struggle with robust generalization — they fail to maintain performance consistency when faced with unseen and unexpected (i.e., out-of-distribution) data. Furthermore, due to the increasing complexity of model structure and the growing volume of training data, it is often challenging to interpret models' decision-making processes. The triple challenges of accuracy, robust generalization, and interpretability constitute significant barriers to the trustworthiness of ML models, raising crucial questions about their wide real-world applications. Motivated by the need to address these concerns, this project outlines a concerted plan on the three aspects of trustworthy ML, group fairness, robust generalization, and data attribution via the algorithmic paradigm of post-processing. The outcomes of this project will be integrated into both undergraduate and graduate courses in trustworthy ML to bolster the technical course material, available to all the students. The technical aims of the project contain three key thrusts: (1) Group fairness, which aims to develop a unified framework for understanding and analyzing the trade-offs between statistical parity, equalized odds and model accuracy in classification tasks, leading to novel algorithmic solutions that achieve optimal trade-offs; (2) Robust generalization, which focuses on developing theories and algorithms to ensure that ML models can generalize well across diverse tasks and domains, particularly under distribution shifts; and (3) Data attribution, which seeks to provide an efficient and principled approach to explain the decision-making process of complex models by attributing the model's predictions to the training data, thereby enhancing interpretability and trustworthiness. All the proposed research will be conducted through the lens of post-processing techniques, which are practical and scalable to large-scale models, including large language models (LLMs). The proposed methods will be tested using the open-source LLM framework LMFlow, ensuring practical application and community accessibility. The team will share all the research outcomes through open-source software packages, and creating new tools for broader access. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

machine learning

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $475K

Deadline

2030-07-31

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Browse More Grants

Machine Learning Grants