Skip to main content

CAREER: Addressing Hallucinations for Trustworthy Large-scale Multi-Modality Learning

NSF

open

About This Grant

The rapid rise of large-scale multimodal models (LMMs) has promoted their advancement in numerous fields, including social media analysis and healthcare. However, while LMMs have shown outstanding performance, they can produce misleading outputs because these models do not know what they do not know, raising concerns about their reliability. The inaccurate, irrelevant, or unintelligible output produced by LMMs is often called hallucinations. Addressing the hallucination problem in LMMs is an important topic in the next five years since it will improve their trustworthiness and impact in many other fields. This project will develop novel hallucination mitigation approaches to improve the trustworthiness and applications of LMMs in healthcare, including sample use cases of tobacco advertisement prevention and autism behavior prediction. Outcomes from the research will impact the field by providing a foundational and practical study needed for future research. It will also train students to conduct and use research to improve community health and mental health outcomes. There are three primary factors leading to hallucinations in LMMs. First, it is widely understood that biased data distribution causes significant challenges in data-driven responsible AI approaches. However, biased distribution influence on predictions is also a leading cause of hallucinations in LMMs. Second, the misalignment between input modalities could result in the LMMs overconfidently relying on a particular input modality and ignoring others. As a result, the LMMs could hallucinate the predictions based on the dominant modality. Third, training LMMs often requires large-scale training data. The limited data could result in hallucinations in LMMs due to the lack of knowledge and diverse information. The overarching goal of this project is to develop robust and trustable LMMs to mitigate hallucination in large language models. First, to address the imbalanced data, new learning approaches will be introduced to mitigate the hallucinations caused by imbalanced data. Second, novel shuffling learning approaches will be developed to address the problem of misalignment across input modalities. Third, to overcome the problem of limited data, new adaptive learning approaches will be developed to improve the performance and trustworthiness of the LMMs. The LMMs in this project are then deployed into practical healthcare applications. The research effort in this project will pave the way to develop new theoretical and practical approaches to distribution modeling, contrastive learning, and shuffling learning to address hallucinations in LMMs. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

social science

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $305K

Deadline

2030-06-30

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)