Optimizing validity of comparative effectiveness research in Alzheimer's disease and related dementias using large language models

NIA - National Institute on Aging

open

Since people living with dementia (PLWD) are vulnerable to medication errors, drug-drug interactions, and a variety of adverse drug events, their prescribing decisions need to be informed by solid evidence. Physicians’ prescribing decisions often rely on routinely collected data because randomized controlled trials (RCTs) often severely underrepresent people living with dementia (PLWD). Electronic health records (EHR) are among the most commonly used real-world data for comparative effectiveness research (CER) because they contain rich clinical data. However, the structured EHR data suffers from missing data on key geriatric factors critical for conducting valid comparative effectiveness research (CER) among PLWD, such as degree of cognitive impairment, mental and functional status, and behavioral symptoms. Much of such information is embedded in the free-text clinical notes and reports, but traditional natural language processing (NLP) requires a labor- intensive data annotation process for each target phenotype, which is not scalable for the large numbers of study variables needed for confounding adjustment in a non-randomized CER study. Large Language Models (LLMs) have been shown to have promising potential to extract concepts and phenotypes that were not predefined during a training stage. However, the performance of the existing LLMs in predicting ADRD-relevant phenotypes is unknown. None of the existing LLMs have been trained on clinical EHR notes linking to external data that contain longitudinal geriatric data. Our objective is to build novel LLMs specializing in ADRD-relevant CER. It is designed to generate ADRD-relevant phenotypes and trained on clinical EHR integrated with multiple geriatric- information-enriched external datasets. The ground truth of all phenotypes our LLMs aim to predict will be provided by large-scale annotation available as structured data in the linked external datasets. Our integrated dataset will cover >850,000 lives (>80,000 PLWD) in two large multi-center EHR networks in Massachusetts from 2000-2024. The central hypothesis is that LLMs can be used to scalably generate valid features and consistently reduce missing data on key geriatric factors, enhancing the robustness of causal CER analyses among PLWD. Building on existing general-purpose LLMs, we will develop novel LLMs by instruction-tuning, converting the linked structured labels into text instructions and finetuning the LLMs through a text generation framework, and by chain-of-thought technique, guiding LLMs to infer results via multiple reasoning steps. In Aim 1, we will continual pre-train and finetune novel LLMs to determine eight categories of geriatric-specific phenotypes commonly used in ARDR-relevant CER. In Aim 2, we will assess generalizability by testing the performance to determine eight additional phenotypes not previously targeted and optimize the LLMs accordingly. In Aim 3, we will compare the treatment effect estimation using only EHR (mimicking the common research scenario when linkage to external data is infeasible due to privacy concerns) with vs. without using the LLM- derived features in six highly relevant empirical drug safety and effectiveness studies among PLWD.

Focus Areas

health research

Eligibility

universitynonprofithealthcare org

How to Apply

Funding Range

Up to $3.7M

Deadline

2029-09-14

Complexity

Medium

Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Optimizing validity of comparative effectiveness research in Alzheimer's disease and related dementias using large language models

About This Grant

Focus Areas

Eligibility

How to Apply

AI Requirement Analysis