Skip to main content

Collaborative Research: CISE Crosscutting Small: SaTC: Privacy-Preserving Synthetic Data Generation

NSF

open

About This Grant

Artificial Intelligence (AI) has the potential to transform many areas of life – like healthcare, education, and finance – but it needs access to data to learn and improve. A lot of the most useful data in the nation is locked away in places like hospitals, research labs, and private companies, where it cannot easily be shared because of privacy concerns. This slows down the development of AI in these important domains. One promising solution is synthetic data – data that's created by computer programs trained on real data. It looks and behaves like real data but does not contain any personal information. This project aims to develop techniques to let organizations take part in the synthetic data creation process without ever revealing their real data. These techniques use AI and encryption to keep the original data secure while still helping to generate useful synthetic versions. Such technology is especially impactful in domains where real data is currently distributed across organizations, such as data of patients with rare diseases. The project builds research capacity at the University of Washington Tacoma, an emerging research institution, in partnership with the University of Central Florida. It creates valuable opportunities for students to participate in research, thereby strengthening the future AI and security workforce. This project advances the state-of-the-art for privacy-preserving data sharing through the development of Secure Multiparty Computation (MPC) protocols to train statistics-based and neural network-based synthetic data generators while keeping the training data encrypted; the development of Fully Homomorphic Encryption (FHE) protocols to train synthetic data generators over encrypted tabular data; and the design of cryptographic protocols for evaluating synthetic data in a privacy-preserving manner so that multiple synthetic data generation techniques can be run and compared against real data without consuming differential privacy budget. The algorithms and cryptographic protocols for the generation of synthetic data will be implemented in open-source MPC and FHE libraries. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

education

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $125K

Deadline

2027-09-30

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)