Skip to main content

Collaborative Research: OAC Core: Mitigating Artifacts in Scientific Data Compressors with a Learning-Driven Framework

NSF

open

About This Grant

Error-controlled lossy compressors are widely used to manage the large amount of data produced by scientific applications. Still, they may produce undesired compression artifacts that distort both raw and post hoc data analytics. This project aims to bridge the gap by developing a novel learning-driven framework to mitigate artifacts produced by scientific lossy compressors. The success of this project is expected to improve the integrity and quality of lossy-compressed scientific data significantly, thus facilitating the use of existing lossy-compression frameworks for efficient data storage, transmission, and analytics in scientific applications. This contributes to scientific discoveries in a broad range of domains, including climatology, cosmology, fusion energy science, and X-ray ptychography, as well as multiple aspects of research and education in advanced cyberinfrastructure. This project addresses the artifact issue by leveraging recent scientific data compression and deep-learning advancements. In-depth investigations are conducted to generically characterize the compression artifacts produced by scientific compressors on both raw data and post-hoc analysis. This aims to improve the understanding of data quality and establish a benchmark for artifact mitigation. Next, deep learning models are designed to tackle artifact mitigation on both raw data and features of interest, with specifically designed transfer learning to reduce training costs. The quality of the recovered data is improved by fusing model outputs tailored to preserve different features. Finally, the quality of the recovered data is validated through tailored uncertainty quantifications, and the performance of the framework is investigated through careful optimization and parallelization. Integration into state-of-the-art error-controlled lossy compressors and incorporation with real-world scientific applications are expected to advance multiple scientific data management tasks, including data storage, I/O, and transmission. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

education

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $299K

Deadline

2028-09-30

Complexity
Medium
Start Application

One-time $749 fee · Includes AI drafting + templates + PDF export

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)