Skip to main content

Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration

NSF

closed
OpenLast verified: 2026-06-16

About This Grant

Recent advancements in large machine learning models have demonstrated that increasing the number of parameters enhances computational precision and unlocks capabilities once deemed unattainable. This trend is exemplified by the rapid growth in model sizes, for instance, GPT-3 contained 175 billion parameters, while GPT-4 reportedly utilizes up to 1.8 trillion. This trajectory is expected to continue in the foreseeable future. However, the explosive growth in model size presents two major challenges for computer architecture and systems research: prolonged simulation times, which can extend from several days to weeks for large-scale models, and infeasibility of deploying workloads on a single compute engine (e.g., a graphics processing unit (GPU)) due to limited on-device memory capacity. To address these challenges, this project proposes the development of scalable simulation techniques and advanced memory management strategies tailored for large-scale machine learning workloads on GPUs. Unlike existing application-agnostic approaches, this research will leverage the distinctive data access patterns and value distributions of modern machine learning models to enable more efficient memory compression and more accurate simulation acceleration. While the primary focus will be on emerging machine learning models, the broader objective is to advance GPU computing to better accommodate any big data workload constrained by memory limitations. This will facilitate faster and broader adoption of GPUs across diverse computing domains, driving continued innovation in computational science. The outcomes of this research will be integrated into both new and existing undergraduate and graduate curricula, as well as K-12 outreach initiatives, fostering a deeper understanding of cutting-edge computing technologies across educational levels. This project would answer two research questions: how to simulate large machine learning computing and how to utilize GPU local memory better when the memory is oversubscribed. While large-scale simulation and memory management have been widely studied, most existing approaches fail to capture the unique architectural characteristics of GPU computing and the specific behaviors of emerging machine learning workloads. Rather than relying on application-agnostic or user-dependent sampling techniques, this research will exploit the distinctive compute and memory access patterns inherent to machine learning models. The first thrust will research efficient simulator acceleration methodology by leveraging the fact that machine learning models are typically executed with highly optimized library functions. These library functions tend to have similar architectural behaviors depending on the operational and data size characteristics. The project will identify representative sample kernels whose performance can be extrapolated to other similar kernels, thereby significantly reducing simulation overhead. By leveraging characteristics of the library functions, the second thrust will explore efficient memory expansion and compression strategies such as dynamic memory prefetching and eviction policies to mitigate the effects of memory oversubscription. The second thrust will develop novel quantization techniques that take advantage of the unique value distributions of weights and gradients within individual tensors. Unlike tensor-oblivious methods, this targeted approach aims to reduce memory footprint more effectively while preserving model accuracy. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Grant Summary

Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration is a NSF grant providing up to $270K for university, nonprofit, small business. Applications are due 2028-09-30 (open). Check eligibility and apply with FindGrants.

Focus Areas

machine learningeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $270K

Deadline

2028-09-30

Complexity
Medium
  1. 1Confirm your organization is eligible for Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration from NSF, checking organization type, location, and any population or project requirements.
  2. 2Gather the required documents and information, including your organization details, project plan, and budget figures.
  3. 3Draft your application narrative and budget addressing the funder's priorities and review criteria. FindGrants can draft each section for you to review and edit.
  4. 4Review every section against the requirements checklist, then export a submission-ready application pack and submit it to NSF before the deadline.
This record is a past award, contract, or funder profile — useful for research, but not an open grant application. Check the original source for current opportunities from this funder.

Don't want to draft it yourself?

We'll draft the complete application against NSF's requirements, run a quality review, and email you a submission-ready PDF plus an editable Word doc within 5 business days. Most orders deliver in 24-48 hours. Flat $399, any grant size.

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration: Frequently Asked Questions

Who is eligible for the Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration?

Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration is offered by NSF and is generally open to university, nonprofit, small business. It is open to organizations nationwide unless the funder specifies otherwise. Review the specific eligibility terms before applying, since funders set their own requirements around organization type, location, and the population or project being served.

How much funding does the Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration provide?

Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration provides up to $270K per award from NSF. Actual award sizes depend on the scope of your project, available program funds, and the number of applicants, so build a budget that reflects realistic, allowable costs rather than the maximum figure.

When is the Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration deadline?

Applications for Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration are due 2028-09-30 (open). Because deadlines can change, verify the date with the funder, NSF, and give yourself enough time to prepare a complete, competitive application before the close date.

How do you apply for the Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration?

To apply for Collaborative Research: SHF: Small: Building scalable GPU simulation and efficient GPU memory management for large machine learning acceleration, confirm your eligibility, gather the required documents, and prepare a narrative and budget that address the funder's priorities. FindGrants guides you step by step and can draft each section, then exports a submission-ready application pack for this grant from NSF.