NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.
NSF
Cloud and network providers must allocate and manage their compute and network resources efficiently and fairly to ensure stringent service level objectives are met. Examples include managing Wide Area Network (WAN) bandwidth to meet traffic needs and managing clusters of Graphics Processing Units (GPUs) for running Artificial Intelligence (AI) training jobs. The management of production networks and systems is a black art today, dominated by domain-specific heuristics that require constant fine-tuning as requirements evolve, and which fall short of performance in ways that are not easy to analyze. This project is developing BONSAI (Beyond-Optimization Network System Allocation Intelligence), a principled framework based on Machine Learning (ML) that can enable high decision quality for a wide range of network resource allocation problems in complex network and cloud environments that are hard to model precisely, and where inaccurate predictions about future traffic patterns or workloads is the norm. The project is developing custom neural architectures inspired by network optimization models for multi-criteria resource allocation problems. The neural architectures will be designed to show resilience to scenarios beyond their training data (e.g., flash crowds, hardware failures) by careful alignment with the optimization models they aim to enhance and adopt a framework that reacts to a wide variety of input transformations with awareness. The project is developing techniques by which neural models can learn and adapt to real-world sources of feedback which are non-differentiable, and novel alignment approaches that will enable real-time validation of the solution suggested by the neural architecture. The project is demonstrating these ideas in the context of important and challenging domains such as traffic engineering in Wide Area Networks and scheduling distributed AI training jobs in shared GPU clusters. The team consists of researchers with complementary expertise in networking, machine learning, and optimization, and results will be disseminated to researchers in these communities. The team will also collaborate with and disseminate results to industry and network operators for real-world validations and applications of their research. The research will benefit the networking and IT industry by taking a major step forward towards principled resource allocation for complex real-world tasks. The project will involve Ph.D., master’s and undergraduate students, and will lead to the creation of a new graduate class on ML-Driven Computer Networking. Project results will be made publicly available at: https://purdue-isl.github.io/projects_pages/Self-correcting-ML Results will be available throughout the project period and for at least three years after. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Up to $1.1M
2029-06-30
Detailed requirements not yet analyzed
Have the NOFO? Paste it below for AI-powered requirement analysis.
One-time $749 fee · Includes AI drafting + templates + PDF export