NSF AI Disclosure Required

NSF requires disclosure of AI tool usage in proposal preparation. Ensure you disclose the use of FindGrants' AI drafting in your application.

CAREER: Towards Trustworthy Analytics

NSF

open

Tools that create visualizations of data are increasingly important for discovery and decision-making in a range of domains, from science and engineering to commerce. Data analysts use these tools to rapidly slice and dice their data, often inspecting a large number of visualizations in the process. Though useful for exploration, these visualizations can also expose random data fluctuations, which could be mistaken for real patterns. If analysts are not careful in interpreting these apparent patterns, they could inadvertently make false discoveries or take incorrect decisions. The goal of this research is to reduce the risk from spurious patterns arising in interactive data analyses. The project comprises three stages: (1) developing techniques for capturing analyst beliefs, expectations, and intentions as they conduct visual analysis; (2) using this data to develop algorithms that forecast the reliability of emerging visualizations; and (3) evaluating strategies for communicating the risk of false patterns. The resulting techniques will be validated and incorporated in tools for detecting RNA modifications from noisy sequencing data, in collaboration with bioinformatics researchers. The expected impact of this project is to aid analysts in assessing the reliability of insights, while guarding against visualizations that seem convincing but that are likely to be misleading. This in turn could broaden the adoption of visual analytics tools, increase the confidence in conclusions, and potentially reduce the incidence of false discovery. As part of this research, the team will develop interactive educational materials for training students in reliable data-driven inference. These learning modules will be disseminated in a format that allows customization by data science instructors for inclusion into existing curricula. Lastly, the project will provide opportunities for graduate research training and incorporate K-12 outreach activities that introduce young learners to data science. The project comprises three main activities: (1) Prototyping techniques to incrementally elicit analysts' belief and prior knowledge as they make sense of data. The elicited knowledge will then be used to distinguish between a gamut of intentions: from planned analyses with substantive hypotheses, to purely exploratory actions with minimal expectations. (2) The project will next develop a model to predict the reliability of apparent patterns and insights unearthed at different points in the analysis cycle. To build this model, the research team will use a variety of features, including the specificity of analyst intents, the degree to which their expectations are borne out in the data, as well as their behavior and interactions with visualizations. The elicitation techniques and the insight reliability model will then be refined in a series of visual analysis studies and through crowdsourced experiments, in which participants' declared priors and discoveries are used to improve the accuracy of the model in forecasting spurious patterns. Lastly, (3) the project will identify and characterize strategies for communicating the risk of spurious insights to analysts in real time. In particular, the team will evaluate techniques for directly visualizing risk indicators, as well as indirect methods whereby the visual encodings of the data will be adjusted depending on how risky it is predicted to be. The developed interventions will be evaluated both in experiments and in a bioinformatics application, to assess whether they reduce the rate of false discovery. The expected results include new methods for eliciting analyst beliefs, techniques to forecast and communicate the trustworthiness of insights, and instructional materials for teaching robust data analytic practices. The products will be disseminated in publications, and in the form of open-source software and learning modules. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Focus Areas

engineeringeducation

Eligibility

universitynonprofitsmall business

How to Apply

Funding Range

Up to $223K

Deadline

2027-04-30

AI Requirement Analysis

Detailed requirements not yet analyzed

Have the NOFO? Paste it below for AI-powered requirement analysis.

0 characters (min 50)

Browse More Grants

Engineering Grants Education Grants