MS303 - Datasets for Science: Large and Small

Organized by: J. Actor (Sandia National Laboratories, United States) and A. Robertson (Sandia National Laboratories, United States)
Keywords: data curation, data generation, data science, representation learning, scientific machine learning
Data-driven methods have achieved tremendous improvements on a range of tasks, including but not limited to problems in computational mechanics. Many of such improvements require high-quality scientific data, and the acquisition of such data is often costly, regardless of whether data is experimental, historical, or simulation-driven. This minisymposium will convene world-class researchers to provide a forum to discuss challenges and opportunities in data curation, statistical machine learning, scientific machine learning, with a specific emphasis on issues concerning data diversity, data generation, multi-use datasets, generative learning, and feature representations. Due to the increasingly interdisciplinary nature of data-driven methods, this session will draw upon expertise in machine learning, statistics, scientific computing, and specific domain applications in mechanics and materials modeling, highlighting practical achievements alongside theoretical results.