Battery Imaging, Battery Insights? The Challenges of the Low-Data Regime

  • Docherty, Ronan (Imperial College London)
  • Vamvakeros, Antonis (Imperial College London)
  • Cooper, Samuel (Imperial College London)

Please login to view abstract download link

Modern machine learning has been defined by scale: millions of parameters, billion-image datasets, an internet of text. Battery imaging promises the ability to link manufacturing parameters to device performance, but faces certain challenges: a complex material system; a diverse range of chemistries, device formats and degradation mechanisms; difficult imaging (including large field of view requirements); a poor culture of data sharing. In this work we discuss our solutions to some of these challenges, focussing both on adapting existing foundation models to work inside this low-data regime, and on expanding datasets and benchmarks to escape it. For datasets, this involves leveraging LLMs for automated micrograph detection & captioning, as well as the Battery Imaging Library (https://www.batteryimaginglibrary.com/), the first open multi-modal, multi-length scale imaging library with full access to raw data. Micrograph segmentation - assigning a class to every pixel in an image - is necessary for downstream microscopy analysis, but standard deep learning models require tens or hundreds of densely labelled examples to perform well. As well as boosting the power of zero-shot, weakly supervised segmenters with upsampled vision transformer features and adapting their architecture to better handle homogeneous microscopy images, we have worked on the first systematic benchmark dataset for materials segmentation, again emphasising variety of materials, instruments and length-scales. We show that this benchmark can be used to optimise various parts of the segmentation workflow, and verify common `folk-wisdom' in the field.