BatteryPass-12K and Technical Language Processing (TLP) Framework for Battery Predictions
Please login to view abstract download link
We introduce the first digital battery passport (DBP) dataset (BatteryPass-12K) and a novel technical language processing (TLP) framework (Fig. 1) for battery predictions. A DBP is an electronic record of the features and history of a battery1. The TLP framework combines the capabilities of artificial intelligence (AI) agents, large language models (LLMs), and optimized hard and soft prompts. The contributions of our work include (1) introduction of a novel task of DBP or DPP classification with the binary labels of conformant or nonconformant, (2) introduction of the first (synthetic) dataset with 12,000 balanced samples, generated from real pilot samples using ChatGPT5.1 Thinking, and (3) the TLP framework for battery predictions. This is important in view of the EU battery regulation for 2027 to protect the environment and ensure traceability and transparency along the entire battery value chain. Furthermore, accurately estimating battery state in extreme temperatures is one of the challenges in the battery domain and the TLP framework aims to address this. The electronic record of the entire life cycle of various features of a battery, provided by the DBP, or battery management system (BMS) is useful within the framework for data-driven solutions. It involves a battery-agnostic model context protocol (MCP) AI agent that can connect to external tools having information and provides soft prompts (continuous feature vectors learned by prompt-tuning) combined with optimized hard prompts (plain text inputs enhanced with gradient-based optimization). The combination is then supplied as input to a capable multimodal LLM for relevant TLP task predictions, as done in this work for the conformance task. Some benchmarking results are provided in Table 1, where it shows the task is fairly challenging though GPT-5.2 Thinking is promising and achieved the best scores.
