Knowledge-Guided Conditional Generation of Double Perovskites via Language Model Feedback

  • Lee, Inhyo (Korea Advanced Institute of Science and Tech)
  • Lee, Junhyeong (Korea Advanced Institute of Science and Tech)
  • Park, Jongwon (Korea Advanced Institute of Science and Tech)
  • Lim, KyungTae (Korea Advanced Institute of Science and Tech)
  • Ryu, Seunghwa (Korea Advanced Institute of Science and Tech)

Please login to view abstract download link

Generative artificial intelligence has emerged as a transformative tool for accelerating materials discovery; however, its practical application remains constrained by a lack of physical consistency. These limitations are particularly critical in materials science, where chemical and structural feasibility are fundamental requirements. To address these challenges, we present a multi-agent, text-gradient–driven framework that systematically integrates two complementary feedback loops: LLM self-evaluation and domain knowledge informed gradients. Unlike conventional black-box generation, our architecture utilizes domain-specific text gradients to iteratively steer the generative process toward thermodynamically stable chemical spaces. We validated this framework through the discovery of diverse double perovskites (DPs), verifying the thermodynamic stability of the generated DPs via density functional theory calculations. The results achieved a 98% compositional validity rate and the identification of up to ~54% (meta)stable candidates. This performance represents a significant improvement over both the LLM-only baseline (~43%) and prior GAN-based approaches (~27%). Furthermore, we provide a critical analysis of the inherent limitations of ML surrogates within the discovery framework. Our findings demonstrate that while these surrogates necessitate additional data for training, their performance significantly degrades in out-of-distribution (OOD) regimes, ultimately misleading the discovery process. Overall, our domain-knowledge-informed multi-agent system provides a data-efficient and reliable paradigm for discovery, establishing a generalizable framework for agentic materials design.