Optimizing AI-generated motion pictograms through AI-based evaluation and regeneration

  • Okatani, Natsumi (Toyo University)
  • Shioya, Ryuji (Toyo University)
  • Nakabayashi, Yasushi (Toyo University)

Please login to view abstract download link

Pictograms are indispensable for intuitive communication across language barriers in public spaces and disaster prevention[1, 2]. However, static symbols struggle to convey complex movements or dynamic concepts. While animated "Motion Pictograms" enhance danger avoidance and universal design, high production costs and technical barriers hinder their rapid deployment. This study proposes an AI-driven method to automatically generate motion pictograms from static images. By introducing a "Self-Correction Loop" for autonomous evaluation and refinement, we established an optimization process for high-quality, consistent motion pictograms. The method involves three phases: generation, evaluation, and improvement. First, a multimodal LLM (e.g., Gemini 1.5 Flash[3], GPT-4o[4]) analyzes static images to construct optimized TI2V (Text-conditioned Image-to-Video) prompts for video AI (e.g., Luma AI). Next, the LLM evaluates outputs using expert personas in HCI and semiotics based on "Semantic Fidelity" (Clarity, Consistency) and "Expressive Quality" (Object Consistency, Smoothness), prioritizing the former. Finally, the system identifies failures—such as disappearing objects—and automatically generates corrective "regeneration prompts" for feedback. Experiments confirmed improved comprehension and quality through the correction loop, even when initial outputs lacked motion or contained noise. However, we identified a bias where AI self-evaluation is more lenient than objective human assessment. Future work will refine evaluation algorithms using design theories to ensure output quality sufficient for real-world social implementation.