Enhancing 3D Spatial Understanding in Large Language Models: An Approach Using DeepSDF for Mechanical Inference
Please login to view abstract download link
Large Language Models (LLMs) offer significant potential in engineering, particularly for identifying design rationales from 3D shapes and verbalizing expert intuition. However, their application is currently hindered by three main challenges: LLMs lack inherent 3D spatial recognition capabilities and struggle with positional relationships; there is a shortage of datasets pairing 3D shapes with structural or fluid analysis results; and standard probabilistic LLM outputs often lack consistency with physical laws. To address these issues, this study proposes a novel approach to enhance 3D spatial recognition capabilities in LLMs. The unique feature of this method is the utilization of DeepSDF to convert 3D shapes into low-dimensional latent vectors, which are directly input into the LLM. By leveraging the continuous nature of Signed Distance Functions (SDF), this approach provides a comprehensive representation of geometry, capturing 3D features without the information loss associated with discrete methods. Using the DrivAerNet++ dataset, we implemented a framework to integrate the latent vectors into the LLM (Llama 3.2-3B). This study focuses on evaluating the fundamental capability of the LLM to capture the underlying relationships between geometric features and physical characteristics. We explored different methodologies for encoding these latent vectors to optimize the model's understanding. The preliminary results verify that the proposed approach successfully enables the LLM to recognize geometric-mechanical consistencies, demonstrating its potential as a tool for physics-informed design support.
