Bridging the Semantic Gap in Open RAN: A Hybrid Retrieval-Augmented Generation Framework for Dual-Domain 5G Engineering
Please login to view abstract download link
The rapid evolution of the Open Radio Access Networks (O-RAN) architecture and the increasing complexity of 3GPP specifications create substantial barriers for researchers involved in the practical implementation of 5G standards. Existing benchmarks primarily focus on theoretical knowledge of regulatory documents, leaving a critical semantic gap between abstract normative requirements and their concrete realization within open-source platforms such as the srsRAN C++ codebase. This work addresses this challenge by proposing a Hybrid Retrieval-Augmented Generation (RAG) framework, building on the methodology proposed by Lewis et al., which unifies these heterogeneous knowledge domains through a hierarchical indexing strategy known as Parent-Child Chunking. A central component of the architecture is a Semantic Query Router, which utilizes a zero-shot classifier to identify user intent and dynamically activate the most relevant knowledge index. This mechanism effectively mitigates semantic interference and prevents context poisoning, which are common failure modes in naive ensemble RAG approaches. The system is built upon the Llama 3 large language model and utilizes the Chroma vector database for efficient indexing of both normative and implementation domains. Experimental evaluation demonstrates that the proposed framework achieves an overall accuracy of 76.7%, significantly outperforming base language models and standard RAG configurations. Notably, the system exhibits high competence in technical implementation tasks, reaching 78.5% accuracy. Performance analysis further reveals that the implementation of semantic routing reduces average generation latency to 3.47 seconds, as the specialized filtering logic optimizes the input context window for the language model. These results confirm that strictly separating knowledge domains based on query intent provides a robust foundation for developing efficient AI assistants capable of synthesizing complex 5G standards with low-level software implementation.
