Home // GPTMB 2025, The Second International Conference on Generative Pre-trained Transformer Models and Beyond // View article
Retrieval Performance in RAG Systems: A Component-Level Evaluation Framework
Authors:
Alexander Kreß
Alexander Lawall
Thomas Zöller
Keywords: retrieval-augmented generation; evaluation framework; synthetic datasets; component-level analysis
Abstract:
Retrieval-Augmented Generation (RAG) systems are relevant for improving factuality in Large Language Model (LLM) outputs, yet their evaluation remains challenging due to their multi-component architecture. This paper introduces plot-RAG (pRAG), a novel evaluation framework that visualizes component-level performance in RAG systems, providing granular insights into retrieval and re-ranking processes, without requiring resource-intensive LLM-based evaluation. The effectiveness of pRAG is demonstrated by analyzing a real-world technical documentation question-answering system. Additionally, the methodology for generating and validating synthetic evaluation datasets is presented, showing they can match or exceed manually prepared datasets for RAG assessment. The experiments confirm that the retrieval component represents the most critical performance bottleneck in RAG systems, and a formula is provided to determine the optimal retrieval size based on response time requirements. These contributions enable a more efficient and targeted evaluation of RAG systems, particularly in specialized domains where the creation of ground truth data typically requires substantial expert involvement.
Pages: 3 to 8
Copyright: Copyright (c) IARIA, 2025
Publication date: July 6, 2025
Published in: conference
ISBN: 978-1-68558-287-6
Location: Venice, Italy
Dates: from July 6, 2025 to July 10, 2025