Retrieval Performance in RAG Systems: A Component-Level Evaluation Framework

Kreß, Alexander; Lawall, Alexander; Zöller, Thomas

Home // GPTMB 2025, The Second International Conference on Generative Pre-trained Transformer Models and Beyond // View article

Retrieval Performance in RAG Systems: A Component-Level Evaluation Framework

Authors:
Alexander Kreß
Alexander Lawall
Thomas Zöller

Keywords: retrieval-augmented generation; evaluation framework; synthetic datasets; component-level analysis

Abstract:
Retrieval-Augmented Generation (RAG) systems are relevant for improving factuality in Large Language Model (LLM) outputs, yet their evaluation remains challenging due to their multi-component architecture. This paper introduces plot-RAG (pRAG), a novel evaluation framework that visualizes component-level performance in RAG systems, providing granular insights into retrieval and re-ranking processes, without requiring resource-intensive LLM-based evaluation. The effectiveness of pRAG is demonstrated by analyzing a real-world technical documentation question-answering system. Additionally, the methodology for generating and validating synthetic evaluation datasets is presented, showing they can match or exceed manually prepared datasets for RAG assessment. The experiments confirm that the retrieval component represents the most critical performance bottleneck in RAG systems, and a formula is provided to determine the optimal retrieval size based on response time requirements. These contributions enable a more efficient and targeted evaluation of RAG systems, particularly in specialized domains where the creation of ground truth data typically requires substantial expert involvement.

Pages: 3 to 8

Copyright: Copyright (c) IARIA, 2025

Publication date: July 6, 2025

Published in: conference

ISBN: 978-1-68558-287-6

Location: Venice, Italy

Dates: from July 6, 2025 to July 10, 2025