Modular retrieval-augmented generation for business knowledge management: Key components and evaluation strategies

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis

Department

Major/Subject

Mcode

Language

en

Pages

95

Series

Abstract

Retrieval-Augmented Generation (RAG) has emerged as a foundational paradigm for enhancing Large Language Models (LLMs) through the integration of external knowledge retrieval mechanisms. To address security concerns, companies aim to deploy RAG systems in local environments. However, implementing RAG architectures locally presents challenges, including high computational constraints, retrieval noise, and architectural complexity. These constraints significantly impact the scalability of RAG systems, creating bottlenecks in the selection of optimal models, retrieval strategies, and pipeline configurations. This thesis aims to evaluate and compare different combinations of RAG pipelines, models and retrieval strategies, to determine how configurations should be adapted based on resource constraints and dataset quality. This study applies a comprehensive methodology to evaluate the efficiency of the combined components of the RAG system, using Faithfulness, Answer Relevance, Answer Similarity, and Context Relevance as key metrics. In addition, a novel Aggregated Quality Score (AQS) is established to benchmark the overall efficiency of various configurations. Experiments are conducted on a technical document corpus with structured content and a conversational dataset with unstructured information. Findings of this thesis indicate that dataset nature directly influences retrieval and generation performance. Smaller models exhibit high sensitivity to pipeline configurations, where specific combinations significantly enhance performance. In contrast, larger LLM s demonstrate lower variability, benefiting primarily from incremental optimizations through pipeline tuning. Thus, retrievers, pipelines, and models must be adapted to specific resource constraints and dataset characteristics for optimal efficiency. By analyzing the interactions between key components, this thesis contributes to the development of adaptive RAG architectures, offering insights into resource-efficient, context-aware implementations and paving the way for further research.

Description

Supervisor

Laaksonen, Jorma

Thesis advisor

Zubowicz, Grégory

Other note

Citation