Conversational interfaces for data analysis: Evaluating modular agent architectures

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis

Department

Major/Subject

Mcode

Language

en

Pages

77

Series

Abstract

This thesis investigates the performance, usability, and interpretability of different large language model (LLM)-driven agent architectures for translating natural language questions into SQL (NL2SQL) and perform further data analysis on the obtained results, with a focus on real-world datasets from the telecommunications domain. Three core architectures were implemented and compared: a Simple-Chain baseline, a ReAct-style agent with variants (including retrieval and Python tool integration), and a modular multi-agent system coordinated by a Supervisor agent. The system was built using LangChain and LangGraph for orchestrating tool use and memory, and deployed via FastAPI with WebSocket-based streaming to enable real-time interaction. Two experiments were conducted: (1) a benchmark evaluation using ten handcrafted NL2SQL tasks with expert-designed reference trajectories, and (2) a user study with telecommunications professionals evaluating the agents’ output quality and responsiveness. Results show that the RAG-augmented ReAct agent consistently outperformed all others across correctness, judge alignment, and reasoning efficiency, validating the benefit of retrieval-enhanced reasoning. The Supervisor-based multi-agent system demonstrated high-quality trajectories but introduced latency and communication overhead. The user study revealed that agents equipped with Python-based data visualizations significantly enhanced user experience. Users currently preferred the ReAct agent due to its faster responses and streamlined tool usage, though multi-agent designs were recognized as promising with further optimization. The findings suggest that retrieval and tool integration substantially improve LLMbased NL2SQL systems, and that agent architecture design should balance flexibility with interpretability and execution efficiency for practical deployment.

Description

Supervisor

Juvela, Lauri

Other note

Citation