Effectiveness and Optimization of Large Language Models in Natural Language Processing for MongoDB Data Retrieval

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2024-08-19

Department

Major/Subject

Data Science

Mcode

SCI3115

Degree programme

Master's Programme in ICT Innovation

Language

en

Pages

75+18

Series

Abstract

Large Language Models (LLMs) have captured widespread attention due to their impressive ability to mimic language understanding and generate compelling text. However, their use in structured output tasks often leads to unreliable completions due to excessive creativity, making it challenging to evaluate the quality of their outputs. This situation calls for innovative approaches. The investigations of this thesis focus on enhancing and refining the application of LLMs for data retrieval tasks within MongoDB databases, aiming to enhance user experience and streamline query generation using natural language. Additionally, the thesis proposes dynamic and data-agnostic prompt-engineering techniques tailored to maximize accuracy within the specified context. Extensive testing across various LLM architectures is conducted to evaluate their proficiency in interpreting domain-specific language and properly retrieving the desired information. Furthermore, a novel obfuscation technique is introduced, aimed at concealing prompts while preserving their underlying semantics. This method holds particular significance for companies with stringent security requirements, offering a practical solution to protect sensitive information within automated query systems. Overall, this work provides novel strategies for optimizing LLM inference in specialized applications, laying the groundwork for future advancements in Natural Language Processing for query generation, obfuscation, and evaluation.

Description

Supervisor

Laaksonen, Jorma

Thesis advisor

Liverani, Michele

Keywords

large language model, inference, user experience, obfuscation, evaluation

Other note

Citation