Effectiveness and Optimization of Large Language Models in Natural Language Processing for MongoDB Data Retrieval
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Authors
Date
2024-08-19
Department
Major/Subject
Data Science
Mcode
SCI3115
Degree programme
Master's Programme in ICT Innovation
Language
en
Pages
75+18
Series
Abstract
Large Language Models (LLMs) have captured widespread attention due to their impressive ability to mimic language understanding and generate compelling text. However, their use in structured output tasks often leads to unreliable completions due to excessive creativity, making it challenging to evaluate the quality of their outputs. This situation calls for innovative approaches. The investigations of this thesis focus on enhancing and refining the application of LLMs for data retrieval tasks within MongoDB databases, aiming to enhance user experience and streamline query generation using natural language. Additionally, the thesis proposes dynamic and data-agnostic prompt-engineering techniques tailored to maximize accuracy within the specified context. Extensive testing across various LLM architectures is conducted to evaluate their proficiency in interpreting domain-specific language and properly retrieving the desired information. Furthermore, a novel obfuscation technique is introduced, aimed at concealing prompts while preserving their underlying semantics. This method holds particular significance for companies with stringent security requirements, offering a practical solution to protect sensitive information within automated query systems. Overall, this work provides novel strategies for optimizing LLM inference in specialized applications, laying the groundwork for future advancements in Natural Language Processing for query generation, obfuscation, and evaluation.Description
Supervisor
Laaksonen, JormaThesis advisor
Liverani, MicheleKeywords
large language model, inference, user experience, obfuscation, evaluation