aalto1 untyped-item.component.html

Addressing language ambiguities for object identification in robotics

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Electrical Engineering | Master's thesis

Department

Mcode

Language

en

Pages

53

Series

Abstract

This thesis addresses the problem of ambiguity in natural language commands for object identification in robotic environments. Ambiguities often arise from underspecified or context-dependent user instructions and continue to pose a challenge in Human-Robot Interaction. To tackle this issue, this thesis proposes a disambiguation framework that combines the capabilities of Large Language Models with the strengths of having a structured graph-based approach. Additionally, the thesis also proposes its a new taxonomy for ambiguity types built upon previous work. The proposed framework consists of a multi-stage pipeline divided in three distinct modules. The first module parses a user’s natural language input query and transforms it into a structured graph representation using a Large Language Model. The second module uses this graph structure and searches the environment, which is represented as a 3D Scene Graph, for objects that match it, while simultaneously identifying the type of ambiguity, based on the proposed taxonomy. The last module generates appropriate disambiguation feedback based on predefined templates and contextual information. The framework was evaluated on object identification tasks involving ambiguous queries. Results show that the use of GPT-3.5 frequently failed to generate valid node structures, with a failure rate of 41%. In contrast, the use of GPT-4o achieved high recall rates (above 90%) which is crucial for ambiguity resolution. These findings demonstrate the effectiveness of the model in object identification when using a capable Large Language Model.

Description

Supervisor

Kyrki, Ville

Thesis advisor

Mihaylova, Tsvetomila
Verdoja, Francesco

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By