Adaptive prompt elicitation for text-to-image generation

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorHegemann, Lena
dc.contributor.authorWen, Xinyi
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorOulasvirta, Antti
dc.date.accessioned2025-10-21T17:02:24Z
dc.date.available2025-10-21T17:02:24Z
dc.date.issued2025-09-29
dc.description.abstractText-to-image generative models have rapidly advanced in producing high-quality and diverse visual content. However, aligning generated outputs with user intent remains challenging, particularly when users have a specific vision but struggle to articulate it precisely in natural language, leading to tedious trial-and-error refinement. This thesis presents Adaptive Prompt Elicitation, a principled method designed to bridge the gap between user intent and effective prompts for text-to-image generation. Building on an information-theoretic foundation, the approach contributes two key innovations: (1) a prompt elicitation framework that interactively elicits user intent through informative queries and automatically generates optimized prompts; (2) an efficient query selection strategy that maximizes expected alignment gain based on Bayesian experimental design. We evaluate the method on two established benchmarks for complex text-to-image generation—IDEA-Bench and DesignBench—comparing it against automatic prompt optimization, in-context query generation, and non-optimized baselines. Results demonstrate that Adaptive Prompt Elicitation achieves superior effectiveness and efficiency in aligning generated images with target intent while maintaining consistent performance across diverse scenarios. Overall, this work contributes a systematic methodology for interactive prompt optimization, advancing the usability, interpretability, and reliability of text-to-image generation systems.en
dc.format.extent62
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/140253
dc.identifier.urnURN:NBN:fi:aalto-202510218421
dc.language.isoenen
dc.programmeMaster's Programme in Computer, Communication and Information Sciencesen
dc.programme.majorMachine Learning, Data Science and Artificial Intelligenceen
dc.subject.keywordtext-to-image generationen
dc.subject.keywordhuman-AI alignmenten
dc.subject.keywordBayesian experimental designen
dc.subject.keywordpreference elicitationen
dc.subject.keywordprompt optimizationen
dc.subject.keywordinteractive machine learningen
dc.titleAdaptive prompt elicitation for text-to-image generationen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessno

Files