Infoscience

Thesis

Design of multimodal dialogue-based systems

Multimodal dialogue systems integrate advanced (often spoken) language technologies within human-computer interaction methods. Such complex systems cannot be designed without extensive human expertise and systematic design guidelines taking into account the limitations of the underlying technologies. Therefore, this thesis aims at reducing the time and effort needed to build such systems by creating application-independent techniques, tools and algorithms that automate the design process and make it accessible for non-expert application developers. The thesis proposes an interactive system prototyping methodology, which (together with its software implementation) allows for rapid building of multimodal dialogue-based information seeking systems. When designed with our methodology, even partially implemented system prototypes can immediately be tested with users through Wizard of Oz simulations (which are integrated into the methodology) that reveal user behavior and modality use models. Involving users in early development phases increases the chances for the targeted system to be well accepted by end-users. With respect to dialogue system design, we propose a two-layered dialogue model as a variant of the standard frame-based approach. The two layers of the proposed dialogue model correspond to local and global dialogue strategies. One of the important findings of our research is that the two-layered dialogue model is easily extendable to multimodal systems. The methodology is illustrated in full detail through the design and implementation of the Archivus system – a multimodal (mouse, pen, touchscreen, keyboard and voice) interface that allows users to access and search a database of recorded and annotated meetings (the Smart Meeting Room application). The final part of the thesis is dedicated to an overall qualitative evaluation of the Archivus system (user's performance, satisfaction, analysis of encountered problems) and to a quantitative evaluation of all the implemented dialogue strategies. Our methodology is intended (1) for designers of multimodal systems who want to quickly develop a multimodal system in their application domain, (2) for researchers who want to better understand human-machine multimodal interaction through experimenting with working prototypes, (3) for researchers who want to test new modalities within the context of a complete application, and (4) for researchers interested in new approaches to specific issues related to multimodal systems (e.g. the multimodal fusion problem).

Related material