SALS-SIG Research Seminar

UI on the Fly: Generating a multimodal user interface with functional unification grammar


Speaker:

David Reitter

MEDIA LAB EUROPE
Date: Monday, 15th December 2003
Time: 11-12:30pm
Location: ICS Seminar Room (357), Building E7B, Macquarie University

Abstract:

It's a common phenomenon that conference calls are more stressful than face-to-face meetings. One reason for that is that human communication relies on multiple channels, among them natural language, eye gaze, body posture. Humans can easily coordinate content presented in different modes. If this multimodality is constrained, humans encounter difficulties -- for example if they need to use a computer interface that uses only graphical metaphors or only voice.

Multimodal human computer interfaces explore parallel multimodal communication. While today's user interfaces make different input and output methods available (mouse and keyboard, screen and speakers), our interfaces go beyond that. They ensure cross-channel coordination for both input and output, so the communication channels can be used in parallel. These interfaces convey not just redundant, but also complementary information. For example, they can augment a graphical user interface (GUI) with helpful audio commentary. In mobile situations, screen-based output may be simplified, or eliminated entirely, in reponse to a specific use situation, e.g. when driving. Similarly, the system can adapt to the needs of hard-of-hearing or visually impaired users.

In FASiL, we address the adaptivity of the user interface with a dynamic generator. Multimodal Functional Unification Grammar (MUG) is a unification-based formalism that provides the means to dynamically generate content that is coordinated across several communication modes, which currently include natural language and a GUI. The interface can adapt the content presented in each mode to the user's preferences and usage situation. An objective function defines the trade-off between predicted cognitive complexity of the output and its utility. This way, the system can select from among several possible output forms generated by the grammar.

I will present an application of the formalism to the domain of mobile personal information management, where the interface is part of a larger system that also contains speech recognition and synthesis, a dialogue manager, and summarization and categorization components.

Parking: Visitors requiring a parking pass are asked to contact us at least one working day before the seminar.

Enquiries: sals@ics.mq.edu.au

Last modified: 28th November 2003