Issues and pitfalls in applying the Metamorph Box

There are a number of issues and possible pitfalls in evaluating agent characteristics and determining the effectiveness of the agent as distinct from the effectiveness of the application. Leaving aside technical issues in realizing the digital dossiers, which have been discussed elsewhere [Perspectives], and user-related issues, such as demographic variables, gender and personality traits, as mentioned in [Ruttkay 2004], we will briefly indicate the design issues and potential pitfalls in developing the experimental testbed for the MetaMorph box.

One issue that we should mention is the complexity of the domain of cultural heritage will make it (even more) difficult to distinguish between the effectiveness of the application (the digital dossier) and the usefullness of the agent (providing guidance in using the dossier). And, as it comes to testing task-related issues, we must face the fact that there is only a limited number of experts that have sufficient knowledge to use the dossier in an effective way, for example in preparing an installation of the work of art, in a museum or exhibition gallery. So, for testing the metamorph box with a sufficient number of users, we should restrict ourselves to explorative, or otherwise limited, tasks.

As concerns the design of the agent character, it is not entirely clear what design rules we should apply to make an effective and focussed choice between the personality characteristics represented by the MetaMorph box. Any combination of appearance, gestures and speech characteristics will likely involve multiple factors in the PEFiC model. Given the complexity of digital dossiers, we should moreover identify parts of the dossier, related to specific subtasks, as for example viewing the video-recorded interview to find particular items of information, for which we can control the guidance offered by the agent, to allow for well-focussed experimentation with personality charactistics related to the task at hand.

As it stands, we believe that the PEFiC model provides all the necessary ingredients for experimentally evaluating ECA systems, provided that we develop a small consistent set of design parameters for digital dossiers allowing for controlled experimentation in the given task domain.

An experimental framework for testing the metamorph box

We proceed from the assumptions that our agents function primarily as guidance in complex interaction tasks with some information system, although conversational agents can take different functions as well, such as for example a conversational partner in language learning, or as a demonstrator in learnign a skill. In interaction tasks in complex environments we can distinguish between two paradigms of interaction, namely pure navigation and guided tours or presentations that are based on some narrative structure. Both paradigms can be augmented using embodied conversational agents, which in the case of navigation might merely give directions and, in the case of guided tours, may explain what is going on, possibly offering the user a choice of continuations. In cooperation with the Faculty of Social Sciences, we have submitted a research funding proposal to undertake such evaluation studies, based on the PEFiC theory outlined previously. In this proposal  [VUBIS], we focused in particular on the relation between the type of agent (character) and the material presented (context), to determine how this relation affects the valence of the user towards the agent, resulting in the degree of distance or involvement experienced.

validation scenario(s)

  • navigation -- pure interactivity
  • guided tours -- using some narrative structure
  • agent-mediated -- navigation and guided tours

In this section we will introduce the notion of digital dossier, which is to be understood as a collection of information that may be viewed from different perspectives, and essentially contains, apart from textual information, rich media information in the form of video-recorded interviews, audio and images.

This notion of digital dossiers is deployed in a project done in cooperation with the Dutch Cultural Heritage Institute (ICN) in the context of the International Network for the Conservation of Contemporary Art (INCCA). Briefly, the idea is to develop digital dossiers for individual artworks, allowing professionals to deal with the information involved in an integrated, highly interactive fashion. The following project assignment may serve as a characterization of the notion of digital dossier:

digital dossier

Create a VR that realizes a digital dossier for a work of a particular artist. A digital dossier represents the information that is available for a particular work of art, or a collection of works, of a particular artist. The digital dossier should be multimedia-enhanced, that is include photographs, audio and other multimedia material in a compelling manner.

Note that dossier is an existing english word, which according to the Webster New World Dictionary has the following meaning

  • dossier (dos-si-er) [Fr < dos (back); so named because labeled on the back] a collection of documents concerning a particular person or matter
It is closely related to the notion of archive, but there is a different focus:
  • archive [...] 1) a place where public records are kept ... 2) the records, material itself ...

Although there are many possible ways to realize a digital dossier, we have in one case study, centered around the Dutch artist Marinus Boezem, chosen for deplaying a spatial metaphor for giving access to the information, namely the artists' atelier. The artworks, as well as the information about the artworks is present in the atelier, and in addition in one corner of the atelier there is a projector with which a video recording of interview with the artist may be showed. The artist himself, in the form of a humanoid avatar, is available as an agent, waiting in the corner of the atelier, to answer questions about his work. An initial application of an artist dossier has been developed in multimedia casus practicum at the VU, as shown in the screenshots in figure ...

In the actual evaluation studies, we must, in general, investigate what role embodied agents may play in presenting such digital dossiers to the user, and how the agents should appear, that is what services are offered and what the actual presence of the character must be. More specifically, to validate the PEFiC theory, we may vary the personality characteristics of the agent, possibly along the perspective of interest for a particular user. However, there are a number of pitfalls in such evaluation studies, as for example the structured variation of personality characteristics over the 64 different types distinguished in the PEFiC model and, as is the case in many agent applications, determining the effectiveness of the agents' contribution in relation to the information provided by the application. We will discuss these issues in more detail after giving a brief description of the platform on which we will realize the experimental framework.

The DLP/STEP agent platform

In a number of papers, [zie refs ECA Perspectives], including the paper ECA Perspectives presented at the Dagstuhl Seminar on Evaluating Embodied Conversational Agents we have described the DLP+X3D platform and the STEP scripting language for animating humanoid characters, as well as a number of applications that have served as target demonstrators in developing the platform.

The design of the DLP/STEP platform was, among others, motivated by the requirements listed below.

scripting behavior

  • convenience -- for non-professional authors
  • compositional semantics -- combining operations
  • re-definability -- for high-level specification of actions
  • parametrization -- for the adaptation of actions
  • interaction -- with a (virtual) environment
These requirements make DLP/STEP specially suitable for the development of reusable libraries of gestures, that may be adapted to different personal styles, with respect to expressiveness and moods.

As indicated in the Dagstuhl paper, for dealing with streaming media and high performance graphics, a number of improvements are necessary to the DLP/STEP platform, that we will, however, not discuss here any further.

Issues and pitfalls in evaluating agent characteristics

As already indicated there are a number of issues and possible pitfalls in evaluating agent characteristics and determining the effectiveness of the agent as distinct from the effectiveness of the application.

A number of these issues has been identified in the abovementioned Dagstuhl seminar on evaluating embodied conversational agents. For our application or rather experimentation platform we need to factor out appropriate clusters of character traits and adapt the agent guiding the user through a digital dossier, so that the users involvement with the application can be related to the actual characteristics of the agent.




