Inertial motion capture and live performance (with a focus on dance)

In This Article

Introduction

We must expect great innovations to transform the entire technique of the arts, thereby affecting artistic invention itself and perhaps even bringing about an amazing change in our very notion of art. (Valéry in Benjamin, 1935, n.p.)

3D Motion capture is a way of translating physical performance into the language of virtual performance. It is a medium that plots motion, usually human motion, converting it into data that can be represented visually and spatially in limitless projected forms. Major motion of the body such as the movement of limbs, head and torso are tracked using sensors or markers attached at strategic points on the body. An inertial suit typically has around seventeen sensors. With more sophisticated systems, finger movement and facial expressions can also be tracked. Due to the expense and the complexity, motion capture is not often used in live performance situations but, as it becomes easier to use in real-time and closer to tapping into the huge legacy of 3D animation capabilities, much of the creative potential that it has promised in the past can finally be delivered, and more.

One persistent problem with linear (non-interactive) and to a lesser extent games media is the inflexibility of timing; a sequence, once begun, ticks along with the exactness and repeatability of a metronome and can either dictate the pace of the live performance or operate at a pace independent of it. Real-time performance animation using 3D motion capture offers new possibilities for hybrid performances in which projected bodies, scenery and effects can react to the twists and turns of human interaction. The inertial motion capture system uses strategically placed sensors at points around the body to measure changes in orientation. Orientation data from these sensors is sent wirelessly to a receiver attached to a computer. These orientation values, in combination with an actor file (a file that contains the performer’s body measurements) are used to create a 3D representation of a performance in digital form.

Inertial motion capture has some inherent traits that differentiate it from the more predominant optical motion capture. Inertial systems use active sensors that measure orientation whereas optical motion capture uses an array of cameras to track points in space. So inertial data is rotational whereas optical data is translational. Optical systems use visible light as a means of detection, providing the level of accuracy needed for feature films. However, these systems require dedicated studio space, and are subject to line-of sight (occlusion), contrast and reflectivity problems. With inertial motion capture the data has no global point of reference and this can lead to a range of translational errors.

For instance, if a person wearing a mocap suit walked in a circle, finishing at the starting point, a virtual character linked to these movements may end up a short distance away from where it started. This makes it difficult to relate a spot onstage to an equivalent position onscreen. Nevertheless, the suit can be used in a wide variety of environments with minimal set-up time, making it a good choice for live performance situations where real-time visuals are needed.

Digital performance

Dance has been at the forefront of digital performance since the eighties. Mark Coniglio and Dawn Stoppiello, who went on to form Troika Ranch, used flexible rods to provide digital information on a dancer's movements back in 1989, eventually developing ‘Isadora’, software that is still commonly used for creating real-time visual effects not only by dance artists, but also by theatre artists and VJs. Through the nineties dance companies such as Electronic Dance Theatre, Riverbed, AlienNation Company and Troika Ranch experimented with forms of motion capture.

However, it was a time when real-time use of motion capture for stage was a very expensive, unpredictable and technically difficult undertaking so the environments used were either not real-time or not 3D. For real-time visuals, video–based motion tracking has been and still is much more commonly used in dance than 3D motion capture. Programs like EyesWeb, vvvv and EyeCon are used to track the outline or a number of key points of a dancer's movements, usually from a single video feed. Effects can then be applied dynamically to these key points to create real-time visuals. This technique has been used to spectacular effect by companies such as Palindrome, Chunky Move and Jambird.

A very early performance using 3D motion capture took place when Riverbed members, Paul Kaiser and Shelley Eshkar combined with choreographer Merce Cunningham to create Biped (1999). This was a landmark performance in which motion captured sequences were used to drive abstracted 'hand-drawn' models of dancers projected onto scrim at the front of stage. Ghostcatching (1999), a digital dance installation, was a collaboration between Riverbed and dancer/choreographer Bill T. Jones. A blackened room and ghostly hand-drawn figures move about in projected three dimensional space tracing lines, gradually filling the space and obscuring the figures that are creating them. The digital opera, Monsters of Grace (1998), composed by Phillip Glass and directed by Robert Wilson, also used pre-rendered 3D motion capture as the basis for visuals. This blend of technology and dance, according to Birringer (1999, pp. 361-381), was more art than performance, in that it

... emphasised technical execution and precision, drawing lines in space and filtering out all psychological and emotional connotations.

A theatrical production, David Saltz’s version of The Tempest (2000), was arguably the first production to use full axis 3D motion capture in real-time onstage. Ariel was played both as a live actor and a virtual actor. The actor was fitted with electromagnetic trackers, essentially controlling the movement of the virtual Ariel. This allowed for a consistency of motion and manner between the two forms. Saltz saw the technology as heralding a new era of mediatised performance concluding that ‘an age of interactive, live media is upon us’ (Saltz, 2001, p. 127).

The Advanced Computing Center for the Arts and Design (ACCAD) at the University of Ohio has been a hub for interdisciplinary productions using 3D motion capture. Landing Place (2004) choreographed by Bebe Miller and animated by Vita Berezina-Blackburn was a collaboration between artists, performers and technicians located in different parts of the United States. In the production, prerecorded motion capture sequences were used to drive movement in a range of nonanthropomorphic scenes, which were projected as visuals in dance (Berezina-Blackburn & Miller, 2005).

In 2005 Paul Kaiser and Shelley Eshkar, and Michael Girard along with composer Curtis Bahn worked with Trisha Brown and Bill T Jones to create the dance production, Motion-e (David, 2005). This was carried out as a broad multidisciplinary collaboration through the Herbinger College of Fine Arts and the Fulton School of Engineering. Unlike the earlier productions in the late nineties, the stage was set up as an optical motion capture area so that the dance onstage could drive real-time effects and sound.

Other productions that used motion capture include Point A to B (2007) by the United Kingdom based Urban Freeflow, performance visuals that are based on the urban sport of parkour, a way of creatively and acrobatically getting from one point to another in an urban environment, and Loops (2001), a continuous open-source digital artwork based on the movement of Merc Cunningham’s hands.

The productions after 2000 were carried out at a time when the fervour that had surrounded the use of the latest interactive technology in live performance waned. In many respects some of the promise of what visual technology offered in the nineties as a tool in live performance fell victim to overexpectation (Daniels, 1999). This was often because the hype surrounding new technology inflated the reality, and ‘people often do not distinguish between what it might be able to do and what it truly can do’ (Grady, 2003, p. 167). deLahunta (2002, p. 114) stressed the need to engage the audience more actively and to emphasise performativity over the visual display.

Since the mid to late 80s (with precedents established earlier), some dancers and choreographers have been exploring various interactive computer systems, but their works tend to integrate these systems into presentations in essentially proscenium-like settings and not engage in open and participatory models allowing the audience/user/viewer to cross the border between performance space and spectating space.

Earlier productions that used 3D motion capture were able to represent the subtleties of movement in visually captivating and groundbreaking ways, but were generally unable to use real-time rendering. The control of the timing and communication was not in the hands of the performer. The importance of variation in performance is a recurring theme when digital media is fused with live performance. Mark Coniglio (2004, p. 6) recognised the limitations of recorded media and, in his productions provided dancers on stage with control so that the performance was not constrained by the inflexibility of the projected imagery. Consequently, the performance became more fluid and spontaneous, a dynamic he calls ‘the chaos of the organic’ (ibid).

Digital media is wonderful because it can be endlessly duplicated and/or presented without fear of the tiniest change or degradation, but it is this very quality (the media's 'deadness') that is antithetical to the fluid and ever changing nature of live performance.

Time is an ingredient in communication and the small changes in timing can affect the way gestures and mannerisms are interpreted. The interplay of timing and body language is the essence of performativity. Schechner regards performativity as underlying theatricality. Gestures, movements and sounds "if not universally understood, come close to conveying the same feelings everywhere" (Schechner, 1994, p. 43).

Paralanguage (nonverbal language) arguably transcends cultural differences more readily than spoken or written language. In Schechner’s interpretation, performativity in physical communication is transcultural, unlike verbal or written communication, which is more localised. This indicates that paralanguage is more fundamental to communication than speech. A real-time animation environment is a way in which paralanguage can be completely isolated from the body creating it, even placed in an entirely different visual context, opening up intriguing possibilities for theatrical communication between performer and audience.

Performance projects

Due to its short history there is little information available that deals with 3D motion capture as a live performance medium. The few well-documented examples of its use tend to focus on the performance outcomes rather than technical processes. Rather than primarily focusing on performative and aesthetic outcomes, this study explores the tools and techniques used in real-time animation to provide the underlay for its use in live performance. The aim is to give a more general overview of what is achievable and some indication of how well it works, so that the choreographer, designer, director or the new media artist can make informed creative decisions. The four practice-led investigations used as the basis of the study which are discussed in this paper are:

  • A Brush with the Real World – a real-time virtual artist.
  • Chasing Shadows – a stage interaction between an actor and his shadow.
  • Private Eyes – an instant movie experiment.
  • Motionics – an exploration of performance animation and dance.

The projects focus on performative examples that exploit the unique attributes of real-time motion capture. These attributes are:

  • The mediated performance can be viewed as the live performance is happening (in real-time).
  • It uses full axis 3D data, opening it up to the vast realm of 3D techniques including effects, behaviours and physics.
  • It can simulate the nuances and spontaneity of paralanguage.
  • It can represent a performance in many artistically distinct ways onscreen.
  • The created visual environment is viewable from any angle.

Many of the techniques in these projects are well known to those who work with animation. Applying these techniques using motion capture to create real-time animation is less well known. There are a number of well-documented examples of real-time animated characters being used in a public forum; these are typically real-time applications by animation production houses rather than hybrid performances by live performance troupes.

This difference in emphasis is important in determining how the subject matter is approached and how the technology is used. In live performance the ‘liveness’ is primary and animation is an augmentation whereas in animation the emphasis is on the animated outcome, and the liveness is an attribute that helps the workflow but the performer is typically disconnected from the final product.

Working in the studio on A Brush with the Real World. Image: John Haag

Chasing Shadows and A Brush with the Real World are theatrically based installations, using animation techniques that are also applicable to dance. Both, in different ways, explore how a projected story can be created live, including how scene changes can occur, methods for using more than one character, and ways in which a character can interact with objects in the scene. Motionics is not a single project but a collection of trials using basic environments that together provide an overview of real-time 3D and it’s application to dance.

Private Eyes explores the concept of the ‘instant movie’. Many aspects of 3D animation software environment are metaphorically modelled on film production; this is reflected in the use of terms such as actors, lighting, cameras and effects. Private Eyes looks at how a filmic approach adapts to a real-time animation/performance scenario.

A Brush with the Real World is designed to look at ways in which dramatic interaction can take place between a screen character and an audience. The virtual artist, Rupert, can communicate physically and verbally, and most importantly, through his paintbrush. A gestural interface is used to operate Rupert’s paintbrush and a pallet of colours. This allows the performer to change the colour and flow of the paint using gestures of their non-painting arm rather than through a device such as a joystick or keyboard, giving them more immediate control.

Rupert is also able to speak and has a range of facial expressions and eye movements that extend his ability to communicate in a live exchange with an audience. A Brush with the Real World begins as a typical non-interactive animation. From this base the level of interaction evolves, beginning with simple gestures transcending into an artistic exchange using his paintbrush eventually leading to verbal dialogue. A Brush with the Real World uses improvisational techniques to actively engage the viewer. Improvisation also brings out the unique strengths of using a real-time medium. Rupert and his brush were also used in the studio for exploring ways in which 3D environments could be used with dance. These include ways of affecting video through motion, using certain positions onstage to trigger events onscreen, or using paintflow as a visual means of dynamically visually representing the movement of the dancer over time.

Motionics uses simple constructed environments as well as some generic assets from the application, MotionBuilder, to explore a range of software features including virtual cameras, triggering, video, particle effects, scene changes and relationship constraints; determining ways in which each can be used to connect the motions of the dancer and the mood of the dance to the projected visuals. The outcomes give an indication of how well each element performs technically and how each might be used creatively. Although each sequence is not intended for performance, each was given a distinctive name for easy reference.

In 3D environments a camera is typically used as a graphic metaphor for the viewing position. The viewing position can be changed in real-time by toggling between cameras in a scene or by constraining the motion of a camera to a drawn path. As in movies, each camera position can establish a different relationship between the dancer and accompanying imagery. Using a first person view, the audience is transposed into the eyes of the dancer as s(he) moves through the virtual scene, so it becomes a mutual journey between audience and dancer. A virtual camera can be used in ways that are very difficult by other means; the camera becomes part of the choreography as the perspective can pan, zoom and follow complex paths around the dancer’s motion represented onscreen. Toggling between multiple cameras, each of which can have unique visual elements, further enhances this flexibility.

In one test, a moving camera constrained to a path was used to watch a virtual dancer from continuously changing perspectives. The motion data generated by a dancer was used to drive a number of virtual dancers at the same time. A similar effect was obtained by using virtual mirror surfaces to create multiple perspectives of the same character.

Computer graphic effects can add atmosphere or spectacle to a real-time environment but the range of effects that can be generated in real-time with the software available is limited. The effects used in Motionics were limited to particle effects that could be easily created within the software. Particle effects are created by the emission of particles that each conform to a consistent appearance and behaviour. In large numbers particle effects can be used to simulate smoke, cascading water or fireworks. The particles can be animated to create the appearance of large flocks of birds or schools of fish. Particle effects can be used in real-time without problems.

However, if they are used extensively, dropped frames or inaccurate representation onscreen may occur. In the study particle emitters were constrained to the motions of the character’s skeleton to create abstracted liminal bodies. Particle effects were also used to create motion trails, some directly following the movement of the dancer and others using less direct motion relationships to create a more eclectic effect. Images were applied to particles and these could be dynamically constrained by the dancer’s motions.

Some of the 3D environments used in Motionics. Image: John Haag

Video can also be incorporated into a scene and was tried as a background element as well as a projection onto a moving character. By applying it to the surface mapping of the character, reflective and transparent effects were easily created, and by using the virtual body as a projection template it becomes a moving, changing screen. Relations constraints offer many ways of linking the attributes of one element in a scene to the attributes of another.

For instance the rotation of a cube can be linked to the colour of a sphere. In Motionics relations constraints were used to trigger a shower of sparks when the character moved through certain points in the scene, and also to change the colours of particles using hand gestures. Relations constraints were also used in several ways to link the dancer’s movements to the scenery or objects in the scene. One scene used a wall of eyes that were locked onto the actor; wherever the actor moved the eyes followed. Another used a projected backdrop that exaggerated the travel of the dancer.

Chasing Shadows is interplay between live action, real-time animation and movie clips to create the dramatic illusion that the actor’s shadow takes on a life of its own. To make the transitions as smooth as possible, the changeovers from real-time shadow to movie clip shadow took place either out of view or at a set place in the virtual scene. Blending movie clips with real-time motion and live action was a challenge, as the position of the actor’s double onscreen had no absolute point of reference and, over time, was subject to drift. To reduce this problem the actor on stage had a set location to return to. Changeovers between clips and real-time performance animation happened either at this point or when the shadow was out of view. Being able to toggle between live and recorded action has broader artistic applications in dance. Using this technique, the dancer, as performance animator, can act separately to the visuals at artistically appropriate times.

Private Eyes started as an offshoot from Chasing Shadows. It is a foggy night in the inner city and a lone misty figure stands under lamplight on a street corner waiting for a bus. He slowly becomes aware of and gradually preoccupied with eyes watching him. The headlights of a passing bus or dark figures passing in the foreground occasionally punctuate the scene. Private Eyes uses particle effects, background video, scene changes, camera switching and mood lighting in real time to test the extent to which an animated movie can be performed ‘live’. Scenes with multiple characters were created using a blend of movie clips and real-time motion capture. In this study there were many variables at play and so it was approached reflexively to get a broad overview of whether all of these factors could work together without degrading the quality of the real-time screen rendering.

Problems and limitations encountered
Foot slide Foot slide The foot slides on the floor with each step.
Foot drop The foot drops a few centimeters to the floor with each step.
Physical limitations Highly vigorous, acrobatic or contorted actions may not record accurately.
Actor file The accuracy of the data relies on accurate actor measurements. Tweaks can sometimes lead to unforeseen inaccuracies.
Frame dropping The hardware or software cannot handle the rate of information and drops frames.
Relative positioning The data only tracks relative positioning. Depending on the accuracy of the calibrations, the location of the virtual character in the scene shifts with movement.
Vertical tracking Actions such as jumping and climbing stairs are highly predictive and difficult to record accurately.
Apparel The suit, while wireless to the receiver, has significant amounts of hardware attached. This limits the range of costuming that can be worn with it for a live performance situation.
Complexity With more complex environments, lag and frame dropping may occur.
Electromagnetic interference Strong electromagnetic interference can affect data.

Technical evaluation

In this paper inertial motion capture is viewed simply in terms of whether or not it is suitable for live performance. The attributes are set out in the chart below. The chart above shows some of inertial motion capture’s limitations as well as some specific recurring problems that are likely to affect real-time performance.

Attributes of inertial motion capture
Latency Operated in real-time with few frames dropped. (There was detectable lag with more complex 3D environments)
Range Can operate over distances of up to 200 metres.
Signal interference Is not affected by line of site problems.
Venue Can be used in a large range of live environments.
Ease of use It is highly portable and quick to set up.
Hardware requirements It can run in real-time on standard computer equipment, including a laptop with an above-average graphics card.
Control One person can operate the entire performance animation process, including the acting.
Updates The software is regularly improved and each new version improves the real-time output.
Complexity Elaborate 3D environments, multiple characters and complex effects can be used.

This evaluation of inertial motion capture has been conducted over several years using the first inertial suit available commercially. To gain a better perspective on the problems and limitations listed it should be noted that since this study began there have been significant advances in the technology that have reduced or eliminated many of the problems and limitations listed. At Siggraph 2008 all systems presented, optical, inertial and flexible tape, were able to apply a broad range of vigorous movements to complex 3D scenes in real-time reliably and with few discernible glitches.

Conclusion

Using inertial motion capture equipment as a tool in research has provided insight, not just into the techniques and performance outcomes investigated, but into a much broader range of capabilities that became clear while using the software. Synaesthetic connections, gestural interfacing, the use of CG and particle effects as real-time visuals, and real-time voice animation are all areas that can be used in diverse ways in a real-time 3D environment. The artistic outcomes that can extend from these elements provide a glimpse into the future; a vision of what will be possible once real-time rendering is able to handle current 3D content creation capabilities.

The primary study, A Brush with the Real World, was performed as an improvisational window installation at the Judith Wright Centre for Contemporary Art in Brisbane in July 2009. The scene was displayed on two large screens, one facing the street for interactions with passing street traffic, and the other set-up in the foyer for more in depth dialogue with a more stationary audience. Cameras and microphones were set up so the performer could see and hear those watching. A whole range of active elements were included in the virtual scene. Rupert could turn lights on and off, change appearance, pick up objects. His pants smoked if they went near fire. Letters on a poster could be moved about. He had a range of moods and spoke with articulate mouth movements. He could paint with a range of colours and could dance to create colourful 3D trails.

All these elements helped to make it a complex multilevel improvisation where people were generally more engrossed in communicating than passively watching. While experimental, A Brush with the Real World accentuated that improvisation works well with animation, whether the form of improvisation is physical, visual or verbal. The installation showed that a gestural interface, in which movements of the body control actions in the scene, enhanced the improvisation providing there was an intuitive connection between gesture and action.

Chasing Shadows is successful as an experiment in blending real-time performance animation and movie clips in the same performance scenario. The theatricality relies on the accurate real-time rendering of the shadow and the invisibility of the technology. There was insufficient time to automate the position of the transitions between live and recorded data and so these changeovers relied on the intuition of the performer. By using a handheld switch to activate changes, the actor has total control over the action so that the entire skit could be handled onstage.

Private Eyes demonstrates that compound scenes, with atmospheric lighting and effects, multiple characters and scene changes could be used in real-time using inertial motion capture connected to a non-specialised computer. From a theatrical perspective, this opens up many ways in which projected visuals can relate to the performer onstage; from moments of reminiscence to visualised emotions, thoughts and schemes; from flights of fancy to extensions of the alter ego.

Dance, with its emphasis on the controlled motion of the body, is able to connect in complex ways with the visuals. Specific parts of the body can affect individual properties of virtual artifacts in the projected image; a twist of the hand, the speed of the arm, the height of the head above the floor can each control an event or property onscreen. Even events onstage, such as lighting changes, can be controlled through the movement of the dancer. The performance space itself can also act as an interface and any point onstage can trigger an action onscreen.

In Motionics most techniques were explored in isolation. Generally, the techniques chosen were distinct from those that could be done through other media. Virtual cameras allowed for quick changes of viewing position and background, and enable both first person and third person perspectives. The camera can also be animated to move along paths in ways that are impossible using video. Responsive relationships between motion and screen were explored, such as scenery that responds to movement onstage, actions triggered by location or bodily gesture.

The use of video and effects also proved to be fruitful areas for creating interesting visual effects. Video was projected onto a virtual character, onto background objects, onto moving objects. In-scene video could be started, stopped and swapped by the actions of the dancer. Particle effects were used as motion trails and to create interpretive effects for atmospheric elements such as mist, or to create liminal bodies.

While the inertial motion capture worked in real-time over a whole range of live performance scenarios, the animation produced was not always robust or stable. Sometimes, it would work for weeks with few problems but this might be followed by weeks in which annoying and difficult to solve problems crept in. For a performance situation such as a dance or theatrical production, in which real-time motion capture is used night after night, this lack of robustness would be a concern. Another significant limitation is the obtrusive electronics worn by the dancer. While this hardware is much more compact and less restrictive than the electromagnetic and exoskeletal suits of the past, its appearance makes a visual statement that can only be obscured using thoughtful costuming.

The findings from the creative practice provide the basis for further exploration and application of performance animation as a part of live performance. While the inertial suit was a breakthrough technology at the commencement of the study, in the past year or so most of the problems experienced that relate to motion capture have been minimised or solved. Each project provided technical affirmation and creative direction that, in total, demonstrate that motion capture has a promising future in live performance and at public events.

In the not to distant future, highly portable lightweight systems are likely to be developed that are easy to use, unrestrictive, location tolerant and highly accurate. The most recent inertial cubes are wireless, intelligent and about the size of a piece of chocolate. Based on current trends, systems will continue to reduce in cost as the technology matures and the demand moves outside niche markets into more ubiquitous uses, paving the way for more widespread use in live performance. So what directions will performance take in its relationship with real-time animation technologies? That’s for performance to explore and for audiences to judge.

References

  • Benjamin, W. (1935). The Work of Art in the Age of Mechanical Reproduction. In H. Arendt (Ed.), Illuminations (pp. 216 – 253). New York.
  • Berezina-Blackburn, V., & Miller, B. (2005). LANDING/PLACE: 
animation in dance performance. ACCAD, Ohio State University. Retrieved May 7 2008, from http://accad.osu.edu/~vberezin/bm/landing.html
  • Birringer, J. (1999). Contemporary Performance/Technology, Theatre Journal, 51(4), 361 – 81.
  • Coniglio, M. (2004). The importance of being interactive. In G. Carver & C. Beardon (Eds.), New Visions in Performance. The Impact of Digital Technologies (pp. 5 – 12). Lisse, Netherlands: Swetz & Zeitlinger B.V.
  • Daniels, D. (2002). Strategies of Interactivity (T. Morrison Trans.). Retrieved June 5, 2005, from http://www.medienkunstnetz.de/source-text/65/
  • David, M. (2005). Real-Time Motion-E Capture Makes Dance A Digital Art. Retrieved May 6, 2008, from http://electronicdesign.com/Articles/Index.cfm?AD=1&AD=1&ArticleID=10248
  • Grady, S. (2003). Virtual Reality: simulating and Enhancing the World of Computers, New Edition. New York: Facts on File Inc.
  • Saltz, D. Z. (2001). Live Media: Interactive Technology and Theatre. Theatre Topics, 11(2), 107 – 30.
  • Schechner, R. (2002). Performance Studies: an introduction, Routledge: London and New York.