Another Tech Blog: Paper Reading #3: LightGuide: Projected Visualizations for Hand Movement Guidance

Introduction

LightGuide: Projected Visualizations for Hand Movement Guidance is a paper by Rajinder Sodhi, Hrvoje Benko, and Andrew D. Wilson. Sodhi is a grad student at the University of Illinois working on his PhD in computer science under the advisement of David Forsyth and Brian Bailey. Benko received his PhD from Columbia University and now works in the Natural Interaction Research group at Microsoft Research focusing on human-computer interaction. Wilson also works for Microsoft Research as a principle researcher, prior to which he obtained his PhD from MIT.

Summary

LightGuide is a proof-of-concept implementation for a system that takes a novel approach to gesture guidance using a projector and a depth camera (Kinect). The authors were motivated by the current how-to paradigm’s lack of valuable feedback in the form of physical interaction. Typically, when learning a gesture such as a yoga pose, an instructor provides feedback by correcting errors through physical touch. When an instructor isn’t present, users rely on videos, diagrams, and textual descriptions. With the rising dependence on do-it-yourself materials found online, this poses a challenge - one that the authors take on with LightGuide.

LightGuide uses a depth camera and a projector mounted to a fixed position on the ceiling to project gesture cues onto the hand of a user. The depth camera and projector are calibrated precisely to allow the user’s hand to be mapped to 3D world coordinates and accurately projected upon. The authors devised three types of cues: follow spot, 3D arrow, and 3D pathlet. The follow spot consists of a white circle and a black arrow centered on the user’s hand indicating z-axis movement as well as positive (blue) and negative (red) coloring indicating xy-axis movement. The 3D arrow is self-explanatory and the 3D pathlet consists of a small path segment with a red dot indicating the user’s current position along the path.

The authors divided the cues into two categories based on whether the cue moves at a steady rate that the user follows or whether the cue advances based on the user’s speed. This categorization, along with the control cases, resulted in the following six testing scenarios: follow spot, 3D follow arrow, 3D self-guided arrow, 3D pathlet (self-guided), video projected onto the user’s hand, and video played on a screen.

The results of the experimentation revealed that, while video cues result in much faster movement on the part of the user, the cues devised by the authors resulted in 85% more accurate performance. The most accurate cue was the follow spot, followed by the arrows, the pathlet, the projected video, and lastly, the video screen.

Related Works

CounterIntelligence: Augmented Reality Kitchen by Leonardo Bonanni, Chia-Hsun Lee & Ted Selker
Development of Head-Mounted Projection Displays for Distributed, Collaborative, Augmented Reality Applications by Jannick P. Rolland, Frank Biocca, Felix Hamza-Lup & Yanggang Ha Ricardo Martins
The Studierstube Augmented Reality Project by Dieter Schmalstieg, Anton Fuhrmann, Gerd Hesina Zsolt Szalavári, L. Miguel Encarnação, Michael Gervautz & Werner Purgathofer
MirageTable: Freehand Interaction on a Projected Augmented Reality Tabletop by Hrvoje Benko, Ricardo Jota & Andrew D. Wilson
Efficient Model-based 3D Tracking of Hand Articulations using Kinect by Iason Oikonomidis, Nikolaos Kyriazis & Antonis A. Argyros
Human Detection Using Depth Information by Kinect by Lu Xia, Chia-Chih Chen & J. K. Aggarwal
Image Guidance of Breast Cancer Surgery Using 3-D Ultrasound Images and Augmented Reality Visualization by Yoshinobu Sato et. al.
A Head-Mounted Display System for Augmented Reality Image Guidance: Towards Clinical Evaluation for iMR1-guided Neurosurgery by F. Sauer et. al.
Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system by Tobias Höllerer et. al.
A Touring Machine: Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban Environment by Steven Feiner et. al.

While the system as a whole is somewhat novel, the underlying ideas and individual components have been demonstrated in various other studies. Firstly, the idea to use projectors to augment reality has been in circulation for some time. CounterIntelligence: Augmented Reality Kitchen, written in 2005, explores the idea of using projectors and various sensors to augment a kitchen environment with the goal of improving user speed and safety. Development of Head-Mounted Projection Displays for Distributed, Collaborative, Augmented Reality Applications, a paper written in 2006, and The Studierstube Augmented Reality Project, a paper written in 2002, both discuss the idea of using head-mounted projectors to facilitate an AR experience. MirageTable: Freehand Interaction on a Projected Augmented Reality Tabletop (2012) is another study that chose to monopolize on the tested success of projectors in AR systems, but did so in a much more effectively than LightGuide and with a much more novel approach. LightGuide built on the success of CounterIntelligence by incorporating user-tracking and projecting onto non-static surfaces and improved upon the ideas set forth in Development of Head-Mounted Projection Displays for Distributed, Collaborative, Augmented Reality Applications by enabling the LightGuide system to work without forcing the user to wear any special equipment.

Using Microsoft’s Kinect as a depth camera to track users is another recycled idea. MirageTable, Efficient Model-based 3D Tracking of Hand Articulations using Kinect and Human Detection Using Depth Information by Kinect are all studies that have used Kinect to track user movement within a system. While MirageTable’s use of Kinect was somewhat basic in comparison to LightGuide (MirageTable only tracks shutter glasses about a very limited space), Efficient Model-based 3D Tracking of Hand Articulations using Kinect and Human Detection Using Depth Information by Kinect both utilize Kinect in a much more novel method than LightGuide. Hand articulation tracking and human detection are significantly more advanced applications of Kinect’s capabilities and LightGuide’s implementation would have greatly benefitted from exploring these capabilities and incorporating them into the system to create a more robust application.

The application of AR to guidance systems has been explored within the fields of CHI and medicine to a great extent. Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system is a paper written in 1999 that discusses the application of AR to allow users to guide each other through environments. A Touring Machine: Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban Environment, a paper written even earlier (1997) takes this idea even further by imagining a 3D AR system capable of guiding a user through a complex urban environment such as a university campus. Image Guidance of Breast Cancer Surgery Using 3-D Ultrasound Images and Augmented Reality Visualization and A Head-Mounted Display System for Augmented Reality Image Guidance: Towards Clinical Evaluation for iMR1-guided Neurosurgery are studies that use head-mounted video displays to allow doctors to visualize tissue within a patient and provide guidance to assist in surgery. These works not only serve to demonstrate that AR guidance is anything but novel, but that a guidance system aimed at coordinating a users movements with an exact pre-programmed path is trivial compared to other applications explored in previous studies.

Evaluation

The authors evaluated the success of LightGuide using a quantitative comparative evaluation and a qualitative user feedback discussion of pros and cons of their approaches. The depth camera tracked the movement of the users hand through the world coordinates and compared this movement path with the intended path used to formulate the cues. In the case of the video-based cues, scaling wasn’t easily interpreted and the results of the analysis for skewed. The authors corrected for this using an iterative closest point algorithm used to analyze the performance of a user’s shape without taking scale into account. Even after correction, the follow spot still resulted in 85% more accurate movements than the video screen.

The qualitative study was based on user interviews following the completion of their trials. Users stated that they found the follow spot cue easily understandable and close to second nature, but felt that it didn’t provide enough information. The 3D pathlet was found favorable because of its feedforward - users liked knowing what was ahead. The majority of users stated that they preferred the 3-D self-guided arrow over all other visualization cues. Users liked setting their own pace and arrows serve as a familiar directional cue. Note that the 3D arrow cue was the only self-explanatory approach devised by the authors.

I found the evaluation performed in this study excellent. The authors addressed not only the factual success and accuracy of their system, but also took user experience into account - which is ultimately more important in a system aimed at guiding, and therefore teaching, a user.

Discussion

I enjoyed this article but found that it only scratched the surface of what’s possible with this type of technology. The authors limited their study to hand translation ignoring other movements such as rotation, as well as other body parts. While the authors succeeded in demonstrating the potential power of projector-based gesture cues, LightGuide ultimately raised more questions than it answered.

Another Tech Blog

Monday, September 3, 2012

Paper Reading #3: LightGuide: Projected Visualizations for Hand Movement Guidance

No comments:

Post a Comment