Research Projects

Here is a brief description of several research projects that I have been working on.

Improving User Interfaces for Robot Teleoperation

The FXPAL robotics research group has recently explored technologies for improving the usability of mobile telepresence robots. We evaluated a prototype head-tracked stereoscopic (HTS) teleoperation interface for a remote collaboration task. The results of this study indicate that using a HTS systems reduces task errors and improves the perceived collaboration success and viewing experience.


We also developed a new focus plus context viewing technique for mobile robot teleoperation. This allows us to use wide-angle camera images that proved rich contextual visual awareness of the robot's surroundings while at the same time preserving a distortion-free region in the middle of the camera view.

To this, we added a semi-automatic robot control method that allows operators to navigate the telepresence robot via a pointing and clicking directly on the camera image feed. This through-the-screen interaction paradigm has the advantage of decoupling operators from the robot control loop, freeing them for other tasks besides driving the robot.

As a result of this work, we will present two papers at the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

GestureSeg: Using Crowd Labeling for Reliable Motion Gesture Segmentation

Most current mobile and wearable devices are equipped with inertial measurement units (IMU) that allow the detection of motion gestures, which can be used for interactive applications. A difficult problem to solve, however, is how to separate ambient motion from an actual motion gesture input. We explore the use of motion gesture data labeled with gesture execution phases for training supervised learning classifiers for gesture segmentation. We believe that using gesture execution phase data can significantly improve the accuracy of gesture segmentation algorithms. We define gesture execution phases as the start, middle and end of each gesture. Since labeling motion gesture data with gesture execution phase information is work intensive, we used crowd workers to perform the labeling.

Using this labeled data set, we trained SVM-based classifiers to segment motion gestures from ambient movement of the device. Our main results show that training gesture segmentation classifiers with phase-labeled data substantially increases the accuracy of gesture segmentation: we achieved a gesture segmentation accuracy of 0.89 for simulated online segmentation using a sliding window approach.

A full paper about GestureSeg will be presented at EICS 2016.

ThermoTouch: Rethinking Thermal Haptic Displays

We're currently developing a novel thermal haptic output device ThermoTouch, that provides a grid of thermal pixels. Unlike previous de- vices which mainly use Peltier elements for thermal output, ThermoTouch uses liquid cooling and electro-resistive heat- ing to output thermal feedback at arbitrary grid locations, which will potentially provide faster temperature switching times and a higher temperature dynamic range. Furthermore, the PCB-based design allows us to incorporate capacitive touch sensing directly on a thermal pixel.


ThermoTouch was presented as late-breaking work at CHI 2016.


AirAuth is a prototype authentication system that is intended to improve the usability of authentication by replacing password entry through mid-air gestures. AirAuth uses an Intel short-range depth camera to track the user's fingertip locations and the location of the hand center during gesture entry. Under controlled conditions we obtained a high EER-based authentication accuracy using just a few enrollment gestures and DTW matching for the gesture input. Being touchless, AirAuth is resistant to smudge attacks. We also evaluated AirAuth's resistance to shoulder surfing (visual forgery) via a camera-based study. We presented AirAuth as a work-in progress at CHI 2014 and a full paper is to appear at MobileHCI 2014.


The expressiveness of touch input can be increased by detecting additional finger pose information at the point of touch such as finger rotation and tilt. Our PointPose prototype performs finger pose estimation at the location of touch using a short-range depth sensor viewing the touch screen of a mobile device. Our approach does not require complex external tracking hardware, and external computation is unnecessary as the finger pose extraction algorithm runs directly on the mobile device. This makes PointPose ideal for prototyping and developing novel mobile user interfaces that use finger pose estimation.


Mouse-based interaction on displays with large sizes and high resolutions can be problematic. The size of an unscaled mouse cursor diminishes so much that it can hardly be located on the screen, when the screen is viewed at a comfortable distance, and the default tracking speed of regular mice makes it tedious to manipulate content on the screen. At FXPAL, we are exploring full-body gestural interfaces as an alternative to mouse-based interactions on large displays. The advantages of gesture-based interaction is that gestures can be simple to perform, and cover larger spatial distances. Hence, smaller control-display gains can be used. Gestures can be intuitive, for instance when the UI is designed such that it follows the Natural User Interface (NUI) principles, where interactive objects expose their functionality during interaction. Finally, we feel that gestural interfaces will promote movement and activity at otherwise sedentary workplaces, with the effect of increasing the users' health and well-being.

VPoint Prototype

The VPoint prototype aims to explore the use of a large display for collaborative content presentation and manipulation. It uses gesture-based input tracked by a Kinect sensor, and is directly integrated with the Windows 7 desktop.


What if mobile phones were equipped with depth imaging cameras? PalmSpace envisions the use of such cameras to facilitate interaction with 3D content using hand gestures. We developed a technique that maps the pose of the user's palm directly to 3D object rotation. Our user study shows that the users could manipulate the 3D objects significantly faster than with a standard virtual trackball on the touch screen.


Protractor3D is a tilt-invariant, data-driven gesture recognizer for 3D motion gestures from data which can be obtained, for example, from 3D accelerometers on smart phones.