A team of researchers, including one of Indian origin, has developed a tool which allows users to execute operations on a smartphone by combining gaze control and simple hand gestures.
While smartphone devices have grown to accommodate the bigger screens and higher processing power needed for more demanding activities, the problem is that they frequently require a second hand or voice commands to operate.
The new tool called EyeMU, developed by researchers at Carnegie Mellon University in the US, shows how gaze estimation using a phone’s user-facing camera can be paired with motion gestures to enable a rapid interaction technique on handheld phones.
“Current phones only respond when we ask them for things, whether by speech, taps or button clicks,” said Andy Kong, a senior majoring in computer science at the University.
“Imagine how much more useful it would be if we could predict what the user wanted by analysing gaze or other biometrics,” Kong added.
The first step was a programme that used a laptop’s built-in camera to track the user’s eyes, which in turn moved the cursor around the screen.
“We asked the question, ‘Is there a more natural mechanism to use to interact with the phone?’ And the precursor for a lot of what we do is to look at something,” said Karan Ahuja, a doctoral student in human-computer interaction.
Kong and Ahuja advanced that early prototype by using Google’s Face Mesh tool to study the gaze patterns of users looking at different areas of the screen and render the mapping data.
Next, the team developed a gaze predictor that uses the smartphone’s front-facing camera to lock in what the viewer is looking at and register it as the target.
They made the tool more productive by combining the gaze predictor with the smartphone’s built-in motion sensors to enable commands.
For example, a user could look at a notification long enough to secure it as a target and flick the phone to the left to dismiss it or to the right to respond to the notification.
“The real innovation in this project is the addition of a second modality, such as flicking the phone left or right, combined with gaze prediction. That’s what makes it powerful. It seems so obvious in retrospect, but it’s a clever idea that makes EyeMU much more intuitive,” said Chris Harrison, an associate professor at the varsity.