Over the last few years we’ve seen a rise in alternative interfaces, Minority Report style technologies meant to demonstrate a future without physical controls: the Xbox Kinect, the Leap Motion, and IBM’s Perceptual computing to name a few.

This rise is as much a factor of improving technologies as it is a response to the impact touch-based interfaces have had over the last few years. Touch has not only reinvigorated the search for novel interfaces, but has made mobile computing itself possible. And herein lies the difference between these two interfaces; gestural interfaces are being developed primarily as a response to enabling technologies, while touch was an existing technology applied to a specific need: efficient interactions on mobile devices. If gestural computing is going to becoming more than a novelty, it’s going to need to solve a real problem.

Is there a need for Gestural interfaces to solve?

While there are numerous demos of gestural based interfaces, from accessing gmail to an entire line of Kinect games, the jury is still out on whether or not these interfaces are providing any real efficiency gains. The fact of the matter is, in the settings where gestural interfaces perform well, at a desk, on the couch, or in front of the tv, the existing interface, be it a keyboard, game controller, or tv remote are far more efficient (with the exception of maybe Fruit Ninja).

Is there a need that existing interfaces, gestural or otherwise, don’t solve?

When walking down the street or driving, using a screen can be difficult (even illegal). Despite the difficulty, these situations often call for interacting with a device: Looking for a place to eat, getting directions, etc. There have been a few attempts to improve this experience, for example, TypeNWalk uses your phone’s camera to display what behind in front of you as you compose an email. Like TypeNWalk, gestural interfaces are not an elegant solution when user’s need to watch where they are going.

There’s a much more efficient interface for eyes free computing: Speech recognition.

Current Speech based products like Apple’s Siri, while temperamental and somewhat limited in functionality, show the promise of an eyes free interaction paradigm.

Unlike gestural ones, audio based interfaces lend themselves to mobile interactions and are intuitive to novice users. We naturally know how to interrogate a voice based service, while the gestures required to navigate a UI are much more artificial.

Gestural interfaces are sexy, but are they that really an improvement on existing tools? I’m not convinced. And If I had to bet on the next major interface, my money would be on speech recognition and voice based interfaces.

Is the next shift in computer interaction really gestural? Probably not.