recently demonstrated the power of
voice recognition using Siri on the iPhone 4S.
But the technology has been around for a while now and you can easily get it for
your PC, MAC or smartphone. Hitesh Raj Bhagat and Karan Bajaj explain how
Apple's Siri demonstration at the launch of the iPhone 4S and its subsequent
marketing videos have raised a lot of interest in voice and speech recognition .
You've probably used some variation of the technology when you used voice
dialling in your phone or when you speak to an automated IVR system on the
phone. For personal use, various reasons for the low adoption rates include poor
accuracy, a training period before you can actually start using it and a limited
number of realworld applications. Over the years, the technology has evolved and
now there are several mobile apps and computer programs available to help you
talk to a machine.
A BRIEF HISTORY
Speech recognition was first introduced to personal computing around 10 years
back - around the time Windows 98 was introduced. However, you may be surprised
to know that research on this technology started way back in 1936.
WHAT IS VOICE RECOGNITION?
Voice recognition and speech recognition are two different terms. Voice
recognition relates to identifying an individual voice - along the same lines as
a biometric scanner. Speech recognition, on the other hand, relates to
identifying spoken words in the correct sense and then translating them into
HOW IT WORKS
Both speech and voice recognition work on the principal of translating 'analog'
spoken words into 'digital' signals that a machine can understand. As simple as
this may sound, it requires a lot of back-end processing, all the while
compensating for differences in dialect, volume levels, tempo and pronunciation
. Translated analog signals from speech, once converted, are then sent back to
the device in digital format which in turn executes a command. Because speaking
out a line takes mere seconds, translating, conversion and execution needs to be
done on the fly - thus the need of a fast data connection to transfer the data
to and fro.
WHAT CAN YOU USE IT FOR?
Speech to text and controlling a machine using your voice is obvious . But the
technology holds promise for those with disabilities . Applications like
DriveSafe.ly for your phone can read out text messages and emails for you -
helpful for the visually impaired. Various apps also allow you to search the web
or type out messages by speaking - helpful for those with limited motor control
WHAT COMES NEXT?
The biggest challenge that any speech recognition system faces today is
deciphering the various dialects and accents that people may have. Plus, in
natural speech, we often tend to use a lot of slang, which automated systems
find hard to understand. The first step would be to build a system that looks
beyond any of these current issues. A possible application then, would be a
universal , real-time voice translator, often seen in sci-fi movies - simply
speak and a device will be able to instantly speak out the same in any language
with 100% accuracy. Going forward, there are also going to be major developments
in speech understanding - true artificial intelligence, when a machine can truly
grasp the context of what you're saying and talk back, rather than just
recognising the words.
Siri is the personal assistant that Apple has introduced on the new iPhone 4S.
The app in deeply integrated with the operating system and responds to your
natural speaking voice. It can be used to make calls, write SMS, set reminders
or answer questions with real-time results from the internet. The app adapts to
your preferences, style of speaking and takes interactivity to an all-new level.
As of now, Siri only supports English, German and French. Plus, it will only be
available for iPhone 4S users.
ON A PC
Windows Speech Recognition Free
Windows 7 comes with its own system (Ease of Access). It allows you to control
your PC using selective commands and also offers dictation of text.
$39.99 (price for 2PC license)
Tazti lets you control iTunes as well as various browser functions such as
search and navigation via voice commands. It also includes a dictation feature.
ON A MAC
Dragon Dictate (www.nuance.com) $199.99
Dragon Dictate not only lets you input text by speaking, but also controls
various functions like launching of programs and general navigation.
OS X has a set of voice commands accessible from the Speech section that
allows various applications to be controlled. It also has a built in
IT'S NOT MAGIC, SO KEEP IN MIND THAT...
Voice recognition requires use of the microphone, be it a computer or a handheld
device. High levels of ambient sound will affect the accuracy of recognition Do
not skip the initial setup process to set up the microphone, speakers and volume
. This makes sure that your hardware is optimised for accuracy Some of the
smarter software solutions will adapt to your style of speaking. The more you
use it, the better and more accurate it will get Speech to text will never be
100% accurate out of the box. Most software will give you about a 60% accuracy
rate that improves to around 80% over time A fast data connection (3G or
is required for voice recognition to work quickly and reliably. This is because
processing is done server side
SMARTPHONE APPS YOU CAN SPEAK TO
Search: Free; Search the internet by just speaking out your queries. Available
for iOS, Android & BlackBerry
Voice Actions: Free; This app lets you
control your Android phone using voice. You can get directions, make calls or
Edwin: Free; Edwin for Android responds
to your queries in real time. It can make calls, SMS, tweet and do a host of
Speaktoit Assistant: Free; This Android
app has a virtual character that responds verbally to questions and notifies you
Vlingo : Free; Vlingo is the closest
alternative to Apple's Siri in terms of recognition and features. It works
across platforms and offers search as well as command execution