

 

 
Master
File
Country
Buzz
Chief
Guest
The
Net
Tech
Trends
Networking
Telecom
Columns
Circuit |
SPEECH RECOGNITION
Command And ControlThrough years of
intense research, quality voice recognition software is now available for a wide variety
of applications for the home, office or mobile user.
By Atanu Roy
Once considered the realm of science
fiction writers, computers that process spoken commands and translate human dictation are
now a reality. Speaking to your PC with voice commands is no longer the fodder of movie
dreamers. After decades of trying, three companies-Dragon Systems, Lernout & Hauspie
and IBM-are selling remarkably good software that lets you speak into a microphone to
dictate documents and control your PC for a wide variety of applications for the home,
office or mobile user.
Only a year ago, the software was still a little rough. You'd
dictate, "Two turntables and a microphone." It might come out, "Two torn
labels and an ice cream cone." But now, demos routinely have someone speaking faster
than a news-radio DJ. The software captures it easily while rarely getting a word wrong.
The Apple Pie
The major speech companies-which include IBM,
Dragon and Lernout & Hauspie-are locked in a heated battle to dominate the PC space.
But since Dragon abandoned the market, none have expressed serious interest in the Mac.
Apple itself may enter the speech recognition market.
While it already has PlainTalk, which allows voice navigation through menus, the company
has promised dynamic new features when it releases the full version of its new Rhapsody
sever operating system (earlier called OS X) next month, and rumours swirl that stronger
voice recognition will be one of them. In its initial launch, Apple Computer has sold
800,000 iMacs.
Macspeech, a four-person company that includes former
developers of PowerSecretary, intends to produce a speech recognition program for the Mac
platform. The firm is looking to license an engine from a major player and gain funding
from venture capitalists. Whoever gets there first, wins it! |
Even in India, vendors have started making noise about
speech recognition. That whoever comes first will get an early lead, is possibly not much
the case, though. IBM's OS/Warp operating system, that the company launched three years
ago, was speech-enabled, but found little acceptance among PC users. Of course, it was
more due to Windows bulldozing than the 85 percent correctness of IBM's speech recognition
capability. Now HCL Infosystems Ltd is pushing aggressively the Dragon Systems products,
its new principals.
Last fortnight, the HCL Group flagship company introduced in
the market "the country's first speech recognition-enabled computer", a Pentium
II 350 MHz run Busybee 2000 range PC. Priced at Rs 50,500, the new PC is bundled with
Dragon Point & Speak speech recognition software.
A. Mohan Rao, chief executive, Frontline Divsion, HCL Insys,
points out that the companiy has specially customised the package for Indian users, and
hence, would support the Indian-English accent and vocabulary. "Its vocabulary is
adapted to regional names, locations and proper nouns, and would recognise typical Indian
names such as Krishna, Lucknow, Chennai, Gram Panchayat, Coimbatore, Meenakshi and so
on," Rao claims.
Continuous Speech Engines
The speech recognition packages let you speak into a microphone to dictate
documents and control your PC Dragon Systems, USA
Indian Distributor: HCL Info- systems Ltd, New Delhi
Dragon NaturallySpeaking Standard Edition
This edition includes all of the major features that made Dragon NaturallySpeaking best
selling continuous product. It includes Dragon's BestMatch technology for superior
accuracy, Natural Language Commands with Select-and-Say editing. Price: $109
Dragon NaturallySpeaking Professional
edition
Professional Edition adds advanced macro support for total control of forms, the ability
for users to add and customize vocabularies. Price: $695
Dragon Point & Speak
Simply click the mouse in the text window of virtually any Win application,
including MS Word, Corel WordPerfect, E-mail packages and chat room software. Price: Rs
6,500
Dragon Dictate Power
The most advanced discrete speech system available. Dictate inside of all
applications, 60,000 word vocabulary, high quality VXI microphone. Available in 6
languages. Price:$695
Dragon NaturallySpeaking Developer Suite
Lets software developers create their own speech-aware applications. The Suite
includes the Dragon NaturallySpeaking SDK, with easy-to-use ActiveX components and COM
interfaces. Price: $695
IBM Corp, USA
Indian Distributor: Tata IBM Ltd., Bangalore
IBM ViaVoice 98 Executive
IBM's most powerful continuous speech software, it is an ideal PC enhancement for
writers, executives and experienced PC users. Price: $149
IBM ViaVoice 98 Office
Designed to help maximise your efficiency as a small office/home user, faculty member, or
advanced student. It has the largest active vocabulary. Price: $89.95
IBM ViaVoice 98 Home
Ideal for the entire family to create letters, holiday gift lists, organisation
newsletters, reports, diaries, logs, It even incorporates electronic mail. Price: $49.95
Lernout & Hauspie, USA
VoiceXpress Professional 2.0
With the features of L&H Voice Xpress Standard and Advanced such as dictation
to virtually all Windows applications, L&H Voice Xpress Professional adds L&H's
technology for the full suite of Microsoft Office applications. Price: $165
VoiceXpress Advanced
Voice power for Microsoft Word. Includes two plug-in vocabularies. Dictation to
virtually all Windows applications. Ideal for frequent users of Microsoft Word. Dictate,
Format and Edit: voice recognition for writing reports, letters, proposals, and resumes.
Price: $79.99
VoiceXpress Standard
Turn Talk into Text! For virtually every Windows 98/95 or Windows NT application.
Ideal for casual PC Users. Price: $49.99 |
It's difficult to understand, however, why HCL Insys
bundled the speech recognition package on Pentium II, when Pentium III is available.
Hardware has been probably one of the biggest barriers in speech recognition. Intel had
announced even a year ago that its new generation of processors, the Katmai, which is a
follow-on to Intel's MMX Technology, would speech-ready. Among the most important new
instructions in Katmai is a new memory-streaming architecture. Katmai New Instructions
will enhance the performance of speech recognition applications, which continue to grow in
use and popularity. Katmai New Instructions will speed up the front-end audio processing,
and depending on the exact code used by the speech software developers, will increase the
throughput of the search algorithms involved in pattern matching. For continuous speech
recognition, the end result will be a reduced error rate and/or a shortened response time.
This will be an exciting, enhanced capability as speech recognition is integrated in a
growing number of business and consumer applications.
The Atmosphere is Vocal
From this point, some experts say, it should be only three or
four years before speaking becomes an everyday way to interact with computers or
computerised devices. At first, that sounds pretty cool. But for a whole generation of
professionals who have built their working lives around using a keyboard, mouse and icons
on a screen, things are going to get scary.
You are about to become your father. He never liked the
keyboard. He didn't grow up using one. When computers came along and he had to make
friends with typing, he always looked awkward, probably pecking with two fingers and
making tonnes of mistakes. If he really had to think, he used a pen and paper.
It's your turn. Speaking is certainly more natural than
tapping buttons lined up on a rectangle of plastic, so the transition might be easier.
Then, while you might learn to speak to your computer, you might never learn to think that
way. "The creative process is changed," says Joel Gould of Dragon. "You
have to learn to think with your mouth."
At some point, you will be able to make yourself look old by
admitting you know how to type. Keyboards aren't likely to go away. They'll stick around
as backup ways to put in information. But there will be no incentive for anyone to learn
touch-typing. It will become the equivalent of knowing how to use a slide rule or tune in
a TV station using rabbit ears.
You might be able to hide your keyboarding skills, but at
some point, in a crowded room, you will see a mistake on a document and call it a typo.
The office won't be the same. Gotten used to acres of
Dilbert-style cubicles? "If voice becomes the interface to the standard computer, it
would definitely impact cube life," says Gregg Armstrong, vice president at Starfish
Software. The office will become a yak-fest. Human voices pouring over cubicle walls would
be distracting. She figures cubes will become more enclosed and podlike. "We'll all
be outfitted like telemarketing people!"
Speech Rules for the New World
New rules will have to be imposed for meetings. The young
pups in the office will bring in their hand-held computers that have voice input. They
will want to use them to take notes. "You could not have everyone talking into their
computers," says David Nahamoo of IBM Research.
You'll get mad at your kids' teachers. Your parents probably
couldn't understand how they could let you use calculators in school. They decried the
erosion of basic math skills. Well, you're going to do the same over spelling. If
computers almost perfectly capture speech, why should kids dwell on spelling?
That might not be as terrible as it
sounds. In math, Nahamoo notes, calculators let teachers shift from focussing on
computation to emphasising problem-solving, a higher-order kind of math. Speech software
might let teachers dwell less on mechanics and more on creativity.
You'll never get over people talking to their microwave
ovens. After putting speech technology on powerful desktop computers, the next step will
be to put speech into many devices that use a computer chip. The VCR, cell phones, ATMs,
thermostats, copiers-eventually all could be voice-guided. In the kitchen, it won't be
such a strange thing to tell your microwave oven, cook this for 3 minutes, Nahamoo says.
Computers also need to be taught to be cautious. Imagine
after the after bitter altercation with you, next day morning your subordinate enters the
office and shouts in a full-throated, deep baritone, "All hard disks, reformat!"
Most probably, computers will talk back. "Down the road
will be conversational technology," says Bob Kutnick of Lernout & Hauspie. The
computer would not only recognise but would understand what you're saying and either ask
questions or reply. It's not science fiction. It's research and development. By 2009,
predicts speech pioneer Ray Kurzweil, automated personalities will take care of routine
business transactions like flight reservations. By 2019, he says, computers should be so
good at conversation, a person could have a full and rewarding relationship with the
latest product from Dell or Gateway.
Speech
Accessories
Automated creation and editing of documents entails the help
of recorders, flash cards, microphones and cables |
RF Wireless Headset
A single channel, full duplex system, with a transmit range greater than 20 feet.
The headset is designed for PC voice recognition, computer telephony and Internet
telephony. Price $395 VR 3345
Dynamic noise canceling electret lightweight headset with a latching mute switch. Price
$69.95
Norcom VoicePort Executive
Has Norcom 2500 speech recognition machine, SRC-1 speech recognition coupler, and
the IBM ViaVoice98 Executive software. Price $499
Norcom Model 2500 & SRC-1
It allows you to dictate notes or reports, wherever you are, for automatic
transcription by your computer. Price $368
Norcom Transcriber 2510SR
Plays recorded dictation to a PC using continuous speech recognition software for
automatic transcription. Price $499
Digital Voice Recorder D1000
Comes packaged with a 4MB Intel Flash Memory Miniature Card that gives you up to
33 min of dictation. Price $315
Olympus Desktop Integration Kit
Add-on for the Olympus D1000 digital recorder. It is used to transfer the
recorded dictation from the recorder to your PC. Price $270
Olympus 8MB Intel Flash Memory Miniature Card
The 8MB miniature card allows 60 min. of standard record time. Price $140
Philips SpeechMike
Sit back and input your voice for all PC speech applications. Price $99.99
Philips, SpeechMike Pro
Combines a professional dictation microphone, loudspeaker and a trackball mouse
into a single housing. Price $199.99
Wireless Microphone System.
Includes body pack type wireless transmitter. Designed for medical, legal and
business dictation, speech recognition or command and control applications. Price $249
Unidirectional Electret Microphone HW505
Features a unique light-weight, over-the-ears wire-frame design with a
high-quality stainless steel microphone boom. Price $49
Unidirectional Dynamic Microphone HW501
Includes a dynamic microphone element for use with non-powered sound cards. Price
$49
Microphone With Single Earphone Speaker VR250B
Cushioned over-the-head design with twin earpads for wearer comfort, with
attached 10 foot cable terminated in a 3.5mm stereo phone plug. Price $59
Electret Condenser Microphone VR116L
Condenser microphone for speech recognition programs with audio confirmation or
computer telephony applications. Price $85
Sony Mini Disc Recorder MZ-R50
Create text simply by speaking naturally into the palm-sized Sony MiniDisc (model
MZ-R50) recorder, even if you are miles away from a PC. Price $349
VTR Recorder
Advanced compression technology allows minutes of dictation to be downloaded in
seconds directly to the serial port of a PC. Once downloaded, files can be E-mailed,
transcribed, transferred or archived in compressed or WAV format. Price $199
VXI Parrott 10-3
Made for straight speech recognition with sound card "microphone in"
and "speaker out" connections by two standard 3.5mm plugs. Price $74
VXI Parrott ST
For those who want to use their headset with voice recognition software and hear
with stereo sound. A TRANSLATOR for universal compatibility to all PC, laptop and palmtop
soundcards is also included. Price $88
VXI Parrott QD-10/20/35
A headset that mounts over-the-head. Price $102
VXI Parrott 60V-10/20/35
Computer and telephony headset systems with the VXI PARROTT SWITCH 60V, provide
universal compatibility to all systems. Price $175
VXI Portable Parrott
Uses advanced technology to maximise the accuracy of your dictation. Price $83
VXI Parrott Connector Cord
Makes VXI Parrott headsets compatible with any PC, laptop or palmtop sound card,
preventing incompatibility, the leading cause of voice recognition failure. Price $26
VXI Parrott Switch 60V
The only switch box that allows the user to hear the customer while encoding data
into the computer. Price $99 |
|