Computers Today..

May 1-15, 1999                                                                           PC USER 

Computers Today Home
Politics
BusinessEntertainment and the Arts
People
About UsWhat's New

Master File

Country Buzz

Chief Guest

The Net

Tech Trends

Networking

Telecom

Columns

Circuit


SPEECH RECOGNITION
Command And Control

Through years of intense research, quality voice recognition software is now available for a wide variety of applications for the home, office or mobile user.

By Atanu Roy

Command and ControlOnce considered the realm of science fiction writers, computers that process spoken commands and translate human dictation are now a reality. Speaking to your PC with voice commands is no longer the fodder of movie dreamers. After decades of trying, three companies-Dragon Systems, Lernout & Hauspie and IBM-are selling remarkably good software that lets you speak into a microphone to dictate documents and control your PC for a wide variety of applications for the home, office or mobile user.

Only a year ago, the software was still a little rough. You'd dictate, "Two turntables and a microphone." It might come out, "Two torn labels and an ice cream cone." But now, demos routinely have someone speaking faster than a news-radio DJ. The software captures it easily while rarely getting a word wrong.

The Apple Pie

The Apple PieThe major speech companies-which include IBM, Dragon and Lernout & Hauspie-are locked in a heated battle to dominate the PC space. But since Dragon abandoned the market, none have expressed serious interest in the Mac.

Apple itself may enter the speech recognition market. While it already has PlainTalk, which allows voice navigation through menus, the company has promised dynamic new features when it releases the full version of its new Rhapsody sever operating system (earlier called OS X) next month, and rumours swirl that stronger voice recognition will be one of them. In its initial launch, Apple Computer has sold 800,000 iMacs.

Macspeech, a four-person company that includes former developers of PowerSecretary, intends to produce a speech recognition program for the Mac platform. The firm is looking to license an engine from a major player and gain funding from venture capitalists. Whoever gets there first, wins it!

Even in India, vendors have started making noise about speech recognition. That whoever comes first will get an early lead, is possibly not much the case, though. IBM's OS/Warp operating system, that the company launched three years ago, was speech-enabled, but found little acceptance among PC users. Of course, it was more due to Windows bulldozing than the 85 percent correctness of IBM's speech recognition capability. Now HCL Infosystems Ltd is pushing aggressively the Dragon Systems products, its new principals.

Last fortnight, the HCL Group flagship company introduced in the market "the country's first speech recognition-enabled computer", a Pentium II 350 MHz run Busybee 2000 range PC. Priced at Rs 50,500, the new PC is bundled with Dragon Point & Speak speech recognition software.

A. Mohan Rao, chief executive, Frontline Divsion, HCL Insys, points out that the companiy has specially customised the package for Indian users, and hence, would support the Indian-English accent and vocabulary. "Its vocabulary is adapted to regional names, locations and proper nouns, and would recognise typical Indian names such as Krishna, Lucknow, Chennai, Gram Panchayat, Coimbatore, Meenakshi and so on," Rao claims.

Continuous Speech Engines
The speech recognition packages let you speak into a microphone to dictate documents and control your PC

Dragon Systems, USA

Indian Distributor: HCL Info- systems Ltd, New Delhi
Dragon NaturallySpeaking Standard Edition
This edition includes all of the major features that made Dragon NaturallySpeaking best selling continuous product. It includes Dragon's BestMatch technology for superior accuracy, Natural Language Commands with Select-and-Say editing. Price: $109

Dragon NaturallySpeaking Professional edition
Professional Edition adds advanced macro support for total control of forms, the ability for users to add and customize vocabularies. Price: $695

Dragon Point & Speak
Simply click the mouse in the text window of virtually any Win application, including MS Word, Corel WordPerfect, E-mail packages and chat room software. Price: Rs 6,500

Dragon Dictate Power
The most advanced discrete speech system available. Dictate inside of all applications, 60,000 word vocabulary, high quality VXI microphone. Available in 6 languages. Price:$695

Dragon NaturallySpeaking Developer Suite
Lets software developers create their own speech-aware applications. The Suite includes the Dragon NaturallySpeaking SDK, with easy-to-use ActiveX components and COM interfaces. Price: $695

IBM Corp, USA

Indian Distributor: Tata IBM Ltd., Bangalore

IBM ViaVoice 98 Executive
IBM's most powerful continuous speech software, it is an ideal PC enhancement for writers, executives and experienced PC users. Price: $149

IBM ViaVoice 98 Office
Designed to help maximise your efficiency as a small office/home user, faculty member, or advanced student. It has the largest active vocabulary. Price: $89.95

IBM ViaVoice 98 Home
Ideal for the entire family to create letters, holiday gift lists, organisation newsletters, reports, diaries, logs, It even incorporates electronic mail. Price: $49.95

Lernout & Hauspie, USA

VoiceXpress Professional 2.0
With the features of L&H Voice Xpress Standard and Advanced such as dictation to virtually all Windows applications, L&H Voice Xpress Professional adds L&H's technology for the full suite of Microsoft Office applications. Price: $165

VoiceXpress Advanced
Voice power for Microsoft Word. Includes two plug-in vocabularies. Dictation to virtually all Windows applications. Ideal for frequent users of Microsoft Word. Dictate, Format and Edit: voice recognition for writing reports, letters, proposals, and resumes. Price: $79.99

VoiceXpress Standard
Turn Talk into Text! For virtually every Windows 98/95 or Windows NT application. Ideal for casual PC Users. Price: $49.99

It's difficult to understand, however, why HCL Insys bundled the speech recognition package on Pentium II, when Pentium III is available. Hardware has been probably one of the biggest barriers in speech recognition. Intel had announced even a year ago that its new generation of processors, the Katmai, which is a follow-on to Intel's MMX Technology, would speech-ready. Among the most important new instructions in Katmai is a new memory-streaming architecture. Katmai New Instructions will enhance the performance of speech recognition applications, which continue to grow in use and popularity. Katmai New Instructions will speed up the front-end audio processing, and depending on the exact code used by the speech software developers, will increase the throughput of the search algorithms involved in pattern matching. For continuous speech recognition, the end result will be a reduced error rate and/or a shortened response time. This will be an exciting, enhanced capability as speech recognition is integrated in a growing number of business and consumer applications.

The Atmosphere is Vocal

From this point, some experts say, it should be only three or four years before speaking becomes an everyday way to interact with computers or computerised devices. At first, that sounds pretty cool. But for a whole generation of professionals who have built their working lives around using a keyboard, mouse and icons on a screen, things are going to get scary.

You are about to become your father. He never liked the keyboard. He didn't grow up using one. When computers came along and he had to make friends with typing, he always looked awkward, probably pecking with two fingers and making tonnes of mistakes. If he really had to think, he used a pen and paper.

It's your turn. Speaking is certainly more natural than tapping buttons lined up on a rectangle of plastic, so the transition might be easier. Then, while you might learn to speak to your computer, you might never learn to think that way. "The creative process is changed," says Joel Gould of Dragon. "You have to learn to think with your mouth."

At some point, you will be able to make yourself look old by admitting you know how to type. Keyboards aren't likely to go away. They'll stick around as backup ways to put in information. But there will be no incentive for anyone to learn touch-typing. It will become the equivalent of knowing how to use a slide rule or tune in a TV station using rabbit ears.

You might be able to hide your keyboarding skills, but at some point, in a crowded room, you will see a mistake on a document and call it a typo.

The office won't be the same. Gotten used to acres of Dilbert-style cubicles? "If voice becomes the interface to the standard computer, it would definitely impact cube life," says Gregg Armstrong, vice president at Starfish Software. The office will become a yak-fest. Human voices pouring over cubicle walls would be distracting. She figures cubes will become more enclosed and podlike. "We'll all be outfitted like telemarketing people!"

Speech Rules for the New World

New rules will have to be imposed for meetings. The young pups in the office will bring in their hand-held computers that have voice input. They will want to use them to take notes. "You could not have everyone talking into their computers," says David Nahamoo of IBM Research.

You'll get mad at your kids' teachers. Your parents probably couldn't understand how they could let you use calculators in school. They decried the erosion of basic math skills. Well, you're going to do the same over spelling. If computers almost perfectly capture speech, why should kids dwell on spelling?

Command and ControlThat might not be as terrible as it sounds. In math, Nahamoo notes, calculators let teachers shift from focussing on computation to emphasising problem-solving, a higher-order kind of math. Speech software might let teachers dwell less on mechanics and more on creativity.

You'll never get over people talking to their microwave ovens. After putting speech technology on powerful desktop computers, the next step will be to put speech into many devices that use a computer chip. The VCR, cell phones, ATMs, thermostats, copiers-eventually all could be voice-guided. In the kitchen, it won't be such a strange thing to tell your microwave oven, cook this for 3 minutes, Nahamoo says.

Computers also need to be taught to be cautious. Imagine after the after bitter altercation with you, next day morning your subordinate enters the office and shouts in a full-throated, deep baritone, "All hard disks, reformat!"

Most probably, computers will talk back. "Down the road will be conversational technology," says Bob Kutnick of Lernout & Hauspie. The computer would not only recognise but would understand what you're saying and either ask questions or reply. It's not science fiction. It's research and development. By 2009, predicts speech pioneer Ray Kurzweil, automated personalities will take care of routine business transactions like flight reservations. By 2019, he says, computers should be so good at conversation, a person could have a full and rewarding relationship with the latest product from Dell or Gateway.

Speech Accessories
Automated creation and editing of documents entails the help of recorders, flash cards, microphones and cables

RF Wireless Headset
A single channel, full duplex system, with a transmit range greater than 20 feet. The headset is designed for PC voice recognition, computer telephony and Internet telephony. Price $395

VR 3345
Dynamic noise canceling electret lightweight headset with a latching mute switch. Price $69.95

Norcom VoicePort Executive
Has Norcom 2500 speech recognition machine, SRC-1 speech recognition coupler, and the IBM ViaVoice98 Executive software. Price $499

Norcom Model 2500 & SRC-1
It allows you to dictate notes or reports, wherever you are, for automatic transcription by your computer. Price $368

Norcom Transcriber 2510SR
Plays recorded dictation to a PC using continuous speech recognition software for automatic transcription. Price $499

Digital Voice Recorder D1000
Comes packaged with a 4MB Intel Flash Memory Miniature Card that gives you up to 33 min of dictation. Price $315

Olympus Desktop Integration Kit
Add-on for the Olympus D1000 digital recorder. It is used to transfer the recorded dictation from the recorder to your PC. Price $270

Olympus 8MB Intel Flash Memory Miniature Card
The 8MB miniature card allows 60 min. of standard record time. Price $140

Philips SpeechMike
Sit back and input your voice for all PC speech applications. Price $99.99

Philips, SpeechMike Pro
Combines a professional dictation microphone, loudspeaker and a trackball mouse into a single housing. Price $199.99

Wireless Microphone System.
Includes body pack type wireless transmitter. Designed for medical, legal and business dictation, speech recognition or command and control applications. Price $249

Unidirectional Electret Microphone HW505
Features a unique light-weight, over-the-ears wire-frame design with a high-quality stainless steel microphone boom. Price $49

Unidirectional Dynamic Microphone HW501
Includes a dynamic microphone element for use with non-powered sound cards. Price $49

Microphone With Single Earphone Speaker VR250B
Cushioned over-the-head design with twin earpads for wearer comfort, with attached 10 foot cable terminated in a 3.5mm stereo phone plug. Price $59

Electret Condenser Microphone VR116L
Condenser microphone for speech recognition programs with audio confirmation or computer telephony applications. Price $85

Sony Mini Disc Recorder MZ-R50
Create text simply by speaking naturally into the palm-sized Sony MiniDisc (model MZ-R50) recorder, even if you are miles away from a PC. Price $349

VTR Recorder
Advanced compression technology allows minutes of dictation to be downloaded in seconds directly to the serial port of a PC. Once downloaded, files can be E-mailed, transcribed, transferred or archived in compressed or WAV format. Price $199

VXI Parrott 10-3
Made for straight speech recognition with sound card "microphone in" and "speaker out" connections by two standard 3.5mm plugs. Price $74

VXI Parrott ST
For those who want to use their headset with voice recognition software and hear with stereo sound. A TRANSLATOR for universal compatibility to all PC, laptop and palmtop soundcards is also included. Price $88

VXI Parrott QD-10/20/35
A headset that mounts over-the-head. Price $102

VXI Parrott 60V-10/20/35
Computer and telephony headset systems with the VXI PARROTT SWITCH 60V, provide universal compatibility to all systems. Price $175

VXI Portable Parrott
Uses advanced technology to maximise the accuracy of your dictation. Price $83

VXI Parrott Connector Cord
Makes VXI Parrott headsets compatible with any PC, laptop or palmtop sound card, preventing incompatibility, the leading cause of voice recognition failure. Price $26

VXI Parrott Switch 60V
The only switch box that allows the user to hear the customer while encoding data into the computer. Price $99

 

India Today Group Online

Top

Issue Contents    Write to us    Subscriptions    Syndication

INDIA TODAY | BUSINESS TODAY | INDIA TODAY PLUS | TEENS TODAY
NEWS TODAY | MUSIC TODAY | ART TODAY | SYNDICATIONS TODAY

© Living Media India Ltd

Back Forward