New York Times
- Circuits - Section D
Be careful what you say around the house. Your appliances may be listening.
Voice control, long the stuff of science fiction and computer lab experiments, is popping up in more and more mundane household devices like clock radios, MP3 players, television remotes, telephones and light switches. You no longer have to push buttons or twist dials to listen to music or brew coffee: you simply tell your appliances what you want, and through built-in microphones and microprocessors they understand and obey your commands.
These low-end voice controls are not designed for space travel but rather to make everyday devices easier to use. A voice-activated television remote, for example can spare you from having to remember hundreds of channel numbers. And even relatively simple voice-controlled devices like light switches can be a boon to people with physical disabilities or with poorly placed wall switches in their basements.
And yes, voice control is also kind of fun.
"Sometimes voice control doesn't really start from a need -- it starts as a feature," said William Meisel, the publisher of Speech Recognition Update, a monthly newsletter. "Manufacturers say, 'This is a feature that will make us look high-tech and distinguish us from the other guys without costing us too much.'"
It is a feature that could find its way into many more living rooms and kitchens. Tod Mozer, chief executive of Sensory, a company based in Santa Clara, Calif., that makes specialized speech recognition chips for appliances, said that more than 15 million such devices had been sold worldwide. If you include cellphones with voice-dialing, the estimate rises to 100 million.
After seeing ads for a number of voice-activated appliances, I started to fantasize about never having to lift a finger around the house. I wanted to give orders to my appliances. They would listen and obey. I would be king of the house. The clock radio, the television set, the lamps -- all would be my trusty servants.
Speech recognition existed at Bell Laboratories in the 1950's, but it did not appear commercially feasible until 1967, when A.J. Viterbi, a professor of engineering at the University of California at Los Angeles, introduced an algorithm that helped digital signal processors match voice patterns to data stored in a computer's memory.
In the 1970's, speech recognition appeared in systems built for the phone company and the Defense Department, but it was not until the 1980's that voice-controlled devices began to enter the home, at first in the form of toys. For example, the Julie doll, released in 1987 by Worlds of Wonder, turned her head when her name was spoken.
Other voice-activated appliances were sold in the 1980's and 90's, but until recently the digital signal processors remained expensive, about $20 a unit. Today a general-purpose chip like the RSC-364 from Sensory costs as little as $1. The prices have fallen so far so fast that some manufactures can't resist adding voice activation as a gee-whiz component.
I sampled six such devices, all recently released. VOS Systems offers a voice-operated dimmer switch for lamps for $35 and a voice-activated module for appliances for $30 that can be used with any AC device. KashNGold's InVoca line includes a voice-activated clock radio for $100 and a television remote control for $100. Then there is the Gigaset 4215 voice-controlled wireless phone from Siemens ($180) and a $239 voice-controlled MP3 player, the MXP 100 Sport from e.Digital.
I tried the VOS appliance module first. After glancing at the manual, I plugged a lamp into the device and the module into a wall outlet. My wife came into the room just as I said, "Lights." The lamp turned on! Buoyed by that success, I hooked up the television remote and the lamp dimmer switch in the living room and the clock radio in the bedroom. The appliance module was dispatched to a boombox in the kitchen.
As it turns out, these devices have to be taught to respond to commands, and the procedure is slightly different for each appliance. That typically involves saying a keyword three or four times until the device is satisfied that it can pick it out in a noisy room. For example, training the television remote required punching in the each channel number, then repeating the keyword I chose for that channel. I also programmed macros, or a single commands that trigger a sequence of responses. For example, the phrase "Play tape" turned on the television, tuned in Channel 3, turned on the VCR and pressed the Play button.
The training process can be pretty humbling. First of all you are talking to a household appliance. Second, you are saying the same words over and over, hoping to get your point across. It's embarrassing when you say something important and somebody doesn't understand. It's even more embarrassing when that somebody is your toaster.
I am not the first person to be taken aback by the training required for some voice-controlled devices. "that's where a lot of people get into trouble," said John Lockyer, a senior technical adviser at Smarthome, a home-automation and smart-appliance retailer based in Irvine, Calif. "They expect it to be like "Star Trek,' where voice recognition recognizes all voices, all languages, and it knows what you want it to do. But there can be a great deal of setup time."
Even worse, some of the devices talk back. A synthesized female voice in the television remote kept criticizing my delivery. I would utter a command like "TV power," and she would reply, "Too soft."
After completing the training with all of the devices, which took about three hours, we had a peaceable kingdom. I would say, "Radio on,' and the clock radio would turn on.
When I said "Sports," it would tune in a sports talk station I had programmed. "TV power" turned on the television, and when I said "Discovery" the cable box clicked over to the Discovery Channel.
Then the poltergeist struck. It started with the living room lamp, which would turn on and off seemingly at will. The clock radio soon started doing the same thing: we would come home to an empty house and find that the radio had turned itself on and turned into an oldies station.
Then the sassy television remote started making programming decisions. Something about the voice of Bernie Mac, star of the Fox sitcom of the same name, kept making the remote switch channels. Once a movie commercial set off the remote to turn up the volume. That triggered the remote again. The set grew even louder. I managed to jump on the remote and turn it off before the speakers exploded.
Right around that time, my wife went to visit her sister 3,000 miles away.
TO find out how to control my appliances a little better, I called Mr. Mozer at Sensory. The solution turned out to be voice spotting, a feature that Sensory includes on its chips that involves using a keyword to get a particular device's attention before uttering a command.
"We had one customer who did a voice-activated fireplace," he said. "You don't want your fireplace to accidentally go on, so there we used a gateway word. You had to say 'Superfireplace' or something like that first and then 'Turn on'."
Peace was restored in our house after I retrained the appliances by using voice spotting. Instead of saying "Sports" to my radio, I would first say, "Radio." Pause. An L.E.D on the radio turned from orange to green, showing that it was ready to accept my command "Get sports." The radio might mistakenly hear its keyword, but it would rarely follow that up by also mistakenly hearing a command.
After that problem was fixed, I started to appreciate the convenience of using voice interfaces. It's a lot easier to say "TV ... HBO" than to remember and punch in a two- or three-digit channel number for every channel. It's also nice to be able to control the television while eating.
Similarly, the Siemens Gigaset 4215 cordless telephone lets me call people simply by pressing a button and saying a name into the mouthpiece. This is helpful for people like me who can never remember phone numbers. The Gigaset 4215 can dial only 20 people by voice command, though; if another person in the house trains the phone to his or her voice, two of you get only 10 numbers each.
Voice dialing is being built into cellphones, too. Voice Signal Technologies for example, makes software for cellphone manufacturers that converts text information without any user training. It can recognize thousands of name in contact lists downloaded from your PC to your cellphone, allowing you t dial by saying "Pete Jenkins's home" or "Maria Gonzales's cellphone."
The benefits of voice activation are most pronounced in devices that usually require a lot of button-pushing. For example, E.digital's MXP 100 MP3 player lets you play a song by simply uttering the title and then saying "Play." It does this without any training by matching what you say to titles on its song lists. I kept trying to trip it up with song titles like "Mandolin Wind," but it almost always found the correct song.
People in the industry anticipate some problems as more voice-controlled devices hit the market. The most obvious one is that such devices require their human master to remember dozens of commands for each device.
"There aren't standards there in terms of the actual language that you're using to control these devices," said Dr. Judith A. Markowitz, a speech-processing industry analyst in Chicago. "You make up a command, and at the moment it is the most logical command for that action. A day later you have no idea what that command is."
Mr. Mozer said that a solution might be to have one device control all the others.
"Rather than having to learn a new interface and read a new manual every time you buy a consumer electronic product, you can use this device as your interface," he said. "So you tell it to set the microwave to high. You tell it to record 'Gilligan's Island' on TV. And it has the intelligence to come back and say, 'Hey, do you want "Gilligan's Island" with this episode or this other episode because they're both playing today.'"
That technology may be closer than people think. Home automation products like the HAL200 voice control system from Home Automated Living use a Windows-based PC and a home networking system to control appliances. HAL2000 recognizes the owner's voice commands and even uses the PC's modem to accept voice commands over the phone.
So not only will appliances listen to everything you say, they will even take your phone calls.