Advertisement

Article

Tiny board puts accurate speech recognition into consumer designs

SENSORY DEBUTS ULTRA-COMPACT VOICE RECOGNITION MODULE

Low-Cost VR Stamp Facilitates Quick System Design

SANTA CLARA, Calif.—Sensory, Inc., the world leader in embedded speech technologies, releases the VR Stamp module, providing easy integration of voice recognition (VR) into consumer, industrial, automotive and medical electronics. The heart of the VR Stamp module is the RSC-4128 integrated circuit, the latest member of Sensory's award-winning RSC Family of mixed signal processors that provide speech recognition, synthesis and system control on a single chip. All of the additional components required for a functional speech recognition system are squeezed into the VR Stamp's ultra-compact footprint. With minor hardware interfacing additions, any electronic product can recognize and speak with the VR Stamp.

“The VR Stamp makes it quick and easy for a developer to incorporate voice recognition and speech synthesis into products such as set top boxes, medical instrumentation, industrial controls, and much more,” notes Bill Teasley, Sensory's vice president of engineering. “This new product virtually eliminates system design obstacles, making it practical to add the exciting new option of speech input and output to any human interface. Imagination is the only limitation.”

Complete Speech Recognition System on Board

The VR Stamp module includes a fully functional system based on Sensory's RSC-4128 mixed signal processor, a powerful 8-bit microcontroller inside a voice recognition system on a chip (including 16-bit ADC, DAC, digital filtering, RAM, ROM, output amplification, timers, comparators and more). In addition, flash memory, serial EEPROM, main clock and real time clock crystals, along with power noise management components, are all densely packed into its standard 40-pin DIP footprint. The VR Stamp can act as a speech recognition slave, or be the primary host controller of the end product along with providing the speech recognition features. Sensory's FluentChip software is included and provides high-accuracy speech recognition, speaker verification, speech compression and output, music synthesis, as well as diagnostic and utility programs. The VR Stamp modules sell for under $30 in volume, and can handle multi-level menus of speech recognition command sets and speech synthesis prompts.

Development Tools Cut Time to Market Dramatically

Not only does the VR Stamp significantly simplify hardware system design, the VR Stamp Toolkit makes the development of speech command sets, speech synthesis prompts, music, speech I/O application design and end product circuit design, a snap. The VR Stamp Toolkit includes Quick T2SI – Lite, a special edition of Sensory's highly-acclaimed Quick T2SI tool that allows speaker-independent vocabulary set development using simple text input to create the desired commands. Also included is Quick Synthesis, which will compress digital recordings of speech prompts in seconds and supports easy scoring of MIDI-like music; a C-compiler for efficient programming; an Integrated Development Environment for easy project management; and a VR Stamp Programming Board, which connects to a PC, via USB, for downloading executable code to the VR Stamp module. Sample programs and circuit designs familiarize the developer with possible applications for Sensory's technologies and insure a quick product development cycle. International languages are supported by the Quick T2SI – Lite tool, making manufactured goods, using the VR Stamp, accessible around the world. The complete VR Stamp Toolkit, with QuickT2SI – Lite, C-Compiler, Quick Synthesis, 2 VR stamps, a VR Stamp Programming Board, and more, retails for $495.

Nestled between the giant IC companies in Silicon Valley is Sensory, Inc ., a firm known for its low-cost embedded speech technologies. In addition to the new VR Stamp module discussed in the press statement (on the left), Sensory offers a plethora of chip and software-only approaches to speech recognition, speech and music synthesis, and speaker verification.

In fact, a Sensory users list reads like a Who's Who of the consumer electronics industry. The list includes the likes of Avon, JVC, Kenwood, Matsushita, MGA, Hasbro, Mitsubishi, Fisher-Price, Radica, Sega, Sharper Image, Mattel, Sony, Toshiba, Uniden , and even test-and-measurement giant Tektronix .

What's In A Name?

I find it intriguing that Sensory named its new voice recognition product similarly to the popular PIC -based BASIC Stamp mini-modules from Parallax . Regardless of the moniker, the VR Stamp is quite small, so the name is appropriate. Nonetheless I can't help but wonder how Parallax's attorneys might feel about the name!

Packaged in its diminutive 40-pin DIP, the VR Stamp is replete with a low-noise audio channel, 1-Mbit of flash, and two dozen digital I/O lines. As you'd expect, there are also pins for connection to a mike and loudspeaker, as well as pins for power and ground. There's also an RS-232 serial interface.


Click to view block diagram

Speaking of power, this is a low dissipation adjunct. As such it should find a home in many OEM designs and products where current draw is a design consideration. Operable from single-ended supplies of from 2.7-V to 3.6-V, it sips just 26-mA (at 3-V), but there's a power-down Sleep mode that drops demand to less than 20-µA.

The press release also notes that the company's VR Stamp Toolkit 's development tool supports speaker-independent vocabulary development. It's worth pointing out that the little VR Stamp can also be used in either a speaker-independent or speaker-dependent mode. That makes it quite flexible.

Rapid Deployment

As the press statement exhaustively states, the VR Stamp Toolkit contains just about everything necessary to program a VR Stamp, but it assumes you’re a savvy programmer. Sensory assumes Toolkit users are experienced programmers that understand assembly and/or C coding. You're also expected to understand embedded systems development methods, relocatable object code, and similar concepts.

Once developed, your speech recognition application is loaded into the VR Stamp module using the VR Stamp Programmer Board connected to your PC's USB (Universal serial Bus) port. Once programmed, a VR Stamp can then be plugged into your target product's DIP socket.

To get you up-and-running, Sensory's development kit includes samples and demos of the company's speech technologies. The Quick Synthesis 4 , for example, lets recordings of speech be compressed. That supports low data-rate synthesis.

You also get a C compiler and debugger. Development is done using the Phyton MCC-SE IDE . It includes a USB dongle for installing the C compiler. A time-limited license for the compiler is included with the toolkit. These tools run under Windows NT, Windows 2000 and Windows XP .

The VR Stamp toolkit costs a bit less than $500. Add another $100 for the programmer board. A serial EEPROM module adds about $40 more to the tab.

Multi-Faceted Software

The company’s statement also mentions its FluentChip software. It’s included with every VR Stamp. FluentChip not only provides speech recognition and speaker verification, but it also supports speech compression and music synthesis. It also helps with diagnostics, and the package includes utilities.

Significantly, just about all of the technologies in the company's existing FluentChip library are available for the VR Stamp. The only application that can't be used on the VR Stamp is Sensory's Record & Play , and that's due to the limited storage capacity on the VR Stamp.

But, FluentChip is capable of running HMM (Hidden Markov Model) and neural network speaker-independent and speaker-dependent technologies. It also handles speaker verification, speech and music synthesis, and sound effects. All of this code runs on Sensory's RSC-4x microcontrollers.

Dedicated Speech Silicon

The RSC-4xs are billed by Sensory as next-generation speech processors that are especially tailored for cost-sensitive applications. Equipped with on-chip 8-bit microcontrollers, the RSC-4xs are the epitome of dedicated devices, integrating many speech-optimized digital and analog processing blocks onto one die.

RSC-4x devices include microphone pre-amps with gain control, twin-DMA (direct memory access) units, vector accelerators, hardware multipliers, three timers, ROM, and 4.8-kbytes of RAM. Multiple ROM options are also available for these devices.


Click to view RSC4x block diagram

The upshot of all of this special integration is accuracy. According to Sensory, its RSC-4xs and speech algorithms reach a recognition accuracy of greater than 97% for speaker-independent recognition, and greater than 99% for speaker-dependent recognition. That’s pretty darned good, especially when you consider that a VR Stamp will add less than $30 to the BOM of your next design.

Just like Parallax’s BASIC Stamps, Sensory’s mini-module “stamp” looks like it has the potential for widespread adoption. Learn more by clicking here for a VR Stamp datasheet (in Adobe Acrobat .PDF format).

Click here to access a library of useful speech recognition documents from Sensory's Web site.

For further details contact Sensory, Inc., 1991 Russell Avenue, Santa Clara, Calif. 95054-2035. Phone: (408) 327-9000. Fax: (408) 727-4748.

Sensory , 408-327-9000, www.sensoryinc.com

0 comments on “Tiny board puts accurate speech recognition into consumer designs

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.