Alango Technologies - Making Digital Sound Better. Technologies. Voice Enhancement. Voice Activity Detection

Technologies / Voice Enhancement / Voice Activity Detection

ABOUT

BLOCK DIAGRAM

DEMONSTRATION VIDEO

APPLICATIONS

INTEGRATION

TECHNICAL INFORMATION

About

Alango’s Voice Activity Detection (VAD) technology reliably detects human speech in an acoustic signal. The technology is based on a proprietary, high-resolution spectral noise estimation algorithm operating in real time. VAD’s sensitivity level is adjustable and ensures minimization of false positive detections in a presence of background noise and other non-speech sounds.

VAD consumes less than 2 MIPS of processor load, which means that a device with VAD implemented consumes very little power in standby mode. VAD runs on the first stage of the signal processing path and ensures that heavier signal processing tasks (e.g., acoustic echo cancellation, beamforming, noise suppression, and speech recognition) can remain “asleep” until voice is detected. When VAD detects voice activity, the system “wakes up” and full signal processing begins.

Block Diagram

Demonstration Video

Watch this video on Alango's Youtube Channel

Applications

In portable, voice enabled devices, VAD enables always-on voice detection at low standby power by allowing MIPS-heavy signal processing to “sleep” until voice is detected.

In smart speaker applications, VAD enables the creation of battery-powered smart speakers. On average, common always-on smart speakers consume roughly 3 watts during standby. Implementing VAD in a smart speaker can allow for significant power savings during standby, allowing for the introduction of battery-powered always-on devices.

In security system applications, VAD can be used to trigger the recording or transmission of audio/video feeds when voice is detected in a monitored area.

In headset and hearable applications, VAD enables always-on voice detection at low standby power by allowing MIPS-heavy signal processing to “sleep” until voice is detected.

Additionally, in hearable applications (such as smart earbuds), when used with an in-ear bone-conduction (motion) sensor, VAD can specifically detect user’s voice activity, distinguishing it from other signals recorded by the motion sensor such as user's jaw movements (e.g. chewing), user walking, running or tapping the device.

In smart appliance applications, VAD can be used to enable always-on, always ready for voice commands standby mode with low power consumption, leading to reduced user energy costs over time.

In automotive applications, VAD can be used to wake up a car when voice is detected, allowing the system to proceed to biometric signal processing. Used this way, VAD allows for always-on operation while ensuring that power-heavy processing is only done when voice is detected.

Integration

Although it can be used standalone, VAD can also be integrated with Alango Voice Enhancement Package (VEP) or Voice Communication Package (VCP) for a complete preprocessing solution.

Technical Information

PERFORMANCE

Small footprint and low-MIPS implementation (ex. < 2 MCPS on ARM M4)
All sampling rates are supported (ex. 32/24/16/8/4/2 kHz)
Can be applied to acoustic microphone and low-bandwidth vibration sensor signals
Flexibly configurable settings including sensitivity to harmonicity and SNR
Low-latency reaction (typically < 50ms)

AVAILABILITY

VAD is available on the following platforms:

ARM cores (all types)
CEVA TeakLite-III, TeakLite-4
Synopsys ARC cores
Cadence (Tensilica) HiFi 2, HiFi 3
Qualcomm 512X
Porting on other platforms can be performed quickly.

Please contact Alango technical support for specific information.

Voice Enhancement Voice Activity Detection

Voice Enhancement

Voice Activity Detection