How does AEC work?

Q-SYS Level 1 Training : Telephony Deployment

6 ) Audio Playback

13m 10s

9 ) QSC Conferencing Solution

21m 53s

12 ) Final Exam Overview

15m 59s

Video Transcript

How does AEC work? 6m 57s
Welcome back. In order to understand how the AEC component works,
let’s take a look at what happens to an audio signal that starts from the Far-End and is sent to the Near-End,
and what AEC does to prevent its echoes from making it back to the Far-End.
Here’s a diagram of this audio signal’s round trip journey.
You’d never know it from the outside, but the Acoustic Echo Canceler puts the audio signal
through a lot of sub-systems, including the Adaptive Filter and Adaptive Algorithm, Double-Talk Detection,
Non-Linear Processing, Noise Reduction, and Comfort Noise. Let’s start with the Adaptive Filter.
AEC’s goal is to eliminate any trace of the Far-End talker’s voice from the Near End microphone feed –
including all direct and indirect paths from the loudspeaker to the microphone.
In order to delete this noise, the AEC component needs to be able to predict what that noise is going to sound like.
If we broadcast a sharp, impulsive noise from the loudspeaker like a loud click or a gunshot, we could then
record the signal that comes into the microphone, and obtain a recording that looks something like this.
Now, this first peak here represents the direct path from the loudspeaker to the microphone, and all the subsequent spikes
represent the various reflections around the room – you'll notice that the longer it takes to get to the microphone
the more attenuated it's become. This image is known as a the room impulse response,
and it's a predictive map of what will happen to any noise that comes out of the loudspeaker.
This room impulse response is used to create a Finite Impulse Response – or FIR – Filter,
here in the Adaptive Filter part of the AEC system.
When a signal comes from the Far-End, it is fed both to the Near End loudspeaker and to the Adaptive Filter.
The FIR Filter is applied to the incoming signal to create its prediction of what that signal should
sound like when it is received by the microphone.
Then this noise is digitally subtracted from the Near End microphone signal – the result should be silence.
The magic part is that the subtraction operation won’t affect any additional noise in the microphone signal,
such as the Near End talker’s voice, letting the Far End talker have a crystal clear conversation
without hearing his own echoes.
No I can hear you loud and clear, so how’ve you guys been?
However there is a fundamental problem with this model, which is that the room impulse response is constantly changing.
Whenever a door opens, or someone sits down, or if a butterfly flaps its wings, the surfaces in the Near End room have changed.
Which means that the acoustic paths from the loudspeaker to the microphone have changed, so the room impulse response has changed.
Now it’s not really a good idea to constantly broadcast big loud sounds to keep up with these changes.
Instead the Adaptive Algorithm is used to constantly update the Filter, by monitoring the result of the subtraction operation
and then adjusting the Filter until the result is as close to zero, or silence, as possible.
This Adaptive Algorithm is always at work; trying to keep the filter converged with the dynamic room impulse response.
However, it can only do its job when the Far End is talking and the Near End is silent.
This is the only time when the microphone signal, after the subtraction operation, would equal zero.
If the Far End is silent then there’s nothing to measure, and if the Near End is talking then there’s extra audio
in the microphone so the result won’t be zero.
This is a job for the Double-Talk-Detector, or DTD. The DTD listens to both the Far End and Near End microphones
and determines if someone is speaking.
If the Far End is speaking and the Near End is not, then it allows the Adaptive Algorithm to do its job of converging
the Adaptive Filter to the room impulse response. In any other situation, the DTD will prevent the Adaptive Algorithm from working.
Once all of these filters and algorithms have been applied to the signal, it still has several processes to go through before it
makes it back to the Far End talker. First it goes through a Non-Linear Processor, or NLP.
Because of the difficulty in completely converging the FIR filter with the room impulse response, there is bound
to be some residual echo left in the microphone signal at this point.
The Non-Linear Processor constantly analyzes the audio at every instant, to determine if it is composed primarily of the
near-end speech, or of residual far-end echoes. It pinpoints the areas that are made up of only echoes and attenuates those sections.
The remaining echoes will be effectively inaudible over the desired near-end speech.
Next in the processing path is Noise Reduction, or NR. Noise Reduction attempts to remove ambient room noise
by listening for steady sustained noise in the signal and reducing it.
This is so the Far End talker hears your voice, and not your air conditioning hum,
or the lawn mowers outside the window, or the invading alien army.
You can adjust the amount of Noise Reduction in your AEC’s control panel, and you can also enable or disable it with this button.
Finally, the Comfort Noise block is a special feature of the Q-SYS AEC system. After going through Non-Linear Processing
and Noise-Reduction, the Far End should hear the Near End talker loud and clear with everything else being quiet.
Too quiet. If the Near End talker stops speaking, the line might go silent and give the impression that the
telephone line has been disconnected.
Basically it’s a byproduct of the AEC doing its job too well. It actually sounds very strange...
When there’s complete silence...In between voices, right?
So the Comfort Noise can be added, which is an artificial low-pass noise signal that makes it sound like there’s
still a connection when nobody is talking.
You can adjust the level of the Comfort Noise added in the control panel as well. The only other features in the
control panel are a master bypass to turn off your AEC, and the Echo Return Loss Enhancement meter, which
shows you how much, in decibels, the Far-End’s echoes have been attenuated in the return signal.
The nominal level for this meter will vary depending on the distance between your loudspeakers and
your microphones, but it should still give you a good idea of how effectively your AEC component is operating.
So that’s what happens inside the magic box – which fortunately you’ll never need to worry about.
All you have to do is make sure it’s connected properly and then forget about it. Unlike a lot of products out there,
the Q-SYS echo cancelation is included as part of the Designer software – there is no additional hardware to set-up
and no additional fees. It’s simply part of the Q-SYS system. Now in the next section we’ll look at how to set it up in conjunction
with the Softphone component to create a teleconferencing system so feel free to move on whenever you’re ready.

Lesson Description

How does AEC work? 6m 57s

Explore the many processes that AEC uses to silence the echoes of the far-end caller’s voice.

Tips and Definitions

How does AEC work? 6m 57s

Room Impulse Response: A diagram of what happens to a sharp, impulsive noise after traveling through the room.

Adaptive Filter: This process uses the Finite Impulse Reponse (FIR) Filter created from the Room Impulse Response by applying it to the Far-End to predict what it will sound like after traveling through the room

Adaptive Algorithm: This process analyzes the result of the subtraction operation and adjusts the Adaptive Filter accordingly.

Double-Talk Detector (DTD): This detector only lets the Adaptive Algorithm do its job when the Far-End is speaking and the Near-End is not.

Non-Linear Processor (NLP): This process analyzes the audio for any remaining residual Far End echoes and attenuates the appropriate sections.

Noise Reduction (NR): This process eliminates steady background noises in the Near-End.

Comfort Noise (CN): This process reintroduces soft white noise to prevent the line from sounding like a disconnection.