How does AEC work?

Video Transcript

0:08

Welcome back. In order to understand how the AEC component works,

0:11

let’s take a look at what happens to an audio signal that starts from the Far-End and is sent to the Near-End,

0:17

and what AEC does to prevent its echoes from making it back to the Far-End.

0:22

Here’s a diagram of this audio signal’s round trip journey.

0:26

You’d never know it from the outside, but the Acoustic Echo Canceler puts the audio signal

0:30

through a lot of sub-systems, including the Adaptive Filter and Adaptive Algorithm, Double-Talk Detection,

0:36

Non-Linear Processing, Noise Reduction, and Comfort Noise. Let’s start with the Adaptive Filter.

0:43

AEC’s goal is to eliminate any trace of the Far-End talker’s voice from the Near End microphone feed –

0:49

including all direct and indirect paths from the loudspeaker to the microphone.

0:54

In order to delete this noise, the AEC component needs to be able to predict what that noise is going to sound like.

0:59

If we broadcast a sharp, impulsive noise from the loudspeaker like a loud click or a gunshot, we could then

1:07

record the signal that comes into the microphone, and obtain a recording that looks something like this.

1:12

Now, this first peak here represents the direct path from the loudspeaker to the microphone, and all the subsequent spikes

1:18

represent the various reflections around the room – you'll notice that the longer it takes to get to the microphone

1:24

the more attenuated it's become. This image is known as a the room impulse response,

1:30

and it's a predictive map of what will happen to any noise that comes out of the loudspeaker.

1:34

This room impulse response is used to create a Finite Impulse Response – or FIR – Filter,

1:41

here in the Adaptive Filter part of the AEC system.

1:44

When a signal comes from the Far-End, it is fed both to the Near End loudspeaker and to the Adaptive Filter.

1:50

The FIR Filter is applied to the incoming signal to create its prediction of what that signal should

1:55

sound like when it is received by the microphone.

1:58

Then this noise is digitally subtracted from the Near End microphone signal – the result should be silence.

2:03

The magic part is that the subtraction operation won’t affect any additional noise in the microphone signal,

2:09

such as the Near End talker’s voice, letting the Far End talker have a crystal clear conversation

2:14

without hearing his own echoes.

2:16

No I can hear you loud and clear, so how’ve you guys been?

2:19

However there is a fundamental problem with this model, which is that the room impulse response is constantly changing.

2:26

Whenever a door opens, or someone sits down, or if a butterfly flaps its wings, the surfaces in the Near End room have changed.

2:35

Which means that the acoustic paths from the loudspeaker to the microphone have changed, so the room impulse response has changed.

2:41

Now it’s not really a good idea to constantly broadcast big loud sounds to keep up with these changes.

2:48

Instead the Adaptive Algorithm is used to constantly update the Filter, by monitoring the result of the subtraction operation

2:56

and then adjusting the Filter until the result is as close to zero, or silence, as possible.

3:02

This Adaptive Algorithm is always at work; trying to keep the filter converged with the dynamic room impulse response.

3:10

However, it can only do its job when the Far End is talking and the Near End is silent.

3:15

This is the only time when the microphone signal, after the subtraction operation, would equal zero.

3:20

If the Far End is silent then there’s nothing to measure, and if the Near End is talking then there’s extra audio

3:26

in the microphone so the result won’t be zero.

3:29

This is a job for the Double-Talk-Detector, or DTD. The DTD listens to both the Far End and Near End microphones

3:36

and determines if someone is speaking.

3:38

If the Far End is speaking and the Near End is not, then it allows the Adaptive Algorithm to do its job of converging

3:44

the Adaptive Filter to the room impulse response. In any other situation, the DTD will prevent the Adaptive Algorithm from working.

3:53

Once all of these filters and algorithms have been applied to the signal, it still has several processes to go through before it

4:00

makes it back to the Far End talker. First it goes through a Non-Linear Processor, or NLP.

4:06

Because of the difficulty in completely converging the FIR filter with the room impulse response, there is bound

4:12

to be some residual echo left in the microphone signal at this point.

4:16

The Non-Linear Processor constantly analyzes the audio at every instant, to determine if it is composed primarily of the

4:23

near-end speech, or of residual far-end echoes. It pinpoints the areas that are made up of only echoes and attenuates those sections.

4:33

The remaining echoes will be effectively inaudible over the desired near-end speech.

4:39

Next in the processing path is Noise Reduction, or NR. Noise Reduction attempts to remove ambient room noise

4:46

by listening for steady sustained noise in the signal and reducing it.

4:51

This is so the Far End talker hears your voice, and not your air conditioning hum,

4:57

or the lawn mowers outside the window, or the invading alien army.

5:07

You can adjust the amount of Noise Reduction in your AEC’s control panel, and you can also enable or disable it with this button.

5:14

Finally, the Comfort Noise block is a special feature of the Q-SYS AEC system. After going through Non-Linear Processing

5:20

and Noise-Reduction, the Far End should hear the Near End talker loud and clear with everything else being quiet.

5:27

Too quiet. If the Near End talker stops speaking, the line might go silent and give the impression that the

5:34

telephone line has been disconnected.

5:36

Basically it’s a byproduct of the AEC doing its job too well. It actually sounds very strange...

5:41

When there’s complete silence...In between voices, right?

5:47

So the Comfort Noise can be added, which is an artificial low-pass noise signal that makes it sound like there’s

5:51

still a connection when nobody is talking.

5:55

You can adjust the level of the Comfort Noise added in the control panel as well. The only other features in the

6:00

control panel are a master bypass to turn off your AEC, and the Echo Return Loss Enhancement meter, which

6:06

shows you how much, in decibels, the Far-End’s echoes have been attenuated in the return signal.

6:12

The nominal level for this meter will vary depending on the distance between your loudspeakers and

6:17

your microphones, but it should still give you a good idea of how effectively your AEC component is operating.

6:22

So that’s what happens inside the magic box – which fortunately you’ll never need to worry about.

6:28

All you have to do is make sure it’s connected properly and then forget about it. Unlike a lot of products out there,

6:34

the Q-SYS echo cancelation is included as part of the Designer software – there is no additional hardware to set-up

6:40

and no additional fees. It’s simply part of the Q-SYS system. Now in the next section we’ll look at how to set it up in conjunction

6:48

with the Softphone component to create a teleconferencing system so feel free to move on whenever you’re ready.