![]() |
|||
|
Beamforming
IV
Implementation R&D/Technology People Consulting Contact |
![]() |
![]() |
|
|
This page presents R&D activities on microphone array and beamforming. Most of the theoretical aspects in section II are well known and can be easily found in the literature [1] [2]. They are reported here for information and physical interpretations purposes. Section III presents many simulations of arrays with different geometries and microphones. Section IV reports the implementation of a microphone array with low cost electret microphones, preamplifiers and a USB Data Acquisition Card. Different algorithms are tested on recordings of speech in a noisy environment. This page is updated regularly. Dr. Stephane D. 26 February 2008. I. Microphone arrays - Generalities Microphone ArraysA microphone array is a set of microphones spatially distributed at known positions acting as a “spatial sampler”. The array microphones output signals can be “combined” (beamforming) so that the array achieves better directionality than a single microphone. This property makes microphone arrays interesting in telephony, either to improve speech intelligibility by reducing reverberation, or by improving the transmit quality in a noisy environment or by improving full duplex hands-free operation by reducing the loudspeaker feedback. The performance of an array such as directivity, beamwidth, signal to ambient noise ratio, depends a lot upon its geometry, the number and type of microphones (omnidirectional, cardioids…) and the beamforming technique. Operating frequency range without spatial aliasing is linked to the inter-microphone distance and geometry (this will be discussed in the simulations section).
Beamforming consists of combining the microphones output signals. They are convolved with optimal weighting filters (gain, delay) and added to get a “beam” in a direction of specific interest. This beam makes the array a highly directive microphone. The direction of interest will be called “look direction”. It can be the direction of an acoustic source in a noisy and/or reverberant environment for example. The
array can be used to track a moving acoustic source or point to the
source of higher energy in real time. Adaptive
beamforming
in this case optimal weighting to point and form a beam in the look
direction is performed in real time as the signal is captured and
stored in a buffer. Algorithms may account for the real noise
environment and is more adapted in moving acoustic source tracking or
for eliminating noise coming suddenly from a specific direction.
Figure I: Example of beam former – Array with N microphones d(w)= Notations: {
}: column vector < >: row
vector wH: w
conjugate transpose d’: d transpose
wHd:
hermitian product k’.r:
dot product (real) In all sections, the matrices, vectors and scalars dependence upon the frequency is omitted for clarity purposes. Hence, vectors d, n, w stand for d(w),n(w), w(w). MATRICES
(uppercase) and vectors (lower case) are in bold fonts, scalars
(lower case) are in normal fonts. Assuming a point source, i.e. an acoustic monopole, the transfer function d0(w) is such that:
(BI.1)
where Ri is the Euclidian distance between the source and the ith microphone position in a coordinates system Oxyz, k the wavenumber k=w/c, Q the strength of the source (TF--> Q=1 N/m). This source model is referred as “near field source”. Formula corresponds to a solution of the acoustic wave equation in free-field. Assuming a far-field source, d0 can be represented by the effect of a plane wave of amplitude A, d0’=A< eik.r1 eik.r2 ……… eik.rN > (TF ---> A= 1 N/m2) (BI.2) k being the wave vector pointing to the origin of the coordinates systems Oxyz (O= origin of the plane wave phase), rN the vector of microphone N coordinates in the same Oxyz coordinates systems. Note: A plane wave is not a solution of the wave equation in free field but it is a practical model for describing the pressure field of an acoustic monopole that would be far (straight wave-fronts) from the system under study. II. Beam Forming – Theoretical aspectsA few beamforming techniques are presented in this section. The list is far from being exhaustive. All of these techniques are based on easy optimization methods and basic linear algebra. Let’s consider the beamformer in figure I. In the approaches presented below we will assume that the ambient noise n is not correlated to the sound source. This means that the algorithm will not “deal” with the sound source signal reflections by walls for example although it can help in reducing reverberation perception. In the same way the system noise hi at each microphone is uncorrelated to the signal and system noise at other microphones. For
all channels, we will assume hi
as a white noise with the same standard deviation s
(constant with frequency) so that the system noise coherence matrix is
defined by: Ghh(w)= E(|h|2)= s 2I (BI.3) Gnn (w)=Gnn =E(|n|2) (BI.4) II.1
Delay and Sum In
the frequency domain the optimal weight vector wopt
is simply a multiple of d0:
wopt=a d0
(a
scalar). For
a gain of 1 in the look direction i.e. wHopt
d0=1
the optimal weight vector is: wopt=(1/N)
d0
Why “delay”? wopt
can be seen as a “phase shifter”. Rewriting k=
(w/c) n where
n
is a unity vector pointing to 0,
and c is the speed of sound, the dot products n.ri are
constant, hence the phase terms Qi=(w/c) n.ri
vary
linearly with w,
which corresponds to a pure delay in the time domain (the delay being
the slope of the phase vs. frequency curve). Inter-microphones
delays can be found by moving the plane wave phase origin 0 to
the position of one of the microphone in the array. If the reference is
the first microphone position r1:
d0’=<1 ei(w/c)n.(r2-r1) ……… e i(w /c) n.(rN-r1)>/ ei(w /c)n.r1) (BI.6) “Inter-microphones”
delays ti
are given by: ti= n.(ri - r1)/c for
a plane wave travelling at c m/s. Note:
A pure delay (linear phase) will appear only if the array is in free
field or lying on an infinite rigid plane. If
the array is several cm above a rigid plane, or around a diffracting
shape, there is no reason why the phase is still linear vs. the
frequency. In this case only delaying and summing the signals may prove
inefficient and may even likely induce beam “mis-pointing” in some
circumstances! These
aspects will be discussed in the simulations section.
Super-directive
approaches are different from the delay and sum technique. The
determination of optimal weighting filters relies on the definition of
a noise field. Super-directive approaches produce optimal weights
having variable amplitude and phase vs. frequency. It achieves strong
directivity through individual microphone signals gain
modification and group delay and may degrade the
signal to “system noise” ratio. In fixed beamforming the noise coherence matrix can
either be build independently of the environment where the array sits
by considering a model of noise field (spherical isotropic noise for
example), or by measuring the noise characteristics in the environment.
The construction of the noise matrix Gnn influences the shape of the beam patterns. The
optimization process consists of minimizing the array output noise
power (excluding unwanted part of the signals i.e. reflections as
mentioned previously) by maintaining the gain in the look
direction. MVDR beamformer
One solution is to set the look direction to be unity
gain, called the minimum-variance distortion-less response (MVDR)
beamformer. That is,
The optimum weights vector wopt is:
At low frequencies, when the acoustic wavelength is much larger than the array, Gnn is ill-conditioned. Since the array samples a small part of the wavelength, only small variations in phase and amplitude occur and the microphones “see” redundant information. This makes the matrix Gnn almost singular. A small positive number m regularises the matrix and makes its inversion stable.
The introduction of this positive number on the matrix
diagonal trades off low frequency directivity for improved White Noise
Gain (see definition below). Adding a positive is similar to adding a quadratic
constraint to the optimisation problem. Let’s assume that wopt is such as the system noise power is constrained at a
certain value a (>0): wHGhh w≤a
(BI.10) For m=a/s2 equation () more closely models a real system with
maximum system noise. The suboptimal weighting will allow better
SNR at the expense of array directivity. Note that in the same way this kind of regularisation
makes the beamformer more “resistant” to microphone mismatch
particularly at low frequencies (frequent phase and amplitude deviation
for low cost microphones).
Thus, if WNG >1 the array gives less noise than a
single sensor. A delay and sum beam former is one where the WNG is
optimised by making Gvv=I. (this is effectively the same as maximising
the gain (G)
by making wHw=I). With w=d0/N,
WNG
= N. The delay and sum beamformer improves N times the system
noise compared to a single microphone. LCMV-LS Beamformer Least-Squares - Linearly Constrained Minimum Variance
Beamformer is a generalization of the MVDR algorithm. A set of
linear constraints is added to optimisation problem (BI.7). Such
constraints can be used to impose: -a null in a given direction… of the loudspeaker in a telephone or conference unit -constant beamwidth over the microphone array operating frequency range as beampatterns depend on frequency
-symmetrical beams
for particular geometries or look
directions.
Assuming a set
of i ( i={1,2,…M}) (M<N)
linear constraints:
(BI.12)
subject to:
(BI.13)where C is a rectangular matrix defined
by:
(BI.14)and g is a vector defined
by:
The optimal weight vector wopt under these
conditions is given
by:
(BI.16)If C=d0 and g=1 we are back to the MVDR.
II.3 Array CharacteristicsIndependently of the beamforming technique reported above, the following parameters are required for characterizing the performance of the array. Signal
to Noise Ratio
Ability
of the
array to improve or degrade the SNR (here the noise refers to the
system noise h).
It can be compared to the SNR of a single microphone.
|
|||