Logo link home

 








Home

Acoustic Design

Testing

Simulation

Beamforming III
Simulations


Beamforming IV
Implementation

R&D/Technology

People

Consulting

Contact


This page presents R&D activities on microphone array and beamforming. Most of the theoretical aspects in section II are well known and can be easily found in the literature [1] [2]. They are reported here for information and physical interpretations purposes. Section III presents many simulations of arrays with different geometries and microphones. 

Section IV reports the implementation of a microphone array with low cost electret microphones, preamplifiers and a USB Data Acquisition Card. Different algorithms are tested on recordings of speech in a noisy environment. 

This page is updated regularly.  Dr. Stephane D. 26 February 2008.       

 

 

I.  Microphone arrays   - Generalities 


Microphone Arrays

A microphone array is a set of microphones spatially distributed at known positions acting as a “spatial sampler”. The array microphones output signals can be “combined” (beamforming) so that the array achieves better directionality than a single microphone. This property makes  microphone arrays interesting in telephony, either to improve speech intelligibility by reducing reverberation, or by improving the transmit quality in a noisy environment or by improving full duplex hands-free operation by reducing the loudspeaker feedback.      

The performance of an array such as directivity, beamwidth, signal to ambient noise ratio, depends a lot upon its geometry, the number and type of microphones (omnidirectional, cardioids…) and the beamforming technique. Operating frequency range without spatial aliasing is linked to the inter-microphone distance and geometry (this will be discussed in the simulations section).


Beam Forming

Beamforming consists of combining the microphones output signals. They are convolved with optimal weighting filters (gain, delay) and added to get a “beam” in a direction of specific interest. This beam makes the array a highly directive microphone.

The direction of interest will be called “look direction”. It can be the direction of an acoustic source in a noisy and/or reverberant environment for example.

The array can be used to track a moving acoustic source or point to the source of higher energy in real time.

Fixed beamforming:
in fixed beamforming optimal weights are pre-determined and stored for processing the microphone output signals. They are data independent and based on a model (or measurement) of the ambient noise and source in the environment of the array: office, teleconference room… Fixed beamforming can be used in adaptive algorithms such as the general sidelobe canceler (GSC) along with active noise control, or it can be given some adaptive characteristics if it is coupled to a localization algorithm steering the beam (pre-computed filters) in the look direction.  With some precautions, the latter approach was successfully implemented in teleconferencing and proved very robust in quiet meeting rooms or offices. 

Adaptive beamforming in this case optimal weighting to point and form a beam in the look direction is performed in real time as the signal is captured and stored in a buffer. Algorithms may account for the real noise environment and is more adapted in moving acoustic source tracking or for eliminating noise coming suddenly from a specific direction. 

Adaptive algorithms like the GSC, will be presented and tested in a separate section.   Localization algorithms as well. 


We assume the following N microphone array and beamformer:


 

 

Figure I: Example of beam former – Array with N microphones

d(w)=total pressure at each microphone (Transfer function)  d0(w)=pressure at microphone induced by the source ("look direction") (Transfer Function).
n(w)= pressure at microphones du to ambient noise (Transfer function)  h(w)= system noise for each channel  (microphone + preamp +quantization).
w
(w)=beamforming weighting filters.
 

Notations:

 

{  }: column vector   < >:  row vector      wH: w conjugate transpose      d’: d transpose       wHd: hermitian product      k’.r: dot product  (real)

 

Note: All quantities are defined in the frequency domain unless otherwise stated (w is the circular frequency in rad/s).
In all sections, the matrices, vectors and scalars dependence upon the frequency is omitted for clarity purposes.  Hence, vectors
d, n, w  stand for  d(w),n(w), w(w).
We will call d
0 the “look direction” instead of “vector of transfers functions between an acoustic source in the look direction and each microphones” and any vector d different from d0 a “direction”  (direction of a source of interest)

 

MATRICES (uppercase) and vectors (lower case) are in bold fonts, scalars (lower case) are in normal fonts.    

 

 

Assuming a point source, i.e. an acoustic monopole,  the transfer function d0(w) is such that:

                 (BI.1)

where Ri is the Euclidian distance between the source and the ith microphone position in a coordinates system Oxyz, k the wavenumber k=w/c, Q the strength of the source (TF--> Q=1 N/m). This source model is referred as “near field source”. Formula corresponds to a solution of the acoustic wave equation in free-field.

Assuming a far-field source, d0 can be represented by the effect of a plane wave of amplitude A

 

d0=A< eik.r1    eik.r2 ……… eik.rN >   (TF ---> A= 1 N/m2)               (BI.2)

 

 

k being the wave vector pointing to the origin of the coordinates systems Oxyz (O= origin of the plane wave phase), rN the vector of microphone N coordinates in the same Oxyz coordinates systems.  

Note: A plane wave is not a solution of the wave equation in free field but it is a practical model for describing the pressure field of an acoustic monopole that would be far (straight wave-fronts) from the system under study.   


II.  Beam Forming – Theoretical aspects 


A few beamforming techniques are presented in this section. The list is far from being exhaustive. All of these techniques are based on easy optimization methods and basic linear algebra. Let’s consider the beamformer in figure I. In the approaches presented below we will assume that the ambient noise n is not correlated to the sound source. This means that the algorithm will not “deal” with the sound source signal reflections by walls for example although it can help in reducing reverberation perception. 
In the same way the system noise hi at each microphone is uncorrelated to the signal and system noise at other microphones.
 
Under these assumptions cross-covariance:  E(|d0.nH|)=0 and E(|d0.hH|)=0.

For all channels, we will assume hi as a white noise with the same standard deviation s (constant with frequency) so that the system noise coherence matrix is defined by:

Ghh(w)= E(|h|2)= s 2I            (BI.3)

The “ambient” noise covariance matrix will be defined as: 

Gnn (w)=Gnn =E(|n|2)               (BI.4)

 
System noise will be added through a quadratic constraint condition in the optimization process. 

II.1   Delay and Sum  

In this section the signal received at each microphone d0=<eik.r1    eik.r2 ……… eik.rN> assuming a far field source, is simply delayed and summed.

In the frequency domain the optimal weight vector wopt is simply a multiple of d0:      wopt=a d0 (a scalar).

For a gain of 1 in the look direction i.e.   wHopt  d0=1 the optimal weight vector is:

wopt=(1/N) d0            (BI.5)

Why “delay”?  

wopt can be seen as a “phase shifter”.  Rewriting k= (w/c) n  where n is a unity vector pointing to 0, and c is the speed of sound, the dot products n.ri are constant, hence the phase terms Qi=(w/c) n.ri  vary linearly with w, which corresponds to a pure delay in the time domain (the delay being the slope of the phase vs. frequency curve).

Inter-microphones delays can be found by moving the plane wave phase origin 0 to the position of one of the microphone in the array. If the reference is the first microphone position r1:   

d0=<1    ei(w/c)n.(r2-r1) ……… e i(w /c) n.(rN-r1)>/ ei(w /c)n.r1           (BI.6)

“Inter-microphones” delays ti are given by: ti= n.(ri - r1)/c for a plane wave travelling at c m/s. 

Note: A pure delay (linear phase) will appear only if the array is in free field or lying on an infinite rigid plane.

If the array is several cm above a rigid plane, or around a diffracting shape, there is no reason why the phase is still linear vs. the frequency. In this case only delaying and summing the signals may prove inefficient and may even likely induce beam “mis-pointing” in some circumstances!  

These aspects will be discussed in the simulations section.


II.2      Super directive approaches

 

Super-directive approaches are different from the delay and sum technique. The determination of optimal weighting filters relies on the definition of a noise field. Super-directive approaches produce optimal weights having variable amplitude and phase vs. frequency. It achieves strong directivity through individual microphone signals gain modification and group delay and may degrade the signal to “system noise” ratio.    

In fixed beamforming the noise coherence matrix can either be build independently of the environment where the array sits by considering a model of noise field (spherical isotropic noise for example), or by measuring the noise characteristics in the environment. The construction of the noise matrix Gnn influences the shape of the beam patterns.

The optimization process consists of minimizing the array output noise power (excluding unwanted part of the signals i.e. reflections as mentioned previously) by maintaining the gain in the look direction.

 
MVDR beamformer

One solution is to set the look direction to be unity gain, called the minimum-variance distortion-less response (MVDR) beamformer. That is,

     subject to              (BI.7)

The optimum weights vector wopt is:

                                           (BI.8)                                                 

At low frequencies, when the acoustic wavelength is much larger than the array, Gnn is ill-conditioned. Since the array samples a small part of the wavelength, only small variations in phase and amplitude occur and the microphones “see” redundant information.  This makes the matrix Gnn  almost singular.  A small positive number m regularises the matrix and makes its inversion stable. 

 

                                              (BI.9)

The introduction of this positive number on the matrix diagonal trades off low frequency directivity for improved White Noise Gain (see definition below).

 

 

Adding a positive is similar to adding a quadratic constraint to the optimisation problem. Let’s assume that wopt is such as the system noise power is constrained at a certain value a (>0):

 

wHGhh wa                                                                 (BI.10)

 

 

For m=a/s2 equation () more closely models a real system with maximum system noise.  The suboptimal weighting will allow better SNR at the expense of array directivity.

 


Note that in the same way this kind of regularisation makes the beamformer more “resistant” to microphone mismatch particularly at low frequencies (frequent phase and amplitude deviation for low cost microphones). If we assume a signal with no interference and only system noise then Gvv=I.  This provides an estimate the discrimination of the array against system noise. The White Noise Gain of the array is defined as

                                                   (BI.11)

Thus, if WNG >1 the array gives less noise than a single sensor. A delay and sum beam former is one where the WNG is optimised by making Gvv=I.   (this is effectively the same as maximising the gain (G) by making wHw=I). With w=d0/N,  WNG = N.

The delay and sum beamformer improves N times the system noise compared to a single microphone.  

 

LCMV-LS Beamformer

Least-Squares - Linearly Constrained Minimum Variance Beamformer is a generalization of the MVDR algorithm.  A set of linear constraints is added to optimisation problem (BI.7). Such constraints can be used to impose:


-a null in a given direction… of the loudspeaker in a telephone or conference unit
-constant beamwidth over the microphone array operating frequency range as beampatterns depend on frequency
                -symmetrical beams for particular geometries or look directions.

        Assuming a set of  i ( i={1,2,…M})  (M<N) linear constraints:

                                                         (BI.12)

where di are directions “of interest”, the optimisation problem under constraint becomes:

       subject to:                        (BI.13)

where C is a rectangular matrix defined by:

           
   (BI.14)

and g is a vector defined by:                                                                                    (BI.15)

The optimal weight vector wopt under these conditions is given by: 

                                                              (BI.16)

If C=d0 and g=1 we are back to the MVDR.

II.3   Array Characteristics  

Independently of the beamforming technique reported above, the following parameters are required for characterizing the performance of the array.  

Signal to Noise Ratio

Ability of the array to improve or degrade the SNR (here the noise refers to the system noise h). It can be compared to the SNR of a single microphone. SNR= White Noise Gain if E(|h|2)= I .

                                                                                       (BI.17)

Array Gain

The array gain measures the improvement of the SNR (here noise refers to “ambient” or “interference noise” n, not the system noise !!) of the array over a single microphone only.
Assuming a source of strength A>0 in the look direction and a noise field of strength B>0, the microphone output for the signal alone is
x(w)=Ad(w) , the microphone output for the noise is xn(w)=B.n(w). The SNR at one microphone is given by:

                                          (BI.18)

The signal to noise ratio of the array is: 

                        (BI.19)

The array gain is then defined by:

                                     (BI.20)




Array Directivity

By analogy (actually reciprocity) with the directivity of a radiating acoustic source, the directivity factor Q is given in Beranek [3] as the ratio of the power radiated by an acoustic monopole with the power radiated by the acoustic source (here the array).  
With
0q  p  the elevation and 0f  ≤ 2p the azimuth angle:

                          (BI.21)


Since we assume an average power |wHopt d0|2 =1 in the look direction for the array, for comparison purposes the monopole average power = 1 for all directions. Hence:   

                           (BI.22)

Since a  finite set of directions will be considered in the coming simulations, Q can be easily discretized into:

                (BI.23)

The Directivity Index DI is defined by: DI(w)=10 log10(Q(w))                                           (BI.24)

Q can be integrated over the frequency band of interest [f1,f2] for getting the average directivity factor.
The directivity index characterizes how well the microphone array suppresses noise coming from all directions 0f  ≤ 2p,   0q  p
The directivity index can be illustrated with a perfect cardioid microphone. For pure “graphical” purposes, the gain=1 (0 dB) is referenced at +30 dB.  


Omnidirectional Microphone:  DI= 0 dB

    
Ideal Cardioid Microphone: DI=4.8 dB

Figure II: Directivity Index illustration

 


 

 

 

Microphone Array - Beamforming III:  Simulations >

 

 

 

Bibliography:


[1]  Michael Brandstein, Darren Ward: “Microphone Array”, Springer-Verlag, 2001

[2]  S. Unnikrishna Pillai: ”Array Signal Processing”, Springer-Verlag 1988. 

  [3]  Leo L. Beranek:"Acoustics", The Acoustical Society of America, 1993 edition.