DSP SYSTEM DESIGN
DSP System Design Complexity Reduced IIR Filter Implementation for Practical Applications by
Artur...
91 downloads
866 Views
19MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
DSP SYSTEM DESIGN
DSP System Design Complexity Reduced IIR Filter Implementation for Practical Applications by
Artur Krukowski University of Westminster and
University of Westminster
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
0-306-48708-X 1-4020-7558-8
©2004 Springer Science + Business Media, Inc. Print ©2003 Kluwer Academic Publishers Dordrecht All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America
Visit Springer's eBookstore at: and the Springer Global Website Online at:
http://www.ebooks.kluweronline.com http://www.springeronline.com
Contents
Contributing Authors
vii
Preface
ix
Symbols and Abbreviations
xi
Polyphase IIR Filters Frequency Transformations
1 99
Filter Implementation
153
VHDL Filter Implementation
199
Appendix
221
References
227
Index
233
Contributing Authors
Artur Krukowski is with the University of Westminster since 1993 working as an Academic Researcher, then since 1999 as a Post Doctoral Researcher in Advanced DSP Systems in the Applied DSP and VLSI Research Group and since 2001 as a permanent member of the research staff. His areas of interest include Multi-rate Digital Signal Processing for Telecommunication Systems, digital filter design and their efficient lowlevel implementation, integrated circuit design, Digital Audio Broadcasting, Teleconferencing and Internet Technologies for Teaching. is with the University of Westminster (formerly the Polytechnic of Central London) since 1984. He is currently Professor of Applied DSP and VLSI Systems, leading the Applied DSP and VLSI Research Group at the University of Westminster. His research and teaching activities include digital and analog signal processing, silicon circuit and system design, digital filter design and implementation, A/D and D/A sigma-delta converters. He is currently working on efficiently implementable, lowpower DSP algorithms/architectures and Sigma-Delta modulator structures for use in the communications and biomedical industries.
Preface
This work presents the investigation of special type of IIR polyphase filter structures combined with frequency transformation techniques used for fast, multi-rate filtering, and their application for custom fixed-point implementation. Despite a lot of work being done on these subjects, there are still many unanswered questions. While a detailed coverage for all these questions in a single text is impossible, an honest effort has been made in this research monograph to address the exact analysis of the polyphase IIR structures and issues associated with their efficient implementation. Detailed theoretical analysis of the polyphase IIR structure has been presented for two and three coefficients in the two-path arrangement. This was then generalized for arbitrary filter order and any number of paths. The use of polyphase IIR structures in decimation and interpolation is being presented and performance assessed in terms of the number of calculations required for the given filter specification and the simplicity of implementation. Specimen decimation filter designs to be used in SigmaDelta lowpass and bandpass A/D converters are presented which seem to outperform traditional approaches. A new exact multi-point frequency transformation approach for arbitrary frequency choice has been suggested and evaluated. The use of this frequency transformation has been applied to the example of multi-band filter based on the polyphase IIR structure. Such filters substantially improved upon the standard techniques in terms of band to band oscillations, overall filter order, passband ripples and calculation burden for the given filter specification. A new “bit-flipping” algorithm has been developed to aid in filter design where the coefficient wordlength is constraint. Also, the standard Downhill
x Simplex Method (floating-point) was modified to operate with the constrained coefficient wordlength. Performance of both these advances is being evaluated on a number of examples of polyphase filters. Novel decimation and interpolation structures have been proposed, which can be implemented very efficiently. These allow an arbitrary order IIR antialiasing filter to operate at the lower rate of the decimator/interpolator. Similar structures for polyphase IIR decimator/interpolator structures are being discussed too. A new approach to digital filter design and implementation has been suggested which speeds-up silicon implementation of designs developed in Matlab. The Matlab program is being developed which takes the Simulink block description and converts it into a VHDL description. This in turn can be compiled, simulated, synthesized and fabricated without the need to go through the design process twice, first algorithmic/structural design and then the implementation. The design was tested on the example 14-bit polyphase two-path two-coefficient polyphase filter. The structural Simulink design has been converted into VHDL and compared bit-to-bit. This research monograph resulted from a doctoral study completed by the first author at the University of Westminster, London, UK, while working under the supervision of Prof. Artur Krukowski
Symbols and Abbreviations
n(k) u(k) e(k)
H(z) A(z)
FFT FST DFT dB SNR rms exp FIR IIR LPF HPF BPF LMS PZP VLSI
Delta sequence, equal to 1 for k=0 and zero for Noise Step function Error signal Update weight coefficient Unit delay operator Transfer function of the discrete-time filter Transfer function of an allpass filter Z transfer function of the bandpass filter Z transfer function of the notch filter Normalized frequency Fast Fourier Transform Fourier Summation Transform Discrete Fourier Transform Decibel Signal-to-Noise Ratio Root Mean-Square Exponential function ex Finite Impulse Response Infinite Impulse Response LowPass Filter HighPass Filter BandPass Filter Least-Mean-Square Pole-Zero Pattern Very Large Scale Integration
xii
VHSIC VHDL HTML M R L MSB LSB A/D ADC DAC SRD SRI MIP MAP LPH MOS COMB MAVR SLINK PCM PDM AM FM AGDP
Very High Speed Integrated Circuit VHSIC Hardware Description Language Hyper Text Mark-up Language Lowpass oversampling ratio Bandpass oversampling ratio Lowpass interpolation ratio Most Significant Bit Least Significant Bit Analogue-to-Digital Analogue-to-Digital Converter Digital-to-Analogue Converter Sample Rate Decreaser Sample Rate Increaser Minimum Phase filter Maximum Phase filter Linear Phase Metal Oxide Semiconductor technology Filter described by Nth-order difference equation. Moving AVerage Filter. Sink function defined for discrete signals Pulse Code Modulation Pulse Density Modulation Amplitude Modulation Frequency Modulation Arbitrary Group Delay Filter Sampling frequency LLFT Lowpass-to-Lowpass Frequency Transformation AA Anti-Aliasing Sigma-Delta PC Personal Computer TB Transition Bandwidth AM Amplitude Modulation FM Frequency Modulation ALU Arithmetic-Logic Unit CDSM Constrained Downhill Simplex Method CSDC Canonic Signed Digit Code NBC Natural Binary Code FSD Fractional Sample Delayer IEE Institute of Electrical Engineers IEEE Institute of Electronic and Electrical Engineers Inverse Z-transform ZII Zero-Insert Interpolation
Chapter 1 POLYPHASE IIR FILTERS Design and Applications
1.
OVERVIEW OF POLYPHASE IIR FILTERS
The idea of a polyphase structure can be easily derived from an FIR filter if all coefficients are substituted by appropriate allpass subfilters with constant unity gain and frequency dependent phase shifts leading to the structure as presented in Figure 1-1(a).
Each allpass has a different, carefully designed phase response. This is why the structure is called polyphase. The frequency selective characteristics of the filters are due to the phase shifts between consecutive branches (Figure 1-1 (b)). This book concentrates on the special class of polyphase structure in which AllPass Filters (APF) are made of
2
DSP System Design
one-coefficient allpass sections (2), where N is the number of branches in the structure. It is worth noting that their impulse responses are sequences of samples with non-zero values at locations, where The step response changes its value every each N samples. The impulse and step responses, and the average time delay of indicate that allpass filters may be very useful for decimation with factor N. The general transfer function of the polyphase structure presented in Figure 1-1 is [2]:
Allpass filters have N poles at and N zeros at To ensure absolute stability of the overall polyphase structure, the absolute coefficients value should be less than unity Substituting (2) into (1), the overall transfer function of polyphase structure becomes:
The frequency response is obtained by evaluating (3) on the unit circle [3]:
Phase response defined by (5) of the allpass filter (see Figure A.3 and Figure A.4 in Appendix A) is a monotonic function of normalized frequency and ranges from zero at DC, to N-multiple of at Nyquist, By proper choice of coefficients the phase shift of each branch of the N-path structure can be designed to match or to differ by
1. Polyphase IIR Filters
3
multiples of at certain regions of Such a situation for N=2 is shown in Figure 1-2. The phase characteristics of both paths overlap (in-phase) at low frequencies and are displaced by (out of phase) for high frequencies.
The two-path polyphase lowpass filter is formed as a sum of two parallel allpass filters with phase shifts carefully designed to add constructively in the passband and destructively in the stopband. It results in the magnitude response function of the half band filter presented in Figure 1-2(c). The proposed two-path polyphase filter structure offers very desirable properties in comparison to most published decimation and interpolation filters used in Sigma-Delta data converter implementations.
1.1
Two-Path Structure
The general transfer function of the halfband lowpass filter presented in Figure 1-2 can be derived from (3) by substituting N=2:
4
DSP System Design
This equation describes a stable system provided that the absolute value of the coefficients should be less than unity, It is clear that the order of the filter is where and are the number of cascaded allpass filters in the upper and lower branches respectively. The frequency response can be calculated by evaluating (6) on the unit circle [3]:
Phase responses of both branches are:
Very important for the quality of decimation is the magnitude response of the filter. Its shape is the cosine-like function derived from (7) and given by:
Example magnitude response for the seventh-order (three-coefficient) LPF is presented in Figure 1-3.
Deep minima of the magnitude response for normalized frequencies above are due to all zeros being placed in the left-hand z-plane close to the unit circle or on the unit circle. Considering the properties of the
1. Polyphase IIR Filters
5
allpass sub-filter phase response, given by (A.6a)-(A.6c) in Appendix A, then the lowpass filter magnitude response has the following properties [3]:
Specifications for half-Nyquist LPF having the transition band
are:
Both and are the peak ripples in the passband and in the stopband respectively. Both and should be as small as possible, preferably close to zero. This requirement is satisfied exactly only at DC and at Nyquist due to properties (a) and (b) in (10). Symmetry property (d) in (10) means that if is the passband frequency at which the magnitude response is minimum then the stopband frequency at which the magnitude response is maximum is For the least square optimal equiripple magnitude response performance in both passband and stopband, the gain must equal at the passband cut-off frequency, and achieve the stopband maximum at the stopband cutoff frequency, Considering that cutoff frequencies are related by the equation:
Using (11) we obtain the relation between passband and stopband ripples:
As for any practical filter
then
and (13) approximates to:
6
DSP System Design
In practical cases, it is possible to concentrate only on achieving stopband specifications because a reasonable stopband performance guarantees, in view of (14), an even better passband performance. If, for example, stopband ripples are (60dB attenuation) then resulting passband ripples are and the minimum passband gain is
Phase response of the two-path LPF can be derived from (7):
Factor represents phase jumps due to sign changes of the cosine function in (10) corresponding to the frequencies of the filter zeros and. Those phase jumps can be seen in the filter stopband in Figure 1-3. Linearphase requirement is very important in the design of digital filters. In processing broad-band signals we invest lots of effort to keep the filter phase as linear as possible to prevent the harmonics of the useful signal from being delayed by different amounts of time. Group delay function, is a good measurement of phase linearity. It can be derived from its definition (16):
Substituting (8) and (15) into (16) gives:
1. Polyphase IIR Filters
7
Simplifying:
Equations (17) and (18) contain the sum of delta functions located at frequencies at which phase jumps occur. As they only occur in the filter stopband, they do not cause any problems for the signal frequencies in the filter passband. The group delay function for the example seventh-order lowpass filter is presented in Figure 1-5.
The function is even symmetric in respect to DC and half-Nyquist and positive for all frequencies. It can be seen from the group delay that the phase response is linear only close to DC and close to Nyquist. It peaks at half-Nyquist where the phase response has its inflection point. It can be seen from (18) that the larger the value and the number of coefficients, the larger group delay peak is and the more non-linear the phase response becomes. The design algorithm for the floating-point version of this class of two-path polyphase structure has been developed by Harris and Constantinides in [1], [4]-[5] and is based on the analogy to the analog elliptic filters. The design for more than two paths and for fixed-point coefficients will be described later in Chapter 1 and in Chapter 3 respectively. The structure described above has significant five-to-one savings in multiplications in comparison to the equivalent elliptic filter. A fifth-order two-path structure requires only two coefficients as opposed to the direct implementation requiring ten coefficients. As for any filter design for a fixed order, the transition width can be traded for the stopband
DSP System Design
8
attenuation, or it can be reduced for the fixed attenuation by increasing the number of filter coefficients. An approximate relationship has been published in [1], [4] for the equiripple filter:
Factor A is the attenuation in dB, TB is the normalized transition bandwidth and N is the number of allpass segments (including the delay). 1.1.1
One-coefficient two-path polyphase IIR lowpass filter
One-coefficient two-path LPF has a structure with one allpass in the upper branch and a delayer in the lower one as shown in Figure 1-6.
The transfer function of this simple structure is:
It can be see that the poles of the LPF are identically located to the poles of the allpass filter. There exists a zero at Nyquist due to the delay in the lower branch, which forces the frequency response to zero at Nyquist. The other two zeros are deployed in a conjugate pair and located on the unit circle. Their exact location can be determined using the standard assumption:
Comparing the second-order term from the numerator of (20) with the transfer function of the second-order FIR filter allows determining the zero locations in terms of the coefficient value:
1. Polyphase IIR Filters Solving for
9
and
Equivalently in terms of the magnitude and phase of the roots:
If conjugate zeros are located near the Nyquist frequency they can produce enhanced attenuation in the stopband. The attenuation and transition band in terms of the coefficient value for this filter are presented in Figure 1-7 and Figure 1-8 respectively.
10
DSP System Design
Clearly requirements for large stopband attenuation and narrow transition band are mutually exclusive. For attenuations above 60dB, the transition band is forced to be greater than The disadvantage of a onecoefficient filter is that, in order to achieve high attenuation, the coefficient must be very accurate. For example it requires 11 bits long coefficient representation to achieve 96dB attenuation (over 16 bits to achieve 120dB).
Equations (20) and (22) prove that all the zeros of the one-coefficient filter are on the unit circle. As and the coefficient must be If the coefficient value reaches 1/3, the filter becomes the third-order integrator while both the conjugate zeros reach the z=-1 point. Decreasing coefficient value below 1/3 make the zeros split into two, sliding along the left-hand real axis as seen in Figure 1-9.
1. Polyphase IIR Filters 1.1.2
11
Two-coefficient two-path polyphase IIR lowpass filter
Two-coefficient structure is presented in Figure 1-10. Filters A0(z-2) and A1(z-2) are one-coefficient second-order allpass subfilters.
The transfer function of such a polyphase LPF is:
This transfer function can be rearranged into a product form showing pole-zero locations. It is easy to find the roots of the overall transfer function as they are the same as the roots of the allpass subfilters composing the filter, except for an additional pole at the origin (due to the delayer in the lower branch). An equivalent product form of (25) is the following:
It is well known that the roots of any fourth-order symmetrical polynomial can be found analytically [6]. The roots of the function:
DSP System Design
12
Using (29) to find the roots of equation (27) gives:
This result gives two conjugate pairs of zeros. In order to find the maximum stopband attenuation for a given transition band, zeros of the filter should be on the unit circle. This requires to be real and negative and further the argument under the square root of (30) must be positive.
After rearranging we get:
Considering that coefficient values are positive we get:
Rearranging in terms of
gives:
For both sides of (34) are equal and there are two conjugate pairs of double zeros on the unit circle. When changing coefficient values within the allowed range of (0...1), the placement of the filter zeros changes dramatically as can be seen in Figure 1-11 [7]-[8]. The shaded area represents coefficients for which all filter zeros are on the unit circle. The grid points inside the area represent the cases of four-bit long coefficient pairs. The bold ones have no more than two bits set in the entire coefficient in either Unsigned (0,1) or Signed Binary Code (-1,0,1). For the filter reduces to a single-coefficient case (third-order transfer function).
1. Polyphase IIR Filters
13
The area is bounded from one side by (34) and from the other one by the condition requiring the argument under the square root for to be less than or equal to zero, forcing all four zeros to be complex ones:
Trajectories of equations (33) and (35) have their contact point (as they touch, but not cross each other) at and For these values of and all five zeros of the transfer function to be nailed at Nyquist. The contact point can be though of as being the “focal” point of the filter. Moving away from this point to any of the four areas, C0, C1, C2 or C3, results in different filter behaviors. Choosing coefficients from area C0 (shaded one) will result in two conjugate pairs of zeros moving from Nyquist on the unit circle towards half-Nyquist point which they reach when both coefficients are equal to one. The area C0 is bounded by two curves of coefficients for which zeros are in single for (35) or double conjugate pairs on the unit circle for (35) (all the remaining ones are at Nyquist). When coefficients are chosen from area C1 then there are two pairs of zeros moved from Nyquist, a pair of conjugate zeros sliding on the unit circle and a pair of reciprocal zeros on the negative
14
DSP System Design
real axis. This area is bounded by the condition (35) of single zeros on the unit circle and the remaining ones in Nyquist, and the general requirement of limiting coefficient values to For coefficients chosen from area C2 zeros of the transfer function stay on the negative real axis and split into two pairs of reciprocal zeros and a single one staying at Nyquist. His area is bounded by the condition of a single reciprocal pair of zeros on the negative real axis for (35) (all remaining ones stay at Nyquist) and the condition requiring two pairs of reciprocal pairs of zeros to stay on the negative real axis for (34) (single zero stays in Nyquist). If coefficients are chosen from area C3 then four zeros of the transfer function are arranged into reciprocal conjugate zeros with only single zero remaining at Nyquist. The area is bounded by the condition (34) requiring two pairs of conjugate zeros to stay on the unit circle, and the general requirement of limiting coefficient values to There is also the fifth area C4 that is far away from the focal point. It represents the coefficient for which four zeros of the transfer functions form two conjugate pairs of zeros on the unit circle, but on the right-hand side of the imaginary axis. An interesting example of the double zero is the case marked in Figure 1-11. Its coefficients (1/8, 9/16) are only four-bit long and have only three bits set in total making it an attractive case for implementation. Establishing the bounded area of coefficients for which all zeros are on the unit circle is very important not only for understanding the behavior of the filter, but can be used by the constrained filter design algorithms (as described in Chapter 3) as the principle of their operation is the structured search within the established required boundaries. 1.1.3
Two-path lowpass filters with more than two coefficients
The filter having three or more coefficient becomes too complicated and equations describing its pole-zero locations become too complex to be solved analytically in the same manner as for the two-coefficient filter. Such an analysis has to be done numerically. An algorithm has been developed to aid such analysis which returns the coefficients of the polyphase structure for given zero or pole locations of the filter and it works for any number of filter coefficients. For the case when pole locations are known, finding filter coefficient values is trivial. They are simply equal radii of the filter poles raised to the power. Depending on the number of paths the zeros of the polyphase structure can take any of a number of permissible symmetrical dispositions. The example six possible zero patterns are shown
1. Polyphase IIR Filters
15
for the two path two-coefficient filter case in Figure 1-12. All the zeros of the polyphase LPF can take any combination of these dispositions.
Determining the coefficient values from its zero locations is especially useful for specifying the bounded area of filter coefficients for which the zeros are on the unit circle. This information is extremely useful and can be used as the starting point by such floating-point filter design algorithms like Powell or Simulated Annealing [9] as well as by any constrained filter design algorithm like bit-flipping or Constrained Downhill Simplex (described in Chapter 3) in which the design is based on the search within the bounded space. Analytic methods of finding the area of coefficient values for which the polyphase filter zeros are on the unit circle, thus allowing one to achieve the maximum attenuation for the given filter order, proved to not be practical for more than three coefficients. The equation manipulations involved made such an approach computationally impractical and very difficult to solve analytically. Alternatively, the required range of coefficients can be found by a direct numerical method. Let us now have a look at the general transfer function of the N-path polyphase structure:
16
DSP System Design
It can be noticed from (36) that the filter denominator has non-zero coefficient values only at with K being the total number of filter coefficients. For every path the bracketed expression in the numerator is a product of type terms, which when multiplied together results in non-zero values only at (just like in the case of the filter denominator). The term changes the exponents of those products (obtained for every path) by a different integer value for every path product. As a result, when transfer functions of all path subfilters (functions of are being added together, only one sub-filter has a non-zero factor at each So what does this tell us? One should pay careful attention to the way the product for each path is created. All of them are created from K one-coefficient blocks, either or one for each filter coefficient (one coming from the numerator and the other from the denominator of the allpass section respectively). This means that if the location of the zeros of the filter is known the procedure for finding the filter coefficients becomes very simple, as follows: 1. Create the filter numerator from its zeros by convolving together terms, for k=1,...,K (where is the zero of the transfer function). 2. Take every factor of the numerator polynomial and use them to create a new polynomial in terms of 3. Calculate the roots of the new polynomial 4. If any root magnitude is greater than unity, then take its reciprocal. 5. Root values obtained are then equal to the polyphase filter coefficients. To show this method in practice it will be used to first calculate the coefficients of the polyphase structure for which all zeros are located at Nyquist. Such a filter case is very suitable in the polyphase filter design for the initial point of optimization algorithms. Results from trial runs of the algorithm are summarized in Table 1-1 for the two-path polyphase filter having different number of coefficients. The same method can be used to specify the bounded area of the two-path, three-coefficient polyphase filter for which all zeros lie on the unit circle, thus allowing achievement of the maximum possible stopband attenuation. Such an approach is especially useful for the case of constrained coefficients for which the equiripple stopband can be only approximated.
1. Polyphase IIR Filters
17
Method for finding filter coefficients from known (valid) zero locations can be used to analyze the polyphase structure to identify the space of coefficients where the desired filter solutions reside. This may not be very important for floating point designs, considering the method suggested by Constantinides and Harris in [4], but would be very useful for algorithms searching in the space of constrained coefficients. In such a case the optimum solution in terms of maximum attenuation will no longer be identical to the floating-point one.
18
DSP System Design
The visualization of zero placement for the two-path, three-coefficient polyphase LPF becomes more difficult than for the two-coefficient case, as shown in Figure 1-13. It can be noticed that coefficient space corresponding to zeros on the unit circle is only a small part of the total area of possible coefficient combinations. This area is bounded by three curves defining cases of all zeros, single, double and triple, either on the unit circle or at Nyquist. Those curves converge to the point where all zeros reside at Nyquist. The plane established by the curves of single and double zeros on the unit circle represent the case where one conjugate pair of zeros migrate from Nyquist towards single zeros, in the end converging into double zeros. In the same way the plane spread between curves of double and triple zeros on the unit circle describe the case where a conjugate pair of zeros moves from Nyquist towards the double zeros, creating triple zeros on the unit circle. The third plane bounded by curves of single and triple zeros describes filter cases having double conjugate pair of zeros moving from Nyquist towards the other pair of single zeros, converging into triple zeros. It is important to know also what happens when coefficients are chosen from outside the described volume. If so, some or all of the zeros will no longer reside on the unit circle. This is very often the case when filter coefficients are being constrained into short wordlength. The curves of the equiripple stopband attenuation for one, two and three-coefficient cases, indicated in Figure 1-13, are very close to the boundaries of the volume. As a result when the coefficients are constrained, they are very likely to fall out of the volume, resulting in non unit circle zeros. It is characteristic that in the case when zeros fall off the unit circle, they do it in pairs. For each such pair one of them falls outside and one inside the unit circle (to keep with the further constraint of having to keep the transfer function real), both being at the same angle (frequency) forming a conjugate quartet. The shaded triangular shapes in Figure 1-13 represent cases for which the filter has a constant transition band (not an optimum case in terms of the stopband attenuation). Thick lines represent optimum filter design cases in terms of equiripple passband and stopband ripples for different transition bands. These were obtained from the standard polyphase LPF floating-point filter design algorithm [1], [4]-[5]. It is easy to notice that the coefficients of the top-branch allpass filter can be exchanged without affecting the behavior of the filter. Therefore there should be two volumes, from which only one was shown in Figure 1-13. This is because, first, both of them have exactly the same shape, causing the same filter behavior and therefore showing both volumes would not convey any extra information. Secondly, such visualization allowed the user to zoom into one of the volumes which otherwise would be very small and barely visible inside the full cube
1. Polyphase IIR Filters
19
Using the analogy to the three-coefficient case it is very easy to predict how the filter will behave for four-coefficients and how the volume of coefficients ensuring the placement of all the zeros on the unit circle would look like, even if it is difficult to picture it on paper. In the four-dimensional space there will be a single point of all zeros at Nyquist, four curves of single/double/triple/quadruple zeros on the unit circle only going from the overall convergent point towards all four convergent points of the threecoefficient filter (cases of one/three/five/seven zeros at Nyquist only).
1.2
N-Path structure versus the classical 2-path one
The two-path structures are applicable for a wide range of halfband type filters including lowpass, highpass, bandpass and Hilbert ones. Extending the structure to more paths can lead to having more flexibility in specifying filter bandwidths while preserving such advantages of the polyphase structure as high attenuation and small passband ripples achievable for a small number of coefficients and, what will be shown later, little sensitivity to constraining their coefficient values as well as ease of implementation. The N-path polyphase structure, as it was described in the previous sections, is based on the N-tap FIR filter in which coefficients are substituted by a cascade of single-coefficient allpass filters as described in Appendix A. When looking at Figure A.3 and Figure A.4 in Appendix A, it can be noticed that such allpass filters have their phase response independent from the coefficient value at 2N frequencies around the unit circle, i.e. for the allpass section:
The consequence of this is that the complex frequency response (magnitude and phase) of the polyphase LPF will also have fixed values irrespective of the coefficient values at the frequencies specified by (37) (since all the allpass sections are of the same order in all the paths). The values of the magnitude response at these magic frequencies can be calculated from the general transfer function of the N-path polyphase lowpass filter (3) by setting all coefficient values to unity. It can be observed that in such a case the polyphase lowpass filter becomes the Moving AVeRage (MAVR) filter:
20
DSP System Design
The magnitude response at the characteristic frequencies is then:
Results for the number of paths ranging from two to eight are presented in Table 1-2. Looking at the results it can be seen that the MAVR filter imparts its characteristic on the shape of the lowpass filter magnitude response. The advantage is that it creates the cutoff frequency at and places zeros on the unit circle at even multiples of but on the other hand it creates spikes in the stopband at even multiples of These spikes are very high with the first one at around 9-13dB and the following ones a few dBs smaller. The amplitude of the neighboring spikes does not differ substantially and until recently this has been reducing the application of such lowpass filters. The cutoff frequency of the polyphase lowpass filter is the same as for the MAVR filter of the same order. This also implies that performing transformations on the MAVR filter itself can change the magnitude response of the whole polyphase filter. For example this can be a way of reducing spikes in the stopband of the polyphase lowpass filters having more than two paths, like in the example of the four-path (N=4) lowpass polyphase filter designed with eight coefficients in Figure 1-14.
1. Polyphase IIR Filters
21
The reasons for spikes in the magnitude response can be explained from a pole-zero plot. It can be noticed that all filter poles are strictly inside the unit circle positioned at four frequencies equispaced around the unit circle. Those closest one to DC determines the position of the edge of the filter passband while the more distant ones create high spikes in the stopband. Zeros are no longer only on the unit circle. Now, almost half of them are close to poles to compensate the influence of those more far away. There are N-1 zeros placed on the unit circle at frequencies which are even multiples of the cutoff frequency irrespective of the coefficient values, just like for the case of a two-path filter. The rest of them (less than half) are placed on the unit circle. The conclusion is that the more paths that are used, the less performance the filter can achieve for the same number of coefficients, both in terms of the achievable stopband attenuation for the given transition bands and considering also high spikes in the stopband magnitude response which for almost all applications have to be got rid of.
1.3
Compensating for peaks in the stopband
The problem of decreasing the spikes in the stopband of multi-path polyphase lowpass filter magnitude response is not a trivial task. The main idea is not to damage highly precise passband of the filter (its ripples and the edge of the transition band) and do the correction as efficiently as the polyphase structure does its filtering operation, thus preserving the implementation efficiency of the polyphase structure. There are three methods proposed here that minimize the effect of the stopband peaks: Use a polyphase halfband compensation filter. Design the non-halfband filter as a cascade of multirate subfilters. Apply the frequency transformation to adjust the filter cutoff frequency.
22
DSP System Design
All these methods incorporate polyphase allpass-based IIR structures and therefore do not degrade the performance of the prototype filter in terms of its passband and stopband ripples and do not require much more computations, thus preserving implementation efficiency of polyphase LPF. 1.3.1
Correction with halfband polyphase lowpass filters
The multiband polyphase LPFs have their stopband peaks at certain fixed frequency locations given in (37), dependent only on the number of paths. These locations are independent of the number of coefficients as well as their values. As the frequencies at which the spikes occur are well known, it is possible to design a compensation filter which would have its zeros at these frequencies. The obvious choice would be a notch filter. The problem arising from such choice is that the filter, in order to achieve similar ripple performance to the polyphase structure, would require many more coefficients (usually with floating-point precision) than the LPF itself and the implementation efficiency of the whole filter in terms of the number of calculations would degrade. The solution is to use a similar polyphase IIR structure to compensate for the spikes. For the number of paths between three and five such compensation filter can be designed as a halfband polyphase LPF as in Figure 1-15, Figure 1-16 and Figure 1-17 respectively. For the larger number of paths the stopband spikes would appear in the passband of the compensator and hence would not be removed.
The example three-path polyphase LPF was designed for a stopband attenuation of A=50dB and transition width of TB=0.05. It can be seen that
1. Polyphase IIR Filters
23
for three paths the spike is only at Nyquist and it can be easily compensated with a halfband filter which has a zero at Nyquist. The obvious question coming into mind is why not to use unity coefficients for the compensator, as it always has the zero at Nyquist. The reason is that the compensator has two tasks to accomplish, one to decrease the spike in the stopband of the multi-path LPF and second to keep passband ripples, within allowed limits, Therefore the compensator has to be designed always for the same passband width (transition band as for the original filter, TB, and for passband ripples, Assuming that then:
In
the
example design the allowed passband ripples were The two-coefficient corrector designed for the attenuation of and transition bandwidth (2 coefficients) was enough to keep the passband ripples within the required limits at the same time getting rid of the spike and increasing the stopband attenuation for the overall filter. For the four and five-path cases spikes happen in the middle of the stopband, and therefore the halfband filter must be designed so that it has one of its zeros placed at the frequency of the spike. The proposed method places the first zero of the corrector at the frequency where the spike is and then the next ones between the first one and the Nyquist in such a way to achieve equiripple stopband attenuation for the corrector. The number of coefficients (equivalent to half the number of zeros excluding the one at Nyquist) is chosen to keep the passband ripples within the limits, To the first approximation the number of coefficients, is the same as for the halfband filter corrector designed for the same passband ripples, and the transition band of The zeros of the polyphase LPF are on the cosine frequency scale (analogs to the logarithmic scale) and therefore if the first zero is at frequency positions of other zeros of the polyphase halfband LPF, can be calculated from:
The coefficients can be then calculated from the position of the zeros using the algorithm described in section 1.1.1.3. Alternatively the
24
DSP System Design
optimization routine can be applied to adjust the transition band of the equiripple polyphase correction halfband LPF in order to place its first zero at the frequency of the spike. The result of correction of the four-path LPF is shown in Figure 1-16. The specifications for the prototype filter were similar to the previous example, namely the attenuation A=50dB and transition width of TB=0.05. The corrector was designed to achieve the total passband ripples less than It was designed for the attenuation and transition band which it achieved with two coefficients, setting the passband ripples of the overall filter at and its minimum stopband attenuation at
The five-path LPF was designed in a similar way as the four and threepath ones. The five-path case has two spikes: one at Nyquist and the other one at (Figure 1-17). In some way it is similar to both three and fourpath cases. As before the first zero of the corrected filter was chosen to be at the frequency of the spike with all the other ones put between the first one and Nyquist to achieve equiripple magnitude response of the corrector in its stopband. The corrector was designed to achieve total passband ripples less than It was designed for the attenuation of and transition band of which it achieved with four coefficients, setting passband ripples at and the minimum stopband attenuation at The result is shown in Figure 1-17. For more than five paths the spikes in the stopband of the magnitude response appear at frequencies and lower which makes it impossible for the halfband filter to correct as it has its 3dB cutoff at Therefore some of the spikes are not correctable. The way to correct them would be to use filters with more than two paths, but they themselves would require correction as well.
1. Polyphase IIR Filters
25
In such a case the order of the overall filter would increase considerably and become impractical for implementation, hence a different approach is suggested that can also be used for the number of paths between three and five (see above). The method is based on the cascaded structure of halfband filters transformed with frequency transformation, as shown below. 1.3.2
Cascading polyphase halfband LPF transformed by
The idea of cascaded polyphase lowpass filters achieving spike-free stopband, having a baseband equal the integer-fraction of Nyquist frequency is based on the idea of converting a prototype lowpass filter into a multiband filter using a frequency transformation. It replaces each delayer of the original filter transfer function with where k is an integer equal to number of required filter replicas around the unit circle as in Figure 1-18.
26
DSP System Design
In this example a four-coefficient polyphase IIR half-band LPF was transformed by and Cascading a number of such filters, each transformed to have different numbers of replica, allows the design of a variety of lowpass filters having integer-fractional passband widths. The minimum stopband attenuation is equal to the minimum one of all the cascaded filters, while the total passband ripples are the sum of the ripples of the basic filters. The power factor in the frequency transformation has to be properly chosen for each basic filter so that there is at least one filter, which has its stopband at the frequencies where the overall filter should have its stopband, i.e. to avoid other passbands. A good example is a class of filters for which the cutoff frequency m being an integer and Such a class of filters can be designed as a cascade of prototype halfband polyphase IIR filters, each successive one transformed by where and i=1..m. All halfband prototype filters except the last one must be designed in a way to avoid overlapping their passbands. Therefore their transition bandwidths before transformation have to be specified as Last stage filter determines overall transition bandwidth, TB, and is designed for the transition bandwidth of,
The example of such class of filter is a lowpass filter achieving passband ripples of up to its cutoff frequency of having transition bandwidth of TB=0.05 and stopband attenuation of A=53dB as shown in Figure 1-19. It was designed using three cascaded LPFs having transition bandwidths of (one coefficient), (two coefficients) and (three coefficients) with one more coefficient per stage required for the lowpass-to-lowpass frequency transformation.
1. Polyphase IIR Filters 1.3.3
27
Cutoff frequency adjusted using frequency transformation
The different approach to the design of polyphase filters with cutoff frequencies different from where N>2 is the number of paths and the basic allpass filter order, and not suffering from high spikes in the stopband is to use the Lowpass-to-Lowpass Frequency Transformation (LLFT) (42) to move the cutoff of the prototype halfband polyphase LPF, to the required new one,
where and In practice the transformation is performed by substituting every delayer of the original filter with the allpass filter (42). As the halfband polyphase LPF has equal ripples both in its passband and stopband (does not have any spikes in its stopband), the new transformed filter will also have equiripple magnitude response both in its passband and in its stopband. Applying the first-order frequency transformation does not obviously increase the order of the overall filter as the first-order delayer is replaced with the first-order allpass filter. However the implementation of each allpass filter of the polyphase LPF will now have two coefficients per second-order allpass section, instead of one as previously.
It is clear that if the prototype polyphase filter was designed to have constrained coefficients, the coefficients of the resulting frequency transformed filter will have to be re-optimized to have comparable (preferably the same) wordlength as the prototype filter. Each of the transformed allpass blocks can be now implemented using the standard allpass Numerator-Denominator (N-D) structure as in Figure 1-20 which has two sets of delayers and calculations concentrated at the output. The LLFT is a single-point exact transformation, which means that only one feature of the prototype filter is going to be accurately mapped from the old frequency, to the new frequency,
28
DSP System Design
Movement of old features of the prototype lowpass filter from their old frequency locations to the new ones can be easily calculated from the phase of the mapping filter (42), treated as the transformation function,
The shape of the transformation function is shown in Figure 1-21. Notice that the shape of the phase describes the unstable mapping function. It has to be strictly unstable, i.e. to have all its poles outside the unit circle, in order to transform the stable prototype filter into the stable target one (see Chapter 2).
1. Polyphase IIR Filters
29
The problem arising when using a one-point frequency transformation to adjust the cutoff frequency of the halfband polyphase lowpass filter is that the new cutoff is no longer in the center of the transition band as the transformation function becomes non-linear and non-symmetric against as soon as If the new cutoff frequency is smaller than it will be closer to the left bandedge, otherwise it will be closer to the right one. Therefore one must decide whether only the width of the transition band and position of its edges is important, or also the placement of the cutoff frequency. In the first case the adjustment need to be made to both edge frequencies of the transition band, in the second case adjustment will be made to the cutoff frequency and the most distant edge of the transition band (the other edge will be simply closer to the cutoff than it was required). It is important to notice that the frequency transformation preserves the ripple structure both in the filter passband and its stopband, but the frequency location of their features are going to be shifted accordingly to the movement of the cutoff frequency. Only DC and Nyquist features remain in the same place.
The example polyphase LPF was designed for the cutoff frequency the minimum stopband attenuation and the transition bandwidth of TB=0.05 (case (a) in Figure 1-22). The prototype polyphase filter required five coefficients and achieved A=105.5dB of attenuation for the transition width of The coefficients of the second-order sections transformed with LLFT given by (42) into (43) are given in Table 1-3. It can be noticed that for such frequency transformation specification the left edge of the transition band is much closer to the cutoff
30
DSP System Design
frequency than the right one. For some applications such, even unwanted, increase of the passband may be advantageous. This, however, happens this way only for otherwise a stopband is closer to the cutoff frequency.
If position of both edges of the transition band is more important than position of the cutoff frequency then the following modified algorithm can be used. As the location of the cutoff frequency is then not important and the maximum allowed width of the transition band is known, the coefficient of the LLFT can be calculated from (42) and the transition bandwidth from:
Frequency is the left edge of the required transition band if the cutoff frequency is smaller than or the right edge otherwise. For the case when only the position of the edges of the transition band are important and the filter cutoff frequency can be at any frequency between these edges, the calculations of the required parameter of the frequency transformation and the transition band of the prototype filter are too difficult to be determined analytically and it requires an iterative approach as follows: 1. Specify the new cutoff frequency and the new transition band ( is the original filter cut-off frequency). 2. Calculate coefficient from (42). 3. Inverse transform edges of the target filter transition band (indexes T- and T+ indicate left and right edges of the transition band respectively) into from (46).
1. Polyphase IIR Filters
31
4. Modify the target cutoff frequency:
5. If modification is greater than allowed frequency error then go to step 2. 6. Calculate the required transition band of the prototype lowpass filter:
An example polyphase LPF was designed to illustrate the method as described above. It was designed for the set of similar specifications as in the previous example. This time the edges of the transition band were chosen to be at The resulting magnitude response of the designed filter is shown in blue (b) in Figure 1-22. The polyphase prototype filter required the transition width of and only four coefficients, in contrary to the previous example where it required five coefficients. The method described here is the most flexible as it designs filters with arbitrary cutoff frequencies and transition bands. It does not require designing a number of lowpass filters (for different transition bands). The stopband ripples achieved are equiripple; there is no problem with spikes in the magnitude response in the stopband. The price to pay for it is very small. It only requires a slightly more complicated allpass filter structure to be implemented. Therefore this method, in most of the cases of fractional-band lowpass filters, should be the preferable one.
1.4
Converting general IIR filter into parallel structure
Parallel realization of filtering tasks is gaining importance as the parallel processing capability of DSP engines is enhanced and advanced. However, there is relatively little methodology developed in the field for tackling the problem of converting from direct-form IIR transfer functions to parallel equivalents. Most work (e.g. [54]-[56], [81]) approaches the task from another direction, composing the overall IIR filter as a combination of elementary IIR filters. Here a decomposition method is presented, which allows for any arbitrary complex IIR filter transfer function to be converted into a sum of variable-order IIR sections or first-order IIR or allpass sections
32
DSP System Design
[57]. Such transformation allows parallel processing of every section in each path by a separate processing element and hence greatly increases the filter computation speed. For the general case, both real and complex filters are decomposed into parallel complex IIR filters as well as real filters decomposed into a set of real IIR sections. Even so the method depends on the root finding algorithm, the decomposition algorithm is able to improve any loss of its accuracy during the successive iterative calculation. A general IIR filter transfer function H(z) is considered:
It can be shown that H(z) can be decomposed with no magnitude or phase distortion into an M-path parallel structure in Figure 1-23, described by (50) in which stands for the IIR filter order in the branch.
There are two ways of calculating the coefficients of the decomposed structures: the direct one and the iterative one. We refer to the latter one as the “successive-separation” algorithm, since one IIR section is calculated for each successive path and the remainder of the original transfer function is then subjected to the next iteration. If the original filter numerator order (as a function of is higher than its denominator order we equalise the numerator and denominator orders by extracting an FIR part from the filter.
1. Polyphase IIR Filters
33
One way of dealing with the FIR part of the numerator is to cascade it with the parallel combination of IIR sections, as shown in Figure l-24(a).
First, the FIR transfer function is calculated from the division of the filter numerator B(z) by its denominator A(z). Its coefficients of the can be calculated from the inverse transform of the transfer function of the filter (51), in other words the taps of the are equal the first coefficients of the impulse response of the filter:
The term denotes the inverse Z-transformation. Then the new IIR filter numerator B’(z) coefficients can be calculated as the remainder of the division of B(z) by A(z) of original filter transfer function (49).
Another way of incorporating of the FIR part of the filter numerator is to place it into an extra parallel path, as shown in Figure 1-24(b). The FIR section can be extracted from the numerator as follows: Calculate the roots of the filter numerator, for Form the FIR filter transfer function (53) by recombining roots:
DSP System Design
34
Form the new numerator, B’(z), by recombining the remaining roots of the original filter numerator as in (54):
1.4.1
IIR-to-IIR decomposition
For IIR-to-IIR decomposition coefficients of the structure can be calculated directly by comparing transfer functions of the general form of an IIR filter with the one of the decomposed structure, (49), as in Figure 1-23.
After rearranging a set of linear equations is obtained, allowing noniterative way of calculating the coefficients of the decomposed structure in terms of the roots of the original filter, for i=1,..,N.
The direct method of (56) is very accurate. MATLAB simulations show the absolute error for orders up to 20 as shown in Figure 1-25. This is a value very close to the internal MATLAB eps value. Higher filter orders are subject to higher errors and this is for two reasons. The linear equation solution algorithm gives bigger errors for higher orders. But a more important source of error is the way the matrix is created. If one looks carefully, the number of calculations required to create this matrix is enormous for high order filters. It is equivalent to the total calculation time, shown in Figure 1-26.
1. Polyphase IIR Filters
35
Very long calculation time increasing fast with filter order and fast decrease of the accuracy are both due to a very large number of calculations required to create and solve a set of linear equations. Considering this fact and the exponential increase of calculation time with the increase of the filter order limits the practical usefulness of the method to filter orders of up to 24. The following, so-called “successive-separation” methods dramatically
36
DSP System Design
decreases the calculation time, with less accuracy for small order filters, but better for high order ones. Let us consider again the general IIR filter transfer function (48) with a gain factor, in front. As calculations are performed iteratively M times, the right-hand side of (57) is the outcome of separating one IIR section from the filter. In the next iteration the gain factor changes to and the constant, at now becomes M-2. This process continues until the iteration is reached.
Both and coefficients can easily be found by calculating the roots of the original filter denominator. Simplifying the bracketed expression of (57) into a single fraction and comparing numerators of both sides gives (58).
These equations are used for calculating the numerator coefficients for both the separated section and the remaining part of the transfer function from the previous iteration. This method is capable of decomposing a complex IIR transfer function into a parallel form of complex IIR sections, as well as real-to-real decomposition. For the real case it is required that the denominators of the IIR sections have real coefficients, a condition which will subsequently lead to real numerator coefficients. This property in turn limits the number of paths the original filter can be decomposed to, which equals half the number of the original IIR filter complex roots plus the number of its real roots. The “successive-separation” method is very insensitive to the accuracy of the root finding algorithm. Error introduced during the calculation of any section during the decomposition is being compensated by the remaining part of the filter and is being taken care of in the successive iterations. In the presented method the choice of K is arbitrary. More careful selection of K and a proper composition of the roots into each section may increase the accuracy of the overall decomposition.
1. Polyphase IIR Filters 1.4.2
37
First-order allpass decomposition
The direct IIR-to-Allpass decomposition can be performed in the similar way as the direct decomposition to IIR sections by comparing the transfer function of the general IIR filter, this time with structure of Figure l-27(b):
After rearranging we get the set of linear equations which, as before, can be given in the following matrix form:
The special case is the decomposition into first-order IIR sections as shown in Figure 1-27(a). Certainly if the filter roots are not real, complex IIR sections will result. In such a case the matrix of equation (58) can be easily simplified. Assuming boundary conditions: and we get a closed-form set of linear equations:
where X is the vector of unknowns.
38
DSP System Design
The coefficients where i = 1...N-1, are easily obtained by de-convolving (dividing the original denominator by the term It is also possible to perform exact IIR to first-order allpass decomposition, as shown in Figure 1-27(b). For this case we start from:
After cross-multiplication and algebraic manipulation of the polynomial coefficients we obtain another set of linear equations for calculating the gain factor preceding each first-order allpass section and the numerator coefficients of the remaining part of the original transfer function. Assuming boundary conditions and these linear equations with vector of unknowns, X, can be expressed as:
There is an easy way of converting the IIR structure of Figure l-27(a) directly into an allpass form of Figure 1-27(b) that is fast and does not suffer from the loss of precision. It bases on transforming a first-order IIR section into a sum of a constant and a scaled first-order allpass section:
1. Polyphase IIR Filters
39
In this case we simply compute (64) for all the IIR sections of the structure in Figure 1-27(b). At the end summing all the constants leading to:
1.4.3
Example 1 converting real prototype into complex sections
First let us consider the decomposition of a multiband, asymmetric real IIR filter into the sum of complex IIR sections of equal orders. The original filter for this example was created from a low-order elliptic filter converted into an FIR of 128 coefficients and then reduced to a IIR one using Balanced Model Truncation (BMT) [58]. Decomposition algorithms were implemented in MATLAB. Results are summarised in Table 1-4, showing absolute peak error of coefficient calculation and of magnitude and phase responses. The reconstruction error was calculated as the difference between original filter transfer function and the one of the recomposed one. The magnitude and phase errors are the peak absolute difference between an original and a decomposed one. Figure 1-28 shows the magnitude response of a sample filter and its associated magnitude and phase errors following decomposition into 32 branches. Note that error is concentrated in the filter’s stopband and its transition band.
From the inherent symmetry the structure performs an equal number of calculations in every path permitting the optimised processor use. This method is also applicable to real filters as demonstrated in Example 2. In such cases denominators were carefully chosen for filters in each path so that they had real coefficients. It is only required to put conjugate pairs of poles into the same filter, optionally with some real ones (if there are any available). The same algorithm as mentioned above was then applied which returned complex-valued coefficients for every branch filter. It was found
40
DSP System Design
that decomposition errors (reconstruction, magnitude and phase responses) are only mildly dependant on the number of paths.
1.4.4
Example 2 converting a real prototype to real sections
This example demonstrated the decomposition of the real prototype filter from previous example into equal-order sections, but this time forced to be real ones. This was achieved by cancelling all imaginary parts of calculated branch filter numerator coefficients. Peak error values resulting from the performed calculations are summarised in Table 1-5 and the full-band error profiles shown in Figure 1-29.
1. Polyphase IIR Filters
41
For the real decomposition case it is essential to pair (recombine) conjugate roots of the prototype filter into each section of the decomposed structure in order to achieve small errors of the overall decomposition. Otherwise imaginary coefficient parts in the calculated sections become too big to be disregarded. 1.4.5
Summary
The parallel structure with several sections in each path, which we have targeted to test our decomposition approach, has a number of advantages for high-speed applications. It requires the same number of multiplications (2N+1) and summations 2N, as well as memory locations 2N, as the direct IIR filter implementation. The memory requirement for storing old samples is less for the decomposed structure, (2N-M) in comparison to 2N for the direct IIR implementation. The real advantage of the parallel structure comes from the fact that it naturally lends itself to performing all the calculations in parallel, which is ideal for a multiprocessor environment. For M paths the filtering will be performed M times faster. The error introduced through our decomposition as can be seen in Table 1-4 and Table 1-5 is very small. For filter orders up to 128, the peak decomposition errors were less than However it should be noted that the error performance is highly dependent
42
DSP System Design
on the filter specification and the number of paths, and can be minimised by careful grouping of poles in each path The calculation speed is determined by the speed of the algorithm solving the linear equations and the number of subfilters the original filter is decomposed to. For the M-path decomposition the linear equations are solved (M-1) times and as every new subfilter is calculated the size of matrices involved in each calculation is smaller. The direct “Gauss’ elimination method with the selection of basic elements” and the “iterative Gauss-Seidel” one [59] was used in the experiments. For the first one the calculation time required to solve an set of linear equations is about 7 times longer than for an one. For iterative methods like the second one the calculation time is dependent not only on the order of the set of linear equations, but also on the number of iterations (dependent on the required accuracy). The accuracy is certainly also dependent on the filter being decomposed, namely the location of its poles which can heavily influence the accuracy of calculations. For the case of the decomposition of the transfer function of the real filter prototype it is important to properly pair the conjugate roots of the filter denominator (in other words recombine them into the second-order components) in order to eliminate the imaginary part from the calculation within the algorithm. This has been pointed out in the second example. The presented method is similar to the well-known Partial Fraction Expansion (PFE) [81]. The difference is that the PFE calculates sections with numerator order one less than their denominator one. The method presented in this chapter calculates sections having the numerator order equal to the one of its denominator.
2.
MULTIRATE FILTERS IN DATA CONVERTERS
Before physical signals can be processed digitally they have to be converted from the analog to the digital form, by means of an analog-todigital (A/D) converter. Unfortunately this conversion introduces distortion. The stringent requirements, which the modern market imposes on A/D conversion, became achievable only due to the advent of fast and complex integrated circuits. There is a growing demand of more than 13-bit A/D converters which can not be designed solely in analog as the required component matching for a large number of bits is extremely difficult to achieve and sustain. One of the many alternative choices is presented here that allows achieving very high bit lengths: “A/D conversion via the intermediate stage of 1-bit coding, at a very high sampling rate” [11]. Onebit coding is achieved by a Sigma-Delta modulator stage, which is then followed by a decimator in which the sampling rate is decreased and the
1. Polyphase IIR Filters
43
magnitude resolution is increased. This architecture primarily covers audio band applications and is especially attractive for VLSI implementation. In this section we first look at the requirements to be met by the Nyquist and the oversampling A/D converters in digital signal processing applications and the arguments in view of the intermediate stage 1-bit coding. Then the typical modulator is presented. The quantization noise shaping by the modulators and the effects of the noise on the resolution of the A/D converter are also presented.
2.1
Nyquist-rate converter versus oversampling ones
The function of an A/D converter is to convert a continuous-time, analog input signal x(t) into a sequence of digital codes y(k). Generally, A/D converters can be divided into two groups according to the sampling rate: the so called Nyquist rate ones and the oversampled ones (oversampling and oversampling ones) [11]. Block diagrams of the Nyquist converter and both types of the oversampled ones are given in Figure 1-30.
The conventional Nyquist rate converters, called Nyquist Rate Pulse Code Modulation (PCM) one, sample the analog signal at its Nyquist frequency, or a little above. The examples of such are Flash converters, the sub-ranging/pipelined ones and the successive approximation ADCs [12]. All the converters working at the Nyquist rate suffer from the required band-
DSP System Design
44
limiting the incoming analog signal in accordance to the Nyquist criterion. For the Nyquist converter the sampling frequency is required to be:
B
Input signal bandwidth. Input signal maximum frequency. Input signal Nyquist frequency.
Therefore Nyquist converters require to be preceded by a continuoustime analog Anti-Aliasing (AA) filters which band-limits the input signal to 2B. As can be seen from Figure 1-30 (a) the AA filter is clearly the first analog processing element of the conversion chain. The AA filter is required to have simultaneously very sharp transition band, small passband ripples and high stopband attenuation. Achieving and maintaining stringent requirements for analog AA filter even for moderate rate and middle range of resolution (13-16 bits) is very difficult, if not impossible. The second block of the Nyquist converter is the sampler. This is then followed by the amplitude quantizer which limits the number of different sample values. So each sample can be expressed by a finite binary code y(k). Most Nyquist A/D converters perform the conversion in a single sampling interval and at its full precision. The amplitude quantization is based on comparing the sampled analog input with a set of reference voltages, which are usually generated inside the Nyquist converter. Their resolution is determined by the amount of the reference levels that can be resolved. For high-resolution Nyquist A/D converters, establishing the reference voltages is really difficult. For example, a standard 16-bit A/D converter, very common for audio applications, requires different reference levels. For a 2V converter input range, the spacing between levels is as small as is (for the maximum output level V fixed to ±1V). The tolerance of VLSI technology exceeds this value many times. However, there are some techniques (like laser trimming or self-calibration) that can be employed to extend the resolution of the Nyquist converter beyond the component tolerances, but these approaches result in additional fabrication complexity and cost of increased circuit area [12]. Analytic analysis of the quantizer, the non-linear element, is very difficult and many models and approaches developed throughout the years. The simplest and the most commonly used is the linear additive noise model. It assumes the quantization noise power (variance) [13]. If N is made large, then the quantization noise power can be approximated by The quantization noise power for the Nyquist converter is
1. Polyphase IIR Filters
45
assumed constant and white, and uniformly distributed over the range of frequencies from -B to B (see Figure 1-31). Good measure of the performance of the A/D converter is the Signal to quantization Noise Ratio (SNR) which for the input power is [12],[14]:
The way to solve the problems with the resolution of the conventional PCM A/D converter is to oversample it way above the Nyquist requirement in connection with an averaging lowpass filtering and a downsampler (decimator) as in Figure 1-30(b). Implementing the complex and costly decimator pays the price of the increased resolution. Oversampled converters sample at much higher rate than required by Nyquist criterion:
The integer ratio is called the oversampling ratio (downsampling ratio from the decimator point of view). Its value ranges from several tens to several thousands, depending upon the application [15]. Both converters require the analog input signal x(t) to be first filtered by the analog anti-aliasing filter. The aim of this filter is to truncate the bandwidth of x(t) to be within the half sampling rate thus preventing the aliasing distortion. The filter transition band is required to be very sharp for Nyquist converters and can be much wider for the oversampling ones This is because the filter must not affect the input signal within the
46
DSP System Design
baseband bandwidth of B. As the complexity of the filter is a strong function of the ratio of transition bandwidth to the width of the passband, oversampling converters require considerably simpler anti-aliasing filters compared to Nyquist ones with similar performance characteristics. When analyzing the quantization noise power of the oversampled A/D converter again the same additive noise model can be used for the quantizer. Then the total noise power added to the input by the quantizer is which is the same constant as before. Therefore if the noise power is constant, white and uniformly distributed over the whole range of frequencies from MB to MB, then the effect of oversampling the conventional converter is to spread the constant, uniformly distributed quantization noise power over a wider range of frequencies. This effectively increases the resolution of the converter as it reduces the inband noise power. The quantization noise Power Spectral Density (PSD) of the oversampled converter is shown in Figure 1-31. As the noise is now spread M-times more than for the conventional Nyquist converter, the measure for the SNR should be taken over the range -B to B. The SNR becomes [14]:
The variation of the oversampled converter is the Sigma-Delta ADC. The name ADC is often used to describe the class of oversampled converters which employ single-bit quantizer with a noise shaping loop, coupled with a decimator as in Figure 1-30(c). Such an ADC is known as Pulse Density Modulation (PDM) ADC as at every time the relative density of single bit tracks the input amplitude. In ADCs the amplitude resolution, sacrificed by the crude quantization, is trade with the temporal resolution. This gives a way to overcome the limitations of VLSI processing technology by removing the stringent requirements from the analog circuits. The problem does not disappear though by oversampling and noise shaping. It is only shifted to the digital domain, namely the decimation filter. The advantage is that in the digital domain the problem can be solved much easier to the required degree of accuracy, but at the expense of speed, area and power consumption [11]. As shown in Figure 1-25 the main difference between the oversampled PCM ADC and the PDM ADC is the insertion of the loop filter H(z) and a one-bit quantizer with a negative feedback loop to shape the large quantization noise.
1. Polyphase IIR Filters
The low-resolution quantization is first provided by a modulator operating at the high rate, Then the decimator decreases the sample rate but increases the resolution in amplitude. The aim of the modulator is to produce a binary bit signal, which average tracks the analog input signal. The concept of a first-order modulator [15]-[16] is explained in Figure 1-32. It consists of a differential node, an integrator and a threshold (one-bit A/D converter) in the forward path, and a clipper (one-bit D/A converter) in the feedback loop. The modulator output, y(k), is a one-bit digital signal which can be converted by digital methods alone into a b-bit PCM signal by means of a decimating filter [11]. The input to the integrator is the difference between input signal x(t) and the output value y(k) converted back to an analog signal q(t). Provided that the D/A converter is perfect, this difference x(t)-q(t) at the integrator input is equal to the negative quantization error. This error is summed in the integrator and then quantized again by the threshold. Its output is:
The is the quantizer step size and is the input to the quantizer. In each clock cycle the value of the output y(k) of the modulator is either or The larger the quantization error, the longer the integrator output u(t) has the same sign and the longer the output pulse is. When the sinusoidal input to the modulator is close to plus full scale the output is positive during most clock cycles. A similar statement holds true for the
47
DSP System Design
48
case when the sinusoid is close to minus full scale For inputs near zero the modulator output varies rapidly between and values with a mean value approximately equal zero. The consequence of the negative feedback loop surrounding integrator and threshold is that the local average of the analog quantized signal q(k), and therefore the value of the modulator output y(k), tracks the analog input signal x(k). The amplitude resolution of the whole A/D converter increases when more samples are included in the local averaging process. The bandwidth is decreased at the same time. Consequently, the resolution of an oversampled A/D converter is a function of the oversampling ratio M. This means that the resolution of oversampling A/D converters can be improved without increasing the number of levels of the threshold [15]. In many applications, like digital audio, the bandwidth of the signal is small compared to the typical speed of operations possible to be performed in modern VLSI circuits. In these cases, oversampling A/D converters offer the possibility to change this excess in speed into higher resolution in magnitude. Oversampling structures do not depend as much on component matching in order to achieve high-resolution as Nyquist converters do. It is an important advantage for monolithic VLSI implementation [11].
2.2
Quantization noise in lowpass
modulators
Although the quantization error is large in the oversampled A/D converter due to the small resolution quantizer being used, the modulator reduces the quantization noise within the signal baseband. The reduction is performed through the use of feedback and sampling at a rate higher than the Nyquist frequency. Two general approaches can be used to achieve this goal: prediction and noise shaping [15]. Predictive modulators spectrally shape the quantization noise as well as the signal, so they are not so useful for precise A/D converters. Noise shaping modulators, known as modulators, simply shape the noise spectrum without significantly altering the signal spectrum. They do not reduce the magnitude of the quantization noise, but instead shape the power density spectrum of this error, moving the energy towards high frequencies. Provided that the analog input to the quantizer is oversampled, the highfrequency quantization noise can be eliminated with a digital lowpass filter without affecting the signal band. Noise shaping modulators are especially well suited to signal acquisition applications because they encode the signal itself and their performance are insensitive to the slope of the signal. The shape of modulator noise power density function should be found in order to design a proper digital lowpass filter for the decimation process. It is derived using the discrete model of the modulator as in Figure 1-33.
1. Polyphase IIR Filters
49
The discrete model was obtained by replacing the elements of the analog model presented in Figure 1-33 with their discrete equivalents. A simple digital integrator with the transfer function replaced the analog integrator. In order to simplify the analysis of the modulator the linear model of the quantizer was assumed which allowed ignoring the clipper. Also, the quantizer levels were assigned the values of ±1. Under these assumptions it is easy to find out the following difference equation:
The equation (71) has an obvious equivalence in the z-domain:
Equation (73) shows that the input x(k) simply goes straight through the modulator, delayed by only one sample. The threshold error e(k) would be seen at the quantizer output if there were no feedback loop. Due to the feedback, the first-order difference e(k)-e(k-1) appears at the modulator output instead. This difference operation acts to attenuate the quantization noise at low frequencies, thus shaping the noise with function The noise shaping function is the inverse of the transfer function of the filter in the forward path of the modulator. In the choice of the filter, care must be taken to ensure the stability of the whole circuit. For the model being analyzed the shaping function originates from the integrator. Modulators with more than one integrator in the forward path, such as the second-order system (two integrators cascaded), perform a higher order difference operation of the error produced by the quantizer and thus deliver stronger attenuation at low frequencies. In general, the power of the function is equal the number of integrators in the forward path (modulator order).
DSP System Design
50
Bennett [13] proved that error generated by quantizer with quantization levels equally spaced by has the power uniquely distributed in the frequency range
For an modulator with L integrators in the forward path, the noise shaping function is an difference equation described by:
Hence the spectral distribution
of quantization noise e(nT) [15]:
Evaluating the last equation on the unit circle we obtain the expression:
Equation (77) describes a theoretical noise power density function for order modulator as shown in Figure 1-34 for example simple modulators.
1. Polyphase IIR Filters
51
It would be interesting to compare the dynamic range (DR) of the ideal modulator and the Nyquist rate uniform quantizer. The inband quantization noise is calculated by integrating over the baseband:
Factor is the oversampling ratio. Useful SNR or dynamic range, of an A/D converter with sinusoidal inputs is defined as a ratio of the output power at the frequency of the input sinusoid for a full-scale input to the output signal power for a small input, for which SNR is unity (0dB). For an ideal modulator the dynamic range follows from equations (71) and (78):
Dynamic range of a Nyquist rate uniform quantizer with b-bits is [15]-[l6]. Comparing both relations gives the maximum bit resolution of an order modulator for a given oversampling ratio. Results are shown in Figure 1-35.
It is clear that the first-order modulator can cater practically for only up to 12-14 bit resolution A/D converters, where the second and the
DSP System Design
52
third-order modulators have the potential of 20-bit or even higher resolution. The term practically means the oversampling ratio less or equal (higher ones are seldom used). First-order modulators are therefore not attractive for high-resolution applications, unless very low signal bandwidth is required. A further disadvantage of first order systems is the fact that the quantization noise from the first order modulator is highly correlated with the input signal [17]. Their main merit is an absolute stability over the full analog input range. There are some techniques, which increase the resolution of the modulators, like a multi-bit A/D and D/A converters in the loop and an appropriate cascade of first-order modulators. Oversampling modulators offer the potential of about 15-dB reduction in baseband quantization noise for every doubling of the sampling frequency without a need for tight component matching [15].
2.3
Decimation filter for lowpass
A/D converters
This section detail a design technique for high fidelity multistage decimation filters based on the polyphase and decimator structures presented in [4] and [5], catering for powers of two sample-rate decreases. The technique is well suited for Analog-to-Digital Converter (ADC) applications in excess of 16-bit resolution. The resulting filter coefficients are constrained to the required bit length using a “bit flipping algorithm” [18]. This technique is presented comparatively through an example of a cascaded decimation filter, designed for a 20-bit resolution ADC, advantageous to other approximation methods, [19]-[21], incorporating a MAVR decimation stage followed by the FIR compensation. The coefficients and frequency responses of the cascaded filter are also reported. 2.3.1
Introduction
With the advent of high-speed and high precision integrated instrumentation, control and Digital Signal Processing (DSP), the demands for higher-resolution ADCs are on the increase. The need to achieve resolutions in excess of 16 bits monolithically has popularized and reemphasized the scope offered by modulators in conjunction with digital decimation filters, in overcoming the limitations inherent in the conventional analog techniques. The quantization noise generated by the modulator is pushed (shaped) out of the baseband and concentrated at the high frequencies. It is then filtered out by the decimation filter having extremely stringent amplitude (and sometimes phase/group delay) response attributes. A decimation filter specification for a 20-bit resolution ADC employing a third-order modulator with oversampling ratio of M=256, must have overall
1. Polyphase IIR Filters
53
passband ripples less than and noise power level below -131.2dB. Furthermore, for efficient physical realization, the filter coefficients of at least the first few stages of the decimator (operating at the highest samplingrates) should be constrained by design to have as few bits as possible. The cascaded polyphase two-path halfband IIR filter, lends itself to meeting the above requirements with the minimum of implementation complexity (only one coefficient per second-order stage), as well as exhibiting low sensitivity to coefficient value variations. This attribute makes it possible to home in on short coefficient wordlength. A PC based design environment developed in [18] has successfully been used in the design of efficiently implementable binary constrained coefficient ADC decimation filters, for varying bit resolution requirements. The following sections elaborate on and present details of the bit-flipping design algorithm with its associated parameters through the use of the aforementioned decimation filter example, providing comparative results for truncated, rounded and bit-flipped coefficient filters. For clarity of understanding basics of noise power calculation for modulators, and the concept of the multistage decimation is also presented. 2.3.2
Design of the multistage cascaded decimation filter
The first important decision in the design of a based ADC is the choice of the modulator. The quantization noise power generated by the twolevel quantizer is shaped by the modulator loop filter to exhibit a transfer function, which is the inverse of the loop filter (Figure 1-34). In order to estimate the required modulator order the baseband quantization noise power with is the error generated by a quantizer with its levels equal ( ±1), as in (78) is compared to the maximum allowed ADC inband noise power, (where b is the required ADC resolution).). This gives the relation between modulator order, L, and the required oversampling ratio, for the given bit resolution, b [15], [18].
The function in (2) stands for the smallest integer greater than or equal x. If (80) is used for a 20-bit ADC example, a third-order modulator with M=256 is estimated. Having determined the modulator order and the oversampling ratio, the decimation filter can be designed. Since the two-path halfband lowpass filters are being used the decimation filter should be designed as the cascade of stages, each decimating by two, as shown in Figure 1-36.
54
DSP System Design
Allpass filters in both branches of the structure have equal number of coefficients in order to optimize the use of processors that are doing calculations in both branches of the filter. The choice of the N-D allpass filter structure was made considering that for any frequency of the input signals limited in amplitude to unity, the internal results of the multiplication and summations do not exceed two (see Chapter 3 for more details). Both the modulator and the filter must satisfy overall ADC requirements. The output quantization noise power is a sum of the noise introduced by the modulator into the signal baseband and the high-frequency noise aliased during decimation. Total baseband magnitude response passband ripples, assuming that no distortion is introduced by the modulator, is the sum of all lowpass filters passband ripples of all the stages and the noise spectrum aliased into the signal band. For the multistage decimation filter case, decreasing the sampling rate by two at each stage, both output quantization noise power and decimation filter passband ripples can be calculated at the end of each stage using “equivalent lowpass transfer function”:
For a small number of decimation filter stages, the only significant noise aliasing into the baseband originates from the modulator noise spectrum, at frequencies where only one lowpass filter has its stopband replica. Then the filter passband ripples and the noise power at the end of the stage can be computed from:
1. Polyphase IIR Filters
55
The stage-to-stage aliasing in the multi-stage filter cascade is presented in Figure 1-37 for the case of the three-stage decimation filter. The contribution of each stage filter towards the overall transfer function is shown.
The conclusion is that it is possible to design and optimize each filter stage to its required specification independently from the successive stages. In the design of the overall filter one can assign equal noise power alias contribution into the baseband for stages following the current one and allow the passband magnitude response ripples to increase by the same amounts from stage to stage, including the current stage. As the sampling rate is being decreased by two at each stage the required transition bandwidth, TB is smaller at each new stage down to zero for the final one. The transition band of the last stage lowpass filter determines an overall filter bandwidth.
As can be seen from (19), for the given design, T and N are fixed and hence the attenuation is also fixed. In most cases the attenuation is much larger than required. This means that the requirements for the lowpass filter
DSP System Design
56
attenuation of the next decimation stages may be more relaxed and can be satisfied with fewer coefficients. The halfband filters in each stage are designed to have the maximum allowed transition band TB in accordance with (84). This approach leads to maximizing the stopband attenuation, which results in the reduction of the inband noise power, to a level less than the specified. As the design constraint was to produce binary constrained coefficients capable of satisfying the given decimator specification, a “bitflipping” algorithm has been developed [15]. This algorithm is explained in Chapter 3 in detail. In most of the cases the bit constraining approach results in reduced stopband attenuation in comparison to the floating-point case. This also makes it possible to relax requirements for all following stages and to subsequently decrease filter complexity, which is important for physical implementation [22]. 2.3.3
Application of the “bit-flipping algorithm”
The “bit-flipping” algorithm has been designed for optimum coefficient bit constraining. It finds w-bit long coefficients that steer the filter response close to the floating point equiripple case without the necessity of checking for all the (where is the number of filter coefficients) possibilities. It should be noted that, for very short bit-lengths, there may be no satisfactory optimization result and hence the number of coefficients must be increased and the algorithm launched again. The algorithm starts with the floating-point coefficients delivered from the elliptic approximation [5]. A structured exhaustive search of the possible bit patterns yielding improvement in the filter frequency response, starting from the least significant bits of the fixed-point coefficients, is the main philosophy of this approach. The optimization process starts with the first stage filter in the decimator and proceeds sequentially forward until the last stage is reached. If the performance of a given filter in the cascade is better than that required at the end of any one stage’s optimization, this fact is used to advantage by relaxing the specification of all the following stages, hence opening the possibility of reduced implementation complexity. The “bit-flipping” approach delivers more efficient filters for a given wordlength in comparison to when the elliptic filter coefficients are crudely truncated or rounded. A comparison of all three is presented in Figure 1-38 for a filter specification having transition band TB=0.125 (normalized frequency), number of coefficients N=7 and coefficient wordlength w=10 bits. As can be noticed in the figure, the bit-flipping algorithm is superior to the standard truncated and rounded cases, resulting in the highest stopband attenuation and smallest passband ripples outside the required transition band.
1. Polyphase IIR Filters
2.3.4
57
A 20-bit decimator example
An example eight-stage polyphase IIR cascaded decimation filter was designed for the 20-bit Analog-to-Digital converter. Magnitude responses of the halfband filters with their associated bit-flipped optimized coefficients for each stage of the decimation are shown in Figure 1-39 and Figure 1-40.
58
DSP System Design
Equivalent lowpass decimation filter magnitude response as given by the floating point COMDISCO SPW (now ALTA) [23] simulation is given in Figure 1-41. Note that, in order to achieve the appropriate attenuation when using short coefficient wordlength (down to four bits), a cascade of two lowpass filters is employed in each stage.
1. Polyphase IIR Filters
59
The presented multistage decimation filter performs within the 20-bit fidelity specifications for up to 90% of the input signal bandwidth. Furthermore, the quantization noise power has a margin of 7.6dB below the required value as well as having an overall passband ripple, which is below For audio applications where passband ripple requirements are not very stringent, the last two stages of a decimation filter can be simplified to have two or four coefficients only. The filters were designed with even number of coefficients for symmetry reasons. Certainly one must be aware of the fact that, due to the allpass IIR filters used in the two-path structure, the overall filter has non-linear phase and phase corrector should be applied for applications demanding low phase distortion (linear phase). 2.3.5
Summary
This section has presented results obtainable from the bit-flipping ADC decimation filter coefficient constraining and optimization algorithm and demonstrated its superiority over rounding and truncation of the floating point ideal elliptic design. The full description of the bitconstrained optimization algorithm can be found in Chapter 3. The presented method reaches beyond 16-bits, unlike MAVR-based ones [19]-[21]. Although only results from the design of a 20-bit resolution decimation filter example have been presented, higher resolutions can also be achieved. Application of the method for the 24-bit lowpass decimation has been published in [2], [24]-[25]. The technique has been extended to cater for multi-path decimation and interpolation filter design incorporating the polyphase structure.
2.4
Decimation filter for bandpass
A/D converters
The traditional lowpass Sigma-Delta based Analog-to-Digital (A/D) conversion principle has been recently extended to bandpass for direct IF conversion [26]. Such a converter offers high Signal-to-Noise Ratios (SNR) for narrow-band signals relative to the sampling frequency, at significantly lower oversampling ratios in comparison to the conventional lowpass converters. In this chapter a sixth-order modulator sampled at 1.82MHz, coupled with a multistage polyphase decimation filter, is reported for the conversion of bandpass signals centered at 455kHz with 14kHz-bandwidth [27]. The decimation filter is uses a small number of short-wordlength filter coefficients. Simulations undertaken demonstrated that this setup realizes 124.3dB SNR with less than passband ripples for a half/full-scale composite 20-sinewave input signal potential of up to 20-bit performance. Such a converter is able to directly demodulate narrow-band AM signals.
DSP System Design
60
2.4.1
Introduction
Modulation is a technique employed in A/D conversion which makes use of oversampling and digital signal processing in order to achieve a high level of accuracy. Modulators are designed such that the noise is shaped away from the band of interest, thus retaining the original signal in the noisefree band. Utilizing an appropriate digital decimation filter can then filter this noise. Such a decimation filter is composed of a high quality bandpass filter and a sampling rate converter, which brings the sampling frequency down to the required one for a given application. For example for audio applications the sampling rate is decreased down to the audio signal Nyquist frequency. In other applications, like in the front-end of a radio receiver, a bandpass converter [26], [28] is used to perform the direct conversion to digital at either intermediate- or radio frequency. Filters used in the decimator must perform both proper reduction of the out-of-band quantization noise and prevent excess aliasing introduced during sampling rate decreasing. They must also be very efficient computationally as the filtering is usually performed at a high rate. Additionally, for precision conversion of wide-band signals they must also have very small passband ripples, less than half the quantization step for the given resolution. Here polyphase structures come in very handy. They can achieve lowpass filtering with very small passband ripples, very high stopband attenuation for a very small computation burden. Their application for audio band lowpass conversion has already been reported in several publications [5]. Such structures are also attractive and viable for bandpass decimation filters. 2.4.2
Bandpass
modulator design
The typical way of designing a bandpass modulator is to use an existing high-quality lowpass modulator and perform a frequency transformation on the feedback loop filter transfer function. If in the prototype modulator the loop filter is a cascade of three first-order integrators (L=3), after transformation we get a modulator as in Figure 1-42.
1. Polyphase IIR Filters
61
The modulator uses a cascade of three second-order notches located at half Nyquist (quarter of the sampling frequency). Equation describing the behavior of such a modulator is given in (85).
Function E(z) is the quantization error assuming ideal 1-bit quantizer.
Even if the resulting noise shaping transfer function and therefore the noise power spectrum (86) are dependent on modulator coefficient values‚ for high oversampling ratios the noise transfer function can be approximated by The theoretical shape of the quantization noise for a three-loop modulator used in our converter design is given in (86) which assumes a linear additive noise model for the quantizer and presented in Figure 1-43 for a half-scale signal composed from twenty orthogonal sine waves.
Integrating (86) over the signal bandwidth where R is the bandpass oversampling ratio‚ gives the inband noise power received from the modulator (88).
62
DSP System Design
For each octave increase of the oversampling ratio the inband noise power of the modulator decreases by 6n+3 dB‚ where n is the number of filter notches [26]. The theoretical results of inband noise levels for different numbers of loops and oversampling ratios are given in Figure 1-44.
In order to achieve the required 20-bit conversion resolution for a 14kHz bandwidth and minimize the computation burden permitted‚ a sixth-order bandpass (BP) modulator was chosen. To achieve this while maintaining appropriate SNR of at least 124.3dB or more‚ a sampling frequency of 3MHz was chosen (R=64). Employing a half-scale input signal‚ the theoretical modulator SNR was found to be 124.8dB. The decimation filter had to be designed so that the noise aliased into the signal band did not increase the noise level by more than 0.5dB‚ with up to 1.4kHz transition band and less than passband ripples. The modulator exhibits a
1. Polyphase IIR Filters
63
75mdB roll-off itself for the composite tone input and a linear approximation of the quantizer‚ which can be compensated after decimation at the signal Nyquist rate with an appropriate compensation filter or by careful design of the modulator coefficients. 2.4.3
The design of the cascaded bandpass decimator
In a bandpass A/D converter the modulator is followed by a digital filter‚ which converts the high-speed bit stream into a multiple-bit output at the Nyquist rate. For bandpass modulation‚ one needs to perform narrowband filtering on a high-speed bit-stream. One can modulate the band of interest down to DC using where is the signal center frequency‚ splitting the modulator output into real and imaginary channels followed by lowpass decimation filters as in Figure 1-45 [28].
For the sine and cosine sequences have very simple structures: each term is either zero or ±1. Such multiplicands can be achieved by simple Boolean operations. Then the decimation can be done with lowpass filters‚ identical for both real and imaginary channels‚ combined with the sample rate decreaser. In the case presented here‚ a two-path polyphase structure is used to perform the lowpass filtering [2]‚ [5]. The decimation filter can be designed exactly in the same way as it was done in section 1.2.3.3 for the case of the lowpass ADC considering the shape of the noise power density spectrum being the same as for the lowpass modulator used before for constructing the bandpass modulator. One should only bear in mind that‚ because of the demodulation‚ the decimation filter has to be designed now for the lowpass oversampling ratio of M=R/2. For the case of the oversampling ratios equal powers of two‚ decimation filters can be designed as a cascade of two-times decimation stages using two-path halfband polyphase filters as in Figure 1-46.
64
DSP System Design
The example design employs seven of such stages‚ decreasing the sample rate by M=128. Cascaded decimation filter comprising double-lowpass filter sections with two 4-bit wide (unsigned) coefficients (0.125‚ 0.5625) in first five stages‚ four 8-bit long (0.04296875‚ 0.17187500‚ 0.39453125‚ 0.74609375) in the sixth one and six 10-bit long (0.0810546875‚ 0.2763671875‚ 0.4990234375‚ 0.689453125‚ 0.833984375‚ 0.947265625) in the seventh one. The filter coefficients were again optimized by the specially developed “bit-flipping algorithm” [29]. The out-of-band noise was attenuated 135.8dB in stages one to five‚ 146dB in stage six and 123.8dB in stage seven. As a result the total quantization noise aliased into the signal baseband was at a level of -133.8dB. The first five stages can be easily implemented with hardwired shifts-additions as they require only six shift/add operations per stage. Also‚ the first stage works with 1-bit data stream somewhat simplifying matters. In contrary‚ if multiplications in stages six and seven were performed through shifts-adds‚ they would require 51 operations for stage six and 96 for stage seven employing 28-bit convergent-round arithmetic. These calculations need to be performed by a specially designed ALU.
The first five stages of decimation can be integrated together into a bandpass filter working at the high rate and the sampling rate decreaser by 32 incorporated after the demodulator. This idea is given in Figure 1-47. The prototype polyphase lowpass filter (0.13349539733‚ 0.57730761228) having
1. Polyphase IIR Filters
65
attenuation of A=65dB each and transition band of was converted into the bandpass filter through the lowpass-to-bandpass frequency transformation [30]. The DC feature of the lowpass filter was shifted to half-Nyquist and the edge of its passband, originally at was used to create the bandwidth of 0.0078125·fs. The resulting tenth-order IIR bandpass filter transfer function was a function of and had evensymmetric denominator and odd-symmetric numerator. This led to simplifying computations down to only four floating-point multiplications for both bandpass filters, but their coefficients had to be at least 22 bits long to achieve the correct magnitude response. The last two stages of decimation were exactly the same as in the previous structure. The overall performance of both structures was exactly the same. 2.4.4
Decimation Filter Performance Evaluation
The overall decimation filter performance is summarized in Figure 1-48 showing the overall decimation filter passband ripples. As the impulse test signal does not provide enough energy for sufficient illumination of the transfer function‚ the twenty-tone test signal had to be designed.
It comprises twenty equally weighted (each having amplitude of 0.05) and approximately equally spaced (roughly spanning the whole of the baseband)‚ mutually orthogonal sine waves. This input signal was designed as suggested in [31] for coherent spectral analysis purposes. Such a 20-tone signal was passed through the modulator and then applied to the designed decimation filter. The magnitude response of the output signal from the decimation filter is shown in Figure 1-49. The noise shaping of the inband quantization noise can be noticed in the frequency range where the multitone signal exists.
66
DSP System Design
Magnitude response of the bandpass filter output signal is shown in Figure 1-50. Note that the filter was designed to perform exactly like the first five filter stages from Figure 1-46 and in result the difference between overall noise and ripple performance for both structures are negligible.
The passband ripples obtained were less than with SNR=124.3dB‚ i.e. quantization noise level below 124dB (assuming half full-scale input). For the structure in Figure 1-46 all calculations in the first five stages can be done as hardwired shift/add arithmetic units. The last decimation stages would require a specialized type of processor similar to the one used in [2].
1. Polyphase IIR Filters 2.4.5
67
Conclusions
In this section a sixth-order bandpass A/D converter with the potential of conversion of bandpass (AM) signals centered at 455kHz with 14kHz bandwidth for up to 20-bits of resolution have been presented. The bandpass modulator is sampled at 1.82MHz and employs the filter transformed from a lowpass prototype through transformation. This modulator coupled with a seven-stage polyphase decimation filter delivered SNR=124.3dB‚ with decimation filter passband ripples less than The use of the polyphase structures in the decimation process allows minimization of the hardware complexity by using a small number of shortwordlength filter coefficients. The simulation result indicates SNR=124.3dB with less than passband ripples for a half full-scale composite input potential of up to 20-bit performance. The resultant converter directly accomplishes demodulation of the narrow-band AM signals. The performance achieved by the design and reported in this section is very difficult‚ if not impossible‚ to achieve by other means.
2.5
Polyphase IIR interpolation filter for lowpass DAC
Oversampling and incorporating modulators improves the accuracy of D/A converters (DAC) just as it does for the Analog-to-Digital Converter. A reason for using modulators for D/A conversion is easily understood when analyzing conditions for the 16-bit DAC. If the converter is using the typical 3V reference voltage‚ then the voltage corresponding to the permissible half-LSB error is This approximately equals the voltage generated by about a dozen electrons stored in a 0.1pF capacitor [32] and is comparable to the thermal noise present at the input of the MOS operational amplifier. The direct design of such a converter will require very expensive trimming and/or calibration procedures. The oversampling technique overcomes the problem of the analog accuracy by trading the digital complexity and speed for lower sensitivity to analog non-idealities. As there are fast and dense digital circuit realizations available due to the state-of-the-art technologies‚ such a tradeoff is very desirable. The general structure of the oversampled DAC is shown in Figure 1-51. The input to the converter‚ x(k)‚ is now a multi-bit digital stream of long words coming at the data rate In most cases is close to the Nyquist frequency of the signal. This signal is first processed by the interpolation filter which changes the data rate of the incoming signal to the higher rate where L is the Sampling Rate Increase (SRI) factor‚ and then suppresses the spectral replicas located at multiples of the old data rate This is done by the high quality lowpass filter following SRI block.
68
DSP System Design
At the output of the interpolation filter the data will have wordlength which is the same or slightly smaller than (Figure 1-44) [32]. This signal then enters the standard modulator‚ which drastically changes the wordlength‚ typically down to a single bit. It can be exactly the same as for oversampled A/D converters. The quantization noise power introduced by such drastic truncation is shaped in a way so that most of it lies outside the baseband. Next the truncated signal is converted into an analog signal by an internal one-bit DAC‚ the first analog stage of the circuit. Since its realization is conceptually simple‚ it can be made linear. The analog DAC output contains a linear replica of the digital input signal x(k) and a large amount of quantization noise. As most of it lies out of the band of interest‚ it can be almost completely suppressed by an analog lowpass filter following a D/A converter. In general the overall performance of the overall converter‚ for ideal operation‚ is determined by the noise shaping loop of the modulator. The requirements for both the digital and analog lowpass filters are dependent on the shape of the loop‚ in and out of the baseband‚ as they must limit the amount of minimum stopband loss due to digital signal replication (interpolation lowpass filter) and due to the quantization noise (analog lowpass filter). The interpolation filter‚ in contrast to the decimation filter used in the A/D converter‚ has a slightly different purpose. It has only to increase the sampling rate and make sure that the power of the signal replicas‚ originating from Zero Insertion or Zero-Order hold by the SRI‚ are attenuated to the appropriate level as required by the given ADC b-bit resolution‚ It also assures that their amplitude is below Additionally it has to make sure that the signal is not distorted in its baseband‚ i.e. the passband ripples of the interpolation filter should be less than half of the LSB‚ for the bipolar converter and for the unipolar one. In contrast to the A/D converter application the interpolator does not filter out the quantization noise from the modulator. This task‚ in the discussed D/A converter‚ is given to the analog filter. If the interpolation filter is used in the multirate system in which the result of the interpolation is subsequently decimated back to the original input rate then the interpolation lowpass filter has to satisfy a more stringent condition‚(89).
1. Polyphase IIR Filters
69
Symbol M=L is the decimation factor. This condition makes sure that‚ following the decimation‚ the ripples originating from the passband and the stopband of the interpolation filter will be less than the required one for the given b-bit accuracy. If (89) is satisfied then the subsequent decimation can be performed without the anti-aliasing lowpass filter. Note that equation (89) assumes flat stopband characteristic. The polyphase based LPF structure‚ discussed earlier in this chapter‚ can be used to accomplish the lowpass filtering needed in the interpolation process as effectively as it could do in the previously described decimation process. The similarity between decimation and interpolation is very close and it will be shown how this relation can be used to design the interpolation filter in a similar way to what was done for the decimation filter in a number of two-times interpolation stages. In such a case at each successive stage the zero-insertion is followed by the lowpass filtering by the polyphase halfband filter. Thanks to its simple structure and small number of short fixed-point coefficients‚ filtering can be performed faster than by other types of LPF.
The equivalent magnitude response of the interpolation lowpass filter has its shape the same as for the multistage decimator and is presented in Figure 1-52 for the three-stage interpolation. Assuming that the LPF in each interpolation stage have the same stopband ripples‚ and the same passband ripples‚ the equation (89) can be simplified to (90).
70
DSP System Design
Considering that passband ripples of the polyphase structure are much smaller than its stopband ripples‚ as in (14)‚ hence they will not significantly influence the overall passband ripples after the decimation. Additionally‚ stopband ripples aliased into the baseband and originating from the overall interpolation filter stopband regions in which more than one-stage filter has its stopband are negligible in comparison to those aliased from other regions. Under these assumptions (90) simplifies to:
Then the stopband attenuation of each stage LPF should be less than:
The SNR considered as a ratio of the total power of the signal replicas to the power of the signal in its baseband can be calculated from Figure 1-52:
The is the stopband attenuation of the stage filter; is the power of the signal in the baseband and is the power of the signal replicas in the stopband of the interpolation filter. Equation is based on the assumption that only parts of the interpolator stopband where the attenuation is the smallest are significant. Stage filter stopband attenuation for the total power of the signal replica in the interpolator stopband less than can be found from:
1. Polyphase IIR Filters
71
The choice of the transition band for each subsequent interpolation stage is different than for the case of the multi-stage decimation filter. The input signal to the interpolator is a full-band signal. After two-times SRI the input signal becomes squeezed within the baseband range of up to of the normalized frequency with a mirror replica above this frequency up to Nyquist. The transition band‚ should be theoretically zero. This is similar to the case of the last stage filter in the multi-stage decimation (refer to Section 1.2.3.2). Practically the transition band of the first filter is chosen very small; just how small depend on the application. One must take care with its choice as it limits the input signal bandwidth of the overall interpolator to At each subsequent stage‚ as the sampling frequency increases‚ the transition band is more and more relaxed as well as the complexity of the lowpass filter decreasing. The requirements for the transition band for each interpolation stage‚ m‚ can be specified as follows:
The standard implementation of the interpolator requires the lowpass filtering to be performed at the high sampling rate (after SRI). The standard zero-insertion interpolation‚ as in Figure 1-53(a)‚ gives the way to simplifying the structure of the polyphase interpolation filter. It can be noticed from Appendix A that the allpass filter that is a function of has its response relative to the input every other sample. This means that output samples at even sampling intervals are independent of the samples at odd sample intervals‚ and they are responses to only even and only odd input samples respectively. This means that zeros inserted into the structure will not have any influence on the output and therefore the SRI block can be moved to the input of the filter (similar to Curtis’ decimator structure [33])‚ as in Figure 1-53(b) for a single LPF interpolator in order to use the allpass filters in the structure most effectively.
72
DSP System Design
The example shows the performance of the polyphase structure used in the interpolation filter for a bipolar b=20-bit Digital-to-Analog (D/A) converter with an L=256 times oversampling modulator. Design process is similar to decimation filter design and the required passband ripples are:
As the interpolation filter is used for D/A application and there is no subsequent decimation involved then the minimum required stopband attenuation for a 20-bit D/A for each filter stage is calculated according to (95). In order to avoid the signal replica affecting the baseband frequencies‚ these replicas must be attenuated below half of LSB. The minimum required stopband filter attenuation was that was chosen as the requirement for the filter design. Transition band of the first interpolation stage is chosen arbitrarily as which limits the baseband to 95% of the original signal bandwidth. Cascaded interpolator requires eight two-times interpolation stages. In order to simplify the requirements for lowpass filters‚ each interpolation uses a cascade of two identical polyphase lowpass filters both designed for half of the required minimum attenuation of
The designed interpolation filter has more than of the stopband attenuation and passband ripples. The ratio of the total
1. Polyphase IIR Filters
73
power of the out-of-band replica to the signal power SNR=151dB. These parameters clearly demonstrate that the designed interpolator exhibits 20 bits of accuracy. The overall interpolator magnitude response is shown in Figure l-54(a) and its baseband ripples in Figure l-54(b). All filter coefficients were constrained to fixed-point values with the bitflipping algorithm detailed in Chapter 3. The first two stages of interpolation require much more accurate representation of the filter coefficients due to the small transition bandwidth and high stopband attenuation demands. The first stage‚ which has the most stringent requirements and required five 12-bit long coefficients. The next stage‚ for which the transition width was more relaxed needed only two 11-bit long coefficients. The remaining stages‚ operating at higher rates‚ were designed to have two and one coefficients only. It is worth noting that filters in stages three to eight require only 6 shift-and-add operations each to perform its filtering. The whole filter requires as little as 104 shift-and-add operations for the whole interpolation filter.
The interpolator structure presented here has by default a non-linear phase as it is designed using IIR filters. The total peak-to-peak group delay ripples in the passband are samples. The phase linearity of the interpolator can be a problem for some applications. However‚ for some applications like audio it will not be a problem as the human ear is insensitive to phase‚ except for the difference of phases of signals coming to both ears creating the sense of the direction of sound (Hass effect) [80]. For other applications the interpolation filter will have to be coupled with
74
DSP System Design
additional filters performing the compensation for phase linearity. A method of compensating phase non-linearity of the IIR polyphase structure is presented in the next section. It incorporates the same type of allpass filters as used for constructing the lowpass filter for phase compensation. The suggested structure of the interpolation filter using a cascade of twotimes interpolation stages and using two-path polyphase lowpass filters is novel [16] and can be applied to a number of applications‚ from D/A converters to multirate filtering requiring both up and down sampling. Both decimation and interpolation filters can be in such cases designed using the same polyphase structure. Performing calculations in a number of stages operating at different rates allows the optimized use of the arithmetic unit (processor). Additionally using the structure in Figure 1-53(b) avoids making any redundant calculations (due to zero insertion) by the interpolator. The performance shown before for both types of filters are very difficult if not impossible to achieve by other types of filters‚ especially when implementation issues are crucial. They require very little computations due to the small number of multiplications done in fixed-point arithmetic. In many cases such multiplications can be easily done using few shift-and-add operations. Such filter implementation makes the polyphase decimation and interpolation filters very competitive to other types of filters having similar specifications.
2.6
Application of multi-path polyphase IIR filters for cascaded decimators and interpolators
The cascade of two-path two-times decimation filters‚ as it was shown in previous sections‚ permits achievement of a very good decimation and interpolation filters but is applicable only for conversion ratios equal powers of two. Applying more than a two-path structure allows extending the choice of possible conversion ratios to any integer number. Then‚ the number of paths of the filters in cascade must be equal to all smallest integer divisors of the oversampling ratio‚ each of them being a primary number. For example‚ if the oversampling ration is M=60 then there will be two two-path twotimes conversion filters‚ one three-path one and one five-path one. In general one can also choose to use one six-path conversion stage and one ten-path conversion stage or any other combination. However‚ in such case the transition bandwidths required for each of the filters would be more stringent and it would result in more coefficients required (for the whole decimation/interpolation filter) than if the smallest number of paths were used. This implicates that increasing the number of paths and the number of coefficients accordingly would not allow achieving the same stopband attenuation. It would result in the smaller one. One can remember that more
1. Polyphase IIR Filters
75
than two-path polyphase filters exhibit spikes in their stopband and the obvious question arises if these would affect the performance of the whole cascaded decimation/interpolation filter. However‚ these spikes are at such frequencies‚ which at the sampling rate conversion will alias only into the transition band. Secondly‚ if such a filter is not the last one in the cascade‚ the spikes would be canceled by zeros on the unit circle of at least one of the next stages of the decimator/interpolator. This is shown in Figure 1-55 for decimation by M=12 assuming that filters with same attenuation of A=50dB.
It can be seen that the spike at is canceled by the zero of the third decimation stage. The idea is analogous for the cascaded IIR polyphase interpolation filters. To confirm the theoretical the equivalent lowpass magnitude response of the designed decimation filter is also presented which confirms the shape of the theoretical one.
3.
ALMOST LINEAR-PHASE FILTER DESIGN
For systems‚ which cater for wide-band signals‚ it is important to ensure same time differences between signal spectrum components before and after filtering. This requirement is met if phase response of the filter is linear without constant factor‚ thus having a constant group delay function‚
76
DSP System Design
A flat group delay function is all that is required to assure the phase response linearity for a polyphase lowpass filter. This is due to the phase response equal zero for DC and continues across the whole baseband. The phase corrector can be designed in two ways‚ either straight by designing the corrector phase response shape to be opposite to the one of the filter or by making the group delay of the filter flat by designing its coefficients accordingly. The first method can be applied if a typical IIR corrector is designed. Then the shape of its phase can be specified. The latter method uses the correction filter or a set of them‚ phase responses of which are only approximately opposite to the shape of the phase response of the original filter. It can be noticed that phase response of the allpass subfilter with negative coefficient can be used to correct the phase of the polyphase lowpass filter for frequencies below Certainly the quality of correction is better for smaller frequency bands. A cascade of different order allpass sections can be used for better correction. Phase correction is normally required only in the signal band. The smaller the bandwidth the easier the correction is and the simpler the implementation is. The number of coefficients required for the same compensation is smaller - as is the requirement for the coefficient wordlength.
3.1
Phase compensation with allpass sections
An important problem in the cascaded decimation filters incorporating the polyphase structure is non-linear phase. Phase non-linearity is small for first stages of decimation (high oversampling ratios) and grows quickly with each succeeding stage. Linear phase is not an important issue for applications dealing with single sine signals as often happen in testing and measurement. However for applications requiring high accuracy and dealing with bandpass signals‚ the decimator needs to have its phase linearity compensated. Allpass sections having negative coefficients are very attractive for correcting the group delay (phase linearity) of the polyphase lowpass filters‚ especially within frequency bands which are the power of two divisions of the Nyquist frequency‚ i.e. 0.25‚ 0.125 etc. The phase response shape of the allpass section‚ used to build the polyphase filter (polyphase decimator)‚ having a negative coefficient‚ is opposite to the one having the positive coefficient. It can be seen from (17) that the total group delay of the polyphase structure is equal to the average of its allpass components and will follow the same bell-like shape of the group delay‚ peaking at half-Nyquist for second-order structure‚ quarter-Nyquist for form-order structure‚ etc. Here only the second-order case will be considered.
1. Polyphase IIR Filters
77
Let’s consider first the simplest corrector‚ being a single second-order allpass section. Best correction is achieved when the group delay of the compensated filter at DC is equal to the group delay at the cutoff frequency‚ This means (skipping the constant term of 1/2):
Symbols are corrector coefficients and The effect of this correction can be seen in Figure 1-56 for the two-coefficient (0.125‚ 0.525) two-path LPF with cutoff of Even a one-coefficient corrector gives a significant‚ 6.5 times decrease of group delay peak-to-peak error in the signal band as shown in Figure 1-56 and Table 1-7.
Cascading more higher-order allpass filter corrector sections improves phase linearity. Although consecutive orders can be chosen‚ it is preferable to use even-order sections. Such a choice assures the symmetry of compensation as well as making them easier to integrate into the decimation filter where they could be placed after the sample rate decreaser. The order of the second compensator section should be chosen to be i.e. the full K-section compensator transfer function is:
78
DSP System Design
Higher order compensator coefficients can be calculated by minimizing the sum of the differences of the resulting group delay squared with respect to an average value‚ as in Figure 1-57‚ i.e.:
Downhill-Simplex optimization method was used to minimize (101) [59]‚ It was observed that for the number of corrector sections greater than three‚ the result was very dependent on the starting point of the optimization. Therefore approximate values had to be calculated before the optimization. Notice that absolute values of consecutive section coefficients approximately follow a geometric series. Therefore it was enough to find the coefficients of the first two sections to specify a good starting point.
Approximation of the second section coefficient is explained graphically in Figure 1-58. The coefficient is calculated to satisfy the equation:
1. Polyphase IIR Filters
79
Symbol is the group delay of the filter compensated with a single allpass section. The initial values of compensator coefficients applied to the optimization can be calculated as:
The example of the filter group delay compensated with four allpass sections is shown in Figure 1-59. Original filter had two coefficients (0.125 and 0.5625) and its group delay was compensated up to Group delay peak-to-peak error decreased from 0.43 to 1.3e-4‚ over 3200 times! Compensation results for the same filter with the number of sections ranging from one to four and cutoff frequencies between 0.03125 and 0.25 are summarized in Table 1-7.
The important issue for the implementation of the polyphase cascaded decimation filter is the high-speed calculations‚ especially in its first stages (operating at the highest frequencies). Therefore it is required that the compensator coefficients are also constrained to a small bit length. The first stages‚ where the group delay is not big anyway‚ can be compensated with Such multiplication can be implemented as a single shift-and-add operation. The last stages would require more sections and longer coefficients‚ but then there is much more time for calculations as the sample rate is much slower than in the first decimation stages.
80
DSP System Design
Floating-point coefficients were constrained to four‚ eight and sixteen bits using a “bit-flipping” algorithm. One must consider that‚ if the required wordlength of coefficients are too small‚ the effectiveness of compensation
1. Polyphase IIR Filters
81
will be limited‚ sometimes not feasible. Then alternative solutions must be sought. Result of limiting floating point coefficients to four eight and sixteen bits are shown in Table 1-8 and Table 1-9.
82
DSP System Design
Factor K used in the tables is the ratio between the baseband group delay peak-to-peak error before optimization and the one obtained after optimization and can be calculated from equation (87).
1. Polyphase IIR Filters
83
Comparing the results of the constrained coefficient compensator with the floating-point ones reveals that the latter ones give much better performance. This shows that compensator coefficients must be designed and implemented using long arithmetic wordlength. In many cases even eight bits were too small if wordlength like 16 bits were to be used‚ the effectiveness of the compensation was really amazing. Sometimes by using three of four compensator sections the group delay ripples can be decreased even a few thousand times (see results for small passbands in Table 1-8). The effectiveness of the compensation decreases quickly when decreasing the wordlength of the coefficients because their values in consecutive allpass sections follow a geometric series‚ quickly converging to zero. Therefore for small wordlength some of the coefficient values are too small to be represented with the given number of bits. The performance of the compensation is the smallest for passbands reaching (short transition bandwidths). The performance decreases very little when the wordlength is decreased. Although the group delay ripples are decreased only two to three times‚ one must remember that the bell-like shape of the polyphase halfband lowpass filter is peaking at and therefore even such small compensator performance is still giving considerable decrease of the absolute group delay ripples. The performance of the constrained compensation is much better for smaller passbands‚ just like for the floating-point versions. The performance for wordlength of 4 or 8 bits is much worse than for 16 bits or floating-point cases. This decrease is much clearer for three or four coefficients than for one or two coefficients. Simply the coefficients of the first sections can be much easier represented (approximated) with a small number of bits than the successive ones. As they are much smaller‚ there is not enough dynamic range to represent them properly. Therefore usually the three or four coefficient compensators degenerate to two or one coefficient cases (compare results of passbands 0.0625 and 0.03125 in Table 1-8 and Table 1-9). Looking at the results of constrained compensators it can be noticed that there is a coefficient -0.0625‚ which appears many times as the result of 4-bit designs. It can be included into first stages of decimation to do primary compensation at the higher frequencies. The more crude and fine compensation can be done later after all filtering operations. This sounds very attractive for multistage polyphase decimation filters requiring minimum group delay distortion. For such filters a small compensation could be performed at each of the first stages‚ giving approximately ten-fold
84
DSP System Design
decrease of the group delay ripples at a cost of a single shift-and-add operation and two delayers. This is what is required to implement the compensator having the coefficient -0.0625. At every next stage when the arithmetic wordlength increases and the sampling interval increases‚ the compensator can have more bits and more coefficients giving better and better performance. At the last stage when the passband reaches v=0.25 the compensation can be performed using a full-blown FIR/IIR filter. The requirements for the last stage will be much smaller than it would be if there were no prior compensations performed. In conclusion‚ the compensation method presented in this chapter is a good method for compensating the group delay ripples of the polyphase lowpass filters. It can be used in two different ways: one is to do fast but small compensation during the filtering (small coefficient wordlength) and would not involve much calculation or much time‚ the other one is to do slower but more effective compensation (larger wordlength and more coefficients) and would require more calculation and more time. The choice is certainly dependent on the application. Certainly either method is competitive to using a full-blown FIR/IIR compensator in terms of the amount of calculation and time required achieving the same compensation performance. The results achieved very clearly demonstrate that such a type of group delay compensation can be effectively used to perform a good compensation at a low computational and hardware cost. It can show above that it can be also easily included into each stage of the polyphase multistage decimation filter.
3.2
Approximating the bulk delay phase response
Alternative solution to linear-phase polyphase filtering is to rearrange the standard structure from Figure 1-2 by replacing the allpass filter in the lower branch with a bulk delayor (simplest form of an allpass) as in Figure 1-60.
The lowpass polyphase structure works on the principle of the allpass filters from both branches being in-phase at the low frequencies and distant
1. Polyphase IIR Filters
85
by at the frequencies close to Nyquist. By putting a bulk delayer‚ (K equal to the order of the allpass filter in the upper path)‚ into the lower branch‚ the top allpass branch will have to be designed to follow the linear phase response of the delayer to achieve the lowpass characteristics. In order to achieve equal phase characteristics of both branch filters at low frequencies (required for lowpass filtering) the order of the delay is one less than the order of the allpass filter in the top branch. The idea as such is not new and was suggested and used by Curtis and employed in a design routine recently published by Lawson [34] and Lu [35]. The current design methods based on the standard idea of composing two identical IIR (non-linear phase) filters to achieve approximately linear-phase characteristic [34] or apply iterative quadrature programming methods [35]. Such an approach does not allow much flexibility‚ limiting the number of points of freedom to half of what would be available when standard IIR filter is used; thus the resulting filter has larger stopband ripples and larger group delay ripples than the structure is really capable of achieving. Our design routine uses the Matlab weighted least-squares optimization routine to approximate the phase of the delayer in the filter passband and in the filter stopband. The transition band was not controlled at all. The important factor was the choice of the weighting function. Choosing constant weights for passband and stopband led to stopband ripples decreasing monotonically with frequency while passband ripples were monotonically increasing. Therefore an iterative method was applied which was changing the weighing function according to the shape of the envelope of the passband/stopband group delay ripples at every iteration. Because the general IIR filter is used in the top branch of the structure‚ which obviously does not have to be symmetric against it is important to monitor the group delay ripples both in the filter passband and its stopband. The passband ripples are responsible for achieving small passband ripples and the stopband ones for high stopband attenuation. The design routine can be described as follows: 1. Specify the required complex magnitude response shape to be equal the response of the delayer in the filter passband and equal the response of the delayer in the filter stopband. Also specify the frequency grid in a logarithmic scale to be denser close to the transition band. 2. Choose the weights of the optimization routine equal to unity at all frequencies‚ W(v)‚ 3. Perform weighted least-squares fit to the frequency response data. 4. Calculate the group delay of the resulting filter‚ and interpolate 5. Calculate maximum of the group delay function‚ the function through these points.
DSP System Design
86
6. Update the weights:
7. Normalize weight to its maximum value.
8. If the iteration number is less than the limit‚ proceed to point 3‚ otherwise deliver the answer vector. It was found during experiments that a maximum of four iterations was required to achieve the final result within 1% difference with regard to the result obtainable if iterations were to continue for more iteration. In order to measure the performance of the method it was compared to the similar approaches suggested by Lu and Lawson. Example filters were designed according to the specifications given in their papers [34]‚[35]. Comparative results showing the stopband attenuation‚ the ripples of the magnitude and group delay responses for the given passband and stopband cut-off frequencies‚ and respectively‚ are given in Table 1-10. Plots of the designed filters are shown in Figure 1-61 and Figure 1-62.
1. Polyphase IIR Filters
87
It can be clearly noticed that the method presented in this section is advantageous when contrasted with both Lu’s and Lawson’s. In both cases the filter order was chosen one less than the ones in the competitive designs‚ even so leading to much better passband and stopband ripples as well as group delay ripples. There are two values given for the group delay ripples. The first one is the ripples within 96% of the bandwidth and the second one for the full bandwidth. It can be noticed that the stopband ripples are not equal. The purpose was to make a compromise between achieving maximum attenuation and minimum group delay ripples in the passband. However‚ it is not possible to achieve both equiripple group delay and equiripple stopband ripples [36].
DSP System Design
88
It was noticed that increasing the requirements for the group delay ripples led to decreased stopband performance. Decreasing group delay ripples by a few percent was causing a few dB decreases in stopband attenuation. The design method was tested on the number of examples for different cutoff frequencies and transition bandwidths. The best performance for the given filter order and transition band specification are achievable for the case of the halfband filter. In such a case the design takes the advantage of the symmetric allpass filter response which is easier to achieve. The example design uses similar filter specification to the ones from Table 1-10 with both bandedges shifted in order to set the cutoff frequency at Performance of same filters forced to be symmetric is shown in Table 1-11.
It can be seen that the stopband attenuation does not increase much when the filter is symmetric against improving only by 1.5dB for Lu’s modified specification and not at all for Lawson’s modified filter. However‚ there is a difference in the group delay ripples‚ which became twice smaller for both example filters.
4.
POLYPHASE FFT ECHO CANCELLATION
In this section an example modification of the Subband adaptive polyphase FFT echo cancellation system [79] is presented in which the standard FIR filter banks are replaced with the polyphase IIR structure. It is demonstrated that such an alternative approach results in a much more computationally efficient implementation combined with more accurate channel detection and improvement in the adaptation speed.
4.1
Introduction
Adaptive signal processing applications such as adaptive equalization or adaptive wideband active noise and echo cancellation involve filters with hundreds of taps required for accurate representation of the channel impulse response. The computational burden associated with such long adaptive filters and their implementation complexity is very high. In addition adaptive filters with many taps may also suffer from long convergence time‚
1. Polyphase IIR Filters
89
especially when the reference signal has a large dynamic range. It is well known that subband adaptive techniques are well suited for high-order adaptive FIR filters‚ with a reduction in the number of calculations by approximately the number of subbands‚ whereby both the number of filter coefficients and the weight update rate can be decimated in each subband. Additionally faster convergence is possible as the spectral dynamic range can be greatly reduced in each subband [79] and [80]. A number of subband techniques have been developed in the past that uses a set of bandpass filters; block transforms [81] or hybrids [82]‚ which introduce path delays dependent on the complexity of subband filters employed. The architecture proposed in [79] avoids signal path delay while retaining the computational efficiency and convergence speed of the subband processing. The architecture of the delay-less subband acoustic echo cancellation employing LMS in each subband is shown in Figure 1-63 [79].
In this structure x(k) is interpreted as a far-end signal. The signal d(k) is interpreted as the signal received by the microphone containing some echoes after passing through the channel and e(k) is the error (de-echoed return signal as defined in [79]). Both x(k) and e(k) are decomposed into 16 bands and decimated down by 16. The LMS algorithm calculates 16 individual filters for each of the subbands‚ which are then re-composed back into one in the frequency domain to obtain the coefficients of the high-order Adaptive
DSP System Design
90
FIR Filter. Update of the weights‚ like in the original design [79] is computed every 128 samples (at 128 times lower rate than the input rate). The computational requirements can be separated into four sections: subband filtering‚ LMS‚ composition of the wideband filter and signal convolution by the wideband filter. Naylor and Constantinides first proposed to replace the FIR filter bank with polyphase IIR structures for subband echo cancellation [83]. The work reported in this paper presents further improvement of the computational efficiency of the subband filtering stage of the polyphase FFT subband echo cancellation system. Consider the 16band polyphase FFT adaptive system with the Adaptive FIR Filter having 1024 taps‚ in which each subband filter is based on the 128-coefficient prototype FIR filter. The number of Multiply/Add/Accumulate operations (MAC) required for the structure with the FIR filter bank was estimated to be 1088 per input sample. In order to lower the number of calculations per input sample Mullis suggested updating the weights of the high-order Adaptive FIR filter every 128 samples [79]. He argued that the output of the adaptive filter could not change faster than the length of its impulse response. Applying polyphase IIR filter structures to perform the subband filtering makes it possible to reduce the number of calculations‚ improving the convergence‚ accuracy of adaptation and allows efficient implementation.
4.2
Polyphase IIR Filter Bank
The two-path polyphase IIR structures as given in [83] and [85] can be modified in such a way to perform simultaneously both the lowpass and highpass filtering operation as in Figure 1-64(a).
1. Polyphase IIR Filters
91
The output of the adder returns the lowpass filtered signal and the output of the subtractor returns the highpass filtered signal. It should be noted that both the lowpass and highpass filtering actions are complementary‚ i.e. they result in perfect reconstruction giving zero reconstruction error. Adding the Sample Rate Decreaser (SRD) by two gives rise to achieving a two-band subband filter. Shifting the sample rate decreaser to the input can further modify this basic building block [25]. This modification results in half the number of calculations per input sample and half the storage requirements.
The polyphase IIR structure incorporates allpass sub-filters‚ as given by (106)‚ and has a possible structure as given in Figure 1-65.
Because of the small number of calculations required per filter order and very high performance‚ such a structure is very attractive for filtering requiring high speed of operation and high levels of integration. The 16channel subband filtering and 16-times decimation was achieved by incorporating the polyphase IIR filtering block from Figure 1-65 in the structure shown in Figure 1-66 where LPF stands for the LowPass Filter (LPF) and the number indicates the number of coefficients. This four-stage structure splits the signal frequency response into equal size bands followed by decimation by two. Each of the resulting signals undergoes a similar operation at the next stage. All filters were designed for the same 70dB of attenuation for achieving an appropriate separation from the neighboring bands. The transition bandwidths were different‚ as they had to cater for the decrease of the sampling frequency at which they were operating. The required Transition Band width‚ TB‚ and resulting filter coefficients for each stage of the filter bank are given in Table 1-12.
92
DSP System Design
Each output from the filter bank is applied to each of the 16 LMS blocks (Figure 1-67)‚ each operating on a small fraction of overall input frequency response‚ thus achieving fast operation speed. Channel approximation error is split into 16 frequency bands in the same way as the input signal was. This way the phase non-linearity of the polyphase IIR filter bank does not cause errors as both the input signal and the error are subject to the same group
1. Polyphase IIR Filters
93
delays. Each of LMS block returns a 64-tap FIR filter decreasing the error of channel approximation to an acceptable level in its frequency band. The output of each LMS block is then applied to an N point FFT‚ where N=64.
Note that the output bands of the filter bank were not in an increasing order. Additionally the sample rate decrease of the output of the highpass filter causes a flip of the frequency response. Therefore channel re-ordering and re-flipping is necessary before doing the IFFT operation (Figure 1-68).
The bank re-ordering block is responsible for arranging the subbands in an increasing order of frequencies. The SL and SH (2) operators are the
94
DSP System Design
frequency selectors returning the lower half or the upper-half of the FFT output respectively. This was necessary as some outputs of the 16-channel subband decomposition had their frequency responses flipped.
The bank re-ordering creates the positive part of the frequency response of the approximated channel filter. Flipping and conjugating the positive part of the frequency response inherently calculated the negative part of the frequency response. This means that the output of the IFFT returns the real FIR filter‚ which accurately approximates both the magnitude response of the channel and its bulk delay.
4.3
Comparison of FIR and Polyphase IIR filter banks
The performance of channel identification when using the polyphase IIR filter bank was compared to Morgan’s FIR approach [79] both in theory and in simulation. Analysis included the comparison of frequency responses from both subband decomposition methods in terms of passband and stopband ripples and channel overlap. Morgan suggested using the ‘firl’ Matlab routine for designing the prototype FIR filter for the subband filtering‚ which resulted in a filter as shown in Figure 1-69 [79].
1. Polyphase IIR Filters
95
The frequency axis was normalized to the input sampling rate. Band filters achieve -6dB at the crossover point to the next band while -3dB was required for zero reconstruction error at this point. It had 50dB of attenuation at the center of the next band and 70dB‚ required for the aliasing noise floor to be below -116dB‚ at the second band. The proposed polyphase structure achieves -3dB at the Fs/2 point by definition. It reached 45dB attenuation at the center of the next band and 70dB before the end of the next band‚ giving a good separation from all the bands except the two adjacent ones. The overall reconstruction error achieved for the polyphase filter bank was below 10-13dB - close to arithmetic accuracy of simulation platform (Figure 1-70).
One of the main advantages of the polyphase approach is in its low number of multiplications required per input sample. The FIR approach as suggested by Morgan requires 32-tap filters and 17 bands giving an overall MAC requirement of 1088 for subband filtering before the 16 times sample rate decrease. In comparison‚ the use of the multi-stage multi-rate polyphase IIR structure allowed a decrease in the number of MACs (excluding the trivial subtractions) per input sample to 24 if implemented as in
96
DSP System Design
Figure l-65(a)‚ and only 14 when using the structure in Figure 1-65(b). Half of the calculations for this structure are performed at the odd sample intervals and the rest of them at the even sample intervals. Practical tests were carried out to compare the channel approximation when using both the FIR and the Polyphase based approaches. The first one was designed in accordance to the one reported by Morgan [79]. The second one was using the Polyphase IIR filter bank as described in this paper. The input test signal was speech sampled at 8kHz with 5% additive white noise. The channel was a 50th-order least-squared FIR bandpass filter positioned from 0.0625 to 0.375 on the normalized frequency axes with 50dB of stopband attenuation using an additional bulk delay of 400 samples.
1. Polyphase IIR Filters
97
The performance comparison between the polyphase approach and the standard one employing the FIR filter bank is shown in Figure 1-71. Both approaches accurately detected the 400-sample bulk delay proving the viability of the method for echo cancellation. Their adaptation speed was similar. The estimated least squared error of channel approximation for the polyphase approach was 6.2dB in the passband and 9.4dB in full range. This shows an improvement in comparison to the FIR approach giving 7.5dB error in the passband and 13.5dB in full range. The superior results for the polyphase IIR filter bank can be attributed to its steeper transition bands‚ perfect reconstruction (zero error)‚ good channel separation and very flat passband response within each band. For an input signal rate of 8kHz the response time to the changes in the channel is 0.064 seconds. The adaptation time for the given channel and input signal was measured to be below 0.2 seconds. The channel approximation error fell below 10% in approximately 0.5 seconds. The times were estimated assuming that all calculations were completed within one sample period.
4.4
Summary
In this section the application of polyphase IIR filters for subband filtering of the polyphase FFT adaptive echo cancellation architecture was presented. The results of the system incorporating the polyphase filter bank were compared to the standard FIR approach as it was reported in [79]. The novel approach alternative multi-stage multi-rate polyphase IIR approach for the design of the subband filter-bank gives an almost ten-fold decrease in the number of MACs required‚ which can be easily translated into an increased number of bands for higher fidelity for the same computational cost as that of the FIR or a low-power subbands adaptive echo canceller. Additionally the polyphase IIR filter structure used here is are not very sensitive to coefficient quantization [85]‚ which makes a fast fixed-point implementation of the echo cancellation algorithm an attractive option. Applying the polyphase IIR filters in the filter-bank demonstrated more accurate channel detection than was possible with the FIR version suggested by Morgan [79]‚ with much reduced computational complexity. They could be also very applicable for use in echo cancellation applications that require dealing with delay paths in excess of 64ms.
Chapter 2 FREQUENCY TRANSFORMATIONS High-Order Mappings for Digital Signal Processing
AN OVERVIEW
1.
The idea of the frequency transformation where each delay of an existing FIR or IIR lowpass filter transfer function is replaced by the same allpass filter is a simple one and allows a lot of flexibility in manipulating the original filter to fit the required specification. Although the resulting designs are considerably more expensive in terms of dimensionality than the original prototype, the ease of use (in fixed or variable application) is a big advantage and has ensured that such mappings are frequently used for IIR filter designs. A general idea of the frequency transformation is to take an existing filter and produce some other filter replica from it in the frequency domain. Up to now the definitive mapping equations are those put forward by Constantinides [30] and since adopted as “industry standard”. These wellknown equations are geared up to map lowpass to bandpass and several other highly stylized combinations. They are culmination of preceding work [37]-[39] which departures from the earliest transformation work by Broome [40], where a simple modulation approach (suffering from severe aliasing) was used. Recent work [41], [42] has strengthened the utility of both of these methods. The basic form of mapping in common use is:
Here
is a prototype filter acted upon by, in general case, an complex allpass mapping filter, as described by (2) - thus
DSP System Design
100
forming a target filter, The choice of an allpass to provide the frequency mapping is necessary to provide the frequency translation of the prototype filter frequency response to the target one by changing the frequency position of the features from the original filter without affecting the overall shape of the filter response.
is the Rotation Factor The N degrees of freedom provided by the choice of filter coefficients are usually under-used by the restrictive set of “flat-top” classical mappings like lowpass-to-bandpass requiring only second-order mapping filters. In general, for the mapping filter, any N transfer function features can be migrated to (almost) any two other frequency locations. The additional requirement for the mapping filter is to keep the poles of its transfer function strictly outside the unit circle - since is substituted for z in the original prototype transfer function - in order to transform a stable prototype filter into a stable target one. The “Rotation Factor”, S, specifies the frequency shift for the target filter. For the case of the real Frequency Transformation equation (2) will be only allowed to have conjugate pairs of poles and zeros and only single ones on the real axis, this means:
Furthermore, the selection of the sign of the outside factor S in equation (3) is limited for the case of the real transformation to two choices of S=+1 and S=-1. This, as it was first pointed out by Constantinides, influences whether the original feature at zero frequency can be moved (“DC mobility”) for the leading minus sign or whether the Nyquist frequency can be migrated (“Nyquist mobility”) arising when the leading sign is positive. For the case
2. Frequency Transformations
101
of the complex transformation the factor S is allowed to take any complex value that would satisfy the condition of This shows that for complex transformation both Nyquist and DC mobility can be achieved simultaneously. If the chosen rotation factor is then the mapping filter modifies both the frequency scale of the prototype filter and the values of its magnitude response by changing the radii of the prototype filter pole-zero pairs. For example using a trivial mapping of results in (4).
The frequency does not change in this example, but each filter coefficients is scaled by while the filter frequency response is rotated in frequency by the argument of S. In other words, the factor S causes a windowing effect on the prototype filter with a windowing function resembling a moving average filter:
This function has N zeros equispaced around the unit circle at intervals, except the DC (v=0) and radius The example for different values of S and N=8 is shown in Figure 2-1. Note that in order to change the frequency scale of the filter without affecting the height of its frequency response, the scaling factor should be chosen to be
102
DSP System Design
The mapping equation of (1) has an intuitive graphical interpretation, as shown in Figure 2-2. In the example shown the mapping filter converts a real lowpass filter into a real multiband one. It can be notices that the phase of the mapping filter acts as a mapping function. The characteristic points of this function are its zero crossings and the discontinuities caused by the phase crossing the boundary. The first one indicates the frequencies where the DC feature of the Prototype Filter is mapped and the other one where the Nyquist of the original filter is placed.
Though the enhanced design flexibility that frequency transformations offer is readily evident, there has been little work reported in this area. Standard transformations under-use the freedom given in (1) by limiting them to simple first-order mapping filters performing low- and highpass transformations, and second-order mapping filters” for bandpass and bandstop targets. Although Mullis and colleagues have given one very useful multiband solution to the general mapping problem [45]-[48], it seems that scant application experience of that method has been related in IIR design literature. Recent work reported in [42] showed how N arbitrary features of the prototype can be mapped by employing an allpass mapping filter easily defined by solving a set of N complex linear equations which gives real mapping filter coefficients. In [43] a different approach to design of the mapping filter was suggested. Typical approaches are based on mapping a selected feature to its new location, which gives certain mapping filter coefficients. Using such an approach designers can only hope that filter behavior between specified features will be correct. This is true when the
2. Frequency Transformations
103
allpass mapping filter order equals the number of replicas. A better way to design a mapping filter is to concentrate explicitly on designing its phase. If this is done through deployment of poles and zeros then such a design becomes easier and surer. Avoiding additional replicas is not an easy task and may not work for all the design cases as discussed in [43], presenting also comments on target filter stability dependence arising out of the prototype filter and the mapping filter behavior.
1.1
Selecting the Features
Choosing the appropriate frequency transformation for achieving the required effect and the correct features of the prototype filter is very important and need careful consideration. It is not possible to use a firstorder transformation for controlling more than one feature, as the mapping filter will not give enough flexibility. It is not good to use high-order transformation just to change the cutoff frequency of the lowpass filter, as this unnecessarily increase of the filter order, not mentioning additional replica of the original filter that may be created in the undesired places.
In order to illustrate the second-order real transformation it was applied three times to the same elliptic halfband prototype lowpass filter in order to make it into a bandpass filter each time selecting two different features for the transformation (Figure 2-2). This filter was designed in Matlab to have ripples of 0.1dB in the passband and -30dB in its stopband. The idea was to convert the prototype filter into a bandpass one with passband ranging from 0.125 to 0.375 on normalized frequency scale. In
DSP System Design
104
each of the three cases different features of the prototype filter were selected. In the first case the selected features were the left and the right band-edges of the lowpass filter passband, in the second case they were the left bandedge and the DC, in the third case they were the DC and the right edge of the filter passband as shown in the figure. Results of all three approaches are different as shown in Figure 2-3. For each of them only the selected features were positioned precisely where they were required. In the first case the DC is moved towards the left passband edge just like all the other features close to the left edge being squeezed there. In the second case the right passband edge was pushed way out of the expected target as the precise position of DC was required. In the third case left passband edge was pulled towards the correctly positioned DC feature.
The conclusion is that if only the DC can be anywhere in the passband, the edges of the passband should have been selected for the transformation. For most of the cases requiring the positioning of passbands/stopbands, the position of the edges of the original filter need to be well chosen so that the edges of the target filter end up in the correct places.
1.2
Designing the mapping filter
It is frequently required to have a full control of the filter behavior - at other frequencies besides the (almost) arbitrary pair under direct control than that delivered through standard first and second-order transformations [45]-[48]. A typical example is the need to convert a lowpass filter to a bandpass, simultaneously retaining a capability of precise placement of upper bandedge and lower bandedge frequencies, along with a couple of specified intervening frequency features.
2. Frequency Transformations
105
Certainly such enhanced transfer function control can only be achieved at a cost in complexity, as each pole-zero pair of the prototype filter is replaced with N such pairs for the mapping filter, as given by (2). Nevertheless, practical goals such as rapid re-design in tunable filtering scenarios is one of the good reasons practitioners might wish to absorb this cost and has provided the motivation for a noteworthy body of earlier work, [42],[43]. This gives four distinct opportunities for influencing the overall filtering operation: a) Choice of the structure and order for the prototype filter b) Selection of coefficients for the prototype filter c) Choice of the structure and order N for mapping filter d) Selection of the
in
Real-time design in items (b) and (d), in particular, give a nice way of achieving nested variability. There is, moreover, scope for driving these changes (including even dynamic change of N) in an adaptive IIR arrangement. This adds greatly to the appeal of the whole approach, and has motivated our development of a general matrix solution equation for the N in (2) in terms of arbitrary frequency migration specifications. The coefficients of the mapping filter can be calculated by solving a set of N linear equations created for from N migrations pairs
Where phases factors
for
are given by:
DSP System Design
106
For real frequency transformations equation (6) can be simplified to:
The designer needs only (in principle) to specify the N pairs assemble by solving either (6) or (8), dependent on the type of transformation, for the values and then map with (1). The only difficulty lies in selecting allowable mapping point pairs at the outset of the procedure. For the best results the features to map from the prototype filter should be in sorted in either increasing or decreasing order of their values. In most of the practical cases there is no need to use the high-order transformation beyond the second-order ones. The flexibility of movement of the filter features often looses with the increase of filter dimensionality especially for large prototype filters. Consider an average IIR prototype filter to be transformed by an mapping filter. Such a transformation would result in a target filter. Next sections on the cases of first and second order mapping cases and the most commonly used high-order transformation – a multiband one for both real and complex situations. At the end the general case will be presented.
1.3
First-order transformations
We can think of (2) as relating old, and new, z-domain pairs of features. For the second-order mapping case equation (2) reduces to:
For the general case of the complex transformation the selection of two distinct migrations and are allowed. In order to express (9) in the frequency domain we need to evaluate it on the unit circle by substitution Factor S in (9) specifies the distance around the unit circle between the two points and Assuming that the relation the parameters of the mapping filter can be calculated.
2. Frequency Transformations
107
For the first-order mapping filter to perform a real transformation, the values of S and must have real values, as well as and This means that for the case of a real transformation only one mapping pair can be specified. Rotation factor S can only take values of either S=(+1) or S=(-1). Selecting S=(+1) defines a lowpass-to-lowpass mapping, while choosing S=(-1) creates a mirror imaging of the original transfer function, i.e. a lowpass-to-highpass mapping according to the property of the Z-transform:
The possible choices of first-order transformations, both for real and complex cases, are shown below including standard Constantinides cases. For all the examples the prototype was designed as a third-order elliptic filter having 0.1dB ripples in the passband and 30dB stopband attenuation. The transfer function of this filter is given by:
1.3.1
Complex frequency shift (complex rotation)
This is the simplest first-order transformation which is also the only one that performs exact mapping of all the features of the prototype filter frequency response into their prescribed new locations. Its purpose is to rotate the whole response of the prototype lowpass filter by the distance specified by the selection of the feature from the prototype filter and corresponding one from the target filter. The mapping filter is given by:
DSP System Design
108
With
defined as:
Where is the frequency location of the selected feature in the prototype filter and is the position of the corresponding feature in the target filter. The example of rotating by is shown in Figure 2-5.
Target filter coefficients can be calculated from the prototype filter using:
The special case of the complex rotation transformation in common use is the mirror one for the case of shifting by when the target filter is a mirror image of the prototype filter against half-Nyquist frequency
2. Frequency Transformations
109
The example design is shown in Figure 2-6 for a quarter-band (cutoff frequency at lowpass prototype filter. Such a transformation can be used to quickly convert a lowpass filter into a highpass mirror complement one. One possible uses is in the design of complementary (matched) filters for quadrature mirror filter banks. Another special case of the complex rotation, often used for practical applications, is the Hilbert transformation. In this case the rotation factor is or for the inverse Hilbert transformation.
The typical use of Hilbert transformation is for Single Sideband Modulation (SSM) and demodulation and for extracting the envelope of the oscillated signals in measurement applications. 1.3.2
Real lowpass-to-lowpass
This transformation allows such a conversion of the prototype filter that the DC and Nyquist features are locked in their places and one selected feature of the prototype filter frequency response, is mapped into a new location, The mapping filter is derived from (9) by using:
110
DSP System Design
The example use of this mapping for moving the cutoff of the prototype halfband filter of (12) from to is shown in Figure 2-7.
Note that freezing the DC and Nyquist locations and moving one feature to a new location causes stretching and contracting of the rest of the filter frequency response. Calculating the mapping function allows to access those effects and determine where other features are mapped. By evaluating (12) on the unit circle the mapping functions of (17) are obtained.
Some cases of (17) are shown in Figure 2-9. It can be noticed that when the selected feature moves towards the Nyquist frequency (positive the features above it get squeezed while the ones below get stretched. The effect is opposite for the case of the selected feature moving towards the DC.
2. Frequency Transformations 1.3.3
111
Real lowpass-to-highpass
This transformation is analogous to the real lowpass-to-lowpass one with the only difference of DC and Nyquist features replacing each other. This in effect converts a lowpass filter into a highpass one and vice versa. The mapping filter is derived from (9) by using:
The example use of this mapping for moving the cutoff of the prototype halfband filter of (12) from to at the same time changing its lowpass character into a highpass, is shown in Figure 2-8.
The mapping function
calculated from (12), is given by:
The relation of (19) is very similar to the one of (18) for the real lowpassto-lowpass transformation, shown in Figure 2-9. The difference is that the mapping function is shifted in by half of the Nyquist frequency, creating a phase discontinuity at DC (clear indication that the Nyquist feature is mapped at DC). For reaching the Nyquist the goes to zero, indicating that DC feature is mapped at this frequency.
DSP System Design
112
1.3.4
Complex lowpass-to-bandpass
This transformation is derived as a cascade of the real lowpass-tolowpass mapping and the complex frequency rotation. It performs exact mapping of one selected feature of the prototype filter frequency response, into two new locations, and in the target filter creating a passband between them. Both Nyquist and DC features can be moved with the rest of the frequency response. The mapping filter is derived from (9) by using:
Frequency location of the selected feature in the prototype filter Position of the feature originally at in the target filter Position of the feature originally at in the target filter The mapping function is given by:
The example in Figure 2-12 shows the use of such a transformation for converting a real half-band lowpass filter into a complex bandpass one with band edges at and
2. Frequency Transformations
113
The shape of the mapping function is shown in Figure 2-13 for different values of and the DC shift as in the example from Figure 2-12. The features mapped in the example are marked on the plots.
1.3.5
Complex lowpass-to-bandstop
This first-order transformation performs exact mapping of one selected feature of the prototype filter frequency response into two new locations in the target filter creating a stopband between them. Both Nyquist and DC features can be moved with the rest of the frequency response. The mapping filter is derived from (9) by using:
DSP System Design
114
Frequency location of the selected feature in the prototype filter Position of the feature originally at in the target filter Position of the feature originally at in the target filter The mapping function is given by:
Example in Figure 2-14 shows converting a real half-band lowpass filter into a complex bandstop with bandedges at and
The shape of the mapping function is shown in Figure 2-13 for different values of and the Nyquist shift as in the example from Figure 2-12. The features mapped in the example are marked on the plots.
2. Frequency Transformations 1.3.6
115
Complex bandpass-to-bandpass
This first-order transformation performs exact mapping of two selected features of the prototype complex bandpass filter into two new locations. Both Nyquist and DC features can be moved with the rest of the frequency response. The mapping filter is derived from (9) by using:
Frequency locations of selected features in the prototype filter Position of the feature originally at in the target filter Position of the feature originally at in the target filter The mapping function is given by:
Example in Figure 2-16 shows an example of converting a complex bandpass filter with bandedges originally at and into a new complex bandpass filter with bandedges at and
The shape of the mapping function is shown in Figure 2-17 for different values of and the Nyquist shift as in the example from Figure 2-16. The features mapped in the example are marked on the plots.
DSP System Design
116
1.4
Second-order transformations
For the case of the second-order mapping old and new z-domain images, and are related by:
The D(z) is the denominator polynomial. The numerator of the filter is a mirrored and conjugated version of the denominator polynomial. The phase response can be then calculated using the method from [86], which for the case of the real transformation leads to the mapping function as given below:
The second-order mapping function allows to independently specify two distinct migrations and They are then used to solve the two simultaneous equations which arise from equation (26). For the case of a real transformation the result is:
2. Frequency Transformations
117
Where:
The upper (+) sign in (14) is applicable for “Nyquist mobility”, while the lower one for the “DC mobility”. Explicit unit-circle form is formed by replacing “old” location, with and appending suitable subscripts. Likewise it is done for “new” unit-circle locations, Here is the normalised frequency variable. This yields the more complete relations: (30) for DC mobility and (31) for Nyquist one.
In the well-known Constantinides formulas “old” features are usually cast as bandedges whose “new” images are also bandedges. For instance, selection of (where is the edge of the passband of a prototype lowpass filter), along with corresponding images and will deliver a bandpass resulting filter with its passband within the positive-frequency band However, as Constantinides pointed out in [30], the DC frequency gain of the prototype does not map to but rather is warped to a location, which requires calculation of an additional equation. It is easy to modify the standard Constantinides result. For instance, to explicitly control the movement of the DC feature and one bandedge the parameters and are selected for use in (30) and (31). This results in a “DC plus upper bandedge” controlled alternative to Constantinides’ lowpass-to-bandpass equation. However, there is always a limitation of explicit controlling only two features as long as a second-order mapping filter is employed.
DSP System Design
118
It is worthwhile reflecting upon whether any two combinations will be legitimate when using (22). The intention might be to map a stable prototype to a stable purely through use of (1), taking to have poles strictly outside the unit circle. Such a condition is easily seen to be sufficient to guarantee such stability inheritance. Therefore certain restrictions in mapping pairs must be observed. Note that a minimum-phase numerator (all zeros inside the unit circle) in (5) requires:
Then (30) and (31) begin to reveal the interplay of allowable “old” and “new” locations if these are specified in unit-circle forms (as is most often of interest to the filter designer). Continuing in this way and demanding DC movement to frequency the following relation is obtained:
Finally:
This finishes the discussion on allowable frequency specification combinations. A lists of general real and complex choices of frequency transformations covered by second-order mapping functions is given below. 1.4.1
Real lowpass-to-bandpass
This transformation performs exact mapping of one selected feature of the prototype filter frequency response, namely the cutoff frequency, into two new locations, and in the target filter creating a passband between them. The DC feature moves with the rest of the frequency response, while the Nyquist one stays fixed. The mapping filter is derived from (26) using:
2. Frequency Transformations
119
Frequency location of the selected feature in the prototype filter Position of the feature originally at in the target filter Position of the feature originally at in the target filter The example in Figure 2-18 shows the use of such a transformation for converting a real half-band lowpass filter into a real bandpass one with band edges at and The shape of the mapping function corresponding to this example is presented in Figure 2-19.
1.4.2
Real lowpass-to-bandstop
This transformation maps exact one selected feature of the prototype frequency response, namely the cutoff frequency, into two new locations, and being the passband edges of the target filter. The Nyquist feature moves with the rest of the frequency response, while the DC one stays fixed.
DSP System Design
120
The mapping filter is derived from (26) using:
Frequency location of the selected feature in the prototype filter Position of the feature originally at in the target filter Position of the feature originally at in the target filter The example in Figure 2-20 shows the use of such a transformation for converting a real half-band lowpass filter into a real bandstop one with band edges at and The shape of the mapping function corresponding to this example is presented in Figure 2-21.
2. Frequency Transformations 1.4.3
121
Real frequency shift
This transformation performs exact mapping of one selected feature of the prototype filter frequency response, namely the cutoff frequency, into a new locations, in the target filter performing an operation of frequency shift. It is very similar to the real lowpass to bandpass transformation, likewise moving the Nyquist feature with the rest of the frequency response and keeping the DC feature fixed. He only difference is that any feature of the prototype filter can be selected. It is then mapped to the exact position in the target filter. The mapping filter is derived from (25) using (33). S= – 1
Frequency location of the selected feature in the prototype filter. Position of the feature originally at in the target filter. The example in Figure 2-20 shows the use of such a transformation for shifting the cutoff of the real half-band lowpass filter to a new frequency location converting it into a real bandstop filter.
Mapping function shape for the above example is given in Figure 2-21. Note that the real shift transformation is not linear. It shifts features of the prototype filter differently, dependent on their distance from the selected
DSP System Design
122
feature - the only one shifted correctly. Features between and will be moved by smaller amounts while the other ones will be moved by bigger amount. This is an unavoidable side effect of this transformation needed to preserve the real character of the filter.
2.
M-BAND TRANSFORMATION
This transformation performs an exact mapping of one selected feature of the prototype filter frequency response into a number of new locations in the target filter. It’s most common use is to convert a real lowpass with predefined passband and stopband ripples into a multiband filter with arbitrary band edges. The order of the mapping filter must be even, which corresponds to an even number of band edges in the target filter. The complex allpass mapping filter is derived from (6) and given by:
Where:
2. Frequency Transformations
123
For the case of the real transformation the mapping filter coefficients can be calculated from much simpler set of N linear equations:
In the fourth-order example shown in Figure 2-24 the cutoff frequency of the prototype filter, originally at was mapped at a four new frequencies Rotation factor S specifies whether the DC should be mapped to itself (S=1), similar to the lowpass-to-lowpass transformation (solid line), or replaced with the Nyquist (S=-1) as for the case of the lowpass-to-highpass mapping (dashed line).
DSP System Design
124
The shape of the mapping function corresponding to the example from Figure 2-24 is given in Figure 2-25 with mapping points marked. The flexibility offered by this transformation is presented in Figure 2-26 for a number of designs with one bandedge fixed and other ones varying.
The direct calculation of the multiband transformation parameters by solving a simple set of linear equations gives great advantage for applications requiring adaptive band tuning. The list of applications includes digital equalizers, noise, echo and interference cancellation and others. 2.1.1
Rate-up transformation
The rate-up mapping is a special case of the M-band transformation, where the target filter is a symmetric multi-replica version of the prototype filter. The mapping filter takes a trivial form of (41).
The example of this transformation for the case of M=3, creating three replica of the original lowpass filter around the unit circle, is shown in Figure 2-27. The cutoff frequency of the original filter, and the bandedges of the target filter, (where was mapped) are shown on the plots of magnitude responses.
2. Frequency Transformations
125
Rate-up mapping is a linear mapping affecting the distances between the features on a frequency axes in the same way, even clearer when looking on the mapping function plot corresponding to the example (Figure 2-28).
The most common use of this transformation is for shifting the position of the digital filter from one place in the system chain to the other one operating at a different rate, assuming integer ratio of both frequencies. This is especially useful for polyphase IIR filters as shown later in Chapter 3.
3.
GENERAL N-POINT (N-IMAGE) MAPPING
Usage of a “transformation filter” like is often inadequate. It is frequently required to have greater control - at other frequencies besides the (almost) arbitrary pair under direct control - than that delivered as a
DSP System Design
126
consequence of the warping attending this simple transformation. A typical example is the need to transport a lowpass filter to bandpass, simultaneously retaining a capability of precise placement of upper and lower bandedges frequencies, along with a couple of specified intervening frequency features. This requires modification and extension to higher-order versions of (1) and (5). In particular, a mapping filter would replace in (1). Of course such enhanced transfer function control can only be achieved at a cost in complexity, as can be seen clearly from the escalated dimensionality of PZP characteristic of such designs. Nevertheless, practical goals such as rapid re-design in tuneable filtering scenarios is one of the good reasons practitioners might wish to absorb this cost and has provided motivation for earlier work [49]-[50]. The approach is indeed flexible. Equation (42) gives the opportunity to control up to N independent features of the prototype filter spectrum. Standard transformations under-use the flexibility offered by frequency transformations by limiting them to simple first-order mapping filters performing lowpass and highpass transformations, or second-order “mappers” for bandpass and bandstop targets. Clearly there is a need to extend beyond the traditional second-order mappings, which can control redeployment of only two transfer function features.
The D(z) term is the denominator polynomial. The numerator of the transfer function is a mirror image of the denominator polynomial. Assuming that the filter is stable, that is has all its poles inside the unit circle and hence all its zeros outside the unit circle at reciprocal locations. The phase response can be expressed as:
The
term is the phase response of
that is:
2. Frequency Transformations The mapping function
127
can be then expressed as:
Equation (42) gives four possibilities of influencing overall operation: Choice of the structure and order for the prototype filter Selection of coefficients for the prototype filter Choice of the structure and order N for the mapping filter Selection of the in The real-time design of item (b) and (d), in particular, give a nice way of achieving nested variability. There is, moreover, scope for driving these changes (including even dynamic change of N) in an adaptive IIR arrangement. This has motivated the development of a general matrix solution equation for in (42) in terms of (almost) arbitrary frequency migration specifications. Considering (42) as a mapping:
From all the N mappings
a set of N linear equations is created:
The designer need only (in principle) specify the N pairs assemble by solving (47) for the values and then map with (1). The only difficulty lies in selecting allowable mapping point pairs at the outset of the procedure. It has been established as a rule of thumb that in order to avoid bad conditioning of the matrix of equations, the migrations should be arranged in increasing order of frequencies, that is:
128
4.
DSP System Design
PHASE RESPONSE OF THE MAPPING FILTER
All-important is the behaviour of an allpass mapping filter in its phase response. This is frequently forgotten by designers who focus on moving only selected points of the frequency response of the original filter to their new locations without bothering about what happens in between the selected features. If the mapping filter phase is not monotonic then inevitably “breakup” of the prototype filter transfer function will take place and an “N-for-N” multi-banding will result. Enforcement of the monotonic change of the mapping filter phase response is therefore a goal inseparable from the “lessthan-N-band” mapping task. It is this realization which has motivated the prime breakthrough in our work: a viewpoint of constrained design of mapping filters - as opposed to rather haphazardly spawning them implicitly through (closed-form or linear equation) specification of isolated points on the phase characteristic. The new method reported enforces monotonic change of the mapping filter phase by design; the shape of allpass filter phase response between specific points (responsible for delivering mapping chosen features) is controlled. This requires use of an allpass filter to map M, (M1 on a sequence x(n) consists of keeping every sample of x(n) and removing M-1 inbetween the samples. This principle will be used later in converting classical decimation filter structures into those more efficient in terms of number of operations performed per one output sample. In the frequency domain, the input-output relation of a factor-of-M down-sampler is given by:
Here and denote the Fourier Summation Transforms (FST) of x(n) and y(n) respectively. is the sum of M frequency scaled and shifted images of with adjacent images of Analysis in the frequency domain indicates possible problem of overlapping images of (“aliasing”) which can be faced during down-sampling as explained in Figure 3-26 for down-sampling by M=2. Notice that the original shape of is lost when x(n) is down-sampled, although it is still periodic. Unless is band-limited to the normalised frequency range of there the overlap of adjacent terms will exist, causing aliasing. To prevent any aliasing that may be caused by the down-sampling process, x(n) is passed through a lowpass filter approximating the ideal frequency response of (7) before it is down-sampled as indicated in Figure 3-27.
DSP System Design
188
Such a combination of an anti-aliasing filter and a down-sampler is called a decimator or a decimation filter [72]. Subsequent sections present efficient ways of implementing such decimation filters using ordinary FIR filters, general IIR ones and special class of polyphase filters from Chapter 1.
3.1
FIR decimation filter
Implementation of FIR filters is based on the idea of Finite Impulse Response (FIR) which requires storing of only up to N old samples of the incoming signal and they are the only samples required to calculate the valid output. Respectively the valid output is calculated from the latest N input samples. This makes them very easy to implement. The only difficulty is that, in order to achieve effective filtering, such filters must have a large number of coefficients that implies a big number of multiplications. In case of multirate signal processing, like decimation where samples are coming at a very high rate, performing a big number of calculations at the signal rate is a tiresome task, requiring a very fast calculation unit (and/or fast processor
3. Filter Implementation
189
clock) and a lot of buffer memory. This means large area of the integrated circuit and large power consumption. Thus, there is a need for methods that will allow decreasing the workload on the processor. The structure of the decimation filter gives a possibility of decreasing FIR filter complexity. It is possible to reduce the clock rate of calculations by moving a sample rate decreaser before the anti-aliasing filter. By additionally feeding filters with different samples in a circular fashion it avoids calculating the output samples which would be later removed anyway. Let’s consider now the standard structure of an FIR decimation filter as in Figure 3-27. Samples are coming with rate at which they are processed and then every one is being taken while other ones removed as in (8):
Considering that only one sample per M samples is kept (i.e. at indexes k = 0, M, 2M...), N*(M-1) useless computations are performed. Upon closer inspection it can be noticed that every coefficient of the filter is multiplied by samples spaced by M sampling intervals. It is then possible to spread both input samples and filter coefficients between branches, taking proper care that if coefficients are placed starting from the top-path, input samples should be inserted starting from the bottom one (Figure 3-28).
A special case is when the filter preventing aliasing of the signal replica into the baseband is a halfband filter with the number of coefficients equal M·N. Such a filter can be combined with a two times sampling rate decreaser forming a structure as in Figure 3-29. Although the classical implementation would also take advantage of having every second filter coefficient (except
190
DSP System Design
the middle one) equal to zero, such a structure allows decreasing memory size and half the number of calculations per output sample.
Interpolation filters can benefit from reduced calculation burden in a very similar way to the decimation structure, on condition that Zero Insertion Interpolation (ZII) is performed during the sampling rate increase. Consider the standard structure as on Figure 3-30.
It is easy to notice that, for every output sample, only every multiplier is receiving a nonzero sample. Therefore it is possible to move the anti-replica (image) filter in front of the sample rate increaser (multiplexer) as in Figure 3-31.
Decimation and interpolation structures above could be used to do more than just reduce the workload per output sample. By combining both the decimation and interpolation filters, the designer gets two types of parallel structures depending on which operation is done first, decimation or interpolation.
3. Filter Implementation 3.1.1
191
Interpolation - Decimation arrangement
The arrangement shown in Figure 3-32(a) allows a relaxation of the requirement for the filter doing the required processing, H(z), by designing it for the lower passband (wider transition band) or the use of filters in which the band of interest is in the lower frequencies.
This structure can be further simplified if we notice that: The anti-aliasing filter does not have to be implemented when converting back to the original (low) rate as the signal bandwidth is already band-limited to the correct cut-off frequency FS/(2M). The main processing filter can be moved behind the sample rate decreaser using the same idea as when moving the anti-replica filter for the case of the interpolation filter. Sample rate increaser and sample rate decreaser can be substituted with wires. Care has to be taken to make sure that the total number of coefficients of the anti-replica filter and the main processing one, is divisible by M. This will make sure that correct samples are reaching the summation block. If is not divisible by M then one of the filters must have delayers added to achieve synchronisation. Also, if the first coefficient of the first filter is put in the first path with others put into successive paths, then the other filter must have its coefficients arranged starting from the last path, putting each succeeding coefficient into each previous path. This leads to the structure shown on Figure 3-32(b). This structure is indirectly working at the high rate, even if all its paths are doing processing at an M-times lower rate. Such a structure is very applicable for use in parallel (multiprocessor) systems with the additional advantage that, if used in adaptive system, the
DSP System Design
192
adaptation time will be M-times lower. The practical example incorporating such multirate filter arrangement with a Lagrangian fractional delay filter doing the required processing is presented in Section 3.2 [10], [64]. 3.1.2
Decimation - Interpolation arrangement
Decimation-interpolation structure in Figure 3-33 allows the decrease of a signal sampling rate, doing calculations at the lower rate in several paths and then converting the result back to the original high output sampling rate.
This idea has been in long use and requires an anti-aliasing filter to be used before sample rate decrease. This has the disadvantage of limiting the signal bandwidth to which sometimes might not be acceptable. It can be also beneficial when used in parallel processing systems where both anti-aliasing and anti-replica filters can be processed in parallel. Only main processing filter requires sequential execution, unless its structure allows multiprocessing. An advantage is that the main processing filter is not limited to an FIR only; IIR filters can also be incorporated.
3.2
FIR decimation and interpolation structures for wideband, flat group delay filters [10]
The ideas presented above for implementing FIR decimation and interpolation filters at a lower rate can be shown in an example incorporating both. They can be used together with a Lagrangian fractional delay filter to achieve a wide-band filter with variable, arbitrary delay and high quality (high resolution) both in amplitude and phase linearity. Additionally this structure allows adaptive change of the delay at every sampling instant. The arbitrary valued delay implementation is based around the use of the Lagrangian interpolation filter [65], a delay-line and sampling rate conversion [66]. The technique involves producing a two times oversampled input signal which is then halfband limited and passed through a fractional delay filter. The resulting signal is finally decimated by two back down to the baseband frequency. The basic scheme is as shown in Figure 3-34.
3. Filter Implementation
193
The simplest way to obtain the over-sampled input signal is to resample it using ZII on alternate samples. It should be made sure that such input signal does not contain frequencies within the transition bandwidth of the decimation filter used. Through over-sampling the input signal by a factor of two the spectrum is replicated within the higher-rate sampling scheme’s normalised bandwidth. Thus, it is enough to convolve the over-sampled input signal with a fractional delay filter impulse response that only meets a desired specification up to half Nyquist frequency. In several experiments it has been found that a 35-tap fractional delay filter of the type proposed in [65] provides 20-bit magnitude resolution (i.e. passband ripples) with delay resolution amounting to one of a sample. To maintain the flat group delay characteristic over the majority of the base-bandwidth after decimation, a linear-phase lowpass band-limiting filter was required. To achieve the necessary 20-bit resolution, a 511 tap halfband FIR filter, weighted by a 120dB Dolph-Chebyshev window can be used and was used in the example to be presented later in this section. This resulted in a baseband width of 0.481 of the normalised frequency with approximately passband ripples, shown in Figure 3-35.
194
DSP System Design
The performance of the overall structure is dependent on the quality of the lowpass filter: the higher the stopband attenuation, the higher the group delay resolution. However, this is countered by increasing the transition bandwidth for the same number of filter taps. Since the interpolation process performs zero insertion between input samples, at each higher-rate sampling instance only half of the FIR coefficients are required to generate a valid output. At the first sampling instant input samples are processed by evenindexed coefficients and at the next one by odd-indexed coefficients of the cascaded FIR filters [66]. This fact permits the decomposition of filters into parallel odd and even coefficient branches, which operate at the input sampling rate. To keep the delay variable in real time, the lowpass and arbitrary delay branch filters are not combined, but are cascaded with the even part of one filter being used in the same branch as the odd part of the other. This form of connection removes the need for a unit-delay that would be required if the odd parts of both filters were combined together. As the Arbitrary Group Delay Filter (AGDF) is in effect working at twice the input sampling rate, the required delay must be scaled by a factor of two. The overall scheme also removes the need for a Sampling Rate Increaser (SRI) at the input and Sampling Rate Decreaser (SRD) at the output. The output signal is formed by summing the output signals from the branches and a scaling applied to restore the signal power. This results in the structure presented in Figure 3-36.
By placing the lowpass filter before the variable delay element the system can be adaptively changed at every input sample instant. If the filters were in reversed order and the delay requirement was to be changed, a valid output would only become available after the mid-array delay of the lowpass filter had been passed. The AGDF is composed of an integer delay line and a fractional delay filter, which is designed to have constant group delay and unity gain to a specified tolerance within the bandwidth of the lowpass filter. The integer delay is implemented using a shift register. If the newest sample is placed at the end of the register then the record of samples fed to the fractional delay filter is taken starting at the sample from the end of it,
3. Filter Implementation
195
where k is the required integer group delay at the current sampling instant. Specifying the maximum allowable integer delay makes the total length of the register equal to this number with added number of taps of the branch fractional delay filter. The motivation for devising this improved structure was to increase the signal bandwidth, while maintaining real-time variability of the delay, beyond bandwidths achievable using least integral square or unmodified Lagrangian interpolation methods [67]. It was very difficult to cross the limit of approximately 0.45 bandwidth using these existing techniques owing this to the unavoidable escalation of error right at the Nyquist frequency [67]. Moreover the new structure produces much smaller magnitude response ripples than seen in alternative approaches. As only FIR filters are used both for lowpass filtering and fractional group delay tuning, the structure is very useful for adaptive fractional delay filtering where the linearity of the group delay is of paramount importance. Although the structure uses interpolation and decimation, all calculations are performed at the original input sampling frequency. The multi-branch (multirate) approach decreases the overall group delay resulting from using a long FIR lowpass filter by the number of branches of the structure. It also allows enhanced speed of the filtering by the use of parallel processing which is especially useful when very long filters are used for high bandwidth and high-resolution applications. This idea was subsequently extended for use in an efficient fractional sample delayer for high precision digital beam steering at a baseband sampling frequency [64].
3.3
IIR decimation in Denominator-Numerator form
Implementing the decimation filter based on the general form of an IIR filter is more complicated than for the case of the FIR one as it contains the feedback loop. This requires all the calculated samples for the proper operation of the whole filter, even if some of the output samples are going to be thrown away during the sampling rate decrease. The NumeratorDenominator (N-D) form as shown in Figure 3-37 arrangement allows to achieve this requirement. The example structure implements a third-order filter with transfer function:
The switch at the input makes sure that odd samples go into one branch and the even ones into the other one. Additionally it has to make sure that
196
DSP System Design
samples in the lower branch are older than in the top one. The samples are being fed back through the coefficients of the feedback loop and a single delay except for coefficients at even-order delayers in the top branch. This is because samples coming from the lower branch are already delayed by one sampling period of the original sampling rate. The idea is for the filter to operate at a lower rate, but performing as if it was working at a higher rate.
The advantage over the standard implementation, with the switch at the output, is that all blocks in the structure operate at the output rate. The example structure performs two-times decimation. By changing the switch to the higher-order one and making similar interconnections between each path like in the structure proposed for two-times decimation it is possible to achieve decimation by any integer factor. The D-N structure by definition may suffer from the large values of the internal values. The feedback loop in the IIR filter has usually an effect of accumulating the sample values (because of poles), while the numerator is attenuating it (due to zeros). Both effects combined result in the required, stable output signal. Therefore the best result, from the point of view of internal sample values, is when the filter is implemented in the Numerator-Denominator (N-D) form.
3.4
IIR decimation in Numerator-Denominator form
The idea of the N-D structure is that if the numerator (FIR) part is preceding the denominator (feedback) it has to provide all the high rate samples to the latter one. This means that it is not allowed to lose any sample
3. Filter Implementation
197
at this stage in order to make the denominator return the correct results. Therefore the first structure to consider is the one shown in Figure 3-33. The numerator part is operating here at the high rate and all the samples are being calculated no sample rate decrease is done at this point. The denominator part is exactly the same as in the D-N structure. The output of the decimation can be taken from either branch of the denominator part. Notice that delayers are storing only input and output samples of the overall filter. As the filter is stable and the input signal bounded, then it follows that the output signal is also bounded. Therefore this structure does not have problems with excessively large sample values to store in the memory.
It is possible to force the numerator part to operate at the output rate, but because it has to provide all the high rate samples to the denominator, this requires doubling of the numerator hardware as shown in Figure 3-38. One filter replica is calculating the odd samples and the other one the even samples of the numerator. As the switch is providing the odd samples at the same time with the even samples, this fact can be used to omit some of the delayers. The delay between the result of one of the numerator filters and the input to the denominator was taken out of the numerator when it was noticed that each of its branch contains at least one delay element. This modification allows one to save three memory locations for this filter.
DSP System Design
198
Sharing delayers between both of its replicas minimised the number of memory locations for the numerator part. Unfortunately there is no way to avoid doubling the numerator coefficients. In terms of the number of multiplications the structure from Figure 3-39 is similar to the one from Figure 3-38. The difference is that the second one does all the calculations at the lower rate (output rate) than the first one. It also allows easier implementation in the multi-processor (parallel) arrangement.
3.5
Summary
The issues of efficient implementation of FIR and IIR filters in multirate filters were addressed in the section above. The changes to their structures were suggested allowing shifting of the filters originally working at the high rate to the other side of the sample rate decreaser or increaser allowing them to operate at the lower rate in the implemented system. Although the modified structures may show an increased number of multiplications and additions compared to the original structure, the number of computation per sample period, calculated for the same reference rate, is lower. In the worst case it is equal to the number of calculation of the prototype implementation. The presented structural modifications may be used to advantage for optimising the work load between multiple processing units allowing each of them to operate at a single clock rate. This also allows lowering the clock rate of the processing units, often leading to reduced power consumption and decreasing other effects due to high operating rate of the integrated circuit.
Chapter 4 VHDL FILTER IMPLEMENTATION Automatic Code Generation Techniques
1.
BASICS OF VHDL
VHDL stands for Very High Speed Integrated Circuits (VHSIC) Hardware Description Language (HDL). It is a language for describing digital electronic systems. It was born out of the United States Government’s VHSIC program in 1980 and was adopted as a standard for describing the structure and function of Integrated Circuits (IC). Soon after it was developed and adopted as a standard by the Institute of Electrical and Electronic Engineers (IEEE) in the US (IEEE-1076-1987) and in other countries [76], [77]. VHDL continues to evolve. Although new standards have been prepared (VHDL-93) most commercial VHDL tools use 10761987 version of VHDL, thus making it the most compatible when using different compilation tools. VHDL enables the user to: Describe the design structure; specify how it is decomposed into subdesigns, and how these sub-designs are interconnected. Specify the function of designs using a familiar, c-like programming language form. Simulate the design before being sending it for fabrication, so that designers have a chance to rapidly compare alternative results and test for correctness without delay and expense of multiple prototyping.
200
DSP System Design
VHDL is a C-like a general purpose programming language with extensions to model both concurrent and sequential flows of execution and allows delayed assignment of values. To a first approximation VHDL can be considered to be a combination of two languages: one describing the structure of the integrated circuit and its interconnections (structural description) and the other one describing its behaviour using algorithmic constructs (behavioural description). VHDL allows three styles of programming: Structural Register Transfer Level Behavioural The first one, structural, is the most commonly used as it allows description of the structure of the IC very precisely by the user. This in very many cases gives the best performance over compiler optimised structures, especially for high speed, fixed-point applications like polyphase structures. Its behavioural style permits the designer to quickly test concepts, where the designer can specify the high-level function of the design without taking much care how it will be done structurally. This can be very attractive for quick design of low and medium-speed and low-volume applications, where the designer expertise is not available. A word of warning is appropriate here. Designs synthesised from behavioural descriptions will often end up using a lot more resources than actually necessary, even after optimisation. However, the success of VHDL for designing integrated circuits is indisputable. Unfortunately there is a lack of tools available linking VHDL tools with such high-level digital filter design/simulation tools like Matlab and Simulink which operate on the levels higher than the structure. At the moment the designer who designed and tested his design theoretically using high-level tools is required to spend the same or more time on designing the structure and the architecture for his theoretical design, simulate it, test it and fabricate it. This involves a dangerous break in the integrity of design flow, giving chances for inconsistencies to creep in. An automated high-integrity link between theoretical design and implementation is essential and can be achieved with VHDL via a conversion tool. A very attractive high-level design/simulation tool is provided by Math Works™ and is called Simulink. It is a very flexible design tool, which allows testing of a high-level structural description of the design and makes possible quick changes and corrections. The circuit description structure is very similar to the way the design could be implemented later. Therefore mapping tool allowing conversion of such a structure into a VHDL code would save the designer’s time, which otherwise has to be spent in rewriting the same structure in
4. VHDL Filter Implementation
201
VHDL and probably making mistakes that will need debugging. This idea is the basis of the work described later. Primarily the work has been concentrated on the analysis of the Simulink structure and its similarity with the VHDL description. The structural style of programming has been chosen for the first version of the program, as this would allow direct mapping of Simulink structures into ones described in VHDL. As Simulink is a highlevel description tool and allows such operations as unconstrained arithmetic operations, the behavioural style will be included in the next version of the conversion program that is still under development.
2.
CONVERTING FROM MATLAB TO VHDL
So far the biggest problem which the designer faces very often is how to pass from the algorithmic design to its physical implementation. The first tool the designer uses when developing the new idea is a high-level design and simulation tool. One of the most commonly used high-level tools is Matlab with Simulink. It allows the designer to put together a behavioural or structural simulation very easily and quickly checking the algorithm or making the necessary adjustments to it. Working directly with any low-level implementation tool from the start is simply not practical, as every small change in the algorithm may sometimes require substantial redesign of the implementation. Therefore an automatic link between the high-level algorithmic design, like Simulink model, to some implementation description, like a target netlist or VHDL, would lead to great effort and time savings in the design cycle. The task of the conversion tool is as follows: Analyse the Simulink model and identify: Common and different blocks Connections (signal lines) and ports for multilevel models Block parameters Generate a VHDL equivalent: Identify entities available in the standard component library Create architectures for each block from bottom up Create configuration files for every entity linking in standard libraries Matlab has been used by a number of developers for a long time and has proven to be an invaluable tool for DSP applications. Therefore this software was chosen for the high-level design part of the whole system. In the first
DSP System Design
202
instance Simulink part of Matlab, has been chosen to be the input to the conversion tool. The fact that Simulink makes it possible to design both behavioural and structural designs (where this latter one is the closest to the physical implementation) justifies its choice. The description of a typical Simulink block is similar to the netlist of the physical implementation. However, it can be easily noticed is that there is a set of blocks in Simulink, which have to be treated as the basic ones. There are compiled “s-functions”, the contents of which are not available. Therefore, their behaviour has to be carefully analysed in order to create their equivalent VHDL descriptions, to be later included into the library of standard Simulink entities/architectures. The VHDL description has been chosen for the output of the conversion tool, as it is the highest level technology-independent description of the design to be realised. There are other tools available, both for UNIX and PC, for compiling VHDL into a netlist, then ported into the silicon fabrication arena or FPGAs. Such tools include Peak VHDL/FPGA from Accolade Design Automation Inc. [78], Galileo and Renoir from Mentor [79].
2.1
Basics of Simulink
Simulink, as is true for most of high-level simulation software, does not allow testing certain behaviour patterns that a real target design can exhibit, most of which are available for the VHDL simulator. The most reliable simulation can only be performed after porting the compiled VHDL into the implementation software. However, Simulink does not: Support fixed point arithmetic in the general sense. Use data types compatible with bit logic (floating-point bit simulation). Define propagation delay in its blocks, necessary for implementation. Support reusable symbols (same symbols may have different contents). In the structural simulation using bit logic arithmetic it is possible to force Simulink to assign only 0s and 1s, even though they are represented with floating-point variables/signals. Fixed-point arithmetic can be implemented structurally in Simulink using gates. This also simplifies setting propagation delay, as this could be included into the VHDL description of each gate. However, this is not possible in the Simulink model. Summarising, the structural fixed-point design can be quite easily converted into VHDL directly, without much additional intelligence required from the conversion program. The model description of the Simulink block (MDLfile) is very similar to the representation of the common structure. It contains both the parameters of the simulation, description of each block with parameters for each block and block connections. The problem is that Simulink does not use reusable symbols. This means that if there are several
4. VHDL Filter Implementation
203
blocks or symbols of the same name, they are all fully duplicated to the most basic element. These makes the analysis of common blocks much more difficult as these blocks may have slight differences and then qualify as two different ones, even if they have the same name. Therefore, the designer must obey the rule that all blocks having the same symbol must also have the same contents. They may only have different parameters.
2.2
Analysis of the Simulink MDL description
As was pointed out earlier, the description of the Simulink model has close resemblance to the Matlab structure. Describing the model with the structure would allow simplifying the conversion process as interdependence of blocks could be indicated by their position in the tree of blocks. Therefore the conversion of the MDL-file into the Matlab structure was the first task to be done by the conversion utility developed. The main problems faced in this stage were: The structure did not allow the same field names at the same level, which was allowed in the MDL-file. All the blocks and lines (connection signals) had to be renamed consecutively as a remedy to this problem. There are no commas to separate parameters and values in the MDL-file, required by the structure syntax. They had to be included appropriately. There is an inconsistency in the description of text constants. In Matlab they are indicated by a single quote, in the MDL-file by the double quote. Therefore single quotes were replaced by double quotes wherever the text constant was found. Simulink does not require ports to have their width always defined. This created confusion in specifying the number of input/output signals in the entity definition. The safest solution was to make a rule of explicitly defining the width of the ports in the Simulink model wherever it was possible. Even so there were cases when the data type had to be derived indirectly from the block to which the port was connected. The number of input and output ports was not defined consistently. For some Simulink blocks they were clearly given by the parameters “Inputs” and “Outputs”. For other ones there was only one parameter “Ports”, containing a five element vector with the number of input ports in the first element and output ports in the second element. There were also several blocks for which there was no description of the number of ports at all. For such a case whether the block had input or output port had to be derived from the connection description (“Line”). The main keyword in the MDL-file to look for is “System”. This indicates the beginning of the description of the blocks and
DSP System Design
204
their connections within one block. It is then followed by a number of “Block” sections describing components of the design and “Line” sections each equivalent to a single wire connector (one can connect to multiple outputs). The “Block” can have another “System” section, which means it contains a lower-level circuit description. Sometimes such blocks have also had some mask parameters. This indicates that there has been a symbol created for such a block. In this case “Mask type” describes the common symbol name (which could be used for the entity name later), “MaskPromptString” contains descriptions of the symbol parameters, “MaskInitialization” has their names and “MaskValueString” their values. If no “System” is found it means that the block is the basic component of the Simulink library and its description should be later copied from the library of basic VHDL blocks. The “Line” statement contains the names of one source block and one or more output ones and their port numbers. For multiple output ports each of them is described by its own “Branch” statement. There are also other block parameters like “Decimation” and “SamplingTime”, which are useful for multirate systems. The problem with using the MDL description is that it has been changing from one Matlab release to another. The conversion tool that has been developed initially for version 5.3, was not working on version 6.0. The same has happened later with version 6.1 and the next ones. In order to allow users to create their own converters from the Simulink block diagram into other ones (C, ADA and other ones) Math Works has introduced a concept of the Target Language Compiler.
2.3
Target Language Compiler
The Target Language Compiler (TLC) is a feature within Real Time Workshop, introduced from Matlab version 6.0, which allows the user to customize the code generated by Real-Time Workshop (RTW). The TLC includes: Complete set of TLC files corresponding to all Simulink blocks TLC files for model-wide information specifying header and parameters The TLC files are ASCII text files that allow controlling the way that the code is generated by the Real-Time Workshop. The designer may change the way code is generated for a particular block by editing a TLC file. The Target Language Compiler provided with Matlab contains a complete set of working TLC files allowing generating ANSI C code. The TLC files can be changed as required. The TLC is an open environment giving full flexibility of customizing and adjusting the code generated by Real-Time Workshop to suit users’ needs and applications.
4. VHDL Filter Implementation
205
The Target Language Compiler enables the user to customize the C code generated from any Simulink model and generate an in-lined code for his own Simulink blocks. By modifying the TLC code, the user can produce platform-specific code, or even may include own algorithmic changes to increase the performance, change the code size, or make the code compatible with existing methods that user has been using and wishes to maintain. The top-level diagram in Figure 4-1 shows how the Target Language Compiler fits in with the Real-Time Workshop Code generation process. The blocks drawn with dashed lines are not necessary for converting the Simulink description into VHDL. They are only required if the stand-alone executable Simulink model is to be generated. The TLC was designed for the sole purpose of converting the model description file, “model.rtw” or other similar files into target-specific code. Being an integral part of the Real-Time Workshop, the Target Language Compiler transforms an intermediate form of the Simulink system, called “model.rtw”, into custom language code (in our case the VHDL code). The “model.rtw” file contains a “compiled” version of the model including the execution semantics of the block diagram in a high level language, described in the “model.rtw” file.
TLC-generated code is can take advantage of the capabilities and fit to the limitations of specific target processor architectures. After reading the
206
DSP System Design
“model.rtw” file, the Target Language Compiler generates the new code according to the target files, specifying how each block should be coded and model-wide files, specifying the overall code style. The TLC behaves similar to the text processor, using the target files and the “model.rtw” file to generate the VHDL code. In general case, in order to create a target-specific executable application, the Real-Time Workshop also requires a template makefile that specifies the appropriate compiler and compiler options required for the build process. The model makefile is created from the template makefile by performing token expansion specific to a given model. A target-specific version of the generic “rt_main” file (or “grt_main”) must also be modified to conform to the target’s specific requirements such as interrupt service routines. A complete description of the template makefiles and “rt_main” is included in the Real-Time Workshop documentation. The Target Language Compiler resembles other high level programming languages borrowing ideas from HTML, Perl, and MATLAB. It has mark-up syntax similar to HTML, the power and flexibility of Perl and other scripting languages, and the data handling power of MATLAB. The TLC can generate the code from any Simulink model, including linear, nonlinear, continuous, discrete, or hybrid. All Simulink blocks are automatically converted to code, with the exception of MATLAB function blocks and S-function blocks that invoke M-files. The Target Language Compiler uses block target files to transform each block in the “model.rtw” file and a model-wide target file for global customization of the code. It is possible to write a target file for custom C MEX S-function to inline the S-function (see Matlab documentation), thus improving performance by eliminating function calls to the S-function itself and the memory overhead of the S-function’s simStruct. Target files can be also written for M-files allowing incorporating them into VHDL convertible Simulink systems. If the user needs to customize the output of Real-Time Workshop, he will need to instruct the Target Language Compiler how to: Change code generated for a particular Simulink block inline S-functions Modify the way code is generated in a global sense Generating code in a language other than C In order to produce customized output using the TLC, the user needs to understand how blocks perform their functions, what data types are being manipulated, the structure of the “model.rtw” file, and how to modify target files to produce the desired output. Please refer to Matlab documentation for the directives and built-in functions describing the target language directives and their associated constructs. The TLC directives and constructs need to be used to modify existing target files or create new ones, depending on particular needs. See TLC Files for more information about target files.
4. VHDL Filter Implementation
207
Inlining S-Functions
The TLC provides a great deal of freedom for altering, optimizing, and enhancing the generated code. One of the most important TLC features is that it lets to inline S-functions that may be written to add custom user algorithms, device drivers, and custom blocks to a Simulink model. In order to create an S-function, C code needs to be written following a well-defined API. By default, the compiler generates non-inlined code for S-functions that invoke them using this same API. This interface incurs a large amount of overhead due to the presence of a large data structure called the SimStruct for each instance of each S-function block in the model. In addition, extra run-time overhead is involved whenever functions within an S-function are called. This overhead can be eliminated by using TLC to inline the S-function, by creating a TLC file named “sfunction_name.tlc” that generates source code for the S-function as if it were a built-in block. Inlining an S-function improves the efficiency and reduces memory usage of the generated code. In principle, the TLC can be used to convert the “model.rtw” file into any form of output (in our case - the VHDL) by replacing the supplied TLC files for each block it uses. Likewise, some or all of the shipping system-wide TLC files can be replaced. It is not generally recommended by Math Works, although it is supported. In order to maintain such customizations, custom TLC files may need to be updated with each release of the Real-Time Workshop as Math Works continues to modify code generation by adding features and improving its efficiency, and very likely by altering the contents of the “model.rtw” file. There is no guarantee that such changes will be backwards compatible. However, the changes to TLC files are less likely to cause problem of converting Simulink blocks into VHDL that by using the MDL model description. Moreover inlined TLC files that users prepare are generally backwards compatible, provided that they invoke only documented TLC library and built-in functions. Code Generation Process
Real-Time Workshop invokes the TLC after a Simulink model is compiled into an intermediate form of “model.rtw” that is suitable for generating code. To generate code appropriately, the TLC uses its library of functions to transform two classes of target files: system target files and block target files. System target files are used to specify the overall structure of the generated code, tailoring for specific target environments. Block target files are used to implement the functionality of Simulink blocks, including user-defined S-function blocks. You can create block target files for C MEX, Fortran, and M-file S-functions to fully inline block functionality into the body of the generated code.
208
2.4
DSP System Design
Automated conversion from Simulink to VHDL
In order to simplify the first version of the conversion program, it has been designed with some constraint put on the original Simulink model. The model was required to: Operate on bit signals or vectors of bits Have only one sampling rate throughout the design Be composed of gates, constants, ports and buses only This allowed the generation of the structural VHDL description relatively easily. The next versions of this toolbox will allow different variable types and generate structural or behavioural VHDL wherever applicable. The conversion requires two passes. First it looks through the whole design identifying common blocks of the model, each of which would be described in a separate VHDL file. It distinguishes the sub-blocks of the model from the basic Simulink blocks. It also gathers information about ports of each block and their types. This information is needed for creating “component” statements in the VHDL file. At the second pass the algorithm looks recursively through the whole hierarchy of the model from the top level down to the bottom one creating the structural description of each block found in the first pass. For each of them it finds the list of “blocks” and the list of “lines”. The first ones are used to generate block instantiation and configuration commands and the latter ones to define the internal signals. The entity definition is being created from the information found in the first pass of the conversion.
2.5
Fixed-point polyphase half-band filter example
The idea of converting the Simulink design into VHDL has been tested on the example of the two-path two-coefficient polyphase filter [25]. The design was first captured using standard floating-point Simulink blocks. In order to make it close to the implementation the results of additions were rounded-to-zero to 14-bits (Table 4-1), subtractions truncated to 14-bits (Table 4-2) and multiplication truncated to 18-bits (Table 4-3).
4. VHDL Filter Implementation
209
Local increase of wordlength at the multiplication was decided upon in order to avoid the unnecessary loss of precision before the subsequent addition. All data was being represented in two’s complement arithmetic with 2 integer bits and a sign, which gives enough guard bits to deal with internal calculation, 14 altogether. Such a rounding scheme allowed eliminating of the limit cycles while keeping the DC offset low. The floating-point version of the filter has been compared to the architectural one designed from standard gates (Figure 4-1).
The simulation used a two-phase non-overlapping clock required by the delayers built from two D-type flip-flops per bit per unit delay. Flip-flops were active with the rising edge of the clock. The data was read at the rising edge of Clock1 and was available at the output at the rising edge of Clock2.
210
DSP System Design
The comparative simulation allows testing of the design for both an impulse and for the signal generated by modulator. Results of both the fixed-point behavioural and the fixed-point structural design versions matched bit to bit. The fixed-point structural system has been designed to run from the external clock signal in order to be able to synchronise the filter with the input data for the ultimate physical implementation. The only blocks requiring the clock are the delayers; the rest is just combinational blocks for which the result is available at a certain time after the change of the input. This time is called the propagation time. The maximum propagation time is dependent on the propagation time of the gates and the maximum number of dependent gates the signal has to go through.
Figure 4-2 shows the inside of the fixed-point polyphase lowpass filter and Figure 4-3 describes the allpass structure used for both the UpperBranch and the LowerBranch blocks (the only difference being the multiplication factor). The floating-point design is similar to the fixed-point one. It differs in not having a clock signals since Simulink controls the simulation itself.
4. VHDL Filter Implementation
211
The 14-bit wide delayers in Figure 4-4 have been designed using two Dtype flip-flops in the Master-Slave arrangement for each bit. The Mux and Demux are just converting the single bit lines into the vector of bits and back again. They were used for the purpose of the simulation only and were not required for implementation. The structure of the 14-bit adder with truncation is shown in Figure 4-5. The second input is being negated before being added to the first input. As the two’s complement arithmetic is used, negation is achieved by inverting all the bits at the negated delayer output, Q!, and adding one using a ladder of two-bit adders with carry, shown in Figure 4-6. Assuming all the gates to have same propagation delay, the time required to add two numbers was estimated to be
212
DSP System Design
4. VHDL Filter Implementation
213
The multiplication by 0.125 (Figure 4-7), required in the UpperBranch, effectively means shifting data three bits towards the Least Significant Bit (LSB). In order to take care of the negative numbers in two’s complement arithmetic, the Most Significant Bit (MSB) has been propagated to the next three bits (sign extension). The output is given in 18 bits without any loss of precision. Actually, 17 bits is enough to provide the full accuracy. However, 18 bits sizing have been chosen for the consistency with the other multiplier by the factor of 0.5625.
214
DSP System Design
The multiplication by 0.5625 is more complicated (Figure 4-8), as this requires adding together two shifted versions of the input, by one bit (0.5 factor) and by four bits (0.0625 factor). The result is available after a maximum time of For such a case 18 bits are required to provide the output at the full accuracy.
The result of the multiplication by 0.125 or 0.5625 is being added to the delayed samples of the input by the structure shown in Figure 4-10. Incorporated in the multiplier structure is a one-bit no-carry adder used to add an additional carry bit originated from the selected rounding scheme. Its implementation is shown in Figure 4-11.
4. VHDL Filter Implementation
215
The result of adding a 14-bit input to the 18-bit one is then constrained back to 14 bits using a round-to-zero scheme achieved with OR and AND gates. The four-port OR gate examines if there is any 1’s set among the disregarded bits. If the MSB=0 (positive number), the output of the OR gate is disregarded forcing to truncate the data. If the MSB=1 (negative number), the result of the OR gate is added to the output rounding it up towards zero. The maximum propagation time of the block was estimated to be
216
DSP System Design
The 14-bit addition with truncation of the result has been implemented as in Figure 4-12. No loss of precision happens here as the format of the data is such that it takes care of the possible carry bit. The maximum propagation time was estimated to be The final and the simplest block of the polyphase two-path IIR structure is a divider by two implemented as a one-bit right shifter (Figure 4-9). It is required at the output of the filter to scale the overall transfer function to unity at the DC. The result has been subsequently truncated to 14 bits by disregarding the LSB of the input data.
The conversion of the Simulink model description into VHDL was achieved using Target Language Compiler approach. This avoided most problems associated with the previously used MDL to VHDL program [87]. The basic blocks like D-type flip-flops with reset, standard logic gates and the two-phase clock generator (added to the custom library of Simulink blocks) could now be converted using their associated TLC files instead of coding them manually into VHDL . A top-level Simulink block converted to VHDL served as a test-bench file. It was used to compare the results of the VHDL simulation with the output from the fixed-point Simulink model. The complete output of Simulink run has been stored in the file comprising all bits of the input and the output. This file has been read sample-by-sample and compared with the output of the VHDL simulation at each clock cycle. The compilation and simulation of the VHDL code, and subsequently its synthesis, has been done with PeakFPGA from Accolade Design Automation Inc. [78] provided by the company for the purpose of evaluation. The example screen shot of the simulator software running the designed VHDL code is shown in Figure 4-13.
4. VHDL Filter Implementation
217
The test bench may be converted from Simulink, but it is better to create a new one, which would compare the results from Simulink with the results of the VHDL simulation, exactly the way it was done for the two-coefficient example design. The VHDL simulation differs from the high-level Simulink one as: There is a need to take proper care of avoiding unassigned states by properly resetting the design before it starts operating. This has been achieved by setting a CLR_L to zero for the first of the simulation, before the first rising edge of i.e. first reading of the data from the input. The propagation time through the blocks of the design plays an important role in the design. This parameter is not considered in Simulink at all. The VHDL simulation allowed assessment of the maximum speed of operation of the design that was approximately four times the maximum settling time of the combinational logic. For the simulation provided here the clocking speed has been set to 2.5MHz, assuming 2ns propagation time for the logic gates.
218
DSP System Design
The simulation in Simulink required only that and were non-overlapping clock signals. The VHDL simulation proved that the best performance (highest speed of operation) has been achieved when overlapping time was a quarter of the clock period. The VHDL code designed for the example single-stage polyphase filter has been subject to synthesis in both PeakFPGA version 4.25 and in Galileo for Xilinx from Mentor [79]. The first one returned the same result for all design families, including Actel, Altera, Lattice, Lucent and QuickLogic EDIF devices. It turned out that only252 flip-flops and 790 two-input gates were required to implement the design (excluding the clock) generator.
The results of the synthesis by Galileo for Xilinx only were different for each design component. These results are presented in Table 5.1. Basically Galileo calculated that only 168 flip-flops were needed for the delayers. The difference was in the number of gates required, between 185 and 1018 depending on the technology. Galileo, in contrast to PeakFPGA, also gave the estimated input-to-output delay between 86ns and 245ns, dependent on the technology used. The maximum clock frequency of the filter may therefore range from 1.1MHz for Xilinx-5200, 1.4MHz for Xilinx-3000 up to 2.9MHz for Xilinx-3100. It is dependent on the propagation delay of combinational logic, which is maximum 4.5ns for Xilinx-5200, 3ns for Xilinx-3000 and 1.5ns for Xilinx-3100. The propagation delay for Xilinx9500XL is 4-6ns. The sequential delay of the flip-flop is merely up to 6ns and all of them work in parallel. Therefore the preferable technology to implement the filter could be Xilinx-3100A, giving the best speed of operation at low cost and optimum use of the FPGA. Assuming technology with a transistor size of the gate consisting of four transistors and each flip-flop consisting of eight the estimated total size of the components of the design would be approximately 0.2mm by 0.2mm plus few percent for the connections.
4. VHDL Filter Implementation
3.
219
SUMMARY
The specimen filter that has been designed could be comfortably used for the first four stages of the decimation filter described in [25]. Even when considering that the design has to be repeated eight times, the total required silicon area of 0.5mm by 0.5mm is very tiny. Putting together the hardwired filters would avoid the need for fast processors, giving more space for the analogue part of the A/D converter, hopefully the whole modulator. The small size implications are a big advantage as this would free up silicon real estate for the implementation of other functions. The example design of the polyphase filter and then its conversion into VHDL proved that such an idea would be a very attractive way of designing test chips very quickly. It took three days to get from the Simulink model to its final synthesised version. The next stage of the research work would be either to compile to a custom layout and put it onto silicon or to commit the design onto a standard FPGA. The current version of the program performs only direct mapping of structures from Simulink to VHDL and does not work for multiplexed architectures. In order to perform such a conversion the program require an algorithm analysing behavioural or structural descriptions to find common operators, and convert them into the multiplexed structure with added control circuitry. This is the aim of the on-going work.
Appendix A
SINGLE COEFFICIENT ALLPASS SECTION A basic single-coefficient allpass section is described here. It is a basic block of the polyphase recursive IIR filter structures described in this book. The commonly used structures of such a filter are shown in Figure A-1:
222
DSP System Design
Structures (a) and (d) require less space of the integrated circuit than the other two ones due to less number of mathematical operations required. One multiplier, two adders and a small memory is all they need. Structure (d) is very useful for cascading allpass sections. The delayers can be shared by successive allpass sections. The transfer function of this basic building block is given by:
The basic allpass section impulse response:
The phase response:
The group delay response:
Appendix A. Single Coefficient Allpass Section The step response obtained by convolution in time domain:
Total energy:
Impulse response centre of gravity:
Average time delay:
223
224
DSP System Design
Appendix A. Single Coefficient Allpass Section
225
References
1. harris, f., “On the design and performance of efficient and novel filter structures using recursive allpass filters”, IEEE 3rd International Symposium on Signal Processing and its Applications (ISSPA’92), Volume: 1, Page(s): 1-5, Gold Coast, Queensland, Australia, 16-21 August 1992. 2. Kale, I, R. C. S. Morling, A. Krukowski and D. A. Devine, “A high fidelity decimation filter for Sigma-Delta A/D converters”, IEE Second International Conference on Advanced A-D and D-A Conversion Techniques and their Applications (ADDA’94), No: 393, Page(s): 30-35, Cambridge, United Kingdom, 6-8 July 1994. 3. Krukowski, A., I. Kale, K. Hejn and G. D. Cain, “A bit-flipping approach to multistage two-path decimation filter design”, Second International Symposium on DSP for Communications Systems (SPRI’94), Adelaide, Australia, 26-29 April 1994. 4. harris, f., M. d’Oreye de Lantremange and A. G. Constantinides, “Digital signal processing with efficient polyphase recursive all-pass filters”, IEEE International Conference on Signal Processing, Florence, Italy, 4-6 September 1991. 5. Valenzuela, R. A. and A. G. Constantinides, “Digital signal processing schemes for efficient interpolation and decimation”, IEE Proceedings, Volume: 130, Part: G, No: 6, Page(s): 225-235, December 1983. 6. Korn, T. M. and G. A. Korn, Mathematical handbook for scientists and engineers: Definitions, theorems and formulas for reference and review, Dover Publications; ISBN: 0486411478, February 2000. 7. Kale, I., A. Krukowski and N. P. Murphy, “On achieving micro-dB ripple polyphase filters with binary scaled coefficients”, Second International Symposium on DSP for Communications Systems (SPRI’94), Adelaide, Australia, 26-29 April 1994. 8. Hejn, K. and A. Krukowski, “Insight into a digital sensor for sigma-delta modulator investigation”, IEEE Instrumentation and Measurement Technology Conference (IMTC’94), Proceedings: Advanced Technologies in Instrumentation and Measurement, Volume: 2, Page(s): 660-663, Hammamatsu, Shizuoka, Japan, 10-12 May 1994. 9. Kale, I., N. P. Murphy and M. V. Patel, “On establishing the bounds for binary scaled coefficients of fifth and seventh order polyphase half-band filters”, IEEE International Symposium on Circuits and Systems (ISCAS’94), Volume: 2, Page(s): 473-476, London, United Kingdom, 30 May - 2 June 1994.
228
DSP System Design
10. Murphy, N. P., A. Krukowski and I. Kale, “Implementation of wideband integer and fractional delay element”, Electronics Letters, Volume: 30, No: 20, Page(s): 1658-1659, 29 September 1994. 11. Kale, I. and R. C. S. Morling, “High resolution data conversion via sigma-delta modulators and polyphase filters: a review”, Proceedings Measurement - Journal of the IMEKO, Elsevier Science Publisher, Volume: 19, No: 3/4, Page(s): 159-168, 1996. 12. Sheingold, H. D., Analog-digital conversion handbook, Edition, Analog Devices Inc., Prentice-Hall, New York (USA), ISBN: 0130328480, July 1997. 13. Bennett, W. R., “Spectra of quantized signals”, Bell Systems Technical Journal, Volume: 27, Page(s): 46-472, July 1948. 14. Aziz, P. M., H. V. Sorensen and J. Van Der Spiegel, “An overview of sigma-delta converters: How 1-bit ADC achieves more than 16-bit resolution”, IEEE Signal Processing Magazine, Page(s): 61-84, January 1996. 15. Boser, B. E., “Design and implementation of oversampled Analog-to-Digital converters”, PhD Thesis (Contract: 88-DJ-112), Stanford University, California, USA, October 1988. 16. Hejn, K., N. P. Murphy and I. Kale, “Measurement and enhancement of multistage sigmadelta modulators”, IEEE Instrumentation and Measurement Technology Conference (IMTC’92), Page(s): 545 - 551, New York, USA, 12-14 May 1992. 17. Chu, S. and C. S. Burrus “Multirate filter designs using comb filters”, IEEE Transactions on Circuits and Systems, Volume: 31, No: 11, Page(s): 913-924, November 1984. 18. Krukowski, A., “Decimation filter design for oversampled A/D converters”, MSc Project Report in DSP Systems, University of Westminster, London, United Kingdom, 1993. 19. Dijkstra, E., O. Nys, C. Piguet and M. Degrauwe “On the use of modulo arithmetic COMB filters in sigma-delta modulators”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’88), Volume: 4, Page(s): 2001-2004, New York, USA, 11-14 April 1988. 20. Park, S. and W. Chen, “Multi-stage IIR decimation filter design technique for high resolution sigma-delta A/D converters”, IEEE Instrumentation and Measurement Technology Conference (IMTC’92), Page(s): 561-566, New York, USA, 12-14 May 1992. 21. Dijkstra, E., L. Cardoletti, O. Nys, C. Piguet and M. Degrauwe “Wave digital decimation filters in oversampled A/D converters”, IEEE International Symposium on Circuits and Systems (ISCAS’88), Volume: 3, Page(s): 2327-2330, Espoo, Finland, 7-9 June 1988. 22. Kale, I., R. C. S. Morling, A. Krukowski and D. A. Devine, “Architectural design simulation and silicon implementation of a very high fidelity decimation filter for SigmaDelta data converters”, IEEE Instrumentation and Measurement Technology Conference (IMTC’94), Proceedings: Advanced Technologies in Instrumentation and Measurement, Volume: 2, Page(s): 878-881, Hammamatsu, Shizuoka, Japan, 10-12 May 1994. 23. Krukowski A., R. C. S. Morling and I. Kale, “Quantization effects in the polyphase N-path IIR structure”, IEEE Instrumentation and Measurement Technology Conference (IMTC’2001), Volume: 2, Page(s): 1382-1385, Budapest, Hungary, 21-23 May 2001. 24. Kale, I., R. C. S. Morling and A. Krukowski, “The design, simulation and silicon implementation of a very high fidelity 24-bit potential decimation filter for sigma-delta A/D converters”, Fourth Cost#229 Workshop on Adaptive Methods and Emergent Techniques for Signal Processing and Communications, Page(s): 155-161, Ljubljana, Slovenia, 5-7 April 1994. 25. Kale, I., R. C. S. Morling and A. Krukowski, “A high-fidelity decimator chip for the measurement of sigma-delta modulator performance”, IEEE Transactions on Instrumentation and Measurement, Volume: 44, No: 5, October 1995.
References
229
26. Jantzi, S., R. Schreier and W. M. Snelgrove, “Bandpass sigma-delta analog-to-digital conversion”, IEEE Transactions on Circuits and Systems, Volume: 38, No: 11, Page(s): 1406-1409, November 1991. 27. Krukowski, A. and I. Kale, “Constrained coefficient variable cut-off polyphase decimation filters for band-pass Sigma-Delta data conversion”, IMEKO Workshop on ADC Modeling, Page(s): 85-90, Smolenice Castle, Slovak Republic, 7-9 May 1996. 28. Schreier R. and W. M. Snelgrove, “Decimation for bandpass sigma-delta analogue-todigital conversion”, IEEE International Symposium on Circuits and Systems (ISCAS’90), Volume: 3, Page(s): 1801-1804, New Orleans, USA, 1-3 May 1990. 29. Krukowski, A., I. Kale, K. Hejn and R. C. S. Morling, “A design technique for polyphase decimators with binary constrained coefficients for high resolution A/D converters”, IEEE International Symposium on Circuits and Systems (ISCAS’94), Volume: 2, Page(s): 533-536, London, United Kingdom, 30 May - 2 June 1994. 30. Constantinides, A. G., “Spectral transformations for digital filters”, IEE Proceedings, Volume: 117, No: 8, Page(s): 1585-1590, August 1970. 31. Mahoney, M., DSP-based testing of analogue and mixed-signal circuits, Wiley - IEEE Computer Society Press, ISBN: 0-8186-0785-8, April 1987. 32. Krukowski A. and I. Kale, “The design of arbitrary-band multi-path polyphase IIR filters”, IEEE International Symposium on Circuits and Systems (ISCAS’2001), Volume: 2, Page(s): 741-744, Sydney, Australia, 6-9 May 2001. 33. Curtis, T. E. and A. B. Webb, “High performance signal acquisition systems for sonar applications”, IEE International Conference on Analogue to Digital and Digital to Analogue Conversion, IEE Conference Publication No: 343, Page(s): 87-94, Swansea, United Kingdom, 17-19 September 1991. 34. Lawson, S., “On design techniques for approximately linear phase recursive digital filters”, IEEE International Symposium on Circuits and Systems (ISCAS’97), Volume: 4, Page(s): 2212-2215, 9-12 June 1997. 35. Lu, W. S., “Design of stable IIR digital filters with equiripple passbands and peakconstrained least squares stopbands”, IEEE International Symposium on Circuits and Systems (ISCAS’97), Volume: 4, Page(s): 2192-2195, 9-12 June 1997. 36. Lawson, S. S., “Direct approach to design of PCAS filters with combined gain and phase specification”, IEEE Proceedings on Vision, Image and Signal Processing, Volume: 141, No: 3, Page(s): 161-167, June 1994. 37. Constantinides, A. G., “Frequency transformations for digital filters”, Electronics Letters, Volume: 3, No: 11, Page(s): 487-489, November 1967. 38. Constantinides, A. G., “Design of bandpass digital filters”, Proceedings of IEEE, Volume: 1, No: 1, Page(s): 1129-1231, June 1969. 39. Constantinides, A. G., “Frequency transformations for digital filters”, Electronics Letters, Volume: 4, No: 7, Page(s): 115-116, April 1968. 40. Broome, P., “A frequency transformation of numerical filters”, Proceedings of IEEE, Volume: 52, Page(s): 326-327, February 1966. 41. Hazra, S. N. and S. C. Dutta Roy, “A simple modification of Broome’s transformation for linear-phase FIR filters”, Proceedings of IEEE, Volume: 74, No: 1, Page(s): 227-228, January 1986. 42. Cain, G. D., A. Krukowski and I. Kale, “High order transformations for flexible IIR filter design”, VII European Signal Processing Conference (EUSIPCO’94), Volume: 3, Page(s): 1582-1585, Edinburgh, Scotland, 13-16 September 1994.
230
DSP System Design
43. Krukowski, A., G. D. Cain and I. Kale, “Custom designed high-order frequency transformations for IIR filters”, IEEE 38th Midwest Symposium on Circuits and Systems (MWSCAS’95), Volume: 1, Page(s): 588-591, Rio de Janeiro, Brazil, 13-16 August 1995. 44. Nowrouzian, B. and A. G. Constantinides, “Prototype reference transfer function parameters in the discrete-time frequency transformations”, IEEE 33rd Midwest Symposium on Circuits and Systems (MWCAS’90), Volume: 2, Page(s): 1078-1082, Calgary, Canada, 12-14 August 1990. 45. Franchitti, J. C., “Allpass filter interpolation and frequency transformation problems”, MSc Thesis, Electrical and Computer Engineering Department, University of Colorado, 1985. 46. Feyh, G., J. C. Franchitti and C. T. Mullis, “All-pass filter interpolation and frequency transformation problem”, 20th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, USA, Page(s): 164-168, 10-12 November 1986. 47. Mullis, C. T. and R. A. Roberts, Digital Signal Processing, Section 6.7, Addison-Wesley Publication Company, ISBN: 0201163500, 1 February 1987. 48. Feyh, G., W. B. Jones and C. T. Mullis, “An extension of the Schur algorithm for frequency transformations”, Linear Circuits, Systems and Signal Processing: Theory and Application, Editors: C.I. Byrnes, C.F. Martin and R.E. Saeks, New York: North Holland, ISBN: 0444704957, October 1988. 49. Schuessler, H. W., “Implementation of variable digital filters”, First European Signal Processing Conference (EUSIPCO’80), Signal Processing: Theories and Applications, Page(s): 123-129, Lausanne, Switzerland, September 1980. 50. Jarske, P., S. K. Mitra and Y. Neuvo, “Signal processor implementation of variable digital filters”, IEEE Transactions Instrumentation and Measurement, Volume: 37, No: 3, Page(s): 363-367, September 1988. 51. Krukowski, A., I. Kale and R.C.S. Morling, “The design of polyphase-based IIR multiband filters”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’97), Volume: 3, Page(s): 2213-2216, Munich, 21-24 April 1997. 52. Saghizadeh, P. and A. N. Wilson, Jr., “A genetic approach to the design of M-channel uniform-band perfect-reconstruction linear-phase FIR filter banks”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’95), Volume: 2, Page(s): 1300-1303, Detroit, USA, 8-12 May 1995. 53. Friedlander, B. and B. Porat, “The modified Yule-Walker method of ARMA spectral estimation”, IEEE Transactions on Aerospace Electronic Systems, Volume: 20, No: 2, Page(s): 158-173, March 1984. 54. Mitra, S. K. and K. Hirano, “Digital allpass networks”, IEEE Transactions on Circuits and Systems, Volume: 21, No: 5, Page(s): 688-700, September 1974. 55. Saramaki, T., “On the design of digital filters as a sum of two allpass filters”, IEEE Transactions on Circuits and Systems, Volume: 32, No: 11, November 1985. 56. Saramaki, T., Tian-Hu Yu and S. K. Mitra, “Very low sensitivity realization of IIR digital filters using a cascade of complex all-pass structures”, IEEE Transactions on Circuits and Systems, Volume: 34, No: 8, Page(s): 876-886, August 1987. 57. Krukowski, A., I. Kale and G. D. Cain, “Decomposition of IIR transfer functions into parallel, arbitrary-order IIR subfilters”, Nordic Signal Processing Symposium (NORSIG’96), Espoo, Finland, 24-27 September 1996. 58. Kale, I., J. Gryka, G.D. Cain and B. Beliczynski, “FIR filter order reduction: balanced model truncation and Hankel-norm optimal approximation”, Proceedings IEE on Vision, Image and Signal Processing, Volume: 141, No: 3, Page(s): 168-174, June 1994. 59. Numerical Recipes in C++, 2nd Edition, Cambridge University Press, Cambridge, MA 02238 (USA), ISBN: 0-521-75033-4, 2002.
References
231
60. Nelder, J. A. and R. Mead, “A simplex method for function minimization”, Computer Journal, Volume: 7, Page(s): 308-313, 1965. 61. Samueli, H., “An improved search algorithm for the design of multiplierless FIR filters with powers-of-two coefficients”, IEEE Transactions on Circuits and Systems, Volume: 36, No: 7, Page(s): 1044-1047, July 1989. 62. Hwang, A., Computer Arithmetic, Principles, Architecture and Design, John Wiley & Sons, New York (USA), ASIN: 0471034967, January 1979. 63. Psoloinis, P. C., “VHDL to silicon implementation of a high resolution decimation filter”, BEng Honors Project Report, University of Westminster, London, United Kingdom, 1995. 64. Murphy, N. P., A. Krukowski and A. Tarczynski, “An efficient fractional sample delayor for digital beam steering”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’97), Volume: 3, Page(s): 2245-2248, Munich, Germany, 21-24 April 1997. 65. Liu, G. S. and C. H. Wei, “A new variable fractional sample delay filter with nonlinear interpolation”, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Volume: 39, No: 2, Page(s): 123-126, February 1992. 66. Elliot, Douglas F., Handbook of Digital Signal Processing - Engineering Applications, Chapter 3, Academic Press, New York (USA), ISBN: 0122370759, November 1997. 67. Cain, G. D., N. P. Murphy and A. Tarczynski, “Evaluation of several variable FIR fractional-sample delay filters”, IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP’94), Volume: 3, Page(s): 621-624, Adelaide, April 1994. 68. Skolnik, M. I., Introduction to Radar Systems, McGraw-Hill Companies, Edition, ISBN: 0072909803, August 2000. 69. Monzingo, R. A. and T. W. Miller, Introduction to Adaptive Arrays, Wiley John & Sons Inc., New York (USA), ISBN:0471057444 , September 1980. 70. Kellermann, W., “A self-steering digital microphone array”, IEEE International Conference oon Acoustic, Speech and Signal Processing (ICASSP’91), Volume: 5, Page(s): 3581-3584, Toronto, Canada, 14-17 April 1991. 71. Krukowski A., R. C. S. Morling and I. Kale, “Quantization effects in the polyphase N-path IIR structure”, IEEE Transactions on Instrumentation and Measurement, Volume: 51, No: 5, Page(s): 1271-1278, December 2002. 72. Vaidyanathan, P. P., Multirate Systems and Filter Banks, Prentice Hall PTR (USA), ISBN: 0136057187, 1st Edition, 21 September 1992. 73. Murphy, N. P., A. Tarczynski and T. I. Laakso, “Sampling-rate conversion using a wideband tunable fractional delay element”, Nordic Signal Processing Symposium (NORSIG’96), Page(s): 423-426, Espoo, Finland, 24-27 September 1996. 74. Välimäki, V., Fractional Delay Waveguide Modeling of Acoustic Tubes, Report No: 34, Helsinki University of Technology, Laboratory of Acoustics and Audio Signal processing. Espoo, Finland, July 1994. 75. Farrow, C. W., “A continuously variable digital delay element”, IEEE International Symposium on Circuits and Systems (ISCAS’88), Volume: 3, Page(s): 2641-2645, Espoo, Finland, 7-9 June 1988. 76. Ashenden, P. J., The Designer’s Guide to VHDL, Morgan Kaufmann Publishers, San Francisco (USA), ISBN: 1-55860-270-4, 1995. 77. Holmes, C., VHDL Language Course, Rutherford Appleton Laboratory, Microelectronics Support Centre, Chilton, Didcot, USA, 23-25 May 1995. 78. Krukowski A. and I. Kale, “Constraint two-path polyphase IIR filter design using downhill simplex algorithm”, IEEE International Symposium on Circuits and Systems (ISCAS’2001), Volume: 2, Page(s): 749-752, Sydney, Australia, 6-9 May 2001.
232
DSP System Design
79. Morgan D and C. Thi, “A delayless subband adaptive filter architecture”, IEEE Transactions on Signal Processing, Volume: 43, No: 8, 1995. 80. Chen J.D., H. Bes, J. Vandewalle et al, “A zero-delay FFT-based subband acoustic echo canceller for teleconferencing and hands-free telephone systems”, IEEE Transactions on Circuits and Systems II, Volume: 43, No: 10, Page(s): 713(7), 1996. 81. Sondhi, M. M. and W. Kellerman, “Adaptive echo cancellation for speech signals”, Advances in Speech Signal Processing, S. Furui and M. M. Sondhi, Eds, NY: M. Dekker, Ch. 1, 1992. 82. Gerald J., N. L. Esteves and M. M. Silva, “A new IIR echo canceller structure”, IEEE Transactions on Circuits and Systems II, Volume: 42, No: 12, Page(s): 818-821, 1995. and A. G. Constantinides, “Subband adaptive filtering for 83. Naylor P. A., O. acoustic echo control using allpass polyphase IIR filter banks”, IEEE Transactions on Signal Processing, Volume: 6, No: 2, 1998. 85. Vaidyanathan P. P.,“Multirate digital filters, filter banks, polyphase networks, and applications: A tutorial”, IEEE Proceedings, Volume: 78, No: 1, 1990. 86. Valimaki, V., Discrete-Time Modeling of Acoustic Tubes Fractional Delay Filters, PhD Thesis, Helsinki University of Technology, Finland, ISBN:951D22D2880D7, TKK Offset, December 1995. 87. Krukowski, A. and I. Kale, “Simulink/Matlab-to-VHDL Route for Full-Custom/FPGA Rapid Prototyping of DSP Algorithms”, MATLAB DSP Conference 1999, Dipoli Conference Centre, Espoo, Finland, November 16-17, 1999.
Index Additive Noise Model 48 AGDF See Arbitrary Group Delay Filter Allpass Section average time delay 236 centre of gravity 236 description 233–38 group delay response 238 impulse response 234 phase response 236 phase response properties 5 step response 235 structures 234 total energy 236 ALU See Arithmetic-Logic Unit Amoeba See Downhill Simplex Method Arbitrary Group Delay Filter xii, 204 Arithmetic-Logic Unit 161 Balanced Model Truncation 41 Bandpass Oversampling Ratio 66 Bit-Flipping 16, 56, 60, 69, 78, 167, 171, 173 BMT See Balanced Model Truncation
Dolph-Chebyshev Window 203 Downhill Simplex Method basic moves 170 constrained 170 overview 169 DPRAM See Dual-Port RAM Dual-Port RAM 179 Dynamic Range 54, 55, 89, 95, 189 Elliptic Filter 8, 41, 60, 114 Equivalent Lowpass Magnitude Response 81 Fourier Summation Transforms 197 Frequency Transformation lowpass-to-lowpass 28 mapping function 30 FST See Fourier Summation Transforms Harris 8, 18 Hilbert Transformation 115 Hybrid Amoeba See Downhill Simplex Method Interpolation Filter 63, 72, 73, 74, 76, 201 Inverse Hilbert_Transformation 116
Canonical Signed-Digit Code 174 CDS See Downhill Simplex Method Compensation Filter 23, 67 Constantinides 8, 18, 96, 105, 114, 147 Constrained Downhill Simplex 16 CSDC See Canonical Signed-Digit Code
Lagrangian Interpolation Filter 203 Least Significant Bit 163, 226 LMS 95, 96, 98 LSB 163
DC Mobility 107, 124 Difference-Multiply-Accumulate 179 Digital Audio 51 Direct Decomposition Method 36 DMAC See Difference-MultiplyAccumulate
MAVR 21, See Moving Average Filter Morgan 100, 102, 104, 243, 244 Moving Average Filter 20 Mullis 96, 109, 147, 242 Multiband Filter 27, 129, 154, 155, 157 Multiplexer 200
DSP System Design
234 Multistage Decimation 56 Natural Binary Code 174 NBC See Natural Binary Code Noise Shaping Function 191 NSF See Noise Shaping Function Nyquist Converter 45, 46, 47, 49 Nyquist Mobility 107, 124 Oversampling Ratio 48, 51, 54, 55, 56, 57, 68, 80 Partial Fraction Expansion 44 PCM See Pulse Code Modulation PDM See Pulse Density Modulation PeakFPGA 229 Perfect Reconstruction 97, 104, 148 PFE See Partial Fraction Expansion Phase Linearity 6, 79, 82, 83, 202 Polyphase IIR Filter group delay 7 N-path structure 20 passband and stopband ripples 5 phase response 2 pole-zero plots for 2 coefficients 14 pole-zero plots for 3 coefficients 19 structure 1 two-path halfband 3 Power Spectral Density 49, 191 Predictive Modulators 51 PSD See Power Spectral Density Pulse Code Modulation 46 Pulse Density Modulation 49 Quadrature Mirror Filter Banks 115 Quantization Noise 48, 53 Quantization Schemes 185 convergent rounding 187 rounding 186 rounding to infinity 186 rounding to zero 186
truncating 186 Real-Time Workshop 216, 218, 219 Reconstruction Error 41, 97, 101 Rotation Factor 106, 113, 130 RTW See Real-Time Workshop Sample Rate Decreaser 97, 176 Sample Rate Increaser 200 Sigma-Delta Modulator 3, 45, 49, 63, 239, 240, 241 Signed Binary Code 13 Simplex 169, 170, 171, 243, 244 Simulink 214 MDL 215 SRD See Sample Rate Decreaser Stability-Forced Transformation 140, 141 Subband Adaptive Echo Cancellation 94 Successive-Separation Method 34, 37 Target Language Compiler 216 code generation 219 diagram 217 inlining S-functions 219 TLC See Target Language Compiler Transformation Function 29, 30 Unsigned Binary Code 13 VHDL 211 VLSI 45, 47, 49, 51 Yulewalk 158, 159 Zero Insertion Interpolation 200 ZII See Zero Insertion Interpolation See Sigma-Delta Modulator