890
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Short Papers
NO. 8,
AUGUST 2001
___________________________________________________________________________________________________
Genetic Algorithm Wavelet Design for Signal Classification Eric Jones, Member, IEEE, Paul Runkle, Member, IEEE, Nilanjan Dasgupta, Student Member, IEEE, Luise Couchman, and Lawrence Carin, Fellow, IEEE AbstractÐBiorthogonal wavelets are applied to parse multiaspect transient scattering data in the context of signal classification. A language-based genetic algorithm is used to design wavelet filters that enhance classification performance. The biorthogonal wavelets are implemented via the lifting procedure and the optimization is carried out using a classification-based cost function. Example results are presented for target classification using measured scattering data. Index TermsÐGenetic algorithms, wavelets, classification.
æ 1
VOL. 23,
INTRODUCTION
WAVELET design is a problem that has attracted significant
attention over the last decade. Tewfik et al. [1] designed orthogonal wavelets based on computing bounds on cost functions, the latter based on either minimizing the error between the original and approximate signal representation, or on maximizing the norm of the projection of the signal onto the wavelet space. Oslick et al. [2] have developed a paradigm for the design of general biorthogonal wavelets with this construct amenable to a cost function. These and other wavelet design paradigms are based on frequency-domain filter characteristics. As an alternative, Sweldens [3] has developed a general scheme for designing biorthogonal wavelets, implemented directly in the time domain. This formalism, termed ªlifting,º yields a simple technique for insertion into a general cost function. For example, Claypoole et al. [4] have developed several techniques for adaptive wavelet design based on lifting. In [4], an efficient design solution was based on a linearly constrained least-squares minimization of the signal-representation error on each wavelet level. While the wavelet design techniques in [1], [4] have clear utility in the context of compression, there are other applications for which alternative cost functions are desirable. In this paper, we are interested in signal classification based on wavelet-based feature parsing. While error minimization of the wavelet representation may be a salutary goal, it does not address the ultimate objective of improved classification performance. The optimal choice of wavelets for signal classification depends on the details of the signal classes and on the classifier. In many cases, the classifier is too complicated to allow a direct solution for the optimal wavelet representation, suggesting the use of a genetic algorithm (GA) [5] for cost-function optimization. This approach is pursued in this
. E. Jones, P. Runkle, N. Dasgupta, and L. Carin are with the Department of Electrical and Computer Engineering, Duke University, Box 90291, Durham, NC 27708-0291. E-mail:
[email protected]. . L. Couchman is with the Naval Research Laboratory, Physical Acoustics, Code 7130, Washington, DC 20375-5000. E-mail:
[email protected]. Manuscript received 12 Apr. 2000; revised 30 Nov. 2000; accepted 9 Feb. 2001. Recommended for acceptance by A. Kundu. For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference IEEECS Log Number 111896. 0162-8828/01/$10.00 ß 2001 IEEE
paper and the classification performance of the GA-designed wavelets is compared to that of classical (Cohen et al. [6] wavelets, as well as to wavelets designed by minimizing the error in the wavelet representation [4]. It is important to note that the cost function employed in [4] allowed a direct solution, without the need for a GA. However, that cost function does not permit one to address the ultimate goal of improved classification. The cost function introduced here is based explicitly on classifier performance, this not permitting a direct solution for the optimal wavelets. Therefore, we have employed this classification-based cost function in a GA.
2
LIFTING-BASED WAVELET DESIGN
The discrete wavelet transform (DWT) is computed efficiently via a recursive multirate filterbank [7]. At each scale j, the filterbank decomposes the signal into low-pass and high-pass components through convolution (and subsequent decimation) with FIR filters h and g, respectively. The DWT representation is composed of scaling coefficients, c
k, representing coarse or low-pass signal information at scale j 0, and wavelet coefficients, dj
k, representing signal detail at scales j 1; . . . ; J. Formally, X X cj
k h
2k ÿ mcj1
m dj
k g
2k ÿ mcj1
m;
1 m
m
where the original discrete-time signal is given by cJ
k, of length 2J samples. At the jth level, both cj
k and dj
k are composed of 2j samples, forming a tree-like relationship between the coefficients at successive scales. Signal reconstruction may be effected through the application of the inverse DWT [7]. Although standard wavelet families are well-suited for analysis of general signals, it is also possible to design wavelet transforms that are adapted to the signals of interest. Rather than use an orthogonal wavelet basis, we choose biorthogonal wavelets which allow more flexibility in the system design. Among the advantages of a biorthogonal system are that the filters h and g (in (1)) need not be of the same length, so that the parameters may be repartitioned to meet specific design constraints. In this paper, we exploit an alternative architecture to the multirate analysis filterbank, known as the lifting scheme [8]. It has been shown that any set of wavelet and scaling filters, including those associated with a biorthogonal bases, may be decomposed in terms of the lifting structure, which is a lattice-type realization of the multirate filterbank [8]. At each scale, the lifting scheme is implemented in the following manner: The signal under analysis, cj1
k, is split into its odd and even components cj1
2k and cj1
2k 1, analogous to the decimation operation in the standard multirate filterbank. A FIR filter, p, of order Np is used to predict the odd components as a linear combination of the even. The wavelet coefficients may be identified as the ªdetailº in the higher resolution data that is not predicted by the even component of the analysis signal: X dj
k cj1
2k 1 ÿ p
mcj1
2m k ÿ no ;
2 m
where no
Np ÿ 2=2 is a temporal shift to properly align the wavelet coefficients according to the prediction filter order. From (1) and (2), g is expressed in terms of the coefficients of p g
2k ÿp
k
g
2k 1
k ÿ no
3
with k 0; . . . ; Np ÿ 1. From theP necessary conditions imposed on the highpass filter [8], we have p
k 1 leaving Np ÿ 1 degrees of freedom remaining for the prediction filter.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Following the prediction step is an ªupdateº which augments the even component of the analysis signal with a combination of the detail to obtain a coarse approximation to the original signal. The update operation utilizes a FIR filter, u, of order Nu : X cj
k cj1
2k u
mdj
m k ÿ n1 ;
4 m
where n1
Np Nu 2=2 provides the proper temporal shift to align the coarse coefficients in accord with the prediction and update filter orders. From (1) and (4), the h is a combination of the prediction and update filters: X h
2k
k ÿ n1 ÿ p
mu
n ÿ m h
2k 1 u
k:
5 m
A necessary condition on h, to satisfy the wavelet multiresolution property, is a partitioning of unity [8] X X h
2k h
2k 1:
6 k
k
P This yields the necessary condition u
k 1=2. The number of degrees of freedom available for the adaptive multirate filter design at each scale is Ndf Np Nu ÿ 2: Within the lifting paradigm, there are a several possible design strategies over the remaining Ndf parameters governing the biorthogonal basis. One such strategy is to suppress all polynomials lower than order Np ÿ 1 at the output of g, and pass all polynomials of order Nu ÿ 1 through the filter h. This approach maximizes the smoothness of the coarse coefficients, while enabling the detail coefficients to represent highpass information; such a design procedure yields the symmetric biorthogonal Cohen, Daubechies, and Faveau (CDF) wavelets [6]. In the lifting framework, these constraints are satisfied through the solution of linear equations over the space of p and u, while the constraints imposed on g and h are not as straightforward [1]. In lieu of applying all Ndf degrees of freedom on smoothness constraints, one can use some of the available parameters to match the biorthogonal wavelet to the signal(s) of interest. For example, in [4], some of the available parameters are employed to enforce smoothness constraints, while the remainder are used to match (in a leastsquare-error sense) the biorthogonal wavelet to signals of interest.
3 3.1
WAVELET FEATURES Features
The principal focus of this paper involves lifting-based wavelet construction via a GA design procedure, the latter employing a cost function linked directly to the classification problem. As demonstrated in Section 4, the GA procedure is applicable to general wavelet features and to a general classifier. Therefore, for demonstration of the basic principles here, we employ simple wavelet features. In particular, our features are based on moments of the normalized wavelet coefficients. These features characterize the envelope shape for a particular level (scale) of wavelet coefficients. Similar feature sets have been used in wavelet-based texture classification systems [9], which represent the temporal/spatial structure of the wavelet coefficients at each scale j. This approach provides a compact representation of the wavelet coefficients and is relatively robust to uncertainty in the training data. Also, while the wavelet transform itself is not shift-invariant, the shape of the envelope remains relatively constant for arbitrary shifts. Prior to feature extraction, the detail coefficients at scale j, dj
k, are squared and normalized to form wj
k
d2j
k ; zj
zj
X k
d2j
k:
7
VOL. 23, NO. 8,
AUGUST 2001
891
The signal, wj
k (denoted wj ), is defined to satisfy the necessary conditions of a probability mass function. The moments about the mean of wj are given by X X mrj
k ÿ m1j r wj
k m1j kwj
k:
8 k
k
The feature set used in this paper is composed of the variance (breadth), skewness (asymmetry), and kurtosis (peakedness) of wj . These parameters are derived from mrj ; r 2; 3; 4; j 1; . . . ; J (we typically only use features from L < J wavelet levels).
3.2
Statistical Model for Features
If we consider L wavelet levels, the moment-based features discussed above yields a 3L-dimensional feature vector, v. The statistical distribution of these feature vectors is characterized via vector quantization (VQ) trained using a K-means algorithm [10], with the features mapped onto integers corresponding to the nearest-neighbor codebook element: k Q
vv; k 2 1; . . . K, for a K-element codebook. In the work presented here, we are interested in transient scattering from a general target, with such scattering typically a strong function of the target-sensor orientation [11]. Therefore, we define target states [11], with each state characteristic of a set of contiguous target-sensor orientations for which the scattering is relatively stationary [11]. A VQ-based classifier is designed for each state, with this construct motivated by previous research, in which these states are employed in a hidden Markov model (HMM) [11]. Given target states Sm ; m 1; . . . ; M, we first define a VQ codebook using feature vectors originating from all states. Given this codebook, the feature vectors from state Sm are used to define a state-dependent probability mass function for the codebook elements p
Q
vv kjSm , which we abbreviate as p
vvjSm . VQ was selected over other statistical feature models (such as Gaussian mixtures [12]) due to the algorithm's simplicity and the fact that our training data (see Section 5) was relatively small. Prior to application of the K-means algorithm, the features are normalized such that a Euclidean-distance metric applied across the heterogeneous feature dimensions is appropriate.
4
GENETIC ALGORITHM IMPLEMENTATION
Genetic algorithms (GAs) constitute an optimization technique based on the ª survival of the fittest º paradigm found in nature. The fundamentals of traditional GAs are well covered in [5]. Also, [13], [14] cover a GA variant called genetic programming that is relevant to the methods used here. Genetic algorithms work with an abstract representation of a design called a chromosome. Here, we choose a tree structure that is compatible with language-based optimization [13], [14], [15], [16]. A classifier language describes the architecture of the classifier. The dictionary of words, or lexicon, for the language defines the components, subcomponents, and numerical parameters necessary to build a wavelet-based classifier. The language's grammar defines how these pieces are connected together. Newly generated classifier designs must be grammatically correct in order to be valid systems.
4.1
Classifier Language
The problem at hand assumes that we have M states Sm , each state defined by an associated ensemble of transient scattered waveforms. Each state is representative of a set of target-sensor orientations over which the scattering physics is stationary. Feature vector v is classified as associated with state Sm if p
vvjSm > p
vvjSk 8 k 6 m (maximum-likelihood discrimination). Our goal is to design distinct wavelet filters for each state Sm , such that the likelihood of classifying a scattered waveform with the correct state is maximized. The cost function employed in the GA
892
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 23,
NO. 8,
AUGUST 2001
Fig. 1. Chromosome for a two-state classifier. Gray tags indicate the grammatical type of each node. Information for the coefficient arrays for the p and u filters has been omitted to conserve space.
discussed below maximizes the minimum probability of correct classification along the confusion-matrix diagonal [15]. While it can easily generalize to M states, the following grammar (see Fig. 1) defines how the individual components of a two-state classifier fit within the GA design. A classifier is made up of two state_identifiers and a maximum-likelihood processor, max. Each state_identifier has two subcomponents, a feature_extractor, and a statistical model. The feature_extractor also has two parts, a lifter_transform and the moment_features block. The statistical model used is vector_quantization with 30 codebook elements. ST IDENTIFIERS LIKE PROCR IDENTIFIER
! ! ! !
EXTRACTOR LIFTER FEATURES STAT MODEL DISCRETE CODE COUNT
! !
! ! ! !
classifier IDENTIFIERS LIKE PROCR list join IDENTIFIER IDENTIFIER max state identifier EXTRACTOR STAT MODEL feature extractor LIFTER FEATURES lifter transform LIFTER STAGES moment features DISCRETE vector quantization CODE COUNT 30:
9
Note that the list_join operator simply groups a set of objects together. Here, it groups two state_identifer objects. Later, it is used to combine numbers into a list of filter coefficients. In (9), which is written in Backus-Naur form [16], the ! symbol indicates transformation or substitution. Symbols on the left of the arrow can be transformed into symbols on the right-hand side of the arrow. The | symbol is the ªorº operator. It indicates that the lefthand side symbol can be transformed into (replaced by) any of the rules on the right-hand side. Uppercase symbols in the rules are nonterminal symbols and lowercase bold symbols are terminal symbols. The derived structure can only contain terminal symbols. Nonterminal symbols are used to define intermediate steps in the process of generating a valid sentence. Whenever a nonterminal is transformed into a rule that itself contains a nonterminal(s), one of
the rules for that symbol is applied. All grammars have a start symbol that indicates which transformation in the grammar to begin with when generating a sentence. The start symbol here is ST. The grammar in (9) is very rigid in that it does not allow for variations in the system, i.e., there is only a single choice for the likelihood processor (max), the statistical model (vector_quantization), and all the other components in the system. However, the LIFTER_STAGES symbol has not yet been defined. The GA allows variation in the number of lifter levels L (three or four), the length of the p and u filters in each stage (four or five), and allows the p and u filter coefficients to have any value in the range
ÿ1; 1.1 LIFTER STAGES STAGE2 STAGE3 STAGE4 STAGE P U COEFFICIENTS COEFFICIENTS2 COEFFICIENTS3 COEFFICIENTS4
! ! ! ! ! ! ! ! ! ! !
COEFFIENTS5 NUMBER
! !
list join STAGE STAGE2 list join STAGE STAGE3 list join STAGE STAGE4 j STAGE STAGE lifter stage P U COEFFICIENTS COEFFICIENTS list join NUMBER COEFFICIENTS2 list join NUMBER COEFFICIENTS3 list join NUMBER COEFFICIENTS4 list join NUMBER COEFFICIENTS5 j NUMBER NUMBER float
ÿ1; 1:
10
Combining the grammars in (9) and (10) yields the complete grammar for the classifier language. Fig. 1 shows one of the many classifier chromosomes generated from the grammar. O'Neil and Ryan [17] have considered a related approach, employing string chromosomes rather than the tree chromosomes used here. 1. While the GA can choose any numerical value for the P p and u coefficients, the values are always postprocessed to enforce p
k 1 P and u
k 1=2. This is done by subtracting the appropriate DC offset from the filter coefficents.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 23, NO. 8,
AUGUST 2001
893
Fig. 2. Representative signals from the four states of a particular target.
4.2
Crossover for Tree Chromosomes
Breeding two chromosomes is done using the crossover operator. In crossover, a node is selected from the trees of two parent chromosomes. These nodes, along with their complete subtree are then swapped between the parents to form two new children chromosomes. In order for the children to be valid classifier systems, i.e., grammatically correct trees, crossover can only occur between nodes that have the same grammatical type. There is a vast literature on such crossovers, with the reader referred to [18], [19]. [20]. Every node within a tree is a possible crossover site. This can be undesirable for several reasons. First, a large percentage of the nodes in a tree are leaf nodes (half in full binary trees). As a result, a high percentage of the crossover operations occur at leaf nodes and exchange only a single node between the parents. Disallowing crossover between grammatical types that only occur at leaf nodes (such as NUMBER) forces crossovers to occur at higher level nodes and increases the average amount of information exchanged between parents. Second, there are portions of the tree chromosome that are identical in all trees. For instance, because of the grammar definition in (9), all STAT_MODEL subtrees are identical in every state_identifier. Crossover at a STAT_MODEL typed node generates offspring that are clones of their parents. Removing such nodes from the list of possible crossover sites increases the chance that offspring have different designs than their parents. As detailed in Section 5, a small percentage of the numerical parameters are mutated each generation, using Gaussian mutation with a standard deviation equal to a selected percentage of the parameter's value [21].
5 5.1
EXAMPLE RESULTS Preliminaries
We consider time-domain acoustic scattering from a submerged elastic target. The details of the measurement and of the target are found in [22]. As discussed above, the backscattered signal from such a target is a strong function of the target-sensor orientation. However, one can define states Sm over which the transient scattered signal is stationary [11] as a function of aspect. In the work presented here, we consider M 4 states. Each scattered waveform is parsed via a biorthogonal wavelet transform and three moments are computed for each wavelet level (scale), as discussed in Section 3.1. The likelihood p
vvjSk is quantified here via K-mean vector quantization (VQ) [10] and as indicated in (9), we consider a 30-element codebook. The codebook for each statistical model is generated using five noisy realizations of all four states, giving a total of approximately 1; 000 training vectors. A distinct biorthogonal wavelet is designed for each of the M 4 states. Our goal is to design four wavelet filters, matched to the corresponding target state, that maximize classification performance.
Each GA chromosome is analyzed as follows: 1. 2.
3. 4.
A classifier system is built based on the blueprint found in the chromosome. The four state-dependent statistical models are trained to recognize their respective state, using five realizations of the noisy data set (as mentioned above, constituting approximately 1; 000 training vectors). The classifier is tested using a new set of 15 noise realizations. The chromosome fitness is calculated. The chromosome fitness is a quantity that characterizes the quality of the individual.
In this context, the performance of the classifier is characterized with a confusion matrix. The confusion matrix can be reduced to fitness value in a variety of ways. For example, one can use the average classification rate. Alternatively, one can choose the worst classification rate as the fitness value, effectively forcing all states to have an acceptable classification rate. The chosen fitness function combines the two approaches: F itness min
C a mean
C. Here, C is the diagonal of the confusion matrix, and a is a weighting factor. A value of a 0:2 has proven to yield results that balance minimum performance standards with high overall correct classification rates.
5.2
Genetic Algorithm Parameters
A steady-state GA [5] with a single population of 150 individuals was evolved for 30 generations using a crossover rate of 80 percent and a replacement rate of 90 percent. A small percentage (3 percent) of the numerical parameters were mutated each generation using Gaussian mutation with a standard deviation equal to 10 percent of the parameters value. Mutation was not used to alter the tree topology. The GA parameters were chosen based on previous success in past applications [23]. The population size was chosen based on computational resources. The GA was run on a cluster of workstations that includes 32 Pentium III class processors running at 500 MHz, with a single optimization run requiring approximately eight hours. While the population size is small for the application, it has proven capable of producing good results.
5.3
Classification Performance
Each of the four states was characterized by approximately 50 backscattered waveforms, with the waveforms corresponding to 1 degree angular sampling rate (variable target-sensor orientation). As discussed above, the GA employs noisy data for design with white Gaussian noise (WGN) added to the noise-free transient scattered waveforms. Example scattered waveforms from each of the four states are shown in Fig. 2. The different characteristics of
894
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 23,
NO. 8,
AUGUST 2001
Fig. 3. Performance of classifiers shown as a function of the worst classification rate for a single vs. the noise level. The results ªenergy 1º and ªenergy 2º correspond to design as in [4] using one and two parameters of p to fit the data.
these waveforms underscores the variability of the scattered waveforms with aspect. Moreover, we note that the scattered-field energy is also state dependent, making it difficult to define a composite signal-to-noise ratio (SNR) for all states. In particular, while the zero-mean WGN has a fixed standard deviation, the scattered signal strength is state dependent. Consequently, all results below are quantified in terms of the noise standard deviation, rather than SNR. The noise standard deviation employed in the GA design was = 0.0185. As a reference, the average SNR with = 0.0185 for states 1 through 4 are 20.04 dB, 17.05 dB, 21.17 dB, and 24.65 dB, respectively, where SNR is calculated as follows (with y
i the samples of a waveform): P SNR ni0 y
i2 = 2 . Using noisy data within the GA cost function implicitly forces the biorthogonal wavelets to be robust to noise, without explicitly enforcing smoothness constraints [4]. The performance of the GA-designed wavelets is compared to that of three alternative wavelet designs. Each of these alternatives uses four levels (L 4) of detail coefficients, p and u filters of length four, and VQ state-dependent statistical models employing a 30-element codebook. In the first design, no attempt is made to adapt the wavelet to the data and the identifiers for all states use the CDF bio-orthogonal wavelet [6]. The other two sets of biorthogonal wavelets employ the design procedure developed by Claypoole et al. [4]. This method also employs lifting, but the design criterion is minimization of the error in the prediction filter p. Here, the procedure in [4] is applied to the scattered waveforms in each state, from which state-dependent wavelets are derived. Moreover, as employed in the GA and as in [4], distinct wavelets are designed at each wavelet level (scale). Following the work in [4], we considered wavelets designed when one and two parameters of p are dedicated to matching the data, while the remaining p and all of the u parameters are dedicated to smoothness constraints. The classification-based cost function is too complicated to be solved as in [4], necessitating the GA. In Fig. 3, we plot the worst-state classification rate (from the four states considered), with state-dependent VQ classifiers based on CDF wavelets, two classes of wavelets designed using the technique in [4], and the GA-designed wavelets. In all cases, the feature vector employed in the state-dependent VQ classifier was based on feature vectors composed of the aforementioned wavelet
moments. Results are plotted as a function of the added-noise standard deviation . Note that the GA-designed wavelets were designed for 0:0185, while their performance is tested over a relatively wide range of noise standard deviations . The wavelets were designed by the respective methods discussed above. After deriving the wavelet filters, the results in Fig. 3 were computed based on a subsequent training of the state-dependent VQ classifiers using 16 noise realizations, and testing on 16 distinct noise realizations. The error bars indicate the standard deviation of these computations. The GA clearly outperforms the other three designs. We also considered applying the method of [4], which is based on minimizing the error in the wavelet representation, with more components of p and u dedicated to signal matching, rather than smoothness. For these cases, we saw no improvement in classification performance. As might be expected, this indicates that the goal of designing wavelets to improve classification is best realized if this design cost function is directly linked to classification, as it is for the GA. It is also of interest to examine the spectral characteristics of the GA-derived wavelets vis-aÁ-vis the more-traditional wavelet designs. In Fig. 4, we plot the spectral characteristics of the detail and coarse filters (h and g from Section 2), for the CDF wavelet and for the GA-designed wavelet for a particular state. The GA designs distinct wavelet filters for each wavelet level (scale), and here the comparison is given only for the first level. The GA-designed wavelet is based on the goal of achieving overall improved classification and, therefore, it is difficult to explain the detailed distinction between the CDF and GA-designed wavelets. Nevertheless, the significantly improved classification performance in Fig. 3 is apparently accrued by the design of wavelets that are markedly different from traditional (CDF) wavelets.
6
CONCLUSION
We have employed language-based genetic algorithms (GAs) for the design of biorthogonal wavelets within the context of a lifting paradigm [3], demonstrating that the GA-designed wavelets, which directly impose a classification-based fitness function, significantly outperform other wavelets based on more traditional design constructs. These results underscore the importance of
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 23, NO. 8,
AUGUST 2001
895
Fig. 4. Coarse and detail filter responses for the first level of a typical signal adapted wavelet compared to that of a CDF wavelet.
explicitly imposing the desired objective in the wavelet design.
[17]
While this is expected, in practice complicated cost functions are generally difficult to optimize. We have presented a genetic design construct that makes such wavelet design relatively straightforward. Such a procedure could be applied to other wavelet cost
[18] [19]
functions of interest, such as entropy minimization.
REFERENCES [1] [2]
[3] [4] [5] [6] [7] [8] [9]
[10] [11] [12] [13] [14] [15] [16]
A.H. Tewfik, D. Sinha, and P. Jorgensen, ªOn the Optimal Choice of a Wavelet for Signal Representation,º IEEE Trans. Information Theory, vol. 38, pp. 747-765, Mar. 1992. M. Oslick, I.R. Linscott, S. Maslakovic, and J.D. Twicken, ªA General Approach to the Generation of Biorthogonal Bases of Compactly-Supported Wavelets,º Proc. IEEE Int'l Conf. Acoustics and Signal Processing (ICASP), pp. 1537-1540, 1998. W. Sweldens, ªThe Lifting Scheme: A Custom-Design Construction of Biorthogonal Wavelets,º J. Applied Computational and Harmonic Analysis, vol. 3, pp. 186-200, 1996. R.L. Claypoole, R.G. Baraniuk, and R.D. Nowak, ªAdaptive Wavelet Transforms via Lifting,º Proc. IEEE Int'l Conf. Acoustics and Signal Processing (ICASP), 1998. D.E. Goldberg, Genetic Algorithms. New York: Addision-Wesley, 1989. A. Cohen, I. Daubechies, and J. Feauveau, ªBiorthogonal Bases of Compactly Supported Wavelets,º Comm. Pure Applied Math., vol. 45, pp. 485-560, 1992. S.G. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1998. W. Sweldens, ªThe Lifting Scheme: A Custom-Design Construction of Biorthogonal Wavelets,º J. Applied Computational and Harmonic Analysis, vol. 3, pp. 186-200, 1996. J. Chen and A. Kundu, ªRotation and Gray Scale Transform Invariant Texture Identification Using Wavelet Decomposition and Hidden Markov Model,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 2, pp. 208-214, Feb. 1994. Y. Linde, A. Buzo, and R.M. Gray, ªAn Algorithm for Vector Quantizer Design,º IEEE Trans. Comm, vol. 28, pp. 84-95, Jan. 1980. P.R. Runkle, P.K. Bharadwaj, L. Couchman, and L. Carin, ªHidden Markov Models for Multiaspect Target Classification,º IEEE Trans. Signal Processing, vol. 47, pp. 2035-2040, July 1999. S.E. Levinson, ªContinuously Variable Duration Hidden Markov Models for Automatic Speech Recognition,ª Computers, Speech, and Language, vol. 1, pp. 29-45, Mar. 1986. J. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge: MIT Press, 1992. J. Koza, Genetic Programming II: Automatic Discovery of Reusable Programs. Cambridge: MIT Press, 1994. D.H. Kil and F.B. Shin, Pattern Recognition and Prediction with Applications to Signal Characterization, Woodbury, N.Y.: Am. Inst. of Physics, 1996. P. Naur, ªRevised Report on the Algorithmic Language ALGOL 60,º Comm. ACM, vol. 6, no. 1, pp. 1-17, 1963.
[20]
[21] [22]
[23]
M. O'Neill and C. Ryan, ªUnder the Hood of Grammatical Evolution,º Proc. of the Genetic and Evolutionary Computation Conf., W. Banzhaf, J. Daida, A.E. Eiben, M.H. Garzon, V. Honavar, M. Jakiela, and R.E. Smith, eds., pp. 1143-1148, 1999. R. Poli and W.B. Langdon, ªSchema Theory for Genetic Programming with One-Point Crossover and Point Mutation,º Evolutionary Computation, vol. 6, no. 3, pp. 231-252, 1998. P.J. Angeline, ªAn Investigation into the Sensitivity of Genetic Programming to the Frequency of Leaf Selection During Subtree Crossover,º Proc. First Ann. Conf. Genetic Programming, J.R. Koza, D.E. Goldberg, D.B. Fogel, and, R.L. Riolo, eds., pp. 21-29, July 1996. P.J. Angeline, ªSubtree Crossover: Building Block Engine or Macromutation?º Proc. Second Ann. Conf. Genetic Programming, J.R. Koza, K. Deb, M. Dorigo, D.B. Fogel, M. Garzon, H. Iba, and R.L. Riolo, eds., pp. 9-17, July 1997. T. Back, F. Hoffmeister, and H. Schwefel, ªA Survey of Evolution Strategies,º Proc. Fourth Int'l Conf. Genetic Algorithms, pp. 2-9, July 1991. P. Runkle, L. Carin, L. Couchman, J.A. Bucaro, and T.J. Yoder, ªMultiaspect Identification of Submerged Elastic Targets via Wave-Based Matching Pursuits and Hidden Markov Models,º J. Acoustical Soc. Am., pp. 605-616, Aug. 1999. E.A. Jones, ªGenetic Design of Antennas and Electronic Circuits,º PhD dissertation, Duke Univ., 1999.
. For further information on this or any computing topic, please visit our Digital Library at http://computer.org/publications/dlib.