APPROXIMATE KALMAN FILTERING
APPROXIMATIONS AND DECOMPOSITIONS Editor-in-Chief: CHARLES K. CHUI
Vol. 1: Wavelets: An Elementary Treatment in Theory and Applications Tom H. Koornwinder, ed. Vol. 2: Approximate Kalman Filtering Guanrong Chen, ed.
Series in Approximations and Decompositions - Vol. 2
APPROXIMATE KALMAN FILTERING edited by
Guanrong Chen University of Houston
World Scientific Singapore- New Jersey- London- Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O B o x 128, Farrer Road, Singapore 9128 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 73 Lynton Mead, Tottendge, London N20 8DH
Library of Congress Cataloging-in-Publication Data Approximate Kalman filtering / edited by Guanrong Chen. p. cm. — (Series in approximations and decompositions ; vol. 2) Includes index. ISBN 981021359X 1. Kalman filtering. 2. Approximation theory. I. Chen, Guanrong. II. Series. QA402.3.A67 1994 O03'.76'0115-dc20 93-23176 CIP
Copyright © 1993 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form orby any means, electronic or mechanical, including photocopying, recording orany information storage and retrieval system now known or to be invented, without written permission from the Publisher. For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 27 Congress Street, Salem, MA 01970, USA.
Printed in Singapore by Utopia Press.
Approximations and Decompositions During the past decade, Approximation Theory has reached out to encompass the approximation-theoretic and computational aspects of several exciting areas in applied mathematics such as wavelets, fractals, neural networks, and computer-aidedgeometric design, as well as the modern mathematical development in science and technology. The objective of this book series is to capture this exciting development in the form of monographs, lecture notes, reprint volumes, text books, edited review volumes, and conference proceedings. Approximate Kaiman Filtering, the second volume of this series, represents one of the engineering aspects of Approximation Theory. This is an important subject devoted to the study of efficient algorithms for solving many of the realworld problems when the classical Kaiman filter does not directly apply. The series editor would like to congratulate Professor Guanrong Chen for his excellent job in editing this volume and is grateful to the authors for their fine contributions.
World Scientific Series in A P P R O X I M A T I O N S A N D DECOMPOSITIONS Editor-in-Chief: CHARLES K. CHUI Texas A&M University , College Stollon, Texas
This page is intentionally left blank.
Preface Kalman Filtering: from "exact" to "approximate" niters As has been in the last three decades and still is today, the term Kalman filter evokes favorable responses and applause from engineers, scientists, and mathematicians, both researchers and practitioners alike. The history of the development of the Kalman filter, or more precisely, the Kalman filtering algorithm, has been fairly long. Ever since the fundamental con cept of least-squares for signal estimation was introduced by Gauss at the age of eighteen in 1795, first published by Legendre in his book Nouvelles methodes pour la determination des orbites des cometes in 1806, and later also appeared in Gauss' book Theoria Motus Corporum Coelestium in 1809, no significant improvement was achieved in the next hundred years — not until 1912 when R. A. Fisher pub lished the celebrated maximum likelihood method, which was already anticipated by Gauss but unfortunately rejected also by Gauss himself much earlier. Then, a little later, Kolmogorov in 1941 and Wiener in 1942 independently developed the im portant fundamental estimation theory of linear minimum mean-square estimation technique. All these together, as well as the strong motivation from astronomi cal studies and the exciting stimulus from computational mathematics, provide the necessary and sufficient background for the subsequent development of the Kalman filtering algorithm, a mile-stone for the modern systems theory and technology. The Kalman filter, mainly attributed to R. E. Kalman (1960), may be consid ered in very general terms as an efficient computational algorithm for the discretetime linear least-squares estimation method of Gauss-Kolmogorov-Wiener, which was extended to the continuous-time setting by Kalman himself and to more gen erality by Bucy about a year later. To briefly describe the discrete-time Kalman filtering algorithm, we consider a stochastic state-space system of the form
Jx f c + i = = ,4fcXfc+| Ak-x.k+ifck, \
vVfc CfcXfc 7^ , fc = CfcX fc + r^
fc A: == 0, 0, I! ,,- '■-■,,
where {xt} is the sequence of state vectors of the system, with an initial state vector xo, {vfc} the sequence of measurement (or observation) data, {£ } and {n }
Vll
Preface
Vlll
two noise sequences, and {Ak} and {Ck} two sequences of the time-varying system and measurement matrices. In this linear state-space system, the first and second equations are usually called the dynamic and measurement equations, respectively. The problem is to find and calculate, for each k = 0,1, ■ ■ •, the optimal estimate xjt of the unknown state vector x^ of the system, using the available measurement data {vi, V2, ■ ■ ■, vjt} under the criterion that the estimation error covariance Cov(xLk - Xfc) = min, over all possible linear and unbiased estimators which use the aforementioned avail able measurement data, where the linearity is in terms of the data and the unbias is in the sense that E{xk} — E{xk}- It turns out that optimal solutions exist under, for example, the following additional conditions: (1) The initial state vector is Gaussian with mean E{x.o} and covariance Cou(xo) both being given; and (2) the two noise sequences {£,} and {rj. } are stationary, mutually indepen dent Gaussian, and mutually independent of xo, with known covariances Cov(£k) = Qk and CW(7?fc) = Rk, respectively. In addition, the optimal solutions x/t can be calculated recursively by
jf xxo = £{x £ { x00}} ,, 0 = \ xfc = ylfc_ix ylfc_ixfcfc_i + Gk(vk - A Ak-ix-k-i), k..i-xk_i),
k = 1,2, • • • ,
where Gk are the Kalman gains, successively calculated by •Po.o = Cou(xo), Pk,k-i = Pfc.fc-l = Ak-iPk-i.k-iAj^ Afc-lFn^-lA^! + + Q Qffcc __!, ! , Gk = = Pk Pk,k-\C (CkPk,k-iC Gfc (CfcPfc,fc-iCk fc + + Rk)~ ftfc)~ , yk-iCk Pk,k = (/ Pfc.fc
G GkkC Ckk)Pk,k-i, )Pk,k-i,
Cov(x f c— - J 4Ak-\x.k-i), in which Pk,k-i is actually the prediction error covariance Cov(xk = f c _ix f c _i), kk = 0,1,-, The entire set of these formulations comprises what we have been calling the Kalman filter or the Kalman filtering algorithm. This algorithm can be derived via several different methods. For more information about this recursive estimation algorithm, for example its detailed derivations, statistical and geometrical interpre tations, relations with other estimation techniques, and broad-range of applications, the reader is referred to the existing texts listed in the section entitled Further Read ing in the book. A few remarkable advantageous features of the above recursive computational scheme can easily be observed. First, starting with the initial estimate xo = E{XQ},
Preface
IX
each optimal estimate x/t, obtained in the consequent calculations, only requires the previous (one-step-behind) estimate and a single bit (current) datum Vfc. The essential advantage of such a simple "step-by-step" structure of the computational scheme is that there is no need to keep all the old state estimates and measurement data for each up-dating state estimate, and this saves huge computer memories and CPU times, especially in real-time (on-line) applications of very large scale (higher dimensional) systems with intensive measurement data. Second, all the recursive formulas of the algorithm are straightforward and linear, consisting of only matrix multiplications and additions, and a single matrix inversion in the calculation of the Kalman gain. This special structure makes the scheme feasible for parallel implementation using advanced computer architectures such as systolic arrays to further speed up its computation. Moreover, the Kalman filtering algorithm is the optimal estimator over all possible linear ones under the aforementioned conditions, and gives unbiased estimates of the unknown state vectors — although this is not the unique characteristic monopolized by the Kalman filter. The Kalman filter is ideal in an ideal world. This is merely to say that the Kalman filter is "perfect" if the real world is ideal — offering linear mathematical models for describing physical phenomena, providing accurate initial conditions for the model established, guaranteeing the exact models and their accurate parameters would not be disturbed or changed throughout the process, and creating Gaussian white noise (if there should be any) with complete information about its means and variances, etc. Unfortunately, nothing is ideal in the real world, and this makes the ideal Kalman filter often impractical. As a result, various modified versions of the standard Kalman filter, called approximate Kalman filters, become undoubtedly necessary. To be more specific, recall that the standard (ideal) Kalman filter requires the following conditions: the dynamic and measurement equations of the system both have to be linear; all the system parameters (matrices) must be given exactly and fixed without any perturbation (uncertainty); the mean £{xo} and covariance Cew(xo) of the Gaussian initial state vector need to be specified; and the two noise sequences, {£ } and {rh } , are both stationary, mutually independent, Gaussian, and mutually independent of xo, with known covariances Cov(£ ) = Qk and Cov(r] ) = Rk, respectively. If any of these conditions is not fulfilled, the Kalman filtering algorithm is not efficient: it will not give optimal, often not even satisfactory, estimation results. In most applications, the following questions are raised: (1) "What if the state-space system is nonlinear?" (2) "What if the initial conditions are unknown, or only partially given?" (3) "What if the noise statistics (means and/or covariances) are unknown, or only partially given, or changing?" (4) "What if the noise sequences are not Gaussian?" (5) "What if the system parameters (matrices) involve uncertainties?" (6) "What if • ■ • ?"
x
Preface
The objective of this book is to help answering at least some of these questions. However, it is appropriate to remark that in a very modest size of this tutorial vol ume, neither do we (actually, never can we) intend to cover too many interesting topics, nor does the reader can expect to be able to get into very deep insights of the issues that we have chosen to discuss. The motivation for the authors of the chapters to present their overview and commentary articles in this book is basically to pro mote more effort and endeavor devoted to the stimulating and promising research direction of approximate Kalman filtering theories and their real-time applications. We would like to mention that a good complementary volume is the collection of research papers on the standard (ideal) Kalman filter and its applications, entitled Kalman Filtering: Theory and Application, edited by H. W. Sorenson and published by IEEE Press in 1985. The first topic in this tutorial volume is the extended Kalman filter. As has been widely experienced, a mathematical model to describe a physical system can rarely be linear, so that the standard Kalman filtering algorithm cannot be applied directly to yield optimal estimations. In the first article, Bullock and Moorman give an introduction to an extension of the linear Kalman filtering theory to estimating the unknown states of a nonlinear system, possibly forced by additive white noise, using measurement data which are the values of certain nonlinear functions of the state vectors corrupted by additive Gaussian white noise. In the second part of their article, some possible choices for linearization are discussed, leading to the standard and ideal extended Kalman filters, whereas some modified extended Kalman filters are also described. Then, in part three of the article, Moorman and Bullock study how to use the a priori state estimate sequence to perform linearization and analyze the bias occurred in this modified extended Kalman filter. The second issue which we are concerned with in this book is the initialization of the Kalman filter. If the initial conditions, namely, the mean E{X.Q} and covariance Ccw(xo) of the initial state vector, are unknown or only partially given, the Kalman filtering algorithm cannot even start operating. Catlin offers an introduction to the classical Fisher estimation scheme, used to initialize the Kalman filter when prior information is completely unknown, and then further extends it to the more general case where the measurements may be ill-conditioned, making no assumption on the invertibility of any matrix involved in the estimation process. In the joint article of Gomez and Maravall, several successful approaches to initializing the Kalman filter with incomplete specified initial conditions, which work well even for nonstationary time series, are reviewed. In particular, they describe a simple solution, based on a trivial extension of the Kalman filter, to the problem of optimal estimation, forecasting, and interpolation for a general class of linear systems. Next, the adaptive Kalman filters are discussed. Basically, adaptive Kalman filters are referred to as those modified Kalman filters that can adapt either to un known noise statistics or to changing system parameters (or changing noise statis tics). Under the assumption that all the noises are Gaussian, although with un-
Preface
xi
known statistics, several efficient methods for providing adaptation capability to the Kalman filter are reviewed in the article of Moghaddamjoo and Kirlin. The adapta tion of the filters to unknown deterministic inputs are also discussed. A technique for the Kalman filter to be able to adapt to changing noise statistics is then de scribed in some details, yielding a stable and robust estimation process even under certain irregular environment. In Wojcik's article, problems in design and applica tions of adaptive Kalman filters for on-line estimation of signal and noise statistics, including noise reduction and the best possible signal restoration, are discussed. An overview on practical issues in solving these problems is given for both stationary and nonstationary linear systems. Several adaptive filtering schemes are compared in terms of speed and efficiency. Almost all the noise sequences in practical systems are non-Gaussian, but many of them can be approximated well by a finite sum of Gaussian noises, called a Gaus sian sum for short, with different means and covariances. For this Gaussian sum case, which is non-Gaussian overall, fairly rigorous mathematical analysis can be carried out to yield optimal or suboptimal Kalman filtering algorithms. In the article of Wu and Chen, this topic is investigated in some details. A historical overview, describing several representative approaches, is first given. Then, un der different assumptions on the Gaussian sums of the dynamic and/or measure ment noise sequences, several optimal (modified and generalized) Kalman filtering schemes obtained under the standard minimum mean-square error (MMSE) esti mation criterion are presented. To handle modeling errors and uncertainties, set-valued models are often pre ferred. This issue is considered by Morrell and Stirling for systems with Gaussian noise, and also by Hong in case that the noise sequences are non-Gaussian. The set-valued Kalman filter is reviewed in the article of Morrell and Stirling. The setvalued Kalman filtering theory is developed under the Gaussian noise assumption, and hence generalizes the standard point-valued Kalman filtering algorithm to the case where the noise sequences can have Gaussian densities defined by common means and covariances in a prescribed convex set of Gaussian density functions. The non-Gaussian case with set models is studied in Hong's article, where the only information needed is the sets with confidence values from which the modeling and measurement errors and the initial conditions are obtained. This approach is con sidered to be a generalization of some existing approaches based on the commonly used "unknown-but-bounded" assumption on the noise. Robust stability of the Kalman filter under parametric and noise uncertainties is analyzed by Chen and Peng. A realistic sufficient condition under which the Kalman filter works satisfactorily with robust stability in the presence of both parametric and noise uncertainties, is derived. This gives a guideline to the Kalman filter user for guaranteeing the stability of the estimation process under imperfect modeling conditions. The final article of the book, written by Kerr, is devoted to an investigation of
xii
Preface
several numerical approximation issues in implementation of the Kalman filter. As has been observed in practice, getting incorrect results at the output of a Kalman filter simulation or hardware implementation can be blamed on the use of faulty ap proximations applied in the implementation, faulty coding/computer programming, or incorrect theoretical details, etc. A thorough discussion on how these problems are handled is presented in the article. The author provides his original unique approach to handling these problems. The techniques espoused therein are consid ered to be universal, and are independent of the constructs of particular computer languages, and have been used in cross-checking Kalman filter implementations in several diverse commercial and military applications. If the survey articles presented in this book can serve as an overview of the state-of-the-art research for Kalman filters that are not "exact" but only "approx imate" under irregular environments; if the reader can benefit by some new ideas, techniques, and insights from individual articles of the book; and if new research in this challenging, yet promising, area can be further stimulated and motivated, then the goal of the present authors, who have contributed their best effort to make the publication of the book possible, will be achieved. In the preparation of this book, the editor has received enthusiastic assistance from several individuals. First, the editor would like to express his gratitude to Charles Chui, the series editor, for his continuous encouragement and support. Sec ond, he is very grateful to Margaret Chui for her assistance in the editorial work. In addition, the editor would like to thank William Bell, George Siouris, Harold Sorenson, and John Woods, for their interest in this project. To his wife, Helen Qiyun Xian, he would like to thank her for her understanding and support. Fi nally, the editor would like to acknowledge the financial support from both the President's Research and Scholarship Fund and the Institute of Space Systems Op erations grants at the University of Houston, and from the McDonnell Douglas Space Systems Company research grants.
Houston Spring, 1993
Guanrong Chen
Contents Preface
vii
I. Extended Kalman Filtering for Nonlinear Systems Extended Kalman Filters 1: Continuous and Discrete Linearizations T. E. Bullock and M. J. Moorman
. . 3
Extended Kalman Filters 2: Standard, Modified and Ideal T. E. Bullock and M. J. Moorman . . .
9
Extended Kalman Filters 3: A Mathematical Analysis of Bias M. J. Moorman and T. E. Bullock . . .
.
.
II. Initialization of Kalman Filtering Fisher Initialization in the Presence of Ill-Conditioned Measurements D. Catlin Initializing the Kalman Filter with Incompletely Specified Initial Conditions V. Gomez and A. Maravall
.
III. Adaptive Kalman Filtering in Irregular Environments Robust Adaptive Kalman Filtering A. R. Moghaddamjoo and R. L. Kirlin
15
23
39
65
On-line Estimation of Signal and Noise Parameters and the Adaptive Kalman Filtering P. J. Wojcik . . .
.87
Suboptimal Kalman Filtering for Linear Systems with Non-Gaussian Noise H. Wu and G. Chen .
113
xiii
xiv
Contents
IV. Set-valued and Distributed Kalman Filtering Set-valued Kalman Filtering D. Morrell and W. C. Stirling .
139
Distributed Filtering Using Set Models for Systems with Non-Gaussian Noise L. Hong . .
161
V. Stability Analysis and Numerical Approximation of Kalman Filtering Robust Stability Analysis of Kalman Filter under Parametric and Noise Uncertainties B. S. Chen and S. C. Peng
.
Numerical Approximations and Other Structural Issues in Practical Implementations of Kalman Filtering T. H. Ken Further Reading
.
. . .
179
193
. 2 2 1
Notation
223
Subject Index
225
Extended Kalman Filters 1: Continuous and Discrete Linearizations T. E. Bullock and M. J. Moorman
Abstract. The use of a linearizing trajectory to apply the linear Kalman Filter equations to nonlinear estimation problems is introduced. Some possible choices for this linearizing trajectory are examined leading to the standard and ideal Extended Kalman Filters. Some other useful modifications to the Extended Kalman Filter are also presented and discussed.
§1 Introduction This article will introduce the extension of linear Kalman Filter theory to the problem of estimating the trajectory of a nonlinear dynamical system, possibly forced by additive white noise, from measurements of nonlinear functions of the system states corrupted by additive white Gaussian noise. A nominal trajectory is used to linearize the nonlinear functions and obtain perturbation equations to which the standard linear method may be applied. The equations for the linearized filter are presented and the choice of the linearizing trajectory is discussed. This leads to the definition of the standard and ideal extended Kalman filters. Some suggested modifications to the standard filter are discussed and their impact on estimator performance is addressed. §2 Derivation of the continuous time linearized equations Consider the nonlinear continuous system with additive white noise disturbance | x ( t ) = a(x(r), * ) + £ « ) , Approximate Kalman Filtering Guanrong Chen (ed.), pp. 3-8. Copyright ©1993 by World Scientific Publshing Co. Inc All rights of reproduction in any form reserved. ISBN 981-02-1359-X
(1) 3
4
T. Bullock and M. Moorman
where E{£(t)}=0, = 0, TT
E{i(t)i (r)} E{£(t)i (T)} Q(t)6(t-r), 6( « } = Q{t)Q(t)6(t-T), with initial conditions xo = x(to) which is a vector valued random variable with known mean £{xo} and covariance Cov (xo,xo). Discrete nonlinear measurements are taken at times tk vfc = c(x(tO, * * ) + & .
(2)
where E{vk} = 0,
E{rkUJ)=Rk6kl. In order to apply linear Kalman filter theory we can linearize the above equations along some "nominal" or "reference" trajectory, x((), which satisfies the dynamical equation ^ ( ( ) = a(x((),t)
(3)
with initial condition x(i 0 ) = £ { x 0 } . Although x(i) is a known deterministic trajectory here, this will be changed later as we investigate various candidates for x(i). In the case of the standard Extended Kalman Filter x(t) is in fact a stochastic process and does not satisfy (3). Subtracting (3) from (1) gives - x ( t ) - -x(t)(t) = a(x(t),t) - a(x(i),i) + ! ( i ) .
(4)
Define the deviation of the nominal trajectory from the actual as x(t) = x(i) - x ( i ) .
(5)
Expanding a(x(£),£) to first order around the nominal trajectory leads to a(x(t),t) « a(x(t),t) + A(t) x ( t ) , where J4(£) is defined as
A(t) -
(6)
Mse(6) = a 2 , ? " 1
sTS~ls)/a2
48
V. Gomez and A. Maravall
If S in (8) is singular, de Jong leaves the diffuse log-likelihood undefined. In order to define the limiting expressions of Theorem 3 when S is singular, we have to consider model (9) with an R matrix that is not of full column rank. Let K be a selector matrix formed by zeros and ones such that KSKT has a rank equal to rank(R) and replace model (9) by v = RKT6l+e,
(10)
where £ t ~ N(c, cr2C), with C nonsingular and 6_x is the vector formed by choosing those components in 8 corresponding to the selected columns RK . This amounts to making the assumption that the other components in 6 cannot be estimated from the data without further information and are assigned value zero with probability one. The next theorem generalizes the results of Theorem 2 to the case of a possibly singular S matrix. Theorem 4. Suppose model (10) with the convention that if R is of full column rank, then matrix K is the identity matrix and 6i = 6. Then, with the notation and assumptions of Theorem 3, letting C —► oo, we have X(v)+hn\C\
- -^{ln|(7 2 £| + \n\KRTZ-1RKT\ 1^6
= (RTT,-1R)-RT'£-\,
+ (v - i ? 5 ) T E - 1 ( v - R8)/a2} , Mse(l) — Mse(. — („.
'-'I
—
V" ll
1\
*■) !
P,nrifl -
1-0 1 Dl Gi = . 1 = ( f l T E - 1 J R ) - 1 i ? T E - 1 ( v - S{3) . Minimizing this diffuse log-likelihood with respect to (3 yields an estimator J3 which minimizes (v - 5 ^ ) T P T E " 1 F ( v - S0), where P = I - R(RTT,~1R)-lRTlS-1 It can be shown that the estimators 6 and J3 obtained in this way can be obtained in a single stage as the GLS estimator 7 = ( | ,/? ) T of model (15). Thus, the EKF 2 2 or■the the DKF DKF can can be be used used toto compute compute the the ( 0 vanishes identically. The optimum Q, R and G can be obtained as follows: 1) Obtain P~CT by solving equation (20) for £ = l , - - , n , where n is the dimension of the state vector.
P~CT
Ti + CAGSF0 T2 + CAGsTl + CA2GSTQ
nT {DTD)~I1D
=
(23) n
r n + c7iG,r n _i + ■ • ■ + CA Gsr0 where DT =
[ATCT,---,(AT)nCT}.
2) Calculate R by using TQ and P ~ C T :
/? = r 0 -
c[p-cT].
(24)
3) Denote by PQ and P0 the error covariance matrices associated with the optimum gain G. Then, under the steady state condition, P 0 - = A(I
- GC)PQAT
(25)
+ Q .
It can be shown that P~ =A[(I-
GSC)P~ - P'CTGj
+ Gs (CP-CT
+ R)Gj]AT
+ Q.
(26)
Subtracting (26) from (25) and using equations (8) and (9) yields A(P0 - Pk)AT
= A[A(P0 - Pk)AT + P-CTGj
Let 6P~ = A(Po — Pk)AT
- GCPQ
- GS{CP-CT
+ G3CP+
R)Gj]AT
(27)
The optimum gain G is then given by
G = (P-CT
+ 6P-CT)(T0
C6P-CT)-1
+
(28)
Substituting (28) into (27) yields 6P~ = A[6P~ - (P~CT + 6P-CT)(r0 T
+ C6P~) + G3CP~ + P-C GJ
+
-
C6P-CTrl(CP-
GsT0Gj]AT
(29)
The optimum Kalman gain G can be obtained by solving equation (29) for 6P~ by using P~CT from (23) in one step from (28). In practice, if a batch of observations Vj can be stored, the above mentioned calculations might be repeated to improve the estimates of the desired parameters [2]. The residual sequence v_; will become
70
A. Moghaddamjoo and R. Kirlin
more and more white with each iteration, resulting in better and better estimates for the autocorrelations r 0 , Ti, ■ ■ ■, and, therefore, for R and G. The estimation for Q can be made with AP0AT + Q = Gr0(CT)+, (30) Q = G r 0 ( C T ) + - A(I - GC)GT0(CT)+AT
,
(31)
where superscript "+" means pseudoinverse. The preceding algorithm can be used on-line by using
3=1
for on-line estimation of autocorrelations. The efficiency of the correlation methods can be improved by using higher order correlations. The correlation technique is good for stationary time invariant systems where calculation time is not critical and running the filter with suboptimal Kalman gain for a long record of data is not dangerous. However this method cannot be used for nonstationary systems or for systems which involve unknown deterministic inputs for a short period of time. Using the stochastic approximation method in con junction with the correlation method gives a slight improvement in the estimates without requiring much extra computation and without requiring further storage of past data. 2.4 Covariance-matching techniques With this method the residuals are made consistent with their theoretical covariances. For example consider the residual sequence u , which has a theoretical covariance of [C(APj-\ATA1 + Q)CT + R]. If the actual covariance of £ has vari ance elements much larger than their theoretical value obtained from Kalman filter, then the estimate for Q should be increased. This has the effect of bringing the actual covariance of v_3 closer to its theoretical value. The actual covariance of v_j is approximated by its sample covariance; viz, Af1_1 J2i=i H-ji^It where M is chosen empirically to give some statistical smoothing. An equation for Q is obtained by setting C{APj-lAT + Q)CT + R = E{vjv]}, (33) or CQCT = E^vJ}
- CAP3_1ATCT
+ R.
(34)
Equation (34) does not give a unique solution for Q if C is of rank less than n. However if the number of unknowns in Q are restricted, a unique solution can be obtained. Notice however that equation (34) is only approximate since Pj-\ does not represent the actual error covariance when the true value of Q and R are unknown. Because of this approximation the convergence of the covariance matching technique is doubtful.
Robust Adaptive Kalman Filtering
71
2.5 Special filters Some of the important special filters of the above mentioned categories are summa rized here, a) Jazwinski adaptive filter [6]: This filter applies the correlation method in a special way to introduce a feedback from residuals in order to compensate the modeling error and hence prevent the divergence of the Kalman filter. This method leads to the estimator, ,. A ( 0, d,agQ fc , N = | d . a
g 0
^
i
every element of e < 0; ^ ^ ^ -"
,„_. (35)
subject to the rule that, if {q~jj)k,N < 0, set (qjj)k,N = 0, where TV is the number of points used in the estimation process, dia g Q fc , N = ( L ^ ) -
1
^
(36)
and eT = [d+i ~ E{d+1\Q
= 0}, • ■ ■, v2k+N - E{£+N\Q
= 0}].
(37)
In equations (36) and (37), we have t L = [aej](Nxn) ; atj = ^(Cfc+£$k+£,fc+i )] .
(38)
where $ij is the transition matrix (from the jth to the ith state) and E{&+t\Q
= 0} = Cfc+/*fc+/,fc flb*£n,fc C^+e + Rk+e ■
(39)
One of the important limitations of this approach is that it responds to mea surement noise, TJ, , if Rk+e > Ck+e QCk+e , where the best performance in terms of absolute size of residuals will not be realized. Also knowledge of Rk is essential in this algorithm. b) Belanger adaptive procedure for estimation of noise statistics [1]: In this proce dure also a correlation method is applied to the filter residuals. The more general time varying case can be handled by this method. In this case it is shown that, if the covariance matrices are linear in a set of parameters, then the correlation function of the filter's residuals is also a linear function of this set of parameters. This fact is used to perform a weighted least-squares fit to a sequence of measured correlation products.
72
A. Moghaddamjoo and R. Kirlin
In this formulation it is assumed that Q and R are linear functions of J com ponents of a vector a, i.e., J
R = Y,Ri 0, then proper correction requires an increase in those components of G which correspond to the positive components of IV This can be achieved by an increase in the corresponding components of Q; similarly, negative components in I\, i > 0, will be corrected by a decrease in those components. In other words if I \ is positive for i > 0, Q should be increased, and, as a result, I \ will be decreased (I\ approaches zero). On the other hand if I \ is negative for i > 0, Q should be decreased, and, as a result, I \ will be increased (I\ approaches zero). Thus, the effects of T; on Q are direct, and for I \ = [0], k > 0, it is necessary to keep Q unchanged. The relationship between Q and 1% cannot be derived analytically, but several different empirical relationships may be proposed; for example,
Qfc+i = Qk exP[A(r! + r 2 + • • • + r,)ro 1 T T ]
(88)
where matrix Q is (n x n), and matrices IYs are (m x m), A and T are coefficient matrices which can be found experimentally, and their dimensions are (nxrn). The number of autocorrelations that is used depends on: 1) the maximum number of data points necessary to yield acceptable estimates of IYs, and 2) the maximum number of permissible data points in the moving window. In this procedure, due to our findings in [7], we suggest that small sample size estimates of correlations (like rank-correlations) be used. Although the stability formulation of this method, due to its complexity, is not readily accessible, it can be easily understood by the following arguments. In this method, Q is the only unknown parameter which controls the Kalman gain. This is because R and u are independently estimated and they are not associated with any feedback loop in the filtering process. Let us assume that, due to some disturbances (i.e., unknown sudden changes in R and/or u), the Kalman gain becomes less than its optimum value. The resultant residual sequence will then have positive autocorrelation, due to equation (87). Detection of the positive autocorrelation demands an increase in Q, which will, in turn, increase the Kalman gain (G changes in the direction which approaches its optimum value). On the other hand, if the Kalman gain is larger than its optimum value, the residual sequence will develop a negative autocorrelation which requires a decrease in Q. Reduction of Q will then decrease the Kalman gain (G changes again in the direction which approaches its optimum value). This correction continues until G reaches its optimum value where it oscillates. Therefore deviation of G from its optimum value, due to any disturbances, not only is controlled by Q, but also will be reduced in time. This behavior is what we refer to as a negative feedback which has a stabilizing role in the overall performance of the algorithm. the overall performance of the algorithm. The overall algorithm is examined in [10] by simulating the double integrator system. The results verify the superiority of this method over the conventional adaptive filtering algorithm in [13] operating under the same conditions.
84
A. Moghaddamjoo and R. Kirlin
4.4 Discussion The conventional method utilizing state estimates to estimate both unknown inputs and noise covariances is potentially unstable and produces suboptimal state esti mates in both the transient and steady state conditions. Another drawback to that procedure is that it is too sensitive to the existence of outliers in the measurement errors due to a possibly invalid Gaussian noise assumption. This problem may be resolved with the considerable calculations of the robust adaptive method. The instability problem arises from the dependency of the estimate of the unknown pa rameters on the state estimates and a positive feedback in the process of estimating Q. The difficulty has been overcome [10] by a curve fitting procedure for estima tion of the input u^ and measurement noise covariance R and by using stochastic approximation for estimation of the process noise covariance Q. §5 Conclusion In this work several different approaches to adaptive Kalman filtering are reviewed. Adaptation is assumed to be with respect to unknown noise statistics and/or un known deterministic inputs. In the case where only noise statistics are unknown, four different approaches (Bayesian, ML, correlation and covariance matching) are reviewed. These methods are based on the assumption that the noise is stationary. Some of these methods (Bayesian and ML) are computationally more involved that cannot be used on real time processing. Others (correlation and covariance match ing) are required to be run suboptimally for a long period of time until optimum estimates of unknown parameters can be obtained. In the case where only deterministic inputs are unknown, two different ap proaches are reviewed. In the first method the deterministic input term is assumed to be zero and the process noise covariance is increased. In this case the filter op erates suboptimally. In the second method inputs are estimated in the same way as states of the system, using the method of state augmentation. This method is optimum only for linear systems when the input state model is completely known. The augmentation method is computationally more costly. In the case where noise statistics and the deterministic inputs are unknown, three different approaches, which are all sequentially adaptive, are studied. The first approach, which is referred to as the linear method, has a potential of instability and can be applied only for a short period of time. The second approach, which is a robust version of the linear method, has also a potential of instability but it can be used in a longer period of time. These two approaches are both suboptimal. The third method, which is referred to as the optimal sequential adaptive algorithm, is stable mainly due to its way of estimating the process noise covariance. This method yields optimum estimates of the unknown inputs and the measurement noise covariance. In this approach estimates of the unknown parameters (except process noise covariance) are calculated independently and optimally. This method can can be be used used for for aa long long period period of of time time without without any any degradation. degradation.
85
Robust Adaptive Kalman Filtering References
1. Belanger, P. R., Estimation of noise covariance matrices for a linear timevarying stochastic process, Automatics 10 (1974), 267-275. 2. Carew, B. and P. R. Belanger, Identification of optimum filter steady-state gain for systems with unknown noise covariances, IEEE Trans, on Auto. Contr. 18 (1973), 582-587. 3. Chang, C. B. and K. P. Dunn, Kalman filter compensation for a special class of systems, IEEE Trans, on Aero. Electr. Sys. 13 (1977), 700-706. 4. David, H. A., Order Statistics, John Wiley & Sons, New York, 1981. 5. Hudson, D. J., Fitting segmented curves whose joint points have to be esti mated, J. ofAmer. Statist. Assoc. 61 (1966), 1097-1129. 6. Jazwinski, A. H., Adaptive filtering, Automatics 5 (1969), 475-485. 7. Kirlin, R. L. and A. Moghaddamjoo, Robust adaptive Kalman filtering for systems with unknown step input and non-Gaussian measurement errors, IEEE Trans, on Acous. Spee. Sign. Proc. 34 (1986), 252-263. 8. Mehra, R. K , On the identification of variances and adaptive Kalman filtering, IEEE Trans, on Auto. Contr. 15 (1970), 175-184. 9. Mehra, R. K , Approaches to adaptive filtering, IEEE Trans, on Auto. Contr. 17 (1972), 693-698. 10. Moghaddamjoo, A. and R. L. Kirlin, Robust adaptive Kalman filtering with unknown inputs, IEEE Trans, on Acous. Spee. Sign. Proc. 37 (1989), 11661175. 11. Montgomery, D. C. and E. A. Peck, Introduction to Linear Regression Analysis, John Wiley & Sons, New York, 1982. 12. Mosteller, F. and J. W. Tukey, Data Analysis and Regression, Addison-Wesley, Reading, MA, 1977. 13. Myers, K. A. and B. D. Tapley, Adaptive sequential estimation with unknown noise statistics, IEEE Trans, on Auto. Contr. 21 (1976), 520-423. A. Reza Moghaddamjoo Department of Electrical Engineering and Computer Science University of Wisconsin-Milwaukee P.O.Box 784 Milwaukee, WI 53201
[email protected] R. Lynn Kirlin Department of Electrical Engineering University of Victoria Victoria, B. C. Canada V8W 2Y2
[email protected] On-line Estimation of Signal and N o i s e P a r a m e t e r s and the Adaptive Kalman Filtering
Piotr J. Wojcik
Abstract. The paper discusses problems in design and application of on-line Kalman filtering techniques for noise reduction and the best possible signal restoration, and gives an overview of practical approaches to solving these problems. The discussion is limited to linear systems with known structures and includes both stationary and nonstationary cases. First, the formulation of the optimum Kalman filtering problem is given. This includes all a priori information about the signal and noise which is necessary for the design process of an optimum Kalman filter. However, such full information about a system model is usually unavailable. In this situation the adaptive Kalman filtering process can be used to solve the problem of incomplete information. As is known, the adaptive Kalman filtering process consists of the following functional components: estimating unknown parameters of the signal model, updating estimates within the Kalman filter structure, and Kalman filtering itself. The paper gives an overview of unknown parameter estimation techniques as well as methodologies for on-line updating parameters within the Kalman filter structure. These methodologies and algorithms are assessed through computer simulations. The assessment is based on the overall performance of the adaptive filtering process. The performance of the filter is measured directly through an improvement coefficient, which indicates how many times the adaptive Kalman filter reduces the observation error. Such a measure enables a designer to determine whether the application of the adaptive Kalman filter will reduce observation errors. Finally, the paper discusses rates at which the adaptive Kalman filter can follow changes of the system characteristics, and proposes modifications to the adaptation scheme, which increase the filter applicability to nonstationary signals.
§1 I n t r o d u c t i o n In control a n d d a t a acquisition systems when signals from any sensors are processed, t h e problem of t h e measurement noise always arises. In t h e past, simple analog filters, with characteristics chosen by t h e designer on t h e basis of a priori information Approximate Kalman Filtering Guanrong Chen (ed.), p p . 87-111. Copyright ©1993 by World Scientific Publshing Co. Inc All rights of reproduction in any form reserved. ISBN 981-02-1359-X
87
88
P. Wojcik
about the signal, were used. It was unreasonable to utilize an expensive computer for each sensor or even a group of sensors to perform a high quality digital filter ing, but now, with microprocessors commonly available, inexpensive and powerful optimum digital filtering can be applied to a variety of sensor signals. Sensor signals are usually stochastic in nature, so statistical approaches to filtering should be applied to these kinds of signals. Such approaches define the best filter as one, which, on average, has its output closest to the correct or useful signal. Two theories of the optimal linear filtering have been developed: the WienerKolmogorov theory for coping with stationary signals and the Kalman filter theory [8] for nonstationary signals. The problem of the design of the Kalman filter for a particular application can be solved in many ways, dependent on a priori information about the processed signal and noise: (i) When all parameters of the pure signal and noise are known, the optimum Kalman filter can be obtained [1,6,5,3] relatively easily. (ii) When the parameters of the signal and noise are not known exactly, but the uncertainty ranges of these parameters are relatively small, the low sensitivity Kalman filter [12] can be designed. (iii) When the parameters of the pure signal and measurement noise are un known, but the structure of the system generating signal and noise is known, several adaptive Kalman filtering algorithms [15,16,2] can be used. This paper describes adaptive Kalman filter algorithms, which can be applied to signals in the presence of either white or coloured measurement noise. All pa rameters of the signal and noise are assumed to be unknown. The adaptive Kalman filter algorithms have been tested through computer simulations, and their results are presented and discussed. §2 Kalman filtering problem and optimum solution in the presence of white noise 2.1 Observed signal model Consider a stationary multivariable linear discrete system (also called a shaping filter) described by the following state space equations (Figure 1) :
Figure 1. Model of the observed signal.
89
On-Line Estimation and Adaptive KF
x fc+1 = Axk + B(k
(1)
vfc = Cx fc + r7fc,
(2)
where Xfc — n x 1 state vector; A — n x n state transition matrix; B — nx q input matrix; Vfc — p x 1 measurement vector; C — p x n output matrix. The sequences £ (q x 1) and TL (p x 1) are uncorrelated Gaussian white noise sequences with means and covariances as follows:
E{ik}=0,
E{tg}
E{r,k} = 0, E
iiill]} =
= Q6lj,
E{rLiTL}=R6ij, 0
foralU.j,
where E{-} denotes the expectation and 0,
(50)
98
P Wojcik
where M is the covariance of the prediction error in estimating the state and G is the gain matrix of the Kalman filter; both matrices do not have to be optimum. Notice that the optimum choice of the Kalman gain matrix according to (7) makes TzO) vanish for all j ^ 0, i.e., the innovation sequence for the optimum filter is white. Rewriting (50) explicitly and making appropriate manipulations [13,9], the following recursive relationship for the Kalman gain G can be derived: rM-i(i)
[iV-i(o)]- 1 ,
d = G,_! + $+_i
(51)
rz,i-i(n) where $+ is pseudoinverse of the matrix $ ; defined as CA CA(I - GXC)A
*, C[A(I
(52)
n l
-GiC)\ - Al
The obtained equation (51) gives the recursive algorithm for the estimation of the Kalman gain matrix Gi for the time interval T on the basis of the innovation sequence correlation function (r z ,i_i) estimated in the preceding time interval Ti_i. It can be proved [9] that the above sequence of the Kalman gains converges to Gopt5.3 Adaptive Kalman
filtering
The adaptive Kalman filter utilizes this algorithm to estimate the gain of the Kalman filter over a certain period of time (adaptation interval), processes the noisy signal with the gain estimated during the previous adaptation interval, and simultaneously estimates the gain for the next consecutive adaptation interval ac cording to adaptation scheme illustrated in Figure 3. The adaptive Kalman filter gain Gi is computed for each consecutive adaptation step according to the recursive algorithm (51). The adaptive Kalman filtering process of an observed signal in the presence of a coloured noise can be summarized as follows: (i) Filter the signal using equations (3), (4), and (46) during the current adap tation interval Ti with Kalman gain Gi which has been estimated during the adaptation step 2j_i. (ii) Estimate simultaneously (within Ti) the correlation function TZii of the inno vation sequence z