Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma
10 Jan M. Maciejowski
The Mo...
17 downloads
832 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma
10 Jan M. Maciejowski
The Modelling of Systems with Small Observation Sets
Springer-Verlag Berlin Heidelberg New York 1978
Series Editors A. V. Balakrishnan • M. Thoma Advisory Board A. G. J. MacFarlane • H. Kwakernaak • Ya. Z. Tsypkin Author Dr. Jan Marian Maciejowski Maudstey Research Fellow, Pembroke College, Cambridge also with the Control and Management Systems Group, Cambridge University Engineering Department Mill Lane, Cambridge CB2 1RX, England
ISBN 3-540-09004-5 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-09004-5 Springer-Verlag NewYork Heidelberg Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. © by Springer-Verlag Berlin Heidelberg 1g78 Printed in Germany
SUMMARY
The p r o b l e m systems,
when
is i n t r o d u c e d defined
of a s s e s s i n g
only
of a system,
of a v a i l a b l e algorithm un d e r
A general "information more no
of models,
criteria
information gain
including
and its c o m p u t a t i o n gain
for the The
language about
of m o d e l l i n g ,
to the p r o b l e m
of s y s t e m
account
of the size of the set
A model
is d e f i n e d
observation
to be an set of a s y s t e m
to find
that
in the s e n s e the m o d e l
with
gain.
nonlinear
criterion
dynamical
with
gain
program.
i n s i g n i f i c a n t as the o b s e r v a t i o n
that
models,
is d e m o n s t r a t e d . requires
The c h o i c e
the m o d e l l e r ' s
It is s h o w n
class
The use of i n f o r m a t i o n
of rival m o d e l s
of i n f o r m a t i o n
for a w i d e
stochastic
is s t r a i g h t f o r w a r d .
is a s s o c i a t e d
with
It is p r o v e d
c a n exist,
in general,
its
consistency
is d i s c u s s e d .
algorithm"
as a c o m p u t e r
the system.
of a model,
and its
is a s u i t a b l e
assessment
calculation
be e x p r e s s e d
solution
is proposed,
modelling
Information
information
a characterisation
of the q u a l i t y
it is not possible,
the h i g h e s t
accounts
of a l g o r i t h m i c
the o u t p u t
criterion
conventional
that
is
restrictions.
gain",
"universal
identification
for
observations.
specified
are available,
to a t h e o r y w h i c h
taking
for c o m p u t i n g
of
of
a partial
while
models
f r o m a set of o b s e r v a t i o n s
on to d e v e l o p
constitutes
identification,
System
The c o n c e p t s
are d r a w n
interpreting
of o b s e r v a t i o n s
and discussed.
that b e h a v i o u r .
which
sets
as the p r o g r e s s i o n
the b e h a v i o u r
theory
small
and
this
that
of p r o g r a m m i n g
a priori choice
sets b e c o m e
the m o d e l
beliefs
becomes
large.
A detailed
IV
investigation of
shows
"the s m a l l e s t
program.
t h a t it is p o s s i b l e
language"
A priori
knowledge
t h e r e f o r e be c o n s i d e r e d required
to r u n a p a r t i c u l a r
assumed
about a system can
to be d e f i n e d by the s m a l l e s t
language
to run the m o d e l .
Finally, which
required
to s p e a k p r e c i s e l y
the e f f e c t on m o d e l
system observations
t h a t a "safe"
are c o d e d
c o d i n g exists,
a s s e s s m e n t as w o u l d
a s s e s s m e n t of the m a n n e r is e x a m i n e d .
which often
It is f o u n d
leads to the
the use of m o s t o t h e r c o d i n g s .
in
same
ACKNOWLEDGEMENTS
The
idea of e x a m i n i n g
information
theory
His c o n s t a n t detailed
modelling
is due
to P r o f e s s o r
encouragement
criticism,
in the
light
A.G.J.
and e n t h u s i a s m ,
has b e e n
an e s s e n t i a l
of a l g o r i t h m i c
MacFarlane.
as w e l l
as
ingredient
of this
work. I have also benefited
from d i s c u s s i o n s
of the C o n t r o l
and M a n a g e m e n t
Dr.F.P.
Kelly,
Dr.
special
mention.
chapter
was
Watson
M.B.
from N e w t o n
of w h o m Beck
deserve
in the
last
out to me by D r . A . T . F u l l e r .
support
Council,
Group,
and Dr.
The q u o t a t i o n
pointed
Financial Research
S.R.
Systems
with many members
for this r e s e a r c h
and in the
final
came
stages
from
the S c i e n c e
from P e m b r o k e
College. Roberta of typing,
Hill but
special
so s u c c e s s f u l l y My wife
has p r o d u c e d thanks
through
are due
chapter
saying
have b e e n w i t h o u t
her
I shall
how
leave
to her
this
standard
for s t r u g g l i n g
one of those
impossible
constant
excellent
5.
has a s k e d me not to w r i t e
acknowledgements,
consequently
her usual
this
encouragement to t h e
embarassing
research
would
and support;
reader's
imagination.
CONTENTS
1
1.
Introduction
2.
S u r v e y of R e l a t e d Work
23
3.
A Characterisation
60
4.
I n c o r p o r a t i o n of A Priori K n o w l e d g e
102
5.
F r a g m e n t s of P r o g r a m m i n g
115
6.
h-Comparability
135
7.
Table L o o k - U p C o d i n g s
148
8.
D i s c u s s i o n and C o n c l u s i o n
158
References
180
of M o d e l l i n g
Languages
Appendices: A
Formal S e m a n t i c s of P r o g r a m m i n g L a n g u a g e s
185
B
S y n t a x of the A l g o l W - S u p p o r t of the G a s - F u r n a c e Models
216
Table L o o k - U p s
220
C
Diagrams
for the G a s - F u r n a c e M o d e l s
229
1.
i.i
INTRODUCTION
Motivation
The areas in w h i c h the s c i e n t i f i c m e t h o d has b e e n demonstrably
and s p e c t a c u l a r l y
by the p o s s i b i l i t y observations,
successful
are c h a r a c t e r i s e d
of p e r f o r m i n g e x p e r i m e n t s ,
or m a k i n g
more or less freely w h e n e v e r these are d e e m e d
desirable.
The result of this has b e e n that e x p l i c i t
c o n s i d e r a t i o n of the size of the set of o b s e r v a t i o n s w h i c h a m o d e l is h y p o t h e s i s e d , fitted, has b e e n n e g l e c t e d .
from
and to w h i c h a m o d e l is Any doubts w h i c h
arise about
the m o d e l can be r e s o l v e d by further e x p e r i m e n t a t i o n
and
observation. This p l e a s a n t p r o p e r t y i n c r e a s i n g l y d i s a p p e a r s enters
the domains of complex i n d u s t r i a l processes,
m e n t a l c o n t r o l systems, m a n a g e m e n t systems, e c o n o m i c systems.
as one environ-
and socio-
The w o r k d e s c r i b e d here aims to c l a r i f y
the r e l a t i o n s h i p b e t w e e n the s m a l l n e s s of the a v a i l a b l e o b s e r v a t i o n sets for such systems of the m o d e l s
and the d e g r e e of u s e f u l n e s s
o b t a i n e d for them.
Until recently,
the class of m o d e l s ~ h ~ c h
c o u l d be used
in s c i e n t i f i c i n v e s t i g a t i o n s was r e s t r i c t e d by a v e r y p r a c t i c a l consideration. understood,
The b e h a v i o u r of the m o d e l had to be
and that u n d e r s t a n d i n g
the theory of the model. s u f f i c i e n t l y simple
could only be o b t a i n e d from
The m o d e l was
c o n s t r a i n e d to be
for t h e o r e t i c a l i n v e s t i g a t i o n to be
possible. The
availability
situation
of the
radically.
the b e h a v i o u r theoretical
complicated behaviour,
of it.
of u s e f u l
relaxed.
model
changed
with hardly
Consequently
models
structure,
has b e e n
to o b s e r v e
the d e t a i l s
this
to i n v e s t i g a t e
It is now p o s s i b l e
and to a d j u s t
simulated
by s i m u l a t i o n ,
understanding
least g r e a t l y
has
It is now p o s s i b l e
of a m o d e l
on the c o m p l e x i t y
computer
any
this
constraint
removed,
or at
to p o s t u l a t e
a
its s i m u l a t e d
of the m o d e l
b e h a v i o u r r e s e m b l e s the b e h a v i o u r
until
its
of the s y s t e m b e i n g
investigated. When
is such
understanding be used the
of h o w
some
light
investigate
say how
to how
models
good
the
an i s o l a t e d
Why should
the details ability
model
a simulation
above not be u s e f u l
system behaviour,
indicate
the q u a l i t y
any can it
in this
A further
with
thesis aim
is
of rival m o d e l
connected
in
system
of the thesis
to d i s t i n g u i s h
assessment,
between
the a b i l i t y
to
is.
model
or r e l i a b l e ?
observed
When
of the same
Most
is i n t i m a t e l y
it give
reported
on t h e s e q u e s t i o n s .
how r i v a l m o d e l s
that
does
the s y s t e m w i l l b e h a v e
of the w o r k
concerned with
b u t it is clear
When
really works?
s h o u l d be assessed.
ostensibly
competing
guide
The p u r p o s e
is to t h r o w
behaviour
useful?
the s y s t e m
as a r e l i a b l e
future?
is to
a model
of the type d e s c r i b e d If it r e p r o d u c e s
is that not s u f f i c i e n t
of the m o d e l ?
In fact,
the
evidence
is it not
to
clear
that
the b e t t e r
the b e t t e r
the
the m o d e l ?
is the p o s s i b i l i t y complexity checked
against
the
time.
clear
the only
is no m o r e value.
agrees w i t h model
of some v a r i a b l e
no o t h e r
that v a l u e s
the v a l u e
in some
taken,
model,
then
It n e v e r
observations, prediction
also
confidence
amounts
assessment
say,
w o u l d be
than
confidence
increases
little
(which does
in
value
but
predictions, measurements of the
very quickly. after
doubt not
in the
to say
the p r e d i c t i o n s
of course,
have
is
any o t h e r
If further
in the m o d e l
third
it is
of c o n f i d e n c e
are b e t t e r
agree w i t h
correct
then
It is now p o s s i b l e
guesses.
one w o u l d
at some
It
is taken w h i c h
of the model,
to certainty,
it.
The p r e d i c t e d sense)
by the m o d e l
than m e r e
that
time of the v a r i a b l e
is nil.
increases.
and these
about
of the two o b s e r v a t i o n s ,
the p r e d i c t i o n
sense,
its
at two d i f f e r e n t
of the v a r i a b l e
with
reasonable
predicted
since
Suppose
information
if a third m e a s u r e m e n t
immediately
reason
and it is b e i n g
example.
(in an i n t u i t i v e
However,
The b a s i c
the model,
simple
of the m o d e l
likely
behaviour,
set of data.
are t a k e n
on the b a s i s
that
is no.
unconstrained,
following
to p r e d i c t
the p r e d i c t i o n
that
imply
only
ten
the next
that it
be). The
model
answer
If a linear v a r i a t i o n
proposed,
of the o b s e r v e d
"overfitting"
and that w e have
is d e s i r e d
would
of
a small
two m e a s u r e m e n t s
are
Our
is r e l a t i v e l y
Consider
times,
reproduction
confidence
clearly
which
depends
one is w i l l i n g
on the d i f f e r e n c e
to ascribe between
to this
the n u m b e r
of o b s e r v a t i o n s observations
required
the a v a i l a b l e then we have situation
of a r b i t r a r y
number
it "explains"
no
to c o n s t r u c t
that
by s a y i n g
then w e
if the n u m b e r
about
it fit the o b s e r v a t i o n s , i s
of o b s e r v a t i o n s ,
the model, This
that
have been m a d e
of
If all of
in its p r e d i c t i o n s .
also be d e s c r i b e d decisions
the model.
are used
confidence
to m a k e
and the n u m b e r
to c o n s t r u c t
observations
can
in o r d e r
which
the m o d e l ,
the same
have no c o n f i d e n c e
as~the
in the
model. This
p o i n t was m a d e
dismissed
Jeans'
catastrophe
succinctly
classical
and the
by P o i n c a r e ,
explanation
specific
heat
when
he
of the u l t r a v i o l e t
of solids
(i) :
"It is o b v i o u s that by g i v i n g s u i t a b l e d i m e n s i o n s to the c o m m u n i c a t i n g tubes b e t w e e n his r e s e r v o i r s and g i v i n g s u i t a b l e values to the leaks, Jeans can a c c o u n t for any e x p e r i m e n t a l results w h a t e v e r . But this is not the role of p h y s i c a l theories. T h e y s h o u l d n o t i n t r o d u c e as many a r b i t r a r y c o n s t a n t s as there are p h e n o m e n a to be e x D l a i n e d ; they should establish connections between different experimental facts, and above all they s h o u l d allow p r e d i c t i o n s to be made."
On the o t h e r hand, reproduces If o n l y increase
a slight
have
been
of p h e n o m e n a "
r e q u i r e d for m o d e l the
complexity
accuracy
behaviour
increase
in accuracy,
constants" "number
the o b s e r v e d
the
is c l e a r l y
in c o m p l e x i t y
then
in some
added
sense
to it than
which
assessment
of a m o d e l
with which
and its
significant.
results fewer
in a large "arbitrary
the a d d i t i o n a l
it now explains. is some
the m o d e l
What
"trade-off"
accuracy.
is
between
A prerequisite
for this a wide
is a m e a s u r e
class
appears
casting
of m o d e l s
in such
of fit of m o d e l
behaviour
to the o b s e r v e d
is the
as a c o m p o n e n t is thus
a suitable
of m o d e l
achieved
assessment
qrthodox would
be
f r o m a small
approach
a form,
in
that behaviour
The r e q u i r e d
model
class,
ment problem
as a s t a t i s t i c a l has
of the
complexity
in
indeed been
follow
of m o d e l s
some
statistical
to f o r m u l a t e
decision
the a s s e s s -
problem.
investigated,
such
of m o d e l
assessment
then be p o s s i b l e
type e n c o u n t e r e d
We do not
the
and to p o s t u l a t e
It m a y
of a p p r o a c h
to the p r o b l e m
to e x a m i n e
framework.
(5).
introduced
complexity.
by a s s e s s i n g
to
manner.
A more
models
is a p p l i c a b l e
innovation
trade-off
chosen
of models.
which
A major
this w o r k poorness
of c o m p l e x i t y
even
in c o n t r o l
an a p p r o a c h
This
type
for d y n a m i c a l
studies for the
(2)(3)(4) following
reasons. Any m e t h o d w i l l be
arrived
appropriate
(such as l i n e a r
Such
compared
investigated market,
for a n a r r o w
(statistical)
corrupted
a method will
are b e i n g
only
difference-equation
set in a p a r t i c u l a r "observations
at from s t a t i s t i c a l
by w h i t e ,
n o t be u s e f u l - for e x a m p l e ,
is the b e h a v i o u r
it may be d e s i r e d
Forrester's
"Industrial
class
of m o d e l s
models,
for e x a m p l e ) ,
environment Gaussian,
(such
firms
models
being in some
a model based
techniques
noise").
different
if the s y s t e m
to c o m p a r e
as
additive
if two very
of c o m p e t i n g
Dynamics"
considerations
on
(6) w i t h
a model
which
uses
market's
game
theory
firms'
elements. usually
simulation
When
the p r o b a b i l i t y Furthermore, economic
difficult
when
under
conditions.
few o b s e r v a t i o n s
and there
is little
the
statistical
specification
may
i t s e l f be very u n c e r t a i n .
by n o t a s s u m i n g conclusions These fruitful
it to be known;
considerations
to i n v e s t i g a t e
by a p a i n s t a k i n g
three
of r e l e v a n t are
about
it,
environment little
is lost
misleading
indicate
that
by m a k i n g
the g e n e r a l
and d i f f i c u l t
it may be m o r e of m o d e l s
of complex,
as few a s s u m p t i o n s situation,
analysis
rather
as
than
of each m o d e l
as it arises.
Overview
We
case
in fact,
the a s s e s s m e n t
systems
and e x a m i n i n g
structure,
knowledge
these,
may be avoided.
understood
possible
behaviours
of a s y s t e m
of the s y s t e m ' s In this
(8).
and s o c i o -
stationariness
a priori
of
When modelling
processes. available,
it is
and i m p o r t a n t
to assume
Finally,when
nonlinear
variables
environmental
it may not be a p p r o p r i a t e
1.2
and the
the e v o l u t i o n
of r e l e v a n t
interesting
transient
contain
also d y n a m i c a l ,
to d e s c r i b e
investigating the m o s t
often
are
distributions
systems,
occur
models
such m o d e l s
extremely
poorly
actions
responses.
Realistic
often
(7) to e x p l a i n
develop
of A p p r o a c h
and Results.
a characterisation
"components":
the s y s t e m
of m o d e l l i n g
to be m o d e l l e d ,
which
has
a model
of
this system, The
and a c r i t e r i o n
system
pair of sets
of q u a l i t y
to be m o d e l l e d
of o b s e r v a t i o n s are
and accuracy,
observation
Each
each
therefore
discrete-time that this
of d a t a detail
does n o t
time,
of this
become
of such
reflects
evident
a system
to
the r e a l i t i e s
be d e f i n e d
in m o r e
which
implies
compute
a reversed
time
obtained.
exercise
interest,
ordering.
functions
defined
These
It only
are u s e l e s s
in a n e w s i t u a t i o n exercise),
as a r e f e r e n c e ,
with
will
of a p a r t i c u l a r
subsets
to a d m i t
be of m u c h
of the m o d e l l i n g
to m o d e l s
which
is b r o a d e n o u g h
system may behave
serve
the o u t p u t
a lack of any
of the m o d e l l i n g
Any r e s t r i c t i o n
onto
not n o r m a l l y
observations
the goal
by s p e c i f y i n g
The
is any a l g o r i t h m w h i c h maps
or even
h o w the
type w i l l
the success
It m e r e l y
interpretation
algorithms
on the p a r t i c u l a r
(presumably
finite.
it w i l l
the m o d e l s
definition
would
such as those w h o s e
for d e d u c i n g
resolution
a set of d i s c r e t e - s t a t e ,
of the o b s e r v a t i o n s
which
allows
limited
to be r a t i o n a l .
to be
However,
system
This
of
and output.
is a s s u m e d
A system will
of the
observations.
direction
by a
1.3.
subsets
algorithms
like
category.
in sec.
input
is a s s u m e d
constrain
collection.
A model certain
looks
to be d e f i n e d
obtained with
measurements.
be of the same
also
always
set of o b s e r v a t i o n s
system
is t a ke n
of its
Since m e a s u r e m e n t s
of the model.
but models
respect
to w h i c h
be assessed. type
of the o b s e r v a t i o n s
is a c c o m p l i s h e d lie
in the
domain
of the a l g o r i t h m ,
observations
are
deterministic
successive
successive
outputs,
the W i e n e r
- Kolmogorov
blocks
of i n p u t
elements
to be the c o r r e s p o n d i n g
For example, n e e d o n l y map
and w h i c h
images.
difference
blocks
whereas
of the o u t p u t
of input
stochastic
or K a l m a n
and p a s t o u t p u t
equation
models
observations
predicting
types m u s t map
observations
to
models
of
successive
to s u c c e s s i v e
outputs. The
term
program".
Thus
the o u t p u t specified
"algorithm" we
think
observations, subsets
may be
interpreted
of m o d e l s and these
as p r o g r a m s programs
of the o b s e r v a t i o n s
task.
This
it w e r e
not
for the p o w e r
of C h u r c h ' s
states
that
any p r o c e d u r e
which
notion
of an " a l g o r i t h m "
equivalent hence
viewpoint
the m o d e l
some p r o g r a m m i n g taken
to be the
the n u m b e r the p r o g r a m which
have
is w r i t t e n
shortness
is a m e a s u r e
the o u t p u t
criterion
program
in
of q u a l i t y
is
as m e a s u r e d
with which
the o b s e r v a t i o n s
The
length
of a r b i t r a r y
to the p r o g r a m m i n g Furthermore,
observations were
and
program.
of the n u m b e r
to c o m p u t e
(9), w h i c h
of a l g o r i t h m s ,
in the program.
the model.
if
in any one of the
of that p r o g r a m ,
(relative
in this
the i n t u i t i v e
as a c o m p u t e r
the
them
arbitrary,
Thesis
theory
as a c o m p u t e r
of c h a r a c t e r s
in c o n s t r u c t i n g
of the
lanaguage,
been m~e
be e x c e s s i v e l y
satisfies
for c o m p u t i n g
may use the
to help
can be e x p r e s s e d
formalisations
can be e x p r e s s e d When
would
as " c o m p u t e r
originally
of
decisions
language)
a model
exactly
by
is r e q u i r e d
(to the a c c u r a c y made).
In o r d e r
to do this,
the m o d e l m u s t g e n e r a t e i n t e r n a l l y
those terms
w h i c h w o u l d c o n v e n t i o n a l l y be t h o u g h t of as "fitting errors". Since the p r o g r a m m i n g terminals,
l a n g u a g e has a finite n u m b e r of
the length of the m o d e l i n c r e a s e s w h e n these
terms increase.
The c r i t e r i o n of q u a l i t y
a particular trade-off between
thus i n c o r p o r a t e s
c o m p l e x i t y and a p p r o x i m a t i o n .
The above c h a r a c t e r i s a t i o n of m o d e l l i n g more detail 2.2.
in C h a p t e r 3.
Support
is e x p l a i n e d in
for it is given in s e c t i o n
The e s s e n c e of this s u p p o r t is that the length o~
the s h o r t e s t p r o g r a m r e q u i r e d to c o m p u t e a s ~ q u e n c e d i s p l a y s properties
analogous
to the p r o p e r t i e s
of the e n t r o p y
associated with a probability
space.
long sequence, w h i c h r e q u i r e s
a maximally
compute it, p a s s e s every e f f e c t i v e (asymptotically, w i t h p r o b a b i l i t y
possible
long p r o g r a m to
i).
This suggests
to "compress"
that
the p r o g r a m
r e q u i r e d to compute a set of o b s e r v a t i o n s
represents
a
test for r a n d o m n e s s
the amount by w h i c h it is p o s s i b l e (model)
In p a r t i c u l a r ,
(system)
the amount of i n f o r m a t i o n w h i c h it has b e e n
to e x t r a c t from the o b s e r v a t i o n s .
If the only
m o d e l w h i c h has b e e n found is one that m e r e l y reads out the observations
from a look-up table,
has b e e n achieved,
and such a m o d e l
then no " c o m p r e s s i o n " conveys no i n f o r m a t i o n
about the o b s e r v a t i o n s . A c o n s e q u e n c e of our c h a r a c t e r i s a t i o n
is that no
a l g o r i t h m can e x i s t for finding
the best m o d e l
the above c r i t e r i o n of quality)
of an a r b i t r a r y
(according to system.
10
The choice of p r o g r a m m i n g
l a n g u a g e to be used,
a s s e s s i n g the q u a l i t y of a model,
for
can be v i e w e d as the
s p e c i f i c a t i o n of "what is to be taken for granted". should
It
t h e r e f o r e be m a d e in the light of the m o d e l l e r ' s
a priori k n o w l e d g e
about the system,
the m o d e l l i n g exercise.
In C h a p t e r 4 this c o n n e c t i o n is
e x a m i n e d m o r e closely. sets are large enough,
and of the p u r p o s e s of
It is shown that,
if the o b s e r v a t i o n
then the results of m o d e l a s s e s s m e n t
are i n d e p e n d e n t of the choice of p r o g r a m m i n g
language.
This can be i n t e r p r e t e d to m e a n that the m o d e l l e r ' s 9 p r i o r i beliefs become
less s i g n i f i c a n t as the set of o b s e r v a t i o n s
a v a i l a b l e to him grows. Nevertheless, observation
the a s s e s s m e n t of m o d e l s of small
sets ~ d e p e n d e n t on the m o d e l l e r ' s
of his a p r i o r i beliefs.
Consequently
cannot be taken to be definitive.
specification
such an a s s e s s m e n t
However,
this is
m i t i g a t e d by the fact that the m o d e l l e r does not n e e d to choose b e t w e e n
mutually exclusive
he can s t i p u l a t e p r o g r a m m i n g
sets of a priori beliefs:
l a n a g u a g e s w h i c h imply a g r e a t e r
or s m a l l e r state of k n o w l e d g e . S e v e r a l d i f f e r e n t models,
even w h e n w r i t t e n in the same
language, w i l l rarely use e x a c t l y the same f e a t u r e s of that language.
It is t h e r e f o r e q u e s t i o n a b l e w h e t h e r a c o m p a r i s o n
of their lengths gives a m e a s u r e to the same set of assumptions. this difficulty.
Chapter
of their c o m p l e x i t y r e l a t i v e Chapters
5 develops
5 and 6 resolve
a formal e q u i v a l e n t
of "a p r o g r a m makes use of s u c h - a n d - s u c h f a c i l i t i e s of a
11
language".
A prerequisite
for this is a formal m e t h o d of
d e f i n i n g the s e m a n t i c s of p r o g r a m m i n g
languages.
such m e t h o d is o u t l i n e d in A p p e n d i x A. the concepts d e v e l o p e d in C h a p t e r
these c o n d i t i o n s
C h a p t e r 6 then uses
5 to specify some c o n d i t i o n s
under w h i c h m o d e l s may be m e a n i n g f u l l y d e m o n s t r a t e d that m o d e l
One
compared.
It iS
a s s e s s m e n t is not m u c h a f f e c t e d if
are not m e t exactly.
The details of the c o m p l e x i t y / / a p p r o x i m a t i o n t r a d e - o f f , w h i c h is i n h e r e n t in our p r o p o s e d m e t h o d of m o d e l a s s e s s m e n t , d e p e n d on the p r e c i s e m a n n e r in w h i c h the o b s e r v a t i o n s coded in the p r o g r a m m i n g
language.
It is c o n v e n i e n t
are to
s e p a r a t e this aspect of the s e l e c t i o n of a s u i t a b l e p r o g r a m m i n g language from those aspects c o n s i d e r e d in C h a p t e r s e q u e n t l y the coding of o b s e r v a t i o n s
4;
con-
is d i s c u s s e d in C h a p t e r 7.
A d i s t i n g u i s h e d m i n i m a l coding is shown to exist,
and it is
argued that this is a n a t u r a l c o d i n g to use for m o d e l assessment. The m o d e l l i n g of one p a r t i c u l a r s y s t e m gas-furnace data
(i0))
(Box and Jenkins'
is used as an e x a m p l e throughout.
The r i v a l m o d e l s c o n s i d e r e d for this s y s t e m are very simple and in no way r e p r e s e n t the range of possibi.lities d i s c u s s e d in sec.
i.i.
Nevertheless,
the c o n s i d e r a t i o n s
there apply e v e n to these simple models, Chapter
3.
It w i l l b e c o m e
raised
as w i l l be seen in
a p p a r e n t that the a s s e s s m e n t
m e t h o d p r o p o s e d in this thesis is i m m e d i a t e l y a p p l i c a b l e to a much
larger class of models.
12
1.3
System
Identification r Realisation
Modern notion
developments
of a d y n a m i c a l
experimental with
data
of systems
system
(ii),
the i n f e r e n c e
not y e t o b s e r v e d
conditions,
behaviour,
known
under
theory
emphasise
as an a b s t r a c t
(12),
of s y s t e m
and M o d e l l i n @
(13).
summary
Modelling
behaviour
under
is c o n c e r n e d
by w h i c h
is a c h i e v e d
is the p o s t u l a t i o n
the
system,
which
and
the s e l e c t i o n ,
from t h e s e
candidate
is p r e f e r r e d
on the basis
of some
criterion.
its h e a v y
emphasis
that
modern
discussing
as
However,
observations, upon
Consequently,
a more
than
if a s y s t e m
then as little
we adopt
structures,
and
one
the
of one The
on
to adopt,
less u s e f u l
when
view
of c o m p o n e n t s " . and the
by r e f e r e n c e
abstract
modelling
for
observations,
is to be m o d e l l e d ,
is to be g a u g e d
it, b e f o r e
these
natural
the o l d e r
structures
this
following
to the
structure
has begun,
success
should
be
as possible.
definition:
(1.3.1)
A system observations, U=
with
"an i n t e r c o n n e c t i o n
of the m o d e l l i n g
Definition
with
is t h e r e f o r e
modelling,
of a s y s t e m
(i)
compatible
v i e w of a system,
observations,
imposed
are
but
of p a s t
The m e t h o d
of a b s t r a c t
of
specified
from o b s e r v a t i o n s
conditions.
the
S is d e f i n e d S=
(u I , u 2
to be an o r d e r e d
(U, Y)
, where:
, .
,uM)
and Y=
(Yl
p a i r of
' Y2
'
,YN )
13
are the i n p u t and o u t p u t o b s e r v a t i o n sets r e s p e c t i v e l y ; ui=
(Ul, u2
i )and . , u~i
• .
y i=
are o r d e r e d sets of o b s e r v a t i o n s
w h e r e tl,t2,..,
(yi1
'
yi2
i ' Ymi
'
c a r r i e d out at time ti,
t N is the n a t u r a l
time ordering;
u~ E { r a t i o n a l s }
'
u {b} where b
i
for yj;
3
(blank)
denotes a missing observation;
similarly
and
(ii) w i t h the c o n v e n t i o n
£i=0;
)
if
that
Yi=b t h e n mi=O;
if
(b,b,...,b)=b,
if u . = b then l
u.%b t h e n u£.@b; i l
if Y i ~ b then
1 i
Ym, ~b; 1
and YN%b.
C o n d i t i o n s (ii) serve only to e n s u r e that adding on a set of blanks
(missing o b s e r v a t i o n s )
does not create a new system.
For c o n c r e t e n e s s • we have s p e c i f i e d that ui,Y i refer to observations made
at time t i, since we are i n t e r e s t e d p r i m a r i l y
in d y n a m i c a l models. essential.
Also,
However•
this i n t e r p r e t a t i o n
is not
each u i , Y i could be a m u l t i d i m e n s i o n a l
finite a r r a y of o b s e r v a t i o n s ,
r a t h e r than a o n e - d i m e n s l o n a l
array, w i t h o u t a f f e c t i n g later results. The input o b s e r v a t i o n set is a l l o w e d to be empty, order to admit d e v i c e s such as noise g e n e r a t o r s as systems of the form w h e n stating
(b, Y).
in
and o s c i l l a t o r s ,
It has b e e n a r g u e d that
the g e n e r a l p r o b l e m of s y s t e m i d e n t i f i c a t i o n ,
it should not be n e c e s s a r y
to d i s t i n g u i s h b e t w e e n input and
output(14).
The two should be lumped t o g e t h e r as a "system
behaviour",
and the task of s y s t e m i d e n t i f i c a t i o n s h o u l d
~4
include
the
it seems the
two
separation
essential cases
shown
and
internal
structures
procedure inputs
must
have
sets.
The
f r o m the sets
lead
form
Our
especially field
difference
a system
define
of
cc. c e ~ n ~ d r
however,
interaction have
is s o m e
cbservaLions concise
referred we prefer the
with
with
to above
o f its
the
set of observations
of observation
themselves and
systems
(b, Y). seem odd,
theory.
by
by
In t h i s
a set
of
examining
equations. process.
We We
of a system
hehaviour.
"laws"
- such
"explain"
The as t h e
this
set of equations as
a "system".
reason
the o b s e r v a t i o n
a system
reverse
this
are
assume
because
eD~TircFme~.t - 'I o t h e r w o r d s ,
- which
to regard
control
the e x i s t e n c e
set of
are
for
the
observations
at f i r s t
of these
the
its
that
a
identification
unless
pair
input
its b e h a v i o u r
solutions
of
with
to define
properties
aware
may
any
It is
t h a t U # b)
familiar
and investigate
we
a system,
"system"
equations,
its
Note
different
between
of both
the
between
labelled
very
But
as an o r d e r e d
distinguishes
i t is c o n v e n t i o n a l
the
point.
distinguished.
of
to t h o s e
t h a t ?,e are
the
same model
(b, U) ( p r o v i d i n g
definition
boxes
-
observations.
U a n d Y, w h i c h
o f the
The black
consider
However,
of distinguishing
to h a v e
are
ordering
"output".
can be expected
to the
defined
output
i.
and an earthing
and outputs
that we
and
a means
in Fig.
"sink"
generator
"input"
to have
"source"
signal
of
goal
of because
of modelling
set of equations
interaction. as a " m o d e l " ,
Hence and
15 The d e f i n i t i o n of "system" w h i c h is p r o p o s e d above is much cruder than the d e f i n i t i o n s
usually encountered.
It
is w o r t h s t a t i n g in full one such d e f i n i t i o n - that of Kalman, Falb and A r b i b
Definition
(ii) :
(1.3.2)
A dynamical system mathematical (a)
(i)
( i n p u t / o u t p u t sense)
is a c o m p o s i t e
c o n c e p t d e f i n e d as follows:
T h e r e is a given time set T, a set of input values U,
a set of a c c e p t a b l e i n p u t functions
R={~
:T+
output values Y, and a set of o u t p u t functions (ii)
(Direction of time).
U}, a set of F ={y
:T÷
Y}.
T is an o r d e r e d subset of the reals.
(iii) The i n p u t space ~ s a t i s f i e s
the f o l l o w i n g conditions:
(I)
(Nontriviality).
~ is nonempty.
(2)
(Concatenation of inputs).
An input s e g m e n t
~(t I, t 2) is ~e~ r e s t r i c t e d to
(t I , t2)~T.
If ~,~'e~ and tl< t 2 < t3, there is an e"e~ such that m" (tl,t2) = ~ ' ( t l , t 2 ) and ~" (b)
T h e r e is given a set F = (fe
:
T
x
A
(t2,t3)=w"(t2,t3).
i n d e x i n g a family of f u n c t i o n s ~ ~Y,~eA}
;
each m e m b e r of F is w r i t t e n e x p l i c i t l y
as f (t,~)= y(t)
w h i c h is the o u t p u t r e s u l t i n g at time
t
under the e x p e r i m e n t
e.
Each f
from the input
is c a l l e d an i n p u t / o u t p u t
function and has the f o l l o w i n g p r o p e r t i e s : (i)
(Direction of time).
f (t,~)
There
is d e f i n e d for all t>l(e).
is a map
~:A÷T such that
16
(ii) ~(~
Let T,teT
(Causality). ,t) =~
and T O.
m o d e l s are allowed to a p p r o x i m a t e s y s t e m
r a t h e r than r e p r o d u c e it exactly.
and
(3.3.6)
however,
Definitions
r e q u i r e m o d e l s to compute the
o b s e r v e d s y s t e m b e h a v i o u r exactly.
This does not m e a n that
the class of m o d e l s w h i c h we can t r e a t is any smaller than the class of models w h i c h are u s u a l l y of interest. merely m e a n s
It
that w h e r e a s a c o n v e n t i o n a l m o d e l may r e p r o d u c e
a system b e h a v i o u r a p p r o x i m a t e l y ,
the c o r r e s p o n d i n g m o d e l
in our f o r m a l i s m has the a d d i t i o n a l task of g e n e r a t i n g the "corrections" behaviour, Fig.
w h i c h m u s t be a p p l i e d to the a p p r o x i m a t e
in o r d e r to p r o d u c e the e x a c t s y s t e m behaviour. 2 shows the c o r r e s p o n d e n c e b e t w e e n a type of
c o n v e n t i o n a l m o d e l c o m m o n l y e n c o u n t e r e d in c o n t r o l studies, and a m o d e l w h i c h s a t i s f i e s d e f i n i t i e n s It w i l l be r e c a l l e d that t h e o r e m "random"
is e q u i v a l e n t
table look-up",
(3.3.1)
(2.2.12)
and
(3.3.6).
suggests that
to "can be c o m p u t e d o n l y by u s i n g a
If a m o d e l is c o n s i d e r e d to be a s u m m a r y
72 of k n o w l e d g e
about a system,
then those c o m p u t a t i o n s of the
m o d e l w h i c h have to be p e r f o r m e d by using a table look-up correspond
to those aspects of the s y s t e m b e h a v i o u r w h i c h
are not u n d e r s t o o d ,
and cannot be p r e d i c t e d - in fact,
those that a p p e a r to be random. may be very d i f f e r e n t
from that shown in fig.
if they are "corrections", m o r e generally,
The role of these c o m p u t a t i o n s 2.
For example,
they need not be additive.
But
the terms c o m p u t e d by table look-up need not
play the role of "corrections".
T h e y may,
for instance,
be p a r a m e t e r s , w h i c h w o u l d c o n v e n t i o n a l l y be v i e w e d as " r a n d o m l y varying".
3.4 C r i t e r i o n of Q u a l i t y
The third c o m p o n e n t of our c h a r a c t e r i s a t i o n of m o d e l l i n g is a c r i t e r i o n of q u a l i t y of a model. Let F r e p r e s e n t a c o m p u t i n g programming
language.
facility,
together with a
Let c be an i n j e c t i v e f u n c t i o n from
the i n t e g e r s to the set of strings of t e r m i n a l in the progr a m m i n g language, w h i c h is used to r e p r e s e n t the integers
in
programs.
(c t h e r e f o r e is i n c l u d e d in the d e f i n i t i o n of the
programming
l a n g u a g e F.
The d e f i n i t i o n of p r o g r a m m i n g
languages is r e v i e w e d in A p p e n d i x A; given in C h a p t e r 7).
m o r e d~tails of c are
Let S be an i n t e g e r s y s t e m as d e f i n e d
in s e c t i o n 3 2, w i t h input and o u t p u t o b s e r v a t i o n s u~, •
Definition
i yj-
(3.4.1)
The trivial F m o d e l of S is the s h o r t e s t p r o g r a m w h i c h
73 4
is a concrete
(F,E)-model of S, such that each c(y~)
appears in it, where the minimisation all possible sets E
(defined by def.
of length ranges over (3.3.1)).
It is assumed that the length of a program is measured by the number of terminals
appearing in it.
The trivial model of a system is one which computes the output observation table look-up.
set by simply reading
it out from a
It is a model which the modeller has
available right at the beginning of the modelling
exercise,
before he has found any structure or pattern in the system behaviour. For any system S, let the sets Ci,D i be those defined by def.
(3.3.1) ~
One can think of the length of a concrete
(F,E)-model of S as the "perceived complexity", F, of the set
(Cl,...,Cm),
conditional
relative to
on the set
((l,Dl),...,(m,Dm)).
The greatest lower bound of this "perceived complexity", taken over all concrete
(F,E)-models of S, is just the
conditional Kolmogorovcomplexity
KF((C,,...,Cm) I ((l,Dl),...,(m,Dm))).
(Although Kolmogorov complexity was developed and binary programs,
for binary sequences
it can be readily generalised to sequences
and programs containing
any finite number of sMmbols).
approximate upper bound for this Kolmogorov complexity
An is
the length of the trivial model of S. The length of the trivial F model of S is the "perceived complexity"
of
(C~,...,C m) before any structure has been
discovered in the system behaviour.
If a shorter model of
74
S is found,
then its " p e r c e i v e d c o m p l e x i t y " w i l l be reduced.
R e c a l l i n g the a n a l o g y b e t w e e n c o m p l e x i t y and entropy, is a p p e a l i n g to m e a s u r e in
((l,D1),...,(m,Dm))
the " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " about
(CI, .... C m)
as the d i f f e r e n c e
b e t w e e n these two " p e r c e i v e d c o m p l e x i t i e s " . Kolmogorov complexity
it
Since
is not e f f e c t i v e l y c o m p u t a b l e ,
the o n l y
u p p e r b o u n d on this " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " which
is a v a i l a b l e ,
model.
in general,
is the length of the t r i v i a l
Thus the length of the trivial m o d e l is a m e a s u r e
of the a m o u n t of i n f o r m a t i o n p o t e n t i a l l y to be c o n v e y e d by the m o d e l l i n g exercise.
Definition
(3.4.2)
Let p be a c o n c r e t e trivial F m o d e l of S.
(F,E)-model of S, and let t be the Then the i n f o r m a t i o n ~ain I(p)
of p is the d i f f e r e n c e I (p) =£ (t) -Z (p) . . . . . . . . . . . . . . . . . . where
£(.)
denotes
In section
(3.5)
the length of a program.
i.i a simple e x a m p l e was p r e s e n t e d ,
which
s u g g e s t e d that the c o n f i d e n c e w h i c h one has in a m o d e l d e p e n d s on the d i f f e r e n c e b e t w e e n the n u m b e r of o b s e r v a t i o n s w h i c h the m o d e l e x p l a i n s
and the n u m b e r of o b s e r v a t i o n s
r e q u i r e d to c o n s t r u c t the model. m e a s u r e of this difference.
The i n f o r m a t i o n g a i n is a
If the i n f o r m a t i o n gain is
zero, then all of the o u t p u t o b s e r v a t i o n s have been used to c o n s t r u c t the model;
the t r i v i a l m o d e l is, of course,
prime e x a m p l e of such a model.
the
If the i n f o r m a t i o n gain is
7S
close to its u p p e r enough
bound
£(t),
to be c o n s t r u c t e d
observation
set,
by the model.
"parameters"than
that
the m o d e l
is j u s t i f i e d
course,
implies
(Chapter that
of a s y s t e m
contains
sets
that system.
This
that if w e have we can n e v e r
only
the
latter
accords
size
confidence
of the p r o g r a m m i n g
aspect).
This
model
the i n t u i t i v e
of
notion
of a system,
in any m o d e l
claim
in some m o d e l
of the t r i v i a l
well w i t h
of
is c o n t a i n e d
we m a y have
a few o b s e r v a t i o n s
have m u c h
We assume,
the s y s t e m
confidence
by the
in a m o d e l
increases.
about
about
set.
and in the d e f i n i t i o n
the p o s s i b l e
of course.
arbitrary
the c o n f i d e n c e
gain
4 deals with
is b o u n d e d
more
observation
that
all our k n o w l e d g e
in the o b s e r v a t i o n
set is " e x p l a i n e d "
by the amount of i n f o r m a t i o n
as its i n f o r m a t i o n
that
language
then,
of this
of the o u t p u t
gain m a y be n e g a t i v e ,
in the o u t p u t
We are c l a i m i n g ,
is simple
a small p a r t
and the r e m a i n d e r
contained
increases
the m o d e l
from o n l y
The i n f o r m a t i o n
This i n d i c a t e s
reality
then
then
of it w h i c h
may be p o s t u l a t e d . We e m b o d y of w h i c h
Axiom
our c l a i m
a c h o i c e m a y be m a d e
following
between
axiom,
competing
on the basis models.
(3.4.3)
If S is a system, El- m o d e l s
an
and El and E2
of S and Ez- m o d e l s
and q are models, being
in the
(F,E2)
has the h i g h e r model of S.
with p being
-model
of S an
of S, then
information
gain
are sets
such
that
are of interest,
(F,EI)
-model
and p
of S and q
the one of p and q w h i c h
is to be c h o s e n
as the b e t t e r
76 This a x i o m implies that good m o d e l s Good m o d e l s w i l l t h e r e f o r e computational
are small models.
tend to use the same
(short)
a l g o r i t h m for as many c o m p u t a t i o n s
as p o s s i b l e ,
since the s p e c i f i c a t i o n of every new a l g o r i t h m i n c r e a s e s the size of the model. specific
Thus the above a x i o m p r o v i d e s
a
link b e t w e e n the w i d e l y - h e l d b e l i e f that s i m p l i c i t y
(as m e a s u r e d by smallness)
is d e s i r a b l e
and the a l m o s t u n i v e r s a l c o n v i c t i o n r e g u l a r i t y has been repeated,
in s c i e n t i f i c h y p o t h e s e s ,
that the more an o b s e r v e d
the more
likely it is to recur.
The f o l l o w i n g t h e o r e m is a crucial c h a r a c t e r i s a t i o n of m o d e l l i n g .
feature of our
As before,
£(p) d e n o t e s
the length of p r o g r a m p, m e a s u r e d by the n u m b e r of t e r m i n a l c h a r a c t e r s w h i c h appear in it.
Theorem
(3.4.4)
T h e r e is, in general, an
no e f f e c t i v e p r o c e d u r e
(F,E) - m o d e l p of a system S, such that,
for finding
for any o t h e r
(F,E) -model q of S, ~(p)~ £(q).
Proof.
S u p p o s e that such an e f f e c t i v e p r o c e d u r e exists.
C o n s i d e r the case E = { E I } = { ( ~ , Y ) } ( w h e r e have an e f f e c t i v e p r o c e d u r e
S=(U,Y)).
Then we
for f i n d i n g the s h o r t e s t p r o g r a m
w h i c h computes Y, using only the set {i}. Now suppose that the p r o g r a m m i n g t e r m i n a l characters. procedure
language F has only two
Then there exists
an e f f e c t i v e
for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h
c o m p u t e s Y, u s i n g b i n a r y sequence:
{i}.
But Y can be a s s o c i a t e d u n i q u e l y w i t h
Y is a system,
and can t h e r e f o r e be
77
a s s o c i a t e d w i t h its index in some fixed e n u m e r a t i o n of systems. This i n d e x can be a s s o c i a t e d w i t h a b i n a r y s e q u e n c e by the b i j e c t i o n i n t r o d u c e d in s e c t i o n 2.2,1. above steps is effective,
Since each of the
there exists an e f f e c t i v e p r o c e d u r e
for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h c o m p u t e s
the
binary s e q u e n c e a s s o c i a t e d w i t h Y, and h e n c e there exists an e f f e c t i v e p r o c e d u r e
for finding its length, n a m e l y the
K o l m o g o r o v c o m p l e x i t y KF(YII).
S u p p d s e F is optimal.
Then, by C h u r c h ' s Thesis,
is p a r t i a l recursive, w h i c h
contradicts
theorem
Theorem
K(YII)
(2.2.5).
This p r o v e s
the theorem.
(3.4.4) does not rely on F h a v i n g only two
terminals or on E h a v i n g the form i n d i c a t e d in the proof. These a s s u m D t i o n s to t h e o r e m
are made in o r d e r to d e r i v e
(2.2.5).
However,
a contradiction
as m e n t i o n e d earlier,
this
t h e o r e m can be g e n e r a l i s e d to the case w h e r e the s e q u e n c e s c o n s i d e r e d have an a r b i t r a r y
finite n u m b e r of symbols,
to cover the u n c o m p u t a b i l i t y
of c o n d i t i o n a l complexity.
the o t h e r hand,
and On
the t h e o r e m does rely on F b e i n g optimal.
A sufficiently restricted programming shortest m o d e l to be found, systems w i l l not p o s s e s s
language may allow the
if it exists.
any m o d e l s
simplest e x a m p l e is a p r o g r a m m i n g
However,
most
in such a l a n g u a g e
(the
l a n g u a g e w h i c h always
computes the same thing, w h a t e v e r p r o g r a m it may be given). Theorem
(3.4.4)
implies that there is no a l g o r i t h m for
finding the m o d e l of a s y s t e m w h i c h has the h i g h e s t i n f o r m a t i o n gain.
So, a c c o r d i n g to our axiom,
finding the b e s t m o d e l of a system.
there is no a l g o r i t h m for C o n s e q u e n t l y the
m o d e l l i n g e x e r c i s e c a n n o t p r o c e e d a c c o r d i n g to some
78
"universal
modelling
of n o n a l g o r i t h m i c followed our
by the
algorithm",
(creative?)
assessment
but must
postulation
of these
involve
a process
of h y p o t h e s e s ,
hypotheses,
according
to
axiom. Note
that the
in s e c t i o n is still
(2.2.4),
Most models
of data w h i c h
up,
it is m o s t
can be explained.
error.
by the m u m b e r
It w i l l
of c h a r a c t e r s
a table
not b e e n e x p l a i n e d usually
be possible
system behaviour
any table
look-ups
algorithms
- that
in the m o d e l
in the rest
a trade-off
between
which would
conventionally
and the d e g r e e
the
a table required
look-
of the data as
to p r o g r a m
it,
of p r e d i c t i o n features
of
by the model. to e x p l a i n m o r e is,
to r e d u c e
of the
the size of
- only by u s i n g m o r e - that Thus
of q u a l i t y
complexity
the use of s m a l l n e s s of a m o d e l
leads
to
of the m o d e l
as the m o d e l
provided
elaborate
is, by i n c r e a s i n g
of that p a r t
be r e g a r d e d
of a p p r o x i m a t i o n
table
look-up,
is, the more
of the m o d e l
as a c r i t e r i o n
aspect
measure
the size of the rest of the model. of the p r o g r a m
artificially
at least one
that e v e r y
size of such
such
the s h o r t e s t
Criteria.
has not been
as a very g e n e r a l
The b i g g e r
the d a t a have
observed
The
there
model.
to c o n t a i n
unlikely
is m e n t i o n e d
is found,
for finding
with Conventional
can be e x p e c t e d
can be r e g a r d e d
(28), w h i c h
if a m o d e l
procedure
generated
measured
that
of that p a r t i c u l a r
Compatibility
since
of M e y e r
implies
no a l g o r i t h m i c
implementation
3.5
result
(cf fig. 2),
by that p a r t of the
79
model
to the o b s e r v e d
therefore model
provides
data.
The use of this
a safeguard
against
observation
set is large
criterion
"overfitting"
the
to the data. If the o u t p u t
to the size look-up,
of that part
then
the
of the m o d e l
size of the
the size of the model. are b e i n g quality table
compared,
leads
to the
selection This
p r e f e r e n c e for small is large
In this
which
is not
look-up(s)
case,
fitting
enough
the
of smaller
conventional
if the n u m b e r
for the d a n g e r
dominate
criterion
to the
errors,
a table
will
of the m o d e l w i t h
corresponds
relative
if two such m o d e l s
the use of the p r o p o s e d
look-up(s).
ations
table
enough,
of o b s e r v -
of o v e r f i t t i n g
to be
d~missed. The d e f i n i t i o n mines
the d e t a i l s
the s m a l l n e s s
of the p r o g r a m m i n g
of the
trade-off
criterion.
are
definition,
is c o n s i d e r e d
A serious proposed about
the s y s t e m
A typical
is not
situation
a program, a priori
the
has
smallest
of this where
that
of the
language
a priori this m a y
the use of the knowledge
indicate
that
a
should
be p r e f e r r e d .
in c o n v e n t i o n a l
system
identification
a particular
will
about
knowledge
a more
look-
available
~
a smaller
knowledge
then
part
table
7.
be m a d e
If s u f f i c i e n t
model with
It m a y h a p p e n
must
used d e t e r -
in the use of
in w h i c h
constitutes
in C h a p t e r
is a v a i l a b l e
example
parametric
which
reservation
criterion.
model w h i c h
is the
coded,
implicit
The m a n n e r
up e l e m e n t s
language
elaborate overall
prevent
indicates
structure model,
size;
that
is a p p r o p r i a t e .
when written
nevertheless,
that m o d e l
a
being
as
the
chosen
as
80
better.
A n o t h e r e x a m p l e is p a r a m e t e r e s t i m a t i o n of a
l i n e a r d y n a m i c a l p r o c e s s w h o s e o u t p u t is c o r r u p t e d by noise. In this case a s t r a i g h t f o r w a r d m i n i m i s a t i o n of the e q u a t i o n e r r o r u s u a l l y leads to b i a s e d e s t i m a t e s So if two m o d e l s
(Eykhoff,
are b e i n g c o m p a r e d w h o s e
c o n t a i n the e q u a t i o n errors,
table look-ups
it is p o s s i b l e
that the larger
one w i l l be p r e f e r r e d on p r o b a b i l i s t i c grounds. again,
a priori k n o w l e d g e
the s m a l l n e s s
(44)).
(about the noise)
Once
is r e q u i r e d if
c r i t e r i o n is to be overridden.
Furthermore,
the s m a l l n e s s c r i t e r i o n could still be u s e d to d e c i d e b e t w e e n the l a r g e r of these two m o d e l s and a third m o d e l b e l o n g i n g to a d i f f e r e n t
class.
As i n d i c a t e d in s e c t i o n intended
i.i,
the p r o p o s e d c r i t e r i o n is
for use in s i t u a t i o n s w h e r e little a p r i o r i
is available,
information
or in s i t u a t i o n s w h e r e it is too d i f f i c u l t to
use such a p r i o r i k n o w l e d g e
for m o d e l assessment.
The s m a l l n e s s - of - m o d e l c r i t e r i o n choice of m o d e l
leads to the same
as do s t a t i s t i c a l c o n s i d e r a t i o n s ,
i m p o r t a n t class of s y s t e m b e h a v i o u r s
for a v e r y
and m o d e l s of them.
If the s y s t e m b e h a v i o u r is a s t a t i o n a r y r a n d o m p r o c e s s w i t h rational spectral density predict,
at any time,
function,
its future behaviour,
the m e a n - s q u a r e p r e d i c t i o n error. to W i e n e r and
then it is known how to So as to m i n i m i s e
The method,
due e s s e n t i a l l y
Kolmogorov, is to m a k e the p r e d i c t i o n for any
future time a s u i t a b l e
linear f u n c t i o n of past o b s e r v a t i o n s
of the b e h a v l o u r
(46).
(45),
are e q u a l l y spaced,
If the o b s e r v a t i o n i n t e r v a l s
and a p r e d i c t i o n
is b e i n g m a d e at each
i n s t a n t of the s y s t e m b e h a v i o u r at the n e x t o b s e r v a t i o n instant,
81
then the p r e d i c t i o n errors are equal
to the random,
uncorre-
lated d i s t u r b a n c e s w h i c h are i m a g i n e d to be acting on the system. S u p p o s e it is d e s i r e d to b u i l d a c o n c r e t e
(F,E) - m o d e l
of the s y s t e m w h i c h w i l l give useful o n e - s t e p - a h e a d predictions.
Any E can be chosen w h i c h allows the m o d e l
to use p r e v i o u s o b s e r v a t i o n s example
(3.3.4)).
to compute p r e d i c t i o n s
(cf.
The m o d e l w i l l have to g e n e r a t e terms
c o r r e s p o n d i n g to p r e d i c t i o n errors by m e a n s of a table up.
If the p r o g r a m m i n g
look-
l a n g u a g e used codes table look-
up terms in such a way that length of code is n o n d e c r e a s i n g with the m a g n i t u d e of the term
(cf. C h a p t e r 7), then,
s u f f i c i e n t l y long s e q u e n c e of o b s e r v a t i o n s , smallest
(in magnitude)
(in length)
for a
the model w i t h
p r e d i c t i o n errors will be the s m a l l e s t
model.
But it is k n o w n that,
for the s y s t e m u n d e r c o n s i d e r a t i o n ,
the s m a l l e s t m e a n square p r e d i c t i o n error is o b t a i n e d by the use of the W i e n e r - K o l m o g o r o v theory. Sherman
(47) has shown that,
Furthermore,
if the p r o c e s s
is Gaussian,
then
the same linear p r e d i c t o r is o b t a i n e d if the e x p e c t a t i o n of any even n o n d e c r e a s i n g
f u n c t i o n of the p r e d i c t i o n error is
minimised. So, u n d e r these conditions, of o b s e r v a t i o n s , to axiom
(3.4.3),
the " e x p e c t e d b e s t model",
uncorrelated
fore s u g g e s t s
judged according
is the W i e n e r - K o l m o g o r o v model.
terms a p p e a r i n g in the table a random,
for a long enough s e q u e n c e
look-up of this m o d e l
sequence.
Theorem
(2.2.12)
The constitute there-
that these terms could not be g e n e r a t e d by any
82
m o r e e f f i c i e n t a l g o r i t h m than a table
look-up.
3.6 P r e d i c t i o n If the b e s t m o d e l that has b e e n found up to some time is a c o n c r e t e
(F,E) - m o d e l p, and it is d e s i r e d to find the
s y s t e m b e h a v i o u r u n d e r some new conditions,
(possibly not yet observed)
w h i c h can be r e p r e s e n t e d by a b l o c k of " v i r t u a l
observations",
D m + ~ , t h a t is, the o b s e r v a t i o n s w h i c h w o u l d
be o b s e r v e d if the new c o n d i t i o n s obtained,
then the m o d e l p,
and the c o m p u t e r F, can be used to find the " p r e d i c t i o n " F(p,m+l,Dm+1).
This p r o v i d e s
v a l u e s of a p o s s i b l e
a m e a n s of c o m p u t i n g
input/output
the
f u n c t i o n of the s y s t e m on
elements of its d o m a i n w h i c h have not b e e n p r e v i o u s l y observed.
According
best "predictions" "prediction"
to our axiom,
a v a i l a b l e to us.
in quotes,
these values
are the
We have put the w o r d
b e c a u s e the v a l u e F ( p , m + l , D m + I)
n e e d not r e p r e s e n t a future v a l u e
(for example,
if the m o d e l
runs b a c k w a r d s t h r o u g h the o b s e r v a t i o n interval). It is possible, is not defined.
In this case,
use p for p r e d i c t i n g
%+i
of course,
However,
that the value F ( p , m + l , D m + I) it may not be p o s s i b l e
s y s t e m b e h a v i o u r u n d e r the c o n d i t i o n s
for some models,
the v a l u e F ( p , m + l , D m + I)
may be u n d e f i n e d simply b e c a u s e p i n c l u d e s
the g e n e r a t i o n of
c e r t a i n p a r a m e t e r s by m e a n s of a table look-up,
and the table
does not contain an e l e m e n t w h i c h is to be used for the computation.
to
In this case, p r e d i c t i o n
(m+l)th
is still p o s s i b l e
if
83
such an e l e m e n t what v a l u e propose
should
a second
extension
Axiom
be s u p p l i e d that
to the model.
element
axiom,
take?
Our
w h i c h m a y be
of the p r e v i o u s
The p r o b l e m solution
is,
is to
thought
of as an
to t a b l e
look-ups
one.
for P r e d i c t i o n
If e l e m e n t s model,
in o r d e r
are
to be s u p p l i e d
to a l l o w
then the b e s t p r e d i c t i o n are chosen
that m o d e l will
so as to m i n i m i s e
to c o m p u t e
be o b t a i n e d the
a prediction,
if these
resulting
of a
elements
increase
in
size of the model. The use of this the v a l u e
of the i n f o r m a t i o n
to a t r i v i a l no c o n f i d e n c e A rough The b a s i c
axiom must
model,
thus
gain.
enabling
in that p r e d i c t i o n , justification
assumption which
observations,
will
of the s y s t e m
during
should
the e l e m e n t s
that p r e v i o u s l y
have b e e n continue
the p r e d i c t i o n
observed
requires
a large
of code
amout
to c o m p u t e
such r e g u l a r i t i e s , by using
the
"fixed"
part
is that
can c e r t a i n l y
of
H e n c e we
look-up
is such
to appear
any
in the b e h a v i o u r
to be such
are p r e s e n t
if the m o d e l
of the m o d e l
b u t we have
as follows.
interval.
an o u t p u t w h i c h
then we
an e l e m e n t
in a s e q u e n c e
table
by
the e l e m e n t m a y be.
prediction
regularities
But,
supply
runs
to be p r e s e n t
prediction.
in o r d e r
whatever
detected
for the
of course,
it to predict;
of the a x i o m
computed
up,
We can
of s c i e n t i f i c
regularities,
choose
be q u a l i f i e d ,
in the that
in a table
look-
is c o n s i s t e n t obtain (i.e.
it
with
a better model
the p a r t
that
84
is common to all the computations)
to c o m p u t e the regularities.
This is true b e c a u s e for a s u f f i c i e n t l y observations, average look-up.
large set of
the size of the m o d e l w i l l be g o v e r n e d by the
length of code a p p e a r i n g as e l e m e n t s of the table Thus the a x i o m is r e a s o n a b l e if it is a s s u m e d
that it is a p p l i e d to the b e s t a v a i l a b l e model. The above a r g u m e n t can be i l l u s t r a t e d by the f o l l o w i n g example.
S u p p o s e a s y s t e m is d e f i n e d by the o b s e r v a t i o n s :
S=(U,Y)=(b, ( 5 5 2 , 5 5 3 , 5 4 6 , 5 5 1 , 5 4 9 , 5 4 4 , 5 4 7 , 5 5 4 , 5 5 7 , 5 5 1 ) ) . If the p r o g r a m m i n g
l a n g u a g e and c o m p u t i n g
f a c i l i t y F is
t a k e n to be A l g o l W, as i m p l e m e n t e d on the IBM 370/165 i n s t a l l a t i o n at C a m b r i d g e ,
and E i = ( ~ , Y i ) , i = l , . . . , l O , (so
that E3=(@,546) , for example),
then a trivial
(F,E) - m o d e l
of S is: B E G I N I N T E G E R I,J;
I N T E G E R A R R A Y Y(I::IO);
FOR J:=l U N T I L I0 DO READ READ WRITE
( Y(J));
(I) ; (Y(I)) ;
END.
552,553t546,551,549,544,547,554,557,551 , W h e n p r e s e n t e d w i t h an i n t e g e r i
(16i~iO), this p r o g r a m
c o m p u t e s Yi by looking it up in the array Y. We k n o w that this m o d e l
is useless for p r e d i c t i o n ,
b e c a u s e it is a trivial model. it c o m p u t e a "prediction".
we can m a k e
We m u s t first supply it w i t h a
new e n t r y in its table look-up. integer
Nevertheless,
To do this, we replace the
iO in line 2 by the i n t e g e r ii, and add a new n u m b e r
85
at the end of the program.
When presented
ii, this p r o g r a m w i l l
the new number.
this n u m b e r
be?
output
According
to our A x i o m
should be one of the i n t e g e r s Yll will
then be that
Clearly nearer
can see that
it.
But why
regularity
will
doing this
close
of this
to 550,
INTEGER
READ WRITE
of
In o t h e r w o r d s ,
by not o b e y i n g
Because
to 550.
But
to b u i l d that
"mean plus INTEGER
UNITL
that
a
the b e h a v i o u r
case w e
can use
O n e w a y of
of the b e h a v i o u r
to b u i l d
random ARRAY
iO DO READ
the A x i o m
detected
a b e t t e r model. the m e a n
we
by o b e y i n g
we have
in that
and t h e r e f o r e
I,J;
F O R J:=l
it
A prediction
than one o b t a i n e d
see this?
is of the c o n v e n t i o n a l BEGIN
should
The p r e d i c t i o n
be better.
obtained
is to o b s e r v e
close
integer
for P r e d i c t i o n ,
in the s y s t e m b e h a v i o u r - n a m e l y ,
our k n o w l e d g e
What
is a v e r y bad one.
be b e t t e r
can we
tends to r e m a i n
remains
"obviously"
a prediction
for P r e d i c t i o n
the
integer.
this p r e d i c t i o n
550 w o u l d
0,...,9.
with
a model
error"
which
type:
E(I::IO);
(E(J));
(I) ; (550+E(I)) ;
END. 2,3,-4,1,-i,-6,-3,4,7,1, This m o d e l the o b s e r v e d obtained
from
only s l i g h t l y
computes
regularity a table better
gain is 12 terminals,
the
system
(550),
look-up. than but
the
behaviour
and c o r r e c t i n g Admittedly, trivial
it w o u l d
model
rapidly
Y by c o m p u t i n g it by a term this m o d e l
is
(its i n f o r m a t i o n become
decisively
86 superior cl o s e
if m o r e
observations
became
available,
which
remained
to 550. In this
obtain
case,
if we
as the p r e d i c t e d
500 and
559.
This
Clearly,
apply next
time,
several
our A x i o m
output
an i n t e g e r
the p r e d i c t i o n
similar
models
each of t h e m there
is a c o n s i d e r a b l e
It may be p o s s i b l e
to r e d u c e
estimating
the p r o b a b i l i t y
table
terms.
3.7
An E x a m p l e
3.7.1
will above
data,
In this
section
portray
a particular
example
w h i c h was
can be built,
range,
distribution
an e x a m p l e
will
model~ng
and
of"best"
for
predictions.
for e x a m p l e of the
by
look-up
be p r e s e n t e d ,
exercise
of 296 p a i r s
which
in terms
gas
of the
and J e n k i n s
flow
as Series
observations.
rate
The
furnace
(45).
into
The J), The
a furnace,
are of the c o n c e n t r a t i o n
gases.
observations
and
of c a r b o n were made
at
seconds.
obtain
a model
of a d e t e r m i n i s t i c flow rate
of the gas
and J e n k i n s
of i n p u t - o u t p u t
of nine
and J e n k i n s
consists
the i n p u t
by Box
observations
intervals
by Box
are of gas
in the o u t l e t
Box
is the m o d e l l i n g
considered
observations
dioxide
which
used
(which is g i v e n
the o u t p u t
equal
reasonable.
characterisation.
consists input
lying b e t w e e n
is q u i t e
range
we
Introduction
The data,
this
for P r e d i c t i o n ,
for these
transfer
to the o u t p u t
observations,
function
concentration
relating of carbon
87
dioxide,
and a m o d e l
deterministic
of the n o i s e
relationship.
process
The m o d e l
which
disturbs
they o b t a i n
the
is:
2
^ Yt
0.53+0 =
37B+O "
--
51B "
u t --3
. . . . . . . .
(3.6)
2
I-0.57B-O.OIB nt
1
=
wt. . . . . . . . .
(3.7)
. . . . . . . . . . . . . . . . .
(3.8)
2
I-O.53B+O.63B Yt
=
Yt+nt
IIere u t and y~ r e p r e s e n t respectively,
after
the input
removal
and o u t p u t
of t h e i r m e a n
variables,
values,
at
^
sampling
instant
generated
t.
Yt r e p r e s e n t s
by the t r a n s f e r
the e s t i m a t e
function
of y~
of eqt~ (3.6),
and n t is
^
the error b e t w e e n identification in variables" (48))).
y~ and Yt"
terminology, in the
white-noise"
process
random s e q u e n c e ) , nt according operator,
u,y denote
(i.e.
which
(3.7).
by Bx t = xt_ I.
the m e a n
to cause
of the
(Johnston acting
on
uncorrelated the d i s t u r b a n c e
B is the b a c k w a r d The m o d e l
representation,
values
("error
t, and w t is a " d i s c r e t e
is c o n s i d e r e d
diagrammatic
disturbance
a zero-mean,serially
to r e l a t i o n s h i p
defined
conventional
at time
system
error"
of e c o n o m e t r i c s
a stochastic
of the p r o c e s s
conventional
n t is an "output
terminology
n t represents
the o u t p u t
Using
input
shift
can be g i v e n
as in fig. and o u t p u t
a
3, w h e r e
variables,
respectively.
3.7.2
The S y s t e m
In terms
of d e f i n i t i o n
(1.3.1),
the
s y s t e m w h i c h we
are
88
considering
is S=(U,Y)
where
U=(u
, ....
u
1
Y=(y
1
, ...,
£ . = m =i, l l and As
the
Y 2 9 G ),
for
example
of
programming
language
IBM
installation
3.7.3
Model
definition
...,
296,
{ui,Y i}
are
section
3.6,
F to be A l g o l
I - The
We m u s t
i=l,
observations
in the
370/165
), 296
3.3.1.
listed
we
shall
in A p p e n d i x take
as i m p l e m e n t e d
C.
the on the
at C a m b r i d g e .
Trivial
define
W,
as
Model
the
sets
For
the
A,B,C,D,E, w h i c h trivial
model,
we
occur can
in
take
these
to be: A =
{Ai:i=i,...,296}
,
Ai= ~
B = { B i : i = l .... ,296}
,
Bi=~
C =
{Ci:i=i,...,296}
,
Ci=Y i
D =
{Di:i=i,...,296}
,
Di=(Ai,Bi)=@
,
Ei=(Di,Ci)=(~,y
i)
is a t r i v i a l
model
E = { E i : i : l ..... 296} A concrete S,
(F,E)
-model,
which
is: BEGIN
INTEGER
I,J;
FOR J:=l READ WRITE
UNTIL
REAL 296
ARRAY
Y(I::296);
DO R E A D O N
(I) ; (Y(I)) ;
END. 53.8
53.6
53.5
57.0
(Y(J)) ;
of
the
system
89 The last line of the trivial m o d e l is the table look-up, which c o n t a i n s the o u t p u t o b s e r v a t i o n s . can be r e p r e s e n t e d d i a g r a m m a t i c a l l y ,
3.7.4
Model
The t r i v i a l m o d e l
as in fig.
4(a).
II - The Mean
Probably
the first n o n t r i v i a l m o d e l to be h y p o t h e s i s e d
for many systems is that the s y s t e m b e h a v i o u r has a c o n s t a n t mean value.
This m o d e l is of the type w h i c h r e p r o d u c e s
regularities only in the o u t p u t o b s e r v a t i o n s ,
and does not
exploit any i n f o r m a t i o n in the input o b s e r v a t i o n s .
Con-
sequently, the sets A , B , C , D , E may be taken to be the same as for the t r i v i a l model.
The m e a n value of the o u t p u t
observations is 53.5.
The f o l l o w i n g is a
(P,E)-model of S
which m a k e s use of this fact: BEGIN
I N T E G E R I,J;
REAL A R R A Y Y(I::296);
FOR J : = l U N T I L 296 DO READON READ
(Y(J));
(I);
WRITE
(53.5 + Y(I)) ;
END.
.3
.i
O
0
-.i
. ..
3.8
3.5
The table look-up of this m o d e l is listed in the column headed y~ in A p p e n d i x C. Fig. r e p r e s e n t a t i o n of this model.
4(b)
shows a d i a g r a m m a t i c
The d a s h e d line r e p r e s e n t s
the b o u n d a r y of the model.
3.7.5 M o d e l I I I -
Deterministic Transfer Function
We now assume that the t r a n s f e r f u n c t i o n of e q u a t i o n
90
(3.6) the
has been
input
and output
restriction output may
hypothesised
that
of
assume
the
between we make
knowledge
initial
the past
to c h o o s e
{Ai:i=1,...,296}
However,
not
sets
the
of past
conditions),
and present
new
Ai = A =
relationship
system.
than
of all
We have
the
may
(other
knowledge
information.
the
the model
observations
assume
as
A,
but
input
...,
(u ,u ,. 1 2 "''ui)
E: f o r i~6
, A i = @ for il,(i.e.
is an a s y m p t o t i c system.
We w i s h to c o n s i d e r m o d e l s of the sJ's w h i c h d i f f e r only in their table table
look-ups.
To capture the idea of a
look-up w i t h o u t r e s t r i c t i n g it unduly, we shall
c o n s i d e r m o d e l s to be pairs
(m,T).
is a part of a program,
and the pair
the c o m p l e t e program.
This
if required: (3.2.1)),
take a p a i r i n g
E a c h e l e m e n t of the pair (m,T) is r e g a r d e d as
can be f o r m a l i s e d q u i t e easily, function T
and change d e f i n i t i o n
(cf proof of t h e o r e m
(3.3.6),
so that a c o n c r e t e
(F,E)-model b e c o m e s an o r d e r e d pair of i n t e g e r s that F ( T ( m , T ) , i , D i ) = C i. w i t h programs,
(m,T), such
T h e s e i n t e g e r s can be a s s o c i a t e d
as before,
m will be c o n s i d e r e d to be the
p a r t w h i c h is common to m o d e l s of all the sJ's, w h i l e T j w i l l be r e g a r d e d as a table for e a c h S j.
look-up, w h i c h may be d i f f e r e n t
W h e n a t r a n s l a t i o n of the p r o g r a m
one l a n g u a g e to another is considered, T
(or at least its length)
hand,
(m,T) from
it w i l l be a s s u m e d that
remains unchanged.
On the o t h e r
the t r a n s l a t i o n of m w i l l be assumed to be d i f f e r e n t
107
from m.
In this way a distinction is drawn between T and
m, which corresponds to some aspects of the distinction between table-lookup and other types of program. In the following definition a particular programming language is assumed, in this language.
m and T j are fragments of programs The definition is based on definition
(3.3.1), and the notations of definition
(4.2.1) are
generalised in an obvious manner. Definition
(4.2.3)
Let AJ={A~} be a set of ordered subsets of let BJ={B~} be a set of ordered subsets of
(Uz 2U ...Uj),
(Y, Y 2 "''Yj)' and
let cJ={c~} be a complete set of mj disjoint ordered subsets of
J_ J J Let DJbe a set of ordered pairs Di-(Ak,B£)
(YI y 2 "''Yj)"
(i=l,2,...,mj),
and let E j be a set of ordered pairs
E i-' j- ~Di'~i j ~J ) (i=l'2'''''mj)" I
Finally,
let ~ be the sequence
2
I
~=(E ,E .... ), a n d ~ Then the pair
be the sequence
~=(T
2
,T .... ).
(m,~) is an asymptotic t-model of the
asymptotic system =(S ,S ,...) if and only if (m,T j) is an EJ-model of S j, for every j=l,2,... The following definitions distinguish between two possible asymptotic behaviours of rival models. denotes the i n f o r m a t i o n gain of the model denotes the information explained by model
I(m,T j)
(m,TJ), and E(m,T j) (m,TJ), n ~ e l y
the ratio of I(m,T 3) to the size of the trivial model of Sj .
(m, , < )
and (m2 '~)2 denote asymptotic models of some
108
I
asymptotic s y s t e m # ,
with ~
2
= (T II ,T 21 ,...) and f 2 = ( T 2 , T 2 .... ). 1
We use lim inf xj to denote lim inf Xk, and j+~ j~m k>j similarly for lim sup. Definition
(4.2.4)
(m , ~ ) 1
is asymptoticall[ weakly better than
(m ,/)
1
2
(denoted by
(m , < ) > w ( m 1
2
,/2)) if and only if 2
lira inf {I(m ,TJ)-I(m ,TJ)}=+ ~. . . . . . . . . . 1
j~ Definition
2
(4.2)
2
(4.2.5)
(m , ~ ) 1
1
is asymptotically
strongly better than
(m , ~ )
]
(denoted by
2
(m1,~1)>s(m2,~z))
lim inf
j~
if and only if
{E(m ,TJ)-E(m 1
1
2
,TJ)}>O . . . . . . . . . . 2
(4.3)
2
The ideas behind these definitions
are the following.
Let tj denote the trivial model of S j, and Itjl denote its size.
We henceforth make the natural assumption that lim [tj[=+ ~ . . . . . . . . . . . . . . . . . . . j~
If
(m , ~ ) 2
is asymptotically weakly better than
I
the "amount of information"
(4.4)
(m ,~) 2
extracted from S j by
eventually greater than that extracted by difference between them is eventually
(m ,T j ) is l
]
(m2,T23), and the
increasing.
their "rates of information extraction",
then
2
But
as measured by the
109
information explained, may be converging towards each other. For example, if Itjl=kj, I(m],TJ)=pj ½,1
I(m2,TJl=qj½2 , with p>q,
then I(mz,T32)-I(m2,T3)=(p-q)j½~- , while E(m ,T j)-E(m ,T~)= k ~ j -~ ~O. i ! 2 If (m ,~) I
(m ,~) 2
is
is asymptotically strongly better than
1
then the "rate of information extraction" by (m ,~)
2
1
eventually greater than that by (m ,~}. 2 2
strong"
terminology
is
justified
by
the
1
The "weak/
following
theorem.
Theorem (4.2.6) (m 1 ' 3 )1 >
S
(m2 , ~2) ~ ( m l , ~ l ) > w ( m
2
,~). 2
Proof Suppose lira inf{I(m ,TJ)-I (m ,T j)}k, such that I(m ,Tl)-I(m ,T~),O,~i>k, such that E(m ,Ti)-E(m ,Ti) O. j~ 1 l 2 z Hence lim inf {E(m ,TJ)-E(m ,TJ)}>O=~lim inf {I(m ,T3)j+~ 1 1 2 2 j~ 1 1 I(m ,TJ)}= 2
+oo
•
2
We now consider the effect of writing models in different languages on their asymptotic performance.
For a precise
discussion of what it means for a program to be written in
110
some particular
language,
see chapter
5.
Let
(m , ~ ) I
(m
,~)
be asymptotic
models of J w r i t t e n
language
~.
a programming
programs
(p ,T~), (p ,T3), j=l,2,...,
2
and
l
in a programming
Z
Let
~ be 2
functions
such
can be written
that
in ~,
2
and such that these programs recursive
language,
compute
as the programs
the same partial (m 'TJ)' 1 (m2'TJ)' 2
j=l,2
1
respectively.
Using
the
notation
of
definition
(3.3.6)
we can write (T (PI'T~) ,' ,') = ~ (T (ml 'T3~) '''' ) where T is an a p p r o p r i a t e
pairing
for P2,m2.
(p , ~ )
Consequently
.......
function,
and
(p , ~ )
]
models
of#written
Let
IPl denote
It J=Jt 1÷k Theorem
2
similarly
are asymptotic
2
in z. the size of a program p;
trivial model of S j written model of S 3 written
and
(4.5)
in ~.
let t~ be the 3
in ~, and let t~ be the trivial 3 we assume that
..................
146)
(4.2.7)
With the notations
and assumptions
as stated above,
(a)
(ml , ~ ) >w(m2 , < ) ~
(pl, < ) >w (p2 , < )
(b)
(ml ' < ) > s (mr '::= [ < i d e n t i f i e r > < l e t t e r >
then the A l g o l - s u p p o r t s equivalent.
of the two m o d e l s may be
This syntax, w h i c h is part of the s y n t a x of
the A l g o l - s u p p o r t of the first model,
allows
sin to be used
::=ilnls must
(since the p r o d u c t i o n s
also be p a r t of the syntax), to be u s e d as a p r o c e d u r e
and f u r t h e r m o r e
identifier.
A l g o l - s u p p o r t of the first m o d e l
(5.4.1)
intuitively).
Then,
it allows it suppose that the
is a f r a g m e n t of the
A l g o l - s u p p o r t of the second m o d e l to allow,
Now,
the i d e n t i f i e r
(a p o s s i b i l i t y w h i c h we w i s h
a c c o r d i n g to d e f i n i t i o n
(iv), the sin call m u s t have the same effect in both
languages.
So,
the A l g o l - s u p p o r t of the first m o d e l m u s t
c o n t a i n sin as a s t a n d a r d procedure.
C l e a r l y this c o n t r a d i c t s
the i n t e n d e d m e a n i n g of " A l g o l - s u p p o r t " . Consequently,
we insist that s t a n d a r d p r o c e d u r e i d e n t i f i e r s
be r e g a r d e d as terminals.
If it is now s t i p u l a t e d that only
l - c o m p a r a b l e m o d e l s s h o u l d be c o m p a r e d for m o d e l assessment, then we have the formal e q u i v a l e n t of the i n t u i t i v e idea, that m o d e l s
should be c o m p a r e d only if they use the same
f a c i l i t i e s of a language.
One r e a s o n for m a k i n g this
s t i p u l a t i o n has a l r e a d y been r e f e r r e d to in s e c t i o n It m a y be felt to be an "unfair"
6.1.
c o m p a r i s o n if the m o d e l s
are not l - c o m p a r a b l e . An o b v i o u s e x a m p l e of this w o u l d be a c o m p a r i s o n of a
139
d i f f e r e n c e - e q u a t i o n m o d e l w i t h a d i f f e r e n t i a l - e q u a t i o n model. If the d i f f e r e n t i a l - e q u a t i o n m o d e l w e r e a l l o w e d to call a standard p r o c e d u r e
for integration,
w o u l d it be r e a s o n a b l e
to compare the " n u m b e r of a r b i t r a r y elements"
e m b o d i e d in
it w i t h the n u m b e r e m b o d i e d in the d i f f e r e n c e - e q u a t i o n m o d e l ? The d l f f e r e n c e - e q u a t i o n m o d e l assumptions
r e q u i r e s fewer a priori
(if its ~- s u p p o r t is a f r a g m e n t of the l - s u p p o r t
of the d l f f e r e n t i a l - e q u a t i o n model, w h e r e in w h i c h b o t h m o d e l s There
I is the l a n g u a g e
are w r i t t e n ) .
are, however,
two w a y s of m a k i n g m o d e l s
l-comparable.
Rather than a d d i n g an e x p l i c i t l y d e c l a r e d i n t e g r a t i o n p r o c e d u r e to the d i f f e r e n t i a l - e q u a t i o n model,
it is p o s s i b l e to add a
"dummy" call of the s t a n d a r d p r o c e d u r e the d i f f e r e n c e - e q h a t i o n model. o f f e r r e d by this p o s s i b i l i t y , r e d u n d a n t statements,
in s e c t i o n
It is the f l e x i b i l i t y of "padding"
that r e d u c e s
i n s i s t e n c e on l - c o m p a r a b i l i t y .
for i n t e g r a t i o n to
models with
the s i g n i f i c a n c e of any
This w i l l be d e m o n s t r a t e d
6.3.
If l - c o m p a r a b i l i t y
is required,
the choice of a s u i t a b l e l - s u p p o r t to be compared.
there still remains
for the m o d e l s w h i c h are
R e t u r n i n g to the above example,
still a d e c i s i o n to be m a d e - s h o u l d b o t h m o d e l s standard p r o c e d u r e decision,
for i n t e g r a t i o n ,
of course,
or n e i t h e r ?
is v e r y s i g n i f i c a n t
it w i l l be g o v e r n e d by the apriori
the m o d e l l e r w i s h e s
to make.
call the This
for m o d e l assessment.
But this is the d e c i s i o n d i s c u s s e d in c h a p t e r 4. words,
there is
In o t h e r
assumptions
that
140
6.3
Example:
Algol W-Comparable Gas Furnace Models
This section investigates how the assessment of the six models of the gas-furnace data
(cf. chapter 3) is altered,
if they are required to be AlgolW-comparable. 6.3.1
Standard Procedures The definition of Algol W is assumed to be a formalised
version of the specification given in use three standard procedures, procedures
(50).
The six models
namely the input/output
READ, READON and WRITE.
In accordance with the
discussion of section 6.3, we consider the syntax specification of
(50) to be augmented by the productions:
<simple statement>::=<standard
procedure statement>
<standard procedure statement>::=<standard procedure ( list>)
identifier>::=READ[READON]WRITE
The abstract syntax,
translator and interpreting
automaton are considered to be modified accordingly. 6.3.2 Al~olW-Comparable
Models
In this example the models are modified so as to be AlgolW-comparable expressions,
by inserting redundant statments
and
rather than by avoiding certain constructions.
Referring to section 3.7, and comparing the models in order, we notice first that the support of model II contains
syntax of the AlgolW-
the productions
<simple t expression>::=<simple
t expression>+l
141 whereas
the s y n t a x of the A l g o l W - s u p p o r t of m o d e l I c o n t a i n s
only the p r o d u c t i o n <simple t e x p r e s s i o n > : : = < t
term>
(For the s i g n i f i c a n c e of "t" see Appendix B). the W R I T E
This d i s c r e p a n c y
s t a t e m e n t of m o d e l
(50) or the i n t r o d u c t i o n to can be removed by c h a n g i n g
I to:
WRITE(Y(I)+O);.
M o d e l III r e q u i r e s several p r o d u c t i o n s w h i c h are not needed for m o d e l s
I or II.
T h e s e are:
< l e t t e r > : : = NIU ::=. <simple
t expression>::=<simple
: : = < t t e r m > * < t
t expression>-
factor>
::= ::=<simple
t expression>
<simple t e x p r e s s i o n > ::=< <statement>::=
<simple s t a t e m e n t > : : = < b l o c k > l < : : = < t
t a s s i g n m e n t statement> left part>
: : = : = : : = < i f
clause><simple
statement>ELSE
<statement> : : = IF < l o g i c a l e x p r e s s i o n >
THEN
M o s t but not all of these are n e e d e d for m o d e l IV, but model IV itself needs two p r o d u c t i o n s w h i c h are not n e e d e d by m o d e l s
I,II, or III:
::= EIVIWIZ
142
: : = < a c t u a l p a r a m e t e r > l < a c t u a l p a r a m e t e r list>, The only new p r o d u c t i o n r e q u i r e d by m o d e l s v and VI are < l e t t e r > : : = A ,
and < l e t t e r > : : = W ,
respectively,
but these can easily be r e m o v e d by u s i n g d i f f e r e n t identifiers. We give b e l o w the six models, m o d i f i e d AlgolW-comparable. AlgolW-support I
so as to be
The c o n c r e t e syntax of their common
is g i v e n in A p p e n d i x B.
The Trivial Model
BEGIN INTEGER
I,J,N,V,W,Z;
REAL A R R A Y E w U , Y ( I : : 2 9 6 ) ;
BEGIN FOR J : = l UNTIL READ
296 DO READON
(Y(J-O));
(1) ;
V:=O; IF I,<SlOS2:e2>,<s2os2:e3>,...} The c h a r a c t e r i s t i c
set of B is not k n o w n in this case,
this d e f i n i t i o n cannot be completed. w e r e the object:
. so
But suppose that B
191
B={<sl:e4>,<s2:e5@ =
then we would have A={<sl:el>,<SlOS2:e2>,<s~ s2:e3>,<Sl°S3 ° s2:e4>,<s2 ° s ~ s2:e5>}. We now introduce the
H-function, which is used to
perform operations on objects.
The ~-function takes two
arguments, the first of which is an object A, and the second is a pair, where K is a composite selector, and B is an object.
The range of ~ is the set of all objects.
The
value ~(A;) is an object which is obtained from A by replacing K(A) by B in such a way that K(~(A;)=B. This is most clearly shown by examples (taken from Lucas et al (68)) : Let A = i/sl
s2~ s1
Sl~2
/ e2
S~e 4
\ e3
Then (i)
/Sl/~S3
(A;<s3:B>)=
~
i
/ e2
Sl
2~e
s2
\ e3
I--.
4
192
/
(ii)u (A;<SlO s2:B>)=
s2
s\
e~ S1
e4 (iii)
(A;<Sl~ sl,s l.s 2:B>)=
e/
i/•2 \
Sl~S
e4
2 e3
c/ In particular, (i) ~(A;<s3:~>)
if B=~, we obtain: = A
(ii) ~ ( A ; < s ~ s2:~>)
=
~e 4
(iii) H (A; <SlO SlO Sl- S 2 :~>) =
/
sI
S2
s~s2\
e I
)=
Sls / ~ 2
/
eI
J
s1
s3
\ e3
I
e4 Ollongren
(49) gives conditions
arguments of the (ii)
~-function
under which interchanging
the
leaves the value unchanged.
~o (,... ,) ~ ]l(~; , . . . , )
Thus ~o is a function which
"creates"
objects.
Example ~° (<Sl :el> '<Sl° s2 :B> '<s2e s2 :e3>) =
s ~
2
S1
U
s2
\
e3
194
~.4
C o n c r e t e Syntax.
The c o n c r e t e syntax of a p r o g r a m m i n g d e f i n e d by u s i n g the B a c k u s - N a u r rules.
language can be
form of w r i t i n g p r o d u c t i o n
This is a s h o r t h a n d m e t h o d of d e f i n i n g a grammar.
S u p p o s e that there exists
a finite n o n - e m p t y set Z of
terminals.
T y p i c a l e l e m e n t s of Z are:
b, 2, *, begin,
and so on.
Let Z* denote the set of all finite strings of
e l e m e n t s of Z.
Also,
suppose N is a finite n o n - e m p t y set
of n o n - t e r m i n a l s
such that N n Z = ~ and N* is the set of all
finite strings of e l e m e n t s of N.
Let V=ZuN,
Let V+=V*-A, w h e r e A is the empty string.
and V * = ( Z u N ) *
Then the set of
p r o d u c t i o n rules is the set ~={(~,~) :~eV*xNxV* Each pair
&SeV+}.
(~,8)e~ is w r i t t e n
the set of p r o d u c t i o n
rules
as ~ 8 .
In B a c k u s - N a u r
{6+~i ' ~+B2'''''
~+~n }
form,
is
d e n o t e d by the single e x p r e s s i o n < ~ > : : : 8 1 1 8 2 1 - . 1 8 n. The b r a c k e t s Naur notation providing
are used to d e n o t e n o n - t e r m i n a l s .
Backus-
can be used to express p r o d u c t i o n rules
that ~eN.
(~,8),
Such p r o d u c t i o n rules are c a l l e d
context-free. A g r a m m a r G is a 4-tuple G = ( N , Z , P , S ) , w h e r e P is a finite n o n - e m p t y subset of ~ , and SEN is the start s~mbol. If each p r o d u c t i o n of a g r a m m a r is c o n t e x t - f r e e g r a m m a r is said to be c o n t e x t - f r e e .
then the
A context-free grammar
can be c o n v e n i e n t l y d e f i n e d by a finite set of e x p r e s s i o n s
195
in B a c k u s - N a u r
form.
If there e x i s t 61,6 yi=61~2
and y 2 = ~ i ~ 2
7i-i
Yi
~
2 e V* and ~+SEP such that
then Y1
(i=l,2,...,n),
~Y2"
If ~leV*
then y o ~ Y n ( Y n
and
is d e r i v e d
from yo ) . The g r a m m a r G is said to g e n e r a t e L(G)={x:S
~.
Two g r a m m a r s
the l a n g u a g e
x & xeZ*} are e q u i v a l e n t if they g e n e r a t e the same
language. The V i e n n a M e t h o d d e f i n e s One is the c o n c r e t e syntax, L u c a s et al
two g r a m m a r s
for each language.
the o t h e r the a b s t r a c t syntax.
(68) e x p l a i n the d i s t i n c t i o n m o s t clearly:
"An a b s t r a c t syntax is one w h i c h only s p e c i f i e s the e x p r e s s i o n s of the l a n g u a g e as to the s t r u c t u r e s s i g n i f i c a n t subsequent interpretation
for their
and not as to how they are to be
e x p r e s s e d for the p u r p o s e of c o m m u n i c a t i o n either to o n e s e l f or to others.
A c o n c r e t e syntax s p e c i f i e s
the e x p r e s s i o n s
of the language
as a set of c h a r a c t e r strings".
The c o n c r e t e syntax of LML is d e f i n e d as follows: <program>
::= ,
,
,
, .
< r a t i o n a l > : : = +J
-JO
::= J
::= J
:
O 1 2 3 4 5 6 7 8 9 . , + -
are r e q u i r e d to be signed so that the
...,
196
terms
in the table
manner
(cf.
the o t h e r
chapter7).
terms
An e x a m p l e
solely
of a v a l i d
This p r o g r a m give
look-up w i l l This
be
coded
requirement
for s i m p l i c i t y string
in LML
(not shown
is e x t e n d e d
to
of d e f i n i t i o n .
is:
can be p a r s e d (using the
the o b j e c t
in a s i z e - c a p t u r i n g
2,1,+.6,-3,O,-1.41,-5.2.
syntax definition)
to
in full):
[/s
s
s1
1
6 A. 5.
Abstract
We
Syntax
introduce
if an o b j e c t
4
the
following
x satisfies
notational
a predicate
conventions:
P, we w r i t e
is -P(x). A
The That
set of o b j e c t s is,
which
is-P={x:is-P(x)}
satisfy .
The
P is d e n o t e d set is-P
is-P.
is d e f i n e d
by
197
an expression is-P=
of the form
(<S-Pl:iS-Pl>,<s-P2:is-P2>,...<S-Pn:iS-Pn
>)
^
which indicates
that for every x c is-P,
X=llo(<S-Pl:Xl>,<s-P2:x2>,...,<S-Pn:Xn>), A
^
A
where x I e is-P I, x 2 e is-P2,...,x n e is-P n. (<S-Pl:iS-Pl >) then we write is-P=is-P I.
If is -P=
A predicate
also be defined by using the disjunction
may
operator V, e.g.: ^
is-P=is-P 1 V is-P2, which denotes x e is-P 1 V is-P 2.
that x e is-P only if
It is assumed that certain predicates
are satisfied by subsets of the elementary Using this notation, defined
the abstract
objects.
syntax of LML is
as follows:
is-program=(<s-n:
is-integer>,<s-m:
is-integer>,<s-rational-
list: is-rational-list>) is-rational-list=(<s-head:
is-rational>,<s-tail:-is-rational-
list V is -~>) It is assumed that is-~={~}, integer and is-rational
and that the predicates
are satisfied by
infinite
sets of elementary
program"
satisfies
objects.
the predicate
program introduced P=_
(countably)
Every LML "abstract
is-program.
the abstract program corresponding
is-
For example,
to the concrete
LML
in section A.4 is the object
//~s-rational-list s~n
/ 2 s-m
! 1
/
_~eaS-tail s- ead ~s-tail / +.6
s-head / -3
~
-tail
s-head
0/
~s-tail s-head /
2 -h e a d /
-1.41 --
.2
198
How
this
object
specified next
by
i9 o b t a i n e d
the
section.
from
translator, Note
which
is m e r e l y
object,
it is c h o s e n
Most the
discussions
abstract
defining
syntax
syntax, not be
If w e w e r e
to a d o p t
would
assumes not
an i n f i n i t e
to have
measure of
terminals
earlier which
must
(in o u r (i)
(2)
case,
of
view,
set
of
has
over
there
of t h e
size
3 is
are
allowed.
of the
string
over
"machines"
programs
is n o t
are:
a finite
an e f f e c t i v e
terminals
length
and the
a size measure
there
axiom
of p r o g r a m s ,
of axioms
size,
first
I t is e s s e n t i a l
pair
These
it
therefore
a useful
of any given
the
A.4).
since
discussed
at m o s t
any y, w h i c h
abstract
As
programs).
exists
as its realisations
(and is
the
that
languages.
above
a program.
introduced
exists
out
for our purposes,
in s e c t i o n
by
viewed
concrete
"terminals"
constitute~
value.
point
separate
to
for that
mnemonic
the
in t h e
any m e a n i n g
Method
then
is
the o b j e c t
label
can be
to b e
satisfactory
satisfied
for Clearly,
this
attach
alternative
in c h a p t e r
(15)
be
that
a maasure
which
Blum
a language
program
defined
for
arbitrary
the Vienna
as d e f i n e d
introduced
not
to h a v e
considered
not be
a "grar~mar"
for us
of
and
of it n e e d
syntax
of
an
be
"+.6"
(p) d o e s
that object.".6"
concrete
will
that writing
s-head o s-rational-list
although
the
number
procedure
of programs
for d e c i d i n g ,
are o f s i z e y.
satisfied
if i n f i n i t e
sets
199
Furthermore, procedures
we w a n t p r o g r a m s
for c o m p u t i n g
to d e s c r i b e e f f e c t i v e
functions.
D e f i n i n g a language
w i t h i n f i n i t e l y m a n y t e r m i n a l s w o u l d c o r r e s p o n d to d e f i n i n g a T u r i n g m a c h i n e w i t h i n f i n i t e l y m a n y tape symbols. w o u l d be a f u n d a m n n t a l
change in the n o t i o n of " c o m p u t a b i l i t y " .
To o v e r c o m e these o b j e c t i o n s , define
This
it w o u l d be p o s s i b l e to
an a b s t r a c t s y n t a x for LML w h i c h s p e c i f i e d a finite
set of terminals.
The object at each t e r m i n a l node of an
a b s t r a c t p r o g r a m w o u l d then s a t i s f y one of the p r e d i c a t e s i s - d i g i t or is-sign,
say, and these w o u l d be d e f i n e d by
i s - d i g i t = is-O V is-i V . . . V is-9 i s - s i g n = is-+ and is~O={O},
V is- -,
.... is-9={9},
is-+={+},
In this case the a s s e m b l y of the d i g i t s
is--=[-}. into i n t e g e r s
and
r a t i o n a l s w o u l d h a v e to be p e r f o r m e d by tile i n t e r p r e t i n g automaton,
A.6
rather than by the translator.
The T r a n s l a t o r
The t r a n s l a t o r is a f u n c t i o n w h i c h maps parsed
concrete programs
a b s t r a c t programs.
the set of
in a l a n g u a g e into the set of
To d i s t i n g u i s h b e t w e e n
concrete
and
a b s t r a c t o b j e c t s we i n t r o d u c e the conventions: is-<program>(x)
means
that x is a p a r s e d c o n c r e t e program,
n a m e l y an object such as that shown in s e c t i o n A.4. precisely,
for LML we have,
for some p o s i t i v e
More
integer k:
is-<program>=(<sl:is->,<s2:is-,>,..~<S2k_l:is->, <s2k:is-.>)
200
The p r e d i c a t e s the concrete
is-,
syntax in exactly
we have is-,={,},
is-O={O},
In the following ...else...
that is-<program>(p)
obtained
are
the same way.
Obviously,
the statement
It is assumed
and is-(xi).
~o(<S-n:trans-integer
if...then
in the metalanguage.
trans-program,
from
etc.
definition,
is a s t a t e m e n t
translator,
is-
is d e f i n e d
The LML
as: t r a n s - p r o g r a m
(p)=
(s l(p))>,
<s-m:trans-integer
(s 3 ( p ) ) > , < s - r a t i o n a l - l i s t :
m a k e l i s t ( s 5(p) ,
s7(P) , .... S2k_l (P)) >) where makelist
(Xl,X2,...,Xn)=~o(<s-head:trans-rational(Xl)>,
<s-tail:i_~f x 2 = ~ & . . . & X n = ~
then ~ else m a k e l i s t
(x2,...,Xn)>)
and the functions trans-rational:
is-
+ is-rational
^
trans-integer
^
: is-
are not further defined. functions
+ is-integer
For our p u r p o s e s
these two
are best thought of as the usual mappings
the rational numbers.
(In an actual
may be more useful to consider
implementation,
them as m a p p i n g s
In this case the sets
w o u l d be finite sets, practical
it
into bit-
^
patterns.
onto
A
is-rational
and i s - i n t e g e r
due to the fixed w o r d - l e n g t h
of
computers). ^
Note that t r a n s - p r o g r a m
(p) e is-program,
(x I, .... x n) e is-rational-list.
and m a k e l i s t
20~
A.7
The I n t e r p r e t i n g A u t o m a t o n
Following Ollongren a u t o m a t o n to be a 5-tuple
(49), we d e f i n e (0, is-state,
~o,A,F), w h e r e
0 is the set of t r e e - s t r u c t u r e d objects and i s - s t a t e
is a p r e d i c a t e over O.
an i n t e r p r e t i n g
a l r e a d y introduced,
Objects satisfying A
i s - s t a t e are states of the automaton.
~o e
is the initial state of the automaton, final states. however, A(~)
is-state
and F is a set of
A is the state t r a n s i t i o n
function;
its range is not is-state, but the p o w e r set of is~state.
is thus a set of states
d e f i n i t i o n of LML,
in qeneral,
a l t h o u g h in our
A(~) will always be a single state.
A . 7 . 1 The State
The state of the i n t e r p r e t i n g
a u t o m a t o n is structured.
The s t r u c t u r e depends on the language to be defined, the d e f i n i t i o n of LML can be rather simple. b l o c k structure, types,
procedures,
conditional
variable
For the LML i n t e r p r e t i n g
A language w i t h
i d e n t i f i e r s of various
and qoto statements,
need a r a t h e r m o r e c o m p l i c a t e d
and for
and so on, will
set of states. automaton,
or LML machine,
we
define is-state=
(<s-c: i s - c > , < s - d n : i s - d n > , < s - c o u n t e r : i s - i n t e g e r > ) .
is-dn is a p r e d i c a t e s a t i s f i e d by a d e n o t a t i o n directory, and is d e f i n e d by is-dn=(<s-data:is-data>,<s-y:
is-rational>,<s-parno:
is-integer
V is -~>) where
202
is-data=
(<s-i:is-integer>,<s-list:is-rational-list>).
The data for a program,
namely the sequence i,Yi_l,Yi_2,...
appears in the initial state as the object
s-list
s-i / i
s_hea~s_tai 1
/
),,
Yi-1
s-head'-
/ Yi-2 We do not specify how this is achieved. result of the computation,
Similarly,
the
Yi,is the object s-yos-dn(~F) ,
where ~F £ F, and we do not specify how it is output.
The
number m+n, which is required for the correct interpretation of the program,
is stored in s-parno • s-dn
"denotation directory"
(~).
(The term
is taken over from
(49) and
(68).
For LML this directory is simpler than in
(49) and
(68),
but it serves essentially intermediate
the same purpose,
namely storing
and final results).
The most complex part of the state is the control, which is an object satisfying the predicate is-c=
(<s-in: <s-ri:
is-in>,<s-al:
is-dum V is-~>,)
where the following c: control, ri:
is-obj-list>,
in:
V is-~,
abbreviations have been used:
instruction,
al:
argument
list, obj
: object,
return information,
dum: dummy.
In this definition,
is~in is a subset of the elementary
203 ^
objects,
called the set of instructions,
subset of the e l e m e n t a r y r is a simple is-obj-list
selector,
is a p r e d i c a t e
discussion
called the set of dummy names.
different
w h i c h we do not define extensive
objects
and is-dum is a
from s-in,
s-al or s-ri.
satisfied by lists of objects,
further;
Ollongren
(49) gives an
of lists.
An example of a control
is the object:
r~-al s-in
s-
/
in/~Ss-al
in 2
ri
[ in 1
I x
This p a r t i c u l a r
control may have
the i n s t r u c t i o n
in 2 is performed,
The result of carrying name a.
of the next state
s-i~s-al
in 1
with x as its argument. to the dummy
so that the control part
\
in 2 (x)
in 1 is now carried out, with
in 2(x)
as its argument.
in 2 is said to be contracting.
On the
it may be that carrying out in 2 requires
carrying out some other instruction
effect:
is
/
In this case,
the following
out in 2 is assigned
in°2 is then deleted,
other hand,
~a
a
instruction
in 3 on in 4 (x).
first
in 4 on x, and then an
In this case in 2 is said to
204
be expanding,
and c a r r y i n g it out results in the n e x t state
having the control:
r
\ /
~
-
s-in
/
s_ri
in3
-ri 1
s-al
b
I
in 4
a
in 1
~b
x
If b o t h in 4 and in 3
a
are contracting,
the c o n t r o l s of the
n e x t two states w i l l be:
r ~N~s-al
s-in s-ri
1 in 1
a
in 3 s~al
a s-in
s-al
/
\
in 4 (x)
in 3 (in 4(x))
in 1
= in2
If an i n s t r u c t i o n
is expanding,
(x)
then p e r f o r m i n g it leaves
all c o m p o n e n t s of the state u n c h a n g e d e x c e p t the control itself.
However,
if it is contracting,
then its e f f e c t
m a y be to change any of the c o m p o n e n t s of the state case,
s-counter
(~) and s-dn
We need some d e f i n i t i o n s
(in our
(~), as w e l l as s-c(~)). for later use.
The set of
205
control
selectors
selectors ~he
of a control
C is the set of composite
~(C)={K:K=roro...or
identity
of a control
selector)
&K(C)¥Q},
if C=~.
if C ~ ,
The terminal
C is the composite
and is I control
selector
selector
T ( C ) = { Y : T e ~ ( C ) & r o T % ~ ( C ) }. If K=r n is a control where
rn=rorg...0r
precedin~
selector
(n compositions),
control
selector
If K is a control
& s-alopreci(K)(C)
selectors
of i n s t r u c t i o n s
selectors
control C, then
o K(C)#~}
is the set of composite
control
of a n o n - e m p t y
(C,K)={s-alopreci(K):i~l
arguments
then the mth
(O~m~n).
selector
=s-ri
and n)l,
control,
of K is
prec m ( K ) = { K ' : r m o K ' = K }
prec-arg
of a n o n - e m p t y
which
select those
associated with preceding
of K w h i c h
are equal
to the dummy name
a s s o c i a t e d with K. If K is a control
selector
of a n o n - e m p t y
then the derived
return
the set r i ( C , K ) =
prec-arg
included because
these two sets differ
for the r e l a t i v e l y The initial
Here, p
(C,K).
a s s o c i a t e d with K is (This d e f i n i t i o n in
state of the LML m a c h i n e
is
(49), but coincide
(<s-data:
introduced
is
int-prog>,<s-al:p>)>,<s-counter: d>,<s-y:
is the LML program,
is-program
C
simple LML machine).
~o=~o(<S-C:~o(<S-in: <s-dn:~
information
control
which
I>,
O>)>). satisfies
in section A.5.
the p r e d i c a t e
The object d
206
satisfies
the p r e d i c a t e
section,
int-prog
is-data d e f i n e d
is an i n s t r u c t i o n
earlier
in this
w h i c h will be defined
later. The set of final states of the LML m a c h i n e F={~:is-state(~) A sequence
~o,~i,...
the LML machine. terminates. A.7.2
& s-c(~)=~}. , where
~i+l~A(~i ) is a c o m p u t a t i o n
The State T r a n s i t i o n
interpretin~
of
If, for some i, ~i e F then the c o m p u t a t i o n
(Every LML c o m p u t a t i o n
W i t h every
is the set
instruction
function
Function in
~in"
of a state ~, and K a control and let ARG= s-al.K(C)
terminates).
e is-in is a s s o c i a t e d Let C be a n o n - e m p t y selector
of C.
an
control
Let s-in-K(C)=in,
be the list of arguments
of in.
Then ~in(ARG,$,K)
= i f PI(ARG,~)
then
gl
else if P2(ARG,~)
then
g2
then
gm'
es___~e 1 . else if Pm(ARG,~) where PI,P2,...,Pm
are p r e d i c a t e s
(m~l),
and gj has one of
two forms: (i)
For the case of c o n t r a c t i n g
control,
gj=~(~(~ (~;) ;{:TEri(C,K) }) ; <s-counter: where
eJo ande3 are objects.
deletes
the i n s t r u c t i o n
in,its
is-integer>,<s-dn:E~(ARG)>)
In this e x p r e s s i o n argument
the innermost
list and its return
207
information,
the m i d d l e ~ r e t u r n s the o b j e c t
p r e c e d i n g control
selectors,
EJ(ARG) o
to
and the o u t e r m o s t ~ alters
c o m p o n e n t s of the state o t h e r than the control. (ii)
For the case of e x p a n d i n g control,
gj = ~ ( ~ ; < K a s - c : ~ ( c 3 (ARG);<s-ri: w h e r e eJ (ARG)
satisfies
s-rioKos-c(~)>)>),
the p r e d i c a t e
is-c.
In this case
the i n n e r ~ a s s o c i a t e s the r e t u r n i n f o r m a t i o n of K(C) w i t h the new control
EJ(ARG),
and the o u t e r ~ r e p l a c e s
the
control K(C) w i t h the new o b j e c t thus created. The V i e n n a M e t h o d uses to d e f i n e i n t e r p r e t i n g in a m o r e r e a d a b l e
a s y s t e m of i n s t r u c t i o n
functions
schemata
rather m o r e c o n c i s e l y and
fashion than the above e x p r e s s i o n s .
However, we shall not d e s c r i b e
this
feature,
since it is
f e a s i b l e to define LML in the above manner. It is now p o s s i b l e to d e f i n e the state t r a n s i t i o n A(~)={q:q=~in(ARG,~,K) &
F r o m this d e f i n i t i o n
ARG
& K=T(s-c(~)) =
& i__nn=s-inoK,s-c(~)
s-al=KoS-C
it is a p p a r e n t
(~) }.
that the state t r a n s i t i o n
is d e t e r m i n e d by always c a r r y i n g out the i n s t r u c t i o n w i t h the t e r m i n a l control of the state, o c c u r r i n g at the "deepest" in s e c t i o n A.7.1).
associated
n a m e l y the i n s t r u c t i o n
level of the control
In g e n e r a l
function:
(cf. e x a m p l e s
(although not for LML),
will be a set c o n t a i n i n g m o r e than one control
selector.
T(S-C(~)) Hence
our e a r l i e r remark that A(~) w i l l in g e n e r a l be a set of states,
r a t h e r than a single state.
In such a case,
does not m a t t e r w h i c h of the t e r m i n a l i n s t r u c t i o n s first.
it
is p e r f o r m e d
208
It is the specification of the interpreting functions of an interpreting automaton which assigns meaning to an abstract program. A.7.3
Interpretin~ Functions
for LML
we now complete the definition of LML by defining a set of interpreting
functions
for it.
The instructions
to
be defined are as follows: Instruction
Type
Domain
int-prog
expanding
is-program
int-m~
expanding
(is~integer) 2
set-mn
contracting
is~integer
int-~ro~-list
expanding
is-rational-list
updatey
contracting
is-rational
product
contracting
{is-rational) 2
sum
contracting
(is-rational)
A
^
We assume that the binary arithmetic operators available.
2
+ and * are
The remarks at the end of section A.6 apply
to these. (i)
int-pro@ int-~ro~
(p,~o,I)=H (to;<S-C:e (p)>)
where e(p)
s-al s-in s-in
/
s-al
J
s-rational-list
int-prog-list
int-mn
(s-m(p) ,s-n(p))
(p)
209
(2)
int-mn int_mn((X,y) ,~,K)=~(~; <Eos-c:e(x,y)>) where e (x,y) = r / ~ s - a l s-in s-i /
s-al
s=
I
v
I ~et-mn
~v
(x,y) (3)
set-mn set-mn(X'~'K)=~(~(~;) ;<s-dn:~(s-dn(~) ;<s-parno:x>) >) Note:
(4)
set-mn puts the value m+n into s-parno-s-dn(~).
int-prog-list (x,~,K)= if s-counter (~)<s-parnoos-dn(~)+2 ~ (~;) ~(~;)
~int-prog-list then else
where e I (x) =
s-in
i
sn rri< \
k s-tail (x)
int-prog-list v
product
"~'"
~k
\
(u, s-go s-dn(~) )
s-el
I (s-head(x),
s-head
•
s-list
°
s-data
o
s-dn
(~))
210
and ¢2 (x) =
r
/•s-al s-in
updatey s-
-
±
/
\ v
sum s-al
I
(s-yDs-dn(~) , s-head o (s-tail) i (x)) where (5)
i=s-ios-dataos-dn
(~)
updatey ~updatey
(x,~,K)=~(~(~;) ;<s-dn:~(s-dn(~) ;<S-y=x> ,
<s-list#s-data: <s-counter:
s-tail-s-list-s-data,
s-counter
updatey
brings
the next data item to the top of s-list-s-data.s-dn(~),
(6)
(7)
into s-yos-dn(~),
(~) by i.
produc 9 ~product
where
s-counter
value
(6)+1>)
Note:
and increases
puts an intermediate
s-dn
((x,y),~,K)=~ (~ (~;) ; )
r e ri (s-c(~),K) sum ~ s u m ((x,y),~,K)=~(~(~;) where
T eri(s-c(6),K)
;)
(6)>),
211
In order to clarify the above definitions, some of the steps in an LML computation are shown below.
To save
space, only the control and those parts of the state which have just changed are shown. s
s-in
s-counter
/ int-prog
I
1 0/
s-data
s-al
/s-i~ 1
s-list
s_n ~ s _ m s-rational-list
s-ha~d
m n
/
Yi_l-/"
s-tail
s-head
s-tail
I
\.
/
.)
2
el
s-head s-hel
u.1-m
o/
sc ~
I
s-al
int-prog-list
/ int-mn
/ s-head
s-in
I
(re,n)
s-tail k%
I aI
212
~2 =
S-C
r "~s-al s_~in siin t ~ in -prog-lis s-t~l r s-head s-al %
I
\v
set-mn
/
ai
\
/
s-al
sum
I
(ra,n)
~3 =
o-c~ s-al
int-pro~-list
s-in / ~ s-al / \ Set-mn
~s-tai< s-head
m+n
/
aI ~4 =
•/ ~
s-c~
/ s-in s-al int-pro@-list s-ea~ds-tail /
aI
\
s-dn S - ~ s 0
s-parno
I
m+n
-da "-.
213
s-o~ if m+n > 0 then ~5 =
/ ~ r s-¢n / / ~
i
r
s-al
int-prog-list ~
IX
/
[ s-a\ /
l~s_r~
s;i%Sa~
s-ln
s-head
~v
/
updatoy
s-i / product
s-al
\ s-rx
/
,0)
\
(al,yi_ 1 )
u
~6 =
S-C/~
~
/
_ s-~n I \ ~,,~-~,-°~-;,, ,s-head
s-in s-i / sum
s-al ~
I updatey
~
/
k
a2
v
v
(al*Yi_l,O)
s-c~
~7 =
/ updatey
int-pro@-list
I (al*Yi_I)
s-
~-t,,,.il
214
S--C
s-dn s-in~ / int-pro~-list _
s-counter 2I
~
s-da~
al*Yi_1 s-head
/
a2
s-head
I
/
:
Yi-2 A sequence like ~5,~6,~7,~8 is now repeated until s-counter (~i)=m+n+2, whereupon we get s-dn s_c/~ / ~ s-co~nter s_y~ s-data m+n+2 r s-al s-parno
~i+l=
.
s-ln sum / ''
s-~n
v
I ~pdatey s-al I s-ri (~,di)
~
~i+2=
//~ S-C
update~
s-al I 9+d.1
I
s-i
m+n
i
I
215
\
~i+3 =
s-dn s-counter
I
\
m+n+3
s-y
/
s-data s- ~arno
Yi
s-i m+n
~i+3£F,
so the
is a v a i l a b l e
computation
be r e m a r k e d
the LML i n s t r u c t i o n s restrictions
These
that
length of the
are,
items
table.
is s i m p l y
done by e n t e r i n g
free g r a m m a r
cannot
definitions
These
LML
and
the v a l u e s
not e x c e e d N,
of m
the
can be
instructions.
state
This
if any of these
like Algol,
context
be e x p r e s s e d (49)).
are
and a b s t r a c t
with
of the LML
of
be e x p r e s s e d
restrictions
to s p e c i f y
(see
There
of p a r a m e t e r s
In l a n g u a g e s
can also be u s e d which
cannot
of c o n c r e t e
an e r r o r
are violated.
restrictions,
which
of i s h o u l d
in the d e f i n i t i o n s
technique
complete.
be c o m p a t i b l e
the v a l u e
look-up
above d e f i n i t i o n s
that the n u m b e r
expressed
conditions
the
are not q u i t e
specifications
rumber of data
and n, and
that
on an LML p r o g r a m
in the e a r l i e r
the
Its r e s u l t
in s - y o s - d n ( ~ i + 3 ) .
It s h o u l d
grammars.
has t e r m i n a t e d .
this
- sensitive
in the c o n t e x t -
216
A.8
Summary
The V i e n n a m e t h o d of d e f i n i n g progra~%ming languages has been described.
This m e t h o d includes
d e f i n i t i o n of the s e m a n t i c s of a language,
the formal and is s u f f i c i e n t l y
p o w e r f u l to be used for the d e f i n i t i o n of p r a c t i c a l p r o g r a m m i n g languages.
It has been used here for the d e f i n i t i o n of
the simple and s p e c l a l - p u r p o s e L i n e a r M o d e l Language. This has b e e n done b o t h to i l l u s t r a t e the method, o r d e r to m a k e language"
and in
f a m i l i a r a r a t h e r b r o a d e r n o t i o n of " p r o g r a m m i n g
than is usual.
The V i e n n a M e t h o d of l a n g u a g e d e f i n i t i o n is used in ch~ter
5 to f o r m a l i s e the n o t i o n of a "fragment"
of a language.
217
APPENDIX B Syntax
Of the
Algo iW-Support
of the Gas-Furnace
Models
This appendix contains the concrete syntax of the AlgolW-support
of the five models of section 6.3.2.
It
is based on the AlgolW syntax specification given in The numbers in brackets the relevant sections of comparison.
to the right of subheadings
(50). indicate
(50), in order to facilit&te
Standard procedure
statements
terminals which do not appear in
are new non-
(50) (cf. sec. 6.3.1).
The symbol "t" may be replaced by either "real" or "integer", in accordance with the rules specified in sections i.i, 1.5, 1.5.3, I.
and 1.6.2 of
Identifiers
::=
(50).
(1.2)
::= ::= <standard procedure identifier>::=
READIREADONIWRITE
::= EIIIJINIUIVIWIYIZ ::= 0111213141516171819
(Note:each of these appears in
:: = l,
(1.3.1)
::=
::=. I
.
218
::=l
(1.4)
<declaration>::=<simple
variable declaration> I
3.1
Simple Variable Declarations
(1.4.1)
<simple variable declaration>::=<simple
type>
<simple type>:: = INTEGERIREAL 3.2
Arra[ Declarations
(1.4.2)
::=<simple
type>ARRAY
() ::= ::=:: ::= ::= 4.
Expressions
(1.5)
::=<simple 4.1 Variables
t expression>
(1.5.1)
<simple t variable>::= I ::=<simple
t variable>
::=(<subscript
<subscript list>::=<subscript> <subscript>::=
list>)
219
4.2
Arithmetic Expressions (1.5.3)
<simple t expression>::=l<simple t expression>+ l<simple t expression>- ::=l* ::= ::=I 4.3
Lo@ical Expressions
(1.5,,..4)
::= ::=<simple t expression>
<simple t expression> ::= < 5.
Statements
(1.6)
<program>::=.
(Note we do not provide a
specification of the syntax of ). <statement>::=<simple
statement> I
I <simple statement>
::=l I <standard procedure statement>
5.1
Blocks
(1.6.1)
::=<statement>END ::=l<statement>; ::= BEGINI<declaration> 5.2
Assignment Statements
(1.6.2)
::=
220
::=:=
Procedure
<standard procedure
Statements
(cf. 1.6.3 and 1.6.8)
statement>::=<standard
procedure
()
list>::=<subscript> (1.6.5) clause><simple
statement>
ELSE<statement> ::= 5.5
Iterative
::=<statement>
FOR:=
value>UNTIL
I I
~
~ ,
I
I
o
,
. 1 .
I
.
I
,
I
~
0
I
I
.
I I
.
]
.
.
I
~
.
l
,
I
o
.
.
I
°
I
~
I I
I
o
o
I
.
I
,
I
I
,
0
I
I
o
I
I
.
~
.
.
o
.
I I I
I I
I I
o
.
~
' ~ X ' ' ~
I
o
.
!
~
.
~
~
*
i
I
-
~
.
. . . .
I I I I ~ 1 1
I
'
0
'
I
)
'
I
o
I
'
I
.
I
~
,
•
I
l
I
,
*
I
.
I
I
•
.
*
.
,
~
I
I
,
.
I
o
I
.
,
o
~
~
I
.
~
I
.
W
o
~
W
~
~
.
.
.
•
~
~
.
•
~
~
.
w
.
w
. o
°
I
I-'-
'~'~
I~. •
I~-
F-'-
I-I