The Modelling of Systems with Small Observation Sets

Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma 10 Jan M. Maciejowski The Mo...

Author: J.M. Maciejowski

17 downloads 832 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Control and Information Sciences Edited by A.V. Balakrishnan and M.Thoma

10 Jan M. Maciejowski

The Modelling of Systems with Small Observation Sets

Springer-Verlag Berlin Heidelberg New York 1978

Series Editors A. V. Balakrishnan • M. Thoma Advisory Board A. G. J. MacFarlane • H. Kwakernaak • Ya. Z. Tsypkin Author Dr. Jan Marian Maciejowski Maudstey Research Fellow, Pembroke College, Cambridge also with the Control and Management Systems Group, Cambridge University Engineering Department Mill Lane, Cambridge CB2 1RX, England

ISBN 3-540-09004-5 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-09004-5 Springer-Verlag NewYork Heidelberg Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher. © by Springer-Verlag Berlin Heidelberg 1g78 Printed in Germany

SUMMARY

The p r o b l e m systems,

when

is i n t r o d u c e d defined

of a s s e s s i n g

only

of a system,

of a v a i l a b l e algorithm un d e r

A general "information more no

of models,

criteria

information gain

including

and its c o m p u t a t i o n gain

for the The

language about

of m o d e l l i n g ,

to the p r o b l e m

of s y s t e m

account

of the size of the set

A model

is d e f i n e d

observation

to be an set of a s y s t e m

to find

that

in the s e n s e the m o d e l

with

gain.

nonlinear

criterion

dynamical

with

gain

program.

i n s i g n i f i c a n t as the o b s e r v a t i o n

that

models,

is d e m o n s t r a t e d . requires

The c h o i c e

the m o d e l l e r ' s

It is s h o w n

class

The use of i n f o r m a t i o n

of rival m o d e l s

of i n f o r m a t i o n

for a w i d e

stochastic

is s t r a i g h t f o r w a r d .

is a s s o c i a t e d

with

It is p r o v e d

c a n exist,

in general,

its

consistency

is d i s c u s s e d .

algorithm"

as a c o m p u t e r

the system.

of a model,

and its

is a s u i t a b l e

assessment

calculation

be e x p r e s s e d

solution

is proposed,

modelling

Information

information

a characterisation

of the q u a l i t y

it is not possible,

the h i g h e s t

accounts

of a l g o r i t h m i c

the o u t p u t

criterion

conventional

that

is

restrictions.

gain",

"universal

identification

for

observations.

specified

are available,

to a t h e o r y w h i c h

taking

for c o m p u t i n g

of

of

a partial

while

models

f r o m a set of o b s e r v a t i o n s

on to d e v e l o p

constitutes

identification,

System

The c o n c e p t s

are d r a w n

interpreting

of o b s e r v a t i o n s

and discussed.

that b e h a v i o u r .

which

sets

as the p r o g r e s s i o n

the b e h a v i o u r

theory

small

and

this

that

of p r o g r a m m i n g

a priori choice

sets b e c o m e

the m o d e l

beliefs

becomes

large.

A detailed

IV

investigation of

shows

"the s m a l l e s t

program.

t h a t it is p o s s i b l e

language"

A priori

knowledge

t h e r e f o r e be c o n s i d e r e d required

to r u n a p a r t i c u l a r

assumed

about a system can

to be d e f i n e d by the s m a l l e s t

language

to run the m o d e l .

Finally, which

required

to s p e a k p r e c i s e l y

the e f f e c t on m o d e l

system observations

t h a t a "safe"

are c o d e d

c o d i n g exists,

a s s e s s m e n t as w o u l d

a s s e s s m e n t of the m a n n e r is e x a m i n e d .

which often

It is f o u n d

leads to the

the use of m o s t o t h e r c o d i n g s .

in

same

ACKNOWLEDGEMENTS

The

idea of e x a m i n i n g

information

theory

His c o n s t a n t detailed

modelling

is due

to P r o f e s s o r

encouragement

criticism,

in the

light

A.G.J.

and e n t h u s i a s m ,

has b e e n

an e s s e n t i a l

of a l g o r i t h m i c

MacFarlane.

as w e l l

as

ingredient

of this

work. I have also benefited

from d i s c u s s i o n s

of the C o n t r o l

and M a n a g e m e n t

Dr.F.P.

Kelly,

Dr.

special

mention.

chapter

was

Watson

M.B.

from N e w t o n

of w h o m Beck

deserve

in the

last

out to me by D r . A . T . F u l l e r .

support

Council,

Group,

and Dr.

The q u o t a t i o n

pointed

Financial Research

S.R.

Systems

with many members

for this r e s e a r c h

and in the

final

came

stages

from

the S c i e n c e

from P e m b r o k e

College. Roberta of typing,

Hill but

special

so s u c c e s s f u l l y My wife

has p r o d u c e d thanks

through

are due

chapter

saying

have b e e n w i t h o u t

her

I shall

how

leave

to her

this

standard

for s t r u g g l i n g

one of those

impossible

constant

excellent

5.

has a s k e d me not to w r i t e

acknowledgements,

consequently

her usual

this

encouragement to t h e

embarassing

research

would

and support;

reader's

imagination.

CONTENTS

1

1.

Introduction

2.

S u r v e y of R e l a t e d Work

23

3.

A Characterisation

60

4.

I n c o r p o r a t i o n of A Priori K n o w l e d g e

102

5.

F r a g m e n t s of P r o g r a m m i n g

115

6.

h-Comparability

135

7.

Table L o o k - U p C o d i n g s

148

8.

D i s c u s s i o n and C o n c l u s i o n

158

References

180

of M o d e l l i n g

Languages

Appendices: A

Formal S e m a n t i c s of P r o g r a m m i n g L a n g u a g e s

185

B

S y n t a x of the A l g o l W - S u p p o r t of the G a s - F u r n a c e Models

216

Table L o o k - U p s

220

C

Diagrams

for the G a s - F u r n a c e M o d e l s

229

1.

i.i

INTRODUCTION

Motivation

The areas in w h i c h the s c i e n t i f i c m e t h o d has b e e n demonstrably

and s p e c t a c u l a r l y

by the p o s s i b i l i t y observations,

successful

are c h a r a c t e r i s e d

of p e r f o r m i n g e x p e r i m e n t s ,

or m a k i n g

more or less freely w h e n e v e r these are d e e m e d

desirable.

The result of this has b e e n that e x p l i c i t

c o n s i d e r a t i o n of the size of the set of o b s e r v a t i o n s w h i c h a m o d e l is h y p o t h e s i s e d , fitted, has b e e n n e g l e c t e d .

from

and to w h i c h a m o d e l is Any doubts w h i c h

arise about

the m o d e l can be r e s o l v e d by further e x p e r i m e n t a t i o n

and

observation. This p l e a s a n t p r o p e r t y i n c r e a s i n g l y d i s a p p e a r s enters

the domains of complex i n d u s t r i a l processes,

m e n t a l c o n t r o l systems, m a n a g e m e n t systems, e c o n o m i c systems.

as one environ-

and socio-

The w o r k d e s c r i b e d here aims to c l a r i f y

the r e l a t i o n s h i p b e t w e e n the s m a l l n e s s of the a v a i l a b l e o b s e r v a t i o n sets for such systems of the m o d e l s

and the d e g r e e of u s e f u l n e s s

o b t a i n e d for them.

Until recently,

the class of m o d e l s ~ h ~ c h

c o u l d be used

in s c i e n t i f i c i n v e s t i g a t i o n s was r e s t r i c t e d by a v e r y p r a c t i c a l consideration. understood,

The b e h a v i o u r of the m o d e l had to be

and that u n d e r s t a n d i n g

the theory of the model. s u f f i c i e n t l y simple

could only be o b t a i n e d from

The m o d e l was

c o n s t r a i n e d to be

for t h e o r e t i c a l i n v e s t i g a t i o n to be

possible. The

availability

situation

of the

radically.

the b e h a v i o u r theoretical

complicated behaviour,

of it.

of u s e f u l

relaxed.

model

changed

with hardly

Consequently

models

structure,

has b e e n

to o b s e r v e

the d e t a i l s

this

to i n v e s t i g a t e

It is now p o s s i b l e

and to a d j u s t

simulated

by s i m u l a t i o n ,

understanding

least g r e a t l y

has


of a m o d e l

on the c o m p l e x i t y

computer

any

this

constraint

removed,

or at

to p o s t u l a t e

a

its s i m u l a t e d

of the m o d e l

b e h a v i o u r r e s e m b l e s the b e h a v i o u r

until

its

of the s y s t e m b e i n g

investigated. When

is such

understanding be used the

of h o w

some

light

investigate

say how

to how

models

good

the

an i s o l a t e d

Why should

the details ability

model

a simulation

above not be u s e f u l

system behaviour,

indicate

the q u a l i t y

any can it

in this

A further

with

thesis aim

is

of rival m o d e l

connected

in

system

of the thesis

to d i s t i n g u i s h

assessment,

between

the a b i l i t y

to

is.

model

or r e l i a b l e ?

observed

When

of the same

Most

is i n t i m a t e l y

it give

reported

on t h e s e q u e s t i o n s .

how r i v a l m o d e l s

that

does

the s y s t e m w i l l b e h a v e

of the w o r k

concerned with

b u t it is clear

When

really works?

s h o u l d be assessed.

ostensibly

competing

guide

The p u r p o s e

is to t h r o w

behaviour

useful?

the s y s t e m

as a r e l i a b l e

future?

is to

a model

of the type d e s c r i b e d If it r e p r o d u c e s

is that not s u f f i c i e n t

of the m o d e l ?

In fact,

the

evidence

is it not

to

clear

that

the b e t t e r

the b e t t e r

the

the m o d e l ?

is the p o s s i b i l i t y complexity checked

against

the

time.

clear

the only

is no m o r e value.

agrees w i t h model

of some v a r i a b l e

no o t h e r

that v a l u e s

the v a l u e

in some

taken,

model,

then

It n e v e r

observations, prediction

also

confidence

amounts

assessment

say,

w o u l d be

than

confidence

increases

little

(which does

in

value

but

predictions, measurements of the

very quickly. after

doubt not

in the

to say

the p r e d i c t i o n s

of course,

have

is

any o t h e r

If further

in the m o d e l

third

it is

of c o n f i d e n c e

are b e t t e r

agree w i t h

correct

then


guesses.

one w o u l d

at some

It

is taken w h i c h

of the model,

to certainty,

it.

The p r e d i c t e d sense)

by the m o d e l

than m e r e

that

time of the v a r i a b l e

is nil.

increases.

and these

about

of the two o b s e r v a t i o n s ,

the p r e d i c t i o n

sense,

its

at two d i f f e r e n t

of the v a r i a b l e

with

reasonable

predicted

since

Suppose

information

if a third m e a s u r e m e n t

immediately

reason

and it is b e i n g

example.

(in an i n t u i t i v e

However,

The b a s i c

the model,

simple

of the m o d e l

likely

behaviour,

set of data.

are t a k e n

on the b a s i s

that

is no.

unconstrained,

following

to p r e d i c t


that

imply

only

ten

the next

that it

be). The

model

answer

If a linear v a r i a t i o n

proposed,

of the o b s e r v e d

"overfitting"

and that w e have

is d e s i r e d

would

of

a small

two m e a s u r e m e n t s

are

Our

is r e l a t i v e l y

Consider

times,

reproduction

confidence

clearly

which

depends

one is w i l l i n g

on the d i f f e r e n c e

to ascribe between

to this

the n u m b e r

of o b s e r v a t i o n s observations

required

the a v a i l a b l e then we have situation

of a r b i t r a r y

number

it "explains"

no

to c o n s t r u c t

that

by s a y i n g

then w e

if the n u m b e r

about

it fit the o b s e r v a t i o n s , i s

of o b s e r v a t i o n s ,

the model, This

that

have been m a d e

of

If all of

in its p r e d i c t i o n s .

also be d e s c r i b e d decisions

the model.

are used

confidence

to m a k e

and the n u m b e r

to c o n s t r u c t

observations

can

in o r d e r

which

the m o d e l ,

the same

have no c o n f i d e n c e

as~the

in the

model. This

p o i n t was m a d e

dismissed

Jeans'

catastrophe

succinctly

classical

and the

by P o i n c a r e ,

explanation

specific

heat

when

he

of the u l t r a v i o l e t

of solids

(i) :

"It is o b v i o u s that by g i v i n g s u i t a b l e d i m e n s i o n s to the c o m m u n i c a t i n g tubes b e t w e e n his r e s e r v o i r s and g i v i n g s u i t a b l e values to the leaks, Jeans can a c c o u n t for any e x p e r i m e n t a l results w h a t e v e r . But this is not the role of p h y s i c a l theories. T h e y s h o u l d n o t i n t r o d u c e as many a r b i t r a r y c o n s t a n t s as there are p h e n o m e n a to be e x D l a i n e d ; they should establish connections between different experimental facts, and above all they s h o u l d allow p r e d i c t i o n s to be made."

On the o t h e r hand, reproduces If o n l y increase

a slight

have

been

of p h e n o m e n a "

r e q u i r e d for m o d e l the

complexity

accuracy

behaviour

increase

in accuracy,

constants" "number

the o b s e r v e d

the

is c l e a r l y

in c o m p l e x i t y

then

in some

added

sense

to it than

which

assessment

of a m o d e l

with which

and its

significant.

results fewer

in a large "arbitrary

the a d d i t i o n a l

it now explains. is some

the m o d e l

What

"trade-off"

accuracy.

is

between

A prerequisite

for this a wide

is a m e a s u r e

class

appears

casting

of m o d e l s

in such

of fit of m o d e l

behaviour

to the o b s e r v e d

is the

as a c o m p o n e n t is thus

a suitable

of m o d e l

achieved

assessment

qrthodox would

be

f r o m a small

approach

a form,

in

that behaviour

The r e q u i r e d

model

class,

ment problem

as a s t a t i s t i c a l has

of the

complexity

in

indeed been

follow

of m o d e l s

some

statistical

to f o r m u l a t e

decision

the a s s e s s -

problem.

investigated,

such

of m o d e l

assessment

then be p o s s i b l e

type e n c o u n t e r e d

We do not

the

and to p o s t u l a t e

It m a y

of a p p r o a c h

to the p r o b l e m

to e x a m i n e

framework.

(5).

introduced

complexity.

by a s s e s s i n g

to

manner.

A more

models

is a p p l i c a b l e

innovation

trade-off

chosen

of models.

which

A major

this w o r k poorness

of c o m p l e x i t y

even

in c o n t r o l

an a p p r o a c h

This

type

for d y n a m i c a l

studies for the

(2)(3)(4) following

reasons. Any m e t h o d w i l l be

arrived

appropriate

(such as l i n e a r

Such

compared

investigated market,

for a n a r r o w

(statistical)

corrupted

a method will

are b e i n g

only

difference-equation

set in a p a r t i c u l a r "observations

at from s t a t i s t i c a l

by w h i t e ,

n o t be u s e f u l - for e x a m p l e ,

is the b e h a v i o u r

it may be d e s i r e d

Forrester's

"Industrial

class

of m o d e l s

models,

for e x a m p l e ) ,

environment Gaussian,

(such

firms

models

being in some

a model based

techniques

noise").

different

if the s y s t e m

to c o m p a r e

as

additive

if two very

of c o m p e t i n g

Dynamics"

considerations

on

(6) w i t h

a model

which

uses

market's

game

theory

firms'

elements. usually

simulation

When

the p r o b a b i l i t y Furthermore, economic

difficult

when

under

conditions.

few o b s e r v a t i o n s

and there

is little

the

statistical

specification

may

i t s e l f be very u n c e r t a i n .

by n o t a s s u m i n g conclusions These fruitful

it to be known;

considerations

to i n v e s t i g a t e

by a p a i n s t a k i n g

three

of r e l e v a n t are

about

it,

environment little

is lost

misleading

indicate

that

by m a k i n g

the g e n e r a l

and d i f f i c u l t

it may be m o r e of m o d e l s

of complex,

as few a s s u m p t i o n s situation,

analysis

rather

as

than

of each m o d e l

as it arises.

Overview

We

case

in fact,

the a s s e s s m e n t

systems

and e x a m i n i n g

structure,

knowledge

these,

may be avoided.

understood

possible

behaviours

of a s y s t e m

of the s y s t e m ' s In this

(8).

and s o c i o -

stationariness

a priori

of

When modelling

processes. available,

it is

and i m p o r t a n t

to assume

Finally,when

nonlinear

variables

environmental

it may not be a p p r o p r i a t e

1.2

and the

the e v o l u t i o n

of r e l e v a n t

interesting

transient

contain

also d y n a m i c a l ,

to d e s c r i b e

investigating the m o s t

often

are

distributions

systems,

occur

models

such m o d e l s

extremely

poorly

actions

responses.

Realistic

often

(7) to e x p l a i n

develop

of A p p r o a c h

and Results.

a characterisation

"components":

the s y s t e m

of m o d e l l i n g

to be m o d e l l e d ,

which

has

a model

of

this system, The

and a c r i t e r i o n

system

pair of sets

of q u a l i t y

to be m o d e l l e d

of o b s e r v a t i o n s are

and accuracy,

observation

Each

each

therefore

discrete-time that this

of d a t a detail

does n o t

time,

of this

become

of such

reflects

evident

a system

to

the r e a l i t i e s

be d e f i n e d

in m o r e

which

implies

compute

a reversed

time

obtained.

exercise

interest,

ordering.

functions

defined

These

It only

are u s e l e s s

in a n e w s i t u a t i o n exercise),

as a r e f e r e n c e ,

with

will

of a p a r t i c u l a r

subsets

to a d m i t

be of m u c h

of the m o d e l l i n g

to m o d e l s

which

is b r o a d e n o u g h

system may behave

serve

the o u t p u t

a lack of any


Any r e s t r i c t i o n

onto

not n o r m a l l y

observations

the goal

by s p e c i f y i n g

The

is any a l g o r i t h m w h i c h maps

or even

h o w the

type w i l l

the success

It m e r e l y

interpretation

algorithms

on the p a r t i c u l a r

(presumably

finite.

it w i l l

the m o d e l s

definition

would

such as those w h o s e

for d e d u c i n g

resolution

a set of d i s c r e t e - s t a t e ,

of the o b s e r v a t i o n s

which

allows

limited

to be r a t i o n a l .

to be

However,

system

This

of

and output.

is a s s u m e d

A system will

of the

observations.

direction

by a

1.3.

subsets

algorithms

like

category.

in sec.

input

is a s s u m e d

constrain

collection.

A model certain

looks

to be d e f i n e d

obtained with

measurements.

be of the same

also

always

set of o b s e r v a t i o n s

system

is t a ke n

of its

Since m e a s u r e m e n t s

of the model.

but models

respect

to w h i c h

be assessed. type


is a c c o m p l i s h e d lie

in the

domain

of the a l g o r i t h m ,

observations

are

deterministic

successive

successive

outputs,

the W i e n e r

- Kolmogorov

blocks

of i n p u t

elements

to be the c o r r e s p o n d i n g

For example, n e e d o n l y map

and w h i c h

images.

difference

blocks

whereas

of the o u t p u t

of input

stochastic

or K a l m a n

and p a s t o u t p u t

equation

models

observations

predicting

types m u s t map

observations

to

models

of

successive

to s u c c e s s i v e

outputs. The

term

program".

Thus

the o u t p u t specified

"algorithm" we

think

observations, subsets

may be

interpreted

of m o d e l s and these

as p r o g r a m s programs


task.

This

it w e r e

not

for the p o w e r

of C h u r c h ' s

states

that

any p r o c e d u r e

which

notion

of an " a l g o r i t h m "

equivalent hence

viewpoint

the m o d e l

some p r o g r a m m i n g taken

to be the

the n u m b e r the p r o g r a m which

have

is w r i t t e n

shortness

is a m e a s u r e

the o u t p u t

criterion

program

in

of q u a l i t y

is

as m e a s u r e d

with which

the o b s e r v a t i o n s

The

length

of a r b i t r a r y

to the p r o g r a m m i n g Furthermore,

observations were

and

program.

of the n u m b e r

to c o m p u t e

(9), w h i c h

of a l g o r i t h m s ,

in the program.

the model.

if

in any one of the

of that p r o g r a m ,

(relative

in this

the i n t u i t i v e


the

them

arbitrary,

Thesis

theory


of c h a r a c t e r s

in c o n s t r u c t i n g

of the

lanaguage,

been m~e

be e x c e s s i v e l y

satisfies


may use the

to help

can be e x p r e s s e d

formalisations

can be e x p r e s s e d When

would

as " c o m p u t e r

originally

of

decisions

language)

a model

exactly

by

is r e q u i r e d

(to the a c c u r a c y made).

In o r d e r

to do this,

the m o d e l m u s t g e n e r a t e i n t e r n a l l y

those terms

w h i c h w o u l d c o n v e n t i o n a l l y be t h o u g h t of as "fitting errors". Since the p r o g r a m m i n g terminals,

l a n g u a g e has a finite n u m b e r of

the length of the m o d e l i n c r e a s e s w h e n these

terms increase.

The c r i t e r i o n of q u a l i t y

a particular trade-off between

thus i n c o r p o r a t e s

c o m p l e x i t y and a p p r o x i m a t i o n .

The above c h a r a c t e r i s a t i o n of m o d e l l i n g more detail 2.2.

in C h a p t e r 3.

Support

is e x p l a i n e d in

for it is given in s e c t i o n

The e s s e n c e of this s u p p o r t is that the length o~

the s h o r t e s t p r o g r a m r e q u i r e d to c o m p u t e a s ~ q u e n c e d i s p l a y s properties

analogous

to the p r o p e r t i e s

of the e n t r o p y

associated with a probability

space.

long sequence, w h i c h r e q u i r e s

a maximally

compute it, p a s s e s every e f f e c t i v e (asymptotically, w i t h p r o b a b i l i t y

possible

long p r o g r a m to

i).

This suggests

to "compress"

that

the p r o g r a m

r e q u i r e d to compute a set of o b s e r v a t i o n s

represents

a

test for r a n d o m n e s s

the amount by w h i c h it is p o s s i b l e (model)

In p a r t i c u l a r ,

(system)

the amount of i n f o r m a t i o n w h i c h it has b e e n

to e x t r a c t from the o b s e r v a t i o n s .

If the only

m o d e l w h i c h has b e e n found is one that m e r e l y reads out the observations

from a look-up table,

has b e e n achieved,

and such a m o d e l

then no " c o m p r e s s i o n " conveys no i n f o r m a t i o n

about the o b s e r v a t i o n s . A c o n s e q u e n c e of our c h a r a c t e r i s a t i o n

is that no

a l g o r i t h m can e x i s t for finding

the best m o d e l

the above c r i t e r i o n of quality)

of an a r b i t r a r y

(according to system.

10

The choice of p r o g r a m m i n g

l a n g u a g e to be used,

a s s e s s i n g the q u a l i t y of a model,

for

can be v i e w e d as the

s p e c i f i c a t i o n of "what is to be taken for granted". should

It

t h e r e f o r e be m a d e in the light of the m o d e l l e r ' s

a priori k n o w l e d g e

about the system,

the m o d e l l i n g exercise.

In C h a p t e r 4 this c o n n e c t i o n is

e x a m i n e d m o r e closely. sets are large enough,

and of the p u r p o s e s of

It is shown that,

if the o b s e r v a t i o n

then the results of m o d e l a s s e s s m e n t

are i n d e p e n d e n t of the choice of p r o g r a m m i n g

language.

This can be i n t e r p r e t e d to m e a n that the m o d e l l e r ' s 9 p r i o r i beliefs become

less s i g n i f i c a n t as the set of o b s e r v a t i o n s

a v a i l a b l e to him grows. Nevertheless, observation

the a s s e s s m e n t of m o d e l s of small

sets ~ d e p e n d e n t on the m o d e l l e r ' s

of his a p r i o r i beliefs.

Consequently

cannot be taken to be definitive.

specification

such an a s s e s s m e n t

However,

this is

m i t i g a t e d by the fact that the m o d e l l e r does not n e e d to choose b e t w e e n

mutually exclusive

he can s t i p u l a t e p r o g r a m m i n g

sets of a priori beliefs:

l a n a g u a g e s w h i c h imply a g r e a t e r

or s m a l l e r state of k n o w l e d g e . S e v e r a l d i f f e r e n t models,

even w h e n w r i t t e n in the same

language, w i l l rarely use e x a c t l y the same f e a t u r e s of that language.

It is t h e r e f o r e q u e s t i o n a b l e w h e t h e r a c o m p a r i s o n

of their lengths gives a m e a s u r e to the same set of assumptions. this difficulty.

Chapter

of their c o m p l e x i t y r e l a t i v e Chapters

5 develops

5 and 6 resolve

a formal e q u i v a l e n t

of "a p r o g r a m makes use of s u c h - a n d - s u c h f a c i l i t i e s of a

11

language".

A prerequisite

for this is a formal m e t h o d of

d e f i n i n g the s e m a n t i c s of p r o g r a m m i n g

languages.

such m e t h o d is o u t l i n e d in A p p e n d i x A. the concepts d e v e l o p e d in C h a p t e r

these c o n d i t i o n s

C h a p t e r 6 then uses

5 to specify some c o n d i t i o n s

under w h i c h m o d e l s may be m e a n i n g f u l l y d e m o n s t r a t e d that m o d e l

One

compared.

It iS

a s s e s s m e n t is not m u c h a f f e c t e d if

are not m e t exactly.

The details of the c o m p l e x i t y / / a p p r o x i m a t i o n t r a d e - o f f , w h i c h is i n h e r e n t in our p r o p o s e d m e t h o d of m o d e l a s s e s s m e n t , d e p e n d on the p r e c i s e m a n n e r in w h i c h the o b s e r v a t i o n s coded in the p r o g r a m m i n g

language.

It is c o n v e n i e n t

are to

s e p a r a t e this aspect of the s e l e c t i o n of a s u i t a b l e p r o g r a m m i n g language from those aspects c o n s i d e r e d in C h a p t e r s e q u e n t l y the coding of o b s e r v a t i o n s

4;

con-

is d i s c u s s e d in C h a p t e r 7.

A d i s t i n g u i s h e d m i n i m a l coding is shown to exist,

and it is

argued that this is a n a t u r a l c o d i n g to use for m o d e l assessment. The m o d e l l i n g of one p a r t i c u l a r s y s t e m gas-furnace data

(i0))

(Box and Jenkins'

is used as an e x a m p l e throughout.

The r i v a l m o d e l s c o n s i d e r e d for this s y s t e m are very simple and in no way r e p r e s e n t the range of possibi.lities d i s c u s s e d in sec.

i.i.

Nevertheless,

the c o n s i d e r a t i o n s

there apply e v e n to these simple models, Chapter

3.

It w i l l b e c o m e

raised

as w i l l be seen in

a p p a r e n t that the a s s e s s m e n t

m e t h o d p r o p o s e d in this thesis is i m m e d i a t e l y a p p l i c a b l e to a much

larger class of models.

12

1.3

System

Identification r Realisation

Modern notion

developments

of a d y n a m i c a l

experimental with

data

of systems

system

(ii),

the i n f e r e n c e

not y e t o b s e r v e d

conditions,

behaviour,

known

under

theory

emphasise

as an a b s t r a c t

(12),

of s y s t e m

and M o d e l l i n @

(13).

summary

Modelling

behaviour

under

is c o n c e r n e d

by w h i c h

is a c h i e v e d

is the p o s t u l a t i o n

the

system,

which

and

the s e l e c t i o n ,

from t h e s e

candidate

is p r e f e r r e d

on the basis

of some

criterion.

its h e a v y

emphasis

that

modern

discussing

as

However,

observations, upon

Consequently,

a more

than

if a s y s t e m

then as little

we adopt

structures,

and

one

the

of one The

on

to adopt,

less u s e f u l

when

view

of c o m p o n e n t s " . and the

by r e f e r e n c e

abstract

modelling

for

observations,

is to be m o d e l l e d ,

is to be g a u g e d

it, b e f o r e

these

natural

the o l d e r

structures

this

following

to the

structure

has begun,

success

should

be

as possible.

definition:

(1.3.1)

A system observations, U=

with

"an i n t e r c o n n e c t i o n


Definition

with

is t h e r e f o r e

modelling,

of a s y s t e m

(i)

compatible

v i e w of a system,

observations,

imposed

are

but

of p a s t

The m e t h o d

of a b s t r a c t

of

specified

from o b s e r v a t i o n s

conditions.

the

S is d e f i n e d S=

(u I , u 2

to be an o r d e r e d

(U, Y)

, where:

, .

,uM)

and Y=

(Yl

p a i r of

' Y2

'

,YN )

13

are the i n p u t and o u t p u t o b s e r v a t i o n sets r e s p e c t i v e l y ; ui=

(Ul, u2

i )and . , u~i

• .

y i=

are o r d e r e d sets of o b s e r v a t i o n s

w h e r e tl,t2,..,

(yi1

'

yi2

i ' Ymi

'

c a r r i e d out at time ti,

t N is the n a t u r a l

time ordering;

u~ E { r a t i o n a l s }

'

u {b} where b

i

for yj;

3

(blank)

denotes a missing observation;

similarly

and

(ii) w i t h the c o n v e n t i o n

£i=0;

)

if

that

Yi=b t h e n mi=O;

if

(b,b,...,b)=b,

if u . = b then l

u.%b t h e n u£.@b; i l

if Y i ~ b then

1 i

Ym, ~b; 1

and YN%b.

C o n d i t i o n s (ii) serve only to e n s u r e that adding on a set of blanks

(missing o b s e r v a t i o n s )

does not create a new system.

For c o n c r e t e n e s s • we have s p e c i f i e d that ui,Y i refer to observations made

at time t i, since we are i n t e r e s t e d p r i m a r i l y

in d y n a m i c a l models. essential.

Also,

However•

this i n t e r p r e t a t i o n

is not

each u i , Y i could be a m u l t i d i m e n s i o n a l

finite a r r a y of o b s e r v a t i o n s ,

r a t h e r than a o n e - d i m e n s l o n a l

array, w i t h o u t a f f e c t i n g later results. The input o b s e r v a t i o n set is a l l o w e d to be empty, order to admit d e v i c e s such as noise g e n e r a t o r s as systems of the form w h e n stating

(b, Y).

in

and o s c i l l a t o r s ,

It has b e e n a r g u e d that

the g e n e r a l p r o b l e m of s y s t e m i d e n t i f i c a t i o n ,

it should not be n e c e s s a r y

to d i s t i n g u i s h b e t w e e n input and

output(14).

The two should be lumped t o g e t h e r as a "system

behaviour",

and the task of s y s t e m i d e n t i f i c a t i o n s h o u l d

~4

include

the

it seems the

two

separation

essential cases

shown

and

internal

structures

procedure inputs

must

have

sets.

The

f r o m the sets

lead

form

Our

especially field

difference

a system

define

of

cc. c e ~ n ~ d r

however,

interaction have

is s o m e

cbservaLions concise

referred we prefer the

with

with

to above

o f its

the

set of observations

of observation

themselves and

systems

(b, Y). seem odd,

theory.

by

by

In t h i s

a set

of

examining

equations. process.

We We

of a system

hehaviour.

"laws"

- such

"explain"

The as t h e

this

set of equations as

a "system".

reason

the o b s e r v a t i o n

a system

reverse

this

are

assume

because

eD~TircFme~.t - 'I o t h e r w o r d s ,

- which

to regard

control

the e x i s t e n c e

set of

are

for

the

observations

at f i r s t

of these

the

its

that

a

identification

unless

pair

input

its b e h a v i o u r

solutions

of

with

to define

properties

aware

may

any

It is

t h a t U # b)

familiar

and investigate

we

a system,

"system"

equations,

its

Note

different

between

of both

the

between

labelled

very

But

as an o r d e r e d

distinguishes

i t is c o n v e n t i o n a l

the

point.

distinguished.

of

to t h o s e

t h a t ?,e are

the

same model

(b, U) ( p r o v i d i n g

definition

boxes

-

observations.

U a n d Y, w h i c h

o f the

The black

consider

However,

of distinguishing

to h a v e

are

ordering

"output".

can be expected

to the

defined

output

i.

and an earthing

and outputs

that we

and

a means

in Fig.

"sink"

generator

"input"

to have

"source"

signal

of

goal

of because

of modelling

set of equations

interaction. as a " m o d e l " ,

Hence and

15 The d e f i n i t i o n of "system" w h i c h is p r o p o s e d above is much cruder than the d e f i n i t i o n s

usually encountered.

It

is w o r t h s t a t i n g in full one such d e f i n i t i o n - that of Kalman, Falb and A r b i b

Definition

(ii) :

(1.3.2)

A dynamical system mathematical (a)

(i)

( i n p u t / o u t p u t sense)

is a c o m p o s i t e

c o n c e p t d e f i n e d as follows:

T h e r e is a given time set T, a set of input values U,

a set of a c c e p t a b l e i n p u t functions

R={~

:T+

output values Y, and a set of o u t p u t functions (ii)

(Direction of time).

U}, a set of F ={y

:T÷

Y}.

T is an o r d e r e d subset of the reals.

(iii) The i n p u t space ~ s a t i s f i e s

the f o l l o w i n g conditions:

(I)

(Nontriviality).

~ is nonempty.

(2)

(Concatenation of inputs).

An input s e g m e n t

~(t I, t 2) is ~e~ r e s t r i c t e d to

(t I , t2)~T.

If ~,~'e~ and tl< t 2 < t3, there is an e"e~ such that m" (tl,t2) = ~ ' ( t l , t 2 ) and ~" (b)

T h e r e is given a set F = (fe

:

T

x

A

(t2,t3)=w"(t2,t3).

i n d e x i n g a family of f u n c t i o n s ~ ~Y,~eA}

;

each m e m b e r of F is w r i t t e n e x p l i c i t l y

as f (t,~)= y(t)

w h i c h is the o u t p u t r e s u l t i n g at time

t

under the e x p e r i m e n t

e.

Each f

from the input

is c a l l e d an i n p u t / o u t p u t

function and has the f o l l o w i n g p r o p e r t i e s : (i)

(Direction of time).

f (t,~)

There

is d e f i n e d for all t>l(e).

is a map

~:A÷T such that

16

(ii) ~(~

Let T,teT

(Causality). ,t) =~

and T O.

m o d e l s are allowed to a p p r o x i m a t e s y s t e m

r a t h e r than r e p r o d u c e it exactly.

and

(3.3.6)

however,

Definitions

r e q u i r e m o d e l s to compute the

o b s e r v e d s y s t e m b e h a v i o u r exactly.

This does not m e a n that

the class of m o d e l s w h i c h we can t r e a t is any smaller than the class of models w h i c h are u s u a l l y of interest. merely m e a n s

It

that w h e r e a s a c o n v e n t i o n a l m o d e l may r e p r o d u c e

a system b e h a v i o u r a p p r o x i m a t e l y ,

the c o r r e s p o n d i n g m o d e l

in our f o r m a l i s m has the a d d i t i o n a l task of g e n e r a t i n g the "corrections" behaviour, Fig.

w h i c h m u s t be a p p l i e d to the a p p r o x i m a t e

in o r d e r to p r o d u c e the e x a c t s y s t e m behaviour. 2 shows the c o r r e s p o n d e n c e b e t w e e n a type of

c o n v e n t i o n a l m o d e l c o m m o n l y e n c o u n t e r e d in c o n t r o l studies, and a m o d e l w h i c h s a t i s f i e s d e f i n i t i e n s It w i l l be r e c a l l e d that t h e o r e m "random"

is e q u i v a l e n t

table look-up",

(3.3.1)

(2.2.12)

and

(3.3.6).

suggests that

to "can be c o m p u t e d o n l y by u s i n g a

If a m o d e l is c o n s i d e r e d to be a s u m m a r y

72 of k n o w l e d g e

about a system,

then those c o m p u t a t i o n s of the

m o d e l w h i c h have to be p e r f o r m e d by using a table look-up correspond

to those aspects of the s y s t e m b e h a v i o u r w h i c h

are not u n d e r s t o o d ,

and cannot be p r e d i c t e d - in fact,

those that a p p e a r to be random. may be very d i f f e r e n t

from that shown in fig.

if they are "corrections", m o r e generally,

The role of these c o m p u t a t i o n s 2.

For example,

they need not be additive.

But

the terms c o m p u t e d by table look-up need not

play the role of "corrections".

T h e y may,

for instance,

be p a r a m e t e r s , w h i c h w o u l d c o n v e n t i o n a l l y be v i e w e d as " r a n d o m l y varying".

3.4 C r i t e r i o n of Q u a l i t y

The third c o m p o n e n t of our c h a r a c t e r i s a t i o n of m o d e l l i n g is a c r i t e r i o n of q u a l i t y of a model. Let F r e p r e s e n t a c o m p u t i n g programming

language.

facility,

together with a

Let c be an i n j e c t i v e f u n c t i o n from

the i n t e g e r s to the set of strings of t e r m i n a l in the progr a m m i n g language, w h i c h is used to r e p r e s e n t the integers

in

programs.

(c t h e r e f o r e is i n c l u d e d in the d e f i n i t i o n of the

programming

l a n g u a g e F.

The d e f i n i t i o n of p r o g r a m m i n g

languages is r e v i e w e d in A p p e n d i x A; given in C h a p t e r 7).

m o r e d~tails of c are

Let S be an i n t e g e r s y s t e m as d e f i n e d

in s e c t i o n 3 2, w i t h input and o u t p u t o b s e r v a t i o n s u~, •

Definition

i yj-

(3.4.1)

The trivial F m o d e l of S is the s h o r t e s t p r o g r a m w h i c h

73 4

is a concrete

(F,E)-model of S, such that each c(y~)

appears in it, where the minimisation all possible sets E

(defined by def.

of length ranges over (3.3.1)).

It is assumed that the length of a program is measured by the number of terminals

appearing in it.

The trivial model of a system is one which computes the output observation table look-up.

set by simply reading

it out from a

It is a model which the modeller has

available right at the beginning of the modelling

exercise,

before he has found any structure or pattern in the system behaviour. For any system S, let the sets Ci,D i be those defined by def.

(3.3.1) ~

One can think of the length of a concrete

(F,E)-model of S as the "perceived complexity", F, of the set

(Cl,...,Cm),

conditional

relative to

on the set

((l,Dl),...,(m,Dm)).

The greatest lower bound of this "perceived complexity", taken over all concrete

(F,E)-models of S, is just the

conditional Kolmogorovcomplexity

KF((C,,...,Cm) I ((l,Dl),...,(m,Dm))).

(Although Kolmogorov complexity was developed and binary programs,

for binary sequences

it can be readily generalised to sequences

and programs containing

any finite number of sMmbols).

approximate upper bound for this Kolmogorov complexity

An is

the length of the trivial model of S. The length of the trivial F model of S is the "perceived complexity"

of

(C~,...,C m) before any structure has been

discovered in the system behaviour.

If a shorter model of

74

S is found,

then its " p e r c e i v e d c o m p l e x i t y " w i l l be reduced.

R e c a l l i n g the a n a l o g y b e t w e e n c o m p l e x i t y and entropy, is a p p e a l i n g to m e a s u r e in

((l,D1),...,(m,Dm))

the " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " about

(CI, .... C m)

as the d i f f e r e n c e

b e t w e e n these two " p e r c e i v e d c o m p l e x i t i e s " . Kolmogorov complexity

it

Since

is not e f f e c t i v e l y c o m p u t a b l e ,

the o n l y

u p p e r b o u n d on this " p e r c e i v e d q u a n t i t y of i n f o r m a t i o n " which

is a v a i l a b l e ,

model.

in general,

is the length of the t r i v i a l

Thus the length of the trivial m o d e l is a m e a s u r e

of the a m o u n t of i n f o r m a t i o n p o t e n t i a l l y to be c o n v e y e d by the m o d e l l i n g exercise.

Definition

(3.4.2)

Let p be a c o n c r e t e trivial F m o d e l of S.

(F,E)-model of S, and let t be the Then the i n f o r m a t i o n ~ain I(p)

of p is the d i f f e r e n c e I (p) =£ (t) -Z (p) . . . . . . . . . . . . . . . . . . where

£(.)

denotes

In section

(3.5)

the length of a program.

i.i a simple e x a m p l e was p r e s e n t e d ,

which

s u g g e s t e d that the c o n f i d e n c e w h i c h one has in a m o d e l d e p e n d s on the d i f f e r e n c e b e t w e e n the n u m b e r of o b s e r v a t i o n s w h i c h the m o d e l e x p l a i n s

and the n u m b e r of o b s e r v a t i o n s

r e q u i r e d to c o n s t r u c t the model. m e a s u r e of this difference.

The i n f o r m a t i o n g a i n is a

If the i n f o r m a t i o n gain is

zero, then all of the o u t p u t o b s e r v a t i o n s have been used to c o n s t r u c t the model;

the t r i v i a l m o d e l is, of course,

prime e x a m p l e of such a model.

the

If the i n f o r m a t i o n gain is

7S

close to its u p p e r enough

bound

£(t),

to be c o n s t r u c t e d

observation

set,

by the model.

"parameters"than

that

the m o d e l

is j u s t i f i e d

course,

implies

(Chapter that

of a s y s t e m

contains

sets

that system.

This

that if w e have we can n e v e r

only

the

latter

accords

size

confidence

of the p r o g r a m m i n g

aspect).

This

model

the i n t u i t i v e

of

notion

of a system,

in any m o d e l

claim

in some m o d e l

of the t r i v i a l

well w i t h

of

is c o n t a i n e d

we m a y have

a few o b s e r v a t i o n s

have m u c h

We assume,

the s y s t e m

confidence

by the

in a m o d e l

increases.

about

about

set.

and in the d e f i n i t i o n

the p o s s i b l e

of course.

arbitrary

the c o n f i d e n c e

gain

4 deals with

is b o u n d e d

more

observation

that

all our k n o w l e d g e

in the o b s e r v a t i o n

set is " e x p l a i n e d "

by the amount of i n f o r m a t i o n

as its i n f o r m a t i o n

that

language

then,

of this

of the o u t p u t

gain m a y be n e g a t i v e ,

in the o u t p u t

We are c l a i m i n g ,

is simple

a small p a r t

and the r e m a i n d e r

contained

increases

the m o d e l

from o n l y

The i n f o r m a t i o n

This i n d i c a t e s

reality

then

then

of it w h i c h

may be p o s t u l a t e d . We e m b o d y of w h i c h

Axiom

our c l a i m

a c h o i c e m a y be m a d e

following

between

axiom,

competing

on the basis models.

(3.4.3)

If S is a system, El- m o d e l s

an

and El and E2

of S and Ez- m o d e l s

and q are models, being

in the

(F,E2)

has the h i g h e r model of S.

with p being

-model

of S an

of S, then

information

gain

are sets

such

that

are of interest,

(F,EI)

-model

and p

of S and q

the one of p and q w h i c h

is to be c h o s e n

as the b e t t e r

76 This a x i o m implies that good m o d e l s Good m o d e l s w i l l t h e r e f o r e computational

are small models.

tend to use the same

(short)

a l g o r i t h m for as many c o m p u t a t i o n s

as p o s s i b l e ,

since the s p e c i f i c a t i o n of every new a l g o r i t h m i n c r e a s e s the size of the model. specific

Thus the above a x i o m p r o v i d e s

a

link b e t w e e n the w i d e l y - h e l d b e l i e f that s i m p l i c i t y

(as m e a s u r e d by smallness)

is d e s i r a b l e

and the a l m o s t u n i v e r s a l c o n v i c t i o n r e g u l a r i t y has been repeated,

in s c i e n t i f i c h y p o t h e s e s ,

that the more an o b s e r v e d

the more

likely it is to recur.

The f o l l o w i n g t h e o r e m is a crucial c h a r a c t e r i s a t i o n of m o d e l l i n g .

feature of our

As before,

£(p) d e n o t e s

the length of p r o g r a m p, m e a s u r e d by the n u m b e r of t e r m i n a l c h a r a c t e r s w h i c h appear in it.

Theorem

(3.4.4)

T h e r e is, in general, an

no e f f e c t i v e p r o c e d u r e

(F,E) - m o d e l p of a system S, such that,

for finding

for any o t h e r

(F,E) -model q of S, ~(p)~ £(q).

Proof.

S u p p o s e that such an e f f e c t i v e p r o c e d u r e exists.

C o n s i d e r the case E = { E I } = { ( ~ , Y ) } ( w h e r e have an e f f e c t i v e p r o c e d u r e

S=(U,Y)).

Then we

for f i n d i n g the s h o r t e s t p r o g r a m

w h i c h computes Y, using only the set {i}. Now suppose that the p r o g r a m m i n g t e r m i n a l characters. procedure

language F has only two

Then there exists

an e f f e c t i v e

for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h

c o m p u t e s Y, u s i n g b i n a r y sequence:

{i}.

But Y can be a s s o c i a t e d u n i q u e l y w i t h

Y is a system,

and can t h e r e f o r e be

77

a s s o c i a t e d w i t h its index in some fixed e n u m e r a t i o n of systems. This i n d e x can be a s s o c i a t e d w i t h a b i n a r y s e q u e n c e by the b i j e c t i o n i n t r o d u c e d in s e c t i o n 2.2,1. above steps is effective,

Since each of the

there exists an e f f e c t i v e p r o c e d u r e

for finding the s h o r t e s t b i n a r y p r o g r a m w h i c h c o m p u t e s

the

binary s e q u e n c e a s s o c i a t e d w i t h Y, and h e n c e there exists an e f f e c t i v e p r o c e d u r e

for finding its length, n a m e l y the

K o l m o g o r o v c o m p l e x i t y KF(YII).

S u p p d s e F is optimal.

Then, by C h u r c h ' s Thesis,

is p a r t i a l recursive, w h i c h

contradicts

theorem

Theorem

K(YII)

(2.2.5).

This p r o v e s

the theorem.

(3.4.4) does not rely on F h a v i n g only two

terminals or on E h a v i n g the form i n d i c a t e d in the proof. These a s s u m D t i o n s to t h e o r e m

are made in o r d e r to d e r i v e

(2.2.5).

However,

a contradiction

as m e n t i o n e d earlier,

this

t h e o r e m can be g e n e r a l i s e d to the case w h e r e the s e q u e n c e s c o n s i d e r e d have an a r b i t r a r y

finite n u m b e r of symbols,

to cover the u n c o m p u t a b i l i t y

of c o n d i t i o n a l complexity.

the o t h e r hand,

and On

the t h e o r e m does rely on F b e i n g optimal.

A sufficiently restricted programming shortest m o d e l to be found, systems w i l l not p o s s e s s

language may allow the

if it exists.

any m o d e l s

simplest e x a m p l e is a p r o g r a m m i n g

However,

most

in such a l a n g u a g e

(the

l a n g u a g e w h i c h always

computes the same thing, w h a t e v e r p r o g r a m it may be given). Theorem

(3.4.4)

implies that there is no a l g o r i t h m for

finding the m o d e l of a s y s t e m w h i c h has the h i g h e s t i n f o r m a t i o n gain.

So, a c c o r d i n g to our axiom,

finding the b e s t m o d e l of a system.

there is no a l g o r i t h m for C o n s e q u e n t l y the

m o d e l l i n g e x e r c i s e c a n n o t p r o c e e d a c c o r d i n g to some

78

"universal

modelling

of n o n a l g o r i t h m i c followed our

by the

algorithm",

(creative?)

assessment

but must

postulation

of these

involve

a process

of h y p o t h e s e s ,

hypotheses,

according

to

axiom. Note

that the

in s e c t i o n is still

(2.2.4),

Most models

of data w h i c h

up,

it is m o s t

can be explained.

error.

by the m u m b e r

It w i l l

of c h a r a c t e r s

a table

not b e e n e x p l a i n e d usually

be possible

system behaviour

any table

look-ups

algorithms

- that

in the m o d e l

in the rest

a trade-off

between

which would

conventionally

and the d e g r e e

the

a table required

look-

of the data as

to p r o g r a m

it,

of p r e d i c t i o n features

of

by the model. to e x p l a i n m o r e is,

to r e d u c e

of the

the size of

- only by u s i n g m o r e - that Thus

of q u a l i t y

complexity

the use of s m a l l n e s s of a m o d e l

leads

to

of the m o d e l

as the m o d e l

provided

elaborate

is, by i n c r e a s i n g

of that p a r t

be r e g a r d e d

of a p p r o x i m a t i o n

table

look-up,

is, the more

of the m o d e l

as a c r i t e r i o n

aspect

measure

the size of the rest of the model. of the p r o g r a m

artificially

at least one

that e v e r y

size of such

such

the s h o r t e s t

Criteria.

has not been

as a very g e n e r a l

The b i g g e r

the d a t a have

observed

The

there

model.

to c o n t a i n

unlikely

is m e n t i o n e d

is found,

for finding

with Conventional

can be e x p e c t e d

can be r e g a r d e d

(28), w h i c h

if a m o d e l

procedure

generated

measured

that

of that p a r t i c u l a r

Compatibility

since

of M e y e r

implies

no a l g o r i t h m i c

implementation

3.5

result

(cf fig. 2),

by that p a r t of the

79

model

to the o b s e r v e d

therefore model

provides

data.

The use of this

a safeguard

against

observation

set is large

criterion

"overfitting"

the

to the data. If the o u t p u t

to the size look-up,

of that part

then

the

of the m o d e l

size of the

the size of the model. are b e i n g quality table

compared,

leads

to the

selection This

p r e f e r e n c e for small is large

In this

which

is not

look-up(s)

case,

fitting

enough

the

of smaller

conventional

if the n u m b e r

for the d a n g e r

dominate

criterion

to the

errors,

a table

will

of the m o d e l w i t h

corresponds

relative

if two such m o d e l s

the use of the p r o p o s e d

look-up(s).

ations

table

enough,

of o b s e r v -

of o v e r f i t t i n g

to be

d~missed. The d e f i n i t i o n mines

the d e t a i l s

the s m a l l n e s s

of the p r o g r a m m i n g

of the

trade-off

criterion.

are

definition,

is c o n s i d e r e d

A serious proposed about

the s y s t e m

A typical

is not

situation

a program, a priori

the

has

smallest

of this where

that

of the

language

a priori this m a y

the use of the knowledge

indicate

that

a

should

be p r e f e r r e d .

in c o n v e n t i o n a l

system

identification

a particular

will

about

knowledge

a more

look-

available

~

a smaller

knowledge

then

part

table

7.

be m a d e

If s u f f i c i e n t

model with

It m a y h a p p e n

must

used d e t e r -

in the use of

in w h i c h

constitutes

in C h a p t e r

is a v a i l a b l e

example

parametric

which

reservation

criterion.

model w h i c h

is the

coded,

implicit

The m a n n e r

up e l e m e n t s

language

elaborate overall

prevent

indicates

structure model,

size;

that

is a p p r o p r i a t e .

when written

nevertheless,

that m o d e l

a

being

as

the

chosen

as

80

better.

A n o t h e r e x a m p l e is p a r a m e t e r e s t i m a t i o n of a

l i n e a r d y n a m i c a l p r o c e s s w h o s e o u t p u t is c o r r u p t e d by noise. In this case a s t r a i g h t f o r w a r d m i n i m i s a t i o n of the e q u a t i o n e r r o r u s u a l l y leads to b i a s e d e s t i m a t e s So if two m o d e l s

(Eykhoff,

are b e i n g c o m p a r e d w h o s e

c o n t a i n the e q u a t i o n errors,

table look-ups

it is p o s s i b l e

that the larger

one w i l l be p r e f e r r e d on p r o b a b i l i s t i c grounds. again,

a priori k n o w l e d g e

the s m a l l n e s s

(44)).

(about the noise)

Once

is r e q u i r e d if

c r i t e r i o n is to be overridden.

Furthermore,

the s m a l l n e s s c r i t e r i o n could still be u s e d to d e c i d e b e t w e e n the l a r g e r of these two m o d e l s and a third m o d e l b e l o n g i n g to a d i f f e r e n t

class.

As i n d i c a t e d in s e c t i o n intended

i.i,

the p r o p o s e d c r i t e r i o n is

for use in s i t u a t i o n s w h e r e little a p r i o r i

is available,

information

or in s i t u a t i o n s w h e r e it is too d i f f i c u l t to

use such a p r i o r i k n o w l e d g e

for m o d e l assessment.

The s m a l l n e s s - of - m o d e l c r i t e r i o n choice of m o d e l

leads to the same

as do s t a t i s t i c a l c o n s i d e r a t i o n s ,

i m p o r t a n t class of s y s t e m b e h a v i o u r s

for a v e r y

and m o d e l s of them.

If the s y s t e m b e h a v i o u r is a s t a t i o n a r y r a n d o m p r o c e s s w i t h rational spectral density predict,

at any time,

function,

its future behaviour,

the m e a n - s q u a r e p r e d i c t i o n error. to W i e n e r and

then it is known how to So as to m i n i m i s e

The method,

due e s s e n t i a l l y

Kolmogorov, is to m a k e the p r e d i c t i o n for any

future time a s u i t a b l e

linear f u n c t i o n of past o b s e r v a t i o n s

of the b e h a v l o u r

(46).

(45),

are e q u a l l y spaced,

If the o b s e r v a t i o n i n t e r v a l s

and a p r e d i c t i o n

is b e i n g m a d e at each

i n s t a n t of the s y s t e m b e h a v i o u r at the n e x t o b s e r v a t i o n instant,

81

then the p r e d i c t i o n errors are equal

to the random,

uncorre-

lated d i s t u r b a n c e s w h i c h are i m a g i n e d to be acting on the system. S u p p o s e it is d e s i r e d to b u i l d a c o n c r e t e

(F,E) - m o d e l

of the s y s t e m w h i c h w i l l give useful o n e - s t e p - a h e a d predictions.

Any E can be chosen w h i c h allows the m o d e l

to use p r e v i o u s o b s e r v a t i o n s example

(3.3.4)).

to compute p r e d i c t i o n s

(cf.

The m o d e l w i l l have to g e n e r a t e terms

c o r r e s p o n d i n g to p r e d i c t i o n errors by m e a n s of a table up.

If the p r o g r a m m i n g

look-

l a n g u a g e used codes table look-

up terms in such a way that length of code is n o n d e c r e a s i n g with the m a g n i t u d e of the term

(cf. C h a p t e r 7), then,

s u f f i c i e n t l y long s e q u e n c e of o b s e r v a t i o n s , smallest

(in magnitude)

(in length)

for a

the model w i t h

p r e d i c t i o n errors will be the s m a l l e s t

model.

But it is k n o w n that,

for the s y s t e m u n d e r c o n s i d e r a t i o n ,

the s m a l l e s t m e a n square p r e d i c t i o n error is o b t a i n e d by the use of the W i e n e r - K o l m o g o r o v theory. Sherman

(47) has shown that,

Furthermore,

if the p r o c e s s

is Gaussian,

then

the same linear p r e d i c t o r is o b t a i n e d if the e x p e c t a t i o n of any even n o n d e c r e a s i n g

f u n c t i o n of the p r e d i c t i o n error is

minimised. So, u n d e r these conditions, of o b s e r v a t i o n s , to axiom

(3.4.3),

the " e x p e c t e d b e s t model",

uncorrelated

fore s u g g e s t s

judged according

is the W i e n e r - K o l m o g o r o v model.

terms a p p e a r i n g in the table a random,

for a long enough s e q u e n c e

look-up of this m o d e l

sequence.

Theorem

(2.2.12)

The constitute there-

that these terms could not be g e n e r a t e d by any

82

m o r e e f f i c i e n t a l g o r i t h m than a table

look-up.

3.6 P r e d i c t i o n If the b e s t m o d e l that has b e e n found up to some time is a c o n c r e t e

(F,E) - m o d e l p, and it is d e s i r e d to find the

s y s t e m b e h a v i o u r u n d e r some new conditions,

(possibly not yet observed)

w h i c h can be r e p r e s e n t e d by a b l o c k of " v i r t u a l

observations",

D m + ~ , t h a t is, the o b s e r v a t i o n s w h i c h w o u l d

be o b s e r v e d if the new c o n d i t i o n s obtained,

then the m o d e l p,

and the c o m p u t e r F, can be used to find the " p r e d i c t i o n " F(p,m+l,Dm+1).

This p r o v i d e s

v a l u e s of a p o s s i b l e

a m e a n s of c o m p u t i n g

input/output

the

f u n c t i o n of the s y s t e m on

elements of its d o m a i n w h i c h have not b e e n p r e v i o u s l y observed.

According

best "predictions" "prediction"

to our axiom,

a v a i l a b l e to us.

in quotes,

these values

are the

We have put the w o r d

b e c a u s e the v a l u e F ( p , m + l , D m + I)

n e e d not r e p r e s e n t a future v a l u e

(for example,

if the m o d e l

runs b a c k w a r d s t h r o u g h the o b s e r v a t i o n interval). It is possible, is not defined.

In this case,

use p for p r e d i c t i n g

%+i

of course,

However,

that the value F ( p , m + l , D m + I) it may not be p o s s i b l e

s y s t e m b e h a v i o u r u n d e r the c o n d i t i o n s

for some models,

the v a l u e F ( p , m + l , D m + I)

may be u n d e f i n e d simply b e c a u s e p i n c l u d e s

the g e n e r a t i o n of

c e r t a i n p a r a m e t e r s by m e a n s of a table look-up,

and the table

does not contain an e l e m e n t w h i c h is to be used for the computation.

to

In this case, p r e d i c t i o n

(m+l)th

is still p o s s i b l e

if

83

such an e l e m e n t what v a l u e propose

should

a second

extension

Axiom

be s u p p l i e d that

to the model.

element

axiom,

take?

Our

w h i c h m a y be

of the p r e v i o u s

The p r o b l e m solution

is,

is to

thought

of as an

to t a b l e

look-ups

one.

for P r e d i c t i o n

If e l e m e n t s model,

in o r d e r

are

to be s u p p l i e d

to a l l o w

then the b e s t p r e d i c t i o n are chosen

that m o d e l will

so as to m i n i m i s e

to c o m p u t e

be o b t a i n e d the

a prediction,

if these

resulting

of a

elements

increase

in

size of the model. The use of this the v a l u e

of the i n f o r m a t i o n

to a t r i v i a l no c o n f i d e n c e A rough The b a s i c

axiom must

model,

thus

gain.

enabling

in that p r e d i c t i o n , justification

assumption which

observations,

will

of the s y s t e m

during

should

the e l e m e n t s

that p r e v i o u s l y

have b e e n continue


observed

requires

a large

of code

amout

to c o m p u t e

such r e g u l a r i t i e s , by using

the

"fixed"

part

is that

can c e r t a i n l y

of

H e n c e we

look-up

is such

to appear

any

in the b e h a v i o u r

to be such

are p r e s e n t

if the m o d e l

of the m o d e l

b u t we have

as follows.

interval.

an o u t p u t w h i c h

then we

an e l e m e n t

in a s e q u e n c e

table

by

the e l e m e n t m a y be.

prediction

regularities

But,

supply

runs

to be p r e s e n t

prediction.

in o r d e r

whatever

detected

for the

of course,

it to predict;

of the a x i o m

computed

up,

We can

of s c i e n t i f i c

regularities,

choose

be q u a l i f i e d ,

in the that

in a table

look-

is c o n s i s t e n t obtain (i.e.

it

with

a better model

the p a r t

that

84

is common to all the computations)

to c o m p u t e the regularities.

This is true b e c a u s e for a s u f f i c i e n t l y observations, average look-up.

large set of

the size of the m o d e l w i l l be g o v e r n e d by the

length of code a p p e a r i n g as e l e m e n t s of the table Thus the a x i o m is r e a s o n a b l e if it is a s s u m e d

that it is a p p l i e d to the b e s t a v a i l a b l e model. The above a r g u m e n t can be i l l u s t r a t e d by the f o l l o w i n g example.

S u p p o s e a s y s t e m is d e f i n e d by the o b s e r v a t i o n s :

S=(U,Y)=(b, ( 5 5 2 , 5 5 3 , 5 4 6 , 5 5 1 , 5 4 9 , 5 4 4 , 5 4 7 , 5 5 4 , 5 5 7 , 5 5 1 ) ) . If the p r o g r a m m i n g

l a n g u a g e and c o m p u t i n g

f a c i l i t y F is

t a k e n to be A l g o l W, as i m p l e m e n t e d on the IBM 370/165 i n s t a l l a t i o n at C a m b r i d g e ,

and E i = ( ~ , Y i ) , i = l , . . . , l O , (so

that E3=(@,546) , for example),

then a trivial

(F,E) - m o d e l

of S is: B E G I N I N T E G E R I,J;

I N T E G E R A R R A Y Y(I::IO);

FOR J:=l U N T I L I0 DO READ READ WRITE

( Y(J));

(I) ; (Y(I)) ;

END.

552,553t546,551,549,544,547,554,557,551 , W h e n p r e s e n t e d w i t h an i n t e g e r i

(16i~iO), this p r o g r a m

c o m p u t e s Yi by looking it up in the array Y. We k n o w that this m o d e l

is useless for p r e d i c t i o n ,

b e c a u s e it is a trivial model. it c o m p u t e a "prediction".

we can m a k e

We m u s t first supply it w i t h a

new e n t r y in its table look-up. integer

Nevertheless,

To do this, we replace the

iO in line 2 by the i n t e g e r ii, and add a new n u m b e r

85

at the end of the program.

When presented

ii, this p r o g r a m w i l l

the new number.

this n u m b e r

be?

output

According

to our A x i o m

should be one of the i n t e g e r s Yll will

then be that

Clearly nearer

can see that

it.

But why

regularity

will

doing this

close

of this

to 550,

INTEGER

READ WRITE

of

In o t h e r w o r d s ,

by not o b e y i n g

Because

to 550.

But

to b u i l d that

"mean plus INTEGER

UNITL

that

a

the b e h a v i o u r

case w e

can use

O n e w a y of

of the b e h a v i o u r

to b u i l d

random ARRAY

iO DO READ

the A x i o m

detected

a b e t t e r model. the m e a n

we

by o b e y i n g

we have

in that

and t h e r e f o r e

I,J;

F O R J:=l

it

A prediction

than one o b t a i n e d

see this?

is of the c o n v e n t i o n a l BEGIN

should

The p r e d i c t i o n

be better.

obtained

is to o b s e r v e

close

integer

for P r e d i c t i o n ,

in the s y s t e m b e h a v i o u r - n a m e l y ,

our k n o w l e d g e

What

is a v e r y bad one.

be b e t t e r

can we

tends to r e m a i n

remains

"obviously"

a prediction

for P r e d i c t i o n

the

integer.

this p r e d i c t i o n

550 w o u l d

0,...,9.

with

a model

error"

which

type:

E(I::IO);

(E(J));

(I) ; (550+E(I)) ;

END. 2,3,-4,1,-i,-6,-3,4,7,1, This m o d e l the o b s e r v e d obtained

from

only s l i g h t l y

computes

regularity a table better

gain is 12 terminals,

the

system

(550),

look-up. than but

the

behaviour

and c o r r e c t i n g Admittedly, trivial

it w o u l d

model

rapidly

Y by c o m p u t i n g it by a term this m o d e l

is

(its i n f o r m a t i o n become

decisively

86 superior cl o s e

if m o r e

observations

became

available,

which

remained

to 550. In this

obtain

case,

if we

as the p r e d i c t e d

500 and

559.

This

Clearly,

apply next

time,

several

our A x i o m

output

an i n t e g e r


similar

models

each of t h e m there

is a c o n s i d e r a b l e

It may be p o s s i b l e

to r e d u c e

estimating

the p r o b a b i l i t y

table

terms.

3.7

An E x a m p l e

3.7.1

will above

data,

In this

section

portray

a particular

example

w h i c h was

can be built,

range,

distribution

an e x a m p l e

will

model~ng

and

of"best"

for

predictions.

for e x a m p l e of the

by

look-up

be p r e s e n t e d ,

exercise

of 296 p a i r s

which

in terms

gas

of the

and J e n k i n s

flow

as Series

observations.

rate

The

furnace

(45).

into

The J), The

a furnace,

are of the c o n c e n t r a t i o n

gases.

observations

and

of c a r b o n were made

at

seconds.

obtain

a model

of a d e t e r m i n i s t i c flow rate

of the gas

and J e n k i n s

of i n p u t - o u t p u t

of nine

and J e n k i n s

consists

the i n p u t

by Box

observations

intervals

by Box

are of gas

in the o u t l e t

Box

is the m o d e l l i n g

considered

observations

dioxide

which

used

(which is g i v e n

the o u t p u t

equal

reasonable.

characterisation.

consists input

lying b e t w e e n

is q u i t e

range

we

Introduction

The data,

this

for P r e d i c t i o n ,

for these

transfer

to the o u t p u t

observations,

function

concentration

relating of carbon

87

dioxide,

and a m o d e l

deterministic

of the n o i s e

relationship.

process

The m o d e l

which

disturbs

they o b t a i n

the

is:

2

^ Yt

0.53+0 =

37B+O "

--

51B "

u t --3

. . . . . . . .

(3.6)

2

I-0.57B-O.OIB nt

1

=

wt. . . . . . . . .

(3.7)

. . . . . . . . . . . . . . . . .

(3.8)

2

I-O.53B+O.63B Yt

=

Yt+nt

IIere u t and y~ r e p r e s e n t respectively,

after

the input

removal

and o u t p u t

of t h e i r m e a n

variables,

values,

at

^

sampling

instant

generated

t.

Yt r e p r e s e n t s

by the t r a n s f e r

the e s t i m a t e

function

of y~

of eqt~ (3.6),

and n t is

^

the error b e t w e e n identification in variables" (48))).

y~ and Yt"

terminology, in the

white-noise"

process

random s e q u e n c e ) , nt according operator,

u,y denote

(i.e.

which

(3.7).

by Bx t = xt_ I.

the m e a n

to cause

of the

(Johnston acting

on

uncorrelated the d i s t u r b a n c e

B is the b a c k w a r d The m o d e l

representation,

values

("error

t, and w t is a " d i s c r e t e

is c o n s i d e r e d

diagrammatic

disturbance

a zero-mean,serially

to r e l a t i o n s h i p

defined

conventional

at time

system

error"

of e c o n o m e t r i c s

a stochastic

of the p r o c e s s

conventional

n t is an "output

terminology

n t represents

the o u t p u t

Using

input

shift

can be g i v e n

as in fig. and o u t p u t

a

3, w h e r e

variables,

respectively.

3.7.2

The S y s t e m

In terms

of d e f i n i t i o n

(1.3.1),

the

s y s t e m w h i c h we

are

88

considering

is S=(U,Y)

where

U=(u

, ....

u

1

Y=(y

1

, ...,

£ . = m =i, l l and As

the

Y 2 9 G ),

for

example

of

programming

language

IBM

installation

3.7.3

Model

definition

...,

296,

{ui,Y i}

are

section

3.6,

F to be A l g o l

I - The

We m u s t

i=l,

observations

in the

370/165

), 296

3.3.1.

listed

we

shall

in A p p e n d i x take

as i m p l e m e n t e d

C.

the on the

at C a m b r i d g e .

Trivial

define

W,

as

Model

the

sets

For

the

A,B,C,D,E, w h i c h trivial

model,

we

occur can

in

take

these

to be: A =

{Ai:i=i,...,296}

,

Ai= ~

B = { B i : i = l .... ,296}

,

Bi=~

C =

{Ci:i=i,...,296}

,

Ci=Y i

D =

{Di:i=i,...,296}

,

Di=(Ai,Bi)=@

,

Ei=(Di,Ci)=(~,y

i)

is a t r i v i a l

model

E = { E i : i : l ..... 296} A concrete S,

(F,E)

-model,

which

is: BEGIN

INTEGER

I,J;

FOR J:=l READ WRITE

UNTIL

REAL 296

ARRAY

Y(I::296);

DO R E A D O N

(I) ; (Y(I)) ;

END. 53.8

53.6

53.5

57.0

(Y(J)) ;

of

the

system

89 The last line of the trivial m o d e l is the table look-up, which c o n t a i n s the o u t p u t o b s e r v a t i o n s . can be r e p r e s e n t e d d i a g r a m m a t i c a l l y ,

3.7.4

Model

The t r i v i a l m o d e l

as in fig.

4(a).

II - The Mean

Probably

the first n o n t r i v i a l m o d e l to be h y p o t h e s i s e d

for many systems is that the s y s t e m b e h a v i o u r has a c o n s t a n t mean value.

This m o d e l is of the type w h i c h r e p r o d u c e s

regularities only in the o u t p u t o b s e r v a t i o n s ,

and does not

exploit any i n f o r m a t i o n in the input o b s e r v a t i o n s .

Con-

sequently, the sets A , B , C , D , E may be taken to be the same as for the t r i v i a l model.

The m e a n value of the o u t p u t

observations is 53.5.

The f o l l o w i n g is a

(P,E)-model of S

which m a k e s use of this fact: BEGIN

I N T E G E R I,J;

REAL A R R A Y Y(I::296);

FOR J : = l U N T I L 296 DO READON READ

(Y(J));

(I);

WRITE

(53.5 + Y(I)) ;

END.

.3

.i

O

0

-.i

. ..

3.8

3.5

The table look-up of this m o d e l is listed in the column headed y~ in A p p e n d i x C. Fig. r e p r e s e n t a t i o n of this model.

4(b)

shows a d i a g r a m m a t i c

The d a s h e d line r e p r e s e n t s

the b o u n d a r y of the model.

3.7.5 M o d e l I I I -

Deterministic Transfer Function

We now assume that the t r a n s f e r f u n c t i o n of e q u a t i o n

90

(3.6) the

has been

input

and output

restriction output may

hypothesised

that

of

assume

the

between we make

knowledge

initial

the past

to c h o o s e

{Ai:i=1,...,296}

However,

not

sets

the

of past

conditions),

and present

new

Ai = A =

relationship

system.

than

of all

We have

the

may

(other

knowledge

information.

the

the model

observations

assume

as

A,

but

input

...,

(u ,u ,. 1 2 "''ui)

E: f o r i~6

, A i = @ for il,(i.e.

is an a s y m p t o t i c system.

We w i s h to c o n s i d e r m o d e l s of the sJ's w h i c h d i f f e r only in their table table

look-ups.

To capture the idea of a

look-up w i t h o u t r e s t r i c t i n g it unduly, we shall

c o n s i d e r m o d e l s to be pairs

(m,T).

is a part of a program,

and the pair

the c o m p l e t e program.

This

if required: (3.2.1)),

take a p a i r i n g

E a c h e l e m e n t of the pair (m,T) is r e g a r d e d as

can be f o r m a l i s e d q u i t e easily, function T

and change d e f i n i t i o n

(cf proof of t h e o r e m

(3.3.6),

so that a c o n c r e t e

(F,E)-model b e c o m e s an o r d e r e d pair of i n t e g e r s that F ( T ( m , T ) , i , D i ) = C i. w i t h programs,

(m,T), such

T h e s e i n t e g e r s can be a s s o c i a t e d

as before,

m will be c o n s i d e r e d to be the

p a r t w h i c h is common to m o d e l s of all the sJ's, w h i l e T j w i l l be r e g a r d e d as a table for e a c h S j.

look-up, w h i c h may be d i f f e r e n t

W h e n a t r a n s l a t i o n of the p r o g r a m

one l a n g u a g e to another is considered, T

(or at least its length)

hand,

(m,T) from

it w i l l be a s s u m e d that

remains unchanged.

On the o t h e r

the t r a n s l a t i o n of m w i l l be assumed to be d i f f e r e n t

107

from m.

In this way a distinction is drawn between T and

m, which corresponds to some aspects of the distinction between table-lookup and other types of program. In the following definition a particular programming language is assumed, in this language.

m and T j are fragments of programs The definition is based on definition

(3.3.1), and the notations of definition

(4.2.1) are

generalised in an obvious manner. Definition

(4.2.3)

Let AJ={A~} be a set of ordered subsets of let BJ={B~} be a set of ordered subsets of

(Uz 2U ...Uj),

(Y, Y 2 "''Yj)' and

let cJ={c~} be a complete set of mj disjoint ordered subsets of

J_ J J Let DJbe a set of ordered pairs Di-(Ak,B£)

(YI y 2 "''Yj)"

(i=l,2,...,mj),

and let E j be a set of ordered pairs

E i-' j- ~Di'~i j ~J ) (i=l'2'''''mj)" I

Finally,

let ~ be the sequence

2

I

~=(E ,E .... ), a n d ~ Then the pair

be the sequence

~=(T

2

,T .... ).

(m,~) is an asymptotic t-model of the

asymptotic system =(S ,S ,...) if and only if (m,T j) is an EJ-model of S j, for every j=l,2,... The following definitions distinguish between two possible asymptotic behaviours of rival models. denotes the i n f o r m a t i o n gain of the model denotes the information explained by model

I(m,T j)

(m,TJ), and E(m,T j) (m,TJ), n ~ e l y

the ratio of I(m,T 3) to the size of the trivial model of Sj .

(m, , < )

and (m2 '~)2 denote asymptotic models of some

108

I

asymptotic s y s t e m # ,

with ~

2

= (T II ,T 21 ,...) and f 2 = ( T 2 , T 2 .... ). 1

We use lim inf xj to denote lim inf Xk, and j+~ j~m k>j similarly for lim sup. Definition

(4.2.4)

(m , ~ ) 1

is asymptoticall[ weakly better than

(m ,/)

1

2

(denoted by

(m , < ) > w ( m 1

2

,/2)) if and only if 2

lira inf {I(m ,TJ)-I(m ,TJ)}=+ ~. . . . . . . . . . 1

j~ Definition

2

(4.2)

2

(4.2.5)

(m , ~ ) 1

1

is asymptotically

strongly better than

(m , ~ )

]

(denoted by

2

(m1,~1)>s(m2,~z))

lim inf

j~

if and only if

{E(m ,TJ)-E(m 1

1

2

,TJ)}>O . . . . . . . . . . 2

(4.3)

2

The ideas behind these definitions

are the following.

Let tj denote the trivial model of S j, and Itjl denote its size.

We henceforth make the natural assumption that lim [tj[=+ ~ . . . . . . . . . . . . . . . . . . . j~

If

(m , ~ ) 2

is asymptotically weakly better than

I

the "amount of information"

(4.4)

(m ,~) 2

extracted from S j by

eventually greater than that extracted by difference between them is eventually

(m ,T j ) is l

]

(m2,T23), and the

increasing.

their "rates of information extraction",

then

2

But

as measured by the

109

information explained, may be converging towards each other. For example, if Itjl=kj, I(m],TJ)=pj ½,1

I(m2,TJl=qj½2 , with p>q,

then I(mz,T32)-I(m2,T3)=(p-q)j½~- , while E(m ,T j)-E(m ,T~)= k ~ j -~ ~O. i ! 2 If (m ,~) I

(m ,~) 2

is

is asymptotically strongly better than

1

then the "rate of information extraction" by (m ,~)

2

1

eventually greater than that by (m ,~}. 2 2

strong"

terminology

is

justified

by

the

1

The "weak/

following

theorem.

Theorem (4.2.6) (m 1 ' 3 )1 >

S

(m2 , ~2) ~ ( m l , ~ l ) > w ( m

2

,~). 2

Proof Suppose lira inf{I(m ,TJ)-I (m ,T j)}k, such that I(m ,Tl)-I(m ,T~),O,~i>k, such that E(m ,Ti)-E(m ,Ti) O. j~ 1 l 2 z Hence lim inf {E(m ,TJ)-E(m ,TJ)}>O=~lim inf {I(m ,T3)j+~ 1 1 2 2 j~ 1 1 I(m ,TJ)}= 2

+oo

•

2

We now consider the effect of writing models in different languages on their asymptotic performance.

For a precise

discussion of what it means for a program to be written in

110

some particular

language,

see chapter

5.

Let

(m , ~ ) I

(m

,~)

be asymptotic

models of J w r i t t e n

language

~.

a programming

programs

(p ,T~), (p ,T3), j=l,2,...,

2

and

l

in a programming

Z

Let

~ be 2

functions

such

can be written

that

in ~,

2

and such that these programs recursive

language,

compute

as the programs

the same partial (m 'TJ)' 1 (m2'TJ)' 2

j=l,2

1

respectively.

Using

the

notation

of

definition

(3.3.6)

we can write (T (PI'T~) ,' ,') = ~ (T (ml 'T3~) '''' ) where T is an a p p r o p r i a t e

pairing

for P2,m2.

(p , ~ )

Consequently

.......

function,

and

(p , ~ )

]

models

of#written

Let

IPl denote

It J=Jt 1÷k Theorem

2

similarly

are asymptotic

2

in z. the size of a program p;

trivial model of S j written model of S 3 written

and

(4.5)

in ~.

let t~ be the 3

in ~, and let t~ be the trivial 3 we assume that

..................

146)

(4.2.7)

With the notations

and assumptions

as stated above,

(a)

(ml , ~ ) >w(m2 , < ) ~

(pl, < ) >w (p2 , < )

(b)

(ml ' < ) > s (mr '::= [ < i d e n t i f i e r > < l e t t e r >

then the A l g o l - s u p p o r t s equivalent.

of the two m o d e l s may be

This syntax, w h i c h is part of the s y n t a x of

the A l g o l - s u p p o r t of the first model,

allows

sin to be used

::=ilnls must

(since the p r o d u c t i o n s

also be p a r t of the syntax), to be u s e d as a p r o c e d u r e

and f u r t h e r m o r e

identifier.

A l g o l - s u p p o r t of the first m o d e l

(5.4.1)

intuitively).

Then,

it allows it suppose that the

is a f r a g m e n t of the

A l g o l - s u p p o r t of the second m o d e l to allow,

Now,

the i d e n t i f i e r

(a p o s s i b i l i t y w h i c h we w i s h

a c c o r d i n g to d e f i n i t i o n

(iv), the sin call m u s t have the same effect in both

languages.

So,

the A l g o l - s u p p o r t of the first m o d e l m u s t

c o n t a i n sin as a s t a n d a r d procedure.

C l e a r l y this c o n t r a d i c t s

the i n t e n d e d m e a n i n g of " A l g o l - s u p p o r t " . Consequently,

we insist that s t a n d a r d p r o c e d u r e i d e n t i f i e r s

be r e g a r d e d as terminals.

If it is now s t i p u l a t e d that only

l - c o m p a r a b l e m o d e l s s h o u l d be c o m p a r e d for m o d e l assessment, then we have the formal e q u i v a l e n t of the i n t u i t i v e idea, that m o d e l s

should be c o m p a r e d only if they use the same

f a c i l i t i e s of a language.

One r e a s o n for m a k i n g this

s t i p u l a t i o n has a l r e a d y been r e f e r r e d to in s e c t i o n It m a y be felt to be an "unfair"

6.1.

c o m p a r i s o n if the m o d e l s

are not l - c o m p a r a b l e . An o b v i o u s e x a m p l e of this w o u l d be a c o m p a r i s o n of a

139

d i f f e r e n c e - e q u a t i o n m o d e l w i t h a d i f f e r e n t i a l - e q u a t i o n model. If the d i f f e r e n t i a l - e q u a t i o n m o d e l w e r e a l l o w e d to call a standard p r o c e d u r e

for integration,

w o u l d it be r e a s o n a b l e

to compare the " n u m b e r of a r b i t r a r y elements"

e m b o d i e d in

it w i t h the n u m b e r e m b o d i e d in the d i f f e r e n c e - e q u a t i o n m o d e l ? The d l f f e r e n c e - e q u a t i o n m o d e l assumptions

r e q u i r e s fewer a priori

(if its ~- s u p p o r t is a f r a g m e n t of the l - s u p p o r t

of the d l f f e r e n t i a l - e q u a t i o n model, w h e r e in w h i c h b o t h m o d e l s There

I is the l a n g u a g e

are w r i t t e n ) .

are, however,

two w a y s of m a k i n g m o d e l s

l-comparable.

Rather than a d d i n g an e x p l i c i t l y d e c l a r e d i n t e g r a t i o n p r o c e d u r e to the d i f f e r e n t i a l - e q u a t i o n model,

it is p o s s i b l e to add a

"dummy" call of the s t a n d a r d p r o c e d u r e the d i f f e r e n c e - e q h a t i o n model. o f f e r r e d by this p o s s i b i l i t y , r e d u n d a n t statements,

in s e c t i o n

It is the f l e x i b i l i t y of "padding"

that r e d u c e s

i n s i s t e n c e on l - c o m p a r a b i l i t y .

for i n t e g r a t i o n to

models with

the s i g n i f i c a n c e of any

This w i l l be d e m o n s t r a t e d

6.3.

If l - c o m p a r a b i l i t y

is required,

the choice of a s u i t a b l e l - s u p p o r t to be compared.

there still remains

for the m o d e l s w h i c h are

R e t u r n i n g to the above example,

still a d e c i s i o n to be m a d e - s h o u l d b o t h m o d e l s standard p r o c e d u r e decision,

for i n t e g r a t i o n ,

of course,

or n e i t h e r ?

is v e r y s i g n i f i c a n t

it w i l l be g o v e r n e d by the apriori

the m o d e l l e r w i s h e s

to make.

call the This

for m o d e l assessment.

But this is the d e c i s i o n d i s c u s s e d in c h a p t e r 4. words,

there is

In o t h e r

assumptions

that

140

6.3

Example:

Algol W-Comparable Gas Furnace Models

This section investigates how the assessment of the six models of the gas-furnace data

(cf. chapter 3) is altered,

if they are required to be AlgolW-comparable. 6.3.1

Standard Procedures The definition of Algol W is assumed to be a formalised

version of the specification given in use three standard procedures, procedures

(50).

The six models

namely the input/output

READ, READON and WRITE.

In accordance with the

discussion of section 6.3, we consider the syntax specification of

(50) to be augmented by the productions:

<simple statement>::=<standard

procedure statement>

<standard procedure statement>::=<standard procedure ( list>)

identifier>::=READ[READON]WRITE

The abstract syntax,

translator and interpreting

automaton are considered to be modified accordingly. 6.3.2 Al~olW-Comparable

Models

In this example the models are modified so as to be AlgolW-comparable expressions,

by inserting redundant statments

and

rather than by avoiding certain constructions.

Referring to section 3.7, and comparing the models in order, we notice first that the support of model II contains

syntax of the AlgolW-

the productions

<simple t expression>::=<simple

t expression>+l

141 whereas

the s y n t a x of the A l g o l W - s u p p o r t of m o d e l I c o n t a i n s

only the p r o d u c t i o n <simple t e x p r e s s i o n > : : = < t

term>

(For the s i g n i f i c a n c e of "t" see Appendix B). the W R I T E

This d i s c r e p a n c y

s t a t e m e n t of m o d e l

(50) or the i n t r o d u c t i o n to can be removed by c h a n g i n g

I to:

WRITE(Y(I)+O);.

M o d e l III r e q u i r e s several p r o d u c t i o n s w h i c h are not needed for m o d e l s

I or II.

T h e s e are:

< l e t t e r > : : = NIU ::=. <simple

t expression>::=<simple

: : = < t t e r m > * < t

t expression>-

factor>

::= ::=<simple

t expression>

<simple t e x p r e s s i o n > ::=< <statement>::=

<simple s t a t e m e n t > : : = < b l o c k > l < : : = < t

t a s s i g n m e n t statement> left part>

: : = : = : : = < i f

clause><simple

statement>ELSE

<statement> : : = IF < l o g i c a l e x p r e s s i o n >

THEN

M o s t but not all of these are n e e d e d for m o d e l IV, but model IV itself needs two p r o d u c t i o n s w h i c h are not n e e d e d by m o d e l s

I,II, or III:

::= EIVIWIZ

142

: : = < a c t u a l p a r a m e t e r > l < a c t u a l p a r a m e t e r list>, The only new p r o d u c t i o n r e q u i r e d by m o d e l s v and VI are < l e t t e r > : : = A ,

and < l e t t e r > : : = W ,

respectively,

but these can easily be r e m o v e d by u s i n g d i f f e r e n t identifiers. We give b e l o w the six models, m o d i f i e d AlgolW-comparable. AlgolW-support I

so as to be

The c o n c r e t e syntax of their common

is g i v e n in A p p e n d i x B.

The Trivial Model

BEGIN INTEGER

I,J,N,V,W,Z;

REAL A R R A Y E w U , Y ( I : : 2 9 6 ) ;

BEGIN FOR J : = l UNTIL READ

296 DO READON

(Y(J-O));

(1) ;

V:=O; IF I,<SlOS2:e2>,<s2os2:e3>,...} The c h a r a c t e r i s t i c

set of B is not k n o w n in this case,

this d e f i n i t i o n cannot be completed. w e r e the object:

. so

But suppose that B

191

B={<sl:e4>,<s2:e5@ =

then we would have A={<sl:el>,<SlOS2:e2>,<s~ s2:e3>,<Sl°S3 ° s2:e4>,<s2 ° s ~ s2:e5>}. We now introduce the

H-function, which is used to

perform operations on objects.

The ~-function takes two

arguments, the first of which is an object A, and the second is a pair, where K is a composite selector, and B is an object.

The range of ~ is the set of all objects.

The

value ~(A;) is an object which is obtained from A by replacing K(A) by B in such a way that K(~(A;)=B. This is most clearly shown by examples (taken from Lucas et al (68)) : Let A = i/sl

s2~ s1

Sl~2

/ e2

S~e 4

\ e3

Then (i)

/Sl/~S3

(A;<s3:B>)=

~

i

/ e2

Sl

2~e

s2

\ e3

I--.

4

192

/

(ii)u (A;<SlO s2:B>)=

s2

s\

e~ S1

e4 (iii)

(A;<Sl~ sl,s l.s 2:B>)=

e/

i/•2 \

Sl~S

e4

2 e3

c/ In particular, (i) ~(A;<s3:~>)

if B=~, we obtain: = A

(ii) ~ ( A ; < s ~ s2:~>)

=

~e 4

(iii) H (A; <SlO SlO Sl- S 2 :~>) =

/

sI

S2

s~s2\

e I

)=

Sls / ~ 2

/

eI

J

s1

s3

\ e3

I

e4 Ollongren

(49) gives conditions

arguments of the (ii)

~-function

under which interchanging

the

leaves the value unchanged.

~o (,... ,) ~ ]l(~; , . . . , )

Thus ~o is a function which

"creates"

objects.

Example ~° (<Sl :el> '<Sl° s2 :B> '<s2e s2 :e3>) =

s ~

2

S1

U

s2

\

e3

194

~.4

C o n c r e t e Syntax.

The c o n c r e t e syntax of a p r o g r a m m i n g d e f i n e d by u s i n g the B a c k u s - N a u r rules.

language can be

form of w r i t i n g p r o d u c t i o n

This is a s h o r t h a n d m e t h o d of d e f i n i n g a grammar.

S u p p o s e that there exists

a finite n o n - e m p t y set Z of

terminals.

T y p i c a l e l e m e n t s of Z are:

b, 2, *, begin,

and so on.

Let Z* denote the set of all finite strings of

e l e m e n t s of Z.

Also,

suppose N is a finite n o n - e m p t y set

of n o n - t e r m i n a l s

such that N n Z = ~ and N* is the set of all

finite strings of e l e m e n t s of N.

Let V=ZuN,

Let V+=V*-A, w h e r e A is the empty string.

and V * = ( Z u N ) *

Then the set of

p r o d u c t i o n rules is the set ~={(~,~) :~eV*xNxV* Each pair

&SeV+}.

(~,8)e~ is w r i t t e n

the set of p r o d u c t i o n

rules

as ~ 8 .

In B a c k u s - N a u r

{6+~i ' ~+B2'''''

~+~n }

form,

is

d e n o t e d by the single e x p r e s s i o n < ~ > : : : 8 1 1 8 2 1 - . 1 8 n. The b r a c k e t s Naur notation providing

are used to d e n o t e n o n - t e r m i n a l s .

Backus-

can be used to express p r o d u c t i o n rules

that ~eN.

(~,8),

Such p r o d u c t i o n rules are c a l l e d

context-free. A g r a m m a r G is a 4-tuple G = ( N , Z , P , S ) , w h e r e P is a finite n o n - e m p t y subset of ~ , and SEN is the start s~mbol. If each p r o d u c t i o n of a g r a m m a r is c o n t e x t - f r e e g r a m m a r is said to be c o n t e x t - f r e e .

then the

A context-free grammar

can be c o n v e n i e n t l y d e f i n e d by a finite set of e x p r e s s i o n s

195

in B a c k u s - N a u r

form.

If there e x i s t 61,6 yi=61~2

and y 2 = ~ i ~ 2

7i-i

Yi

~

2 e V* and ~+SEP such that

then Y1

(i=l,2,...,n),

~Y2"

If ~leV*

then y o ~ Y n ( Y n

and

is d e r i v e d

from yo ) . The g r a m m a r G is said to g e n e r a t e L(G)={x:S

~.

Two g r a m m a r s

the l a n g u a g e

x & xeZ*} are e q u i v a l e n t if they g e n e r a t e the same

language. The V i e n n a M e t h o d d e f i n e s One is the c o n c r e t e syntax, L u c a s et al

two g r a m m a r s

for each language.

the o t h e r the a b s t r a c t syntax.

(68) e x p l a i n the d i s t i n c t i o n m o s t clearly:

"An a b s t r a c t syntax is one w h i c h only s p e c i f i e s the e x p r e s s i o n s of the l a n g u a g e as to the s t r u c t u r e s s i g n i f i c a n t subsequent interpretation

for their

and not as to how they are to be

e x p r e s s e d for the p u r p o s e of c o m m u n i c a t i o n either to o n e s e l f or to others.

A c o n c r e t e syntax s p e c i f i e s

the e x p r e s s i o n s

of the language

as a set of c h a r a c t e r strings".

The c o n c r e t e syntax of LML is d e f i n e d as follows: <program>

::= ,

,

,

, .

< r a t i o n a l > : : = +J

-JO

::= J

::= J

:

O 1 2 3 4 5 6 7 8 9 . , + -

are r e q u i r e d to be signed so that the

...,

196

terms

in the table

manner

(cf.

the o t h e r

chapter7).

terms

An e x a m p l e

solely

of a v a l i d

This p r o g r a m give

look-up w i l l This

be

coded

requirement

for s i m p l i c i t y string

in LML

(not shown

is e x t e n d e d

to

of d e f i n i t i o n .

is:

can be p a r s e d (using the

the o b j e c t

in a s i z e - c a p t u r i n g

2,1,+.6,-3,O,-1.41,-5.2.

syntax definition)

to

in full):

[/s

s

s1

1

6 A. 5.

Abstract

We

Syntax

introduce

if an o b j e c t

4

the

following

x satisfies

notational

a predicate

conventions:

P, we w r i t e

is -P(x). A

The That

set of o b j e c t s is,

which

is-P={x:is-P(x)}

satisfy .

The

P is d e n o t e d set is-P

is-P.

is d e f i n e d

by

197

an expression is-P=

of the form

(<S-Pl:iS-Pl>,<s-P2:is-P2>,...<S-Pn:iS-Pn

>)

^

which indicates

that for every x c is-P,

X=llo(<S-Pl:Xl>,<s-P2:x2>,...,<S-Pn:Xn>), A

^

A

where x I e is-P I, x 2 e is-P2,...,x n e is-P n. (<S-Pl:iS-Pl >) then we write is-P=is-P I.

If is -P=

A predicate

also be defined by using the disjunction

may

operator V, e.g.: ^

is-P=is-P 1 V is-P2, which denotes x e is-P 1 V is-P 2.

that x e is-P only if

It is assumed that certain predicates

are satisfied by subsets of the elementary Using this notation, defined

the abstract

objects.

syntax of LML is

as follows:

is-program=(<s-n:

is-integer>,<s-m:

is-integer>,<s-rational-

list: is-rational-list>) is-rational-list=(<s-head:

is-rational>,<s-tail:-is-rational-

list V is -~>) It is assumed that is-~={~}, integer and is-rational

and that the predicates

are satisfied by

infinite

sets of elementary

program"

satisfies

objects.

the predicate

program introduced P=_

(countably)

Every LML "abstract

is-program.

the abstract program corresponding

is-

For example,

to the concrete

LML

in section A.4 is the object

//~s-rational-list s~n

/ 2 s-m

! 1

/

_~eaS-tail s- ead ~s-tail / +.6

s-head / -3

~

-tail

s-head

0/

~s-tail s-head /

2 -h e a d /

-1.41 --

.2

198

How

this

object

specified next

by

i9 o b t a i n e d

the

section.

from

translator, Note

which

is m e r e l y

object,

it is c h o s e n

Most the

discussions

abstract

defining

syntax

syntax, not be

If w e w e r e

to a d o p t

would

assumes not

an i n f i n i t e

to have

measure of

terminals

earlier which

must

(in o u r (i)

(2)

case,

of

view,

set

of

has

over

there

of t h e

size

3 is

are

allowed.

of the

string

over

"machines"

programs

is n o t

are:

a finite

an e f f e c t i v e

terminals

length

and the

a size measure

there

axiom

of p r o g r a m s ,

of axioms

size,

first

I t is e s s e n t i a l

pair

These

it

therefore

a useful

of any given

the

A.4).

since

discussed

at m o s t

any y, w h i c h

abstract

As

programs).

exists

as its realisations

(and is

the

that

languages.

above

a program.

introduced

exists

out

for our purposes,

in s e c t i o n

by

viewed

concrete

"terminals"

constitute~

value.

point

separate

to

for that

mnemonic

the

in t h e

any m e a n i n g

Method

then

is

the o b j e c t

label

can be

to b e

satisfactory

satisfied

for Clearly,

this

attach

alternative

in c h a p t e r

(15)

be

that

a maasure

which

Blum

a language

program

defined

for

arbitrary

the Vienna

as d e f i n e d

introduced

not

to h a v e

considered

not be

a "grar~mar"

for us

of

and

of it n e e d

syntax

of

an

be

"+.6"

(p) d o e s

that object.".6"

concrete

will

that writing

s-head o s-rational-list

although

the

number

procedure

of programs

for d e c i d i n g ,

are o f s i z e y.

satisfied

if i n f i n i t e

sets

199

Furthermore, procedures

we w a n t p r o g r a m s


to d e s c r i b e e f f e c t i v e

functions.

D e f i n i n g a language

w i t h i n f i n i t e l y m a n y t e r m i n a l s w o u l d c o r r e s p o n d to d e f i n i n g a T u r i n g m a c h i n e w i t h i n f i n i t e l y m a n y tape symbols. w o u l d be a f u n d a m n n t a l

change in the n o t i o n of " c o m p u t a b i l i t y " .

To o v e r c o m e these o b j e c t i o n s , define

This

it w o u l d be p o s s i b l e to

an a b s t r a c t s y n t a x for LML w h i c h s p e c i f i e d a finite

set of terminals.

The object at each t e r m i n a l node of an

a b s t r a c t p r o g r a m w o u l d then s a t i s f y one of the p r e d i c a t e s i s - d i g i t or is-sign,

say, and these w o u l d be d e f i n e d by

i s - d i g i t = is-O V is-i V . . . V is-9 i s - s i g n = is-+ and is~O={O},

V is- -,

.... is-9={9},

is-+={+},

In this case the a s s e m b l y of the d i g i t s

is--=[-}. into i n t e g e r s

and

r a t i o n a l s w o u l d h a v e to be p e r f o r m e d by tile i n t e r p r e t i n g automaton,

A.6

rather than by the translator.

The T r a n s l a t o r

The t r a n s l a t o r is a f u n c t i o n w h i c h maps parsed

concrete programs

a b s t r a c t programs.

the set of

in a l a n g u a g e into the set of

To d i s t i n g u i s h b e t w e e n

concrete

and

a b s t r a c t o b j e c t s we i n t r o d u c e the conventions: is-<program>(x)

means

that x is a p a r s e d c o n c r e t e program,

n a m e l y an object such as that shown in s e c t i o n A.4. precisely,

for LML we have,

for some p o s i t i v e

More

integer k:

is-<program>=(<sl:is->,<s2:is-,>,..~<S2k_l:is->, <s2k:is-.>)

200

The p r e d i c a t e s the concrete

is-,

syntax in exactly

we have is-,={,},

is-O={O},

In the following ...else...

that is-<program>(p)

obtained

are

the same way.

Obviously,

the statement

It is assumed

and is-(xi).

~o(<S-n:trans-integer

if...then

in the metalanguage.

trans-program,

from

etc.

definition,

is a s t a t e m e n t

translator,

is-

is d e f i n e d

The LML

as: t r a n s - p r o g r a m

(p)=

(s l(p))>,

<s-m:trans-integer

(s 3 ( p ) ) > , < s - r a t i o n a l - l i s t :

m a k e l i s t ( s 5(p) ,

s7(P) , .... S2k_l (P)) >) where makelist

(Xl,X2,...,Xn)=~o(<s-head:trans-rational(Xl)>,

<s-tail:i_~f x 2 = ~ & . . . & X n = ~

then ~ else m a k e l i s t

(x2,...,Xn)>)

and the functions trans-rational:

is-

+ is-rational

^

trans-integer

^

: is-

are not further defined. functions

+ is-integer

For our p u r p o s e s

these two

are best thought of as the usual mappings

the rational numbers.

(In an actual

may be more useful to consider

implementation,

them as m a p p i n g s

In this case the sets

w o u l d be finite sets, practical

it

into bit-

^

patterns.

onto

A

is-rational

and i s - i n t e g e r

due to the fixed w o r d - l e n g t h

of

computers). ^

Note that t r a n s - p r o g r a m

(p) e is-program,

(x I, .... x n) e is-rational-list.

and m a k e l i s t

20~

A.7

The I n t e r p r e t i n g A u t o m a t o n

Following Ollongren a u t o m a t o n to be a 5-tuple

(49), we d e f i n e (0, is-state,

~o,A,F), w h e r e

0 is the set of t r e e - s t r u c t u r e d objects and i s - s t a t e

is a p r e d i c a t e over O.

an i n t e r p r e t i n g

a l r e a d y introduced,

Objects satisfying A

i s - s t a t e are states of the automaton.

~o e

is the initial state of the automaton, final states. however, A(~)

is-state

and F is a set of

A is the state t r a n s i t i o n

function;

its range is not is-state, but the p o w e r set of is~state.

is thus a set of states

d e f i n i t i o n of LML,

in qeneral,

a l t h o u g h in our

A(~) will always be a single state.

A . 7 . 1 The State

The state of the i n t e r p r e t i n g

a u t o m a t o n is structured.

The s t r u c t u r e depends on the language to be defined, the d e f i n i t i o n of LML can be rather simple. b l o c k structure, types,

procedures,

conditional

variable

For the LML i n t e r p r e t i n g

A language w i t h

i d e n t i f i e r s of various

and qoto statements,

need a r a t h e r m o r e c o m p l i c a t e d

and for

and so on, will

set of states. automaton,

or LML machine,

we

define is-state=

(<s-c: i s - c > , < s - d n : i s - d n > , < s - c o u n t e r : i s - i n t e g e r > ) .

is-dn is a p r e d i c a t e s a t i s f i e d by a d e n o t a t i o n directory, and is d e f i n e d by is-dn=(<s-data:is-data>,<s-y:

is-rational>,<s-parno:

is-integer

V is -~>) where

202

is-data=

(<s-i:is-integer>,<s-list:is-rational-list>).

The data for a program,

namely the sequence i,Yi_l,Yi_2,...

appears in the initial state as the object

s-list

s-i / i

s_hea~s_tai 1

/

),,

Yi-1

s-head'-

/ Yi-2 We do not specify how this is achieved. result of the computation,

Similarly,

the

Yi,is the object s-yos-dn(~F) ,

where ~F £ F, and we do not specify how it is output.

The

number m+n, which is required for the correct interpretation of the program,

is stored in s-parno • s-dn

"denotation directory"

(~).

(The term

is taken over from

(49) and

(68).

For LML this directory is simpler than in

(49) and

(68),

but it serves essentially intermediate

the same purpose,

namely storing

and final results).

The most complex part of the state is the control, which is an object satisfying the predicate is-c=

(<s-in: <s-ri:

is-in>,<s-al:

is-dum V is-~>,)

where the following c: control, ri:

is-obj-list>,

in:

V is-~,

abbreviations have been used:

instruction,

al:

argument

list, obj

: object,

return information,

dum: dummy.

In this definition,

is~in is a subset of the elementary

203 ^

objects,

called the set of instructions,

subset of the e l e m e n t a r y r is a simple is-obj-list

selector,

is a p r e d i c a t e

discussion

called the set of dummy names.

different

w h i c h we do not define extensive

objects

and is-dum is a

from s-in,

s-al or s-ri.

satisfied by lists of objects,

further;

Ollongren

(49) gives an

of lists.

An example of a control

is the object:

r~-al s-in

s-

/

in/~Ss-al

in 2

ri

[ in 1

I x

This p a r t i c u l a r

control may have

the i n s t r u c t i o n

in 2 is performed,

The result of carrying name a.

of the next state

s-i~s-al

in 1

with x as its argument. to the dummy

so that the control part

\

in 2 (x)

in 1 is now carried out, with

in 2(x)

as its argument.

in 2 is said to be contracting.

On the

it may be that carrying out in 2 requires

carrying out some other instruction

effect:

is

/

In this case,

the following

out in 2 is assigned

in°2 is then deleted,

other hand,

~a

a

instruction

in 3 on in 4 (x).

first

in 4 on x, and then an

In this case in 2 is said to

204

be expanding,

and c a r r y i n g it out results in the n e x t state

having the control:

r

\ /

~

-

s-in

/

s_ri

in3

-ri 1

s-al

b

I

in 4

a

in 1

~b

x

If b o t h in 4 and in 3

a

are contracting,

the c o n t r o l s of the

n e x t two states w i l l be:

r ~N~s-al

s-in s-ri

1 in 1

a

in 3 s~al

a s-in

s-al

/

\

in 4 (x)

in 3 (in 4(x))

in 1

= in2

If an i n s t r u c t i o n

is expanding,

(x)

then p e r f o r m i n g it leaves

all c o m p o n e n t s of the state u n c h a n g e d e x c e p t the control itself.

However,

if it is contracting,

then its e f f e c t

m a y be to change any of the c o m p o n e n t s of the state case,

s-counter

(~) and s-dn

We need some d e f i n i t i o n s

(in our

(~), as w e l l as s-c(~)). for later use.

The set of

205

control

selectors

selectors ~he

of a control

C is the set of composite

~(C)={K:K=roro...or

identity

of a control

selector)

&K(C)¥Q},

if C=~.

if C ~ ,

The terminal

C is the composite

and is I control

selector

selector

T ( C ) = { Y : T e ~ ( C ) & r o T % ~ ( C ) }. If K=r n is a control where

rn=rorg...0r

precedin~

selector

(n compositions),

control

selector

If K is a control

& s-alopreci(K)(C)

selectors

of i n s t r u c t i o n s

selectors

control C, then

o K(C)#~}

is the set of composite

control

of a n o n - e m p t y

(C,K)={s-alopreci(K):i~l

arguments

then the mth

(O~m~n).

selector

=s-ri

and n)l,

control,

of K is

prec m ( K ) = { K ' : r m o K ' = K }

prec-arg


which

select those

associated with preceding

of K w h i c h

are equal

to the dummy name

a s s o c i a t e d with K. If K is a control

selector


then the derived

return

the set r i ( C , K ) =

prec-arg

included because

these two sets differ

for the r e l a t i v e l y The initial

Here, p

(C,K).

a s s o c i a t e d with K is (This d e f i n i t i o n in

state of the LML m a c h i n e

is

(49), but coincide

(<s-data:

introduced

is

int-prog>,<s-al:p>)>,<s-counter: d>,<s-y:

is the LML program,

is-program

C

simple LML machine).

~o=~o(<S-C:~o(<S-in: <s-dn:~

information

control

which

I>,

O>)>). satisfies

in section A.5.

the p r e d i c a t e

The object d

206

satisfies


section,

int-prog

is-data d e f i n e d

is an i n s t r u c t i o n

earlier

in this

w h i c h will be defined

later. The set of final states of the LML m a c h i n e F={~:is-state(~) A sequence

~o,~i,...

the LML machine. terminates. A.7.2

& s-c(~)=~}. , where

~i+l~A(~i ) is a c o m p u t a t i o n

The State T r a n s i t i o n

interpretin~

of

If, for some i, ~i e F then the c o m p u t a t i o n

(Every LML c o m p u t a t i o n

W i t h every

is the set

instruction

function

Function in

~in"

of a state ~, and K a control and let ARG= s-al.K(C)

terminates).

e is-in is a s s o c i a t e d Let C be a n o n - e m p t y selector

of C.

an

control

Let s-in-K(C)=in,

be the list of arguments

of in.

Then ~in(ARG,$,K)

= i f PI(ARG,~)

then

gl

else if P2(ARG,~)

then

g2

then

gm'

es___~e 1 . else if Pm(ARG,~) where PI,P2,...,Pm

are p r e d i c a t e s

(m~l),

and gj has one of

two forms: (i)

For the case of c o n t r a c t i n g

control,

gj=~(~(~ (~;) ;{:TEri(C,K) }) ; <s-counter: where

eJo ande3 are objects.

deletes

the i n s t r u c t i o n

in,its

is-integer>,<s-dn:E~(ARG)>)

In this e x p r e s s i o n argument

the innermost

list and its return

207

information,

the m i d d l e ~ r e t u r n s the o b j e c t

p r e c e d i n g control

selectors,

EJ(ARG) o

to

and the o u t e r m o s t ~ alters

c o m p o n e n t s of the state o t h e r than the control. (ii)

For the case of e x p a n d i n g control,

gj = ~ ( ~ ; < K a s - c : ~ ( c 3 (ARG);<s-ri: w h e r e eJ (ARG)

satisfies

s-rioKos-c(~)>)>),


is-c.

In this case

the i n n e r ~ a s s o c i a t e s the r e t u r n i n f o r m a t i o n of K(C) w i t h the new control

EJ(ARG),

and the o u t e r ~ r e p l a c e s

the

control K(C) w i t h the new o b j e c t thus created. The V i e n n a M e t h o d uses to d e f i n e i n t e r p r e t i n g in a m o r e r e a d a b l e

a s y s t e m of i n s t r u c t i o n

functions

schemata

rather m o r e c o n c i s e l y and

fashion than the above e x p r e s s i o n s .

However, we shall not d e s c r i b e

this

feature,

since it is

f e a s i b l e to define LML in the above manner. It is now p o s s i b l e to d e f i n e the state t r a n s i t i o n A(~)={q:q=~in(ARG,~,K) &

F r o m this d e f i n i t i o n

ARG

& K=T(s-c(~)) =

& i__nn=s-inoK,s-c(~)

s-al=KoS-C

it is a p p a r e n t

(~) }.

that the state t r a n s i t i o n

is d e t e r m i n e d by always c a r r y i n g out the i n s t r u c t i o n w i t h the t e r m i n a l control of the state, o c c u r r i n g at the "deepest" in s e c t i o n A.7.1).

associated

n a m e l y the i n s t r u c t i o n

level of the control

In g e n e r a l

function:

(cf. e x a m p l e s

(although not for LML),

will be a set c o n t a i n i n g m o r e than one control

selector.

T(S-C(~)) Hence

our e a r l i e r remark that A(~) w i l l in g e n e r a l be a set of states,

r a t h e r than a single state.

In such a case,

does not m a t t e r w h i c h of the t e r m i n a l i n s t r u c t i o n s first.

it

is p e r f o r m e d

208

It is the specification of the interpreting functions of an interpreting automaton which assigns meaning to an abstract program. A.7.3

Interpretin~ Functions

for LML

we now complete the definition of LML by defining a set of interpreting

functions

for it.

The instructions

to

be defined are as follows: Instruction

Type

Domain

int-prog

expanding

is-program

int-m~

expanding

(is~integer) 2

set-mn

contracting

is~integer

int-~ro~-list

expanding

is-rational-list

updatey

contracting

is-rational

product

contracting

{is-rational) 2

sum

contracting

(is-rational)

A

^

We assume that the binary arithmetic operators available.

2

+ and * are

The remarks at the end of section A.6 apply

to these. (i)

int-pro@ int-~ro~

(p,~o,I)=H (to;<S-C:e (p)>)

where e(p)

s-al s-in s-in

/

s-al

J

s-rational-list

int-prog-list

int-mn

(s-m(p) ,s-n(p))

(p)

209

(2)

int-mn int_mn((X,y) ,~,K)=~(~; <Eos-c:e(x,y)>) where e (x,y) = r / ~ s - a l s-in s-i /

s-al

s=

I

v

I ~et-mn

~v

(x,y) (3)

set-mn set-mn(X'~'K)=~(~(~;) ;<s-dn:~(s-dn(~) ;<s-parno:x>) >) Note:

(4)

set-mn puts the value m+n into s-parno-s-dn(~).

int-prog-list (x,~,K)= if s-counter (~)<s-parnoos-dn(~)+2 ~ (~;) ~(~;)

~int-prog-list then else

where e I (x) =

s-in

i

sn rri< \

k s-tail (x)

int-prog-list v

product

"~'"

~k

\

(u, s-go s-dn(~) )

s-el

I (s-head(x),

s-head

•

s-list

°

s-data

o

s-dn

(~))

210

and ¢2 (x) =

r

/•s-al s-in

updatey s-

-

±

/

\ v

sum s-al

I

(s-yDs-dn(~) , s-head o (s-tail) i (x)) where (5)

i=s-ios-dataos-dn

(~)

updatey ~updatey

(x,~,K)=~(~(~;) ;<s-dn:~(s-dn(~) ;<S-y=x> ,

<s-list#s-data: <s-counter:

s-tail-s-list-s-data,

s-counter

updatey

brings

the next data item to the top of s-list-s-data.s-dn(~),

(6)

(7)

into s-yos-dn(~),

(~) by i.

produc 9 ~product

where

s-counter

value

(6)+1>)

Note:

and increases

puts an intermediate

s-dn

((x,y),~,K)=~ (~ (~;) ; )

r e ri (s-c(~),K) sum ~ s u m ((x,y),~,K)=~(~(~;) where

T eri(s-c(6),K)

;)

(6)>),

211

In order to clarify the above definitions, some of the steps in an LML computation are shown below.

To save

space, only the control and those parts of the state which have just changed are shown. s

s-in

s-counter

/ int-prog

I

1 0/

s-data

s-al

/s-i~ 1

s-list

s_n ~ s _ m s-rational-list

s-ha~d

m n

/

Yi_l-/"

s-tail

s-head

s-tail

I

\.

/

.)

2

el

s-head s-hel

u.1-m

o/

sc ~

I

s-al

int-prog-list

/ int-mn

/ s-head

s-in

I

(re,n)

s-tail k%

I aI

212

~2 =

S-C

r "~s-al s_~in siin t ~ in -prog-lis s-t~l r s-head s-al %

I

\v

set-mn

/

ai

\

/

s-al

sum

I

(ra,n)

~3 =

o-c~ s-al

int-pro~-list

s-in / ~ s-al / \ Set-mn

~s-tai< s-head

m+n

/

aI ~4 =

•/ ~

s-c~

/ s-in s-al int-pro@-list s-ea~ds-tail /

aI

\

s-dn S - ~ s 0

s-parno

I

m+n

-da "-.

213

s-o~ if m+n > 0 then ~5 =

/ ~ r s-¢n / / ~

i

r

s-al

int-prog-list ~

IX

/

[ s-a\ /

l~s_r~

s;i%Sa~

s-ln

s-head

~v

/

updatoy

s-i / product

s-al

\ s-rx

/

,0)

\

(al,yi_ 1 )

u

~6 =

S-C/~

~

/

_ s-~n I \ ~,,~-~,-°~-;,, ,s-head

s-in s-i / sum

s-al ~

I updatey

~

/

k

a2

v

v

(al*Yi_l,O)

s-c~

~7 =

/ updatey

int-pro@-list

I (al*Yi_I)

s-

~-t,,,.il

214

S--C

s-dn s-in~ / int-pro~-list _

s-counter 2I

~

s-da~

al*Yi_1 s-head

/

a2

s-head

I

/

:

Yi-2 A sequence like ~5,~6,~7,~8 is now repeated until s-counter (~i)=m+n+2, whereupon we get s-dn s_c/~ / ~ s-co~nter s_y~ s-data m+n+2 r s-al s-parno

~i+l=

.

s-ln sum / ''

s-~n

v

I ~pdatey s-al I s-ri (~,di)

~

~i+2=

//~ S-C

update~

s-al I 9+d.1

I

s-i

m+n

i

I

215

\

~i+3 =

s-dn s-counter

I

\

m+n+3

s-y

/

s-data s- ~arno

Yi

s-i m+n

~i+3£F,

so the

is a v a i l a b l e

computation

be r e m a r k e d

the LML i n s t r u c t i o n s restrictions

These

that

length of the

are,

items

table.

is s i m p l y

done by e n t e r i n g

free g r a m m a r

cannot

definitions

These

LML

and

the v a l u e s

not e x c e e d N,

of m

the

can be

instructions.

state

This

if any of these

like Algol,

context

be e x p r e s s e d (49)).

are

and a b s t r a c t

with

of the LML

of

be e x p r e s s e d

restrictions

to s p e c i f y

(see

There

of p a r a m e t e r s

In l a n g u a g e s

can also be u s e d which

cannot

of c o n c r e t e

an e r r o r

are violated.

restrictions,

which

of i s h o u l d

in the d e f i n i t i o n s

technique

complete.

be c o m p a t i b l e

the v a l u e

look-up

above d e f i n i t i o n s

that the n u m b e r

expressed

conditions

the

are not q u i t e

specifications

rumber of data

and n, and

that

on an LML p r o g r a m

in the e a r l i e r

the

Its r e s u l t

in s - y o s - d n ( ~ i + 3 ) .

It s h o u l d

grammars.

has t e r m i n a t e d .

this

- sensitive

in the c o n t e x t -

216

A.8

Summary

The V i e n n a m e t h o d of d e f i n i n g progra~%ming languages has been described.

This m e t h o d includes

d e f i n i t i o n of the s e m a n t i c s of a language,

the formal and is s u f f i c i e n t l y

p o w e r f u l to be used for the d e f i n i t i o n of p r a c t i c a l p r o g r a m m i n g languages.

It has been used here for the d e f i n i t i o n of

the simple and s p e c l a l - p u r p o s e L i n e a r M o d e l Language. This has b e e n done b o t h to i l l u s t r a t e the method, o r d e r to m a k e language"

and in

f a m i l i a r a r a t h e r b r o a d e r n o t i o n of " p r o g r a m m i n g

than is usual.

The V i e n n a M e t h o d of l a n g u a g e d e f i n i t i o n is used in ch~ter

5 to f o r m a l i s e the n o t i o n of a "fragment"

of a language.

217

APPENDIX B Syntax

Of the

Algo iW-Support

of the Gas-Furnace

Models

This appendix contains the concrete syntax of the AlgolW-support

of the five models of section 6.3.2.

It

is based on the AlgolW syntax specification given in The numbers in brackets the relevant sections of comparison.

to the right of subheadings

(50). indicate

(50), in order to facilit&te

Standard procedure

statements

terminals which do not appear in

are new non-

(50) (cf. sec. 6.3.1).

The symbol "t" may be replaced by either "real" or "integer", in accordance with the rules specified in sections i.i, 1.5, 1.5.3, I.

and 1.6.2 of

Identifiers

::=

(50).

(1.2)

::= ::= <standard procedure identifier>::=

READIREADONIWRITE

::= EIIIJINIUIVIWIYIZ ::= 0111213141516171819

(Note:each of these appears in

:: = l,

(1.3.1)

::=

::=. I

.

218

::=l

(1.4)

<declaration>::=<simple

variable declaration> I

3.1

Simple Variable Declarations

(1.4.1)

<simple variable declaration>::=<simple

type>

<simple type>:: = INTEGERIREAL 3.2

Arra[ Declarations

(1.4.2)

::=<simple

type>ARRAY

() ::= ::=:: ::= ::= 4.

Expressions

(1.5)

::=<simple 4.1 Variables

t expression>

(1.5.1)

<simple t variable>::= I ::=<simple

t variable>

::=(<subscript

<subscript list>::=<subscript> <subscript>::=

list>)

219

4.2

Arithmetic Expressions (1.5.3)

<simple t expression>::=l<simple t expression>+ l<simple t expression>- ::=l* ::= ::=I 4.3

Lo@ical Expressions

(1.5,,..4)

::= ::=<simple t expression>

<simple t expression> ::= < 5.

Statements

(1.6)

<program>::=.

(Note we do not provide a

specification of the syntax of ). <statement>::=<simple

statement> I

I <simple statement>

::=l I <standard procedure statement>

5.1

Blocks

(1.6.1)

::=<statement>END ::=l<statement>; ::= BEGINI<declaration> 5.2

Assignment Statements

(1.6.2)

::=

220

::=:=

Procedure

<standard procedure

Statements

(cf. 1.6.3 and 1.6.8)

statement>::=<standard

procedure

()

list>::=<subscript> (1.6.5) clause><simple

statement>

ELSE<statement> ::= 5.5

Iterative

::=<statement>

FOR:=

value>UNTIL

I I

~

~ ,

I

I

o

,

. 1 .

I

.

I

,

I

~

0

I

I

.

I I

.

]

.

.

I

~

.

l

,

I

o

.

.

I

°

I

~

I I

I

o

o

I

.

I

,

I

I

,

0

I

I

o

I

I

.

~

.

.

o

.

I I I

I I

I I

o

.

~

' ~ X ' ' ~

I

o

.

!

~

.

~

~

*

i

I

-

~

.

. . . .

I I I I ~ 1 1

I

'

0

'

I

)

'

I

o

I

'

I

.

I

~

,

•

I

l

I

,

*

I

.

I

I

•

.

*

.

,

~

I

I

,

.

I

o

I

.

,

o

~

~

I

.

~

I

.

W

o

~

W

~

~

.

.

.

•

~

~

.

•

~

~

.

w

.

w

. o

°

I

I-'-

'~'~

I~. •

I~-

F-'-

I-I

The Modelling of Systems with Small Observation Sets

Small Satellites for Earth Observation

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Modelling of Marine Systems

All Sets Great and Small

Small-Scale Armour Modelling (Modelling Masterclass)

Mechatronic Systems: Modelling and Simulation with HDLs

Mechatronic Systems, Modelling And Simulation With HDLs

The Elements of Cantor Sets: With Applications

Modelling Forest Systems

Modelling, State Observation and Diagnosis of Quantised Systems (Lecture Notes in Control and Information Sciences)

Small Computer Systems Handbook

Histories of Scientific Observation

Modelling with Words - Learning

Modelling Metabolism with Mathematica

Observation Reconsidered

Small Persons With Wings

Small Persons With Wings

Small Persons With Wings

Modelling of Mechanical Systems: Structural Elements, Volume 2 (Modelling of Mechanical Systems)

Measurements, Modelling and Simulation of Dynamic Systems

Modelling and Analysis of Enterprise Information Systems

Measurements, Modelling and Simulation of Dynamic Systems

Measurements, Modelling and Simulation of Dynamic Systems

Modelling of Mechanical Systems: Fluid-Structure Interaction, Volume 3 (Modelling of Mechanical Systems)

Computer Modelling of Electrical Power Systems

Systems Modelling: Theory and Practice

Modelling Photovoltaic Systems Using PSpice

Innovative Information Systems Modelling Techniques

The Modelling of Systems with Small Observation Sets

Small Satellites for Earth Observation

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Modelling of Mechanical Systems: Discrete Systems, Volume 1 (Modelling of Mechanical Systems) (Modelling of Mechanical Systems)

Modelling of Marine Systems

All Sets Great and Small

Small-Scale Armour Modelling (Modelling Masterclass)

Mechatronic Systems: Modelling and Simulation with HDLs

Mechatronic Systems, Modelling And Simulation With HDLs

The Elements of Cantor Sets: With Applications

Modelling Forest Systems

Modelling, State Observation and Diagnosis of Quantised Systems (Lecture Notes in Control and Information Sciences)

Small Computer Systems Handbook

Histories of Scientific Observation

Modelling with Words - Learning

Modelling Metabolism with Mathematica

Observation Reconsidered

Small Persons With Wings

Small Persons With Wings

Small Persons With Wings

Modelling of Mechanical Systems: Structural Elements, Volume 2 (Modelling of Mechanical Systems)

Measurements, Modelling and Simulation of Dynamic Systems

Modelling and Analysis of Enterprise Information Systems

Measurements, Modelling and Simulation of Dynamic Systems

Measurements, Modelling and Simulation of Dynamic Systems

Modelling of Mechanical Systems: Fluid-Structure Interaction, Volume 3 (Modelling of Mechanical Systems)

Computer Modelling of Electrical Power Systems

Systems Modelling: Theory and Practice

Modelling Photovoltaic Systems Using PSpice

Innovative Information Systems Modelling Techniques

Recommend Documents