JOURNAL OF SEMANTICS
AN INTERNATIONAL joURNAL FOR THE INTERDISCIPLINARY STUDY OF THE
SEMANTICS OF NATURAL LANGUAGE
MANAGI N G EDI TOR: PETER BoscH (IBM Scientific Centre, Heidelberg and University of Osnabriick) ASS OCIATE EDITORS: NICHOLAS AsHER (University of Texas, Austin) RoB VAN DER SANDT (University of Nijmegen) REVIEW EDITOR: ANKE LODELING (University of Tiibingen ASSI STANT EDITOR: ANKE LODELING (University of Tiibingen
l
EDI T ORIAL BOARD : MANFRED BIERWISCH ( MPG and Humboldt
University Berlin)
BRANIMIR BoG1JIU\EV {Apple Computer Inc) MARIO B RILLO (University of Toulouse)
o
KEITH BROWN (University of Essex)
GENNARO CHIERCHIA (University of M ilan)
fv D
____, C
____,
A
B --->
In this case we have the same abductive inference graph as shown before in (28), and we get the same abductive variants and the same costs associated with them. But now the cost-minimal variant B is inconsistent with cg'. From this fact it follows that there is no pragmatically licensed update for S with regard to cg'. In other words, S becomes pragmatically anomalous with regard to cg'. (3 I )
Now look at the Hobbs-Stickel account. It gives A as the minimal explanation (c£ the diagram (3 1)). This leads to the postulation of cg' U {A} as update. Consequently, there is an important difference between the Hobbs-Stickel account and the present one. On the Hobbs-Stickel view
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Since we have assumed that 1 here are no blocking alternatives, the condition (22a) becomes v2cuous and the set p(S) is the set of cost minimal variants, given in (29c). Since the expression B is consistent with cg, a pragmatically licensed update exists (satisfying the Quality conditions (1 sa, b)). It is given in (29d). The Hobbs-Stickel account is looking for minimal explanations, that means it selects the cost-minimal variants from the set of the consistent abductive variants. This contrasts with the former view which first selects the cost-minimal variants from the set of all abductive variants and then checks them with regard to consistency. However, in the present case this mak a',B', where a' and ,B' are the parameters for any other apple part (e.g. for the pulp). Suppose that, as is rather plausible, this condition is satisfied, then the !-principle selects the red peel-interpretation and blocks the red pulp interpretation. Consequently, we get (ph) as an conversational implicature, but not (pc). Note that the non-existence of the implicature (32c) doesn't forbid a discourse as (34) but rather licenses it. (34) This apple is red. But not only its peel is red. Its pulp also is red. In the case of(3sa) analogous considerations give (35b) but not (35c) as a conversational implicature.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Red Peel-Variant
1 so Lexical Pragmatics
.
=
TRACTOR( d)
+-
PART(d, xt7 t\ TYRES(xt< •- -r) A etc •-
r
TYRES(x)
+-
P-STATE(x, u)11 A et c • -P
Pumped up Tyres-Variant total costs: $ 2 (l:"'f - a,B( 1 'Y) � $ 2 - a,B -
-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3 5) (a) The apple is sweet (b) Its pulp is sweet (c) Its peel is sweet It should be added that the present account evaluates utterances as (36) as pragmatically anomalous (assuming the former axioms and weights} (36} ?This apple is red, but its peel is not (perhaps, its pulp is) This qualifies implicatures like (32b) and (3sb) as non-cancellable (under normal circumstances-neglecting the possibility of genetic engineering). Finally consider the contrast between (37a) and (37b}: (37) (a) ?The tractor is pumped up (b) The tyres of the tractor are pumped up (c) ?The coachwork of the tractor is pumped up (d) TRACTOR(d) A PART(d, x) A PRESSURE(x, u) A u = pumped up (e) TRACTOR(d) A PART(d, x) A TYRES(x) A P-STATE(x, u) A u = pumped up (f) TRACTOR(d) A PART(d, x) A MOTOR(x) A P-STATE(x, u) A u = pumped up The present account predicts (3 7a) as pragmatically anomalous. This prediction results from the fact that those parts of tractors that may be pumped on (the tyres) are only marginally diagnostic for identifying tractors and therefore the corresponding interpretation (37b) can be blocked by specifications that refer to more salient parts, for example as shown in (3 7c). However, the latter specifications suffer from sort conflicts and therefore violate the condition ( 1 s ) To make the argument explicit, let us start with (37d) as underspecified representation of (37a), and let us compare the cost for calculating the two enrichments (37e) and (37f) (related to (37b) and (37c)). The diagram (38) presents the corresponding abductive inference graph that is relevant for abducing the pumped up tyres-interpretation (37e). P-STATE(x, u)*0 t\ u up* 1 A (38) TRACTOR(d)*1 A PART(d, x)*0 TI III I
Reinhard Blutner
I5I
Note that this graph has practically the same structure as that given in (3 3 ). In the present case the factoring operation unifies the part and pressure-state slots arising from the predicate complex of the utterance_ (37a) with those that emerge while conceptually decomposing the subject term the tractor. Next let us consider an abductive inference that corresponds to an enrichment referring to more salient parts as the tyres, say the motor of the tractor as it is given in (39). (39)
TRACTOR(d)$1 1\ PART(d, x)$0
.,_
-1
II
II
TRACTOR(d) ..- PART(d, x)"
,
1\
P-STATE(x, u) so 1\
u
=
upS1
-r 1\ MOTOR(x)" ' ( •--r) 1\ etc• -o '
total costs: $ 2 - a '"(
In this case only the part slot of the initial representation (first line of (39)) can be unified with a corresponding slot arising from the conceptual decomposition of the subject term. The pressure-slate slot, on the other hand, cannot be used in factoring because the composition of concept of a tractor's motor doesn't involve a pressure-state slot in the intended sense. The cost calculations for the rwo enrichments (37e) and (37f) are as given in (38) and (39). It is obvious that the pumped up motor-variant wins over the pumped up tyres-variant when the condition a '"' > a/3 holds, i.e. a I a > f3I"'· Here a may be interpreted as the salience of the tyre parts of the tractor, a ' as the salience of the motor part of the tractor, f3 as the salience of the p(ressure)-state slot for the tyres of the tractor, and "' as the salience of the part slot for tractors. Let us suppose that the condition a I a > {3 h is satisfied, which is quite plausible if we assume that the saliences for the different slots are approximately the same. But the saliences of the various ftllers of a slot may vary considerably; for example, it appears that the salience of the motor as part of the tractor is much higher than the salience of the tyres. Then the !-principle selects the pumped up motor variant and blocks the pumped up tyres-interpretation. Intuitively, the winning variant (pumped up motor) suffers from sort conflicts. I refrain from expressing this formally by a corresponding axiom. The existence of this sort of conflict leads to a violation of the Quality 1 condition. Consequently, under plausible context conditions, there is no pragmatically licensed update for an utterance of (3 7a), and it comes out as pragmatically anomalous.Let us consider an utterance like (40) 1
1
(4o) The bicycle is pumped up
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
?Pumped up Motor-Variant
1 52 Lexical Pragmatics
This utterance does not have the highly marked status of (37a). The present account explains this by making the plausible assumption that the tyres of bicycles are one of the most salient parts of bicycles. Consequently, in this case the pumped up tyres-interpretation comes out as a cost minimal one, and it doesn't suffer from sort conflict. Needless to say, the present considerations regarding the amounts of the parameters have to be supported by careful empirical studies. However, as a first step considerations of this kind may be valuable. They may demon strate at least which kinds of influence are conceivable, and this again may be tested empirically.
Abduction and systematic polysemy
In this section I will demonstrate how the ideas put forward in section 4.1 may provide a mechanism for generating the range of the conceptually salient senses of institute-type words-a mechanism that solves the restric tion problem of polysemy. Adopting the radical underspecification view (section 2.2), I will show how the extended mechanism of conversational implicature is capable of giving a principled account. The general idea that leads us to underspecified representations in the case of institute-type words is as follows. Suppose there are certain entities which can be understood as conceptual frames or schemata and can be classified according to the variety of institute-types (government, school, parliament, etc.). Suppose further that these entities can be considered under different perspectives. These perspectives are assumed to provide more concrete realizations of the rather abstract concept of a certain institute-type e, perhaps realized as building, process, or institution property. However, the particular perspective adopted and, consequently, the concrete realization of the intended institute-type remains semantically open. In a first approximation, the semantic representation of institute-type nominals may look like (41a, b). (41) (a) Ax 3e[SCHOOL(e) 1\ REALIZE(e. x)] (b) Ax 3e(GOVERNMENT(e) 1\ REALIZE(e. x)]
Note that the specification of x as building, process, or institution proper has not been specified in the lexicon. That means that the variety of different interpretations has not been treated by stipulating semantic ambiguities. Note furthermore that the different restrictions on interpreta tive variants, for example for school and government, are no longer treated semantically. As a consequence, the restriction problem of polysemy has to be analysed pragmatically.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4· 3
Reinhard Blutner
I53
In the previous subsection we used an axiom of the form q t- p� 1 1\ p�2 to abduce, for instance, the existence of peel parts of an assumed apple from the existence of the apple. In a similar vein, we now use axioms of this form in order to abstract, for instance, the existence of a building and/or an institution realization from the existence of an entity of type school or government. Weighted abduction rules that provide the corresponding decompositions are presented m (42a, b) for the case of school: {42) (a) SCHOOL{e) t- REALIZE{e. xfrr 1\ BUILD{x)a( I--y) 1\ etd-a ' ' ' (b) SCHOOL{e) t- REALIZE{e. x)a -y 1\ INSTIT{x)a ( I --y) 1\ etd-a ·
·
(43) (a) The school has a flat roo£ (b) The school building has a flat roo£ Moreover, I will demonstrate why the utterance of (44a) appears as a pragmatic anomaly (under normal circumstances) and, consequently, why the interpretation of (44b) is suppressed as a conversational implicature of (44a). {44) (a) ?The government has a flat roo£ (b) The government building has a flat roo£ To simplify it, the underspecified semantic representation of the sentences {43a) and (44a) are as indicated in (45a, b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Analogously to the case discussed before, the parameters may be interpreted as follows: a as the .salience of the building realization of a school, a ' as the salience of the institution realization, and 1 as the salience of the realization slot { 1 < I ) . It is important to note that the assumptions about the weights in (42) are correct only when the condition on the left side of t- is conceptually necessary for the conditions on the right side. In the case of {42a) that means that every institution of type school must be realized in a (single) building. This certainly is highly plausible in the case of school. In other cases, for instance for government, the corresponding supposition is plausible only with a certain (very small) probability. In these cases we have to introduce an extra factor 8 ( 8 < I ) into the exponential of the corresponding rule (as we will do subsequently in a simple example). Let me now demonstrate by an example (i) how the abductive machinery can be used to generate the possible interpretations as conversational implicatures, and (ii) how the mechanism excludes the impossible inter pretations as cases of pragmatic anomaly. More concretely, I will illustrate how the content of (43b) may be construed as a conversational implicature of (43a).
1 s 4 Lexical Pragmatics
(45) (a) 3e[SCHOOL(e) 1\ REALIZE(e, x) 1\ BUILDING{x) 1\ HAS_A_FLAT_ROOF(x)) (b) 3e[GOV(e) 1\ REALIZE(e, x) 1\ BUILDING(x) 1\ HAS_A_FLAT_ROOF(x))
s' (46) SCHOOL{e)$1 1\ REALIZE{e, x)$0 1\ BUILD{x)$0 1\ i II II I II II SCHOOL{ e) REALIZE{e, x)"'7 1\ BUILD{x)"' ( '-7) 1\ etc•- .
.
.
+-
(Consistent} Building-Variant total costs: $ 2
-
a
By using the alternative rule (42b) for decomposing the subject term, the inference graph (47) results. sr (47) SCHOOL(e) $1 1\ REALIZE(e, x)$0 1\ BUILD(x) $0 1\ .
i I
SCHOOL{ e)
II
+-
II
,
REALIZE(e, x)"' 7
1\
INSTIT{x)"'
'
'Y is satisfied, and the (inconsistent} institution variant is suppressed in this case. It is plausible to assume that a and a ' are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In both cases, the first two conjuncts result from the lexical inputs of the institute-type nominals, and the remaining ones correspondingly to the predicate complex. The expression BUILDING(x) is due to the assumed sort restriction provided by the predicate complex, and it is singled out as an important representational element in the present analysis. The diagram (46) shows the part of the abductive inference graph that is relevant for abducing the building-interpretation (43b) starting with (43a) (in its pre-analysed form (4sa)). Note that there is no real abduction in this very crude and simplifying analysis. The graph shows a 'conceptual decomposition' of the subject term and a factoring operation that unifies the occurrence of BUILD(x) resulting from this decomposition with its occurrence resulting from the predicate complex. This effects a saving in assumption costs (by an amount of a).
Reinhard Blumer 1 S S
of comparable amount, since the building and institution reading of school can be seen as realizing concepts of both the basic level of buildings and that of institutions. Consequently, the condition a/a' > 1 (with 1 « I ) may be assumed to hold and the !-principle selects the building variant. We can conclude that (43b) comes out as a conversational implicature of (43a). Now I want to argue that (44a) comes out as pragmatically anomalous. As before, we have to contrast two abductive inference graphs. They are shown in (48) and (49). 8• (48) GOV( e )$1 1\ REALIZE( e, x)$0 1\ BUILD(x)so 1\ .
i
+---
.
.
II
II
REALIZE(e, x/01
1\
II
BUILD(x)6 o ( •-'1')
etc0( •-o)
1\
( Consistent ) Building-Variant total costs: $ 2 a -
(49) GOV( e ) s 1 1\ REALIZE( e, x)$0 t
I
GOV(e)
II
+---
II
,
1\
BUILD(x) so
s,
1\ '
>
REALIZE(e, xt 1 1\ INSTIT(xt ( •- '1'
1\
etc •-o '
{ Inconsistent) Institution-Variant total costs: $ 2 - a'-y The parameter 8 ( 8 « I ) corresponds to the probability of assuming that a government is realized in a single building. Now, the condition 8a/a' > 1 becomes relevant and the !-principle selects the (consistent) building variant when the condition is satisfied. In the other case, when the converse condition 8a/ a' < 1 is satisfied, the !-principle selects the (inconsistent) institution variant. Of course, the latter possibility is actually realized. This follows from the assumption that it is very implausible to assume that a governnient is realized in a single building ( 8 « I ) . Furthermore, government buildings are certainly not basic-level buildings. Consequently a « a'. Both factors make it highly plausible to assume that 8a/ a' < 'Y· Therefore, the inconsistent institution variant wins over the consistent building variant. The inconsistency of the selected variant leads to a violation of the Quality I condition. Consequently, under plausible context conditions (43a) comes out as pragmatically anomalous.'8 Again, it should be stressed that these considerations regarding the amounts of parameters are rather provisional and should be supported by careful empirical studies. Nevertheless, the present view sheds some light
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I
GOV(e)
II
I 56
Lexical Pragmatics
on the way how the restriction problem of polysemy may be solved by considering the probabilistic nature of conceptual knowledge. s
CONCLUSI O N
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
One aim of this paper was to collect some general problems that have a prima facie claim on the attention of linguists interested in Lexical Semantics. These problems had to do with the utterance of words within concrete conceptual and contextual settings and went beyond the aspects of meaning typically investigated by a contrastive analysis of lexemes within the Katz-Fodor tradition. Three groups of problems were considered: (i) pragmatic compositionality, (ii) blocking, and (iii) pragmatic anomaly. The problems came to the fore in connection with the pragmatics of adjectives and the phenomenon of systematic polysemy. The same points can be made with regard to word formation in general (e.g. Aronoff 1 976; Bauer 1983 ) and the interpretation of compounds in particular (e.g. Meyer 1993; Wu 1990). Moreover, the investigation of kinds of polysemy other than those found with institute-type words may be helpful in order to see the uniquity of these problems (cf., for instance, Lakoff's (1987) study on English prepositions and Sweetser's ( 1990) investigation of English perception verbs). Furthermore, Fabricius-Hansen's (1993) research on how the interpretation of noun-noun compounds is affected by a genitive attribute may raise the same problems in a more complex area. The second aim of this paper was to sketch a new approach called Lexical Pragmatics that deals with these problems in an explanatory way and tries to give a systematic account of the phenomena under discussion. The paradigm is based on two simple principles: (i) an adequate representation of lexical items has to be given in a semantically underspecified format, and (ii) their actual interpretation has to be derived by a pragmatic strengthen ing mechanism. The basic pragmatic mechanism rests on conditions of updating the common ground and allows to give a precise explication of notions such as generalized conversational implicature and pragmatic anomaly. The fruitfulness of the basic account was initially established by its application to a variety of recalcitrant phenomena, among which its precise treatment of Atlas & Levinson's Q- and !-principles and the formalization of the balance between informativeness and efficiency in natural language processing (Hom's division of pragmatic labour) deserve particular mention. The basic mechanism was subsequently extended by an abductive reasoning system that is guided by subjective probability. The extended mechanism turned out to be capable of giving a principled account of lexical blocking, the pragmatics of adjectives, and certain types of systematic polysemy.
·
Reinhard Blutner I 57
Acknowledgements I am grateful to Manfred Bierwisch, Paul David Doherty, Bart GeurtS, Gerhard Jager, Thomas Jiingling, Annette Lefimollmann, Chris Pinon, and Rob van der Sandt for useful comments on earlier versions of this paper. Special thanks go to an anonymous referee of the JS. I do not, however, intend to imply by this that they endorse my approach. In particular, Thomas and Manfred don't believe a word of it. REINHARD
BLUTNER
Humboldt University, Berlin Jagerstrasse 1 o-1 1 10117 Berlin Germany e-mail: blutner@ger:man.hu-berlin.de
Received: I 8. I 0.97 final version received: 05.04.98
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I find it important to apply the ideas to other possibly more complex and more realistic examples than those considered here. Moreover, methods are needed that allow one to measure the values of the probabilistic parameters that control and organize conceptual knowledge. Seen from a moderately distant viewpoint, the standard accounts of Lexical Semantics may appear as an incoherent research field which is at odds with itsel£ As an endeavour that has to access Grammar, Semantics, and aspects of utterance interpretation at the same time it multiplies the diversion of . these disciplines. Overstretched by the task of theory formation, it either combines theoretical rigour with descriptive poverty, or, more predominantly, it leads to linguistic anecdotalism, collecting pretty and curious observations without theoretic control. I sense that it is the inadequacy or the lack of a genuine pragmatic component that has led to this situation. In so far as Lexical Pragmatics tries to take pragmatics seriously-especially the conception of conversational implicature-and in so far as it is explicit about this component, it may substantiate a division of labour between grammatical and pragmatic aspects of the lexicon. This may broaden the way for overcoming the unfortunate situation just mentioned. Perhaps most details concerning the main ideas of the present account in concrete terms may prove false in the future. This may concern, first at all, the Economy principles and their interaction. In order really to justify the details of these principles we need more empirical evidence and studies. But it is also crucial to discover the reasons that explain why the principles are just as they are. This brings us to a reductionist programme as is currently pursued in the domain of Integrative Connectionism (e.g. Smolensky 1995). A first attempt at achieving a full reduction of Speaker's economy (I-principle) and Hearer's economy (Q-principle) to connectionist principles is currently under way.
I 58 LeXical Pragmatics
NOTES
·
Jackendoff (I983) for similar distinc tions. The important point of this dis tinction correlates with Grice's proposal in his William James Lectures to make a distinction within the 'total significa tion' of a linguistic utterance between what a communicator has said (in a certain favoured, and _maybe to some degree artificial, sense of said), and what a communicator has meant beyond it (what she has implicated, indicated, suggested). 7 Mental model and conceptual representation are more psychologically coloured terms; information state is the favoured term used in formal semantics. 8 See, for example, McEliece (I977). 9 In section 4 a more refmed cost-function is developed which sometimes allows the selection of state descriptions that are not minimal surprising. However, these state descriptions can be characterized as the 'better interpretations' because they are more unifying and, in some sense, more relevant than less surprising ones. This formulation brings us closer to the idea of Atlas & Levinson (I 98 I) where the I-principle is intended as inference to 'the best interpretation' (with 'best interpretation' informally understood as interpretation which prefers corefer ential readings of entities, making use of stereotypical relations between . referents or events. However, it should be added that the way in which Atlas & Levinson (I98 I) try to formalize their Principle of Informativeness seems rather misleading. IO Here K is the epistemic operator indexed to H and S, respectively. The epistemic logic I assume is Hintikka's (I962). As discussed by Zeevat (I 997), this condition on common grounds is only a necessary one. Developing a more refined defmi tion ofcommon ground, Zeevat formulates also an update operation for common grounds. His · conception, however,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I I have to thank an anonymous referee of the JS for this example. 2 An anonymous referee notes that for him beef sounds equally good as cow in (7). Consequently, the relevant dif ference in acceptability between (6b) and (7) is in cow (which sounds fine) more than in beef This fact is sufficient to illustrate the phenomenon of deblock . ing, which is the relevant one in the present context. 3 These problems concern (i) the restric tiveness of the coercion mechanism, (ii) the apparent inflation of shifting operations, (iii) the stipulation of an additional checking mechanism that diminishes the use of monotonic processing, and (iv) problems with the analysis of co-predication in case of logical polysemy. For details and further criticism, see Copestake & Briscoe (1995), Fodor & Lapore (to appear), and Blutner {to appear). 4 In general, cost factors relate to the (estimated) cost of accessing the dif ferent interpretations. In section 4. 1 an explicit account is provided which brings our system as close as possible to a Bayesian network and takes costs as negative logarithms of certain condi tional probabilities. 5 Sperber & Wilson's {I986) extreme position of reducing the maxims to just one-the maxim of relevant-isn't relevant in the present context. As argued by Levinson (I989), Sperber & Wilson try the 'impossibility of reduc ing countervailing principles to one mega-principle'. They concentrate on the phenomena of classic particularized Relevance implicatures illustrated by Grice, and they fail to account for the whole range of generalized conversa tional implicatures-the implicatures that are most important for lexical pragmatics. 6 See, for example, Bierwisch (I983) and
Reinhard Blumer I 5 9
�ifJ
P =def
ifJ.
..., � ifJ
P.( =def ..., � ..., ¢)
all a knows, it is possible that p. I 3 The presented arguments showing the
advantages of the present approach over the traditional approach based on
Hom-scales are due to Rob van der Sandt. I thank him for allowing me to include his considerations in this article. I 4 An anonymous referee notes that at least for the pair persuade to not-dissuade from the equivalence is spurious, since dissuade presupposes that the person previously intended the complement. The referee's example: If John was undecided about whether to vote for Clinton, Mary could persuade John not to vote for Clinton, but she couldn't dissuade him from voting for him. I 5 Only under very special conditions is it possible to construct representational pendants to the non�representational elements, for example when we develop intuitions about our cognitive system or about our mental activity. In this sense, the representation of salience, relevance, and so on is possible. Usually, these representations are comparative in char acter and are not quantitatively scaled. I6· With regard to the first part of the dictum, seeing non-cancellability as a necessary condition of entailment (or seeing cancellability as a sufficient con� dition of conversational implicature), I . agree with Hirschberg ( I 99 I ) and assume that it is right, at least if it is possible to discriminate cancellation from suspension (the calling into question of an asserted proposition), from con textual disambiguation, and from cer tain forms of speaking loosely (for careful discussion, c£ Hirschberg I99 I : 28 f£). I7 As assumption cost of the latter units I have stipulated $ 1 . You may see this stipulation as fixing the $-unit. I 8 An anonymous referee has suggested more minimal contrasts in order to bring out the relevant difference:
?The government has a flat roof vs. The Ministry ofjustice has a flat roof ?The university has a flat roof vs. The college of engineering has a flat roof
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ignores the effects of conversational implicatures which also influence the common ground. The present account seeks to gra5p these effects in a first and rather sketchy manner. Furthermore, it should be noticed that I use the notion of information state sometimes as refer ring to a set of possible worlds and sometimes as a representational struc ture (state description or disjunction of state descriptions). It should be added that in the present context we read K as know for sure (this doesn't presuppose the complement proposition). However, there is a prob lem with this formulation that at least should be mentioned. In the present formulation, the common -ground cannot include propositions that some or all of the participants know to be false. However, there are kinds of conversations where this formulation is unsatisfying. For instance, consider a Christmas-time conversation where the proposition that there is a Santa Claus may be common ground; even if some or all of the participants know that there isn't any Santa Claus. I I Identifying state descriptions with sets of possible worlds and r(a) with a family of sets of possible worlds, we can write this condition in the following way: U p(a) :J cg[a]. I 2 Beside the epistemic operator K we -,K...,_ Hintikka _need its dual reads as a knows that It is important to note that Hintikka is . using the verb know in a technical sense without the usual factive presup can be read position. In this vein, as what a knows is not that p, and can be read as for
I 6o Lexical Pragmatics
REFE RENCES Alshawi,
H
&
Crouch,
R
Dalrymple, M, Kanazawa, M., Mchombo, & Peters, S. ( I994), 'What do reciprocals mean?', in M. Harvey & L. Santelmann (eds), Proceedings of the Fourth Semantics
( I992),
'Monotonic semantic interpretation', in Proceedings of ACL, Delaware, 32-9. Aronoff. M ( I976), Word Formation in Genera
tive Grammar, MIT Press, Cambridge, MA. & Levinson, S. ( I 98 I), 'It-clefts,
and Linguistic Theory Conference: SALT
W, Cornell University, Ithaca.
Atlas, J.
& Peters, S. (I996), Seman tic Ambiguity and Underspeciflcation, CSLI
informativeness and logical form', in P. Cole (ed.), Radical Pragmatics, Academic Press, New York, I -61. Bauer, L. (I983), English Word-Formation,
van Deemter, K.
Publications, Stanford, CA. Fabricius-Hansen, C. ( I 993),
Cambridge University Press, Cambridge. Bierwisch, M (I983), 'Semantische und konzeptuelle Reprasentation lexika lischer Einheiten', in W. Motsch & R Ruzicka (eds), Untersuchungen zur Semantik, Akademie Verlag, Berlin,
'Connectionism and cognitive archi tecture: a critical analysis', Cognition,
Georgetown University Roundtable on Language and Linguistics, Georgetown University lexicon', in C. Rameh (ed.),
Press, Washington, DC. Carnap, R (I947), Meaning University of Chicago
Copestake,
A.
&
Briscoe,
T.
( I 995),
'Semi-productive polysemy and sense extension', jou rnal ofSemantics, 12, I 5-67. Cruse,
D. A. {I 986),
Lexicon'.
·
Gazdar, G. ( I979), Pragmatics, Academic Press, New York. Grice, P. (I968), Logic and Conversation, text of Grice's William James lectures at Harvard University, published in Grice {I989)· Grice, P. ( I989), Studies in the Way of Words, Harvard University Press, Cambridge,
and Necessity, Press, Chicago &
London. Charniak, E. & Shimony, E. S. (I 990), 'Probabilistic semantics for cost based abduction', Technical Report CS-90-02, Computer Science Department, Brown University. Clark, E. V. (I99o), 'On the pragmatics of contrast', journal of Child Language, 17, 4 I7-3 I.
28, 3-7 1 . Fodor, J . A . & Lapore, E . (to appear), 'The emptiness of the lexicon: critical reflec tions on J. Pustejovsky's The Generative
Lexical Semantics,
Cambridge University Press, Cambridge. Deane, P. D. ( I 9 8 8), 'Polysemy and cognition', Lingua, 75, 325-36 1 .
MA.
Hintikka, J. ( I 962), Knowledge and Belief An
Introduction to the Logic ofthe Two Notions, Cornell University Press, Ithaca & New York. Hirschberg, J. {I99 I),
A Theory of Scalar
Garland Publishing, Inc., New York & London. Hobbs, J. R, Stickel, M E., Appelt, D. E.,
Implicature, & Martin,
abduction', 69-I42.
Horn, L. R
P. (I 993), 'Interpretation as Artificial Intelligence, 63,
( I 984).
'Toward a new taxo
nomy for pragmatic inference: Q based and R-based implicatures', in D.
Schiffrin (ed.), Meaning, Form, and Use in Context, Georgetown University Press, Washington, DC, 1 1-42.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Beitriige zur Geschichte der deutschen Sprache und Literatur, I I 5 , I93-243. Fodor, J. A. & Pylyshyn, Z W. ( I 988),
6 1-99· Bierwisch, M. (I989), 'The semantics of gradation', in M. Bierwisch & E. Lang (eds), Dimensional Adjectives, Springer Verlag, Berlin etc., 7 I-26 1. Blutner, R (to appear), 'Lexical semantics and pragmatics', in Linguistische Berichte. Caramazza, A. & Grober, E. (I 977), 'Poly semy and the structure of the subjective
'Nominal
phrasen mit Kompositum als Kern',
Reinhard Blutner
I6I
Householder, F. W. ( 1 97 I ), Linguistic Specu Lehrer, A { I970), 'Static and dynamic ele ments in semantics: hot, warm, cool, cold', lations, Cambridge University Press, Papers in Linguistics, 3, 49-74· London & New York. Jackendoff, R ( I983 ), Semantics and Lehrer, A ( I978), Structures of the lexicon Cognition, MIT Press, Cambridge, MA. and transfer of meaning, Lingua, 45, 95- I 2 3 . Kamp, H. {I 975), 'Two theories about adjec tives', in E. L. Keenan (ed.), Formal Levinson, S. ( I 9 8 3 ), Pragmatics, Cambridge Semanticsfor Natural Language, Cambridge University Press, Cambridge. Levinson, S. { I987), 'Pragmatics and the University Press, Cambridge, I23-55· grammar of anaphora', journal of Karttunen, L. ( I972), 'Possibly and Must', in Linguistics, 2 3 , 379-434. J. P. Kimball (ed), Syntax and Semantics 1 , Levinson, S. ( I 989), 'Relevance', journal of Seminar Press, New York, I-2o. Linguistics, 25, 455-72. Karttunen, L. & Peters, S. ( I 979), 'Con ventional implicature', in C.-K. Oh & McCawley, J. D. { I 978), 'Conversational implicature and the lexicon', in P. Cole D. A Dinneen (eds), Syntax and Semantics (ed.), Syntax and Semantic$ 9: Pragmatics, 1 1 : Presupposition, Academic Press, New Academic Press, New York, 245-59. York, I-56. Keenan, E. L. { I974), 'The functional McCawley, J. D. { I993), Everything that Linguists have Always Wanted to Know principle: generalizing the notion of about Logic but were Ashamed to Ask, 2nd Subject of', in Papers from the Tenth edn, University ofChicago Press, Chicago. Regional Meeting of the Chicago Linguistic McEiiece, R { I977), Theory of Information Society, Chicago, IL, 298-3 IO. and Coding, Addison-Wesley, Reading, . Keil, F. C. { I 979). Semantics and Conceptual MA. Development, Harvard University Press, Cambridge, MA. Matsumoto, Y. {I995), 'The conversational condition on Horn scales', Linguistics and Kiparsky, P. (I982), 'Word-formation and Philosophy, I 8, 2 I -60. the lexicon', in F. Ingeman (ed.), Pro ceedings of the 1982 Mid-America Linguistic Meyer, R ( I993), Compound Comprehension in Isolation and in Context, Max Niemeyer Conference. Verlag, Tiibingen. Lahav, R ( I989), 'Against compositionality: the case of adjectives', Philosophical Montague, R ( 1 970), 'Universal Grammar', Theoria, 36, 3 73-98. Studies, 5 5, I I I-29. Lahav, R ( I993 ), 'The combinatorial Nunberg, G. ( I979), 'The non-uniqueness of semantic solutions: Polysemy', connectionist debate and the pragmatics Linguistics and Philosophy, 3, I43-84. of adjectives', Pragmatics and Cognition, I , 7 1 -88. Nunberg; G. (I995). 'Transfers of meaning', Lakoff, G. ( I987), Women, Fire, and journal of Semantics, 12, I09- I 32. Dangerous Things: What Categories Reveal Nunberg, G. & Zaenen, A ( 1 992), 'Sys About the Mind, University of Chicago tematic polysemy . in lexicology and Press, Chicago. . lexicography', in K. Varantola, H. Tommola, T. Salmi-Tolonen, J. Schopp Lang, E. ( I 989), 'The semantics of dimen (eds), Euralex II, Tampere, Finland. sional designation of spatial objects', in M. Bierwisch & E. Lang (eds), Dimen Partee, B. ( 1984), 'Compositionality', in F. sional Adjectives, Springer-Verlag, Berlin, Landman & F. Veltman (eds), Varieties 7 1 -261. of Formal Semantics, Foris, Dordrecht, Langendoen, D. T. { I978), 'The logic of 28I-J I I . reciprocity', Linguistic Inquiry, 9, I 77-97· Pustejovsky, J. (I989), 'Type coercion and selection', paper presented at WCCFL Lehrer, A ( I968), 'Semantic cuisine',journal VIII, April I 989, Vancouver, BC. of Linguistics, 5, 3 9- 5 6. ·
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
162
Lexical Pragmatics
Pustejovsky, J. (1991 ), 'The generative lexi con', Computational Linguistics, 17, 4, 409-41.
Pustejovsky, J. ( 1993), 'Type coercion and lexical selection', in J. Pustejovsky (ed.), Semantics · and the Lexicon, Kluwer, Dordrecht, 73-96. Pustejovsky, J. ( 1995 ), The Generative Lexicon, MIT Press, Cambridge, MA. Pustejovsky, J. & Boguraev, B. ( 1993): 'Lexical knowledge representation and narural language processing', Artificial .
Intelligence, 63,
1 93-223.
Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foun dations, MIT Press/Bradford Books,
Cambridge, MA. Sadock, J. M. ( 1.978}, 'On testing for conversational implicature', in P. Cole (ed.), Syntax and Semantics, Vol. g: Pragmatics, Academic Press, New York, 281-97·
Smolensky, P. ( 1995), 'Constituent strucrure and explanation in an iritegrated connectionist/symbolic cognitive archi tecture', in C. Macdonald & G. Macdonald (eds), Connectionism: Debates on Psychological Explanation, Vol. 2, Basil Blackwell, Oxford, 22 1-90.
Philosophical Review, 72,
327-63.
Sperber, D. & Wilson, D. ( 1986), Relevance, Blackwell, Oxford. Stickel, M. E. (1989), 'Rational and methods for abductive reasoning in natural language interpretation', in R Studer (ed.), Natural Language and Logic, Springer-Verlag, Berlin, 2 3 3-52. Sweetser, E. E. ( 1990), From Etymology to Pragmatics, Cambridge University Press, Cambridge. Thomason, R H. (1990), 'Accommodation, meaning, and implicature: interdiscip linary foundations for pragmatics', in P. R Cohen, J. Morgan, & M E. Pollack (eds), Intentions in Communication, MIT Press, Cambridge, MA. Wu, D. ( 1 990), 'Probabilistic unification based integration of syntactic and semantic preferences for nominal com pounds', Proceedings of the 13th Interna ·
tional Conference on Linguistics (COLING 413-18.
Computational
g o),
Helsinki,
Zeevat, H. ( 1997), 'The common ground as a dialogue parameter', in A. Benz & G. Jager (eds), MunDial'97: Proceedings
of the Munich Workshop on Formal Semantics and Pragmatics of Dialogue, Munich.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Quine, W. V. 0. ( 1960), Word and Object, MIT Press, Cambridge, MA. Rumelhart, D. E., McClelland, J. L. and the PDP Research Group ( 1 986), Parallel
Sommers, F. ( 1959), 'The ordinary language tree', Mind, 68, 16o-85. Sommers, F. ( 1963), 'Types and ontology',
© Oxford University Press 1 998
Lexical Rules As Hypotheses Generators A N A T OL I ST R I G I N
Humboldt University, Berlin
Abstract Developments in computational linguistics lead to the conception of sense extension rules in the lexicon as a theory of regular polysemy. Lexical rules are defined only on such semantic information as is in the lexicon with the desired effect of restricting the amount
1 AN INFORMAL DES CRIPT I O N O F S E NSE E XTE N S I O N It suffices intuitively to characterize sense extension as one kind of regularity in the interpretation of polysemous words. Rules are usually invoked to describe regularities. The discussion of sense extension rules within a formal framework began somewhere in the 196os. McCawley ( 1968), discussing the semantics of lexical items in the lexicon, suggested that probably all languages have implicational relationships among their lexical items, whereby the existence of one lexical item implies the existence of another lexical item, which then need not be listed in the lexicon.
His example of such an implicational relation is the use of words for temperature ranges (warm, cool) to also represent the temperature sensation produced by wearing an article of clothing. To quote McCawley again . . . the English sentence (I) This coat is warm. is ambiguous between the meaning that the coat has a relatively high temperature and the
meaning that it makes the wearer feel warm . . I propose then that English has two lexical items wa rm, of which only one appears in the lexicon, the other being predictable on the .
basis of a principle that for each lexical item which is an adjective denoting a temperature range there is a lexical item identical to it save for the fact that it is restricted to articles of clothing and means 'producing the sensation corresponding to the temperature range denoted by the original adjective'.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of semantic information in the lexical representation of ambiguous items. The paper presents some examples which indicate difficulties for this approach, argues for prag matically based rules which use conceptual information, and proposes a programmatic partial formalization of this approach in the framework of abductive interpretation.
1 64 Lexical Rules as Hypotheses Generators
Note that although a rule is a mapping between lexical entries in the lexicon, nothing is said about the nature of the semantic information in the entries. At a second glance the generalization is not quite correct, and McCawley himself notes this in the postscript to the paper reprinted in McCawley (1973). In (2) the restriction to clothing is violated.
(2) The fire is warm.
On the other hand, in the case of articles of clothing other bodily sensations can seemingly be predicated of them, e.g. in (3 ). The domain of the rule could probably be broader than originally suggested, but to find this out a more subtle analysis is required, which was never conducted. Later Green { 197 4) in her critique of the implicational rule approach noted that the domain of the rule suggested by McCawley should probably be more restricted at the same time because the words hot and cold are not used to refer to articles of clothing producing these sensations, i.e. we do not usually say This jacket is cold. The question how to delimit the domain of a rule is evidently not trivial. As an additional touch one might note with Green that it is not clear why only cool, hot, and warm have extensions to colours, as in hot colours, warm colours, and only warm, cool, and cold may refer to personality characteristics. . To quote Green: we say that someone has a warm personality, and that he is warm to the people, but not that he has a hot personality, or that he is hot to the people.
Confronted with these irregularities, Green thinks that the implicational rule approach to the lexicon is too strong, but that {to quote Green again): it would be possible to think of rules that imply the possibility of specific kinds of 'derived' uses. (implicarional possibility rules) as deftning the notion of 'related lexical entry'. These would bind together as one lexical item lexical entries which have semantic relationships related by these rules. This way, the non-existence of a usage would not have to be seen as
an exception to the rule, which has to be learned in addition to the rule. Rather, it may be seen simply as the existence of a gap in the lexicon; if a word or usage should be added to the lexicon to ftll that gap, it would be seen as an addition to the lexicon, not as a change in
a rule of grammar.
Her statement can be . interpreted as indicating a different conception of rules. Here the rules are not mappings between lexical entries, but are used on demand to fix all possible semantic interpretations of a lexical item, where evidently an item is something different from McCawley's item, for . which entry is used.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3) · The sweater is itchy.
Anatoli Strigin
165
The rules as conceived by Green would probably leave it to the speaker to decide on some regular basis whether there should be two related lexical entries in a lexical item, i.e. whether a rule should be applied. The question arises whether the basis is sufficiently determined by the lexicon data or needs some knowledge from outside the lexicon. In the latter case the rule can be said to use conceptual knowledge. If pragmatics is conceived as an interface between purely linguistic and general conceptual knowledge, the rules suggested by Green may be taken to belong to pragmatics. · Though the example of McCawley would not probably be considered to be the case of a sense extension rule now, the idea caught on. The line of thought leading to the pragmatic understanding of the kind of regularities under the above interpretation of Green's quotation is also elaborated in . Nunberg (1979, 1995). The line of thought based on the suggestion of McCawley is most prominently expounded in Copestake & Briscoe (1995), Briscoe, Copestake, & Lascarides (1995). To isolate what I consider to be the essential difference between them consider the basic problem to be solved in the conceptual analysis of the phenomenon. To provide a model of sense extension it is necessary to find sufficiently general rule domain definitions and to account for the existence of exceptions in the domains. To define a domain the source class of objects should be defined as well as the basic transfer relation between that class and the class resulting from the rule application. It sometimes seems that exceptions are an artefact of an imprecise domain description. The two conceptions differ in their predictions as to whether the domain description can be made precise. The pragmatic understanding would imply that this is a matter of conceptual knowledge, dependent on many totally non linguistic factors and complex conceptual processes, hence difficult. The lexicalist semantic understanding would imply that lexical entries contain only a limited amount of regimented semantic information; hence the domains and transfer relations can be more easily described in terms of the scheme of a lexical entry. This position seems to be more attractive, provided we know how to make the abstract relations of the scheme more specific. But we do not expect a high degree of precision. In his pragmatic approach Nunberg assumes that a speaker of some particular language like English has at her disposal a set of basic sense shifting transfer functions reflecting principles of the organisation of conceptual knowledge (rather like the rules of Green). They can be combined in different ways to give rule-like sense extensions. Nunberg (1995) and Nunberg & Zaenen (1992) are cautious regarding their relation to the lexicon. There it is maintained that some of the .processes could be viewed as lexical. In such cases of lexical transfer the conceptual relation defining the transfer function is explicitly coded in the relevant lexical ·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
166 Lexical Rules as
Hypotheses Generators
entries, so the transfer is licensed on the information in the lexical entry. This accounts for the language-specific character of· sense extension. In other cases conceptual information is needed that is not represented in the lexicon. The only hope to set some limit on the amount of such information is to claim that it should be in some sense relevant for the speaker. Nunberg (1995) uses the notions of salience and noteworthiness. The transfer function should be salient, and the property contributed by the
(4)
a.
b.
We had rabbit for dinner yesterday She wears rabbit
It could seem that the sense extensions which let words denoting animals denote their edible substance or their hide cover indeed only the animals the indicated parts of which are regularly used in the indicated way in English-speaking communities. The restrictions to the community and to regular use o the parts/products in the community seems necessary because names of animals which do not answer the descriptions do not allow the sense extension to proceed quite as easily, c£ an example of Copestake & Briscoe (1995) in (5).
(5)
Badger hams are a delicacy in China while mole is eaten in many parts of Africa
Under the pragmatic account we might base the transfer on our factual knowledge about the use of fur or meat of specific animals, then generalize it if this use is sufficiently important, to obtain a sense extension rule which takes an animal to some related stuff. The ease of transfer in the core set of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
new predicate should be noteworthy in the context. Briscoe et al. (1995) assume that the rules· are present in the lexicon and have the status of defaults. That means they should be applied whenever their application is allowed. Predicted, but unattested cases have to do either with the involved default being overridden or with its graded quality. The rule domains are more easily specifiable because lexical entries contain only a few types of semantic relations which define their qualia structure (Pustejovsky 1995), i.e. some essential properties of the corresponding objects. The relations can be made conceptually more precise, but this additional information is not accessible to the rules. A number of examples should now help to bring the difference between the two conceptions into focus. One lexical predicate transfer in English can be described by 'if an animal has hide such that it is common to process it to be used by people in the English linguistic community, then the word denoting this animal can be used to denote the processed hide'. Another regularity is to use an animal's name for the meat of this animal, cf. (4).
Anatoli Strigin 167
·
cases could then be accounted for if the necessary relations were coded in the lexical entries for words like rabbit. In cases like (5) we would need more general inferences via the conceptual knowledge. For moles we could then assume that they are used for food. This description is compatible with Nunberg's treatment. Treating sense extensions as lexical rules, another explanation of (5) should be sought, because factual knowledge is not available in the lexicon. Copestake & Briscoe (1995) formulate the rule in . terms of an abstract qualia relation origin, and its domain description refers to animals. The rule generates a lexical entry for the comestible substance for any word denoting an animal. The actual attested use of the rule by English speakers to derive the meat sense of mole determines the
Briscoe 1995; Briscoe & Copestake 1996). Using rules to derive senses with low frequency leads to the deterioration of their acceptability compared to regular cases. The sense-extension from animals to their meat is barred from its usual application to the name of an animal by the existence of a word which is reserved to denote the edible parts of this animal specifically. Thus pork is the usual name of pig meat, and not pig. This part of the phenomenon is known as blocking. An important characteristic of blocking is deblocking. Deblocking happens when the name with the blocked reading is never theless sometimes used in this reading instead of the specialized word to denote the entity in question. The examples are of mixed quality, but could make the point, c£ (6, 7), both from Copestake & Briscoe (1995), the latter coming from Terry Pratchett's Guards, Guards where the use of the blocked version characterizes one character of the novel, called Throat, because of the pejorative associations with the use of the word pig.
(6) (7)
??
Sam ate pig (instead of pork) 'Hot sausages, two for a dollar, made of genuine pig, why not buy one for the lady?' 'Don't you mean pork sir?' said Carrot warily, eyeing the glistening tubes. 'Manner of speaking, manner of speaking,' said Throat quickly. 'Certainly your actual pig products. Ge�uine pig.'
A pragmatic explanation would appeal to conversational principles. ' Briscoe et al. (1995) develop a theory of default interaction to account for blocking and intend to treat deblocking as blocking by a lexical exception, in principle. This account does not easily generalize to the sense extension
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
acceptability degree judgement for (5). Since mole is not used in the corresponding way in the English community, the applications of the rule are thus very rare. This statistics of the rule is registered as the relative frequency of the derived sense in the lexical entry for mole (Copestake &
168 Lexical Rules as Hypotheses Generators rules, and is supplanted by one based on a statistical explication of Gricean maxims in later papers. This one leaves deblocking to pragmatics.
(8)
korovu a. My eli We were eating the ·cow serebr'anyx tarelok gov'adinu iZ b. My eli We were eating the veal from silver plates ?? korovu iz serebr'anyx tarelok My eli c. plates · We were eating the cow from silver
It seems that in Russian we have two different sense extension rules for one domain. The conventionalized sense-extension is not blocked for korova. It is not clear how to distinguish the result of the two rules in the lexicon. In contrast to this, the English-like 'sense extension to meat' rule is freely applicable to names of fish, where the corresponding morphological derivatives are very rare, available mostly for expensive big fish like salmon. Thus it is perfectly OK to have (9).
(9)
My eli sudaka (iz serebr'anyx tarelok) We ate pike (from silver plates)
For names of bigger fowl the derivatives often exist, but the difference in the meaning between the morphologically derived forms and the forms in the extended sense is barely perceptible. The derivation does not apply to smaller hunted birds like snipe, or to the word ptica (fowl), and sense extension rules function here like in the case of
fish. On the assumption
that derivation is a lexical rule the interaction of the two rules is difficult to state. The necessary domain definitions seem also to be very difficult in the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The discussion of sense extension in Nunberg (1979), in Ruhl (1989), and in Copestake & Briscoe (1995) is based mostly on English examples. Some comparison with other languages c�uld throw additional light on the sense extension rules and, I believe, would indicate some difficulties for the lexicon conception of the rules. In Russian the meat of a mammal denoted by a noun is usually referred to by a morphologically regularly related mass term derived from the noun via the sufftx -ina. The sufftx is of very general application, but sometimes the derivation is blocked, e.g. korova (cow) in this sense is blocked in favour of the word gov'ad-ina. Although the word contains the sufftx, the stem is not the name of any animal. Now, if you wanted to convey the idea that the edible parts of a mammal were consumed as a whole, and not in portions, you could use the sense extension device like in English. And it is the only way to use the animal names for mammals in this extended sense, since the uses which call the holistic consumption into question have a very strange ring to them, e.g. (8).
Anatoli Strigin 169 lexicon. There does not seem to be a problem of principle for the pragmatic account, since the applicability' of the transfer function can be derived on any cultural basis whatsoever. At a first approximation, the dichotomy could run along the lines of how much edible substances is obtained from the animal or whether or not it is eaten as a whole as a rule, leaving ptica
(fowl) out for obvious reasons. The generalization can provide estimates for a nonce word or an unknown name of an animal. The probabilistic solution in the lexicon is completely non-predictive in these cases.2 Apresjan ( I 973) is a compendium of sense extension rules in Russian. It . includes, among other regularities, the following two: the sense extension
(I o)
abrikos vs. abrikos apricot b. jablon'a vs. jabloko applletree vs. apple (I I) gorcica, xren mustard, horseradish a.
The first makes an exception for apples ( x ob) and does not easily apply to exotic fruit in Russian. Thus, it is very strange to say There stood a banana/ mango/coconut/orange at the corner to refer to the corresponding plants. The second applies only to plants which can reach the kitchen table grouped up, so that the form identifying the plant is not preserved. A third regularity, not listed in Apresjan's work, but very common, is to refer to spots made by the juice of some berries with the name of the berry, e.g. in ( 1 2).
( 1 2)
U teb'a klubnika
na shtanax ne otstiralas'
At you strawberry on trousers not washed-away-reflexive The strawberry juice stain on your trousers did not wash away
It is unclear whether three sense extension rules in the sense of lexical mapping are adequate here. If the first one is regarded as such, either there is no origin specification in the lexical entries of exotic fruit, because there is no word blocking the application of the rule similar to (Iob), or the probabilistic solution must be adopted. The absence is unnatural, the probabilistic solution simply registers the exceptions. But the upshot of this is that people who do not know the names for the plants will still use the sense extension innovatively. This does not seem to happen, they rather
use things like banana tree/bush. On the pragmatic approach one explanation could simply state that the domain of the rule is limited because its results for exotic fruit are culturally not noteworthy.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
rule from fruit to plants bearing these fruit, 3 in case the fruit is used for food (xo), and from plants to a kind of food product made from them, as in {I 1).
170 Lexical Rules
·
as
Hypotheses Generators
The second rule would either require a very detailed specification both in lexical entries and in the rule domain of what is usually done to the plant in the domain to exclude cases of calling an apple pie an apple, or must be assimilated to some more general rule with consecutive specification of the result. Carrots cut in pieces are still called carrot, and it could be maintained that it is actually a case of a very general sense extension rule called grinding which allows the shift to the substance obtained from some object. That
specify a context: mustard seeds can be ground, but not processed to be used, i.e. mixed with vinegar, etc. If we license the rule in the general form for the lexical entry mustard, we still have to qualify its result. This is a patently pragmatic option, so should we first license such general rules in the lexicon to produce excessively general lexical entries and then move them to pragmatics for qualifications? The third rule, if considered as originating in the lexicon, presupposes such a considerable amount of world knowledge in the lexicon as to make its principled structuring by qualia relations very implausible. However, the conception of lexical rules as defaults in the lexicon has the merit of a rigorous formalization in Copestake & Briscoe (1 995). The
pragmatic alternative along the lines of Nunberg has some generality which can be made more precise. The origin of the phenomenon, it might be claimed, is in the way we name things. Some names are just reserved labels, some names use relations between things, invoking the concept of one thing and shifting to a related concept, and sense extension is a way to produce this kind of shift. Before we name things, we probably decide whether the thing is worth being named at all, since the device of description is always available. This noteworthiness is a prerequisite of sense extension, c£ Nunberg (1 995). But if noteworthiness or nameworthiness (nameworthiness is probably a more suggestive term) is computed at all, its criteria are .far from being clear. The nameworthiness concerns the relation underlying the regularity, and the result. It is usually motivated by the high relevance both of the processes conceptualized by the relation and of the results of these processes to some human sphere of life, and are context dependent. The nameworthy relations
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ground mustard is used in the kitchen only prepared in a specific way could be claimed to be world knowledge. An obvious difficulty with this treatment is the granularity of the rules and their precision.4 We could claim that we have a lexical rule which relates syntactic features and introduces this general relation, and our world knowledge tells us how it is specified in the context, e.g. how the object is ground and what form grounding turns the thing into depending on the context? The only requirement for making the rules more precise is that of compatibility with the derived syntactic properties. The difficulty is in requirement to
Anatoli Strigin
171
are used by transfer functions to give an interpretation of a lexical item in a context The central problem of the pragmatic account now is a description of transfer functions. This is also the aim of the paper. It attempts to give such a . description in a general framework which views semantic interpretation as hypothetical reasonmg, lexical interpretation being a part of this activity.
·
2 LEXICAL I NTERPRETATION AS ABD U CTIO N
( I 3)
if B is observed, and A ::::} B is given abduce A, a possible reason for B
All possible reasons are called hypotheses or abducibles. The use of abduction to model interpretation of texts has already been reported by Charnjak & McDermott (I98 5). Hobbs, Stickel, Appelt, & Martin (I993) took it a step further. To find an: interpretation of a sentence they explain its logical representation by abducing the best possible explanation for its components, where best is having .the least cost. In case of the lexical interpretation they represent ambiguity by providing all possible interpretations and disambiguating the result by searching for the hypothesis which provides the cheapest explanation. Consider their example in (I4). The knowledge of the interpretation of the word bank in English is taken to consist of the postulated predicate bank (X) , which is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If lexical interpretation is hypothetical activity, lexical rules as transfer functions could be based on the reasoning mechanisms underlying this activity, too. Lexical rules determine the hypotheses space of this activity. Since a hypothesis may be adopted, but need not, such rules will be rather like the 'implicational possibility rules' of Green. Hypothetical reasoning can be considered a case of abduction, following Peirce, e.g. Peirce (I 992). In its simplest form abduction is a mode of reasoning on the basis of a rule (which establishes a connection between a case and a result) and a possible result of the rule, to the case (possible reason for the result). The notion 'reason' is deliberately very broad and intuitively vague here, since it must be based on the formal properties of the underlying logic. The technical terms used in the literature are explanation for the reason in question and evidence or observation for the observed result of the rule. If the rule is stated as an implication, abduction is just following modus ponens backwards.
172
Lexical Rules as Hypotheses Generators
implied by two concepts corresponding to the notions of river bank, bankriv�(X), and of financial institution, bankfirumu(X). ( 1 4)
{ bankn�(X)
bank(X) bankfinanu(X) => bank(X) =>
}
Many non-monotonic inferences are abductive by nature, which is to say they provide plausible explanations for some states of affairs . . . The problem, of course, is that not just any explanation will do; it must, in some sense, be a 'best' explanation . . . But if there is a best theory, there must be poor ones; so diagnostic reasoning really consists of two problems: (a) What is the space of possible theories that account for the given evidence? (b) What are the best theories in this space? ·
Poole systems will be used to define the space of all interpretations of a lexical item. Under the approach of Hobbs et al. ( I 993) the· hypothesis space is not limited, and the problem of choice is not separated from the problem of characterizing the hypothesis space. Poole systems (Makinson 1994 introduced the term) or abductive frameworks are a formalization of hypothetical reasoning as theory formation based on first order language L. Let f be a set of sentences in L which are considered to be facts in the sense that we accept them as true and inviolate for the span of the abductive task at hand, and ¢ be the sentence expressing another kind of fact, an observation which we want to explain. Finding an explanation is our abductive task. We also have a set of hypotheses II at our disposal, which we can use to explain ¢, as long as they do not lead to contradiction. For technical reasons (Poole 1987) we need a set P of ground instances of the formulas of II which would together with r imply ¢. An abductive task assumes a very simple version of theory building: if a set of ground instances P is shown not to contradict r, we simply compute the consequences of f and P together, which is a theory in the formal meaning of the word. If ¢ is in the theory, we may say that P explains ¢>. The sentences in P are called abducible sentences or abducibles.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The antecedents may be t:tken as possible hypotheses, and if some discourse referent d is plugged in for the variable, we can explain the observation bank(d ) in two ways by abducing to one of the readings following (1 3) and (14). If something implying bank,;v�(d) occurs in the text, bankriv�(d). can be explained in its· turn, and since this amounts to using the hypothesis which has already been put to use, this explanation is cheaper than assuming a new hypothesis. So the reading bank,;ver(X) is sort of primed in the context. I will start with this example as the basic idea underlying lexical interpretation, but elaborate it in terms of abductive systems as defined by David Poole (Poole 1988). Talking about AI treatments of diagnosis, Raymond Reiter (Reiter 1987) notes that:
Anatoli Strigin
173
·
(15) r
=
{
rained-last-night ::::} grass-is-wet sprinkl�r-was-on ::::} grass-is-wet grass-ts-wet ::::} shoes-are-wet
}
The hypothesis set II may contain all the antecedents of the implications. It is often preferable to have only the basic hypotheses, in the sense that they are not further explainable from other hypotheses, i.e. II {rained-last-night, sprinkler-was-on}. We can choose either P1 =
=
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To represent generalizations with potential exceptions we allow the hypotheses in II to be open first-order formulas. Free variables can be used as place holders for constants. Given all substitution instances. of a hypothesis, those substitution instances which are explicitly contradicted may not be used, the others may. The rules in II are thus sentence generators. We may now introduce the necessary terminology. An abductive framework is a pair (r, II) of sets of possibly open formulae. Let P be a set of ground instances of formulas from II. De£ I : A scenario ofan abductiveframework (r, II) is a set P ofground instances of elements of II such thatT U P is consistent. De£ 2: If ¢ is a sentence, an explanation of ¢ from (r, II) is a scenario P of (r, II) which together with r implies ¢, i.e. a set P ofground instances of II is an explanation of ¢ iff (i) P u r 1= ¢ (ii) p u r is consistent It is possible to use provability, f-, instead of modelling relation, f=, due to the equivalence of the notions for the first order languages.5 A theory is explicated as an extension of the abductive framework (r, II) defined below. De£ 3 : An extension of (r, II) is the set of logical consequences of the union �f r and some maximal with respect to set inclusion scenario P of (r, II). Another name for an extension is maxiconsistent set (Makinson 1994). There is a connection between explainability and theory building, expressed in the following theorem proved in Poole (1988). Theorem: there is an explanation of ¢ from (r, II ) iff ¢ is in some extension of (r, II). The theorem says that ¢ can be explained iff it follows from a consistent theory based on some maximal set of hypotheses. Consider a simple example of an abductive task. Suppose we observe shoes-are-wet. The facts at our disposal are the implications of the data base in ( 1 s).
174 Lexical Rules
as
Hypotheses Generators
=
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
{rained-last-night} or P2 = {sprinkler-was-on} as the explanation of shoes-are-wet. The maximal scenario is their union. The extension is built on this scenario. The maximal scenario is also an explanation, but it is too presumptive. We would like to assume not all compatible cases at once, but one at a time, i.e. we should use explanations which are minimal in terms of set inclusion. We want to use knowledge in the form of hypotheses only if there is evidence for them. We do not want constantly to hypothesize that it rained last night, or that we must pay for cigarettes, but do this only if our shoes are wet or if there is a cigarette shortage and the vending machine refuses to , budge. Since neither ¢ nor II are syntactically marked as observation or hypotheses, respectively, they simply are earmarked so, if they are viewed as terms in the relation of explainability. The point is they cannot be simultaneously treated as something else in the same task. And explainability is only one family of relations involved in the non-monotonic inferential activity. Another family of relations is prediction by default, i.e. prediction either ofwhat is a convention, or what is an accepted tendency. Extensions (or an explainability relation) can be used to model default reasoning, too. By default reasoning a kind of hypothetical reasoning is usually meant where the hypotheses are used not to explain observed things, but to predict what may be. In modelling prediction by explainability we must define criteria of what is predicted. There are different possibilities. Usually what is predicted by default is defined as something which is in every extension of an abductive framework, i.e. what is explained by every theory. Default rules have the property which is called conditioning in Poole (1991), i.e. they are used whenever their preconditions are met. Hypotheses are used when there is evidence for them. Formally there is no difference between the two. Moreover, whatever is a hypothesis in one abductive task might be a default in another. This conceptual difference can be taken care of by keeping defaults and hypotheses separate, so the set of defaults will be denoted by �. but defaults can be used in explanation alongside with hypotheses. The modified definition is given below. De£ 4: A scenario of abductive framework (r, �' II} is the union of a set P of ground instances of elements of II and a set D ofground instances of � such that r U P U D is consistent. De£ s : If ¢ is a sentence, an explanation of ¢ from (r, �' II ) is a scenario A = P U D of (r, �' II) which together with r implies ¢, i.e. A P U D is an explanation of ¢ iff (i) P u D u r f= ¢ (ii) P U D U r is consistent It is essential to understand that explanation is not prediction by default. This point is often overlooked.6
Anatoli Strigin
175
Poole systems may use constraints on inference which serve as a kind of inference control mechanism. The next definition gives the form of Poole systems with constraints. De£ 6: A scenario of an abductive framework with constraints (r, II, �, C) is the union of a set P ofground instances of elements of II and a set D ofground instances of b. such that r U P U D is consistent. If
f=
ct(C)&animaltofur(X, Y, C)&aspof(X, Y, C) =>
shifi(X, Y)
ct( C)&animaltomeat(X, Y, C) count(X)&animal(X)
=>
&-.(count( Y) )&edible( Y)&consistsof(X, Y) ct(C)&animaltofur(X, Y, C) count(X)&animal(X)
=>
&•(count(Y) )&-.(edible(Y) )&partof(X, Y) To see how the rule works assume that the relevant part of the lexical entry for rabbit are as in (25). II = { rabbit(X) } (25)
r
=
{
rabbit(X)
=>
shifi(X, Y)
ifrabbit(X)
=>
}
ifrabbit(Y)
The context is fixed to the lexical interpretation of rabbit, i.e. ct(rabbit) holds. The semantic form ifrabbit(X) of rabbit can be explained via rabbit(X), via animaltofur(Z, X, rabbit), or via animaltomeat(W, X, rabbit). The predicates consistsof(X, Y) and partoJ(X, Y) provide the necessary entailments via aspof(X, Y, rabbit), and constrain the rules simultaneously. Consider now the examples from section 1. The case of mole can be explained, if consistsof is not nameworthy for moles, presumably because mole stuff is not considered to be edible. It can be assumed to be so. This · exceptional hypothesis decreases the acceptability judgements. The case of ( I I ) can be seen as small-scale generalization. The use of abduction allows generalization on the basis of a small number of sufficiently similar cases. A new predicate is introduced which is abduc tively explained by the cases in question: There is no need for straight forward semantic criteria for this relation which are sometimes required under the lexical rule approach, c£ Briscoe & Copestake (I996). This is especially useful in the case of (12): the generalization exploits a property of easily squashable juicy plants which is nameworthy because it is very salient
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
C=
1 82 Lexical Rules
as
Hyp otheses Generators
in a small, but very important set of contexts. And lastly, exotic plants are not nameworthy origin properties of exotic fruit, presumably because they are exotic. In the case of animal-to-meat sense extension in Russian, the domain of the morphological rule derivation can be easily restricted to animals which are sold as meat, in portions. The interaction between the two rules, the morphologically marked derivation rule and the sense extension, is the subject of the next section.
BLOCKING AND DEBLOCKING
Presumably these resources must be assessed from the viewpoint on the interlocutors, i.e. a) from the point of view of the listener: could the speaker intend to name something which does not have a specific name with required properties with the help of this device? and b) from the point of view of the speaker: can the listener plausibly find the hypothesis? These are pragmatic constraints, since their justification lies in the fact that there is a primary interpretation of a word, and the laws of successful communication are known to the interlocutors. On the one hand, the listener's constraint would give one part of an account of blocking and deblocking. In (6), the use violates the assumption that there is no appropriate specific name, so the reference shift seems unmotivated. In (7) this shift is justified, since an additional characteristic can be hypothesized, distinguishing the two words. The mechanism is in each case based on something like Gricean Maxims yielding discourse implicatures (Copestake & Briscoe 1 995), probably because more specific explanations are more informative. For this explication to go through the explanation via a name should be relatively more specific according to the specificity of explanations criterion. The definition of specificity (7) will be shown below to be applicable in this case. On the other hand there should be a general mechanism to compute the plausibility of some interpretation . relative to others which reflects the effort. Computing plausibility of an interpretation can use different degrees of assumption, e.g. how much inference using world knowledge this computation involves. The extreme case is probably the interpretation of unknown words. If e.g. the context of interpretation strongly suggests the meat-interpretation of an unknown word, then the constraints of the rule (23) can be used as a default characterisation of the aspect relation associated with the hypothesis animaltomeat. Thus, if somebody does not know what a badger is, s/he still can infer that it is an animal, and the edible part of its
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
s
Anatoli Strigin I 8 3
(26 )
( 27 )
IT
=
{pigmeat(X) }
f
=
{ pigmeat(X)
IT
=
{ pig(X) }
r
=
{
pig(X)
=>
shift(X, Y)
==?
sfpork(X)}
sfpig(X) ==?
}
sfpig(Y)
Suppose our relevant conceptual knowledge is represented in (28), i.e. pigmeat is the meat a pig consists of, and it is a nameworthy aspect of pig, that it consists of meat (meat abbreviates edible stuff here).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
stuff is meant, though it is impossible to say which part it is; delicacy, ham, and eaten bias the explanation towards the edible parts. The bias could be accounted for using something like the coherence measure of Ng & . Mooney (1990). To return to the account of t:xplanation interaction in terms of Gricean maxims. One strategy of the listener is to assume that the speaker is as informative, as necessary and possible. In other words, s/he is trying to be as specific, as· possible in the sense of De£ 6 of the specificity of solutions. The solutions in case of the name vs. extended sense possibilities are solutions to· the choice of words. Given that this is known to the speaker, s/he would normally act this way. Acting contrary to it should be justifiable, i.e. . explainable in its own right. In (6), the use violates the assumption that there is no specific name with appropriate properties, and there is no evident justification. In (7) this use is justified, since an additional characteristics of the object referred to can be computed on the basis of the contextual information: pig is intended to refer to generally not edible parts of pig, which are in the sausages. So the use of pig has a somewhat different explanation, than in the normal case. This use is justified from the point of view of Throat. The listener is invited to drop the constraint on edibility or to expand this notion: an invitation not understood by the customers of Throat, as a rule-to their disadvantage. This account employs the definition of specificity. To show that De£ 6 is applicable in this case, we must compare two solutions to the choice problem. If the solution with the name turns out to be more specific, the definition (7) is applicable. But since the problem of word choice is not quite the problem of explanation choice, we should compare the resources of explanation in general, i.e. explanation schemata and not their instances. Let {26) be the lexical entry for pork, and (27) the lexical entry for pig.
184
Lexical Rules
(28)
=
ll
(29) r
=
as
Hypotheses Generators
{ usemeat(X, Y),pigtomeat(X, Y) }
Since the abductive task is that of choosing words, the context is ftxed, e.g. to the constant choice by ct(choice). The rule (23) is also available in ct(choice) as (3o). .6.
=
r :..._
C
=
{ animaltomeat(X, Y, choice) }
{
{
animaltomeat(X, Y, choice)&aspof(X, Y, choice)&ct(choice) =>
shifi(X, Y)
.
'ct(choice)&animaltomeat(X, Y, choice) =>
count(X)&animal(X)
}
·
·
&-. (count( Y) )&edible( Y)&consistsof (X, Y) .
}
We have to show that the solution (J I) is less general than the solution (32). ( J I ) ( pigmeat(X) , ifpork(u)) ( J2) ( { pigtomeat(X, Y) , pig(X) , animaltomeat(X, Y, choice) } , ifpig( u)) To check this assume that the discourse referent u is provided in the choice context choice. Consider the case where the choice context contains only the grammatical information ·that u is a discourse referent dis coursereferent( u), and we know nothing about its identity. In this case both the solution (3 I) and (32) are applicable. But if we modify this contingent fact and assume that we already know that pig( u ) (3 I) is no longer applicable, since pigs are not pork. Solution (32) is still applicable, because if X is instantiated to u, the hypothesis pig( u) suffices to explain ifpig( u). Thus, the account of blocking/deblocking sketched above can use De£ 6 with open formulas. ,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
pigmeat(Y)&usemeat(X, Y) => consistsof(X, Y)&pig(X)&meat(Y) pig(X)&pigtomeat(X, Y) => consistsof(X, Y)&meat( Y) pig(X)&consistsof(X, Y)&meat(Y) => aspof(X, Y, choice) pigmeat( Y)&consistsof(X, Y)&pig(X) => aspof(X, Y, choice)
Anatoli Strigin I 8 5
·
.
. 6 A COMPARIS O N O F S OME RULE PROPERTIES UNDER THE TWO APPROAC HES A short comparison of the two approaches is now in order. Though the paper is largely programmatic, and merely sketches a formalization of the notion of a transfer function of Nunberg (1979) in the context of an abductive theory of natural language interpretation, there are two points clearly relevant to the proposal which should be discussed, however briefly,
·
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Consider now the example of this sense extension in Russian. A number of people suggested that morphologically marked semantics-changing derivation rules do not really differ from sense extension rules. They can thus be compared for specificity. The restriction of the sense extension rule to the holistic meaning is easily explainable if assumptions made to satisfy Gricean maxims can be conventionalized. Since the derivation rule is restricted to portion-wise sold animals, solutions using it would have been more specific than those using sense extension. The sense extension rule must aquire an additional condition, to be used nevertheless. The contrary to the restriction is the simplest conventionalizable addition, in a sense. For small animals the additional restriction is often not distinguish able from the holistic version; hence there is no violation of the maxims if both rules are used. This account of rule deblocking refers to pragmatic inferences which may be ad hoc, for single words, or conventionalized for. a class of words. But it is not applicable in a straightforward way to syntactic rules because of . that. Specificity ranking can be defined on different kinds of rules. Inasmuch as such ranking is used in syntax, the similarity is very interesting and suggestive, and may be a manifestation of some deeper information processing property of human intelligence, but the solutions to the problems of interpretation should not be automatically transferred to syntax, contrary to what I think is the position of Briscoe et al. (1995). In particular, the morphological phenomenon when different tense-formation rules {e.g. dreamed/dreamt) coexist is not deblocking in the sense explicated here, since deblocking involves pragmatic inferences extending some attributes of � concept, as the discussion of (7) indicated, and is not available in the -syntax. Another feature of the account is that it presupposes contextual variation of concepts. Principles of such conceptual modification in a context are postulated by psychologists, but are not very dear (Barsalou 1 992 reports on such effects).
186
Lexical Rules
as
Hypotheses Generators
i.e. contextual and language-particular licensing of the transfer. I will not be able to provide a theoretical contribution to these problems and have to restrict myself to some remarks. Since any aspect of a concept is in principle available as a source of a
·
your trousers, this might be related to the stain, though other explanations are possible. Suppose we consider the salience of the stain hypothesis to be proportionate to a probability estimate of washing the trousers as a result of a stain on them. The new conditional probability of the hypothesis might be taken to reflect its chances of entering into an explanation of the word strawberry (its interpretation). But if its salience is only proportionate to the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
hypotheses space extension for rule generation, but not all potential extensions are observed in every language, and if they are then not in all contexts, such extensions must be licensed by some contextually relevant and language-relevant factors. As far as language-particular preferences for sense . extensions are concerned, the position of Nunberg seeins to be adequate: cultural salience can lead to a sense extension rule. Unfortunately, no experiment in rule generation is possible here, since sense extension rules are learned when learning a language, and not created anew. Once the rule is highly conventionalized, it can be placed in the lexicon in the sense of being more readily accessible. The objection of Lascarides & Copestake (1998) that a non-trivial interface is required between the sort of formalism necessary to implement open-end inference of the kind proposed here and the syntactic representation is based on the assumption that some reasoning takes place in the lexicon, and it is better to make it easier. Program matically, Poole systems and the context theory can achieve exactly this, limiting the depth of abduction. Contextual dependence of a rule is another important matter. The use of context is limited under the lexicon rule approach to overriding the default information. Arguably, there is a better use, for which the abductive pragmatic approach is better suited. Consider the case of juice strains in (12). It could be assumed that all the aspects of a concept have logically equal chances either to be an interpretation for the sform or not to be, taken in isolation, but acquire different preferences depending on the interpreta tion context. The rating of a hypothesis in this context should depend on its salience there. If some interpretation hypotheses provide minimal explana tions of an observation in the interpretation context, they should become more salient. Thus, in the context of washing the juice stain interpretation of strawberry should be very salient. Then the corresponding rule will have a high probability of being chosen. The question is how to measure salience to achieve the kind of reasoning described. Intuitively, the interpretation becomes more likely because it explains part of the context. If you wash
Anatoli Strigin 187 probability, it is actually a likelihood function (Edwards
1992). Thus, there
are two roads to compute the saliences of the hypotheses in the context: either to treat them as probabilities directly, or to take their negative logarithms Qog-likelihood) and treat them as cost assignments. Probabilistic abduction proposed in Poole (1993) is a way to compute the conditional probabilities of hypotheses in a given context. Cost-based abduction with probabilistic cost semantics proposed by Charniak &
entries of affixes. Blocking and deblocking can be modelled under both approaches, but the interpretation of deblocked items can be naturally handled under the pragmatic rules approach, whereas Briscoe & Copestake (1996), where a frequency-of-occurrence based account of blocking is proposed, reserve the problem of extra implicatures of the use of the blocked form · and the generation of this form itself to the interface of pragmatics. Briscoe & Copestake ( 1 996) also propose the use of statistic data to grade
the rules relative to each lexical entry in the domain and to use the statistic information to guide the rule application. While this is of great potential interest for computational linguistics, this approach does not cover contextual dependence of the rules discussed above. Another recent attempt to integrate probabilities, pragmatics, and a lexicon which is screened off from pragmatics is Copestake & Lascarides {1997). But the approach propose there still offers no possibility to assign different probabilities to senses in different contexts.8 The dependence on frequency of observed readings can be reflected under the cost-assignment extension of abduction indicated above, too, so that rule probability for each lexical entry can, in principle, be registered on the aspof predicate of the corresponding aspect of the concept. However, this discussion shows that an abductive treatment along the lines of this _ paper is still more like a research programme, the .work of Hobbs and his asssociates notwithstanding. · ANATOLI STRIGIN Humboldt University, Berlin jiigerstrasse 1 0- 1 1 1 0 1 1 7 Berlin Germany e-mail: [email protected]
Received: 28.08.97 Final version received: 1 8.06.98
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Shimony (1994) could implement the log-likelihood-based version, although a flexible reassignment of costs in a context is needed. Lexicon rules can easily accommodate all kinds of syntactic and morphological effects accompanying sense extension. It is then possible to claim basic similarity of sense extensions and derivation. This is not a problem for the pragmatic rules, either, since rules can occur in lexicaL
r88 Lexical Rules as Hypotheses Generators N O TES r
Nunberg & Zaenen
(1992)
appeal to
Gricean maxims to justify their opinion that a specific description is to be pre
ferred to a vague one 'where no ulterior
·
motives intrude'. They do not specify how these motives are figured out and
2
Lascarides
(1997)
&
suggest that inter
preting a compound should be con
trolled by the associated probabilities.
notion, it is fairly clear in prototypical
the lexicon as interpretation schema.
Though cultural salience is a vague
etc. are going to be treated in one way or another depending on how well they fit
the generalization.
The example is based on the hypothesis that this is the direction of the rule application.
Some
evidence
hypothesis are the forms like
in English or German.
Apfelbaum
This
is
also
for
the
lemon-tree
(apple-tree) in the
direction
adopted in Copestake & Briscoe
Apresjan assumes the reverse.
(1995);
4 The term is used by Nunberg & Zaenen
to describe specifications of a
(1992)
relations interpreting a compound in
These
are
ordered
by
a
specificity
hierarchy. If no specific interpretation is possible, the most general relation is considered and treated as an anaphor to
be ·resolved pragmatically from the context. The choice between several
possible compatible readings is guided
by the principle that words are assigned
the most probable sense that produces a
well-defined discourse update. The dis
tinctive feature of the proposal is the
method
of
computing
probabilities.
The probability of a sense of a word depends on the frequency of this sense
in the corpus used to compute the frequency distributions. The interpreta
tion schemata defining the senses of the
general relation in a context depending
compounds are assigned weights which
Note that the hypotheses in TI are not
tial compound which does not occur in
on the world knowledge.
.
statements about the real world; but
reflect their productivity. For a poten
the corpus the probabilities of the senses
assumptions about what can be a possible
will be proportional to the productivity
A reviewer noted that the default logic
the corpus, the prqbabilities of its senses
description of the world.
in Poole systems is insufficient. Indeed the relation of sceptical default conse
rankings. If a compound is observed in
are computed via their frequency in the corpus, with a residual probability dis
quence in Poole systems with constraints
tributed between those senses, which
default logic of Reiter, c£ Reiter
to
is equivalent to the prerequisite-free
Dix
However,
(1992).
the
(r98o),
computed
relation is here not that of sceptical
default consequence, but of explanation.
The objection does not apply. Pros and
cons of abductive explanation vs. default logic
should
be
application case.
7 See Chaffin
the part
(1992)
of relation.
evaluated
in
each
for the polysemy of
are not observed, again in proportion the
productivity
rankings.
Thus,
there is no way to take. into account
the influences of the context on the
probability of a reading. Hence if two
readings are compatible, it is invariably
the more frequent that will be chosen.
The closest we can get to reflecting the contextual influence is to index prob abilities by the
compute them.
type of corpus used to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
allows for predictions about new nouns,
i.e.· names of unknown birds, new loans,
6
·
domain of compounds Copestake
They register a restricted number of
cases. If these cases serve as a basis of a
5
pretation for the notoriously difficult
taken into account by the listener.
generalization, the pragmatic approach
3
8 In an attempt to provide an inter
Anatoli Strigin
189
RE FERE N C E S Apresj an, J.
Linguistics
(1973), 'Regular polysemy', 5-32. W. (1992), 'Frames, concepts
Indiana University Press, Bloomington, IN. Hayes, P. J. ( 1980), 'The logic of frames', in D. Metzing (ed.), Frame . Conceptions and Text Understanding, Walter de Gruyter, Berlin. Hobbs; J., Stickel, M., Appelt, D., & Martin, P. (1993), 'Interpretation as abduction',
Regularity,
I.P,,
Barsalou, L. and conceptual fields', in A Lehrer & E. F. Kittay (eds), Frames, Fields, and Contrasts, Lawrence Erlbaum, Hillsdale,
NJ, 2 1-74-
Briscoe, T. & Copestake, A (1996), 'Con trolling the application of lexical rules',
Artificial Intelligence, 63, 69-142.
.
·
Santa Cruz, CA Briscoe, T., Copestake, A, & Lascarides, A (1995), 'Blocking', in P. Saint-Dizier & E. Viegas (eds), Computational Lexical Semantics, Cambridge University Press, Cambridge. Chaffin, R. ( 1992), 'The concept of a semantic relation', in A. Lehrer & E. F. Kittay (eds), Frames, Fields, and Con . trasts, Lawrence Erlbaum, Hillsdale, NJ,
. Computer Science, I, 225-87. McCarthy, J. (1993), 'Notes on formalizing context', Proceedings of . the Thirteenth International joint Concference on Artificial Intelligence. McCawley, J. D. (1968), 'The role of semantics in a grammar, in E. Bach &
R. T. Harms (eds),
Theory, Holt, Reinhart, & Winston, New York, 124-69. McCawley, J. D. (1973), Grammar and Meaning, Taishukan Publishing Company, Tokyo. Makinson, D. (1994), 'General patterns in nonmonotonic reasoning', in D. M. Gabbay, C. Hogger, J. Robinson, & D. Nute (eds), Handbook ofLogic in Artificial
25 3-88.
Charniak, E. & Shimony, S. E. ( 1994), 'Cost based abduction and map explanation',
Artificial Intelligence, 66, 345-74.
Charnjak, E. & McDermott, D.
to
Artificial
(198 5), Intelligence,
.
Addison-Wesley, Reading, MA Copestake, A & Briscoe, T. (1995), 'Semi-productive polysemy and sense extension', Journal of Semantics, 12,
Intelligence and Logic Programming, Vol.
1 5-67.
Copestake, A & Lascarides, A (1997), 'Integrating symbolic and statistical representations: the lexicon pragmatics . interface', Proceedings of the Association for Computational Linguistics 1997, Madria. Dix, J. ( 1992), 'Default theories of Poole type and a method for constructing cumulative versions of default logic', Proceedings of 10th ECAI, Wiley & Sons, New York, 289-93. Edwards, A W. F. (1992), Likelihood, John Hopkins University Press, Baltimore, MD.
Green� G. M
(1974), Semantics and Syntactic
Universals in Linguistic
·
J,
Clarendon Press, Oxford. Ng, H. T. & Mooney, R. J. (1990), 'On the role of coherence in abductive explanation', Proceedings of the Conference
of the American Association of Artificial Intelligence, 3 3 7-42. Nunberg, G. {1979), 'The non-uniqueness of
semantic
solutions:
polysemy',
Linguistics and Philosophy, 3, 2, 1 43-84. Nunberg, G. (1995), 'Transfers of meaning', journal of Semantics, 12, 109-32· Nunberg. G. & Zaenen, A (1992), 'Systema tic polysemy in lexicology and lexi cography', in K. Hannu Tommola, T. Salrni-Tolonen, & J. Schopp (eds),
.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lascarides, A & Copestake, A (1998), 'Pragmatics and word meaning', MS. Leveque, H. J. (1986), 'Knowledge represen tation and reasoning', Annual Review of
Proceedings ofthe ACL SIGLEX Workshop on Breadth and Depth ofSemantic Lexicons,
Introduction
·
190 Lexical Rules as Hypotheses Generators EURALEX 1992
on belief: conditioning, specificity and the lottery paradox in default reasoning', Artificial Intelligence, 49, 281-307. Poole, D. (I993), 'Probabilistic hom abduction .and bayesian networks', Artificial Intelligence, 64. 8 I-I 29. Pustejovsky, J. (I995), The Generative Lexicon, MIT Press, Cambridge, MA. Reiter, R (I98o), 'A logic for default reasoning', Artificial Intelligence, I 3, 8I-I 32. Reiter, R (1987), 'Nonmonotonic reason ing', Annual Review of Computer Science, 2, I47-86. Ruhl, C. (I989), On Monosemy: A Study in Linguistic Semantics, State University of New York Press, Albany, NY. Russel, S. & Norwig, P. (I995), Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs, NJ.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Proceedings, Tampere, Finland, 387-96. Peirce, C. S. (1992), Reasoning and the Logic of Things, Harvard University Press, Cambridge, MA, edited by Kenneth Laine Ketner. Poole, D. (1985), 'On the comparison of theories: preferring the most . specific explanation', Proceedings of the Ninth International joint Conference on Artificial Intelligence, Los Angeles, CA, 144-7. Poole, D. (I987), 'Variables in hypotheses', Proceedings of the Tenth International joint Conference on Artificial Intelligence, Milan, Italy, 905-8. Poole, D. (I988), 'A logical framework for default reasoning', . Artificial Intelligence, 36, 27-47· Poole, D. (I99I), 'The effect of knowledge