Convex Functions, Partial Orderings, and Statistical Applications (Mathematics in Science and Engineering)

Convex Functions, Partial Orderings, and Statistical Applications This is volume 187 in MATHEMATICS IN SCIENCE AND EN...

Author: Josip E. Pecaric | Frank Proschan and Y. L. Tong

22 downloads 570 Views 15MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Convex Functions, Partial Orderings, and Statistical Applications

This is volume 187 in MATHEMATICS IN SCIENCE AND ENGINEERING Edited by William F. Ames, Georgia Institute of Technology A list of recent titles in this series appears at the end of this volume.

CONVEX FUNCTIONS, PARTIAL ORDERINGS, AND STATISTICAL APPLICATIONS Josip E. Pecaric FACULTY OF TECHNOLOGY UNIVERSITY OF ZAGREB ZAGREB , CROATIA

Frank Prosch an DEPARTMENT OF STATISTICS FLORIDA STATE UNIVERSITY TALLAHASSEE, FLORIDA

Y. L. Tong SCHOOL OF MATHEMATICS GEORGIA INSTITUTE OF TECHNOLOGY ATLANTA, GEORGIA

ACADEMIC PRESS, INC.

Harcourt Brace Jovanovich, Publishers

Boston San Diego New York London Sydney Tokyo Toronto

This book is printed on acid-free paper. @ Copyright 0 1992 by Academic Press, Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.

ACADEMIC PRESS, INC. 1250 Sixth Avenue, San Diego, CA 92101 United Kingdom Edition published by ACADEMIC PRESS LIMITED 24-28 Oval Road, London NW1 7DX

Library of Congress Cataloging-in-Publication Data PeEariC, J. E. Convex functions, partial orderings, and statistical applications / Josip E. PeEariC, Frank Proschan, Y. L. Tong. p. cm.-(Mathematics in science and engineering; v. 187) Includes bibliographical references and index. ISBN 0-12-549250-2 (acid-free paper) 1. Convex functions. 2. Inequalities (Mathematics) I. Proschan, Frank, 1921- . 11. Tong, Y. L. (Yung Liang), 1935- . 111. Title. IV. Series. QA331.5.P43 1992 515’.223-dc20 91-34153 CIP Printed in the United States of America 92 93 94 95 9 8 7 6 5 4 3 2 1

To the memory ofmy parents

-1. E. P.

To Edna, Michael, and Virginia -F. P. To my teachers

-Yo L. T.

This page intentionally left blank

Preface Notation and Numbering System

xi

...

Xlll

Chapter 1 Convex Functions 1.1 1.2 1.3 1.4 1.5

One-Variable Convex Functions Convex Functions on a Normed Linear Space Convex Functions of Higher Order Functions Convex with Respect to an ECT System of Functions Inequalities Involving Derivatives and Differences

1 9 14 23 30

Chapter 2 Jensen’s and Jensen-Steffensen’s Inequalities

43

2.1 2.2 2.3 2.4

Jensen’s Inequality Jensen-Steffensen’s Inequality Companion Inequalities of Jensen’s and Jensen-Steffensen’s Inequalities Higher-Order Jensen-Type Inequalities

43 57 63 71

Chapter 3 Reversals, Refinements, and Converses of Jensen’s and Jensen-Steffensen’s Inequalities

83

3.1 Reversals of Jensen’s and Jensen-Steffensen’s Inequalities 3.2 Some Refinements of Jensen’s and Jensen-Steffensen’s Inequalities 3.3 Converses of Jensen’s Inequality ’

83 87 98

vii

viii

Contents

Chapter 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7

Applications of Jensen's Inequality to Means and Holder's Inequalities

Inequalities for Means Holder's and Minkowski's Inequalities Dresher's Inequality Beckenbach's Inequality Aczel's and Related Inequalities Further Generalizations of Holder's and Minkowski's Inequalities Some Inequalities for Complex Functionals and Norms

Chapter 5

Hermite-Hadamard's and Jensen-Petrovic's Inequalities

5.1 Hermite-Hadamard's Inequality 5.2 Jensen-Petrovic's Inequalities 5.2.1 Inequalities for Starshaped Functions 5.2.2 Inequalities for Convex Functions 5.2.3 Combination Convexity Inequalities 5.2.4 Inequalities for Sums of Order p

Chapter 6

Popoviciu's, Burkill's, and Steffensen's Inequalities

6.1 Inequalities of Popoviciu and Burkill 6.2 Steffensen's Inequality

Chapter 7

Cebysev-Gruss', Favard's, Berwald's, GaussWinckler's, and Related Inequalities

7.1 Cebysev-Griiss Inequality 7.2 Favard's, Berwald's, Gauss-Winckler's, and Related Inequalities

Chapter 8

Hardy's, Hilbert's, Opial's, Young's, Nanson's, and Related Inequalities

8.1 Hardy's, Hilbert's, Opial's, and Related Inequalities 8.2 Young's Inequality 8.3 Nanson's Inequality

107 107 112 119 122 124 126 128

137 137 151 151 154 163 164

171 171 181

197 197 212

229 229 239 247

Contents

Chapter 9

General Linear Inequalities for Convex Sequences and Functions

9.1 Inequalities for m-Convex Sequences and Functions 9.2 Some Generalizations and Refinements

Chapter 10 Orderings and Convexity-Preserving Transformations

ix

253 253 262

277

10.1 Orderings of Convexity: Generalizations and Related Results 10.2 Various Results

277 288

Chapter 11 Convex Functions and Geometric Inequalities

307

11.1 Old and New Results via Majorization Theory 11.1.1 A Partial Ordering of Triangles 11.1.2 Schur-Convex and Schur-Concave Functionals 11.2 Concavity via Hyperbolic Forms

Chapter 12 Convexity, Majorization, and Schur-Convexity 12.1 Majorization and Convex Functions 12.2 Schur-Convex Functions 12.3 Multivariate Majorization and Convex Functions

307 307 310 314

319 319 332 336

Chapter 13 Convexity and Log-Concavity Related Moment and

Probability Inequalities 13.1 Jensen's Inequality 13.2 Moment Inequalities for Univariate Random Variables 13.3 Dimension-Related Inequalities for Exchangeable Random Variables 13.3.1 Exchangeable Random Variables and de Finetti's Theorem 13.3.2 Inequalities 13.4 Brunn-Minkowski Inequality 13.5 A Class of Log-Concave Probability Measures 13.6 Some Properties of Log-Concave Density Functions 13.7 Some Statistical Applications

Chapter 14 Muirhead's Theorem and Related Inequalities 14.1 Muirhead's Theorem and Generalizations 14.2 Moment Inequalities

339 339 342 343 343 346 347 348 353 355

361 361 364

x

Contents

14.3 Additional Inequalities for Exchangeable Random Variables 14.4 Inequalities for a Class of Positively Dependent Random Variables 14.5 Applications to Special Families of Random Variables and Distributions 14.5.1 Exchangeable Random Variables 14.5.2 Distributions with the Semigroup Property 14.5.3 The Multivariate Normal Distribution

Chapter 15 Arrangement Ordering

366 368 371 371 372 373

375

15.1 Definitions and Basic Properties 15.2 Preservation Properties of Arrangement Increasing Functions 15.3 Arrangement Increasing Property of Overlapping Sums

375 379 386

Chapter 16 Applications of Arrangement Ordering

391

16.1 16.2 16.3 16.4

Moment and Geometric Inequalities Arrangement Increasing Probabilities for AI Families of Densities Applications to Rank Order Problems Monotonicity in the Selection of Populations

Chapter 17 Multivariate Arrangement Increasing Functions

391 395 397 400

407

17.1 Definition and Basic Properties of Multivariate Arrangement Increasing Functions 17.2 Preservation and Closure Properties ofMultivariate Arrangement Increasing Functions 17.3 Applications to Measures of Agreement Among s Judges

413 417

References Author Index Subject Index

419 457 463

407

Preface

Since the publications of the two papers in 1905 and 1906 by J. L. W. V. Jensen, the celebrated Danish engineer and mathematician, the theory of convex functions has experienced a rapid development. This can be attributed to several causes: first, a great many areas in modern analysis directly or indirectly involve the application of convex functions; secondly, convex functions are closely related to the theory of inequalities, and many important inequalities are consequences of the applications of convex functions. For example, the important AG inequality or the general inequality between means of orders rand s, such as Holder's and Minkowski's inequalities, are all consequences of Jensen's inequality for convex functions. As a result, the topic of convex functions has been treated extensively in the classical book by Hardy, Littlewood, and P61ya (Inequalities, 1934 and 1952, pp. 82-125, Cambridge University Press) and other volumes (such as Beckenbach and Bellman (Inequalities, 1961 and 1965, Springer-Verlag) and Mitrinovic (Analytic Inequalities, 1970, Springer- Verlag)). An earlier book that is devoted solely to inequalities for convex functions was written by Pecaric in 1987 (Convex Functions: Inequalities, Beograd (in Serbo-Croatian)). In the present volume we provide a more comprehensive and complete treatment of this topic with some related partial orderings and selected applications in probability and statistics. We shall emphasize the special role played by inequalities in the theory of convex functions, and note that the definition of convex functions itself is, in fact, an inequality. Furthermore, we note that the definition of the class of convex functions can be extended or modified for some other classes of functions. For example, the class of Schur-convex functions is closely related to the theory of majorization as treated comprehensively by Marshall and Olkin (Inequalities; Theory of Majorization and Its Applications, 1979, Academic Press). xi

xii

Preface

This book is divided into 17 chapters. The first chapter provides some definitions and fundamental results for various classes of convex functions. General inequalities for convex functions are treated in Chapters 2, 3, 5, 6, 7, 9, 10, 12; some classical applications of these inequalities are given in Chapters 4,8, and 11. In particular, Chapters 2-4 contain results on various versions of Jensen's inequality and its reversals and refinements, while some of the following chapters contain classical results on moment-related inequalities. Chapters 13-17 concern corresponding results on inequalities for convex functions and via partial orderings in probability and statistics. In particular, the last three chapters deal with arrangement ordering. As an unexpected bonus, we find that majorization and other notions of partial orderings may be viewed as special cases of arrangement ordering. The results given in this volume include classical results as well as many new results that have appeared recently in the literature. Prochan's research was partially supported by U. S. AFOSR grant Nos. 88-0040 and 91-0048; Tong's research was suported in part by NSF grants DMS-8801327 and DMS-9001721. Needless to say, we are indebted to the earlier books and the extensive literature in this area. In particular, we wish to acknowledge the inspiration of the works of Hardy-Littlewood-Polya, Beckenbach-Bellman, Mitrinovic, MarshallOlkin, and a few others. A reviewer read an earlier version of the manuscript and made several key suggestions; however, we are solely responsible for errors. We thank Professor William Ames for his encouragement to submit this work to Academic Press, and Ms. Annette Rohrs for her skillful typing. Finally, we thank Mr. Charles Glaser and his staff members at Academic Press for their wonderful cooperation and the neat appearance of this volume. J. E. P. F. P. Y. L. T.

Notation and Numbering System

(a) Unless specified otherwise, all vectors and matrices are in boldface type and all vectors are row vectors. (b) Unless specified otherwise, the following notation is used in this book: (1)

(2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14)

IR = {x: -oo<x 0 for y > x (x, Y E I). A function f: I -IR is said to be convex on

8

1. Convex Functions

I with respect to g (g-convex on I) if

+ g(X3' xl)f(X2) + g(Xl , x2)f(X3)

g(X2' x3)f(x1 ) holds for all

Xl'

X2' X3 E I,

1.22. Remarks.

Xl

~0

(1.20)

< X2 < X3 .

(a) Of particular interest are the g-convex functions

satisfying

g(X, y) + g(y, x) = 0. (b) For related properties of g-convex functions, see Vasic and Keckic 0 (1971a) and Lackovic (1979a, 1979b). Let K(b) be the class of all functions f: IR ~IR which are continuous and nonnegative on the segment 1= [0, b] and such that f(O) = O. Note that the mean function F of the function f E K (b), defined by

F ( X ) = ~f(t)dt J ( O < x ~ b ) , F(O) = 0,

(1.21)

o

belongs to the class K(b). Now let K1(b) denote the class of function f E K(b) convex on I, and K 2(b) be the class of functions f E K(b) convex in mean on I, i.e., the class of functions for which FE K 1(b). Let K 3(b) denote the class of functions f which are starshaped with respect to origin on the segment I, i.e., the class of functions f with the property that for all x E I and all t (0 ~ t ~ 1) the following inquality holds:

f(tx)

~tf(x).

(1.22)

1.23. Definition. f is said to belong to the class K 4(b) if it is superadditive on I, i.e., if f(x + y)

+ f(y) holds

~ f ( x )

for all x, y and x

+ y in

I.

(1.23)

If the reverse inequality in (1.23) is valid, then f is said to be subadditive. If F belongs to the class K 3( b), we say that f is starshaped in mean, i.e., that it belongs to Ks(b); and if F belongs to K 4(b), we say that f is superadditive in mean, i.e., that it belongs to K 6(b). Similarly, we define functions subadditive in mean. Bruckner and Ostrow (1962) proved that the following inclusions hold:

(1.24) Beckenbach (1969) gave examples showing that each inclusion is proper.

1.2. Convex Functions on a Nonned Linear Space

9

1.24. Remarks. (a) (1.24) is the well-known "partial ordering of convexity." There exist many generalizations and related results. Some of them will be given in this volume. (b) Interesting results concern functions which have the given property "in mean." In later chapters we shall give results for sequences and functions monotonic in mean (such as Cebysev's inequality). 0

1.2. Convex Functions on a Normed Linear Space The definition of a convex function has a very natural generalization to real-valued function defined on an arbitrary real linear space L. Here we merely require that the domain U of f be convex. This assures that for x, y E U and a E [0, 1], f can always be defined at ax + (1 - a)y. We then define f to be convex on U ~ L if f(ax

+ (1- a)y):o:; af(x) + (1- a)f(y).

(1.25)

In the following we give some results in Roberts and Varberg (1973) for the case in which L is a riormed linear space. Note that for L = IR n we have a convex function of several variables.

1.25. Remark. Let U be an open convex set in L, Xo E U, Y E L, and define the function get) = f(xo + ty) where t E (a, b) such that Xo + ty E U for all t. The function f: U ~IR is said to be convex if get) is a convex function on (a, b) (Roberts and Varberg, 1973, p. 91). 0 1.26. Theorem. (a) Let f be convex on an open set U ~ L. If f is bounded from above in a neighborhood of a point in U, then f is locally Lipschitz in U, hence Lipschitz on any compact subset of U, and f is continuous on U. (b) Iff is convex on the open set U ~ IR n , then f is Lipschitz on every compact subset of U and continuous on U.

Proof.

See Roberts and Varberg (1973, pp. 93-94).

0

1.27. Theorem. (a) Assume that f is defined on the open convex set U ~ L. If f is convex on U and Frechet differentiable at xo, then, for x E U,

(1.26)

10

1. Convex Functions

holds. Iff is differentiable throughout U, then f is convex iff (1.26) holds for all x, Xo E U. Furthermore, f is strictly convex iffthe inequality is strict. (b) Let f : U ~ ~ be continuous and Frechet differentiable on the open set U c:::::: L. Then f is (strictly) convex ifff' is (strictly) increasing on U. (c) Let f be continuously differentiable and suppose that the second derivative exists throughout an open set U c:::::: L. Then f is convex on U iff rex) is nonnegative definite for every x E U. Furthermore, if rex) is positive definite on U, then f is strictly convex.

Proof.


0

1.28. Remarks. (a) We shall say that I' is increasing on U if for x, y we have

E

U

(f'(x) - f'(y))(x - y) 2: 0,

and that f' is strictly increasing on U if this inequality is strict for all xi=y. (b) A function f: U ~M (U c:::::: L; L, Mare normed linear vector spaces) is Frechet differentiable at Xo (xo E U) if there exists a linear transformation T: L ~M such that lim f(x) - f(xo) - T(x - xo) = x--->xo

Ilx - xoll

° '

which is equivalent to f(x) = f(xo) + T(x -xo) + o(llx

-xoll)

as x ~Xo' The linear transformation T is called the Frechet derivative and is denoted by f' (xo). (c) A similar derivative of the Frechet derivative is called the second Frechet derivative. This derivative is a symmetric bilinear transformation defined on Lx L, i.e., f;.h(X) = fJ:.k(X) (h, k E L). Note that if f: U ~~ is continuously differentiable on the open convex set U c:::::: Land r(x) exists throughout U, then for any x, Xo E U there is an s E (0, 1) such that (1. 27)

where h =x -Xo.

1.2. Convex Functions on a Normed Linear Space

11

(d) A symmetric bilinear transformation B(h, k) defined on Lx L is positive (nonnegative) definite if for every h E L (h :;eO), we have B(h, h) > 0

(B(h, h)

~0).

(e) The following definition is also valid: A (continuous real-valued) function f is operator convex on (A, v) if f (aa + (3b ) ~ at (a) + (3f(b) for positive reals a, {3 such that a + {3 = 1 and operators a, b with their spectra in (A, v), (See Davis, 1957, for a brief survey of operator functions and Ando, 1978, for further comments on classes of operator functions.) D Now, by letting L = ~ n we can consider convex functions of several variables. In this case f' (xo) is the gradient vector of the function f at the point Xo, i.e.,

and the following theorem is valid:

1.29. Theorem. (a) Suppose f is defined on the open convex set U ~ ~ If f is convex on U and the gradient vector f'(x;'J) exists, then for x E U,

f(x) - f(xo) ~ (f'(xo) , x - Xo),

n .

(1.28)

where (a, b) is the inner product of vectors a, b (i.e., (a, b) = ~ 7 = a1 ibi)' If f is (strictly) convex and f'(x) exists throughout U, then f' is (strictly) increasing on U. Conversely, if the partial derivatives of f exist and are continuous throughout U and iff' is (strictly) increasing, then f is (strictly) convex.

(b) Let f have continuous second partial derivatives cPf / dXjdXi = k throughout an open convex set U C ~ n . Then f is convex on U iff the Hessian matrix H = II/;j(x)1I is nonnegative definite for each x E U. Moreover, if H is positive definite on U, then f is strictly convex.

Proof.


D

1.30. Remark. A real symmetric matrix A = lIaijll (i, j = 1,2, ... , n) is said to be positive (nonnegative) definite if the quadratic form Q(x) = ~ 7 . j = laijXiXj is positive (nonnegative) for all x = (x 1> ••• , x n ) :;e (0, ... ,0). It is known that A is a positive (nonnegative) definite matrix iff all determinants i, j

= 1,2, ... , k;

k

= 1, ...

,n

12

1. Convex Functions

are positive (nonnegative) (see, e.g., Beckenbach and Bellman, 1961, D 1965, pp. 57-58). The following theorem can be found in Roberts and Varberg (1973, p. 108):

1.31. Theorem.

(a) The function f is convex on an open convex set U of a normed linear space L ifff has support at each point of U, i. e., if there is an affine function A : L ~ Isuch R that A(xo)=f(xo) and A ( x ) ~ f ( for x ) every x E U.

(b) The function f is convex on an open convex set U £ IR n ifffor every point Xo E U there exists a point 1 E IR n such that f(x) - f(xo) ~ (1, x - Xo).

(1.29)

1.32. Remark. Denoting by !'(Xo, v) the Gateaux derivative (the directional derivative) of f, i.e., f

'(

) -I' f(Xo + tv) - f(Xo) Xo, v - 1m , 1--->0 t

the function f is Gateaux differentiable at Xo if !'(Xo, v) exists for every vEL. Note that A(v) = !'(Xo, v) is a linear function of v. Furthermore, if f has a Frechet derivative at Xo, then f is Gateaux differentiable at Xo with !'(Xo, v) = f;(Xo)· Gateaux differentiability of fat Xo does not, however, guarantee the existence of !'(Xo) without additional conditions on

f

D The following result is given in Roberts and Varberg (1973, p. 107):

1.33. Theorem. x, Xo E U,

Let f be convex on an open set U £ L. Then for all f(x) - f(xo) ~ f ! r ( x ox, -xo)

(1.30)

holds. Iff is strictly convex, then the inequality is strict.

Note that (1.30) implies f(xo) - f(x)

~ f ! r ( x Xo,

x) = -f!r(x, x - xo).

Thus by (1.29) and (1.30) we have f!r(x, x -xo) ~ f ' . - ( x ox, -xo)·

(1.31)

1.2. Convex Functions on a Normed Linear Space

13

1.34. Remark. In the case of convex functions of several variables (multivariate convex functions) we have

where f{+ , ... , f ~ +are the right-hand partial derivatives of f. In this case (1.30) and (1.31) become (1.32) and (1.33) D

In the following we observe another definition of convex functions and the definition of I-convex functions for functions of several variables.

1.35. Definition. Let U convex if its epigraph

~ [Rn.

epif = {(x, A): (x, A) is a convex set in

The function f: U E [Rn

x

[R; X E

~[R

is said to be

U, f(x):s; A}

(1.34)

[Rn X [R.

It is known that Definition 1.35 is equivalent to (1.1) for all x, y E [Rn

and all a E [0, 1]. This definition has some useful applications. If we replace the interval [a, b] by an open convex set U ~ L, we can extend Definition 1.8 for I-convex functions immediately to functions F: U ~[R (Roberts and Varberg (1973, pp. 215-16):

1.36. Theorem. Let f be I -convex on an open set U in a normed linear space L. Iff is bounded above in a neighborhood of a single point Xo E U, then f is continuous and hence convex on U. Wright-convex functions have an interesting and important generalization for functions of several variables (Brunk, 1964). Let [Rn denote the n-dimensional vector lattice of points x = (Xl, . . . ,xn ) , Xi real for i = 1, ... , n, with the partial ordering

iff Xi :S; Yi for i

= 1, ... , n.

14

1. Convex Functions

1.37. Definition. A real-valued function f on an n-dimensional rectangle I c IR n is said to have increasing increments if f(a

+ h - f(a) :5f(b + h) - feb)

whenever a E I, b + h E I, 0:5 h e IR

n

,

(1.35)

a:5 b.

1.38. Remark. Brunk (1964) gave the following result for functions with increasing increments f : I ~IR: (i) f is not necessarily continuous on I. (ii) If f(x) is a continuous function for b:5 x:5 a + b, where 0:5 a E IR n , then ep(t) = f(ta + b), 0:5 t :51, is a convex function on [0,1]. (iii) If the partial derivatives R(x) exist for x E I, then f has increasing increments iff each of these partial derivatives is increasing in each argument; in other words, iff the gradient ['(x) = ([{(x), ... , f ~ ( x ) i) s increasing on I. (iv) The second partials, if they exist, are then nonnegative. (v) If f is continuous and has increasing increments on I, it can be approximated uniformly on I by polynomials having increasing increments and therefore nonnegative second partial derivatives. D

1.3. Convex Functions of Higher Order Let f be a real-valued function defined on [a, b]. A kth order divided difference of f at distinct points Xo, ... ,Xk in [a, b] may be defined recursively by [x;]f = f(Xi) (i = 0, ... , k) and [Xo, ... , xk]f = ([Xl" .. ,xdf- [xo, ... ,xk-tlf)I(Xk-xo).

(1.36)

The value [xo,"" xdf is independent of the order of the points Xo, ... , Xk' This definition may be extended to include the case in which some or all of the points coincide by assuming that xo:5 ... :5 Xk and letting [x, ... ,x]f = f(j)(x)lj!, (j+l times)

provided that f(j)(x) exists. If f E IIn (the class of polynomials of degree at most n), then [xo, ... , xn]f is equal to the coefficient of .r" and hence is zero if degf < n. We can easily show that (1.36) is equivalent to _

[xo, ... ,xk]f -

k

2: j ~

f(xj) ~ )'( OW Xk

k

where

w(x)

= IT (x j=O

-Xj)'

(1.37)

1.3. Convex Functions of Higher Order

15

The following result on B-splines and divided differences can be found in Curri and Schoenberg (1966): For fixed x E [a, b], let M(x, y) = n ( y - x ) ~ - be I defined as n(y-xt- I if y2'::x and zero otherwise. Let a ::5 xo::5 ... ::5 Xn ::5 b with Xo xn, and (1.38)

*

(the nth divided difference of M(x, y) with respect to y at Xo,···, xn). Mn(x) is commonly referred to as a B-spline and has the following properties: M; (x) > 0 in (xo, xn)

M; (x) = 0 outside (xo, xn);

and

if f has a continuous nth derivative in (a, b), then

f b

[xo,· .. ,xn]f =

~

n.

(1.39)

Mn(x)f(n)(x) dx;

a

moreover,

f b

b

Mn(x)dx=l

f

and

a

a

1 n xMn(x) dx = - - 2:: Xi' n + 1 i=O

(1.40)

Another integral repesentation of divided differences is useful (Isaacson and Keller, 1966):

ff... f 1

[x, X + hI' ... , x + hn]f =

11

In-l

f(n)(tn(h n - h n- I )

+ ...

000

+ tz(h z - hI) + tih i + x) dt; ... du : (1.41) and as a final remark, we note that

detA [xo,··· ,xn]f=--, detB

(1.42)

where B=

Ilxi-ill,

0::5 i, j::5 n,

= lIaijll,

1::5 i, j::5 n, X'j - i for i 0, aij = {f(xj) for i = O. A

*

1.39. Definition. A function f: [a, b) ~ ~ is said to be n-conuex, n on [a, b] iff for all choices of (n + 1) distinct points in [a, b], [xo, ... , xn]f 2':: O.

2':: 0

(1.43)

16

1. Convex Functions

If this inequality is reversed, then f is said to be n-concaue on [a, b]. If the inequality is strict, then f is said to be a strictly n-conuex (n-concave)

function.

1.40. Remarks. (a) If n = 2, then Definition 1.39 is equivalent to inequality (1.5). Thus 2-convex functions are just convex functions. Furthermore, J-convex functions are increasing functions and O-convex functions are nonnegative functions. (b) It is known that for n ;::=: 1, f is both n-convex and n-concave iff f is a polynomial of degree at most n - 1. 0 1.41. Theorem.

Iff is an n-convex function on [a, b] for n ;::=: 2, then (i) the function f(k) exists and is (n - k)-convex for I:=::; k z: n - 2; (ii) t0 for x E interior["o, ... ,Xn ], , () [ ] C M (x "0, ... , X n ){ =0 for x II: "0, ... ,Xn • Thus supp M(·' "0, ,xn ) = ["0, ... ,xn ] . (d) f ~ . M ( x "0, ' , xn ) dx= 1 (see, e.g., Dahmen and Micchelli, 1983, and Micchelli, 1979, 1980).

1.3. Convex Functions of Higher Order

(e)

f IRk (x;)'M(x I Xo, ... ,xn ) dx =

m; where r

m; -_ (n +r r)-l

E

Z" and (1.58)

t ~ .. · t ~

~

21

'-O+"'+;n=r

((io,

to,

,in) C (0,1, ... ,r» is the rth generalized symmetric mean of , tn, t, E ~ for all i, and Xi denotes the ith component of x E ~k.

(n ~ r) terms. (These means were

Note that the sum in (1.58) involves

studied extensively in Neuman, 1986.) This result is given in Neuman and Pecaric (1989). Similar to (1.51), we can give the (n, m)th finite difference

ilZ;'kf(x, y) = ilZ(ilkf(x, y» = il'k(ilU(x, y»

=

~ o #0 (-I t +

m

i j - -

C)(7)f(X+ ih, Y + jk).

(1.59)

1.53. Remark. The following identities are valid (Popoviciu, 1944, pp. 48,65): ilhf(x) = n! hn[x, x + h, x + 2h, ... , x + nh]f, ill::::'f(x, y)

= n! m! hnk m [

x + h, ... ,x + nh y, y + k, ... ,y + mk X,

(1.60)

]t

•

(1.61)

o 1.54. Definition. A function f: I x I convex or J-conuex of order (n, m) if

~ ~

ilZ;'kf(x, Y) ;:::

is said to be Jensen (n, m)-

°

(1.62)

holds for every h > 0, k > 0, X E I, y E I. Now, let the sequence {znm}n.mEN be given. Then the (n, m)th finite difference is given by

il1 °a If__ = il2 °a If__ = a..I) •

(1.63)

1.55. Definition. The sequence {aij}i.jE.N' is said to be convex of order (n, m) if iln,maij;:::o (n, m ;:::0, i,j= 1, 2, ... ).

22

1. Convex Functions

Similarly, we can consider functions defined on m + 1 points + 1. Letting I = {XO, Xl, . . , ,Xm }, we have

Xo

< Xl
O}.

i=l

= [ R ~U ( - [ R ~ U ) {O}. For x, y E [RN, we write x> y if x - YE N [ R ~ , and x < y if Y > x. Therefore for all points x and y in IR we have x=y, or xy. Ifx>y and AE [R is positive, then AX>Ay. We shall write

Then

[RN

sgnx

={

~ -1

if X> 0, ifx = 0, if X O. Since the points can be represented in the form Xi =X I

+ Aih,

Xl' . . . ,Xn

23

are colinear, they (1.67)

i= 1, ... , n.

The divided difference [x., ... ,xn]f of f(x) at the points Xl' . . . ,Xn can be defined recursively by (Norlund, 1924; Popoviciu, 1934a; and Ger, 1972):

[xdf = f(XI), [Xl' ... , xn]f = ([X2' ... , xn]f - [Xl' ... , xn-df)/(An - AI)'

(1.68) n 2=: 2.

If in (1.67) we use any other colinear point instead of Xl, then the

divided difference in (1.68) will remain unchanged.

1.5S. Definition. Let Dc [RN be an open convex set. The function f : D ~[R is said to be n-convex if for every system of colinear points Xl' . . . ,Xn+l E D we have [ x ~ , ...

(1.69)

,xn+df 2=: O.

The finite differences defined in (1.46) and (1.47) are for points in D. Note that if Xl E [RN, d e [R ~ , and the points X2, . . . ,Xn are given by Xi = Xl

+ (i - 1)d,

i=2, ... ,p,

(1.70)

then (Kuczma, 1985, p. 375): ~ ~ - Y ( X I )

[Xl' . . .

,xn]f = (n _ 1)! Idl n -

(1.71)

l .

1.59. Definition. Let D c [R N be an open convex set. The function f: D ~[R is said to be Jensen n-convex if

M,f(x) 2=: 0 holds for all xED and all h E [ R ~such that

1.4.

(1.72) X + ph E

D.

Functions Convex with Respect to an ECT System of Functions

Another generalization of n-convex functions are functions convex with respect to an ECT (extended complete Tchebycheff) system of functions. It is known (see Karlin and Studden, 1966, pp. 375-466) that an ECT system of functions has the following properties: Let Un, UI, . . . , u; be of

24

1. Convex Functions

class Cn[a,61. Then {ui};t is an ECT system on [a, b ] iff for k = 0, 1, . . . , n we have W(uo, . . . , uk)> 0 on [a, b ] where W ( u o , * U k ) denotes the Wronskian of the functions UO, u l , . . . , u k , i.e., u&)

u;t(t) -

u&)

u;(t)

. . . uF'(t)

Uk(t)

u; ( t)

*

* *

.

*

u!jk'(t)

(1.73)

uLk'(t)

(1.74)

1.60. Definition. A function f defined on the open interval (a, b) is said to be convex with respect to {ui}{ if (1.75)

holds for all choices of {t,}:+' satisfying a C= to < t , < * . < tn+l< b. In symbols we write

f ~ C ( u o ~,

(1.76)

. . un)*

1 , .

9

1.61. Examples. (a) If uo(t)= 1, then C(uo) is the set of all increasing functions on (a, b). (b) Let ui(t) = ti, i = 0, 1, . . . , n. Then C ( l , t, t2, . . . , t") is the set of all n-convex functions on (a, b). In this case we have (1.42), i.e., l,l,

[ X o , . . . , x n l f = u (X O , X I

...,

3 *

*

.

9

1, t, . . . , t" Xn-I

9

Xn

XO,X I ,

...

xn

and

1, t, . . . , tn

(xi-xj)>o i>j

is the well-known van der Monde's determinant.

(xo 0 for all t E (a, b), then C(1, UI(t» comprises the set of all functions which are convex with respect to Ul(t) on (a, b) (Karlin and Studden, 1966, p. 376). 0 Now, let functions Wk E cn-k[a, b] (k tive. We can form the system

= 0,

1, ... , n) be strictly posi-

t

UI(t) = wo(t)

JWI(SI) dSI a t

U2(t)

= Wo(t)

JWI(SI) JWz(Sz) ds; dSI a

= WO(t)

(1.79)

a

t

un(t)

51

51

S1l-1

JWI(SI) JWZ(SZ)··· J Wn(Sn) ds; ... ds a

a

i

•

a

It is known (Karlin and Studden, 1966, pp. 379-80) that if

{ U i } ~are

in

the class Cn[a, b] satisfying the initial conditions

uf(a) = 0,

p

= 0, 1, ... , k -

1;

k

= 1, 2, ... , n,

(1.80)

then the following conditions are equivalent: (i) the system { u J ~has a representation of the form (1.79), (ii) { U i } ~is an ECT system on [a, b]. These special classes of ECT systems are of particular interst. Note that we can find functions Wi by using u., i.e., the following relations are valid (Karlin and Studden, 1966, p. 380):

Wo=Uo,

WI=

W(uo. UI)

u6

W(Uo, ... , Uk)W(UO, ... , Uk-Z) Wk = z W(uo, ... , Uk-I)

(1.81) k=2, ... , n.

Thus in the following we shall consider an ECT system {uJ3 defined on [a, b] of the explicit form (1.79). Let D j , j=O, 1, ... , n, denote the first order differential operator d

(!(t»)

(DJ)(t) = dt wj(t) ;

(1.82)

26

1. Convex Functions

then a direct calculation from (1.79) yields j

j

= 0, 1, ... , n -

= 0,1,

1 (1.83)

... ,n.

The following theorem is proved in Karlin and Studden (1966, pp. 462-66):

1.62. Theorem.

Let f

E

quo, Ul,""

un) where {uJ3 is an ECT

system on (a, b). (a) If f, DOf, D1f, ... , D'] exist and are continuous where r < n - 2 (r = -1, 0, 1, 2, ... ) (the condition for r = -1 corresponds to f being continuous), then Dr+1f(x) exists and is continuous.

(b) If f, DOf, ... , Dn-zf are continuous on (a, b), then Dn-Zf(x)/Wn_l(X) has a right derivative D'k-1f which is right continuous and D'k-1f(x)/wn(x) is increasing on (a, b). The following two theorems are also given (1966, pp. 382-83, 395, and 387-88).

1.63. Theorem. f(t)

10

Karlin and Studden

The function

= fn(t, x)

J

J

sf

x

x

x

= {wo(t) Wl(Sl)

wz(sz) ...

Wn(Sn) ds; ... dS1 for

°

a s: t:5x

(1.84)

is contained in quo, Ul' ... , un)' For n = 0, fo(t, x) = wo(t) for x:5 t s: b and equals zero otherwise. (Note that fn(t, a) = un(t).)

More generally, we have

En quo, ... , u;), n

fn(t, x)

;=0

and we also have Uj

E

n 7 ~ C o (uo,

... , u;) for each j

=

0, 1, ... , n.

1.4. Functions Convex with Respect to an ECf System of Functions

1.64. Theorem. For n e: 1 and a < c < b, every f admits a representation of the form t

f(t)

=

E

27

C(uo, Ul , ... , un)

.

n D'-lf(c) ~ wi(c) [;(t, c),

Jfn(t, x) dp(x) +

tE[c,b),

(1.85)

c

where [; is defined in (1.84) and dp(x) = d(D'k-1f(x)/wn(x» is a nonnegative measure. If p(a+) > -00 then the value c can be replaced by a+. The resulting representation is then valid on the full interval (a, b).

For t E (a, c), c < b, we have the following representation: f(t)

= (-1t+ 1

c

J

n Di-1f(c) I n ( t , x ) d p ( x ) + ~ ( - 1 t wi(c) - i ];(t,c) (1.86)

t

where t:s,x

for

t>x, (1.87)

is valid for each k = 0, 1, ... ,n. In the case in which p(b_) < 00, the value c can be replaced by b _ and the representation holds throughout (a, b). Pecaric, Tudor, Crstici, and Savic (1987) noted that a result similar to Theorem 1.64 can be given by a generalization of Taylor's formula. In their result Wi (i = 0, 1, ... , n) are in the class cn-i[a, b], either positive or negative on [a, b], and the first order differential operator is defined as above. Then the following theorem is valid: 1.65. Theorem. Let f: [a, b ] ~ ~ be a real function such that Dnf(x) = (DnD n- 1··· Dof)(x) is continuous on [a, b]. Then n

f(t)

= 2: ail¥;(t, c) + Rn(t)

for all

t

E

[a, b]

(1.88)

i=O

where c

E

[a, b], a, = Di-1f(c)/wi(c) (i = 0, 1, ... , n; D-1f == f),

x

x

x

(1.89)

28

1. Convex Functions

for k = 1, ... , nand Wo(t, x) = wo(t); and

f

f

t

Rn =

t

Wn(t, x)Dnf(x) dx

= Dn(u) Wn(t, x) dx,

(1.91)

c

c

where u E [min(c, t), max(c, t)]. Furthermore, we also have

. R-l a·=hm--=---.:.x-->cWi(x, c)'

(1. 92)

I

(1. 93)

c

c

for k = 0, 1, ... , i; and a. =_l-

I

Wi( C)

Ri-1(y) lim Y-->C fY

.

(1.94)

,

Wi-l(Y,s) ds c

(1. 95)

(k

= 0, 1, ... , i-I).

Proof.

Applying integration by parts to

we obtain

f f + t

R

n

= -Wn(t, c)_D_n---"lf---,-(c--,-) wn(c)

c

Dn-y(x) (dWn(t, x)) dx wn(x) dx

t

=-

Dn-y(c) Wn(t, c) wn(c)

c

since

Wn-1(t, x)D

n-l

f(x) dx,

1.4. Functions Convex with Respect to an ECT System of Functions

29

Continuing this process we will finally obtain (1.88) with coefficients a, given by (1.89). On the other hand, (1.92) can be expressed in the form a,

= lim - x - - - - - : . . . . ~ - - ' - - - - - - - " - ~ - - x->c J Wl(SI) ... Wi(Si) ds, ... dSI

11 C

C

thus using L'Hospital's rule yields . - - - - - - - . : : : DoRi-1(X) = 11m - - - - ' - - - - - : . . : ~ - - - - I x->c w1(x) wz(sz) . . . Wi(Si) ds, ... ds z

a,

J

T

C

C

Continuing this process, we obtain (1.93) and finally a, = lim Di-1R;_I(X)/Wi(x)= Di-1f(c)/wi(c).

x->c

Note, also, that (1.94) can ,be written in the form a.=_l-lim I w{c) Y->C JY I

C

Ri-1(y)/wo(Y)

Y

JWl(Sl)

JSl

Si~2

wz(sz)...

J Wi-l(Si-l) dSi- 1 ... dSI ds

S

(1. 96)

then, using L'Hospital's rule, we have a. =_1- lim

DoRi-1(y)

I w{c) Y->C Y Y Si-2 I J w1(Y)J wz<sz) ... J Wi-l(Si-l) ds.s., ... ds z ds C

(1.97)

S

Continuing this process, we obtain (1.94) and, finally, for k = i -1,

=_l-lim Di-ZRi_1(Y)/Wi_l(Y) Wi(C) Y->C Y- C

30

1. Convex Functions

We note in passing that (a) for wo(t) == 1 and Wk(t) = k (k = 1, ... , n) we obtain Taylor's formula (and result from Savic, 1977); (b) Theorem 1.65 is a generalization of some results in Karlin and Studden (1966, pp. 387-89, 454-56). (c) If the points tj in Definition 1.60 are given by tj = to + jh for j = 0, ... , n, where h > 0, a < to, and to + nh < b, then we obtain generalized mid-point convex functions with respect to {uiH. For a treatment of such functions, see Lapidot (1981).

1.5.

Inequalities Involving Derivatives and Differences

First we give some results from Farwig and Zwick (1986). We shall use the notation W ~ [ a b] ,

== {f .I":" is absolutely continuous and

f(n)

E

Loo[a, b]},

where the norm in L; is given by

IIflloo = ess sup{lf(x)j:x E [a,

b]},

and lIn denotes the set of polynomials of degree at most n.

1.66. Lemma. Assume that hE W:'[a, b]. (a) If (i) h is n-convex on [a, b] and (ii) there exist points a::5;o::5···::5 ;n-l ::5 b such that [;k> , ;n-dh 2: for k = 0, ... , n - 1, then h(kl(b) 2: for k = 0, ,n - 1. (b) If for some 0::5 m s; n - 1 we have h(m)(b) = and

°

;n-m-l < b, then h

E

° °

lIm-IOn [;0' b].

Proof. Since h is n-convex, [xo, ... , xn-dh is an increasing function of each of the variables Xi E [a, b] (i = 0, ... , n - 1). Thus h(n-l)(b) )'

0::5[;0,"" ;n-dh::5[xo, ... ,xn_dh::5[b, ... , b]h= ( (n times)

n- 1 .

(1.98) whenever ;i::5 Xi ::5 b holds for each 0::5 i ::5 n - 1. Thus h(n-l)(b) 2: 0. Since [Xo, ... , xn-dh 2: for such points, [Xl' . . . , xn-dh is an increasing function of Xl' . . . , Xn-l provided that ;i::5 Xi::5 b (i = 1, ... , n - 1). In particular, for such points, [Xl' ... ,xn-dh 2: 0. By repeating the same argument n - 1 times, the inequalities h(k)(b) 2: (k = 0, ... , n - 1) are

°

°

1.5. Inequalities Involving Derivatives and Dift'erences

31

proved. Now assume that h(m)(b) = 0 for some O:::s m :::s n - 1. Then

for all xn-m-l 2: Sn-m-l' Thus, if Sn-m-l < b, then h is a polynomial of degree :::Sm-1 on [Sn-m-l,b]c[So,b]. In particular, h(n-l)(b)=O; hence the same reasoning for n -1 shows that h E Il m- 1 on [So, b]. 0 As a first application of Lemma 1.66, Farwig and Zwick (1986) proved the following theorem which generalizes Theorem 1 in Schoenberg (1982), a slight variation of which appears later as Corollary 1.69.

1.67. Theorem.

Assume that f, g

g(b):::Sf(b),

E

W::'[a, b]. (a) If

g(k)(a)2:f(k)(a)

(k=0, ... ,n-2),

(1.99)

and [Xo, ... ,xn]g:::s[xo, ... ,xn)f forall

a:::Sxo:::S"

':::sxn:::sb,

where Xo

* xn,

(1.100)

then g(k)(b) :::Sf(k)(b) (k = 1, ... , n -1). (b) If g(m)(b) = f(m)(b) for some 1 :::s m :::s n - 1, then g == f. Proof. Let h = f (1.99) we have

-

g, Sn-l = b and So = ... =

k (b - a) [a, ... , a, b]h = h (b) (k times)

k-l

Sn-2 =

h(j)(a)

2: -.,j ~ O

a. Then from

. (b - a Y2: 0

] •

for k = 0, ... , n - 1; hence Lemma 1.66 yields (a). Now assume that h(m)(b) = 0 for some 1:::s m :::s n -1. Then by Lemma 1.66, h e Ilm-l on [a, b] and thus m

O=(b-a) [a, ... ,a,b]h=h(b)(m times)

m-l h(j)(a)

2:

j=O

. -.,-(b-aY· ] •

Since by (1.67) h(b) 2: 0 and h(j)(a):::s 0 for each j in the sum, we obtain h(j)(a) = 0 (j = 0, ... , m - 1), and hence h == O. 0 We note that iff,gEC[a, b], then (1.100) is equivalent to g(n)(x):::s f(n)(x) for all x E [a, b].

1. Convex Functions

32

1.68. Corollary.

If, in addition to the conditions of Theorem 1.67, g dt - eX> -

U (x)

o

= u(O) = 0 and

Then u' (0)

u"(x)

= 2 ( eX> -

n-2 X2k)

2: -,

k=O k.

is a positive increasing function for x > O. Thus from Theorem 1.71 with n = 2 and c = 0 we obtain (1.103). Now let n (k - 2)X2k IX z Xz v(x) = 2e k (k ) - 3x e' dt. k ~ O ! 2 -1 0

2:

Then v(O) = v'(O)

= v"(O) = 0 and

'i x k=O k. 3

v"'(x)

= 4x3(eX> _

2

: )

is a positive increasing function for x > O. Thus from Theorem 1.72 with n = 3 and c = 0 we obtain (1.104). 0 Note that as a special case (1.103) and (1.104) yield X

0

1

2

2

X2

e' dt - - (eX - 1 + x ) < - e' 2x 2

for x> 0,

1 2 2 IX 2 2 s 2 0 O. 3x 9 X

o

The following theorem is due to Zmorovic (1956):

1.74. Theorem. If the second derivative!" of a real-valued function f is continuous on [a, b], then b

12

J (!"(x)f dx 2(b_a)3

a

(

(a + b)

)2

f(a)-2f -2- +f(b) .

(1.105)

1.5. Inequalities Involving Derivatives and Differences

35

A weighted version of (1.105) is given in Djordjevic and Milovanovic (1976). Further generalizations are given in Djordjevic, Milovanovic, and Pecaric (1979), where some related results are also given. For example, the following result is proved: 1.75. Theorem. Let f E C[a - h, a h) (k = 1, ... , n -1). Then llf(nlll r ~

+ h]

and f(k)(a - b)

(n - 1)' (nr _l)(r-l llr 2hn' r -1 If(a

+ h) -

= (-l)kf(kl(a +

f(a - h)l,

where Ilgllr = (l/(b - a» f ~ Igl r dt)lIr with g defined on [a, b]. (For a function defined on [a - h, a + h], we replace a and b by a - hand b + h, respectively. )

Let us adopt the notation b

Ilh 11g,p, =

11

(f g(t) Ih(t)IP dt) '. a

Lupas (1980) showed that instead of finite differences we can consider divided differences. 1.76. Theorem.

If fECn[a,b]

and a=xO<xI ["(x).

(1.113)

A generalization of this result is given in Milovanovic and Stankovic (1974). They also gave a similar result which yields the following special case: If [" is an increasing function on (a, b), then 1

1

['(x) + 2'!"(x) ~ f ( x+ 1) - f(x) ~ [ ' ( x +) 2["(x + 1) holds for every x E (a, b - 1). Let { a n } ~ = be l a convex sequence such that (Zygmund, 1959, p. 93; Dawson, 1969, p. 438)

-2 (n

+ 1)

~ Sa;

° for

~

° an ~

n = 1, 2, ....

~1

(1.114)

for all n. Then

1. Convex Functions

40

A generalization of this result for p, q-convex sequences is given in Kocic and Milovanovic (1989): Let { c n } be ~ a positive, convex, and increasing sequence with Co = 1 such that c; = D(n a ) as n ~00, where a > 2 is given. Then from a result of Essen we have n

lim sup (logn)-l n----. ce

2: t:,?cn/c n -::; (a -

k=2

2)Ko ,

where K o = sup (1-x)2(log(l/x))-1=0.407, O<x such that for all x, yES and A E [0, 1] we have

°

f(AX + (1 - A)Y)-::; max(f(x), f(y)) - sA(l - A) Ilx - y 11 2 • (1.116)

= 0, thenf(x) is convex; and if in (1.116) s = 0, thenf(x) is quasiconvex. If these inequalities hold for every x, yES and A = 1/2, then f(x) is said to be strongly I-conuex, strongly I-quasiconvex, I-convex, and I-quasiconvex, respectively. If in (1.115) r

A method for solving the problem of minimization: minf(x), XES

is relaxational if there is a sequence {xd such that x; E Sand f(Xk+l) -::; f(Xk) for k = 0, 1, .... In the following we discuss some inequalities which are useful for proving the convergence and estimating the rate of convergence for relaxational methods. Let f(x*) = minxEsf(x), L(y) = {x E S :f(x) -::;f(y)}, and f(x) be strongly convex on S. Then 2 r

Ilx - x* 11 2 -::; - (f(x) - f(x*)) for every XES.

(1.117)

1.5. Inequalities Involving Derivatives and DiJferences

If f

E

41

C 1(S), then

1 Ilx-x*II::5-IIVf(x)11 r

forevery

XES,

1 0::5f(x)-f(x*)::5-IIVf(x)112 forevery r

2 r

IIx-YII::5-IIVf(Y)11

Furthermore, if f

forevery

yES and

(1.118)

XES,

(1.119)

xEL(y). (1.120)

Cl,l(S) with Lipschitz constant L, then

E

t:':»

IIVf(x)II::5\,2+12;z-IIVf(y)11

forevery

yES and

X E

L(y). (1.121)

The proof of inequalities in (1.117)-(1.121) can be found in Karmanov (1977, pp. 41-43). For strongly quasiconvex functions we have 4 IIx - x*1I 2::5 - (f(x) - f(x*)) for every

XES.

(1.122)

S

If f E Cl,l(S) and S is an open convex set, then L

O::5f(x) - f(x*) ::52s21IVf(x)1I2 for every

XES.

(1.123)

For the proof of inequalities in (1.122) and (1.123), see Korablev (1978). Some additional results for strongly quasiconvex functions are given in Jovanovic (1989): 1 Ilx-YII::5-IIVf(y)11

s

Furthermore, if f

E

forevery

yES and

xEL(y). (1.124)

C1.1(S), then

t:»

IIVf(x)lI::5 \'2 + 2-;z IIVf(y)11

for every

yES and

X E L(y) (1.125)

and O::5f(x) - f(x*) S

L + V2L2 + 2s 2 2s 2 IIVf(x)nz for every

yES. (1.126)

42

1. Convex Functions

Javanovic also noted that in (1.121) Y2 + 12(L 2/r 2 ) can be replaced by Y2 + 8(L2 /r2 ) for strongly convex functions. If the constant r in Definition 1.81 is negative, then we have weakly convex and weakly quasiconvex functions, respectively. In general, if r is real, then f is a r-convex (r-quasiconvex) function. The following results are valid: 1.82. Theorem (Rockafellar, 1976). f(x) is r-convex iff the function f(x) - r IIxl/ 2 is convex. 1.83. Theorem (Vial, 1983). A differential function f(x) is r-conuex iff for every x, y E C we have f(y) - f(x) 2:. (Vf(y),Y -

x) + r Ilx - y 11 2•

1.84. Theorem (Vial, 1983). Let f(y) be a twice continuously differentiable function on a convex nonempty interior set C. Then f(x) is r-conuex iffthere exists a r e ~ such that (V'l(x)y, y) holds for every x

E

C and y E ~ k .

2:. 2r

IIYl12

Chapter 2

2.1.

Jensen's and Jensen-Steffensen's Inequalities

Jensen's Inequality

In this section we treat Jensen's inequality in the form of a convex combination of the values. of a convex function. Its applications in probability and statistics concerning the expectation of a convex function of a random variable deserve separate attention, and will be treated in Section 13.1. Jensen's inequality for convex functions is one of the most important inequalities in mathematics and statistics. Many other inequalities can be obtained from it. Thus this inequality and many of its useful consequences are discussed here.

2.1. Theorem (Jensen's Inequality). If I is an interval in ~ and f: I -- ~ is convex, x = (Xl' . . . , Xn) E I" (n 2: 2), P = (PI' ... ,Pn) is a positive n-tuple (i.e., Pi> 0), and Pk = r . ~ ~ lPi (k = 1 , ... , n), then 1

f (P

n

n

)

~PiXi

:5

1 P n

n

~pJ(Xi)'

If f is strictly convex, then (2.1) is strict unless

Xl

(2.1)

= ... = Xn.

Proof. The proof of (2.1) is by induction. The case n = 2 follows from the definition of convex functions. Suppose that the result is valid for all 43

44

2. Jensen's and Jensen-Stelfensen's Inequalities

k, 2:5 k :5 n - 1. Then Pn- I 1 n-I ) 2: p.x, ) =f (Pn p X + p P 2: p.x,

I n

f (p

n

n 1=1

n

n

Pn :5 P f(x n )

P n- I ( 1 n-I

+pf p

n

1

n-1 1=1

n

2: p.x, )

n-l 1=1

n

r.t:2: pJ(Xi)

:5 -

holds by the result for n = 2 and the induction hypothesis. The proof for the strict inequality when f is strictly convex is easy and is omitted. 0

2.2. Remarks. (a) The condition that p is a positive n-tuple can be replaced by: "p is a nonnegative n-tuple and P; > O. " Similar remarks can be made for other results in this chapter. (b) Jensen's inequality in (2.1) can be used as an alternative definition of convexity. (c) Jensen's original papers (-1905 and 1906) are related to I-convex functions (functions which satisfy (1.16)). However, inequality (1.16) appeared much earlier under different assumptions. For example, as Jensen (1905) mentioned in the Appendix of his paper, Holder (1889) proved inequality (1.16) in 1889 by assuming that f is twice differentiable on [a, b] and that r(x) ~0 on the interval. If f is twice differentiable, then r(x) ~0 for x E [a, b] is equivalent to f being convex on [a, b]. Later Henderson (1895/96) proved (2.1) under the same assumptions imposed by HOlder (1889). As far back as 1875, however, a special case of (2.1) in the form 1

n

)

1

n

f ( ;; ~Xi :5;; ~ / ( x ; )

(2.2)

(i.e., PI = ... = Pn) was proved by Grolous (1875) by an application of the centroid method. To our knowledge, (2.2) is the first inequality for convex functions in mathematical literature (see Mitrinovic and Vasic, 1975). Grolous (1875) also introduced the assumption that rex) > 0, but it can be seen from the text that it suffices to assume that f is a convex function in the geometric sense. (d) By replacing the interval I with a convex set U from a real vector space M, and Xi (i = 1, ... , n) for points in U, Theorem 2.1 remains valid for convex functions defined on a real vector space. 0

2.1. Jensen's Inequality

45

Integral inequalities analogous to (2.1) can be proved by using Theorem 2.1 or by a direct argument. For example, the following integral analogue of Theorem 2.1 is given in Beesack and Pecaric (1984):

2.3. Theorem. (a) Let v be a nonnegative measure on a a-algebra of subsets of a set D and let q, f be v-measurable functions on D such that q(x) ~0 and -00 $ a $f(x) $ b $ 00 for all xED and f Dq dv = 1. If ¢ is continuous and convex on (a, b) and if q¢(f) E Ly(D), then either qf+ or qf- (or both) is in Ly(D), so that f Dqfdv = f Dqf+ dv + f Dqf- dv is well defined (possibly 00 or -00); moreover, we have ¢(J qfdV) $ J q¢(f)dv. D

(2.3)

D

In the case f Dqfdv = 00 we have b = supxEDf(x) = 00, and the left-hand side of (2.3) means ¢(oo) = limx--->oo ¢(x); while if f Dqfdv = -00, then a = infxEDf(x) = -00 and the left-hand side of (2.3) is ¢( -00). Finally, if both integrals in (2.3) are known to exist (finite or infinite), then (2.3) still holds. Moreover, if f Dqfdv' is finite, then f Dq¢(f) dv does exist, either finite or 00. (b) If ¢ is continuous and concave on (a, b), then all of the above statements hold true with the direction of inequality in (2.3) reversed. In that case the finiteness of f Dqfdv implies the existence of f Dq¢(f) dv, either finite or -00.

Proof. (a) If either ¢ or f is constant for almost all x, then (2.3) holds trivially with equality; thus we assume that this is not the case. The proof of all but the last two sentences in (a) is essentially given in Hardy, Littlewood, and P6lya (1934, 1952, Theorem 202) for ordinary Lebesgue integrals on the line, and that proof still applies in our case. To prove these last sentences we assume that both integrals r= J qfdv, D

s

= J q¢(f) dv D

exist, either finite or infinite. (The case in which s is finite has already been discussed.) If r is finite, then we proceed as in Hardy, Littlewood, and P6lya (1934, 1952), and the convexity of ¢ implies

¢(f)

~ ¢(r)

+ ).[f(x) - rl,

XED,

(2.4)

46

2. Jensen's and Jensen-StetJensen's Inequalities

where J.. is any real number between the right-hand and left-hand derivatives c j > ~ ( r ) , c j > ~ ( r ) .F rom (2.4) we see that

[cj>(f(x))r

= min {a, cj>(f(x))} 2:: min{O, cj>(r) + J..[f(x) - r]},

so that f D q[ cj>(f)r dv is finite. Hence f D qcj>(f) dv exists but may have the value 00. Multiplying (2.4) by q(x) and integrating over D, we obtain (2.3) (and we do not need the existence of s). If cj> is concave, then the reverse inequality in (2.4) holds, and similar argument using [cj>(f)]+ = ' max{O, cj>(f)} shows that the finiteness of r implies the existence of s, either finite or -00. Next, let us consider the case r = 00. Here we have b = 00, f D dv = 00, but f D qfdv is finite. Let fn(x) == min{n,f(x)}. Then f n ( x ) f(x) ~ as n ~ o ofor xED. Also 05,f::(x)5,n and 02::f;;(x)2::f-(x) hold. Thus dv are finite. Consequently, f D qfn dv is finite, both f D qf:; dv and f D and we may apply (2.3) to fn to obtain

«:

st:

f

qcj>(fn) dv 2:: cj>(

D

f

qJn

dV).

(2.3')

D

Now f D qfn dv-« f D qf dv = 00 as n ~00. Since cj>(u) is continuous and convex on (a,oo), it is also monotonic for large u. Moreover, fn(x) 5, fn+l(X) for xED, so that the right-hand side of (2.3') approaches cj>(r) = cj>(oo) as n ~00. To obtain (2.3) it only remains to prove that

f

qcj>(fn) dv-« 1=

D

f

qcj>(f) dv as

~

n

00.

(2.5)

D

We may assume that cj>(u) is monotonic on (u o, 00). Let D 1 = {x:x ED, f(x) 5, uo},

D z = {x:x E D,f(x) > uo}.

Then fn+1(x)2::fn(x»uO for x e D, and n2::no, where no>uo. Thus { q ( x ) c j > ( f n ( x ) ) is } ~ oa monotonic sequence for each x E D z . By the monotone convergence theorem it follows that

f

Dz

qcj>(f,.) dv>«

f

qcj>(f) dv as

n

~

00,

Dz

where the limiting integral may have the value 00 or -00. For x E D, we have a5,f(x)5,Uo, so fn(x)==f(x) for all n2::no; hence f D 1 qcj>(fn) dv == f D 1 qcj>(f) dv where this integral may also have the value 00 or -00. However, since s = JD qcj>(f) dv exists, by assumption it follows that if both JD 1 qcj>(f) dv and JD2 qcj>(f) dv are finite, then they are both


47

or both -00, and (2.5) follows. We note that it is possible to prove that f D 1 qq>(f) dv is either finite or 00. Finally, the case r = -00 follows in much the same way. In this case, a = -00, f D qfdv = -00, and f D qf+ dv is finite. We apply (2.3) to the function k=max{-n,f} for which O ~ f ; ; ( x ) ~ - no, s,f:(x)s,f+(x), and fn+l(X) s,fn(x) for all xED. The proof proceeds as before, and we again obtain (2.3) for f (b) This follows by applying (a) to the convex function q>1 = -q>. 0

00

Now, let E be a nonempty set and L be a linear class of real-valued functions f: E ~IR having the properties: L1: f, gEL L2: if I

E

~(af

+ bg) E L for all a,

b E IR;

L, i.e., if f(t) = 1 for tEE, then f

E

L.

We also consider isotonic positive linear functionals A : we assume that

L ~IR.

That is,

AI: A(af + bg) = aA(f) + bA(g) for f, gEL, a, b E IR; A2: f

E

L, f(t)

~0

on E

~A(f) ~0 (A

is isotonic).

If

A3: A(I) = 1 also holds, then we say that A(f) is a linear mean defined on L. Jessen (1931) gave the following generalization of the Jensen's inequality for convex functions (see also Popoviciu, 1944, p. 33): 2.4. Theorem. Let L satisfy properties Ll, L2 on a nonempty set E, and assume that q> is a continuous convex function on an interval I c IR. If A is a linear positive functional with A(I) = 1, then for all gEL such that q>(g) E L we have A(g) E I and q>(A(g)) s,A(q>(g)).

(2.6)

Proof (Rasa, 1988a). Let I=[a,b]. From as,g(t)s,b for all tEE we obtain as' A(g) s, b. For arbitrary but fixed E > 0 there exist real numbers u, v E IR such that for p = up., + UPl (Pi(t) = t' for i = 0, 1) we have (i) P s, q> and (ii) p(A(g)) ~ q>(A(g)) - E. (If a < A(g) (A(g».) Now (i) implies p 0 g ~ Ij> 0 g; hence A ( Ij> 0 g) 2= A (p 0 g)

Since

E

= A (u . 1 + v . g) = u + vA(g) = p(A(g» 2= Ij>(A(g»

is arbitrary, the proof is complete.

-

E.

0

Note that, as Rasa (1988) pointed out, if Ij> is not continuous on I, then (2.6) may fail to hold. For example, let L

= {g: [0, 1] ~IR such that lim g(x) is finite}, x ~ 1

A: L I~R, A(g) = l i m x ~ g(x), l and Ij>: [0, 1 ] ~IR such that Ij>(x) = 0 for XE[O, 1) and 1j>(1) = 1. Then Ij>(A(Pl»=I>O=A(Ij>°PI)' Additional generalizations of Jessen's inequality are given in McShane (1937). The following Theorems 2.5-2.6 and 2.8-2.10 can be found in his paper. Let IR n be the n-dimensional Euclidean space, and a point in IR n will be denoted by z = (Zl' ... , zn)' Linear functions ~ a.z, on IR n will be denoted by t(z).lff(x) is an n-tuple offunctions (fI(x), ... ,fn(x» of L, we denote by A(f) the n-tuple (A(ft), ... ,A(fn»' From Al we obtain

AI': A(t(f» = £(A(f» for every function t(z) linear on IR n. The first result of McShane (1937) concerns the geometric formulation of Jensen's inequality: Let Ll, L2, and AI-A3 be satisfied. Let K be a closed convex point set in IR n, and fl(X), ... ,fn(x) be functions of the class L such that f= (fl' ... ,fn) is in K for all x E E. Then A(f) is in K.

2.5. Theorem.

Proof. Let t(z) + c = 0 be a hyperplane in IR n such that K is entirely to one side of the hyperplane, say, t(z) + c 2= 0 for z in K. Then t(f) + c 2= 0 for all x, and 0 ~ A(t(f) + c) = A(t(f» + A(c) = t(A(f» + c, so that A(f) lies on the same side of the hyperplane as K does. That is, no hyperplane separates A(f) from K. This is possible only if A(f) is in K. 0 2.6. Theorem.

Let Ll, L2 and AI-A3 be satisfied. Let K be a closed convex point set in IR n and Ij>(z) be continuous and convex on K. Let fI(x), ... ,fn(x) be functions of the class L such that f(x) =


(fl(X), ... ,fn(x)) is in K for all x <jJ(A(f» is defined and

E

E and <jJ(f(x)) is in the class L. Then

<jJ(A(f» :5A<jJ(f).

Proof.

49

(2.7)

Denote by K 1 the epigraph of the function <jJ: K,

= epi <jJ = {(z, a)

E

IR n

X

lR:x E K, a

2=

<jJ(z)}.

Then the set K, is closed and convex and the point (f(x), <jJ(f(x» is in K, for all x E E. Hence, by Theorem 2.4, the point (A (f) , A(<jJ(f))) is in Ki; that is, A(f) is in K and A(<jJ(f» 2= <jJ(A(f». 0

2.7. Remarks. The following are some examples from McShane's (1937) paper. In the first example the conditions on the convergence or integrability are too obvious to be formally stated. ,m) or (1,2, ... ), so that f(x) is a (finite (a) The range of x is (1, or infinite) sequence {al' az, }, and A(f) = I: c;aJI: C;, where c, 2= 0 and 0 < I: c, < 00. (b) E is in the interval (0, 1), L is the class of all bounded functions on E, A(f) is the Banach integral of f over (0,1). (c) E is the set of all real numbers, L the class of all uniformly almost periodic functions, A(f) is the mean value of f. (d) More generally, E is any group, L is the class of all functions almost periodic on E, and A(f) is von Neumann's (1934) mean value of

f.

0

In Theorems 2.5 and 2.6 we have not mentioned conditions for strict inequality. To investigate this problem it is convenient to define negligible sets. A set 5 c E is negligible (with respect to L and A) if there exists a function f in the class L such that f(x) 2= 0 on E,

f(x) > 0 on 5,

and

A(f) = O.

It follows readily that every subset of a negligible set is negligible and so

is every set which is the sum of a finite number of negligible sets.

2.8. Theorem. In Theorem 2.5, A(f) is a boundary point of K only if all points f(x) not in a negligible set of x belong to the intersection of K with one of its hyperplanes of support.

50


Proof. If A(f) is a boundary point of K, then passing through it there exists a hyperplane of support it : t(z) + c = 0 of K; say t(z) + c ~ 0 for z in K. Let S be the set of x such that f(x) is not in n n K, then t(f(x)) + c > 0 on S. Since t(f(x)) + c ~ 0 for all x and A(t(f(x)) + c) = t(A(f)) + c = 0, it follows that S is negligible. 0 The following result is also valid:

2.9. Theorem. If the set K is strictly convex and ¢ is strictly convex on K, then in Theorem 2.6 equality holds only if };(x) = A(j;) = const. (i = 1, ... , n) except possibly on a negligible set. We are unable to state whether the condition "}; = A(};) except possibly on a negligible set" is sufficient as well as necessary for the equality to hold in Theorem 2.9. However, by adding a condition concerning L and A we can establish such a result even when K is not strictly convex. Thus we restrict our attention to systems of L, A such that the following condition holds: A4: If S is any negligible subset of E, and f(x) is any (real) function which vanishes on E - S, then f(x) is in the class L and AU) = O. Note that in Remarks 2.7(a) a set S of integers is negligible if ~ i E c S , = 0, and in Remarks 2.7(c) and (d) only the empty set is negligible. An immediate consequence of conditions Ll, L2, A1-A4 is that if f(x) is any function in the class Land g(x) = f(x) except on a negligible set, then g(x) is in Land A(t) =A(g). The following theorem is proved in McShane (1937):

2.10. Theorem. If Condition A4 is satisfied, then in inequality (2.7) equality holds iff the following condition is satisfied: For all x not in a negligible set S, the point (fI(x), ... ,fn(x)) belongs to a convex subset K' of K on which ¢(z) is linear. In particular, if ¢(z) is strictly convex, then the equality holds iff};(x) = const. except on a negligible set.

2.11. Remarks. (a) (McShane, 1937). It is possible to extend Theorems 2.5,2.6,2.8, and 2.9 to functions f(x) assuming values in a Banach space ffJ. Suppose that Ll , L2, and A1-A3 are satisfied, and that L is a class of functions f(x) defined on E and assuming values in ffJ. We shall assume


51

that for every linear function t(z) on 00 and every f(x) in L the function t(f(x» is in L. Further, we shall assume that there is a linear mean M defined on L such that for every linear function t(z) on 00 and every fin L we have t(M(f» = M( t(f». We then find that Theorems 2.5, 2.6, 2.8, and 2.9 can be extended with only one change. The only properties of convex sets needed there are these: through each boundary point of a convex set there exists a hyperplane of support, and every point that does not belong to a convex set can be separated from it by a hyperplane. These properties have been established for convex bodies (closed convex sets having interior points). Hence our theorems can be immediately extended provided that we replace the words "convex set" by "convex body."
k=!

k ~ l

k ~ l

and Theorem 2.32 follows because n

C

L k=l

n

p d ~ ( X k- )

L

p d ~ ( X k )= X ko.

k ~ l

We begin the proof of Theorem 2.33 by establishing (2.36), and then the proof is similar to that of (2.34). Let

f f

(f"; oX)Xdu

C==E

_

(f>X) du

E

then c E (a, b), and by (2.37) f(c)

~ ( f ~oX)(t) +

(c -

X ( t » ( f oX)(t) ~

holds for every tEE. Integrating out with respect to Ii and observing that c f E ( f ~0 X) du - f E if"; 0 X) du = 0, (2.36) follows. To complete the proof of Theorem 2.33 we consider the case of equality in (2.36). If X is constant W.r.t. Ii a.e., then (2.36) is obviously an equality. We now show that if we require f to be strictly convex, then for (2.36) to hold it is also necessary that X be constant w.r.t. Ii a.e. Since f is increasing and strictly convex on (a, b) we have ['; > 0 and Z : E ~IR defined by

ff E

oXdu

= (f oX)(t) + ( f ~oX(t»(Z(t)-

X(t»

(2.39)

66


is measurable on E. From (2.35) and (2.36), and the continuity and strict monotonicity of f, there exists a unique Xo E (a, b) such that f(xo) = fEfoXdu. We rewrite (2.39) as

_f(xo) - «(f oX)(t) + (xo- X ( ) -xoZt (f>X)(t)

( t » ( f oX)(t» ~

.

(2.40)

From (2.38) and the fact that ['; > 0, it follows that for all tEE, X(t) =1= Xo iff Z(t) > Xo, while X(t) = Xo iff Z(t) = Xo' Integrating out on both sides of (2.39) with respect to /l and dividing by fEf>X du, we have

f f

tf": X)Z du 0

c = =:.E

_

(2.41)

tf"; oX) du '

E

where c is as defined above. If X is not constant w.r.t, /l a.e., then E1={tEE:X(t)=I=xo}={tEE,Z(t»xo} has positive measure. Since E\E 1 = {t E E: Z(t) = xo}, from (2.41) it follows that c > xo; thus f Ef X du = f(xo) 0 such that al + a2 = 1 and ad(xl) + ad(x 2) = y. An easy calculation shows that

_ ad'(xl)xl + ad'(x2)X2 x = ad'(xl) + ad'(x2) (f'(xl)=ml andf'(x2)=m2)' Thus

Pecaric (1985a) noted that by using the same proof the following generalization of Slater's (1981) Theorem 1 can be given:

2.3. Companion Inequalities

2.34. Theorem.

67

Iff is convex on (a, b), and if for Xl, ... , x.; E (a, b), + ... + P« > 0, we have

PI, ... , P« 2:: 0, PI n

2:

p J ~ ( X i =1= ) 0,

i ~ l

(2.42)

then (2.34) holds. Now we observe a companion inequality to Jensen-Steffensen's inequality (Pecaric, 1985a):

2.35. Theorem. Assume that f is convex on (a, b) and a < Xl Xn < b. If (2.42) holds and O ~ P k ~ P n( 1 ~ k ~ n - 1 ) P , n>O,

~ ••.

~

(2.43)

then (2.34) holds.

Proof.

For arbitrary but fixed x, y

E

(a, b) we have (2.37), i.e.,

f(y) - f(x) 2:: (y

- x ) f ~ ( x ) ,

(2.44)

Y ) f ~ ( x ) .

(2.45)

or equivalently,

f(x) - f(y)

~ (x

-

Therefore d i = f(c) - f(xi) - f ~ ( X i ) ( -Xi) C 2::0 (i

= 1, ... ,n),

where

Assume that c (2.44) we obtain

E

[Xb xk+d (k

E

(1, ... , n - 1)). If Xi

~ X i + l ~ c,

then by

f(Xi+l) - f(xi) 2:: (Xi+l - X J f ~ ( X i=) (c - x i ) f ~ ( X i -) (c - X i + l ) f ~ ( x J 2:: (c - x i ) f ~ ( X i -) (c - X i + I ) f ~ ( X i + I ) ' namely,

which is d i 2:: d i+1• Similarly, if c

~ X i ~ X i + tbhen

from (2.45) we have

f(Xi+l) - f(Xi) ~ (Xi+l - X J f ~ ( X i + = l ) (c - x i ) f ~ ( X i + l- ) (c - x i + l ) f ~ ( X i + l ) ~ (c

- x i ) f ~ ( X i -) (c - X i + l ) f ~ ( X i + l ) '

68

i.e.,

2. Jensen's and Jensen-StefJensen's Inequalities ~i::S;~ i + 1 T ' herefore n

L

i=l

we have

k

n

P i ~ i= L P i ~ i+

L

i ~ l

p.S,

i ~ k + 1

k-1

= ~kP+ k

L

P ; ( ~ i- ~ i + 1 + ) ~ k + 1 h + 1

i ~ l

n

+

L

P ; ( ~ i- ~ i - 1 )

i ~ k + 2

i.e., f(c)Pn -

n

n

i=l

i=l

L pJ(Xi) = L

P i ~ i ~ 0,

which is the inequality in (2.34). If c E (a, XI), then increasing for i = 1, ... , n. Thus we have n

L

~ i is

nonnegative and

n-l

P i ~ i= ~ n P + n

i=l

L

P ; ( ~ i- ~ i - 1 ) ~ 0,

i=l

i.e., (2.34) is again valid. Similarly we can prove (2.34) if (xn , b). 0

CE

2.36. Remarks. (a) If k

n

i=l

i=l

O::s; L p J ~ ( X i ) : : SL; p J ~ ( X i ) (1::s; k ::s; n - 1), then A E [Xl' Xn ]. For an increasing function f and nonnegative Pi' we have Slater's result. (b) Using similar proofs we can give integral analogs of Theorems 2.34 and 2.35. 0 A multidimensional generalization of Slater's inequality is given in Pecaric (1985b). A further generalization is given in Pecaric and Andrica (1987): namely, a generalization of Jensen's inequality for linear normalized isotonic functionals is the well known Jensen's inequality (see Theorem 2.4). Pecaric and Andrica (1987) gave a similar generalization of such results. Let X be a real vector space and D be a nonempty convex subset of X. We consider the real vector space ;!F(D, IR) whose elements are the real

2.3. Companion Inequalities

69

functions defined on D. Let X* be the algebraic dual space of X, i.e., the real space of all linear functionals x * : X ~IR. Let us denote by M ~ ( D IR) , the set of linear isotonic functionals for D with the property A(l) = 1. For a fixed element y E D we define the abstract subdifferential of fin y by of(y) = {a*

E

X* :f(x) "?f(y) + a*(y, x - y) for every xED}.

(2.46)

Note that if of(y) -=1= 0 for every y ED, then f is convex on D. The following result represents an abstract form of Jensen's inequality, i.e., it is a generalization of Theorem 2.17.

2.37. Theorem.

Let f : D ~ 1 and R A E M ~ ( DIR). , Let y be an element of D such that there exists an a" E of(y) with the property A(a*)"? a*(y, y). Then

(2.47)

A(f) "? f(y)·

Proof.

Since a"

E

of(y), by (2.43) we have

f(x) "?f(y) + a*(y, x - y)

for every xED;

thus f(x) "?f(y) + a*(y, x) - a*(y, y)

holds for fixed y. Applying the functional A, we obtain A(f)"?f(y)+A(a*(y, ·))-a*(y,y)"?f(y).

D

An abstract form of Slater's inequality is:

2.38. Theorem. Let f be a convex function on D such that of(y) -=1= 0 for every y ED, and let the functional A be in M ~ ( D IR). , If there exists a point XED with the property that for every y E D we have a *(y, .) E of(y) such that A(a*(·, x)) "?A(a*(·, ')), then f(x) "? A(t).

Proof.

For the fixed point x and for every point y

(2.48) E

f(x) "?f(y) + a*(y, x - y)

or f(x) "?f(y) + a*(y, x) - a*(y, y).

D we have

70


We apply the functional A with respect to y and obtain

f(x};:==A(f) + A(a*(·, x) - A(a*(', .))

~ A(f).

0

Theorems 2.39 and 2.41, which are consequences of Theorem 2.38, are also given in Pecaric and Andrica (1987): 2.39. Theorem. Let X be a normed space and D c X be an open convex set. Let the discrete functional A E M ~ ( D ,~ ) be given by n

AU) =

2: pkf(xd,

k= 1, .. " n.

k=l

If there exists a point a

E

D with the property n

2: Pka *(XkJ a -

Xk) ~0,

k=l

then f( a)

~ AU).

*

2.40. Remark. Note that D is an open convex set, so af(y) 0 for every y ED (see Roberts and Varberg, 1973, pp. 117 or Remark 1.32). 0 2.41. Theorem. Let Pj: X ~ ~ be the projection maps defined by plxI' ... , xm) = Xj, j = 1, ... , n. Let f be a convex function on the open convex set D eX. Then (2.49)

holds, where A E M ~ ( D ~, ) , (A(Plh1)/A(h1), ... ,A(Pmhm)/A(hm» E D, 1, ... , m.

h = (hI, ... , h m) E af, and A(h;) 0, i =

*

We note that (a) The special case of the inequality in (2.49) for m = 1 appears in Gavera (1984). (b) For the discrete functional we obtain the multidimensional generalization of the Slater Inequality from Pecaric (1985b).

2.4. Higher-Order Jensen-Type Inequalities

2.4.

71

Higher-Order Jensen-Type Inequalities

First we state the well known Levinson inequality (1964) (see also Mitrinovic, 1970, p. 363): 2.42. Theorem. If the third derivative of f exists and is nonnegative, : 5 n ) then for O<XkO(l:5k:5n) and P k = r , ~ ~ I P i ( 2 : 5 k we have

If fill> 0, then equality holds iffXl = ...

= xn.

Popoviciu (1965) pointed out that the natural condition on f for the validity of (2.50) is that f be )-convex. Bullen (1973) proved the following generalization of Theorem 2.42. 2.43. Theorem. (a) Let f be a real-valued 3-convex function on [a, b] and Xk' Yk(1 :5 k :5 n) be 2n points on [a, b] such that XI

+ YI = ... = Xn + Yn . (2.51)

(1 1 n P k"2;l pd(xd - f P n n

n

)

~ lPkXk

:5

1 n (1 n ) P ~ lpd(Yk)- f P ~ lPkYk . n n

(2.52)

(b) If for a continuous function f (2.52) holds for all n, a1l2n (distinct) points satisfying (2.51), and all P« > 0 (1 :5 k :5 n), then f is 3-convex. In Pecaric (1980c) it is shown that the condition (2.51) can be weakened, i.e., the following result holds: 2.44. Theorem. n), Xb Yk(1 :5 k

:5

Let f be a 3-convex function on [a, b], Pi> 0 (1 :5 i :5 n) be points in [a, b] such that Xl

+ Yl = ... =Xn + Yn= 2c,

(2.53)

72


and

Xi + Xn-i+l (PiXi + Pn-i+IXn-i+l)/(Pi

S

2e,

+ Pn-i+l)

S

(2.54) e for

1 sis n.

Then (2.52) is valid.

2.45. Remarks. (a) Let Pi = 1 (1 sis n). Then Theorem 2.44 yields 1 n

(1 n

)

1 n

(1 n

)

~ ~ f ( X i ) - f ~ ~ X i S ~ ~ f ( Y i ) - ~f ~ Y i,

(2.55)

provided that (2.53) and Xi +X n-i+l s2c for

1 si Sn

(2.56)

are valid. This result was obtained by Lawrence and Segalman (1972) for 3-convex functions on (0, 2c) where Xl s ... S X n • Note also that inequality (2.55) is a simple consequence of Theorem 2.15; i.e., (2.55) is valid if (2.56) holds and f: (0, 2e) - IR is such that the function f(x) - f(2e -x) is a Wright-concave function. A shorter proof of Lawrence and Segalman's result for a wider class of functions was obtained by Pecaric (1989-1990), and the class of functions Pecaric considers satisfy ! i . ~ ! i . h f ( x ) ~ for all X E (0, 2a), h > 0, and k > Osuch that X + h + 2k < 2a. (b) Levinson's inequality is a generalization of an inequality of Fan (see Beckenbach and Bellman, 1961, 1965, p. 5; Mitrinovic, 1970, p. 363; and Hardy, Littlewood, and P6lya, 1934, 1952, pp. 281-82). (c) For some other generalizations of Levinson's inequality, see Gavrea and Ivan (1981), Gavrea (1985), and Gavrea and Gurzau (1987). (d) Note that from Theorems 2.42 and 2.43 it follows that if f :[0, 2a]-1R is 3-convex, then the function f(2a -x) - f(x) is convex. By using this fact we can obtain many similar inequalities which are related to Levinson's result (Vasic and Janie, 1970a; Vasic and Lackovic, 1976; Vasic and Stankovic, 1973) directly from the known inequalities for convex functions. 0

°

The following two results are generalizations of results in Vasic and Stankovic (1973) and Vasic and Janie (1970a). These results cannot be obtained from the method in Remark 2.45(d) (see Pecaric and Janie, 1986), and are Petrovic-type inequalities. For more results of this type, see Section 5.2.


73

Let f be a 3-convex function on [0, 2a]. If Xi' Pi (1 -s i :s n) are positive real numbers and Xo is a nonnegative number such that Xo, f.Z=OPkXk E [0, a],

2.46. Theorem.

f

(Xi- XO)( PkXk - Xi) > k=l

°

i = 1, ... , n,

for

(2.57)

and n

XO+LPkxk:S2a and k=O

xi:Sa

for

i=I, ... ,n;

(2.58)

then the following inequality holds:

~ p J ( X i- ) ~ P J ( 2 -Xi) a ~ A ( r ( ~ P i X- f(2a i) - ~PiXi)) + B(f(xo) - f(2a - xo»,

(2.59)

where

A= (i p.x, - Xof Pi)/(f p.x, - xo), ,=0

, ~ o

, ~ o

For Xo = 0, from Theorem 2.46 we obtain: Let f be defined as in Theorem 2.46 and Xi' Pi (1 :s i:s n) be positive real numbers. If either

2.47. Corollary.

n

Xi:s

L PkXk :s a

for

i = 1, ... , n

(2.60)

k ~ l

or n

Xi:s a :s

L PkXk :s 2a k=l

for

i

= 1, ...

(2.61)

,n

holds, then

~ lp d(Xk) -

ktl pd(2a - Xk)

~ f ( ~ lPkXk) -

- (1-

f(2a -

~ lPkXk)

~ / k) ([(0) -

f(2a».

(2.62)

Vasic and Stankovic (1973) proved the following result:

2.48. Theorem.

If f: [0, 2a] ~ is a 3-convex function, then the function ([(x) - f(2a - x»/(x - a) is convex on [0, a]. Thus for Xi E [0, a] (i =

74

2. Jensen's and Jensen-Stefrensen's Inequalities

1, ... , n) and P;

= 1, Jensen's

inequality yields

- f(2a -

~PiXi)) / ( ~ lPiXi-

a).

(2.63)

Another result similar to Levinson's inequality is given in Pecaric (1982c):

2.49. Theorem. (a) Let f be a real-valued 3-convex function on [a, b] and Xb Yk(1 ::; k ::; n) be 2n points on [a, b] such that YI - Xl = Yz- Xz= ...

= Yn-

Xn > 0,

(2.64)

and let P« > 0 (1 ::; k ::; n). Then (2.52) holds. Iff is strictly 3-convex, then equality in (2.52) holds iffXl = ... = Xn·

(b) If for a continuous function f (strict) inequality in (2.52) holds for all n, all 2n (distinct) points satisfying (2.64), and all P« > 0 (1 ::; k ::; n), then f is (strictly) 3-convex.

2.50. Remark. Pecaric (1982c) noted that Theorem 2.49 is equivalent to the fact that the function f(x + s) - f(x) (0 < s < b - a) is convex on [a, b -s]. 0 A generalization of Theorem 2.49 is given in Zwick (1984):

2.51. Theorem.

Let f be (n

+ 2)-convex on

(a, b). Then the function

G(X) = [x, X + hI, ... ,x + hn]f (see (1. 36) for definition) is a convex function of X for all x and all hI, ... .h.; such that x + hi E (a, b) (i = 1, ... , n). Therefore for Pi> 0 and Xi E I (i = 1, ... , m) (I is the domain of G) Jensen's inequality yields

1 Pm

m

~Pi [Xi , Xi+ hI, ... , Xi + hn]f ~ [X, X + h-e, ... , X + hn]f,

(2.65)


75

Proof. Let Wj = p/ Pm (j = 1, ... , m). Since f(n) is a convex function, Jensen's inequality yields

f ( n ) ( t~;(h; -

hi-I) + tih i +

j ~ WjXj) :5

j ~ 1 w d ( n ) t;(h; (~ -

h;-I) + tih i + xj )

.

By using the integral representation of divided differences in (1.41) we obtain (2.65). 0 An additional generalization is given in Farwig and Zwick (1985): Let f be defined as in Theorem 2.51. Then

2.52. Theorem.

G(x)

= [xo, ... , xn]f

(2.66)

is a convex function of the vector x = (xo, ... ,xn). Consequently, (2.67)

°

holds for all a; ~ such that I : ~ oa; = 1, which generalizes the inequality in (2.65).

2.53. Remark. We can easily show that Theorem 2.52 is also a generalization of Theorem 2.48, i.e., the function given in Theorem 2.48 is convex on [0, 2a]. The same result can also be proved by using the idea in Vasic and Stankovic (1973). 0 The next theorem concerns the notions of majorization and Schurconvex functions. Let x = (xo, ... , xn) and y = (Yo, ... , Yn) denote two real (n + l j-tuples. x is said to majorize y (in symbols, x > y) if k

k

2: xU] ~ ;=0 2: Y[iJ

for

k

= 0, 1, ... , n

- 1

; ~ o

X[O]

~ X [ I ] ~ ••• ~ x [ n l

and

Y[O]

~ Y [ l ] ~ ••• ~ Y [ n ]

are the decreasingly ordered components of x and y. A function g: IRn+I_1R is said to be Schur-convex if x> y implies g(x) ~g(y). It is clear that all Schur-convex functions are permutation symmetric. The

76

2. Jensen's and Jensen-Steft'ensen's Inequalities

concepts of majorization and Schur-convex functions deal with the diversity of the components of a vector and related inequalities and will be treated comprehensively in Chapter 12. Since the divided difference is a permutation symmetric function, the following theorem (Pecaric and Zwick, 1989) follows from Theorem 2.52 and a result on majorization inequalities: 2.54. Theorem. Let f be an (n X,yE (a, bt+ 1 and x>y, then

+ 2)-convex

function on (a, b). If

[xo, ... ,xn]f ~ [Yo, ... ,Yn]f;

(2.68)

i. e., the function G defined in (2.66) is Schur-convex.

2.55. Example. The complete permutation symmetric functions Ck(x) =

2.: O:::=;:il~"

·-5,ik S

XiI' .. Xi.,

k

= 1, ... ,n,

Co(X) '= 1,

n

can be generated by taking divided differences of monomials: Ck(x) = [xo, ... , xn]t n+ k, k ~ 0.

°

Since t n + k is (n + 2)-convex for t ~ and k ~ 0, the functions C k are Schur-convex provided that Xi ~ (i = 0, ... , n). 0

°

2.56. Example. Let f(t) = 1/(1 - at). Then cf>(x) = [xo, ... ,xn]f = an /

01

(1- ax i)).

Since f(n+2)(t) = (n + 2)! a n+ 2/(I_ aW+ 3 , it follows that cf> is Schurconvex provided that n is even and ax, < 1 (i = 0, ... , n), or n is odd and either ax, > 1 (i = 0, ... , n) or ax, < 1 (i = 0, ... , n) with a > 0. 0 The following result (Farwig and Zwick, 1985) is a consequence of Theorem 2.54. 2.57. Theorem. Let fen) be convex in (a, b), a s: xo:'5 ... define z = (1/(n + 1)) L7=oXi' Then f(n)(z) --=[z, ... , z]f:'5[xo, ... ,xn]f. n!

:'5 Xn :'5

b, and

(2.69)

(n+ltimes)

If Xo *Xn, then equality in (2.69) holds iff f E fI n+! where fI n+! denotes the class ofpolynomials of degree at most n + 1.


77

2.58. Example. Iff E Cn-1[a, b], then we can show that n - l f(k)(a) (b -at[a, ... , a, b]f=f(b)- 2: -,-(b -a)\ (n times) k=O k.

which is the error term associated with the Taylor series expansion of fat a. If f(n) is convex, then it follows from (2.69) that f(n)(z) :S, f(n)( !;), where !;

is the so-called "intermediate point" in Taylor's theorem and z = (na + b)/(n + 1). Normally!; may lie anywhere in (a, b), but in this special case a further restriction on its location is possible: If f(n)(x) 0 (i = 1, ... , n).

The following theorem (Mitrinovic and Pecaric, 1987a) gives a result on Jensen's inequality via the monotonicity property of certain functions: 3.22. Theorem. Let f be a convex function defined on a convex set U c M (M is an arbitrary linear real space). Let the function g be defined by n 1 ( n ) g(x) = ~ l ~ f qixAi + (r -x) ~ 1A k ,

where qi>O (i=1, x) EZ=lAk E U (i = 1, (xy > 0, Y E I), then

,n) with ~ Z = 1 ( 1 / q k ) = 1 r, E ~ , qixAi+(r, n) for all x in an interval I (I ~ ~ ) . If Ixl :5lyl

(3.25)

g(x) :5 g(y).

Proof. Let s1 E [0, 1] (i, j = 1, ... , n) with E?=l s1 = 1 (j = 1, ... , n). By (3.13) we have (3.26) Using the substitutions n

ai=qiyAi+(r-y)

LA k=l

s{ = -1 ( 1 - X) - (i =1= j), o

b

qi

Y

and we obtain (3.25) from (3.26).

o

s:. = -qi1 ( 1 - Y-X) + Yx- ,

3.2. Refinements of Ineqnalities

91

3.23. Remarks. (a) The function g in Theorem 3.22 is also convex. Indeed, we have Ag(X) + Xg(y)

= ~ ~(Af(qiXAi+ (r-x) kt1 A k) + Xf(q.yA, + (r- y) ~ 1A k)) 2: g(h

+ Xy).

(b) Using the substitutions: lIqi-Wi, q;Ai-Xi, r=l, conclude that (3.25) is also valid if g(x) =

we may

~WJ(XXi + (1- x) ~ 1WkXk),

where Wi >0 (i = 1, ... ,n) are such that I:Z=1 Wk = 1 and XXi + (Ix) I : k ~ W 1 kXkE U (i = 1, ... , n) for all x in an interval I (I ~ ~ ) . D Note that for one-variable convex functions the result in Remark 3.23(b) can be obtained by using Fuchs' generalization of a theorem on majorization. Moreover, in this case we can obtain similar results for Jensen-Steffensen's inequality. The above results give refinements of related discrete inequalities. Next we give some similar results for linear isotonic functionals. Let E be a nonempty set, d be an algebra of subsets of E, and L be a linear class of real-valued functions g : E - ~ having the properties

+ bg) E L for all a, b E ~ ; that is if f(t) = 1 for tEE, then f E L;

L1: f, g E L=;,(af

(3.27)

L2: 1 E L,

(3.28) (3.29)

L3: f E L, E 1 Ed=;' fCEl E L,

where CEI is the indicator function of E 1 (i.e., CEl(t) = 1 for t E E 1 , and 0 for t E E\E 1 ) . It follows from L2, L3 that CEI E L for E 1 E d. We also consider isotonic linear functionals A: L - ~ by assuming:

AI: A(af + bg) A2: f

= aA(f) + bA(g) for f, gEL, a, b E E L, f(t) 2: 0 on E =;, A(f) 2: 0 (A is isotonic).

~ ;

(3.30) (3.31)

Furthermore, we make use of the fact that if L also satisfies L3, then for every £1 E d such that A( C E ) > 0, the functional Al defined for all gEL by A 1(g) = A(gCE)/A(CE) is an isotonic linear functional with A 1 (1) = 1. We observe that A(g) = A(gCEJ + A(gCE\E).

(3.32)

92

3. Reversals, Refinements, and Convenes

3.24. Theorem. Let L satisfy properties L1, L2, and L3 on a nonempty set E, and assume that
0, then for every gEL such that 1> (g) E L we have

3.26. Theorem.

F[A(1)(g)), 1>(A(g))]

2:

inf F[A(1)(gE 1 ' x)), 1>(A(gE1 , x))], (3.36) XEI

where

Proof. have

For brevity, let E z=E\E 1 and assume A(CEZ) >0. Clearly we

and

where

In addition, A(1)(g)) = A(1)(g)CE, + 1>(g)C E,) = A(1>(g )CE) + A(1> (g)C E,) 2: A(1>(g)C E)

+ o1>(z)

holds from the remark concerning (3.31) with E 1 replaced by E z . Now, Z E I holds because if 1= [Il', then Il' ~g(t) ~{3 for tEE since 1>(g) is in L (hence it is defined). Thus Il'Cdt) ~g(t)C E2(t) ~ {3CE2(t) for all tEE whenever Il'A(CEz) ~ A ( g C E z ~){3A(CE,), thus we have Il' ~ Z ~ {3. A simple modification of the argument shows that Z E I if either Il' = -00 or {3 = 00. It then follows from the above inequality and the increasing property of F(·, y) that

In

F[A(1)(g)), 1>(A(g))]

+ o1>(z), 1>(A(gCd + oz)] 2: inf F[A(1)(g)C E ) + o1>(x), 1>(A(gCE) + ox)]

2:

F[A(1)(g)C E)

XEI

= inf F[A( q,(gE" x», q,(A(gE XEI

"

x»],

94

3. Reversals, Refinements, and Converses

since A(gEt' x) = A(gC E) +XA(CE2) = A(gC E) + ox, <jJ(gEt' X)(t)

= <jJ(g(t)C Et(t) + xCE

2(t»

= <jJ(g(t»CEt(t)

+ <jJ(X)C E2(t).

0

3.27. Remark. Clearly, there are many variations and generalizations of Theorem 3.14 which have essentially the same proof. For example, if F(·, y) is decreasing for each y E J, then in place of (3.36) we have F[A( <jJ(g», <jJ(A(g»] :5 sup F[A( <jJ(gE t' x», <jJ (A(gE, , x»].

(3.36')

XEI

This also follows from (3.36) simply by replacing F with F1 = - F.

0

For another (more extensive) generalization, suppose E, E.sI1 for ' letting oi=A(C E), l:s;j:S;n with E inEj=0 (i*j) and E = U ~ E iBy zi=A(gCE)/A(CE) (where we assume all Oi>O), we find under the hypotheses of Theorem 3.14 that if x = (Xl' . . . ,xn ) , then F[A(<jJ(g», <jJ(A(g))]

2=

infn F[i 0i<jJ(X i), <jJ(i OiXi)]' 1

XEl

(3.37)

1

If we let n

gE,....,En.X(t) =

2: XiCe,(t), i ~ l

then the right-hand side of (3.16) can be written as

Note, however, that this value is independent of the function g and it provides a lower bound for the left-hand side of (3.37) which is valid for all admissible gEL. Similarly, if F(·, y) is decreasing, then instead of (3.37) we have F[A(<jJ(g», <jJ(A(g»]:s; sup F(i Oi<jJ(Xi), <jJ(i OiXi))' xe I"

1

(3.37')

1

There are other variations which follow immediately from (3.36) and (3.36'). For example, if 1:5 m < n, and if we let Xm = (Xm+l , .•. , xn ) and m

GEt .....En.Xm(t) = 2: g(t)Ce,(t) + ;=1

n

2:

j=m+l

XjCE/t),

3.2. Refinements of Inequalities

95

then under the hypotheses of Theorem 3.26 we can obtain

F[A(Ij>(g», Ij>(A(g»]:s;

sup F[A(Ij>(gEt .....En.xJ), Ij>(A(gEt .....En.xJ)].

xmEln-m

(3.38) Finally, we observe that in (3.36), (3.37), or (3.38) the lower bounds on the right-hand sides depend on the subsets E, c E (with E, E .sIl), and a possible sharper bound (hence a better result) might be obtained by allowing the sets E, to vary. For example, by (3.36) we have

F[A( Ij>(g», Ij>(A(g))]

2:

sup {inf F[A( lj>(gEt.X», Ij>(A(gEt. X»]}, (3.39) EtEstlt XEI

3.28. Remark. Inequality (3.36) is a generalization of the well-known inequality of Beckenbach (1966) (see also Mitrinovic, 1970, p. 52, and Bullen, Mitrinovic, and Vasic, 1987, pp. 156-57); i.e., for a discrete D functional A we get Beckenbach's inequality. Theorem 3.26 is related to Jensen's inequality. A generalization of Jensen's inequality for convex functions of several variables was given by McShane (1937) (see Theorem 2.10). The following theorem is a similar generalization of Theorem 2.6.

3.29. Theorem. Let Ij> be a continuous convex function on a closed, convex set U c IR n. Let L satisfy properties L1, L2, L3 on a nonempty set E, and let A: L I~R be an isotonic linear functional with A(l) = 1. Let L={G=(gl, ,gn):giEL for l:s;i:S;n}, and define A : L ~ l R by n A(G) = (A(gl), , A(gn»' Then A is a linear operator on the linear class L. Let J be an interval such that Ij> (U) c J and F: J2 ~IR be an increasing function of its first variable. If E 1 E.sIl such that A(CE\EJ > 0 then, for every GEL such that Ij>( G) E L, we have F[A(Ij>(G», Ij>(A(G))] 2: inf F[A(Ij>(G Et.x», Ij>(A(G E t . x » ], XEU

Proof. The proof is similar to that of Theorem 3.26, thus we merely outline the differences. Let E 2 = E \ E 1 and a=A(C E,). Then Ij>(A(G» = Ij>(A(GCEJ + az),

z = A(GCE2)/A(CEz)'

A(Ij>(G» =A(Ij>(G)CEJ +A(Ij>(G)C E,) 2:A(Ij>(G)C EJ + alj>(z),

96


because Al(g) = A(gC EJ/A (C EJ is an isotonic linear functional on L with A 1(1) = 1. Consequently, McShane's inequality applies to the operator AI: L ~IR n defined by A 1(G) = (A 1(gl), ... ,A1(gn» = A(GCEz)/A(CEJ Moreover, by McShane's result, we have z = A 1(G) E U. The result of the proof remains unchanged except for obvious notational modifications. 0 Finally, we give a refinement of Jensen's inequality that is similar to (3.24). We shall use the notation kn = fk,n(x1' ... , xn) =

G)

-1

fG

15il 0 for all x E I, or (if) @(x) > 0 for m < x < M with either @(m)= 0 , @'(m)# 0, or @ ( M )= 0, @ ' ( M )# 0, or (ii) @(x) < 0 for all x E I , or (ii') @(x) < 0 for rn < x < M with precisely one of @(m)= 0, @ ( M )= 0. Then for all g E L such that @(g)E L (so that m 5 g(t) 5 M for all t E E ) ,

A(@(g)5 ) A@(A(g))

(3.45)

holds for some d > 1 in cases (i), (i') or d E (0, 1) in cases (ii), (ii'). More precisely, a value of A (depending only on m, M , @) for (3.45) may be determined as follows: Define p = ( @ ( M ) - @ ( m ) ) / ( M- m). If p = 0, let x = 1 be the unique solution of the equation @'(x) = 0 (m 1 holds if @(x) < 0 on (m, M ) and 0 < A < 1 if @(x) > 0 on (m, M).

Proof. (a) As in MitrinoviC and VasiC (1975) and Beesack (1983), we consider the points B(m, @(m)),C(M, @ ( M ) ) on the convex curve y = @(x). The equation of the chord BC is y

+ p(x

= @(m)

- m ) = h(x).

By Theorem 3.37, we obtain A ( @ ( g ) ) % h ( A ( g ) )If. we consider the family of convex curves with equations y = d@(x) (d > 0), we show as in MitrinoviC and VasiC (1975) that there is a unique d > 0 which satisfies the conditions stated in the lemma, such that the curve will be tangent to the chord BC. Thus we have

100


By elimination of

A we obtain

+

G(x) = P@(x) - @ r ( ~ ) ( ~Y) = 0.

The solution of the above equation is the abscissa (call it xo) of the contact point, where xo E (m, M ) . First, we have G ’ ( x )= -(px + Y)@”(x) < 0, i.e., the graph of G can cut the x-axis in at most one point of (m, M ) . Furthermore,

G(m)G(M)= @ ( m ) @ ( M ) ( @ ’ ( m-)P ) ( @ r ( M )- P> holds and, by the mean value theorem and the fact that @ r is an increasing function (since $”(x) > 0), we have G(m)G(M)5 0. Thus the assertion is true. Consequently, h ( y )IA@(y) holds for all y E I , and taking y = A ( g ) yields

A(@k)) Ih ( A ( g ) )5 A@(A(t?)h which establishes (3.45). (b) This follows from (a) by applying to the convex function

= -@.

0 3.40. Remarks.

(a) It is clear that the last inequality in the proof of Theorem 3.39 constitutes a refinement of (3.45). (b) Equality in (3.45) is valid iff the point of contact of the segment BC and the curve y = A@@) coincides with A ( g ) . In the case of a discrete positive functional with n

AV) = C p i f ( x i ) (pn i=l

= 1,pi

’01,

equality holds iff there exist two subsequences ( x i , , . . . , xJ and (xikc,,. . . , x i , ) of the sequence ( x l , . . . , x,) such that every element of the first subsequence is equal to m, every element of the second subsequence is equal to M , and

+

xo= MB m ( 1 -

01,

k

@(xo)= @(M)B

+ + ( m > ( l - B),

where xo is a unique solution of the equation G(x)

=0

6=

C pi,,

r=l

on (m, M ) .

0

Similarly, the following theorem can be obtained:

3.41. Theorem. (a) Let L , A , and g be as in Theorem 3.39, and let be a differentiable function on I = [m, MI such that @ ‘ ( x ) exists and

).(a

3.3. Converses of Jensen’s Inequality

101

si strictly increasing on I . Then we have

A(@(g)5 ) A + @,(A(g))

(3.46)

for some A satisfying 0 < A < ( M - m ) { p - @‘(m)},where p = ( @ ( M )@ ( m ) ) / ( M- m ) . More precisely, A may be determined as follows: Let x = 1 be the unique solution of the equation @ ‘ ( x ) = p ( m < 1 < M ) , then il= @(m)- @ ( i f ) + p ( 1 - m )

satisfies (3.46). (b) Let all the hypotheses of (a) be satisfied except that @ ’ ( x ) is strictly decreasing in I. Then 5

A + A(@k))’

where 0 < A < ( M - m){@’(m) - p } with p defined as in (a). In fact, we have A = @(if) - @ ( m )- p(2 - m ) with 2 given in (a). A further generalization of the above results is given in PeEariC and Beesack (1987a): 3.42. Theorem. Let J be an interval such that J =I @ ( I ) . If F(u, v ) is a real-valued function defined on J X J and increasing in u, then

(= max

F[O@(m) + (1 - O ) @ ( M ) ,#(Om + (1 - O ) M ) ] ) . (3.47)

ec[m,Ml

Furthermore, the right-hand side of (3.47) is an increasing function of M and a decreasing function of m.

Proof. By (3.43) and the increasing property of F ( - ,y ) , we have

102


Thus the first part of (3.46) follows. As in Remark 3.38(a) we have, for m % x and m < M‘ c: M ,

{ ( M - x)Cp(m)+ ( x - m ) @ ( M ) } / ( M- m ) 2 { ( M ’ - x ) # ( m ) + (x - m)Cp(M‘)}/(M’ - m). Hence, by the increasing property of F ( . , y ) ,

d ( x ; m , M , C p ) ? d ( x ; m , M ’ , Cp),

msx,

m<M’<M.

(3.48)

By (3.48) and the fact [m,M ’ ] c [m,M I , it follows that max d ( x ; m, M, Cp)

2

xclm,Ml

max d ( x ; m, M’, Cp) 2 max d ( x ; m, M ’ , Cp). xe(rn,M ‘ ]

xc[m,MI

Similarly, we can show that max d ( x ; m, M , Cp) xc[m,M]

5

max d ( x ; m’, M , Cp)

for m’ 5 m < M .

x e [ m ’ ,M ]

Finally, the identity in (3.47) follows immediately from the change of variable 8 = ( M - x ) / ( M - .m>, so that x = Om (1 - 8 ) M with 05851. 0

+

In a similar fashion (simply by replacing F by -F in the above theorem) we can prove 3.42’. Theorem. Under the same hypotheses as in Theorem 3.37 except that F si decreasing in its first variable, we have

F[A(Cp(g)), $ w g ) ) l L

min

0; m, M , $1

x~[m,Ml

(= min F[eCp(m)+ (1

-

8elO.ll

8 ) @ ( M ) ,@(Om+ (1- e ) M ) ] ) . (3.47’)

Furthermore, the right-hand side of (3.47’) is a decreasing function of M and an increasing function of m. 3.43. Remark. Theorem 3.42 is a generalization of a result of Knopp (1935, Satz 1). For an equivalent form of Knopp’s result see Popoviciu (1944, p. 34). We now show that Theorems 3.39 and 3.41 are simple consequences of Theorems 3.42 and 3.42’. We first establish Theorem 3.38 by applying Theorems 3.42 and 3.42’. For case (i) we apply Theorem 3.42 and for case (ii) we apply Theorem 3.42’, both with F ( x , y ) = x / y , and

3.3. Converses of Jensen’s Inequality

103

J = (0, 00). We proceed only with case (i) since the proof for case (ii) is similar. In case (i) the inequality in (3.47) becomes

where

f

( x ) -f(x; m, M ,

@I= { ( M - x ) @ ( m +) ( x - m ) @ ( M ) ) / ( M- m ) @ ( x ) .

Clearly we have f’(x) = G ( x ) / @ ( x ) * where G ( x )= p @ ( x )- ($(m) + p(x - m ) ) @ ’ ( x ) )The . equation G ( x )= 0 (see proof of Theorem 3.38 has exactly one solution. Thus G ( x )= 0 holds for a unique x = Z(m, M). Since @ is convex and positive, it follows that f ( x ) L 1, with equality for x = m and x = M . Hence the maximum value on the right-hand side of (3.49) is attained at x = i. Next we prove Theorem 3.39 by applying Theorems 3.42 and 3.42‘. In Theorem 3.42 take F ( x , y ) = x - y. Then (3.46) becomes

where

Y ( x )= Y ( x ;m, M , @) = { ( M - x ) @ ( m )

+ (x -m)@(M))/(M - m ) - @(XI. Then Y ’ ( x )= p - @ ’ ( x ) is strictly decreasing on Z with Y ’ ( i )= 0 for a unique Z E (m, M ) . Consequently, Y ( x ) has its maximum value at x=i.

0

A similar converse of Theorem 3.26 is given by PeEariC and Beesack (1987b):

3.44.

Theorem. Let all the conditions of Theorem 3.26 be satisfied except that for a compact interval Z = [m,MI, m 5 g ( t ) IM holds for all t E E. Zf u = A ( C E , E ,> ) 0, then

104


Proof. As in the proof of Theorem 3.26 we denote £2 = £\£1' Now let d(t) = (M - g(t»/(M - m), so that g(t) = md(t) + M(l- d(t», and let (3 = A(dCEJ Then ep(A(g» = ep(A(gCEJ + A[(md + M(l- d»CEzD = ep(A(gCEJ + m{3 + M( a - (3». Also, by the convexity of ep on I we have

A(ep(g»

= A[ep(g)CEt + ep(g)CE,) = A[ ep(g)CE1+ ep(md + M(l - d»CEzl :5A[ep(g)C E1+ {dep(m) + (1- d)ep(M)}CEz] = A {(ep(g)CEt + ep(m)A(dCEz) + ep(M)A«l-d)C Ez) = A(ep(g)CEJ + (3ep(m) + (a - (3)ep(M).

Since 0:5d(t):51, we have O:5{3=A(dCEz):5A(CEz)=a. The result in (3.50) now follows from this fact, the increasing property of F(·, y), and 0 the equality for ep(A(g» and the inequality for A(ep(g». 3.44'. Theorem. If F(·, y) is decreasing on J for every y EJ and all the other conditions of Theorem 3.44 are satisfied, then

F[A(ep(g», ep(A(g))]

~

inf F[A(ep(g)CEJ

+ 8ep(m)

OS(J::s:a

+ (a - 8)ep(M), ep(A(gCE1) + Bm + (a - 8)M],

(3.50')

where a=A(CE\EJ Theorem 3.44' follows by applying (3.50) to the function FJ. = - F. We note that in the special case E 1 = 0, Theorem 3.44 reduces to Theorem 3.42. If we also note that a = a Et and denote the right-hand side of (3.50) by H(E 1 ) , we obtain the best (least) upper bound for F[A( ep(g», ep(A(g))] given by

F[A(ep(g», ep(A(g»]:5 inf H(E 1 ) E1Ed,

under the hypothesis of Theorem 3.44, where slJ. 1 = {E 1 E sIJ.:A(CE\EJ = aEt>O}. 3.45. Remarks. (a) Mitrinovic and Vasic (1975) conjectured that .It. in Theorem 3.39 (Theorem 3.41) for discrete functionals are increasing

3.3. Converses of Jensen's Inequality

105

functions (decreasing functions) of m. A geometric proof of this conjecture is given in Vasic and Pecaric (1979b). Theorem 3.42 and its proof provide an analytical proof for a more general result. (b) It is obvious that Theorem 3.37 is an important result in the theory of convex functions. This inequality has several generalizations. For example, we can similarly give its generalization for g-convex functions (for the discrete case, see Vasic and Pecaric, 1979a). Furthermore, Vasic and Pecaric (1979b) proved their result without the assumption that 4J is a convex function, i. e., they showed that we need only assume either (4J(x) - 4J(m»/(x - m) or (4J(M) - 4J(x»/(M -x) is an increasing function of x on (m, M] and [m, M), respectively. Furthermore, with the same assumption, Theorem 3.42 is proved for discrete functionals (see Pecaric and Crstici, 1986). Several similar results are also given in those two papers. (c) The following interpolation of Theorem 3.41 for the discrete case is given in Pecaric and Mesihovic (1989): Let P; > 0 (i = 1, ... , n), (i = 1, ... , n) andf"(x) >0 on [m, M]. Then P; = 1, m ~ x ; ~ M

~pJ(x;) - · f ( P~ iX;) ~ 1 J 1 c ~ ~WeI) ~A, where In

= {I, ... , n}, A is defined as in Theorem 3.41, and W(I) = Pd(M) + (1- Pj)f(m) - f(PjM + (1- Pj)m).

(d) The following converse of McShane's inequality can be found in Andrica and Drimbe (1988): n

A(f) - f(A(Pl)' ... ,A(Pn» = K,

2:

(A(pD - A

2(pd),

k=l

where A is an isotonic linear functional, Pk: ~ ~ ~ is the k-projection defined by Pk(Xl,'" ,xn ) = xi , k = 1, ... ,n, f E C 2(E), and K, = max IIH(f)(x)1l where IIH(f)(x)lI is the Frobenius norm of Hessian D matrix H(f)(x).


Chapter 4

4.1.

Applications of Jensen's Inequality to Means and Holder's Inequalities

Inequalities for Means

We shall again consider positive (isotonic) linear functionals as in Section 3.2. Let I = (a, b), -00::5 a < b ::5 00, and let 1jJ, X: I ~IR be continuous and strictly monotonic. Suppose that L and A satisfy the conditions L1, L2 and AI, A2 stated in (3'.27)-(3.31) with A(l) = 1, on a base set E, and that 1jJ(g), X(g) E L for some gEL. We define the generalized mean with respect to the operator A and 1jJ by

M1J'(g, A) = 1jJ-l{A(1jJ(g»},

gEL.

(4.1)

Observe that if a::5 1jJ(g(x»::5 f3 for x E E, then by the isotonic charact of A, we have a::5A(1jJ(g»::5f3, so that M1J' is well defined by (4.1). We also note that the above assumptions imply that g(x) E I for x E E. In the rest of this chapter we assume that gEL satisfies the above conditions, so that the theorems hold only for such g.

7r

4.1. Theorem.

Under the above hypotheses we have M1J'(g, A)::5 Mx(g, A),

(4.2)

provided either X is increasing and = X0 1jJ-l is convex, or X is decreasing and is concave. Proof. For gEL we have both 1jJ(g) ELand X(g) E L by assumption. Hence ( 1jJ(g» = X(g) E L for gEL; thus if is convex, then it follows from Theorem 2.4 that (A(1J!(g)))

::5 A(X(g».

107

108

4. Applications of Jensen's Inequality

Consequently if X is increasing, then X-I is also increasing, and we obtain

which is (4.2). If is concave, then - is convex, and we obtain the first inequality above with the direction reversed; since now X-I is decreasing with X, we again obtain (4.2). D

4.2. Remark. Theorem 4.1 is a generalization to functionals of the general mean value inequality found in Hardy, Littlewood, and P6lya (1934, 1952, p. 75, Theorem 92). D As an application of Theorem 4.1, we consider the generalization of the classical means M[rl(g, A) for isotonic functionals A, defined for r E IR by M[rl(g A)

,

= {(A(gr»lIr exp(A (log g»

for r *0, for r = 0,

(4.3)

where g(x) > 0 for x E E, gr E L for r E IR and log gEL. From Theorem 4.1 it follows as a special case that M[rl(g, A) ::; M[sl(g, A)

if

-00

< r ::; s < 00.

(4.4)

4.3. Theorem. Let L, A, 1jJ, and X be as in Theorem 4.1, but with 1= [m, M], -00 < m < M < 00. Then for every gEL such that m ::; g(t)::; M and for tEE, we have (1jJ(M) -1jJ(m»A(x(g» - (X(M) - x(m»A(1jJ(g» ::; 1jJ(M)x(m) - x(M)1jJ(m),

(4.5)

provided that = X 0 1jJ -1 is convex. The inequality in (4.5) is reversed when is concave.

Proof.

If 1jJ is increasing on I, we have m,

= 1jJ(m)::; 1jJ(g(t» ::; 1jJ(M) = M 1

for every tEE.

Thus by Theorem 3.37 with m, M replaced by ml, M 1, we have A( ( 1jJ(g))) :s {( 1jJ(M) - A(1jJ(g )))x(m) + (A( 1jJ(g» - 1jJ(m »X(M)}

x [1jJ(M) -1jJ(m)r 1 ,

4.1. Inequalities for Means

109

which reduces to (4.5). If 1jJ is decreasing on J, we have M 1 ~ 1jJ(g(t» ~ ml for every tEE and, with an obvious modification of the proof, the result follows as before. D

In the case of classical means, (4.5) gives Goldman's inequality for isotonic functionals (see, for example, Bullen, Mitrinovic, and Vasic, 1987, p. 203): (M r _ mr){MIsJ(g;A)Y- (M S - mS){MI'I(g;A)Y ~ M'rn" - MSm' (4.6) for 0 < r < s, or r < 0 < s, and the inequality is reversed for r < s < O. Similarly, for r = 0 we obtain

M m

log - {MIsl(g; A)Y - (M S

-

mS)log{MlOI(g; A)} ~ m'

10gM - M log m S

(0 <s).

(4.7)

4.4. Theorem. Let L, A, 1jJ, X be as in Theorem 4.1 and gEL with m ~ 1jJ(g(t» ~ M for tEE.' Assume I = (a, b) = ~+ = (0, (0), X ( + ~) = ~ + , and let =x o 1jJ - l. (a) Suppose "(x) 2: 0 with equality for most isolated points of [m, M], and let A> 1 be determined as in Theorem 3.39(a). Then (4.8)

holds if Xis increasing and supermultiplicative on ~", while the inequality is reversed if X is decreasing and supermultiplicative. Moreover, (4.9)

holds if X is increasing and submultiplicative, with the inequality reversed if X is decreasing and submultiplicative. (b) Suppose "(x) ~ 0 with equality for most isolated points of [m, M], and let A (0 < A < 1) be determined by Theorem 3.39(b). Then Mx(g; A) 2: X-1(A)M",(g; A)

(4.8')

holds if X is increasing and submultiplicative, with the inequality reversed when X is decreasing and submultiplicative. Moreover, (4.9')

holds if X is increasing and submultiplicatioe, while the inequality is reversed if X is decreasing and supermultiplicatiue.

110


Proof.

The proof follows from Theorem 3.39 in precisely the same manner as in Beesack (1983, Theorem 3). We illustrate by proving the reverse inequality in (4.8') when X is decreasing and supermultiplicative. The inequality to be proved is

which, since X is decreasing, is equivalent to

Since X is supermultiplicative, this holds provided A (( 1JJ(g))):5 A (A (1JJ(g))),

because gEL, 1JJ(g) E L, and ( 1JJ(g)) = X(g) E L hold by the assumptions preceding (4.1). Therefore the last inequality follows from Theorem 3.39(a) with g replaced by 1JJ(g). 0

4.5. Remark. Theorem 4.4 is a generalization to isotonic functions of the result in Beesack (1983, Theorem 3). As Beesack noted, the previous factors X-\A), {X- 1(r 1)} -1 are equal only when X is multiplicative. 0 For the generalized classical means M[r1(g, A) defined in (4.3), Theorem 4.4 gives the following results: Let 0 < X 1:5 g(x) :5 X 2 < 00 for all x E E, and for r, S =f= 0 let Il = ( X ~- XD(X; - x;rl, XsX r _ xrX s }(lISl-(lIrl B ( / )lIr I 2 1 2 { r,s = W s [1 - (s/r)](X; - XD

Then we have (Beesack, 1983, p. 334) M[s](g, A):5 Br,sM[r](g, A)

(r < s, r, s =f= 0).

(4.10)

For the cases involving r = 0 or s = 0, let

= ( X ~- XD/log(X2/X1) , B, = (1l,/(et))1I'X 1 1 e x p ( X ~Il,)/ Il,

for t =f= O. Then M[S](g,A):5BsM[OJ(g,A) if O<s,

(4.11)

M [ U ) ( g , A ) ~ B ; 1 M [ r ) (ifg , r 0 be determined as in Theorem 3.41(a). If X is superadditive on 1= IR+, then (4.13) when X is increasing, while the inequality is reversed when X is decreasing.

(b) If '(x) exists and is strictly decreasing on [m, M], A is determined as in Theorem 3.41(b), and X is superadditive, then (4.14) when X is increasing, with the inequality reversed when X is decreasing.

4.7. Remark. As in Beesack (1983, p. 336), we obtain the following results for the generalized classical means: M[sl(g, A)::; C r. s + M[rl(g, A) for

provided that 0 < Xl::; g(x) -s X 2 for x tl

C

=

E

X ~ X ; X

S -

r

s

2= 1,

(4.15)

E, where C r,s is defined by

XD- 1,

( X ~- XD(X; -

= { X; r,s

0 =1= r < s,

(W)s/(S-r)}lIS.

+r- -s ~

'

and MIsl(g, A):o; Cs + M[OI(g, A) for

s

2=

1,

(4.16)

where C, is defined by tl

Cs

=

{

=

( X ~-

XD/log(X 2/X 1) ,

tl[ log (tl)- - 1]}lIS.

X~ I O g X2 - X~ l o g Xl + log(X 2/X1)

s

As noted by Beesack (1983), the constants not, in general, the best possible. 0

s

c.S'

C, in (4.15), (4.16) are

4.8. Remark. Similarly, we can give a generalization of Theorem 5 of 0 Beesack (1983). The above results are given in Beesack and Pecaric (1985a). A further generalization of Theorems 4.4 and 4.6 is given in Pecaric and Beesack (1987a).

112


4.9. Theorem. Let g, L, A, 1jJ, X be as in Theorem 4.4, where I is an arbitrary interval. If F(u, v) is a real-valued function defined on [m, Mf and is increasing in u, then F(M",(g, A), Mx(g, A»:s max F[1jJ-l(81jJ(m) 8e[O.l]

+ (1-

8)1jJ(M», X- 1(8x(m) + (1- 8)X(M))]

(4.17)

holds provided that 1jJ is increasing, 1jJ 0 X- 1 is convex, and F(u, v) is a real-valued function defined on I

X

I and is increasing in u.

Suppose that X is increasing on 1. Let Fl(x, y) = F(1jJ-\x), 1jJ-l(y», epl(X) = 1jJ(X-1(x», gl = xog, ml = x(m), and M1 = X(M). Then the conclusion follows from Theorem 4.1 when applied to F1 , epl' gl' If X is decreasing on I, we need only define ml = X(M) and M1 = x(m). Then (4.4) implies

Proof.

F(M",(g, A), Mx(g, A»:s max F( 1jJ-l( 81jJ(M) 8e[O,1]

+ (1 which is equivalent to (4.17).

~ 8)1jJ(m», X-

1(8x(M)

+ (1- 8)x(m»),

0

4.10. Remarks. (a) In the special case F(x,y)=x-y, X(x)=x, and A(g) = Hgdt, (4.17) yields an inequality in Knopp (1935, Satz 2). (b) Theorem 4.9 is a generalization of a result of Beck (1969), who considered quasi arithmetic mean values Mq,(r.7 Piep(Xi» (see also Bullen, Mitrinovic, and Vasic (1987, pp. 256-258». (c) Similarly, we can apply Theorem 3.24 to prove generalizations of index-set type inequalities (see Bullen, Mitrinovic, and Vasic, 1987, pp. 94-98, 103-105, 173-176, and 234-242. 0

4.2.

Holder's and Minkowski's Inequalities

We begin this section with a lemma for non-normalized isotonic linear functionals which includes the corresponding versions of Jensen's inequality and Theorems 3.37,3.39, and 3.41. 4.11. Lemma. Let L satisfy conditions L1, L2, and A satisfy conditions AI, A2 (defined in (3.27)-(3.31» on a base set E. Suppose that k E L with

4.2. HOlder's and Minkowski's Inequalities

113

k 2= 0 on E and A(k) > 0, and that
1 and w, f, g ~ on E with wi", wgP, w(f + g'f

°

E

L, then AlIp(w(f + g'f):5AlIP(wJP)

°

°

+ Al/P(wgP).

(4.24)

If 0, A(wgP) > 0, then the reverse inequality in (4.24) holds.

Proof.

This is an immediate consequence of Theorem 4.12.

4.14. Theorem.

0

Let L and A satisfy conditions Ll, L2 and AI, A2 on a q, base set E. Let p > 1, q = pl(p -1), and w, f, g ~ on E with wf", wg wfg E L. If < m :5 f(x)g-q/P(x) :5 M for x E E, then (M - m)A(wJP) + (mMP - MmP)A(wgq):5 (MP - mP)A(wfg). (4.25)

°

°

q) If P < 0, then (4.25) also holds provided either A (wfP) > Oar A(wg > 0; if o
Oar A(wg > 0.

Proof.

First we note that if p > 0, then we have 0:5 mrwg"

:5

wfP :5 M'wg"

on E,

and the inequalities are reversed if p < 0. In particular, this shows that q) A(wfP), A(wg are either both zero or both positive for all p. If q A( wg ) > 0, then, since (x ) = x'' is convex for either p > 1 or p < 0, (4.25) follows from (4.19) by the substitutions in (4.23). If P > 1 and A(wgq) = 0, then (4.25) also holds because it reduces to the case 0:5 (MP - mP)A(wgf). q) q) If p < and either A(wfP) > or A(wg > 0, then A(wg > 0; thus (4.25) holds. (Note that (4.25) does not hold if A(wJP) = A(wgq) = 0

°

°

4.2. Holder's and Minkowski's Inequalities

115

unless A (wfg) =0, which need not be the case.) If 0 1 and p-1 + q-1 = 1, then the left-hand side of (4.25) is bounded below by {p(M - m)A(wr)} IIp{q(mMP - MmP)A(wgq)}lIq, and this reduces to the right-hand side of (4.26), with a factor of (MP - m'']. This shows that the inequality in (4.25) is sharper than that in 0 (4.26). 4.18. Theorem. Let L, A, p, w, f, g be as in Theorem 4.14, and let 0< m < F(x) 5, M and 05, G(x) 5, M for x E E, where F = f(f + g)-qlp,

4.2. Holder's and Minkowski's Inequalities

G

= g(f + g)-qIP.

117

If p > 1, then

A IIP(w(f + gY) 2:: K(p, q, m, M){AIIP(wfP ) + A IIP(wg P)}

(4.28)

holds where K(p, q, m, M) is the constant on the right-hand side of (4.26). If 0 0 for p < o.

Proof.

This follows immediately from Theorem 4.16.

0

4.19. Remarks. (a) By writing s = ar + bt, where a + b = 1, Holder's inequality yields the following Liapunov inequality for isotonic functionals: (4.29) If we define the means of order r by

M[r](g, A) = A(gr)lIr for

r

=1=

0

and A(1) = 1, then (4.29) can be written in the form M[sl(g, A) :0:; M[r](g, A)(rls)«t-s)/(t-r))M[t](g, A )(tls)«s-r)/(t-r))

for

0 < r <s

< t. (4.30)

Hence by the arithmetic-geometric means inequality we have

r(t - s) -[

t(s - r) -[

-[I I I MS(g,A):O:;Mr(g,A)+Mt(g,A), s t-r s t-r

or equivalently, M[t](g, A) - M[rl(g, A) 0 and A1(CE,} > 0 hold. Now gqgl = fg and gqgf = JP; thus

It is easy to verify that (4.38), with A and g replaced by Al and gl, respectively, reduces to

(4.41) where

gdt)

= gl(t)CE,(t) + {A(JPCE)/A(fgCE)}1I(p-IlCE2(t).

Hence by 1/(p -1) = q -1 = q/p, we have

gqgE , = g{fCEI + [gA(JPCE,/A(fgCE)]q/PC E,} = fgE " g q g ~ =, JPC E, + [gA(JPC E,/A(fgCE,WCE2 = It, and so (4.40) follows from (4.41).

0

4.24. Remark. (a) Beckenbach's inequality (1966) is the special case of Theorem 4.23 corresponding to E = {1, 2, ... , n}, £1 = {1, 2, ... ,m} (where 1:s: m < n), L = ~ n , the vector space of all real n-vectors a=(al, ... ,a n ) , and A(a)=I:7a;. Let a=(al, ... ,a n ) , b= (b 1 , . . . , b n ) be two n-tuples of positive real numbers, and p, q be real numbers such that p-l + q-l = 1 (p > 1). If 0 < m < n, then P ( n~af )l/P( n~a.b, )-1 ~ (n~ iif)lI ( n~ii;b; )-1 ,

(4.42)

124


where

a, ii, = I

{ (b i

m

m

1=1

1=1

2: a)/2:

for

1 ~ i ~ m,

for

m

qlp

ajbj)

+ 1 ~ i ~n,

and the equality in (4.42) holds iff iii = a, for all i. The inequality in (4.42) is reversed if p < 1 and p *0. Furthermore, for m = 1 (4.42) reduces to HOlder's inequality. (b) Some related results are also given in Pecaric and Beesack (1987b). (c) In the same fashion we can give generalizations of Theorem 4.23 and Corollaries 3, 4 in Bullen, Mitronovic, and Vasic (1987, pp. 258-260). 0

4.5. Aczel's and Related Inequalities We noted that from Jensen's inequality we can easily obtain Holder's inequality. Similarly, from Theorem 3.1 (the reverse Jensen's inequality) we can obtain Aczel's (1956) (Mitrinovic, 1970, pp. 57-58) and Popoviciu's (1959b) inequalities, and their proofs can be found in Vasic and Pecaric (1982b). The ideas can be used to prove more general results. First, we note that the following generalization of Theorem 4.22 is valid:

4.25. Lemma.

Let E, L, A, t/J be defined as in Lemma 4.11 and assume that pEL with p ~ Oon E and O < A ( p ) ut/J(a) - A(pt/J(g)) . u -A(p) u -A(p)

(4.43)

By letting p=u,

in Theorem 3.1 for n

q = -A(p),

= 2,

b = A(pq)/A(p),

i.e., from

A..(pa+qb»_pt/J(a)+qt/J(b) ." p+q p+q

for

q < 0, P

+ 1 > 0, pa+qb p+q

E

I,

4.5. Aczel's and Related Inequalities

125

we have 4J(ua - A(pg)) ~ u4J(a) - A(p)4J(A(pg))/A(p)). u-A(p) u-A(p)

By applying Lemma 4.11 we have (4.43).

0

4.26. Theorem (Aczel's inequality for isotonic functionals). Let L satisfy Ll, L2, and A satisfy A1, A2 on a base set E. IfF, s'. fg ELand g ~- A(g2) > 0 (or f ~ - A(F) > 0), where go, fo are real numbers, then (fogo - A (fg))2 ~ ( f ~- A ( f 2 ) ) ( g- ~ A (g2)).

(4.44)

4.27. Theorem (Popoviciu's inequality for isotonic functionals). Let A and L be as in Theorem 4.26. If P > 1, q > 1, p-l + «: = 1, f, g ~ 0 on E, I", s". fg E L, and fo, go are positive real numbers such that (4.45) then

(f£ - A (jP))lI P(gg - A(gq))lIq 75, fogo - A(fg).

(4.46)

In the case 0 0 (or p < 0 and A(fP) > 0), the reverse inequality in (4.46) holds.

Proof.

By applying the substitutions

4J(x) =xP (p > 1),

in (4.43), we obtain (fogo - A(fg)Y ~ (f£ - A (fP))(gg - A(gq)Y-l

(4.47)

if the first condition in (4.45) is satisfied. If the second condition is also satisfied, then from Holder's inequality we have A(fg) < A (fP)lIp A(gq)lIq < fogo, i.e., fogo - A(fg) > O. Thus from (4.47) we obtain (4.46). 0

4.28. Remark. Note that for p = 2, we obtain Aczel's inequality from (4.47). Thus (4.47) is a generalization of A C Z ( ~ I ' sand Popoviciu's inequalities. 0 4.29. Theorem (Bellman's inequality for isotonic functionals). Let A and L be as in Lemma 4.25. Let f, g ~ 0 on E with I", gq, (f + g) E L, and fa, go be positive real numbers satisfying (4.48)

126


If p > 1, then «(fg - A(fP»lIP(gg - A(gP»lIPY:S (fo + goY - A«(f + gY). (4.49)

If 0 0, then the inequality in (4.49) is reversed.

Proof. We give proof for p > 1 only; the proof for the other case is similar. Using (4.48) and the Minkowski Inequality we have (fo + goY> (A(fP)lIp

+ A(gP)lIPY

From the discrete Minkowski Inequality for n «a1

+ b 1Y + (az + b zy)lIp :s (af +

~ A « ( +f gY)·

= 2, i.e.,

a ~ ) l I P+ (M + bf)lIP ,

the substitutions

and the Minkowski Inequality, we have

«(fg - A(fp»llp + (gg - A(gP»lIPY :s (fO + goY - (A(fp)llp + A(gP)lI PY :s (10 + goY - A«(f + gY).

0

4.30. Remark. The last inequality in the above proof is an interpolation of (4.49), and it is a generalization of a result in Mitrinovic and Pecaric (1988a). Similarly we can prove Theorem 4.27 by using only Holder's inequality, i.e., we can obtain a similar interpolation of Popoviciu's inequality. Of course, we can also obtain other results similar to those that can be obtained from HOlder's inequality. 0

4.6.

Further Generalizations of Holder's and Minkowski's Inequalities

Let us further consider the generalized means defined in Section 4.1, and assume that I = (a, b), -00:S a < b < 00, 1J11"'" 1J1n: I ~ ~ are continuous and strictly monotonic, X: I ~ ~ is continuous and increasing; L and A satisfy the conditions Ll, L2 and AI, A2 defined in (3.27)-(3.31), with A(l) = 1 on a base set E; gl" .. , gn : E ~ ~ and f:I1 x··· x I n ~ ~ are real-valued functions such that gl(E) c 11, ... , gn(E) c In, 1J11(gl), ... , 1J1n(gn), x(f(gl, ... , gn» E L.

4.6. Further Generalizations

127

We consider the inequality of the form

MX(f(gl, ... ,gn), A) =Sf(M"'l(gl' A), ... ,M"'n(gn, A)), (4.50) and observe

4.31. Theorem. A necessary and sufficient condition for (4.50) to hold is that the function H(Sl' ... , sn) = x(f(1J11 1(Sl)' ... , 1J1;;\sn))

is concave. If H is convex, then the inequality in (4.50) is reversed. Proof. McShane's inequality (see Theorem 2.6 and Remark 2.11(b)) for the function H becomes x(f(1J111(A(fr)), ... , 1J1;;l(A(fn)))) 2':A(X(f(1J111(fl), ... , 1J1;; 1 (fn))))' (4.51) Thus if we let t.

= 1J1i(gi) (i = 1, ... , n), then (4.51) becomes

X(f(M"'l(gl, A), ... ,M"'n(gn, A))) 2':A(X(f(gl' ... ,gn))), (4.52) which is equivalent to (4.50).

0

4.32. Remark. For the special case of discrete functionals with n = 2, this result is given in Beck (1970) (see also Bullen, Mitrinovic, and Vasic (1987, pp. 246-255). 0 The following two corollaries can be proved as in Beck (1970):

4.33. Corollary. Assume that f(x,y)=x+y, and let H(s,t)= X(1J11\s) + 1J12 1(t)), E = 1J1U 1 J 1 ~ , F = 1 J 1 ~ / 1 J 1 ~ , G = X'/ X", and all of 1J1;, 1 J 1 ~ , X', x", 1 J 1 ~ , 1 J 1 ~ be positive. Then Mx(gl + gz, A) =s M"'l(gl, A) + M"'2(gZ, A) holds iff G(x + y) 2': E(x) + F(y). 4.34. Corollary. Assume that f(x, y) =xy, and let H(s, t) = X(1J11 1(S)1J12 1(t)), A(x) = 1J1;(x)/(1J1;(x) + x 1 J 1 ~ ( x ) ) B(x) , = 1 J 1 ~ ( x ) / ( 1 J 1 ~ (+x ) X1J1~(x))C , (x) = X'(x)/(X'(x) + xx"(x)), and 1J1;, 1 J 1 ~ , X', X", 1 J 1 ~ , 1 J 1 ~ be positive. Then

MX(glgZ,A) =s M"'l(gl, A)M"'2(gZ'A) holds iff C(xy) 2':A(x) + B(y).

128


Several other results related to Holder's and Minkowski's inequalities will be given in other parts of this book. We complete this section by observing the following general result in Bourbaki (1952, pp. 9-14) (see Mitrinovic, 1970, pp. 355-56, and Kuczma, 1985, pp. 203-4): Let P be the set of all mappings of a set S into the nonnegative reals. Let M be a mapping of P into the nonnegative real numbers satisfying (i) M(O) = 0, M(At) = AM(t), where A> 0 and f E P; (ii) f(x)::5 g(x) for all XES and f, g E P implies M(t)::5 M(g); and (iii) M(f + g)::5 M(t) + M(g) for all f, g E P. Let h(t l , . . . , tn) be a real-valued function of n real variables t l , . . . , t n which is defined and continuous for t, 2= 0 (i = 1, , n). Let h have the following properties: (iv) inequalities ti > 0 (i = 1, , n) imply , Atn) = that h(tl>.'" tn) > 0; (v) if A> 0, then h(Atl, Ah(tl, ... , tn); and (vi) the set of all points (t l, ... , tn) in En satisfying t, 2= 0 (i = 1, ... , n) and h(tl>' .. ' tn) ~ 1, is convex. Then (a) fl , ... .t; E P implies

M(h(fl' ... ,fn))::5 h(M(fI), ... , M(fn))'

(4.53)

(b) O 0 for tEE and ep: [0, o o ) ~IR be strictly convex, with ep(uv):o:; ep(u)ep(v) for u, v> O. Let 1jJ(t) = ep(t)/t for t > 0 and 1jJ(0+) = 0, 1jJ(00) = 00. If X(t) = 1/1jJ-\t), then

4.45. Theorem.

ep(I.4(f)I) :0:; 1jJ(A(X(p)))A(pep(lfl)) holds provided that f E L, IfI E L, x(p) E L, x(p )ep(x(p) IfI) E L, and A(X(p))> O.

pep(lfl) E L,

Note that the inequality in (4.63) is a special case of this result. In the same fashion we can prove converse inequalities similar to those in Vasic, Janie, and Keckic (1971), Kocic and Maksimovic (1973), and Pecaric and Janie (1988) by using the other parts of Lemma 4.11. Similar results for norms can be obtained by using the same arguments. First, we observe the following theorem (Pecaric and Dragomir, 1989a): 4.46. Theorem.

Iffis an increasing convex function for x

2:

0, then (4.70)

where Pi 2: 0, P; = ~ 7 ~ Pi' 1 Xi E V (i = 1, ... , n), and V is an arbitrary normed vector space with norm 11·11.

134


As a special case of this inequality we have

I I ~xl :s ( ~ 1p}I(1_,»)'-1 ~Pi Ilxill'

for

r> 1.

(4.71)

Similarly, by the reverse of Jensen's inequality, the reverse of the inequality in (4.70) holds under the same conditions on f for the following values of the p/s:

Pi :s 0 (i

Pl>O,

= 2, ... , n) and P; > O.

(4.72)

Also note that if

qi:S 0 for i = 2, ... ,n, and q1:s

( ~ z l q i l l l ( l - ' »1-', ) (4.73)

then (4.74) As a consequence of (4.71) and (4.74) we then have

Ilxl + x211' II xllI' II xzll' - , , - - - ~ - = . : . c . . . . ~_ _ + __ u+v

u

v

for

uv(u+v»O,

II x1 + xzll':s /./X111' + Ilxzll' for uv(u + v) 2 or 0, and all continuous convex funcb-a p+q

~ - - m i n { pq}. ,

(5.15)

5.12. Remarks. (a) Observe that (5.14) may be regarded as a refinement of the definition inequality for convex functions. (b) For p = q = 1 and y = (b - a)/2, (5.15) is the HermiteHadamard's inequality. We now show that under the same conditions Hermite-Hadamard's inequality yields the following refinement of (5.15):

f

A+y

f(pa p

+ qb) ~ ~ f(t) dt ~ ~{f(A- y) + f(A + y)} ~ p f ( a +) qf(b) +q 2y A-y 2 p +q (5.16)

First, observe that if O < y ~ [ ( b - a ) / ( p + q ) ] m i n { p then , q } by , considering two cases (0 < P ::5 q and 0 < q < p ), we can easily verify that a ~ A - y < A + y so ~ bthat , f is defined on [A-y,A+y]. By Hermite-Hadamard's inequality in (5.1) with a, b replaced by Ay, A + y, we obtain

f

A+y

f(A) ~ 12 y

f(t) dt ~ 12[f(A - y) + f(A + y)].

(5.17)

A-y

By the definition of convexity, we have for a

Hence, taking

Xl

= a and X3 = b,

~Xl < X 2 < X 3 ::5 b

we obtain (5.18)

f(A+y)::5

b - (A + y) A +Y- a b-a f(a)+ b-a f(b).

(5.19)

5.1. Hermite-Hadamard's

Inequality

145

From (5.17)-(5.19) we then have A+y

f(A) ~1

2y

-

J

f(t) dt

A-y

~ 1-{f(A-

y) + f(A + y)}

2

~ !{b -Af(a) + A - a f(b)} =pf(a) + qf(b) , 2

b-a

b-a

p+1

proving (5.16). (c) Note also that (5.14) can be proved from the integral analogues of Jensen's and Lah-Ribaric's inequalities, i.e., from the inequalities b

f(

b

b

b

(J p(x)g(x) dx)/ (J p(x) dX)) ~((J p(x)f(g(x» dX) / (J p(x) dX)) a

a

a

a

~ M - g f(m) + g - m f(M), M-m

M-m

where m ~ g ( x ) ~forM all !E[a, b], g = ( f ~ p ( x ) g ( x ) d x ) / ( f ~ p ( x ) d x ) . Indeed, if we make the substitutions p(x)=l, g(x)=x, a=A-y, b = A + y, m = a, and M = b, we obtain the result that (5.14) is valid if the following condition holds:

a

~ A-

y

~b

and

a

+y

~ A

~ b.

(5.20)

We can then easily show that conditions (5.15) and (5.20) are equivalent. Therefore, condition (5.15) is sufficient. Now we show that it is also necessary. Let p >q and assume that (5.15) is not valid, i.e., y> (q(b - a»/(p + q). Since the function f(x) = a - x for x < a and f(x) = 0 for x2:a is convex, (5.14) becomes: 0 ~ ( a - A - y ) / 2 y ~a 0 con, tradiction. Therefore we must have y ~ q(b - a)/(p + q). Similarly, for the case p 0) for which A(g) = (pm

+ qM)/(p + q).

(5.23)

Then f(pm + qM) :SA(f(g» :spf(m) + qf(M). p+q p+q

(5.24)

Proof. Observe first that since m :s A(g) -s M, there always exist p ;::: 0, q ;::: 0, (p + q) > satisfying (5.23). The first inequality in (5.24) is just (2.6), while the second of (5.24) is (3.43). 0

°

5.14. Theorem.

Suppose that L satisfies conditions Ll-L3 defined in (3.27)-(3.29) on a nonempty set E and that f is a continuous convex function on an interval I, while g, h E L with f(g),f(h) E L. Let A, B be isotonic linear functionals on L for which A(l) = B(l) = 1. If A(h) = B(g), E[ E.s4 satisfies A(CE) > 0 and A (C Ez) > 0 where E 2 = E\E 1 , and if A(hCE)/A(CE) :sg(t) :sA(hCEz)/A(CE,) for all tEE,

(5.25)

f(A(h»:S B(f)(g) :sA(f(h».

(5.26)

then

5.1. Hermite-Hadamard's Inequality

Proof.

147

By Jessen's inequality we have i

= 1, 2.

(5.27)

Applying Theorem 5.13 «5.24» to the isotonic linear functional Band the function g with m=A(hCE,)/A(CE,), M=A(hCE,)/A(CE,), for p= A(CE,), q = A(CE,), we have P + q = A(CE) = A(1) = 1 and B(g)=A(h)=A(hCE,)+A(hCEz) =

pm+qM . p+q

Hence, by (5.24) we obtain f(A(h» = f(B(g» :s B(f(g» :s A(CE,)f(

~ ~ h ~ ~ +/ )A(CE,)f( ~ ~ ~ : ) )

:sA(f(h)CE,) +A(f(h)CEz) (by (5.27»

= A(f(h», proving (5.26).

0

Note that again the inequality (5.26) is a refinement of Jessen's inequality and is also a generalization of an inequality given in Vasic, Lackovic, and Maksimovic (1980). Wang and Wang (1982) proved the following generalization of Theorem 5.11:

5.15. Theorem. Let f: [a, b ] ~IR be a convex function, Xi E [a, b], and = 0, 1, ... , n). Then the following inequalities are valid:

Pi> 0 (i

n-l

+

2: xi1 -

n

tj + 1)t1 ••• tj

j=1

IT dt,

+ X n t 1 t 2 ••• tn )

i=1

(5.28) where

(£1'i

n

+ f3;)/2 = ~ 1Pk

/

n

k ~ - Pk 1 for i = 1, ... , n

(5.29)

148

5. Hermite-Hadamard's and Jensen-PetroviC's Inequalities

and

O:s a i < f3i :s 1 for

i

= 1, ... , n.

(5.30)

Proof. (Pecaric, 1989a). By the integral and discrete versions of Jensen's inequality, we have

f(~PiXi/itPi)

f{n (f3j - aj)-l f

fJ,

n

=

)=1

f ... f fJ,

n

:s)] (f3j - a'j}-l

a1

fJ,

n

-s

I ~(f3) -

aJ-1

n

/

n

~Pi-

f(xo(1- (1)

an

fJn

f ---f eXt

= ~pJ(Xi)

fJn

(f(xo)(l - (1)

(1'"

o

5.16. Remarks. (a) In Pecaric (1989a) the condition in (5.30) is weakened and a generalization to Theorem 5.15 for convex functions of several variables is given. Another extension of Theorem 5.15 is given in Hu (1986). (b) Generalizations of Hermite-Hadamard's inequalities are also given in Lackovic and Stankovic (1973) and Lackovic (1969). Note that the results in Lackovic and Stankovic (1973) are simple consequences of the well-known results for support line of convex functions. For example, by integration and Theorem 1.6 we can obtain the following generalization

5.1. Hennite-Hadamard's Inequality

149

of the first inequality in (5.1): b

f(c)

b

+ f ~ ( c ) ( xf p(x) dX) /

(f p(x) dX) -

a

cf~(C)

a b

b

~ ( fp (x)f(X)dx)/(JP(X)dx) a

where

for

a 1. Then for the convex function -log(1 +x), (5.50) becomes a generalization of the well-known Bernoulli inequality n

n

II (1 + Xi)"'::; 1 + L p.x, , i=l

i ~ l

and (5.52) gives the reverse inequality.

5.2.3.

0

Combination Convexity Inequalities

Hwang and Yang (1985) gave some generalizations of results of Beckenbach (1969). First they proved: 5.48. Theorem. If f E K(b) (K(b) is the class of real-valued functions that are continuous and nonnegative on [0, b] with f(O) = 0) is convex on [0, a] and starshaped on [0, b], where a ::; b, then for all real numbers Xi in [0, b] (i = 1, ... , n) and weights Ai> such that r . 7 ~ 1Ai::; alb, we have

°

(5.65)

Furthermore, the constant alb is the best possible. By letting Ai

= aw.lb

we obtain Beckenbach's (1969) inequality.

5.49. Theorem. Iff E K(c) is starshaped on [0, b] and superadditive on [0, c] where b ::; c, then for every X in [0, c] and every A E [0, blc] we have f(Ax)::; (clb )Af(x). 5.50. Theorem. If f E K(c) is convex on [0, a], starshaped on [0, b], and superadditive on [0, c], where a z: b -s c, then for all real numbers Xi E [0, b] (i = 1, ... , n) satisfying r . 7 ~ 1Xi = co::; c, we have f«A/n)co»::; (A/n)f(co), where O::;A::;alb. This theorem is also equivalent to Beckenbach's inequality.

164

5. Hermite-Hadamard's and Jensen-Petrovie's Inequalities

5.51. Theorem. Iff E K(c) is convex on [0. a]. starshaped on [0, b] and superadditive on [0, c], where a s; b :s: c. then for all real numbers Xi E [0, c] (i = 1, ...• n) satisfying ~ 7 = X1 i E [0. c] and all weights Ai> such that ~ 7 = A 1 :S: a]c, we have

°

(5.66)

Proof. Denoting A= ~ 7 = A1 i we have Axi:S: (alc)x i E [0, a] for i = 1•...• n; and from Jensen's inequality we have n )... (C b ) -; f(Axi) = 2: -; f -b A- Xi . 1=1 1=1 C

f ( 2: AiXi :s: 2: n

1=1

)

n)... I\.

I\.

Since (blc)Xi E [0, b], (clb)A < 1, and f starshaped on [0. b], we obtain the first inequality in (5.66) from the fact that f c I b)A(b I c )x;) :s: (clb)Af«blc)x;). To prove the second inequality in (5.66). note thatfis increasing on [0, c] which implies

«

and

-bc

i

1=1

i

c AJ(!!-Xi) :S:-b f(i Xi) c 1=1 1=1

A i : S : ~ 1=1 b f (Xi)' i

Also. from f«blc)Xi):S:f(b) for i=l, ...• n, we have (clb) ~ 7 = AJ«blc)x;):s: 1 (alb )f(b); and from ~ 7 = 1AiXi:S: ~ 7 = 1AiC:S: a. we 1 :s: f(a). Since f is starshaped on [0, b]. we have obtain f ( ~ 7 = AiX;) f(a) =f«alb)b):S: (alb)f(b). 0 Note that when b = c. Theorem 5.48 is a special case of Theorem 5.51.

5.2.4.

Inequalities for Sums of Order p

Let us denote and

5.2. Jensen-Petrovic's Inequalities

165

where a and p are positive n-tuples such that Pi 2= 1 (i = 1, ... , n). The well-known inequality for sums of order p states that (5.67) for s>r>O (see Hardy, Littlewood, and Polya, 1935, 1952, pp. 28-30). But this inequality is also valid for r < s < 0 and s < 0 < r (see Vasic and Pecaric, 1979a). Furthermore, Vasic and Pecaric also proved that (5.68) for s > r > 0 and a ~> 0; and that

a ~-

vs(a,p)::svr(a,p)

a2 - ... for

> 0 or r < s < 0 and

a ~

s>r>O,

or

r<s (a, p)

=

-1 ( ~ / i < / > ( a ; ) ) ,

(az) - ... - (a n

».

Here (x) is a continuous and strictly monotonic function, positive for all positive x and tends to 00 either as x ~0 or as x ~00. We shall also assume that the components of a are all positive and that Pi 2= 1 (i = 1, ... , n). The following two results are simple consequences of Theorem 5.26 and Corollary 5.27 (Vasic and Pecaric, 1979a, and Hardy, Littlewood, and P6lya, 1934, 1952, pp. 84-85). 5.52. Theorem. If 'l/J and X are continuous, posttuie, and strictly monotonic, then v1J1 and V x (v, is either vcj>(a) or vcj>(a, p» are comparable whenever (i) 'l/J and X vary in the opposite directions, or (ii) 'l/J and X vary in the same direction and X/ 'l/J is monotonic. In case (i) we have (5.70) if'l/J decreases and X increases. In case (ii) (5.70) holds if xtw decreases.

166

5. Hermite-Hadamard's and Jensen-PetroviC's Inequalities

If 'l/J and X are continuous, positive, and strictly monotonic, then D", and Dx are comparable under the conditions in the case (ii) of Theorem 5.52, and the inequality

5.53. Theorem.

(5.71)

holds if X/ 'l/J is decreasing. Jensen (1906) used (5.67) in the proof of the following theorem (see also Mitrinovic, 1970, p. 52 and p. 78): 5.54. Theorem.

numbers and r1 , r;;,! 2= 1. Then

Let a ij (i = 1, ... , n; j = 1, ... , m) be posunie real rm be positive real numbers satisfying r 11 + ... +

••• ,

Similarly, by using generalizations of (5.67) we can give some further generalizations of Theorem 5.54 (see Vasic and Pecaric, 1979a). The following result of Mulholland (1950) presents a generalization of Minkowski's inequality: Let the function f be increasing and convex for x 2= 0 and f(O) = O. Furthermore, let the function F defined by F(t) = logf(e') be convex for all real t. Then

5.55. Theorem.

vf(a + b)::s vf(a)

+ vf(b)

(5.72)

holds for all a = (aI' ... ,an), b = (b , , ... ,bn) such that a, 2= 0 and b, 2= 0 (i = 1, ... , n). Note that Milovanovic and Milovanovic (1978) proved vf(a + b, p)::s vf(a, p)

+ vf(b, p)

(5.73)

under the same conditions, and another inequality for sums was given by Klamkin and Newman (1975): n (

(r

)1I(r+l)

+ 1) ~ a ~

(

2= (s

n

)1/(5+1)

+ 1) ~ 1 a ~

,

where 0=a O::sa 1::S"'::san, a i - a i _ 1::s1 (i=l, ... ,n), r2=1, + 12= 2(r + 1).

s

(5.74) and

5.2. Jensen-Petrovlc's Inequalities

167

A weighted version of (5.74) is given in Meir (1981). A refinement of one of Meir's results is given in Milovanovic and Milovanovic (1986), and an improvement can be found in Pecaric (1989a). In the following we state a generalization of Meir's result given in 1. Milovanovic (1980) and Milovanovic and Milovanovic (1986). 5.56. Theorem. Let f(x) and g(x) be differentiable functions on [0,00) satisfying f(O) = 1'(0) = g(O) = g'(O) = O. Suppose that f'(x) and g'(x) are convex on [0,00), and denote h(x) = g(f(x». Then for any given 0= ao:5 a1 :5 ... :5 an and 0:5 Po:5 P1:5 ... :5 p; satisfying 1 a, - ai-1:52 (Pi

+ Pi-1)

i = 1, ... ,n,

for

(5.75)

we have

h - 1 ( Pih'(a ~ i») : 5 f - 1 ( p;/'(a ~ i»)'

(5.76)

If instead of (5.75) the following condition a, - ai-1 :5 Pi

for

i = 1, ... , n

(5.77)

is satisfied, then we have

) h- 1( n- 1Pi + Pi+1 h'(a i) ) :5f-1 (n-1 Pi + Pi+ 1f'(a i). 2 i=1 i-1 2

2:

2:

(5.78)

Klamkin and Newman (1975) gave a similar integral inequality which was generalized by Pecaric (1985d): 5.57. Theorem. Let g: [a, b] ~ ~ be a nonnegative differentiable function with g(a)=O, let f:[O, 0 0 ) ~ [ 000) , and w:[O, 0 0 ) ~ [ 0 00) , be differentiable increasing functions with f(O) = w(O) = 0, and let p: [a, b ] ~ ~ be a nonnegative integrable function. (a) LetO:5g'(x):5p(x)forallxE[a, b]. Ifwisaconvexfunction, i.e., w' is an increasing function, then b

b

h- 1(f p(x)h'(g(x» dX) :5f-1(f p(x)f'(g(x» a

dx),

(5.79)

a

where h(x) = w(f(x». If w is concave (i.e., if w' is decreasing), then the reverse of the inequality in (5.79) holds.

168

5. Hermite-Hadamard's and Jensen-Petrovic's Inequalities

(b) If g'(X) "2:. p(x) for all x E [a, b], then the reverse of the results in (a) is valid. Furthermore, the equality in (5.79) holds iffg(x) = f ~p(x) dx.

Proof.

(a) Since f'(g(x))g'(x) ::sp(x)f'(g(x)), we have x

f(g(x))::s J f'(g(x))p(x) dx. a

We also have h'(g(x)) = w'(f(g(x)))f'(g(x)). Then, using the fact that w' is an increasing function, we obtain x

p (x)h (g(x)) ::s p (x)1' (g(x))w I

I

(J p (x)1' (g(x)) dx)

(5.80)

a

and b

b

J p(x)h'(g(x)) dx::s w(J p(x)f'(g(x)) dX).

(5.81)

a

a

We then obtain the inequality in (5.79) from (5.81). If w is concave, then the reverse of the inequalities in (5.80) and (5.81) are valid; thus the reverse of the inequality in (5.79) holds. The proof of (b) is similar. 0 Forf(x)=x n+ 1 and w(x)=x(m+l)/(n+l) (XE[O,OO)), it follows from Theorem 5.57 that 5.58. Corollary. Let g: [a, b ] ~ ~ be a nonnegative differentiable function such that g(a)=O and O::sg'(x)::sp(x) for all xE[a,b], where p : [a, b] ~ ~ is a nonnegative integrable function. If m "2:. n "2:. 0, then b

b

«m + 1)) J p(x)(g(x))mdx)lI(m+l)::s «n a

If g'(x) "2:. p(x) for all x holds.

+ 1) J p(x)(g(x)t dx)lI(n+l). a

E

(5.82)

[a, b], then the reverse of the inequality in (5.82)

For p(x) = M for all x Newman (1975).

E

[a, b], we obtain a result of Klamkin and

5.2. Jensen-Petrovii's Inequalities

169

5.59. Remark. The following result similar to (5.48) is also valid (Delange, 1947): Suppose that f(x) and g(x) are real-valued and continuous functions on [a, b]. Letf"be continuous on [a, b] withf":50 and let g/ be continuous on the same interval. Furthermore, let g(a)=f(a), g(b)=f(b), and g ( x ) ~ f ( x f) or a<x 0 we have

Proof(PeEari6 (1986a)). The inequality in (6.2) can be easily proved by applying Theorem 1.7; i.e., for (6.2) to hold for every continuous convex 171

172

6. Popoviciu's, Burkill's, and Steffensen's Inequalities

function on [a, b] (and hence for every convex function on [a, b D, it suffices to show that it holds for the functions

II(x)=4+p,

for

a:=:;x:=:;b

and Mx)=lx-cl

for

a:=:;x:=:;b,

where c E [a, b] is arbitrary but fixed. Obviously II satisfies (6.2); thus we need only show that 12 also satisfies (6.2). For real numbers x, y, z, c and positive real numbers p, q, and r the inequality (p

+ q + r) Px + qy + rz - c I -

I

p+q+r

(q

+ r) Iqy+rz - c I q+r

qy - (r + p) rz + Px - c I - (p + q) Ipx+ - cI r+p p+q I

+ pix -

c] + q

Iy -

c] + r [z - c] ~

°

is equivalent to

Ip(x - c) + q(y - c) + r(z - c)I-lq(y - c) + r(z - c)I-lr(z - c) + p(x - c)I-lp(x -c) + q(y - c)1 + Ip(x - c)1 + Iq(y - c)1 + r(z - c)1

~ 0,

which is just Hlawka's inequality (Mitrinovic, 1970, p. 171)

[u + v + wi - lu + vl- Iv + w] - Iw + u] + lui + Ivl + Iwl ~0, where u, v, ware arbitrary real vectors in a pre-Hilbert (unitary) space withu=p(x-c), v=q(y-c), andw=r(z-c). 0

6.3. Remarks. (a) Theorem 6.2 was proved by Burkill (1974) under the assumption that I is twice differentiable. In Vasic and Stankovic (1976) and Baston (1976) this assumption was removed, and both proofs use Fuch's generalization of a majorization theorem. Another proof of Theorem 6.2 can be found in Lupas (1982). We note that the differentiability condition in Burkill's result can be directly eliminated by using the fact that it is possible to approximate uniformly a continuous convex function by convex polynomials. Using the fact that the convexity of Ion [a, b] implies its continuity on (a, b) and l ( a ) ~ I ( a + ) I , ( b ) ~ I(b-), we can easily prove the validity of (6.2) for an arbitrary convex function. (b) The previous proof of Theorem 6.2 is similar to Popoviciu's proof of inequality (6.1). 0

6.1. Inequalities of Popoviciu and Burkill

173

As in Vasic and Stankovic (1976), let us denote by (Cn,d the inequality

l S i t < ~ O, xiE[a,b] (i=1, ... ,n), and f : [ a , b ] ~ 1 Ris a convex function. (In fact, C3 ,z) is just inequality (6.2).) Then the implication (C 3 ,z) ::} (Cn,k) for

2 ~ k ~ n - 1 ,

n2::3

holds, and it was proved in Vasic and Stankovic (1976) (see also Pecaric, 1986a).

6.4. Remarks. (a) In the previous discussion, we require that Pi> 0 (i = 1, ... , n) for defining (Cn,k)' This condition can be replaced by a weaker one, namely, that the real numbers Pi (i = 1, ... ,n) are nonnegative such that Pit + ...

+ Pi, > 0 for 1 ~ i. < ... < ik

~ n.

This is equivalent to the condition that no more than k - 1 of the p;'s can be zero. In fact, without loss of generality we can assume that if the function f is continuous and convex, then the inequality for (C n •k ) follows under weaker conditions as we let Pi ~0+. (b) Vasic and Stankovic (1976) also proved the reverse implication (Cn,d ::} (C 3 . z) using Popoviciu's inequality. This implication can be proved as follows: let Pn = 0 (Pn ~0) in (C n.k), then Pn-l = O. Continuing D this process to P4 = 0, we finally obtain (C 3 ,z) (Pecaric, 1986a). A similar result is: 6.5. Theorem. Let the function f:(O, a ] ~ 1 Rbe such that f(x)/x is convex of order m -1, and let Xi E (0, a] (i = 1, ... , m), I:::1 Xi E (0, a]. Then the inequality

- ... + (_1)m-1

L f(Xi) 2:: 0 cn

(6.3)

174


holds, where

Proof. We give the proof for m = 2 and m = 3 only. For m = 2, (6.3) becomes f(XI + X2) 2::f(XI) + f(x2), which is obviously true (see Theorem 5.26). For m = 3, (6.3) becomes f(x + y

+ z) - f(x + y) - f(y + z) - f(z + x) + f(x) + f(y) + f(z) 2:: O. (6.4)

By the substitutions

f(x)- f(x)/x,

Xl=x +y +z,

X2=X+y,

and

X3=X for

x, y, z >0

and the inequality in (1.5) we have _ , - f ( - , - x _ + - - , - y ~ + ~ z_ - - , -_ ) f(x + y) + f(x) >0' z(y+z)(x+y+z) zy(x+y) xy(y+z)- ,

i.e., X

~

( y) + z(x +y

+z)

Z

f (x + y + z ) - f (x + y ) + ( )f(x) "2::0. x +Y Y+ z

If y and z are interchanged, we have

u

x y )f(x) "2::0. f (x + y + z ) - f (x + z ) + ( ( y+z ) (x+y+z ) x+z y+z Thus by adding these inequalities we have

x x x - - - f ( x+ Y + z) - - - f ( x + y) - - - f ( x + z) x+y+z x+y x+z

+ f(x) "2::0.

Similarly, we have

--,--y-f(x + y x+y+z

+ z) - -y-f(x + y) - -y-f(y + z) + f(y) 2:: 0, x+y

y+z

z z z - - - f ( x+ Y + z) - - - f ( y + z) - - - f ( x + z) x+y+z y+z x+z By adding the last three inequalities we then obtain (6.4).

+ f(z) "2::0. D


175

6.6. Remark. Popoviciu (1946) poved the inequality in (6.3) for the case in w h i c h f : [ O , a ] ~ is1 Rconvex, f(O) =0, andf(m-l) exists. Vasic (1968c) proved the same inequality for an arbitrary 3-convex function f without the existence of f(2) and then used this result to prove a more general result. A different proof of Vasic's result was given in Pecaric (1980e). Keekic (1970) showed that the differentiability condition in Popoviciu's result can be removed; he gave a proof for m = 4 only and concluded that the same procedure can be extended to any m. Lackovic (1975) noted that his proof remains correct if f is continuous on [0, a), f(O) = 0, and f(x)/x is a (n -I)-convex function on [0, a). In fact, if f(O) = 0 then clearly (6.3) becomes an equality for some Xi = O. Thus if f is defined on [0, a] and f(O) = 0, then the result can be extended immediately. However, the proof we give here remains correct even if f is not defined at X = O. Thus our result is more general than the result obtained in Lackovic (1975) (see Pecaric, 1986a). 0

A generalization of (6.3) that is similar to (Cn,k) was given by Keckic (1970). Additional generalizations of results of this type can be found in Vasic and Adamovic (1969) and Keckic (1969) and Pecaric (1986a). In the following we state the results due to Pecaric. Let D be a commutative additive semigroup, and let the nonempty set E c D satisfy the condition n

a, E E for i = 1, ... , nand

m

L a, E E::} L ai E E j

i=1

j ~ 1

Further, let G be a commutative additive group which is totally ordered (i.e., G possesses a totally ordering relation ::s satisfying the condition a, b, c E G and a < b ::} a + c < b + c). We start with the following simple result which will be used later. 6.7. Theorem. For any given function f: E (C n ) denote the condition

~G

and n

= 2, 3, ...

let

n

n

2:

2, a,

E

E for i = 1, ... , n,

and

L

a,

E

E.

i=1

Then (a) the implication (C z)::} (C n ) is valid; (b) for n > 2 the implication

176

6. Popoviciu's, Burkill's, and Stelleusen's Inequalities

(Cn ) ::? (C z) is valid if the neutral element 0 of D exists, 0 E E, and f(O) = o.

Proof. (a) can be easily proved by induction, and the proof of (b) is obvious. 0 6.8. Remark. A function satisfying the condition (C z) is usually called subadditive. 0 The following result is a minor modification of a theorem in Vasic and Adamovic (1969):

6.9. Theorem. For any given function f: E denote the condition

~G

and 2:::; k < n let (C n •k )

where a, E E for 1:::; i:::; n, I : 7 ~ 1a, E E. Then (a) the implication (C3,z) ::? (Cn,k) is valid. (b) For (n, k) (3,2) the implication (Cn. k ) ::? (C 3 •Z) is valid if D has the neutral element 0,0 E E, and f(O) = o.

*

Proof. The proof is similar to the proof in Vasic and Adamovic (1969). 0 6.10. Remark. Under the assumption 0 E E, we can obtain the result of Vasic and Adamovic (1969) by applying Theorem 6.2 to the function f(x) - f(O). 0 6.11. Remark. A function f satisfying condition (C 3 ,z) is said to be H-positive (Hlawka-positive; see Burkill, 1974 and Baston, 1976). 0 The preceding theorems are generalized by the following theorem which simultaneously generalizes the analogous result in Keckic (1969).


6.12. Theorem. For any given function f : E let (Cm,n,k) denote the condition

2J 2: a; (k) j=1 k

(

J

)

S

~G

and 3 S m

S

177

k

+1S

n

(n -m+ 1) 2: f (m-z) 2: a., k- m

+2

_ (k - m 1

j~1

(m"-2)

+ 2)(n -

m

k- m

J

2)

+ +3

k - 2 ) ( n - 2) +(_1)m-1 ( m-3 k-1 for a, E E (1 sis n),

L7=1 a, E E,

2: (m"-3)

f ( ~ 3aii) + ... j=1

(n-m+1) (n

~ f ( a i ) +k -m+l/

)

~ a i

where for all

k <no

Then (a) the implication (Cm,m,m-I) =? (Cm,n,k) is valid; (b) the implication (Cm,n,k) =? (Cm,m,m-I) is valid if D has the neutral element 0, 0 E E and f(O) =0. '

6.13. Remarks. (a) Under the assumption 0 E E, we can obtain the result of Keckic (1969) by applying Theorem 6.5 to the function f(x) - f(O). (b) We shall say that a function f is superadditioe of nth order if it satisfies the condition (Cn,n,n-I)' For n = 2, this property reduces to ordinary superadditivity, and, for n = 3, to H-positivity. 0 For a given function f : E ~G (E and G are as defined above) and for a given sequence {a;}iEK (defined on the set J( of all natural numbers or on some sufficiently large set M of J(), we define

(/) = (/, a, f) =

f ( ~a i)

-

~f(a;)

for

a,

E

E, i

E

I,

and

2: a, E E iel

to be a function on the power set of M. Then we have the following 6.14. Theorem. (a) Let I and J be disjoint nonempty sets of natural numbers, and let a, E E (i E I U J), and LiElUJ a, E E. If the function f

178


satisfies condition (C 2), then the inequality cI>(I U J) ~ cI>(I) + cI>(J) holds; i.e., the function cI>, in a limited sense, is subadditve. (b) Let I, J, and K be disjoint nonempty sets of natural numbers, and let a, E E (i E I U J U K) and ~ i E I U ] Ua,K E E. If the function f satisfies condition (C 3,z), then the inequality cI>(I U J U K) - cI>(J U K) - cI>(K U I) - cI>(I U J)

+ cI>(I) + cI>(J) + cI>(K)

~0

holds; i.e., the function cI> is H -positiue, also in a limited sense. (c) More generally, let a, E E (i E UZ=l Ik = A), ~ i E a, A E E. If the function f satisfies the condition (Cn,n,n-I)' then l

L cI>Ci Iii) ~ L c I > ( ~ I2ii) - L cI>(Y Iii) + ... ( n ~ l )

( n ~ 2 )

/=1

( n ~ 3 )

/=1

/=1

L cI>(Ii) + cI>(i I} _m

+ (-lr- l

, ~ l

which means that the function limited sense.

cI> is superadditive of nth order,

again in a

6.15. Remark. In the following we shall confine our attention to the subadditive case because properties of positive or superadditive functions of index set can be preserved by pasing to the limit. 0 Proof of Theorem 6.14. Letting a2) ~ f ( a l )+ f(az), we obtain

a l ~~ i E l a a i nd

a z ~ ~ i E ] a iin

f(al

+

and

cI>(I UJ) = f(

L

at) -

iEIU]

L

tEIU]

f(a;)

~ f ( ~ai) - ~ f ( a ; +) f ( ~ai) - ~ f ( a t ) =

cI>(I) + cI>(J).

This establishes the result in (a). The proof of (b) and (c) is similar. 0


179

6.16. Corollary. Let Ik = {l, ... , k}, (k = 1, ... ,n), a; E E (i = 1, ... ,n), and ~ 7 = a;1 E E. If the function f satisfies condition (C 2) , then the following inequalities hold: (In)::5 (In)::5

(In-I)::5 . . . ::5 (1 2 ) ::5 0,

min (f(a;

+ a) -

f(a;) - f(aj))

::5 0.

(6.5)

ls.i<j:5n

Proof. For I=I n- 1 and J={n}, (6.1) yields (In)::5(In-I); (6.4) we have

and by

Since the points a I and a2 can be replaced by arbitrary a; and aj, (6.5) follows. 0

6.17. Remarks. (a) Corollary 6.16 is an improvement of Theorem 6.7(a). (b) Let L be a real linear space and U be a convex set in L. A function f: U -IR is said to be convex if for all Xl , X2 E U and a E (0, 1) (6.6) (see Definition 1.1). We show that the well-known Jensen's inequality for convex functions can be obtained easily from (6.6) using Theorem 6.7(a). For this purpose we introduce a semigroup structure on Lx IR+ (=D). The operation "+" is defined by X+Y=(x,p)+(y,q)= (

) PX + qy ,p+q p +q

for

X, YED. (6.7)

We can easily show that this operation is commutative and associative and that X, ... , X n

where E

E

E::9 Xl + ...

+ Xn

E

E,

= U x IR +. We further define the function g: E -IR by g(X) = pf(x) for

X = (x, p)

The condition (Cn) in Theorem 6.7 becomes

E

E.

(6.8)

180


which is Jensen's inequality for the convex function f: U - - ~ . We note that condition (C z) is equivalent to inequality (6.6) defining the convexity of f Similarly using Theorem 6.14(a) we can obtain Theorem 3.14 and Corollary 3.15(a) when the p;'s are positive. D Note that Vasic and Stankovic's implication (C 3 ,z) ::} (Cn,k) for convex functions follows directly from Theorem 6.9(a) applied to the case when D = ~ x ~ + , E = [a, b] x ~ +where the operator "+" is defined in (6.7) and the function g: E -- ~ in (6.8). Also note that inequality (6.3) is identical to the condition (Cm,m,m-I)' Thus, by Theorem 6.12, condition (Cm,n,k) is also valid. An interesting application of Theorem 6.14(a) is that the set function considered in Theorem 6.12 is not only subadditive, as shown in Theorem 3.14, but also H-positive (Pecaric, 1986a).

6.18. Remark.

Let a be a given positive n-tuple, k an integer such that

1:5 k s: n, and denote byaik ),

.- .. ,

a ~ ) the

K

= (:) k-tuples formed

from the elements of a. If s, t E ~ , then the mixed mean of order sand t of the positive n-tuple a, when taking k at a time, is

] the power mean of order u of a given n-tuple. Ozeki where M ~ u denotes (1973) considered the special cases A k = M(l, 0; k, a), Bk = M(O, 1; k, a), and C; = M(l, -1; k, a). An inequality for A k was given by Kober (1958) (see also Mitrinovic, 1970, p. 380). Note that the following consequences of (6.1) are generalizations of related results of Ozeki (1973) and Kober (1958):

(n - k)AI + (k - l)An 2: (n - l)Ak,

(n - k)C I + (k - l)Cn 2: (n - l)Ck.

By using Vasic-Stankovic's inequality (the condition (Cn,k))' we can easily give generalizations of these results for arbitrary sand t and arbitrary weights. D

6.2. Stelfensen's Inequality

6.2.

181

Steft'ensen's Inequality

The following result was given by Steffensen (1918) (see also Mitrinovic, 1970, p. 107, and Beckenbach and Bellman, 1961, 1965, p. 48): 6.19. Theorem. Assume that two integrable functions f and g are defined on the interval [a, b] such that f never increases and that O:=; g(t) :=; 1 on [a, b]. Then b

b

a+A

J f(t) dt:=; Jf(t)g(t) dt J f(t) dt, z;

a

b-A

(6.10)

a

where b

A=

Jg(t) dt.

(6.11)

a

Proof. (6.10) follows directly from the following identities (Mitrinovic, 1969, and Mitrinovic, 1970, p. 117); a+A

b

a+A

J f(t) dt - Jf(t)g(t) dt = J (f(t) - f(a + A))(1 - g(t)) dt

a

a

a b

J (f(a + A) - f(t))g(t) dt,

+

(6.12)

a+A b

b

b-A

Jf(t)g(t) dt - J f(t) dt = J (f(t) - f(b - A))g(t)) dt a

b-A

a b

+

J (f(b - A) - f(t))(1 - g(t)) dt. b-A

(6.13)

o 6.20. Remarks. (a) Steffensen (1919) derived the Jensen-Steffensen Inequality (Theorem 2.19) using the second inequality in (6.10) (see also Pecaric, 1979b). Bullen (1970a) derived Steffensen's inequality, Theorem 2.19, using Jensen-Steffensen's Inequality «2.19)). Therefore these inequalities are equivalent.

182


(b) The first and second inequalities in (6.10) are equivalent. This can be proved by using the substitution g ( t ) 1~ - g(t) (see Mitrinovic, 1969, and Mitrinovic, 1970, p. 108). (c) Hayashi (1919) generalized inequality (6.10) slightly by imposing the condition: 0:::; g(t):::; A where A is any positive real number (instead of 0:::; g(t):::; 1). However, his result can be easily obtained from (6.10) by using the substitution g ~g / A. (d) Note that Steffensen's inequality cannot be found in Hardy, Littlewood, and P6lya (1934, 1952), and his paper (1918) was not reviewed in Jahrbuch tiber die Fortschritte der Mathematik. However, the paper was cited by G. Szego in his review (see Hayashi, 1919, 1920). (e) Davies (see Mitrinovic, 1970, p. 108) proved Steffensen's inequality by showing that the function a+G(x)

S(x) =

J

x

f(t) dt -

a

Jf(t)g(t) dt, a

where G(x) = S ~ g ( t dt, ) is increasing, Vasic and Pecaric (1984) used that result in the proof of Theorem 3.20 to show that monotonicity of S(x) actually follows from Steffensen's inequality. D In his short note (which contains no references) Apery (1951,1953) proved a different version of Steffensen's inequality. His result states (see also Mitrinovic, 1970, p. 116 and Riekstyn's (1986, p. 23»:

6.21. Theorem.

Let f be a decreasing function on (0,00), and g be a measurable function on [0, 00) such that 0:::; g(x) :::; A (A is a positive real number). Then A

00

J f(x)g(x) dx ss A J f(x) dx, o

0

where

f 00

A=

~

g(x) dx.

o

6.22. Remark. Apery proved Theorem 6.21 by using an identity similar to (6.12). In fact, Mitrinovic (1969) proved Theorem 6.19 by using Apery's idea. D

6.2. Steffensen's Inequality

183

Applying integration by parts, (6.12) becomes a+A

b

a+A

J f(t) dt - J f(t)g(t) dt = - J a

a

a

x

(J (1 - g(t» dt) df(x) a

b

-

J a+,t

b

(6.14)

(J g(t) dt) df(x), x

where A is defined in (6.11). Thus it is obvious that the condition 0:5 g(t):51 (for every t E [a, b]) can be replaced by the weaker condition x

J g(t) dt zzx - a for every x E [a, a + A]

and

a b

Jg(t) dt

~0 for

every x E [a + A, b].

(6.15)

x

However, the conditions in (6.15) are also necessary. Indeed, for f(t) = 1 (t :5x) and f(t) = 0 (t > x for every x E [a, b]) we obtain (6.15) from the second inequality in (6.10). On the other hand, from (6.15) we have b

x

b

J g(t)dt= J g(t)dt- J x

a

g(t)dt~A-(x-a)~O

a

for every x E [a, a

+ A]

(6.16)

and x

b

b

J g(t) dt = J g(t) dt - J g(t) dt s: A:5 X- a for every x E (a + A, b]. a

a

x

(6.17)

Combining (6.15), (6.16), and (6.17) we obtain (6.15)::} (6.18), where x

Jg(t)dt:5X-a a

b

J g ( t ) d t ~ OforeveryxE[a,b].

and

(6.18)

x

Moreover, it is clear that (6.18)::} (6.15). Thus we conclude that (6.15) and (6.18) are equivalent. Consequently, we have proved the following theorem (Milovanovic and Pecaric, 1979):

184


6.23. Theorem. Assume that f and g are integrable functions on [a, b], and let A be defined in (6.11). Then the second inequality in (6.10) holds for every decreasing function f iff (6.18) holds. Similarly we can prove: 6.24. Theorem. Let f and g be integrable functions on [a, b]. Then the first inequality in (6.10) holds for every decreasing function f iff x

b

J g(t)dt$b -x and

J

x

a

g ( t ) d t ~ Ofor every X

E

[a, b],

(6.19)

where A is defined in (6.11).

An immediate consequence of Theorems 6.23 and 6.24 is that (Vasic and Pecaric, 1981a): 6.25. Theorem. Let f and g be integrable functions on [a, b]. Then the inequalities in (6.10) hold for every decreasing function f iff x

b

0$ J g(t)dt$b-x and

0$ J g(t)dt$x-a foreveryxE[a,b]. a

x

(6.20)

Some converse results are considered in Pecaric (1982d). 6.26. Theorem. Let f:l ~ ~ ,g : [a, b ] ~ ~ ([a, b] c 1 where 1 is in the interval in ~ ) be integrable functions, a + AE 1 where A is given by (6.11). Then a+A

J f(t) dt a

b

$

f

f(t)g(t) dt

(6.21)

a

holds for every decreasing function f iff x

J g ( t ) d t ~ x - fa orxE[a,a+A] a

b

and

Jg(t)dt$O x

forxE(a+A,b]

(6.22)


185

and 0 :s: A:s: b - a; or x

f g ( t ) d t ~ x - af orxE[a,b];

(6.23)

a

or b

f g(t)dt:s:O

(6.24)

forxE[a, b].

x

Proof. For f(t) = 1 (t:s: x) and f(t) = 0 (t > x) for all x E I we claim that, from (6.21), (6.22) or (6.23) or (6.24) must be satisfied. To show the other direction, (i) if O:s: A:s: b - a, we can obtain (6.21) from (6.18); (ii) if A> b - a, then a+A

b

b

a+A

f f(t) dt - f f(t)g(t) dt = f f(t)(l- g(t» dt + f f(t) dt a

a

a

b

b

= f (f(t) - f(b »(1 -

sv» dt

a

a+A

+ f (a+A-x)df(x) b

x

b

(J (1 - g(t» dt) df(x)

=- f a

a

a+A

+ f (a+A-x)df(x):s:O; b

186


and (iii) if A< 0, then a+.l.

b

J f(t) dt - Jf(t)g(t) dt

a

a

b

J f(t) dt - Jf(t)g(t) dt

= -

a

a+).

a

b

=

Jg(t)(f(a) - g(t)) dt a b

J (x-a-A)df(x)

+

a+.l. b

=-

b

J(J g(t) dt) df(x) a

x a

J (x-a-A)df(x):50.

+

a+.l.

D

Similarly we can prove

6.27. Theorem. Let f: I - 7 functions, b - AE I. Then

~ ,

g: [a, b] - 7

b

~

([a, b] c I) be integrable

b

Jf(t)g(t) dt:5 J f(t) dt a

(6.25)

b-.l.

holds for every decreasing function f iff x

J g(t) dt:50

b

for x

E

[a, b - A]

and

a

Jg(t) dt

?

b- x

x

for x and 0 :5 A :5 b - a; or b

Jg(t)dt?b-x

forxE[a,b];

x

or x

J g(t)dt:50 b

forxE[a,b].

E

(b - A, b]


187

6.28. Theorem. Let g: [a, b] ~ ~ be an integrable function such that there exists aCE [a, b] satisfying g(x) ~1 for x E [a, c] and g(x) ~0 for x E (c, b]. Then (6.21) holds for every decreasing function f: I ~ ~ provided that [a, b] c I and a + A E I. Proof.

Let 0

~A ~b

- a. If c

~a

+ A,

then clearly we have

x

b

J g ( t ) d t ~ x - aforxE[a,c] a

and

J g ( t ) d t ~ O forxE[a+A,b]. x

Suppose that for some Xl E (c, a + A) we have f ~ lget) dt <x, - a. Since f ~ l get) dt ~0, we have f ~get) dt < Xl - a, i.e., a + A < Xl' which is clearly a contradiction. Similarly in the case c > a + A we can prove that (6.13) also holds. Now let A> b - a. Then obviously we have gg(t) dt ~ X - a for X E [a, c] and for some X E (c, b Jwe have x

b

b

Jg(t)dt= Jg(t)dt-J a

a

x

b

g ( t ) d tJ ~ g(t)dt~b-a~x-a; a

i.e., condition (6.23) holds. Similarly, in the case A < 0 we can prove that (6.24) holds. Thus from Theorem 6.19 we obtain Theorem 6.21. D From Theorem 6.20 we can also prove

6.29. Theorem. Let g: [a, b ] ~ ~ be an integrable function such that there exists aCE [a, b] satisfying g(x) ~0 for X E [a, c] and g(x) ~ 1 for X E (c, b]. Then (6.25) holds for every decreasing function f: I ~ ~ provided that [a, b] c I and (b - A) E I. 6.30. Theorem. Let g: [a, b]- ~ be an integrable function such that g ( x ) ~ 1(or g ( x ) ~ O for ) every xE[a,b]. Then for every decreasing function f: I - ~ the reverse inequalities in (6.10) hold provided that a + A, a - AE I. Proof.

This is a simple consequence of Theorems 6.28 and 6.29.

D

6.31. Remark. As noted in Remark 6.20(a), Steffensen (1919) derived the Jensen-Steffensen Inequality using the second inequality in (6.10). In

188


the following we show that Theorem 3.3 (the reverse of the JensenSteffensen Inequality) can be obtained from Theorem 6.28; namely, letting Xl;:::' •• ;:::x n and fortE(Xk+l,xd

g(t)=gk=Pk/P n

and

1 ~ k ~ n - 1 ,

where Pk = r . ~ = P 1 i' we show that

Since I can be approximated uniformly on [a, b] by polynomials with nonnegative second derivative, without loss of generality we may assume that ['(x) exists and is increasing, i.e., -['(x) is a decreasing function. Then, from (6.21), we obtain (3.2). 0 By letting get) = AG(t)/ S ~G(t) dt, where A> 0 and S ~G(t) dt > 0, we obtain that (Mitrinovic and Pecaric, 1988c):

IJ b

b

~Jl(t)G(t) dt /

I(t) dt

b-A

a

b

a+A

JG(t) dt

~ I J I(t) dt

a

(6.26)

a

holds iff b

b

J

J

x

a

O ~ A G ( t ) d t ~ ( b - x )G(t)dt x

and

b

J

J

a

a

O ~ A G ( t ) d t ~ ( x - a )G(t)dt

(6.27)

hold for every x E [a, b]; and that the second inequality in (6.26) is valid iff x

J

A G(t) dt a

b

~ (x -

a)

JG(t) dt a

b

and

JG(t) dt;::: 0

(6.28)

x

hold for every x E [a, b]. Of course, using these results we can give extensions of the results related to Bellman's generalization of Steffensen's inequality (Pecaric, 1982b, 1984f). The following theorem (Godunova, Levin, and Cebaevskaja, 1967) is also a consequence of the result stated above:


189

6.32. Theorem.

Let f(x) be a nonnegative decreasing function on [a, b], and <jJ( u) be an increasing convex function for u 2:: 0 with <jJ(0) = O. If g(x) is a nonnegative increasing function on [a, b] such that there exists a nonnegative function gl(X) defined by the equation gl(X)<jJ(g(X)/gl(X)) = 1,

(6.29)

and that J ~gl(t) dt ~ 1, then the following inequality is valid: b

a+).

b

~ iJ

<jJ(J f(t)g(t) dt / J g(t) dt) a

a

(6.30)

<jJ(f(t)) dt,

a

where A = < j J ( J ~g (t) dt).

For <jJ(u)

= uP (p > 1),

Theorem 6.32 becomes

6.33. Theorem. f

Let f(x) be a nonnegative decreasing function on [a, b], Lp(a, b), and let g(t) be nonnegative and increasing on [a, b] such that J ~gq(t) dt ~ 1, where p > 1 and q = p/(p - 1). Then E

b

(J f(t)g(t) dt a

r

a+).

~

(6.31)

J fP(t) ft a

holds, where A = ( J ~g(t) dtY'.

Proof of Theorem 6.32.

Using Jensen's inequality for convex functions and the second inequality in (6.26) we have b

<jJ

(J f(t)g(t) dt / a

b

b

~J

J g(t) dt) a

b

g(t)<jJ(f(t)) dt / J g(t) dt

a

i

~

a

a+).

J

(6.32)

<jJ(f(t)) dt

a

provided that x

b

<jJ(J g(t) dt) J g(t) dt a

a

b

~ (x -

a) J g(t) dt a

b

and

J g(t) dt

2:: 0

(6.33)

x

hold for every x E [a, b]. The second condition in (6.33) is obviously satisfied. On the other hand, the increasing convex function <jJ with

190

6. Popoviciu's, Burkill's, and Stelfensen's Inequalities

<j>(0) = 0 is starshaped, thus <j>(ax) ::;a <j> (x) holds Consequently by (6.29) and Jensen's inequality we have

for

0 < a ::; 1.

x

b

<j>(J get) dt) J get) dt a

a b

b

x

b

I

= <j>(J gl(t) dt(J get) dt) (J gl(t) dt)) J get) dt a

a

a

b

a

b

x

b

::; (J gl(t) dt)<j>(J gl(t)(g(t)jgl(t)) dtl J gl(t) dt) J get) dt a

a

a

b

a

x

::; J gl(t)<j>(g(t)jgl(t)) dt J get) dt a

a b

=J a

x

x

1 . dt J get) dt

= (b - a) J g(t) dt.

a

a

Since g is an increasing functiori, we have (see, for example, Mitrinovic, 1970, p. 9) b

x

_1_ J get) dt b-a

~ _ 1 J_ get) dt,

a

x-a

a

i.e. x

b

(b-a) J g(t)dt::;(x-a) J g(t)dt. a

a

Hence the first condition in (6.33) is also satisfied.

0

We note that (a) (6.32) is an interpolation inequality of (6.30). Also note that the condition in (6.29) can be replaced by (6.29') or, more generally, by b

J gl(X)<j>(g(X)jgl(X)) dx s: b - a;

(6.29")

a

and (b) the proof given above can be found in Mitrinovic and Pecaric (1988b). The same is true for Remarks 6.34(a).


191

Inequality (6.31) was given by Bellman (1959), and has been stated in Beckenbach and Bellman (1961, 1965, p. 41) and Mitrinovic (1970, p. 111). However, this result is incorrect as stated, as noted by Godunova, Levin, and Cebaevskaja (1967). The results given above represent a more general and correct version of that in Bellman. Another corrected version of Bellman's inequality can be found in Bergh (1973):

6.34. Theorem. Let f and g be positive functions on (0, (0), f decreasing and g measurable. If for some p 2: 1, f E LP + L and g E Lq n L 1 such that IIfllL" = 1 and IIgllo = t, where lip + 1/q = 1, then 00

tP

00

r p

J f(x)g(x) dx :521/q(J (f(x»)P dx o

0

(6.34)

holds, and the constant 2 1/q is the best possible.

°

A similar inequality for < P :5 1 can be found in Godunova and Levin (1968). In fact, that result is a consequence of a more general result given below: Let K(x, t) (x EX, t E [0, a]) be a given nonnegative kernel. We say that the function 1>: [0, a ] ~ ~ belongs to the class U(K) if it can be expressed in the form ep(t) = Ix K(x, t) do(x), where o(x) is an increasing function on [0, a] with f x do(x) = 1. Let f be a positive, increasing, and strictly convex function such that f' (x )/f"(x) is concave, and let t

h(t)

=J

t

dh(s),

get)

=J

h(O) = g(O) = 0,

dg(t),

h(a)=1,

o

o

where get) and h(t) are increasing functions on [0, a]. Then a

a

(J f( ep(t») dh(t)

J ep(t) dg(t) :5 f- 1 o

0

holds for every function ep in the class U(K) iff a

J

a

(J f(K(x, t)) dh(t)

K(x, t) dg(t):5 f- 1

o

holds for every x E X.

0

(6.35)

192

6. Popoviciu's, Burkill's, and Stell'ensen's Inequalities

The following generalization of Steffensen's inequality is given in Pecaric (1982f):

6.35. Theorem. Let h be a positive integrable function on [a, b] and f be an integrable function such that f(x)/h(x) is increasing on [a, b]. If g is a g(x):s for every x E [a, b], real-valued integrable function such that then (6.21) holds, i.e.,

O:s

b

f

1

a+A

~

f(x)g(x) dx

a

f

(6.36)

f(x) dx

a

where A is the solution of the equation a+A

f

b

hex) dx =

a

f

h(t)g(t) dt.

a

If f(x)/h(x) is a decreasing function, then the reverse of the inequality in (6.36) holds.

Proof. a+A

a+A

b

J f(t) dt - Jf(t)g(t) dt = f a

a

b

f

g(t)(1 - g(t»f(t) dt -

a

f(t)g(t) dt

a+).

a+A

: s ~ ~ : ~: ~ f

h(t)(1- g(t» dt

a b

-f

f(t)g(t) dt

a+A b

a+A

= ~ ~ : : ~(f~ h(t)g(t)dt- f a

h(t)g(t)dt)

a

b

- J f(t)g(t) dt a+A

f ( b

= and the proof is complete.

g

a+A

D

t )h (t )( f (a + A) h(a + A)

f(t») dt:S 0 h(t) ,


193

Applying Theorem 1.43(a) we obtain, from Theorem 6.34: 6.36. Theorem. Let g be an integrable function such that O:s g(x) :s 1 for every x E [a, b]. (a) If the function f is convex of order n with t 0, then it includes an atom at 0 and, to facilitate the arithmetic we introduce the notation x , = max{x, O}. Also x':- denotes (x. )" except that 00 will be interpreted as being 1. Thus the indicator function of [t, (0) is (x - t ) ~ . The above formula for f E M o may be written as

f 1

f(x) =

(x -

t ) ~ dv(t).

o

The following class of functions which we consider is larger than that containing such f(x). Let M; denote the class of functions f with the representation

f 1

f(x) =

(x -

t ) ~ dv(t)

for x

E

[0, 1]

o

for some v which is a nonnegative regular Borel measure. Note that k need not be an integer, although the case of great interest is when it is an integer. In particular, M, is the class of increasing convex functions with a value zero at O. More generally, if f E c(n+l)(o, 1) with f(i)(O) = 0 for i = 0, ... , n - 1, f(n) 2:: 0, and I":" 2:: 0 on [0,1], then f E M n .

194

6. Popoviciu's, Burkill's, and Stetrensen's Inequalities

6.38. Theorem. Let J.. be a regular Borel measure such that f6ldJ..(x)1 < 00, and let dx denote Lebesgue measure, then f6f(x) dJ..(x) 2: fof(x) dx holds for all f E Mo iff

f 1

dJ..(x) 2: 0 foreverytE[0, 1]

t

and 1

a

~Or:!}!:l {t +

f

dJ..(x)}.

t

Therefore 1

a = Or:!}!:l {t -

f

dJ..(x)}

t

is the best possible choice.

6.39. Theorem. Let J.. be a (signed) regular Borel measure such that HIdJ..(x)1 < 00. Then f6f(x) dA(X)2: fof(x) dx holds for all f E M; iff

f 1

(x -

t ) ~ dJ..(x) 2: 0

for every t E [0, 1]

o

and (6.38) Therefore the best possible choice is that a equals the right-hand side of (6.38).

6.40. Remark. Some generalizations of Steffensen's inequality for functions of several variables are given in Pecaric (1980d), Fink (1982b), and 0 Pecaric and Janie (1989). Pecaric (1989b) proved the following theorem:

6.41. Theorem. Let G: [a, b] ~ ~ be an increasing and differentiable function and f: I ~ ~ be a decreasing function (I is an interval in ~ such


that a, b, G(a), G(b)

E

195

I). (a) If G(x) ?x, then

b

G(b)

f f(x)G'(x) dx?

f

a

f(x) dx.

(6.39)

G(a)

(b) If G(x) =5 x, then the reverse of the inequality in (6.39) is valid. Proof.

By letting G(x) = z we have b

b

G(b)

f f(x)G'(x) dx = f f(x) dG(x) = a

a

f

f(G- 1(z» dz.

G(a)

If G(z) > z, then G-1(z) < z and f(G-1(z» ?f(z). Thus we have Glb)

G(b)

f(G-1(z» dz > G(a)

f

f(z) dz;

G(a)

i.e., (6.39) holds. If G(z) =5 z then, of course, we obtain the reverse of the inequality. 0

6.42. Remarks. (a) For a = 0 and G(O) = 0, we obtain a result in Ostrowski (1970, pp. 83, 161,263) from Theorem 6.41(b). , G ( b ) ~ o o we , obtain the result in Volkov (b) For a=O, b ~ o o and (1969). This result is a generalization of the following inequality of Gauss (see, for example, Mitrinovic, 1970, p. 300): Letfbe never increasing for x> 0 then, for any A> 0, 00

2

1.. f A

00

f ( X ) d x = 5 x~ 2f(x)dx. f 0

Indeed, this inequality follows if we also let G (x) = 4x 3 /271.. 2 + A for 1..>0. (c) Theorem 6.19 is, in fact, a consequence of Theorem 6.41. To see this, let G(x) = a + f ~ g(t) dt in Theorem 6.41 where g satisfies the conditions of Theorem 6.19, then G(x) =5x holds and we obtain the second inequality in 6.10. To obtain the first inequality we let G(x) = b - f ~ g ( t d) t in Theorem 6.41. (d) For another generalization of Gauss' inequality, see Petschke 0 (1989).


CebySev-Gruss’ , Favard’s, Berwald’s, Gauss-Winckler’s, and Related Inequalities

Chapter 7

7.1.

Cebysev-Griiss Inequality

A classic result due to CebySev (1882,1883) is stated in the following theorem. 7.1. Theorem. Let f, g : [ai b ]+= R and p : [a, b ]+= R + be integrable functions. I f f and g are monotonic in the same direction, then b

6

a

a

b

a

b

a

provided that the integrals exist. I f f and g are monotonic in opposite directions, then the reverse of the inequality in (7.1) is valid. In both cases, equality in (7.1) holds iff either g or f is constant almost everywhere. A discrete analogue is given by 7.2. Theorem. Let a and b be two n-tuples of real numbers monotonic in the same direction, and p be a positive n-tuple. Then

If a and b are monotonic in opposite directions, then the reverse of the inequality in (7.2) holds. In either case equality holds iff either a , = . . = a, o r b , = b,.

-

=

a

CebySev’s inequality can be generalized for m (>2) functions (or n-tuples).

197

198

7. Related Inequalities

7.3. Theorem. Let t., ... .t: (m > 2) be nonnegative real-valued functions and p a positive function on [a, b]. If t., ... .I; are monotonic in the same direction, then b

(fp(x) dx a

r-

1

il

b

b

f p(x) rlfi(X) dx 2= (f p(x)f;.(x) dX). a

(7.3)

a

If It, ... ,fm are positive on [a, b], then the equality in (7.3) holds iff at least m - 1 of the functions f1' ... .I; are constant almost everywhere. 7.4. Theorem. Let aj (j = 1, ... , m; m > 2) be nonnegative n-tuples which are monotonic in the same direction, and p be a positive n-tuple. Then

(7.4) If all n-tuples are positive, then the equality in (7.4) holds iffat least m - 1 n-tuples among a1' ... , am have identical components.

7.5. Remarks. (a) The history of Cebysev's inequality and the question concerning its priority are considered in the expository paper by Mitrinovic and Vasic (1974). It was noted that for the special case in which Pi = a, (i = 1, ... ,n), inequality (7.2) was obtained earlier by Laplace (1749-1827), and that inequality (7.1) with p(x) == 1 was obtained by Winckler (1866). (b) Cebysev's papers were published in 1882 and 1883, and inequality (7.3) with p(x) == 1 was considered by Andreief (1883a). (c) There exist several results which show that Cebysev inequalities are valid under weaker conditions: (i) The condition that the functions be monotonic can be replaced by the condition that they be similarly ordered. The same is valid for sequences. In this case Theorem 7.1 is a simple consequence of the following identity:

f b

a

f b

p(x) dx

f b

p(x)f(x)g(x)dx -

a

a b

b

= ~ J a

f b

p(x)f(x) dx p(x)g(x) dx a

Jp(x)p(y)(f(x)-f(y»(g(x)-g(y»dxdy. (7.5)

a

7.1. Cebysev-Griiss Inequality

Note that the functions f: I ordered if

~IR

and g: I

~IR

199

are said to be similarly

(f(x)-f(y))(g(x)-g(y))2:0 forevery

x. v e I

holds, and they are said to be oppositely ordered if the reverse inequality holds. A similar definition applies to sequences. Of course, the generalization of the identity in (7.5) for functions with several variables is also valid (see Berljard, Nazarov, and Svidskii, 1967, or Mitrinovic and Vasic, 1974). Thus similar generalizations of Theorems 7.1-7.4 are valid (for similarly ordered functions and sequences). The first such result is given in Hardy, Littlewood, and P6lya (1934, 1952, p. 168). (ii) The condition that the functions be monotonic can be replaced by monotonic in mean. Such a result for two functions (sequences) was given by Biernacki (1951), and for m (2:2) functions (sequences) by Burkill and Mirsky (1975). A simple proof and some interpolations were given in Vasic and Pecaric (1982d), and additional generalizations for functions with increasing increments were given in Pecaric (1984g) (see Theorems 2.30, 3.12, 3.17 and Remark 3.19(b)). (iii) Steffensen (1920) noted that Theorem 7.1 is valid when f is an increasing function on [a, b] and g satisfies the condition x

P ~ XJ ) p(t)g(t) dt:5 P ~ b J ) p(t)g(t) dt, a

x

b

where

P(x) =

Jpet) dt. a

a

(7.6) Pecaric (1980b) (see also Vasic, Stankovic, and Pecaric, 1985b) noted that instead of pet) > 0, we need only that P(x) > 0 and P(b) > O. Steffensen and Pecaric also gave corresponding discrete analogous results. Steffensen's result contains a result of Biernacki (1951), and Pecaric's results is a generalization of Popoviciu (1959b) (see also (iv)). (iv) The condition that pet) > 0 (Pi> 0) can be replaced by

0:5 P(x):5P(b)

for

a:5x:5b.

(7.7)

The same condition also appears in the Jensen-Steffensen Inequality; thus Cebysev's inequality and the Jensen-Steffensen's inequality are related. If the functions have increasing increments, then both (7.1) and (7.2) follow from Jensen-Steffensen's inequality. For the general case, a proof for (7.2) can be found in Popoviciu (1959b).

200


(v) The conditions that the functions (and sequences) be positive in Theorems 7.3 and 7.4 were weakened by Ahlswede and Daykin (1979). That is, if p(x) == 1 and Pi == 1, then Theorems 7.3 and 7.4 are valid for increasing functions and increasing sequences aj which satisfy

t

b

t ( 0 ) + b ~ a J t ( X ) d x : 2 for :0

j=1, ... ,m

(7.8)

a

and 1

n

ajl + - - L aji:2: 0 for n -1 i=2

j

= 1, ... , m.

(7.9)

(vi) Another modification of the conditions for Cebysev's inequality was given by Levin and Steckin (see, for example, Karlin and Studden, 1966, pp. 414-415). The following result is a generalization of their result and can be proved similarly: Let v: [0, 1] ~ ~ be an increasing function such that v(x) = -v(1- x), and let f: [0, 1 ) ~ ~ be an integrable function with respect to v such that the following two conditions hold:

f ( X ) i s i n C r e a S i n g f O r X E [ oand ,~l

f(x)=f(1-x)

for

xE[0,1].

Then for every continuous convex function <j> we have 1

1

1

1

J dv(x) J f(x) <j>(x) dv(x):::; J f(x) dv(x) J <j>(x) dv(x). o

0

0

0

As a special case we have the inequality of Leven and Steckin: 1

1

1

J f(x) <j> (x) dx:::; J f(x) dx J <j>(x) dx, 0 0 0

For a more general result, see Clausing (1980). (d) For other similar results, see the noted paper by Mitrinovic and Vasie (1974) and Section 8.l. (e) Theorems 7.1-7.4 have been generalized for monotonic functions of several variables. The case p(x) == 1 was generalized by Vietoris (1974). Pecaric, Janie, and Beesack (1982) showed that an analogous generalization is valid for the case when p(x) = P\(Xl) ... Pk(xd, and for similarly defined sequences of weights. In fact, they gave a related result for Stieltjes' integral. Additional generalizations are given in Pecaric and

71..

Cebykv-Griiss Inequality

201

Mesihovik (1988), where a similar generalization of (b) is also given. Some applications in number theory are given in Rutkowski (1989). 0 In the rest of this chapter we shall use the following notation

a

a

A simple consequence of (7.10) is the following result: Let p be a nonnegative function. If 2 0 for all x i , . . ,x,

I.L(xi)ln Igi(xi)ln

E

[a, b ] ,

then (7.11) Simple consequences of (7.11) include the inequalities of CebySev, Cauchy, and other inequalities. Of course, Gram’s inequality

II

b

a

p(x).L(xl&(x)dx

1

20

(7.12)

is also an obvious consequence of (7.11). Identity (7.10) for Stieltjes’ integral is given in Chokhate (1929). A generalization of (7.10) for functions of several variables is also valid (the

202


case p ( x ) = 1 is given in Ogura, 1920). Results analogous to (7.11) and (7.12) are also valid in this case. An interpretation of (7.12) for the case p ( x ) = 1 is given in Ogura (1920), and using the idea in his proof we can obtain the weighted version of his result: Let q51(x), & ( x ) , . . . , q5n(x), . . . be a system of normalized orthogonal functions with respect to p, i.e.,

a

If we denote by a i ( i ) the following

I

b

ai(fi) =

p(x)fi(x)&(x) dx for j = 1, . . . ,p ,

i = 1, 2, . . . ,

a

then we have 2

h

where C denotes the summation with respect to m , , . . . , mP satisfying 1P m , < m2 < * . < mP 5 k ; k z p being any fixed positive integer. Of course, similar results can be formulated for Stieltjes’ integral. Discrete versions of (7.10) are also valid. For example, we have

If Ne let f i= x i , x ; = y i , g $ = z j , g$ = u j , then, after simple algebraic manipulations, we obtain the following identity in Seitz (1936/37):

I

i=l j=1

;=I j = 1

As an immediate consequence of the above identity we can formulate the following (Seitz, 1936/37): If, for all positive integers i, j , r, s such that

7.1.

l l i < j s n and l l r < s < r n , we have 2 0 and

IYi

CebySev-Griiss Inequality

;: : 1

203

:;120,

then the inequality

holds. This inequality is also a generalization of CebySev's and Cauchy's inequalities (see, for example, MitrinoviC and VasiE, 1974). Finally, we give an interesting consequence of (7.11) for generalized convex functions. Let f : (a, b ) 4 R be a convex function with respect to an ECT-system of functions {ui}: (see (7.4)) (in symbols, we write f E C(bio, u l , . . . , u,)). The following result is a simple consequence of (7.11): Let f E C ( u o ,u l , . . . , u,) and g E C(vo,' u l , . . . , vn), and let p be a nonnegative integrable function. Then

2

a

l a

0.

(7.13)

a

+

If f, g : (a, )+ R are two (n 1)-convex functions, i.e., if f, g E C(1, t, . . . , t"), then (7.13) becomes (for p ( x ) = 1)

...

a

...

...

...

a

a

204


For n = 1, we obtain that for all convex functions inequality is valid: b b b f f(x)g(x) dx - b

~ _a f

a

f and g the following

f(x) dx f g(x) dx

a

a

7.6. Remarks. (a) Inequality (7.14) was proved by Lupas (1972a). Moreover, in the case when S: (x - (a + b )/2)g(x) dx = 0, this result was proved by Atkinson (1971) with the additional condition that the second derivatives of f and g exist. (b) In Pecaric (1989d), inequality (7.14) is generalized for (2,2)convex f u n c t i o n f : P ~ (/= 1 R [a, b]): b b b

~a f

f f(x, x) dx - b a

a

;:::

12

f f(x, y) dx dy a

(b - a)3

fbfb(x_a+b)(y_a+b)f(X y)dxdy. 2 2'

(7.15)

a a

(c) Some other results related to inequality (7.14) are given in Lupas D (1972a), Vasic and Lackovic (1979), and Pecaric (1983d). Fan (1953) considered the inequality b b

b

f f K(x,y)f(x)g(y)dxdy:5Bf f(x)g(x)dx a a

(7.16)

a

for nonnegative and decreasing functions f and g; there exist many generalizatons of this result (see Pecaric, 1980d, Pecaric and Crstici, 1981b, and Pecaric, 1984d, 1987). For example, the following theorem holds:

7.7. Theorem. Let p : P ~IR and q:I ~IR (/ = [a, b]) be two integrable functions. Then for every (1, I)-convex function f: P ~IR the inequality b b b K(f,p,q)=

f a

a(x)f(x,x)dx-

ff a

a

p(x,y)f(x,y)dxdy;:::O (7.17)


holds ifffor every x, y Pea, y) = Q(y),

E

205

[a, b] we have P(x, a)

= Q(x),

P(x, y):s Q(max(x, y»,

(7.18) where Q(x)

=

f ~ q(t) dt and P(x, y) = f ~

g pes, t) dt ds.

Interpolation inequalities of Cebysev's inequality are given in Vasic and Djordjevic (1973), Vasic and Pecaric (1981b, 1982c), and Mesihovic (1985). In the following we give the result in Pecaric (1987). 7.8. Theorem. Let Pij (ij = 1, ... , n) be nonnegative real numbers such that Pij = Pji (i, j = 1, ... , n). If (7.19) holds, then

(7.20) where n

Cn(a, p)

= 2:

i=l

n

n

n

2: Pijajj - 2: j=l 2: Pijaij' j=l i ~ l

7.9. Remark. Theorems 7.3 and 7.4 follow from Theorems 7.1 and 7.2 by induction. Note that by applying Theorem 6.7 we can obtain many results that are usually proved by induction. The same is true for Cebysev's inequality. Indeed, the set of all positive and decreasing (or increasing) functions X defined on [a, b], with the multiplication of functions as an interior operation, is a semigroup D. Let the set E c D be nonempty and have the property that Xl' X 2 E E implies Xl + X 2 E E, and let us define the function f : E - ~ by b

f(X) =

b

JX(t) d{t(t)/ Jduit), a

a

{t being a nonnegative measure such that f ~dll(t)

'*

0 and such that the integral in the numerator exists for every X E E. For the operation "+" in ~+, let us take the ordinary multiplication of positive numbers; then this operation has neutral element 1. Let us denote b

r: = }] [f Xi du./ a

b

b

f J/[f du

a

a

b

Xl' .. x; du./

f J. du

a

206


Since the function X(t) = 1 (a ~t E, we have (by Theorem 6.7) T2

~ b)

is a neutral element and belongs to -s 1, and Corollary 6.15 yields

~T;

~1

(7.21) and b

b

b

b

~ l s ~ ~ s (f n Xi du f x, dll) / (f du f XiX dll) ~ 1.

t;

(7.22)

j

a

a

a

a

o

G. Gruss (1935) proved the following converse of Cebysev's inequality (conjectured by H. Griiss-see footnote in G. Gruss, 1935):

7.10. Theorem.

Let f, g be two integrable functions defined on I

= [a, b]

and satisfying

(7.23) for every x

E

(a, b), where c l' C2> d 1, d 2 are fixed real numbers. Then

(7.24) where b

T(f, g) = b

b

~ a f f(x)g(x) dx -

(b

b

~ a)2 f f(x) dx f g(x) dx,

a

a

a

and the constant ~ in (7.24) is the best possible.

A discrete analogue of (7.24) is given in Biernacki, Pidek, and Ryll-Nardzewski (1950):

7.11. Theorem. d l ~ b i ~ d 2 f

Let a and b be two n-tuples such that lo ~r i ~ n Then .

Cl

~a,

~ C2

and

(7.25) where [x] denote the largest integer

~ xand


207

The following result is related to Griiss' inequality:

7.il. Theorem. Let f, g be absolutely continuous on I = [a, b], and let be bounded on I. Then 1 IT(f, g)l:::; -2 (b - a)2 sup II'(x)l- sup Ig'(x)l, (7.26) 1 XEI XEI

1', g'

where equality in (7.26) holds iff I' and g' are constant. Moreover, if are continuous, then

1', g'

1 T(f, g) = 12 (b - a f l ' ( ~ ) g ' ( 1 / ) for some ~ , 1/ E I.

(7.27)

7.13. Remarks. (a) Inequality (7.26) was proved by Cebysev (1882), and identity (7.27) can be found in Ostrowski (1970). (b) Several generalizations of Theorems 7.10-7.12 are given in Pecaric (1984d, 1987). For example, the following result is valid: (i)

Let a = (ai, ... , an) and b = (b l , . . . , bn) be real n-tuples monotonic in the same direction, and let p = (PI, .. _, Pn) be a real n-tuple such that k

0:::; Pk

:::;

P;

for

k

= 1, ... , n -

1

and

Pk

= L. Pi'

(7.28)

i=I

If lilakl

~ m and lilbkl ak+1 - akJ then

~r

T(a, b; p)

for k = 1, ... , n -1 where Sa; = ~ mrT(e, e;

p)

~ 0,

(7.29)

where

and e = (0,1, ... ,n -1). If a and b are monotonic in opposite directions, then

T(a, b; p):::; mrT(e, e; p):::; 0,

(7.30)

where e= (n -1, ... ,1,0). (ii) Let a and b be real n-tuples monotonic in the same direction such that lilakl ~ m, lilbkl ~ r for k = 1, ... , n - 1, and let p be a real n-tuple such that either 0< P; :::; Pk

for

k: = 1, ... , n - 1

(7.31)

208


or

0:5 P; :5 Fk

for

k = 2, ... ,n

and

h

= P; - Pk -

I •

(7.32) Then the reverse of the inequalities in (7.29) is valid. If a and b are monotonic in opposite directions, then the reverse of the inequalities in (7.30) holds. l c and I ~ b k :l 5 d (iii) Let a and b be two real n-tuples such that I ~ a k :5 for k = 1, ... , n - 1, and let p be a real n-tuple such that (7.28) is satisfied. Then IT(a, b; p)l:5 cdT(e, e; p) holds. If (7.31) or (7.32) holds instead, then IT(a,b;p)l:5cdT(e,e;p). These results are generalizations of results in Lupas (1981) and Mitrinovic (1970, p. 341). In fact, for m = r = 0 (7.28) is a generalization of Cebysev's inequality, as noted in Remark 7.5(c.iv). Pecaric also proved some more general results. For a (l,l)-convex sequence {aiJ ((1, 1)concave sequence {aij}) he obtained the inequality

The continuous analogue of (7.33) is that (iv) Let F be a (l,l)-convex function, and let f and g be two real-valued functions such that F(f(x), g(y)) is integrable over [z and a :5f(x):5 A, b :5g(x):5 B for all x E [a, b], where a, A, b, B are fixed real numbers. Then b

b

b

I b ~ a f F ( f ( X ) , g ( X ) ) d x - ( b ~ af ) F(f(X),g(y))dXdyj zf a

a

a

1 :5 4: (F(A, B) - F(A, b) - F(a, B)

+ F(a, b)).

The following Pecaric's result generalizes (7.27): Assume that p, q satisfy the conditions in Theorem 7.7, and let K(f, p, q) be defined as in (7.17). We have (1) If f: P ~ ~ has continuous partial derivatives t., fz, and fZI on P, then

Ki], p, q)

= fZI(;, 1/)K((x - a)(y - a), p, q) for all

;,1/ E [a, b];


209

and (2) if f, g : P ~IR have continuous partial derivatives f1' fZ1' gl' gz, and gZl with gZl 0 on P, then

*"

f Z l ~ ~1J ~ K(g, p, q) gZl ,1J

K(f, p, q) =

for all

fz,

~ , 1 JE [a, b].

(c) Similar converses of Theorems 7.3 and 7.4 are given in Pecaric (1980g) and Roghi (1971-72). Pecaric's result states: Let

(J p(t) dt) m-1 JDh(t) dt -/1 (J jj(t)p(t) dt). b

D m(f1, ... ,fm; p)

=

b

a

b

a

a

If jj(x) are monotonic functions on [a, b] for j = 1, ... ,m, and if p is a nonnegative function on [a, b], then IDm(f1,'" ,fm;P)1 ::5(m _1)m-m/(m-1) b

x

(J p(t) dtf jU (ljj(b)1 + Ijj(b) - jj(a)l). a

Furthermore, the constant is the best possible. (d) In the proof of Theorem 7.10 given in Mitrinovic (1970, pp. 70-71), the following inequality is given: T(f, gf::5 T(f, f)T(g, g).

(7.34)

Note that a more general result is also valid: Let b

[f, g]

= IJ p(x)t(x)gj(x) dxl a

n

be the determinant given in the identity (7.10). Then [f, gf::5 [f, n[g, g]. The result is due to Davis (see Beckenbach and Bellman, 1961, 1965, p. 61), where the inequality with p(x) =' 1 is given, and was first proved by Everitt (1957). 0 Ostrowski (1970) gave the following result:

7.14. Theorem.

Let f be a bounded measurable function on I such that ::5f(x)::5 Cz for x E I, and assume that g'(x) exists and is bounded on I. Then

Cl

1 IT(f, g)1 ::5 -8 (b - a )(c z - c 1) sup Ig'(x)l· xel

Furthermore, the constant A in (7.35) is the best possible.

(7.35)

210


The following two theorems are proved in Lupas (1973) and are refinements of the results in Ostrowski (1970). For notational convenience let us define b

Ilfllz =

(b ~ a J If(x)I dX) Z

liZ

.

a

7.15. Theorem. Let f, g be locally absolutely continuous on I = (a, b) with f', g' E Lz(I)· Then (b - a)Z IT(f,g)l:5

n

z 11f'llz·llg'llz.

(7.36)

Furthermore, the constant 1/nz in (7.36) is the best possible.

7.16. Theorem. Let f be locally absolutely continuous on I with f' E Lz(I) and let g be bounded and measurable on I with d, :5g(x):5 d z on I. Then 1 IT(f, g)l:5 2n (d z - d 1)

,

Ilf liz·

(7.37)

7.17. Remark. The weighted versions of (7.36) and (7.37) are given in Milovanovic and Milovanovic (1979). D Of course, the Gruss Inequality can be improved if we consider smaller classes of functions. A function f defined on (a, b) is said to be monotonic of order p if it is a convex (concave) function of order 1, ... ,p. If f is monotonic of order p in (a, b) for every p = 1, 2, ... , it is said to be absolutely montonic in (a, b). If f is absolutely monotonic in (a, b), then it has derivatives of all orders and f(kl(X) 2':

°

or

f(k l(X):5

°

for all

k

=

1, 2, . ..

and x

E

(a, b).

A function f is said to be completely monotonic on (a, b) if f( -x)

absolutely monotonic in (-b, -a) or equivalently, and it satisfies f'(x):50,

f"(x) 2': 0,

f"'(x) :50,

...

f"(x) :5 0,

f"'(x) 2': 0, ...

or

IS


for x in (a, b). Griiss (1935) proved that if monotonic functions, and if (7.23) holds, then

211

f and g are absolutely

4 IT(f, g)l:5 45 (cz - cl)(d z - d 1 ) ,

(7.38)

and the constant 15 is the best possible. Pecaric (1983e) noted that (7.38) is also valid if f and g are completely monotonic functions. If f is absolutely monotonic and g is completely monotonic, then (7.39) where the constant -b is also the best possible. Pecaric also gave a result for three functions, and a related result was given by Hardy (1936) (see also Mitrinovic, 1970, p. 72). Landau (1935) proved that (7.38) holds if f and g are monotonic of order 4. Needless to say, this is a nontrivial improvement of Griiss' result. For functions of order k = 2, 3, Landau proved that 1 ' IT(f, g)1 :5 9(cz - Ct)(d z - d 1 )

for

k = 2,

(7.40)

9 IT(f,g)I:5100(cZ-Cl)(dz-dl) for

k=3.

(7.41)

Note that using inequality (7.34) we can obtain a series of results when

f and g belong to different classes of functions. For example, if f is monotonic of order 2 and g is monotonic of order 3, then (7.42) Some other bounds for IT(f, g)1 can be obtained by using the Berwald inequality (see Section 7.2). As a special case, we have

where f is a positive and concave function on (a, b). This inequality is equivalent to the inequalities b

T(f, f):5

~ (b ~ a J(f(x))Z dx) z a

b

and

T(f, f):5

~ J (f(x))Z dx.

212


Thus, using (7.34) we obtain b

IT(f, g)l:5

b

~ IIfllzllgllz:5 ~ (b ~ a Jf(x) dx )(b ~a Jg(x) dx ), a

a

(7.43) where f and g are positive and concave functions on (a, b). This is an interpolation inequality for an inequality given in Mitrinovic (1970, p. 73) (see also Gruss, 1935, and Franck and Pick, 1915). Other bounds can be obtained from theorems which are proved in Franck and Pick (1915) and Blaschke and Pick (1916). For other related results, see Fempl (1965) and Knopp (1935). Griiss' inequality provides bounds for the difference in T(f, g). An analogous result for the ratio 1

R(f, g)

=

1

1

(J f(x) dx Jg(x) dx ) / (J f(x)g(x) dX) 0 0 0

was obtained by Karamata (1948). He proved that if integrable functions on [0, 1] and if O:[a, b ] ~ 1 R be convex and f:[O, 1 ] ~ 1 Rbe continuous, increasing, and convex such that a ~f(x) ~ b for x E [0, 1]. Then 1

f

1>(f(x» dx ~

b

+ a - 2c b_ a

f b

2(c - a) 1>(a) + (b _ a)2

o

1

1>(x) dx,

C

=

f

f(x) dx.

o

a

(7.53) If 1> is strictly convex, then the equality in (7.53) holds iff xf(x)=a+(b-a)

A+ Ix - AI

2(1-A)

,

where

A= b +a -2c. b- a

(7.54)

7.2. Related Inequalities

215

7.24. Theorem. Let ¢: [a, b] ~ ~ be continuous and convex, and let f: [0, 1] ~ ~ be convex of order 1, ... , n + 1 such that a :s; f(x) :s; b for x E [0, 1]. Then 1

1

J¢(f(x)) dx:s; J ¢ ( ~ ( A x)), dx o

0

tor

J'

ja

+b

U - 1)a

+b

- - l, po+ F(x) = 0 or -00, and F- 1 be convex. Let {ad and {bk} be two positive real sequences such that for every n e: 1, w(n) = { W r ) } k ~ isl a positive n-tuple satisfying n

2:

w ~ n )= 1

and

k=l

2:

w ~ n ) b n :C 5

for

k e: 1.

n=k

If F',. is a quasi-arithmetic mean (for definition, see Bullen, Mitrinovic, and Vasic, 1987, p. 215), then 00

2:

n=l

00

bnF',.({ad,w(n»):5C

2: an;

(8.7)

n ~ l

if in addition 1 n C = lim bk> n-+ oo n k=l

2:

then the constant in (8.7) is the best possible.

8.4. Remark. For F(x) = !ogx, w ~ n =) s.ln, and b; = 1/(k + 1), we have 1

2: - n=l n + 1 00

Since e- 1 < (n!)lInl(n

(

n

n!

IT ak k=l

) lin

k=l

where G; ( {ak}) = Va 1 • • • an' Of course, many other examples can be found in Godunova (1965) and Bullen, Mitrinovic, and Vasic (1987, pp. 272-73). Godunova also gave many integral analogues of the results. 0

232

8. Special Related Inequalities

In her proof Godunova used the well-known inequality for quasiarithmetic means. Since this result is a simple consequence of Jensen's inequality, by using Jensen's inequality we can have more general results. This fact is noted in Vasic and Pecaric (1982c), where the following results are given: 8.5. Theorem. Let f: 1-->; ~ be a convex function, Xi E I (i = 1, 2, ... ), {cd be a positive sequence, and for every n ~ 1 let qn = ( q ~ , ... , q ~ ) be a positive n-tuple such that ~ 1 : ~ 1ql: = 1 (n ~ 1). If ~ cnql:::=; d k

for

k ~ 1,

(8.8)

n ~ k

then (8.9)

If f is a concave function and the reverse of the inequality in (8.8) holds, then the reverse of the inequality in (8.9) holds.

Proof.

By Jensen's inequality we have

Thus

co

cc

= ~ f(xd ~ cnql:::=; k ~ l

n ~ k

~

~ dd(xd·

0

k=l

8.6. Theorem. If in Theorem 8.5 we replace I by a convex set U in ~ m and the points Xi (i = 1, 2, ... ) by points in U, then the conclusion remains valid.

Proof.

The proof is similar to the one-dimensional case.

0

,

8.1. Hardy's, Hilbert's, Opial's and Related Inequalities

Let us consider the two quasi-arithmetic means (see Mitrinovic, and Vasic, 1987, pp. 215-82) given by Kn({ak}, p)

=

233

Bullen,

K - l ( ~ n ~ 1PkK(ak)),

Ln({bd, p) = L

- l ( ~ n ~ lPkL(bk)),

Then the following result holds:

8.7. Coronary. Let f : [k 1 , k 2 ] x [ e1 , e2] ~ ~+ be a real-valued function, {ad, {bd, and {cd be positive sequences, and assume that, for every l = 1. n 2: 1, qn = ( q ~ , ... , q ~ ) is a positive n-tuple such that ~ Z = qZ (a) If H(s, t) then

=

f(K-1(s), L -l(t)) is a convex function and (8.8) holds,

00

00

2: cnf(Kn({ak}, qn), Ln({bd, qn)) L dnf(a n, b n). $

n ~ l

(8.10)

n ~ l

(b) If H(s, t) is concave and the reverse of the inequality in (8.8) holds, then the inequality in (8.10) is reversed. Proof. This follows from Theorem 8.6 by letting m = 2, f(s, t) = H(s, t), s, = K(aJ, and t, = L(bJ for i = 1, 2, . . . . 0 8.8. Remark. In the previous result we assume that all sums are finite. Of course, we can use other generalizations of Jensen's inequality (e.g., a generalization of Theorem 8.1) and related inequalities to obtain similar results. For example, in Vasic and Pecaric (1982c) the Jensen-Petrovic Inequality is used, and Imoru (1977) contains a generalized Hardy's 0 inequality.

Similarly, Godunova (1967b) proved the following result:

8.9. Theorem. Let K(t) 2: 0 be defined on ~ = {t = (tl , ... , tn): 0 < t, < 00, i = 1, ... , n} with f v, K(t) d ~= 1, and let V" and Vy be defined similarly. Let (u) be a nonnegative convex function for u 2: 0 and f be such that f(y) 2: 0 for y E Vy , f $ 0, and (f(x))/(x) ... x n) is integrable

234

on


v,..

Then

By using this result Godunova obtained many general inequalities, which include (i) Hardy's and Knopp's inequalities (Hardy, Littlewood, and P6lya, 1934, 1952, p. 250): x

00

00

J e x p ( J~ logf(t) dt) dx < e Jf(x) dx, o

0

0

which follows from Theorem 8.9 by letting n = 1, (jJ(u) = e" and K(t) = {01

for for

0 ~ t ~ 1, t> 1;

(ii) Hilbert's inequality:

by letting n = 1, (jJ(u) = u", and K(t) = [sin(.n/p)/.n][r and (iii) Hardy-Littlewood-P6Iya inequality:

1(1 m ~ ~ ~ ~

y} dy

f

dx
1,

0 0 0

by letting n

= 1,

(jJ(u)

= u", and K(t) = (p

-1)/(p2t llp max{l, t}).

In the following we give some results from Mitrinovic and Pecaric (1988c). We say that a function u : [a, b ] ~ ~ belongs to the class U( v, K) if it admits the representation

f b

u(x) =

K(x, t)v(t) dt,

(8.12)

8.1. Hardy's, Hilbert's, Opial's and Related Ineqnalities

235

where v is a continuous function and K is an arbitrary nonnegative kernel such that v(x) > 0 implies u(x) > 0 for every x E [a, b]. We also assume that all integrals under consideration exist and are finite. First we prove the following theorem.: 8.10. Theorem. Let u, E U(v" K) (i = 1, 2), where vz(t) > 0 for every t E [a, b], r(t) ~0 for every t E [a, b], and ¢(u) is convex and increasing for u ~ O. Then

(8.13) holds, where b

() J r(t)K(t, x) d s (x ) - Vz x ( t. Uz

a

Proof.

(8.14)

t)

Using Jensen's inequality for the convex function ¢ we have

b Ja r(x)¢

(I

1 Jb , v 1(t) uz(x) K(x, t)vz(t) vz(t) dt

I)

a b

b

::; J r(x)¢(J K(x, t)vz(t) Iv 1(t) I dt) dx Uz(x) vz(t) a

a

b

b

= f ¢(jV1(t) !)vz(t)(f r(x)K(x, t) dX) dt vz(t) Uz(x) a

a

b

=fS(t)¢(I~:~:~I)dt.

0

a

8.11. Remark. If for a> 0 we let K(x, t)

= { ~ X- t)a-l/f(a)

for for

t ::; x

t >x,

(8.15)

then v is the derivative of order a of u in the sense of RiemannLiouville, and from Theorem 8.1 we obtain Theorem 8.3 from Godunova and Levin (1969) (see also Rozanova, 1976a). 0

236


Note that Theorem 8.1 can be generalized for convex functions of several variables. For example, the following result is valid:

8.12. Theorem. Let u, E U(vi , K) (i = 1, 2, 3), where vz(x) >0 and r(x) 2: 0 for every x E [a, b], and 4>(u, v) is convex and increasing for u, v 2: O. Then b

b

J r ( x ) 4 > ( I ~ : i ; ~ I ,1 ~ : i ; ~ I ) d XJ~ S ( X ) 4 > ( I ~ : i ; ~ I 1, ~ : i ; ~ I ) d X ' (8.16)

a

a

where s(x) is given in (8.14).

Let U1(v, K) denote the class of all functions u E U(v, k) such that K(x, t) = 0 for t > x. Note that if U E U1(x, K), then we have b

u(x) =

JK(x, t)v(t) dt.

(8.12')

a

Let u, E U1(Vi , K), vz(x) > 0, and r(x) 2: 0 for every x E [a, b]. Further, let 4>(u) and f(u) be convex and increasing for u 2: 0 and f(O) = O. Iffis a differentiable function and max K(x, t) = M, then

8.13. Theorem.

b

M

.

J vz(x)4>(I ~ : ~ !~ )f'(u z(X)4>(I~ : i ; ~I)) dx

a

b

~ f ( MJ v z ( t ) 4 > ( I ~ : i : ~ I ) d t .(8.17) a

Proof. Since f' is an increasing function, by using (8.12') for the function Ul and the well-known inequality for the absolute value of a function, we have

b

~M

x

)4>( Ivz(x) v1(x) 1)f'(uz(X)4>(J K(x, t)vz(t) Iv1(t) I dt)) dx. uz(x) vz(t)

J vz(x a

a

8.1. Hardy's, Hilbert's, Opial's and Related Inequalities

237

Now using Jensen's inequality for the convex function ep and the condition K(x, t) :s: M, we have x

b

V z C X ) e p ( I ~ ~ i ; ~ l ) fK(x, ' ( f t ) V z C t ) e p ( I ~ : i dt) : ~ 1 )dx

I:s:M f a

a x

b

:s: f

M V 2 ( X ) e p ( I ~ ~ i ; ~ I ) fM' V( fz C t ) e p ( I ~ : ~ : dt) ~ I ) dx

a

a b

= f(M f V2(t)ep( I

~ : ~ : ~1) dt.

0

a

8.14. Remark. Note that for Riemann-Liouville's derivative we have For a=l, we have M=l, and we obtain Theorem 1 in Rozanova (1972a). Therefore, Theorem 8.13 is a further generalization of Theorem 2 in Godunova and Levin (1967). 0 M=(b-a)a-1jr(a).

In the following we give another generalization of Theorem 2 in Godunova and Levin (1967). 8.15. Theorem. Let ep: [0, (0) ~ ~ be a differentiable function such that for q> 1 the function ep(x1/q ) is convex and ep(O) = O. Let u E U1(v, K) where ( f ~(K(x, t)Y' dt) lIP :S: M, P -1 + = 1. Then

«'

b

b

f lu(x)1

1- qep'(!u(x)l)

Iv(xW dx:s:

1/

~ q'(lu(x)l) Iv(xWdx a b

:5 f M 1-q(z(X»lIq-lfj>'(M(z(X»lIq)Z'(X) dx a b

q) q) = :q f fj>'(M(Z(X»lI d(M(z(x»lI a

o a

8.16. Remark. Similar results can be given for the class of functions U2(v, K) in which K(x, t) = 0 for t <x. If u E U2(v, K), then we have b

u(x) = f K(x, t)v(t) dt.

(8.12")

o

x

8.17. Corollary. Let fj>, q, and p be defined as in Theorem 8.15. If uCn-1)EAc[a,b) and either uCk)(a)=O for k=0,1, ... ,n-1 or uCk)(b) = 0 for k = 0, 1, ... , n - 1, then b

f lu(x)1

b 1-qfj>'(lu(x)1)

11

luCn)(xWdX:5 :q fj>( M(f luCn)(tW dt) ' )

a

a

where M = (b - at- lIq/«n -I)! (np - p + l)lI p ) . For some related results, see Rozanova (1972a, 1976a, 1976b). The following theorem is due to Rozanova (1972b). 8.18. Theorem. Let y(x) be an absolutely continuous function on [0, a), y(O) = 0; r(x) be increasing, and r(O) = O. Let fj>(w) and F(w) be convex

8.2. Young's Inequality

239

and increasing functions for w > 0, F(O) = 0, Q(w) be a convex and increasing function, and 'ljJ(w) be an increasing function with 'ljJ(0) = 0. If F'(z(x»z'(x )'ljJ(11 z '(x)') ~(F(z(a »1z(a »'ljJ' (x 1z(a»,

(8.19)

where z(x) = H r'(t)¢(Iy'(t)llr'(t» dt, then

f a

F'(r(x)¢ )(1 y(x)llr(x»G(r' (x )¢(I y' (x )llr'(x» dx

o a

~ H ( r'(x)¢(ly'(x)llr'(x» J dX),

(8.20)

o

where G(w) = wQ('ljJ(l/w», H(w) = F(w)Q('ljJ(alw». Further, the equality in (8.20) holds iff y(x) =Ax, rex) = Bx, and 'ljJ(w) = CF(aw), where A, B, C and a are constants.

If in (8.20) we let y(x) = f(x), f'(x) > 0, f(O) = 0, ¢(w) = w, F(w) = 1P(w) = w 2, and Q(w) = VI + w, we obtain the following result (see Rozanova, 1972b): Let a and b be given positive real numbers, and let f be a real-valued function such that f(O) = 0, f(a) = b, f(x) ?::. 0, and f(x )/1' (x) ~x on the interval [0, a]. Then

8.19. Remark.

f(a)

= b,

f a

2

f(x)(l + (f'(X»2)112 dx

~b(a 2 + b 2)1I2,

(8.21)

o

and equality holds iff f(x) = (bla)x. Inequality (8.21) is given in P61ya (1947) with a stronger condition that f"(x)?::.O instead o f f ( x ) / f ' ( x ) ~ x . 0 Rozanova (1972b) also gave some other examples which give generalizations of Opial's inequality.

8.2.

Young's Inequality

The following result is known as Young's inequality: 8.20. Theorem. Let f be a real-valued, continuous, and strictly increasing function on an interval 1= [0, c] (c > 0) such that f(O) = 0, and let

240


a

b

ab::::; ff(X)dX+ f g(y)dy forall o

a,b::::;c,

(8.22)

0

= f(a).

and equality holds iff b

This inequality was proved by Young (1912) with the additional condition that f be differentiable, and another proof is given in McShane (1947, pp. 131-32). Proofs of Theorem 8.20 under the present conditions are given in Diaz and Metcalf (1970), Bullen (1970b), and Nieto (1974). For a geometric interpretation of the result in Theorem 8.20, consider Figures 8.1 and 8.2 given below. The area of the curvilinear triangle OAP is given by SU(x) dx, and the area of the curvilinear triangle ORB is given by S ~g(x) dx. Thus the inequality in (8.22) is justified. On the other hand, Young's inequality is related to an integral representation of convex functions (see Theorem 1.2(a)). Namely, let f: [0, 00) - [0, 00) be a continuous and increasing function such that f(O)=O and f(x)_oo as x_ oo. Then exists and has the same properties as f. Further, if we let

r:

f

f

o

o

y

x

F(x)

= f(s) ds and F*(y) =

f-1(t) dt,

y

B(O,b)

x A(a,O)

C(X,O)

Figure 8.1.

8.2. Young’s Inequality

241

Y

A

Figure 8.2.

then F and F* are both convex functions on [ o , ~ ) Thus . the following results are valid (Roberts and Varberg, 1973, pp. 29-30):

+ +

(i) xy 5 F ( x ) F * ( y ) for all x 2 0 and y 2 0 (Young’s inequality), (ii) xy = F ( x ) F * ( y ) iffy = f ( x ) = F ’ ( x ) , (iii) (F*)’ = (F’)-’, (iv) (F*)* = F, and ( 9 F * ( Y ) = SUPX20 ( X Y -f(x>). Note that property (v) is used for defining a conjugate function: If f :I+ R is a convex function defined on an interval I, then f * :I*+ R denotes the conjugate function given by f * ( y ) = supxEI(xy - f ( x ) ) with domain I*= { y E (w :f*(y) < m}. Some properties of conjugate functions are given in Roberts and Varberg (1973, pp. 28-36). In a similar fashion we can define a conjugate function corresponding to a convex function f of several variables (see PSeniEnyi, 1980, p. 64):

f*(Y)=sup ((x, Y> -f(x)). This function is also convex with the property f ( x ) = ( f * ) * ( x ) . In this case Young’s inequality is also valid, i.e., we have

f(x) +f*(Y)

2

(x, Y>.

A function f is called an N-function (Krasnosel’skii and Ruticii, 1958 and 1961) if it admits to the representation M ( u ) = J g ’ p ( t )dt where the

242


function p is continuous from the right for t ~ 0, increasing, positive for t > 0, and such that p(O) = 0, limHoop(t) = 00. The class of such p functions will be denoted by '!fl. Let p E '!fl. For a function q defined by q(s) = supp(t)"Ss t for s ~ 0, we say that it is the right-inverse function of p. It can be easily verified that q has the same properties as p. We now introduce a concept of complementary N-functions: Let M be an N-function and p E '!fl be its right-derivative. Then the N-function v, N(v) of the form N(v) = fb q(s) ds, where q is the right-inverse function of p, is called the complementary N-function to function M. M and N are mutually complementary N-functions. For these functions Young's inequality is also valid, i.e., the following results are valid (Krasnoselskii and Rutickii, 1958 and 1961): (i) If M and N are mutually complementary N-functions, then for every u, v E ~ Young's inequality is valid, i.e.,

uv ::s;M(u) + N(v).

(8.23)

(ii) Equality in (8.23) holds iff either

v=p(u)

or

u=q(v)

for

(8.24)

u , v ~ O .

Consequently we have

up(u)

= M(u) + N(p(u»

and

vq(v) = M(q(v» + N(v) for

u, v

~ O.

(8.25)

(iii) If for a given N-function M the inequality in (8.23) holds for all u, v ~0, then the function N is the complementary N-function of the function M. Now let f be an increasing function on an interval I, let (1' = inf{f(x):x E I}, 13 = sup{f(x):x E I}, and let J = «(1', 13) (or [(1', 13), «(1', 13], or [(1', 13] if the values of (1', 13 are attainable). A function g with domain J is called a pseudo-inverse of f if for each Y E J we have

XL(Y)== sup{x :f(x) < y} ::s; g(y) ::s; inf{x:f(x) > y} == XR(Y)' Cunningham and Grossman (1971) proved an extension of Young's inequality for the case a > O. For general a the following result holds: Let f be an increasing function on an interval I containing the points x = and x = a (where a> 0 or a < 0). Let g be a pseudo-inverse of f with domain J. If f(O) = 0 and b e J, then (8.22) holds, and equality holds iff f(a_)::s; b ::S;f(a+).

°


243

Boas and Marcus (1974b) proved the following theorem, which is equivalent to the above result: 8.21. Theorem. If f is an increasing function on an interval I containing the points x = c and x = d, and g is a pseudo-inverse off, then d

f ~ )

cf(c) + ff(x)dx-=:=dt+

f g(y)dy,

e

(8.26)

t

d

t

df(d)+ f g(y)dy-=:=ct+ f f(x)dx fed)

(8.27)

e

for all t in the domain of g. Furthermore, equality in (8.26) holds iff t is between f(d-) and f(d+), and equality in (8.27) holds iff t is between f(c-) and f(c+). If f is decreasing instead of increasing, then the inequalities in (8.26) and (8.27) are reversed. In that case the conditions for the equalities to hold remain unchanged. Proof. Suppose f is increasing. Let f(c) = A, and define F by F(x) == f(x + c) - A. Then F is increasing in I} = {x:x + c E I} with F(O) = O. Furthermore, letting I denote the domain of g, the function G defined by G(y) = g(y + A) - c for y E I} = {y :y + A E I} is seen to be a pseudo-inverse of F. Thus, for arbitrary d E I and t E I we have (d - c) E I} and (t - A) E I}. By the result of Cunningham and Grossman (1971), it follows that d-e

t-A

(d - c)(t - A)::; f

F(x) dx + f

o

o

G(y) dy,

which implies d

dt - cA::; f f(x) dx e

t

+ f g(y) dy. A

Since A = f(c), this is equivalent to (8.26). Moreover, the equality holds iff F«d-c)_)::;t-A::;F«d-c)+) holds; that is, iff f(L)::;t::;f(d+) holds. The inequality in (8.27) is equivalent to that in (8.26), because interchanging c and d in (8.26) yields (8.27), and vice versa. Finally, the proof for the case in which f is decreasing follows in the same fashion by

244


a similar application of a result of Cunningham and Grossman 0 (1971).

8.22. Remark. A result which is analogous to Theorem 8.21 can be 0 found in Milicevic (1975). Merkle (1974) noted that the following converse of Young's inequality is valid: 8.23. Theorem. Let f be a continuous and strictly increasing function on an interval I which contains x = 0 such that f(O) = 0, and let g = f-l. Then b

a

Jf(x) dx + J g(y) dy s: max{af(a), bf(b)} o

holds for every a

(8.28)

0

E

I and b Ef(I).

8.24. Remark. Similar results can be obtained for other forms of Young's inequality. For example, from (8.25) we obtain M(u) + N(v):5 up(u)

for

p(u)

~v

and M(u) + N(v):5 vq(v) for

Thus for every u, v

~0

p(u):5 v

i.e.,

u

z;

q(v).

we have

M(u) + N(v):5 max{up(u), vq(v)}.

o

The following result is given in Beesack, Mitrinovic, and Vasic (1980): 8.25. Theorem. Let f be continuous and strictly increasing on an interval I containing x = 0 such that f(O) = O. Let g be a continuous function with domain J = f(I) such that (8.22) holds for a E I and b E J, where equality l • holds for b = f(a). Then g =

r

Proof.

For an arbitrary but fixed a

f

E

a

ep(b) =

o

I let

f b

f(x)dx+

0

g(y)dy-ab.


245

Then <jJ(f(a» =0, f(b)?O for all bEl, and <jJ'(b)=g(b)-a exists for all s «: It follows that <jJ'(f(a» =0; thus g(f(a))=a for all aEI. Suppose that g(b) =1= f-\b) for some b e J. Let a = f-\b) so that b = f(a); then a = g(f(a» = g(b) =1= f-l(f(a» = a, a contradiction. Hence g(b)=f-l(b) for all s «: D 8.26. Corollary. Let f satisfy the conditions of Theorem 8.23, and g be a continuous function with domain 1 = f (I) such that g( y) ~f - \y) for all y E I. If (8.22) holds, then g = f-l.

8.27. Remark. In Takahashi (1932) it is assumed that g is continuous and strictly increasing with g(O) = 0 and g-l(X) ?f(x). This result is somewhat weaker than Corollary 8.26. Similarly, a result of Bullen (1970b) is weaker than Corollary 8.26. A similar result for N-functions is given in result (iii) of Krasnoselskii and Rutickii (1958, 1961). D 8.28. Theorem. conditions:

Let T: P - P be an operator satisfying the following

x y ~ T ( p ) ( x ) + T ( q ) for ( y )

p,qEP,X?O,

and

y?O;

xy= T(p)(x) + T(q)(y) if y=p(x),

(8.29) (8.30)

where pEP and q is its right -inverse function. Then

f x

T(p)(x) =

p(t)dt for

pEP

and

x?O.

(8.31)

o

8.29. Remark. Theorem 8.28 is proved in Lackovic (1974b). It is a minor generalization of a result in Hsu (1972). D There exist generalizations of Young's inequality which involve several functions. The following is given in Beesack, Mitrinovic, and Vasic (1980): 8.30. Theorem. Let f be continuous and strictly increasing on an interval I containing the points x = a and x = b, and let g be an increasing function on I. Then

f b

t g ( b ) - f ( a ) g ( a ) ~f(x)dg(x) +

f t

f(a)

g(f-l(y»dy fort Ef(I),

(8.32)

246


= f(b)

and equality holds iff either t

or g is a constant between band

r\t). Cooper (1927), Takahashi (1932), and Oppenheim (1927) also give generalizations of Young's inequality which involve several functions. The following result is due to Oppenheim: 8.31. Theorem. Let t.. ... ,fn be continuous, nonnegative, and strictly increasing on 1= [0, 00). If at least one of them takes the value zero at x = 0, then 'k

D1 fk(tk)

~ktl

f (I] i*k

o

for tk E I (k

= 1, ...

(8.33)

t(x») dA(x)

, n). Moreover, the equality holds iff t 1 = ...

°

Proof. Without loss of generality we may assume that Define the functions Fk (1 ~ k ~ nJ by Fk(x) = A(x) for Fk(x) = fk(tk) for x 2= tk. Then

t1 f (ll

t1 f (ll

4

~ •••

~x

~ tk

~tn' and

~

Fj(x) ) dFk(x) =

f

~

=

°

Since ~ Fj(x) ~ t ( x )and differences), we also have

°

o

Fj(x») dFk(x)

n

d

n

n

(J] Fj(x») = J] Fj(tn) = Jl t(ti)'

~ ~ F k ( X )~ ~ f k ( X h ) old

for all x, j, k (and all

f ( I T F j ( x » ) d F k ( Xf) ~ 4

o

~t1

°

= tn'

4

l*k

0

(ITt(x»)dfk(x) for l*k

1 ~ k ~ n ,(8.34)

and the inequality (8.33) follows. Furthermore, if t 1 = ... = t.: then clearly the equality in (8.34) holds, hence also that in (8.33). On the other hand, if ~ tk < tn for some k, then

°

IT Fj(x) = Fk(x) IT

i*n

i*n,k

Fj(x) < fk(X)

IT

i*n,k

t(x) =

IT t(x)

i*n

holds for tk < x ~ tn ; thus strict inequality must hold in (8.34) for k Consequently, strict inequality also holds in (8.33). D

= n.

8.3. Nanson's Inequality

8.3.

247

Nanson's Inequality

The following result is due to Nanson (1904):

8.32. Theorem.

If the real sequence { a k n : ~isl convex, then

al

+ a3 + ... + aZn+l:> az + a4 + ... + a Zn n+1 n

(8.35)

with equality iff {ak} is an arithmetic sequence.

Proof.

Since {ad is convex, we have ak - 2ak+l

+

a k + Z ~ Ofor

k

= 1, 2, ...

,2n-1.

(8.36)

By virtue of this fact, we conclude that k(n - k

+ 1)(azk-l -

2aZk + aZk+l) ~ 0

for

k

= 1, ... , n

and k(n - k)(aZk - 2aZk+l + aZk+Z) ~ O.

By adding these inequalities we obtain (8.35). Furthermore, equality in (8.35) holds iff equality in (8.36) holds for every k, which occurs iff {ak} is an arithmetic sequence. 0

8.33. Remarks. (a) Another proof of (8.35) is given in Adamovic and Pecaric (1989). (b) Steinig (1981) noted that the following extension and interpolation of (8.35) is equivalent to (8.35): Zn+l

2:

k=l

1 n 1 Zn+l 1 n ( - I ) k + l a k ~ - a Z k + l ~ - 2 - a k ~ - a Zk' n + 1 k=O n + 1 k=l n k=l

2:

2:

2:

(c) Let the sequence {ak} satisfy the conditions m s 6?an s M

for

n ~ 1.

(8.37)

Then the sequences {Cn}n;"l and {dn}n;"l given by en =a n -m(nZj2) and d; = M(n z/2 ) are convex. Thus the following results are valid (see Andrica, Rasa, and Toader, 1984): 2n+l 1 n In 2n+l - - m ::s-aZk+l - aZk ::s--M 6 n +1 k ~ O n k~l 6

2:

2:

248


and n(2n + 1) 6

Z ~ l()k+l

m ~L.

k=l

-1

1 ~ n(2n + 1) ak---1 L. a Z k + l ~ 6 M. n + k=O

(d) Stankovic (1976) gave the following generalization of (8.35): (n - 2p )(al + a3 + ... + aZn+l) + (2p - n - l)(az + a4 + ... + azn)

+ 2p(a l + aZn+l) - p(az + azn)

~

° for all

p ~ 0.

For p = 0, it reduces to Nanson's inequality. Note that this inequality yields the special case neal + a3 + ... + aZn+l) - (n + l)(az + a4 + ... + azn) ~ 2p(a3

+ ... + a Zn- l) - p(az + 2a4 + ... + 2azn- z + azn),

which is weaker than Nanson's inequality. This is so because it follows from

which in turn follows by adding the inequalities 2aZk+l

~ aZk

+ aZk+Z for k = 1, ... , n -

1.

(e) Adamovic and Pecaric (1989) proved the following generalization of Nanson's inequality: 1 n-m+l 1 n+l 2 2 aZk ~ - - 1 aZk-l for m e n - m + k =m n + k= 1

L

L

j{

and

2m ~ n.

However, this inequality is also weaker than (8.35) due to the inequality 1

n - 2m + 2

n-m+l

L k=m

1 n «» ~ -L aZk' n k=l

(8.38)

which will be proved as a consequence of Theorem 8.34 below. (f) Lackovic (1975) proved that the inequality

is valid for every convex sequence {ad iff the sequence {pd is given by Pk = constant (k = 1, ... , 2n + 1). 0


249

Adamovic and Pecaric (1989) proved:

8.34. Theorem. Let {akH be a convex sequence and In = {1, ... , n}, and let I, J, and M be nonempty subsets of In such that I and J are non-overlapping. Let I, J, and M have cardinal numbers a, f3, and y, respectively, and denote v = 2: i,

w

=

ieJ

2: i. ieM

If u

v

u

v

- 0) if Lpq(a n) ~0 for n ~ 1, where Lpq(a n) = an+z - (p

d ~is

said to be p, q-convex

+ q)an+l + pqa n·

8.38. Remark. In Milovanovic, Pecaric, and Toader (1985) it is shown that the theory of p, q-convex sequences plays an important role in the ~ by sequence { w n } given p n _qn W = P - 1 for p*"q n { np'"? for p = q. For example, the following result is proved: The sequence {an} satisfies the relation n = 1,2, ... (8.44) iff an = UWn + VWn+l' (8.45) where u and v are arbitrary real numbers.

0

Milovanovic, Pecaric, and Toader (1986) also proved the following generalization of Nanson's inequality: 8.39. Theorem.

If the real sequence {ak}f'+l is p, q-convex, then

(pqta 1 + (pqt-1a3

+ ... + aZn+l

...:..o.-..o...:...----=-_..:.::-::...:.-_--=--

(pqt-1az

+ (pqt- Za 4 + ... + aZn

~ ~ ~ - - - " - - - - - - ' ' ' - - ' ~ - - - ' - - - - - - - . . . = : . . :

(8.46) and equality holds iff {ad satisfies (8.44).


251

8.40. Remarks. (a) For s.; = Wl + ... + Wn we have Lpisn) = 1; thus if the real sequence {ad satisfies m

:s; L p q :s; M

for

n = 1, 2, ... ,

then the sequences {b n} and {en} given by

are p, q-convex. A generalization of the first inequality in Remark 8.33(c) can be given by using this fact (see Milovanovic, Pecaric, and Toader, 1986). (b) As shown in Mitrinovic (1970, pp. 205-6), Nanson's inequality for k- 1 ak =X gives an inequality of J. W. Wilson. However, the results given in this section can also be applied to yield results in Mitrinovic (1970, p. 198, (3.24» and in Mitrinovic (1965, p. 139, 2.3.1.4 and 2.3.1.5). All of these results are further generalized in Adamovic and Pecaric (1989), where the following result is proved: Let p and q be real numbers such that p >q.

(i) If either (1) q >

°

°

or (2) p > > q and p + q < 0, then

p

q

a + -1

p +q p- q

-P- -q> - - for O 0 > q and p + q > 0, then the reverse inequality in (8.47) holds. 0 Besides the generalizations of Wilson's inequality and other related results, this result represents an improvement of a result of D. Z. Djokovic (Mitrinovic, 1965, pp. 162-63, 2.3.2.8, or Mitrinovic, 1970, p. 276, 3.6.26). It also improves the inequality in Mitrinovic (1970, p. 279, 3.6.31). Furthermore, some other examples are given in Adamovic and Pecaric (1989).


General Linear Inequalities for Convex Sequences and Functions

Chapter 9

9.1.

Inequalities for m-Couvex Sequeuces and Functions

The following theorem is given in Pecaric (1981e): 9.1. Theorem. Let p = (Pl' ... ,Pn) be a real n-tuple, where n > m. Then the inequality n

2: p.a, 2:: 0

(9.1)

i=1

holds for every m-conuex sequence

{ a i } i~ ff

n

2: (i -

i=1

1)kp i = 0 for

k

= 0, 1, ... , m -

1

(9.2)

+ 1, ... , n,

(9.3)

and n

2: (i -

k

+m -

1)(m-1)Pi 2:: 0 for

k =m

i ~ 1

where

P) = j(j -

1) ... (j - k

+ 1), /0) = 1.

Proof. The sequences a, = (i - 1)k and a, = -(i - 1)k (1 - 0 for y > -1.

= 0, 1, ...

,n)

with

I:7=o ( ~ ) i a i > 0 for y
oc IIfn - fll = O. Let us assume that D c ~ , and let S(D) be one of the normed subspaces of the space of all real functions defined on D, where the norm of a function f E S(D) is denoted by Ilflll' We consider operators A of the following form A:C[a, b ] ~ S ( D and ) , say that A is continuous if from the condition Ilfn - fll ~0 (as n ~(0) it follows that IIAfn - Aflll ~0 as n ~00. Also, we write Af?:. 0 if g(t) = Af?:. 0 holds for every tED, where f is a given function in the space C[a, b]. The set of all functions which are convex of order n and continuous on [a, b] (continuous from the right at a and continuous from the left at b) will be denoted by Kn[a, b]. Clearly we have Kn[a, b] c C[a, b]. We shall consider the classes K n [ a, b] for n e: 2, and define for i = 0,1,2, ....

ei(t) = ti for

a

~t

~b.

(9.23)

Furthermore, for t, C E [a, b] we define the function wn+l(t, c) by t - C

Wn+l(t,C)=

(

+ It - CI)n 2

9.17. Theorem. Assume that A: C[a, tinuous operator. Then

f

E

= (t-c)':-.

b ] ~S(D)

(9.24)

is a linear and con-

Kn[a, b]:? Af?:. 0

(9.25)

holds for every function f iff Aei = 0 for

i = 0, 1, ... , n - 1 and

Awn(t, c)?:. 0 for every Proof.

c E [a, b].

(9.26) (9.27)

(a) If: Let us denote m

Fm(x) = Pn(x) + L cjwn(x, Xj)'

(9.28)

j ~ 1

where Pn(x) E fIn (the set of all polynomials of degree at most n), and assume thatf E Kn[a, b]. Then by Theorem (1.44) there exists a sequence

264

9. General Linear Inequalities

the form (9.28) with Cj 2: 0 (j=1, ... ,m) such that the functions wn(t, c) are of the form (9.24) and { F m ( x ) }of ~

lim IlFm(x) - f(x)11 m---+ oo

= o.

(9.29)

By virtue of (9.28) and the linearity of the operator A, it follows that m

AFm(x) = APn(x) + m

2: cjAwn(x, Xj)

j=l

m

= 2: ajAe/x) + 2: cjAwn(x, Xj)' j ~ O

j=l

(9.30)

Since (9.26) and (9.27) are valid where the e/s and wn's are given in (9.23) and (9.24), respectively, by virtue of (9.29) we obtain m

AFm(x)=2:c jAwn(x,xj)2:0 forevery j=l

m=1,2, ....

(9.31)

By using the continuity property of the operator A in (9.30) and (9.31), we find that Af

= A ( ~ ~ Fm(X)) o o = E ~ o Ao Fm(x) 2:0,

which implies (9.25). (b) Only if: Suppose that the implication in (9.25) is valid for an arbitrary function f E Kn[a, b]. By a direct verification it follows that the functions ej and -ej defined in (9.23) are in the class Kn[a, b]. Thus by (9.25) we have Aej 2: 0 and A(-ej) 2: 0, and (9.26) is satisfied. In the same fashion we conclude that wn+1(t, c) E Kn[a, b]. Using (9.25) one more time, we obtain (9.27). D We say that the operator A:C[a, b] x C[a, b ] ~ S ( D is) bilinear if the operator Bu = A(u, v) is linear with respect to u for every function v E C[a, b] and if the operator Cv = A(u, v) is linear with respect to v for every function u E C[a, b]. From Theorem 9.17 we can obtain the following theorem. 9.18. Theorem. Let the operator A: C[a, b] x C[a, b ] ~SeD) be bilinear and continuous. Then for every pair offunctions (t, g),

9.2. Some Generalizations and Refinements

265

is valid iff (i) A(ei,ej)=O for O::5i, j::5n-1, (ii) A(ei,wn(t,C»= A(wn(t, c), ej) = 0 for every c E [a, b] and every i, j = 0, 1, ... , n - 1, and (iii) A(wn(t, Cl), wn(t, C2»"? 0 for every (c 1 , C2) E [a, b] x [a, b].

9.19. Remarks. (a) For n = 2, instead of w2 (t, c) = (t - c)+, we can let W2(t, c) = oc(t) = It - c] (see Vasic and Lackovic, 1978). (b) Vasic and Lackovic's papers were published in 1978, and earlier Bojanic and Roulier (1974) proved the following general result: Let A: C[a, b ] ~X be a continuous linear operator. Then AU) E P holds for every f E Kn[a, b] (n "? 2) iff we have (i) A(p) = 0 for every p E IIn - 1 (the set of all polynomials of degree at most n - 1), and (ii) A(wn(t, c» E P for every c E (a, b), where X and P are defined as in Remark 9.13. (c) Theorems 2.24 and 2.26 are simple consequences of Theorem 9.17. (d) Vasic and Lackovic (1978, 1979) give some majorization-type theorems which follow by replacing A with A - B in Theorem 9.17. (e) In Kocic and Lackovic (1986) and Kocic (1984) the reverse of the implication in (9.25) is considered. For that result we need the concept of one-sided strong local maximum (OSLM) of real-valued functions: A function rjJ E C(/) has a OSLM at the point Xo E 1 (an interval) if there exists an h > 0 such that for every x E (xo - h, Xo + h) £; 1 we have rjJ(x) ::5 rjJ(xo), and rjJ(x) < rjJ(xo) holds at least in one of the intervals (xo - h, xo) or (xo, Xo + h). We denote by C(/) the set of all functions in C(/) that have a OSLM in at least one point in 1. Now we can give a result of Kocic and Lackovic (1986): Let {A;J be a i family of linear operators such that A),: C ( / ) S ~ (D) and e;(t) = t (i = 0, 1). If (i) A),eo = 0 for every AE A, where A denotes the index set which is at least countable, (ii) A),e1 = 0 for AE A, and (iii) for every rjJ E C(I) there exists at least one Ao E A and Yo E D such that AAo(rjJ, Yo) < 0, then

A),"? 0 for every AE A

f

~

E

K(I) for every f

E

C(/),

where K(I) is the set of all convex functions on I. As a special case, they gave a linear criterion of convexity: Let {A),} be a family of continuous ~ (D) where 1 is a finite interval. If the linear operators A),: C ( / ) S previous conditions (i)-(iii) and (iv) A),0c"? 0 for every C E I, AE A are satisfied, then

Ad"? 0 for A E A¢:}f E K(I) holds. The function o; in (iv) can be replaced by W2(t, c).

266


(f) A result similar to Theorem 9.17 for starshaped sequences is given in Milovanovic, Stojanovic, and Kocic (1986). 0 The following theorem is proved in Brunk (1964):

9.20. Theorem. Let I be an interval in ~ k ; X(t) = (XI(t), ... ,Xk(t» be a vector of functions where the X;'s (1:5 i:5 k) are increasing and continuous from the right on [a, b). Let H be continuous from the left and of bounded variation on [a, b), with H(a) =0. If H(b)=O, fra,b)H(u) dX(u) =0, and

f

H ( u ) d X ( u ) ~for[a,t]c[a,b], O

[a,t)

then

f

~0

f(X(t» dH(t)

(9.32)

[a,b) holds for every continuous function f: I ~ ~ with increasing increments, where JH dX = (f H dXI, . . . , f H dXk)· The conclusion also holds when [a, t] is replaced by [a, t). Now let dp, denote a signed measure on (a, b) such that f ~ Idp,1 < 00. Such a measure possesses a decomposition du = dP,1 - dp,zwhere dP,1 and dp,z are finite nonnegative measures on (a, b). We restrict our attention to the measures du such that for each (jJ E C(uo, u l , ••• , un) the integral f ~ (jJ du is well-defined, with infinite values permitted. Specifically, if (jJ+(t) = max{ (jJ(t), O} and (jJ-(t) = (jJ+(t) - (jJ(t), then we can write b

b

b

f (jJ du = f (jJ+ a

dP,1

a

+ f (jJ- dp,z-

b

b

(f (jJ- dP,1 + f (jJ+

a

a

dP,z).

a

That f ~ (jJ du is well-defined means that at lest one of the sums b

f (jJ+ a

b

dP,1

b

f

f

a

a

+ (jJ- du., and

b (jJ- dP,1

+ f 4>+ dp,z a

is finite. The dual cone of C(uo, u l , . . • , un), denoted by C*(Uo, UI , ... ,un), is the set of signed measures du on (a, b) which


267

obey the above integrability requirements and satisfy b

JcI>(t)u(dt)

~0

for all cI>

E

quo,

Ul, •.. ,

un)

a

(see Karlin and Studden, 1966, p. 405). In the following we state some characterizations of the cone C*(uo, Ul,"" un) as given in Karlin and Studden (1966, pp. 405-10.). 9.21. Theorem. A signed measure du is contained in the dual cone C*(uo, Ul,"" un) iff b

J

Uj

du = 0 for

j

= 0, 1, ... , n

(9.33)

~0,

a <xn(t, xl dp,(t) a

where cl>n is given by (1.84).

9.22. Theorem. A signed measure du is contained in the dual of the cone nj=k quo, Ul, . . . , uj ) , k:5, n - 1, iff b

J

(i)

Uj

du = 0 for

j

= 0, 1, ... , k;

a

f b

(ii)

Uj

dp.

~o

for

j

= k + 1, k + 2, ... , n;

a

and b

(iii)

J cl>n(t, x) dp,(t)

~0

for

a < x < b.

a

For the next result we need the following definition: 9.23. Definition. A signed measure du is said to have k sign changes on (a, b) if there exists a subdivision of (a, b) into disjoint consecutive

268


intervals .lo,. . . ,Jk such that dp is of alternating sign and non-null on Jo, . . . ,Jk. (In the case that d p =f (t)dt for some continuous function f, the number of sign changes of d p is equivalent to the number of ordinary sign changes of the function f.)

9.24. Theorem. (a) If d p satisfies the orthogonality relations (9.33), then d p exhibits at least n + 1 sign changes. (b) Let d p satisfy (9.33). Zf dy possesses exactly n 1 sign changes on (a, b ) , is a nonnegative measure, and is non-null on some interval extending to the endpoint b, then d p E C*(uo,u1, . . . , u,).

+

Proof. (a) Suppose that d p possesses p 5 n sign changes. Then there exists a subdivision J o , . . . ,Jp such that d p is non-null and alternates in sign on .Io,. . . ,J p . Let ti = sup{t: t eJi} for i = 0, . . . ,p - 1, and define

Then the polynomial u(t) satisfies u(t)dp(t)2 0. Furthermore, since the support of d p cannot be confined to the finite set { t o , .. . , t,-,}, it follows that u ( t )dp(t) > 0. However, this inequality is incompatible with the orthogonality properties assumed for dp. Thus we conclude that dp must possess at least n + 1 sign changes. be the subdivision of (a, b ) associated with d p (b) Let .To, . . . , obeying the precepts of Definition 9.23. (Note that d p is a nonnegative measure on J,,+l.) Define t o , . . . , t, by ti =sup{t:t E J , } for i = 0, 1,. . . , n, and let

s:

8(t) =

(9.35)

where @ E C(uo,u l , . . . , u,). Expanding the determinant in (9.35), we see that s(t)can be written in the form


269

With the aid of the orthogonality requirements satisfied by du, we obtain b

f

a

f b

Un)

(J(t) d{l(t) = U(u o' to,

, , tn

ep(t) duit].

a

However, it follows from (9.35) that (J(t) d{l(t) is a nonnegative measure throughout (a, b), so that f ~ ep(t) du ~O. Consequently we have du E C*(uo, Ul, . . . , un)' 0

9.25. Remarks. (a) Theorems 9.21, 9.22, and 9.24 contain many linear inequalities for convex functions, as was shown in Karlin and Studden (1966, pp. 410-31). This is a consequence of the fact that every linear continuous functional has an integral representation (see, for example, Kolmogorov and Fomin, 1972, pp. 347-48). (b) In Kocic (1982b) the implication f E quo, Ul) ~ A f " :0 ? is considered where A is a linear continuous operator defined as in Theorem 9.17. . (c) Note that Theorem 9.14 is used in Kovacec (1984) for a generalization of some classic inequalities for a rearrangement of vectors. (d) In Karlin and Studden (1966, p. 411), the Steffensen inequalities are also obtained as a consequence of Theorem 9.21. 0 In the following we give some refinements and converses of the previous results. First, we note that a simple consequence of Abel's identity in (9.5) is the well-known Abel Inequality (see Mitrinovic, 1970, pp.32-33): 9.26. Theorem. al":?· •. ":? an":?

Let {akH be a sequence of real numbers satisfying and let Wk = al + ... + ak (k = 1, ... , n). If

°

m

=

min Wk and l ~ k ~ n

M

=

max Wk, l ~ k $ n

then n

mai s;

L ;=1

wjaj:S Mal'

(9.36)

270


Bromwich (1908, 1955) gives the following generalization of (9.36): 9.27. Theorem. n), define

Given a real sequence {ad1 and an integer v (1::5 v::5 k

Ak

= 2:

a,

for

k

= 1, ... , n,

i=l

H;

= max

H ~

vsksn

= l:",:,;k:-s:v-l max Ak> Ak

and

,

h; =

min

Ak ,

l ~ k = : : : ; v - l

= min

h;

Y=:::;ksn

Ak ,

If {vd 1 is a decreasing sequence ofpositive real numbers, then n

2: a

hy(Vl - vy) + h ~ v y : : 5

ivi::5

Hy(Vl - vy) + H ~ v y . (9.37)

i=l

9.28. Remark. Bromwich (1908, 1955) also gave an integral analogue of the previous result with many applications. 0 Since the previous results (especially Abel's inequality) are related to the second integral mean value theorem, we give the following theorem and discuss some related results: Let t. g be real-valued functions which are defined and bounded on a compact interval I = [a, b], and let f be Stieltjes integrable with respect to g on I, written as f E L(g) on I. Further, let V(f, [a, b]) denote the total variation of f on I and "r the total variation function of f, i.e., vf(a) = 0 and vf(x) = V(f' [a, x]) for x E [a, b]. 9.29. Theorem (Second integral mean value theorem). Let f be increasing (decreasing), g be continuous, and h e L(g) on [a, b]. Then there exists a Z E [a, b] such that z

b

b

Jf(s)h(s)dg(s)=f(a) J h(s) dg(s) +f(b) J h(s)dg(s). a

a

(9.38)

z

If instead f is nonnegative and increasing (decreasing), then there exists a z, E [a, b] (zz E [a, b]) such that

f b

a b

f b

f(s)h(s) dg(s) =f(b)

h(s)dg(s)

~

~

(f f(s)h(s) dg(s) = f(a) Jh(s) dg(s»). a

(9.39)

271


Proof. Denote G(x) = f ~ g ( s d) g(s) for x E [a, b]. Then G is continuous on [a, b] and f E L( G). Integrating by parts yields b

b

f f(s) dG(s) =f(b)G(b)- f G(s)df(s). a

a

By the first mean value theorem there exists a b

[a, b] such that

b

f G(s)df(s)=G(z) f df(s) a

Z E

= G(z)(f(b)-f(a».

a

Combining, we have (9.38).

D

Note that if we redefine f(a) = 0 (or f(b) = 0), then the result in (9.39) follows from (9.38). As an immediate consequence of the second mean value theorem we obtain (Karamata, 1949, p. 264 and Boas, 1970a): 9.30. Theorem. Let f be nonnegative and monotonic on [a, b], and assume that h E L(g) and fh E L(g) on [a, b]. (a) Iff is increasing, then b

f(b )inf{fh(s) dg(s): a z: t s:

-l-

b

f f(s)h(s) dg(s) a

t

b

::Sf(b)sup{f h(S)dg(s):a::st::Sb}. t

(9.40) (b) Iff is decreasing, then t

b

f(a)inf{f h(s)dg(s):a::st::Sb}::S f f(s)h(s)dg(s) a

a t

::Sf(a)sup{f h(s) dg(s):a::s t s; b}. a

(9.41)

Moreover, if g is continuous at b (at a), then we may replace f (b) by f(b_) in (9.40) (f(a) by f(a+) in (9.41».

9.31. Remark. In the same fashion we can obtain analogs similar to the D results of Mitrinovic (1970, pp. 301-2, 3.7.35 and 3.7.36).

272


The following result is a modification of a result in Karamata (1949, pp. 77-79) (see also Marik, 1949): 9.32. Theorem. Let f be a function of bounded variation on [a, b) = I, and let g, h be bounded functions such that h E L(g) and fh E L(g) on I. Then b

IJ f(s)h(s) dg(s) I ~ If(b)1 + V(f, I)

x

~ ~ ~IJ h(s) dg(S)I,

a

(9.42)

a b

b

IJ f(s)h(s) dg(s) I ~ If(a)1 + V(f, I)

~ ~ ~iJ h(s) dg(s)\.

(9.43)

a

Beesack (1975) gave the following generalization of a result of Darst and Pollard (1970): 9.33. Theorem. Let f be of bounded variation on [a, b) = I and h, g be bounded functions such that hE L(g) and fh E L(g) on 1. Let m = inf{f(x): a ~ x ~ b}, then b

v

b

Jf(s)h(s) dg(s)

~m Jh(s) dg(s) + V(f, I) a s ~ ~ ~ s J b h(s) dg(s),

a

a

u

(9.44) b

v

b

Jf(s)h(s) dg(s)

"2

a

m

Jh(s) dg(s) + V(f, I) a

a s ~ ~ ~ s Jb h(s) dg(s). u

(9.45)

An interesting related result is given by Marik (1974) (we state the result in the form given by Beesack): 9.34. Theorem.

Let the conditions of Theorem 9.32 be satisfied. Then v

b

If a

f(s)h(s) dg(s) I

~ ~ (V(f, I) + If(a)1 + If(b)1) U ~ ~ ~Jl h(s) dg(s). U

(9.46)

Proof. Let H o ( x ) = f ~ h ( s ) d g ( s ) , u=inf{Ho(x):xEI}, v= sup{Ho(x):x E I}, c = !(u + v), and H = H o - c. (Note that in general the


273

values of u, v need not be attained, the same is true for sUpu,vEI g h(s) dg(s)). Then

IIh II g =

1 H(x) = -2

x

x

t

Ih(s) dg(s) _!2 inf Ih(s) dg(s) +!2 Ih(s) dg(s) tel

a

I

a

a

t

1 - -2 sup tEl

h(s) dg(s)

a

x

=2! SUp teI

x

x

(I h(s) dg(s») +!2 inf (I h(s) dg(s») ::;!sup (I h(s) dg(s») 2 t el

t

t e!

t

t

Thus we have

I b

f(s)h(s) dg(s)

=-

a

II

I b

H(s) df(s) + H(b)f(b) - H(a)f(a),

Q;

b

I::; I b

f(s)h(s) dg(s)

a

IH(s)1 dVf(S)

+ ~II hllg (If(b)1 + If(a)I),

a

and (9.46) follows.

0

The following result is a simple modification of an inequality in Karamata (1949, p. 79):

Let h, g, hf,fg E L(A), and denote G(x) = f ~ g ( s d) A(S), H(x) = f ~ h(s) dA(S)(H(x) > 0 for all x E (a, b D. If f is a nonnegative, decreasing function on I such that f ~ h(s)f(s) dA(S)> 0, then

9.35. Theorem.

I I b

inf G(X)::; a < x ~ H(x) b

a

g(s)f(s) dA(S) ::;

b

sup G(x). H(x)

(9.47)

a-c.x zzb

h(s)f(s) dA(s)

a

Proof. Denote the terms on the left-hand side and right-hand side in (9.47) by u and v, respectively. Then

uH(x)::; G(x)::; vH(x) for x

E

[a, b)

274


and b

b

Jf(s)g(s)dA(S)=f(b)G(b) + JG(x)d(-f(s)) a

a b

~ v V ( b ) H ( bJ) H(x)d(-f(s))) + a b

=V

Jf(s)h(s) dA(S). a

Since f ~ f ( s ) h ( s ) d A ( S » Othe , follows. 0

right-hand

inequality

ill

(9.47)

Karamata gives a discrete analogue of (9.47), and an extension of that result is given by Simeunovic (see Mitrinovic, 1970, p. 223). In the following we give an integral analogue of Simeunovic's result (see also Pecaric and Savic, 1984): If the conditions of Theorem 9.35 are satisfied, h(s) > 0 for all s E I, and A is increasing, then b

J

g(s)f(s) dA(S) inf g(x) ~ inf G(x) ~ . : : . , a b - - - - asxsbh(x) a<xsb H(x) h(s)f(s) dA(S)

J

a

~

G(x) sup - a<xsbH(x)

g(x) sup - - . asxsbh(x)

~

(9.48)

We also give the following generalization of Abel's inequality: Let Xij (i = 1, ... , n, j = 1, ... ,m) be real numbers, and let a = {aiJ (i = 1, ... , n, j = 1, ... , m) be a nonnegative, decreasing (1, i)-convex sequence of real numbers. Then

9.36. Theorem.

n

all min Xij -s

m

L L Xijaij i ~ l

holds, where Xij = ~ ~ ~ 1 ~ ~ = X 1 rs '

j=l

~ all max x;

(9.49)


275

Proof,

n-1

= anmXnm -

m-1

x.; !:i.a rm - s=l 2: x; !:i.a r=l 2:

1

2

n-1 :5 max Xij( a nm -

2:

r=l

m-1 !:i.a rm 1

n-I

ns +

2:

s=l

!:i.a ns + 2

m-1

2: 2:

r=l s=l

n-1 m-1

x; !:i.1, 1a.,

2: 2:!:i. «:

)

1,1

r=l s=l

= all maxXij. This establishes the second inquality in (9.49). The first inequality can be proved similarly. D

9.37. Remark. Integral analogues of this and of the next two theorems ~ can be proved similarly. In the next result (proved in Pecaric, 1979a), we shall adopt the notation in Theorem 9.10. 9.38. Theorem.

Let XiI' .. Xim (1:5 ik

:5

n, 1:5 k :5 m) be real numbers.

(a) For all nonnegative and decreasing n-tuples a (1:5 j:5 m), we have

all' .. amIminXst···sm:5 F(a

j , ••• ,

a m) :5 all' .. ami maxXSt"'sm' (9.50)

(b) If aj (1 :5 j :5 m) are monotonic n-tuples, then m

IF(a 1 ,

.•• ,

am)1 :5 max IXs!,,,sml

IT (Iajnl + lajn -

aj11)·

(9.51)

j ~ l

The following result is given in Pecaric, Mesihovic, and Milovanovic (1989): 9.39. Theorem. Let xij (i = 1, ... , N, j = 1, ... ,M) and F(a) be defined as in Theorem 9.12 (a), and let en,m = (i -ltU -1)m.

276


(a) If aij (i = 1, t1n , m aij 2: a (i = 1,

, N;j = 1, ,M) are real numbers such that ,N - n, j = 1, , M - m), then F(a)

a

2: - ' - I

n.m.

F(en,m)'

(9.52)

(b) If lt1n , m aij !::; A (i = 1, ... , N - n;j = 1, ... ,M - m), then A IF(a) I ::;F(en,m)' n!m!

(9.53)

Chapter 10 Orderings and Convexity-Preserving Transformations

10.1. Orderings of Convexity: Generalizations and Related Results Partial orderings of notions of convexity and related preservation properties play an important role in the theory of inequalities. In this section we discuss some useful results on this topic. For notational convenience, we shall express a sequence { a , } ~(defined for n = 0, 1, 2, . . .) simply as {a,}. We first observe the following result due to Ozeki (1968): Let { a , } ; be an increasing sequence, and let the sequences { B , } ; and {C,}; be defined by B, = (lln) C;='=,iaj and C, = (lln) Cy=n bi (bo = 1). Then

for n = 2 , 3 , . . . and C , r C , / 2 . Of course, the main results of Ozeki (1965, 1967, 1968, 1969, 1970, 1971, 1972) are for the sequence {A,}: where A, = l / n C;='=, a,. Ozeki (1972) proved that if the sequence {a,}: is k-convex, then {A,}; is also k-convex. His proof depends on the following identity (see also MitrinoviC, LackoviC, and StankoviC, 1979): (n

+ k)AkA, = (n - l)AkA,-l + Akan

for n = 2,3, . . . .

From this identity we can obtain the following inequality for k-convex sequences:

277

278

10. Orderings and Convexity-Preserving Transformations

For the case k = 2, Ozeki first considered this problem in 1965 and gave the following list of implications:

where the inequalities hold for every n E}( (the set of all positive integers). It should be pointed out that the implication ~ 2 l o an g 2= O=> ~ 2 a n2=0 was proved earlier by Montel (1928). 1 where p = On the other hand, let An(a, p) = (1/ Pn) ~ 7 = p.a., (PI, ... ,Pn) is a real n-tuple and P; = ~ 7 = Pi' 1 Then the following identity is valid: A k+ 1(a, p) - Ak(a, p)

=

:;+1

k

~ Pi(ak+1 -

k k+1 i=1

ai),

(10.1)

and it follows that if the sequence { a n } is ~ increasing, then for arbitrary weights Pi > 0 (i E}() the sequence {An(a, p)} is also increasing. This, of course, is a weighted version of Ozeki's result for k = 1.

10.1. Remarks. (a) A related result is the following (see Pecaric, 1980b): Let p be such that 0 < P; < P; (k = 1, ... , n - 1), and let a be an increasing n-tuple. Then (10.2) Integral analogues of these results are also valid. Note that an integral analogue of (10.2) is given by Lovera (1957) and Ozeki (1965) when the weights are positive. The special case in which the weights are 1 was considered by Mott (1963) (see also Mitrinovic, 1970, p. 9). (b) Note that the monotonicity property of the arithmetic mean stated above can be used to prove the monotonicity property of an arbitrary quasi-arithmetic mean (Bullen, Mitrinovic, and Vasic, 1987, p. 215), i.e., we have

where M: (u, v ) ~ ~ (-oo:s u < v:s 00) is a continuous and strictly monotonic function, u :S a, :S v, and Pi 2= 0 (i = 1, ... , n). 0

10.1. Orderings of Convexity

279

We also note that the generalization of Ozeki's result for k > 1 is not possible for arbitrary weights. To see this, let us consider the sequence {An} defined by (10.3) where a = {an} is a real sequence and p = {Pn} is a positive sequence, and observe that: 10.2. Theorem. If a is a k-convex sequence, then the sequence defined in (10.3) is k-convex iff the sequence p is given by

_ (u +n -1) ,

P« - Po

n

(10.4)

where Po and u are positive real numbers.

10.3. Remark. For k = 2, Theorem 10.2 is proved in Vasic, Kecic, Lackovic, and Mitrovic (1972). For general k, an attempt to prove this theorem was made by Lackovic and Simic (1974), and a proof can be 0 found in Toader (1988c). A result of Toader (1983), which is a modification of the result of Bruckner and Ostrow (1962) (see (1.24)) for sequences, deals with an ordering of convexity. In the following we give some further generalizations of that result. Let K be a class of convex sequences, a = {an}, S* be a class of starshaped sequences of a such that the sequence {(an+lao)/(n + I)} is increasing for n 2: O. Let S be a class of superadditive sequences satisfying a n+ m - an - am + ao 2: 0 for every n, m > 0, and let W be a class of weak-superadditive sequences, i.e., for every n we have a n+ 1 - an - al + ao 2: O. We say that the sequence a = {an} has the property "P" in the u-mean if the sequence AU = { A ~ }given in (10.3) and (10.4) has the property "P." We denote by MUK, MUS*, MUS, and MUW the sets of sequences which are convex, starshaped, superadditive, and weak-superadditive in u-mean, respectively. The following result is given in Toader (1983): If 0 < v < u, then we have K eMuK eMvKeS* eMuS* «u-s-, S* eS e W,

MUS* eMUS eMuW,

MVS* eMVSeMVW,

W eMuWeMVW.

Some more general results can also be found in Toader (1986a).

280


Mocanu (1982) considered the weighted mean

Fg(x)

=

g ( ~ )Jg'(t)f(t) dt,

(10.5)

o

where g is a real-valued function such that g' exists. Toader (1986b) proved the following results: If the transformation (10.5) preserves the convexity (or the starshapedness, or the superadditivity), then the function g is of the form

u>o,

ki=O.

(10.6)

Denoting by F; the function in (10.5) with g given in (10.6), let MUK(b), MUS*(b), and MUS(b) denote the classes of functions f E C(b) with the property that the corresponding functions E, belong to K(b), S*(b), and S(b), respectively, where C(b), K(b), S*(b), and S(b) are the classes of continuous, convex, starshaped, and superadditive functions, respectively. It follows that if 0< v < U, then the following implications are true: K(b) c MUK(b) c MVK(b) c S*(b) c MUS*(b) c MVS*(b), S*(b) c S(b),

MUS*(b) c MUS(b),

MVS*(b) c MVS(b).

Note that (10.6) was also obtained by Lackovic (1975). Some similar results are given in Toader (1986a, 1988b). Lackovic (1975) also proved the following result: Let the function f be defined and continuous on [0, b] such that f(O) = 0, and consider the following conditions: (i) f is m-convex on [a, b], (ii) fis m-convex in mean, i.e., the function F(x) = (1/x) gf(t) dt (0 < X :5 b), where F(O) = 0, is m-convex on [0, b], (iii) the functionf(x)/x is convex of order m -1 on (0, b], (iv) fis superadditive of order m on [0, b] (see Remark 6.13(b», (v) f is superadditive of order m in mean, i.e., F is superadditive of order m on [0, b]. Then the following implications are valid: (i) => (ii) => (iii) => (iv) => (v). Partial orderings for sequences that are convex of higher order was also considered by Toader (1985c). He considered the ordering of order


281

three, and provided the following definitions: The sequence {an} is said to be: starshaped of order three if the sequence {(a n + ! convex of order two; (ii) superadditive of order three if

(i)

-

ao)/(n + I)} is

for every

m, n, P ;::: 0;

(iii) 2-starshaped of order three if it satisfies the relation: an + 3

-

ao:> a n + 2

n+3

-

at

n+1

f

or

:> 0 n e: .

Let us denote by K 3 , S;, and S ~ *the classes of convex, starshaped, and superadditive and 2-starshaped of order three sequences, respectively. Further, we denote by M UK3, MUS;, M US3, and M U S ~the * classes of sequences {an} with the pro('erty that { A ~ given } in (10.3) and (10.4) is in K3 , S;, S3' and S ~ * , respectively. If < U < v, then the following implications are true:

°

For results of higher order, see Toader (1986a). Finally, we give some generalizations of Theorem 10.2. Let (Pn,i) (i = 0, 1, ... , n; n = 0, 1,2, ... ) be a triangular matrix of real numbers, let A(a) = {An(a)} be the sequence defined as n

An(a) =

L Pn,n-IJh k=O

for

n = 0, 1, 2, ....

(10.7)

Ozeki (1967) obtained necessary conditions for a triangular matrix (Pn,i) to possess the following property: The sequence {An(a)} defined in (10.7) is convex for every convex sequence {an} (see also Mitrinovic, Lackovic, and Stankovic, 1979; Lupas , 1979; and Kotkowski and Waszak, 1978). The following is a generalization of Ozeki's result (see Lupas, 1979).

282


10.4. Theorem.

Let An(a) be defined as in (10.7). Then the implication /:1r an;::=: 0::;> /:1rA n(a);::=: 0

(10.8)

is valid for every sequence {an} iff

u+ 1, j) = 0

/:1rxn

for

= 0, 1, ... , r -1;

j

n

= 0, 1,2, ...

and

(10.9) /:1rXn(r, i + r);::=: 0 for

i = 0, 1, ... , n;

n = 0, 1,2, ... ,

where Xn(m, k) =

for

{~~k

(n - k

+m -

1 - j)

m -1

LJ

J=O

. Pn,J for

n 0 for all n. Similarly, we have Pnan = totl ... tn-l(tn -1),

so that from p.a; > 0 we have t; > 1. Since by assumption an-lan+l holds, we have tn + l

~

qntn-l(tn - 1)2 1 + ..:c..:.....:..:.......::....:....:.:_----'-tn(tn-l - 1)

~ a ~

(10.21)

Using (10.15), the definition of the sequence {r,}, and (10.21), we then have

= C(tn+l -

tnQn)

~ c(qntn-l(tn - 1)2 + 1 - tnQ n) tn(tn-l - 1)

= CI/(qn, Q, tn-I, tn), where C and C l satisfy C

~0

and

Cl =

C

tn(tn-l - 1)

> 0 for n > 1,

and

(10.22) is a function of tn only when qn' Qn, tn- l are kept fixed. From (10.18) we have qntn-l - Qntn-l + Qn > O. Clearly the discriminant form of fUn) is

If D :s: 0, then it is immediate that 0n-l 0n+l - o ~ ~ 0; thus we consider only the case D > o. Since t.;.«> 1, we see that t n_ I ( I - 4qn(1- Qn» > 1

286


and 1 - 4qn(1- Qn) > O. Thus the condition D > 0 implies 1 . 1 - 4qn(1- Qn)

(10.23)

~ - l >

To show that 0n-lOn+l - o ~2: 0 holds for all n, we proceed by induction. For n = 1, it follows by a direct verification that 000z -

ai = (poao)Z (-Q1t Z+ (1- 2Ql)t + 1- Ql +Pza z), rorz

where t 000z -

poao

= (Plal)/(poao) > O. Since C = (poaof/(rorz) > 0, we have

of = C(ql = C(ql 2:

z) Ql)tZ+ (1- 2Ql)t + (1- Ql) - qltZ+pza poao Ql)t Z+ (1- 2Ql)t + 1-

e. +pz(aoapoaoz; aD)

C«ql - Ql)t Z+ (1 - 2Ql)t + 1 - Ql) == Cfl(t)

(note that the last step is given incorrectly in Ozeki, 1967). The discriminant form of the polynomial ft(t) is Do = (1- 2Ql)Z - 4(1- Ql)(ql - Ql) = 1- 4ql(1- Ql)'

Applying (10.16) and (10.19) for n = 1, we find that Do ::5 0, i.e. ,fl(t) 2: O. Thus we have 000z - ai 2: O. Now let us assume that (10.24) for some n 2: 1 where C 2: 0 was defined earlier. By a direct verification we have f(Qn-ltn-l)

= tn-1«qn + (Qn-1Qn

Qn)Q~-lt~-l

+1-

2qn)Qn-l tn-l

+ (qn -

Qn-l»

= tn-1F(tn-l), and the discriminant form of the quadratic polynomial f(t) is D 1 = (1 - Qn-1Qnf - 4qn(1- Qn)(1 - Qn-l)'

From (10.19) we have D 1 ::5 0, i.e., f(Qn-ltn-l) 2: O.

(10.25)

Thus the assumption D > 0 implies that equation f(t) = 0 (where f is defined in (10.22» has two distinct real roots, a and {J, say. Moreover,


287

we have

a+f3 1 Qn-lln-l - -2- = 2« qn - Qn )i.., + Qn ) x (2Qn-l(qn - Q n ) l ~ - +l (2QnQn-l + 1- 2qn)ln-l -1)

= Cdiln-l), where Cz > O. For the quadratic trinomial jjtr) we have

Iz(O) = -1,

(10.26)

and by a direct calculation we find that

Iz(' -

4 q n ~ -1

Qn»)

= C 3(2Qn -

1)(Qn-l(2Qn - 1) - 1 + 4qn(1- Qn»,

(10.27)

where, from (10.17), 2Qn > 1 and C3 > O. From (10.19) we have

4 (l-'Q) ~(1- Qn_1Qn)Z qn n 1- Qn-l Thus from (10.27) we obtain

Iz(' _

4 q n ~ _1 Qn») ~ C

4(Qn_l)z(1-

Qn)Z

~0,

(10.28)

where C 4 > O. Combining (10.23), (10.25), (10.26), and (10.28), we then have (10.29) and from (10.24) it follows that In (10.25) and (10.29) we obtain

~ In-1Qn-l.

Consequently, from

Note that in Theorem 10.9 if we take r., = 1 and Po = PI = ... = p.; = 1 for all n = 0, 1, ... , then the assumptions (10.26)-(10.29) are satisfied and, from the log-convexity property of the sequence {an} the logconvexity property of the sequence {An} follows, where An is given by An = (lin) ~ 7 ~ a., 1

288


Ozeki (1967) also obtained the following result: If a positive sequence {an} is log-convex, then the sequence {an} defined by for

n = 0, 1, ...

(10.30)

is also log-convex. Together with the sequence {an} let us consider another sequence {Sn} given by (10.31) It is clear that {an} is log-convex iff {Sn} is log-convex, and Ozeki's

(1967) proof is based on this idea.

10.2.

Various Results

Let {an} and {bn} be two given sequences. A result which general than (10.31) was given by Davenport and P6lya (1949):

10.10. Theorem.

IS

more

Let the sequence {wn} be defined by

(10.32) where {an} and {bn} are positive and log-convex. Then the sequence {wn}

is also positive and log-convex. Let the sequence {c n } be defined by n

c, = 2:

ak b n -

k1

(10.33)

k=O

which is the convolution of {an} and {bn}. Then (10.33) gives the coefficients of the expansion

and the following result is valid (Kalusa, 1928; Karamata, 1933;

10.2. Various Results

289

Davenport and P6lya, 1949; Jurkat, 1954; Lorentz, 1954; and Menon, 1969):

10.11. Theorem.

If {an} and {b n} are posuuie and logarithmically concave (written as log-concave) sequences, then their convolution {cn} defined in (10.33) is also positive and log-concave.

Note that if {an} and {bn} are positive and log-convex, then their convolution {cn } need not be log-convex. The following result is proved in Vinogradov (1975): Let {an} and {b n} be two nonnegative log-concave sequences such that ao - Aal 2:: 0 and b., - Ab1 2:: 0, where A 2:: O. Let the sequence {cn } be defined by n

c;

= 2: aib n-i i=O

n-l

A 2: ai+1b n-i for

n

2::

1,

i ~ O

Then {c.} is also log-concave. Ozeki (1969) also considered a sequence related to the convolution sequence {cn}; namely, for given {an} and {b n}, let {cn} be given by (10.34) Then the following theorems are valid:

10.12. Theorem. Let {an} and {bn} be positive and convex sequences. If al2::ao and b l2::b o, then the sequence {cn} defined in (10.34) is convex. In other words, if

i = 1, 2, ... ,

for

(10.35)

then ~ 2 C i _ 2:: l 0 for i = 1, 2, ....

The following theorem concerns log-convex sequences.

10.13. Theorem.

If {an} and {bn} satisfy the conditions bi-1b i + 1 2::

bf,

for

i

=

1, 2, ... ,

then the sequence {c.} defined in (10.34) also satisfies the condition Ci-1C i + 1 2::

cf

for

i = 1, 2, ....

(10.36)

290


It is easy to prove that if {an} and {b n} satisfy the conditions of Theorem 10.12, then they are increasing. This fact follows from (10.35) by induction. The proof of Theorem 10.12, as given in Ozeki (1969), is very similar to that of Theorem 10.9. By a direct calculation we can verify that for the sequence {cn } the equality

L\2(n + I)c,

= (n + 1)L\2cn + 2L\cn+ 1

holds. This implies that if {an} and {b n} satisfy the conditions of Theorem 10.12, then not only is {c.} convex, but also their convolution sequence {cn } defined in (10.33) is convex. Some related problems were studied in Jurkat (1954). The proof of Theorem 10.13 given in Ozeki (1969) is also similar to that of Theorem 10.9. Note that the sequence {cn} defined in (10.34) can be written in the form c; = (l/(n + l»cn , where cn is given by (10.33). Upon applying this identity, the inequality in (10.36) becomes (see Mitrinovic, Lackovic, and Stankovic, 1979).

(i + 1)2 -

"(" 1 1

-

-2

+ 2 )Ci-1Ci+1;:::Ci .

f

or

"1 2

1=

,

, ... ,

which is a weaker result than log-convexity. Ozeki (1970) also gave some theorems related to coefficients of functions given in the form of a power series. In particular, the following result was proved. 10.14. Theorem.

For a given real sequence { P n } let ~ the sequence { q n } ~ be defined by the following equality

1+

k ~ lqkx k = (1- ~ 1Pkxk)

-1;

(10.37)

n = 2,3, . . ..

(10.38)

= 1, 2, ....

(10.39)

that is, let n-1

qn = Pn

+

L Pkqn-k

for

k ~ l

If

{ P n } is ~

positive and log-convex, then qn > 0

and

qnqn+2;::: q ~ + l for

n

The proof of this theorem, given in Ozeki (1970), is based on the following theorem, which can be found in the same paper. We note that the fact qn > 0 was already proved in Karamata (1933). Some similar (but more general) results were obtained by Jurkat (1954).


10.15. Theorem. Let D; (n n X n matrix (a i) where

= 1, 2, ... )

aij = -1

denote the determinant of the j = i - 2,

for

291

and a ij = 0 for

j

< i - 2.

If and then (-ltD n 2= O(n = 1, 2, ... ).

The proof of this theorem by induction is given in Ozeki (1970). A result that is similar to Theorem 10.14 is proved in Ozeki (1971):

10.16. Theorem.

Let {a;}, {b;}, and {c.} be sequences of real numbers such that the following conditions are satisfied:

rr

n i=O

(x

~

+ aZi+l) = j-::O

rr n

(n +i 1) b.x

n+l-i

b o = 1,

,

(10.40)

n

(x

+ azi )

i ~ O

If a, > 0 and aiai+Z - af+l

=

L cjx

n

-

1

Co

,

(10.41)

= 1.

i ~ O

2= 0

(i

= 1, 2, ... ),

then b,

2= c,

(i

= 1, ...

, n).

The proof of Theorem 10.16 given in Ozeki involves a partial ordering k be two power series. They defined below: Let ~ ~ ~ aix" o and ~ ~ = bkx o are said to be partially ordered, denoted by ~ ~ ~ aix">» o ~ ~ ~ bix", o if ak 2= b k holds for k = 0, 1, 2, . . .. Ozeki (1968) proved the following results concerning this partial ordering:

10.17. Theorem. Assume that f(x) = ~ ~ ~ O P k X If k . the sequence {pd is positive and log-convex, then f(k-l)(x)f(k+l)(x) (f(k)(x»)Z (k - 1)! (k + 1)! » k ! for

_ k - 1,2, ... ,

(10.42)

where the derivatives f(m) are taken in the usual sense.

The proof of this theorem, as stated in Ozeki (1968), follows immediately by calculating the coefficient of .r" on the left-hand and the right-hand sides of (10.42).

292


Ozeki (1971) also studied some properties of the partial ordering for polynomials (see also Mitrinovic, Lackovic, and Stankovic, 1979). Let (E, A, fl) be a probability space. For p E ~ let Dp,1' be the set of all functions f: E ~ ~ such that (i) f is measurable and nonnegative (or positive if p 2: 0), and (ii) < f fP du < 00 for p#:O and 0< flogfdu < 00 for p = 0. For f E Dp,1' let us define the LP -mean of f with respect to fl by

°

for p#:O, forp=O. If A denotes another probability measure on (E, A), and q

:5,

p, then we

denote the quotient of the respective means by (10.43)

for positive functions whose domains are the convex cone Dq,A n Dp,w Clausing (1983) proved that:

10.18. Theorem. domain

If q

:5,

1 :5, p, then

¢q,p,A,1'

Dq,p,A,1'

=

is quasiconvex in the whole

Dq,p,A,W

This result is a generalization of a result in Clausing (1982) which concerns the special case q = -1 and p = 1. Using that result Clausing (1982) obtained results in Kantorovich (1948) and Wilkins (1955). The following result can be found in Peetre and Persson (1988). Special cases of this result are treated in Pecaric and Beesack (1986), Capocelli and Tanja (1985), Daroczy (1964), Dresher (1953), Persson (1987), and Sharma and Autar (1973a, 1973b).

10.19. Theorem. Let F: ~ ' : - ~ ~+ be an increasing function, and let g: D ~ ~+ (D is an additive Abelian semigroup) be superadditve. (a) If F is convex and f: D ~ ~ , : - , is subadditive, F(O) = 0, then the function H(x) = g(x)F(f(x)/g(x» is subadditve. (b) If F is concave and f: D ~ ~':- is superadditve, then H (x) is superadditive. Let us assume that the function f is defined and continuous on [0,1]. Bernstein's polynomial Bn(x, f) of order n = 0, 1, ... of the function f is


293

defined by Bn(x,f) =

~ O( ~ ) X k ( I - X r - k f ( ~ ) .

It is a well known fact that the sequence {Bn(x, f)} converges to f(x) uniformly as n ~00 under reasonable conditions on f A systematic study of Bernstein's polynomials of convex functions was first made by Popoviciu (1961) (see also Popoviciu, 1944, pp. 43-44). For example, it is known that Bernstein's polynomials of continuous m-convex functions are also m-convex functions, and similar results are valid for functions of several variables. Moreover, the following result is valid:

10.20. Theorem. A continuous function f: [0, 1] every n = 0, 1, ...

~IR

is convex iff for (10.44)

10.21. Remark. Inequality (10.44) was proved by Temple (1954), and the same result was also proved by Arama (1960). In fact, Arama proved the following result: For an arbitrary continuous function f and ;1, ;2, ;3 E[0, 1], we have

Theorem 10.20 was also proved by Moldovan (1962) under a condition 0 weaker than f E C 2[0, 1] (which was given by Kosmak, 1960). The following theorem is related to Theorem 10.20: 10.22. Theorem. A continuous function f: [0, 1] every n = 0, 1, ... f ( x ) ~ B n ( x , forevery f )

~IR

is convex iff for

XE[O, 1].

(10.45)

10.23. Remarks. (a) Inequality (10.45) was proved independently by Arama (1960) and P6lya and Schoenberg (1958), and Theorem 10.22 was also obtained by Moldovan (1962). (b) Inequalities (10.44) and (10.45) imply that the sequence {Bn(x,f)} is decreasing and converges to f from above, and Theorems 10.20 and 10.22 assert that these properties are valid only for convex functions.

294


(c) The convexity property of the sequence {Bn(x, f)} was considered by Popoviciu (1961) and Arama and Ripianu (1961) (see also Mitrinovic, Lackovic, and Stankovic, 1979). (d) Let S denote the class of all star-shaped functions on [0,1]. Lupas (1972a) showed that

f

E

~Bn(x, f) E

S

S.

(e) Theorems 5.2 and 5.3 are similar to Theorems 10.20 and 10.22. 0 The following theorem was proved by Artin (1931) (see also Marshall and Olkin, 1979, p. 452): 10.24. Theorem. Let U be an open convex subset of ~ n , and let <jJ: Ux(a, b ) [~0, (0) satisfy (i) <jJ(x, z) is Borel-measureable in z for each fixed x, and (ii) log <jJ(x, z) is convex in x for every fixed z. If v is a measure on the Borel subsets of (a, b) such that <jJ(x, .) is v-integrable for every x E U, then b

ljJ(x) ==

J<jJ(x, z) dv(z)

(10.46)

a

is a log-convex function on U.

An important example is that ¢(x,z)=(<jJ(z)Y, where <jJ(z) is a positive function, and in this case the function b

A(X)=

J(<jJ(z)Y dv(z)

(10.47)

a

is log-convex. This is a special case of a result given in Remark 4.19(d). 10.25. Remark. Theorem 10.24 is equivalent to Holder's inequality (see Marshall and Olkin, 1979, p. 461). 0 Suppose that for every x E ~n and every convex subset A c ~m there exists an integral of the form fAf(x,y)dy==I(x,A). Then the following theorems are valid (Tomilenko, 1976): 10.26. Theorem. Let f(x, y) be a function of (n + m) variables, where x E ~ nand y E ~ m .Iffis a finite log-concave function on ~ n + and m A and


B are convex subsets in

295

~ m , then

I(AXI+ (1- A)X2' AA + (1 - A)B) ~I(xl, A)'J(X2' B)l-;" (10.48)

where Xl' X2 E

~ n and

0 < A < 1.

10.27. Theorem. If f(x, y) is a finite log-concave function of (n + m) variables, where X E ~nand y E ~"', then y

I(x, y) is log-concave on

=

Jf(x, t) dt

(10.49)

~ n + m .

10.28. Remark. Theorem 10.26 is a generalization of a result of Prekopa (1973). For Prekopa's result and its applications in statistics, see Section 13.5. 0 In Bodin and Zalgaller (1968) (see also Mitrinovic, 1970, p. 309) the following result is given: '

10.29. Theorem. by

Let A be a negative semidefinite quadratic form given all' a22::; 0 and

alla22 - ai2 ~ o.

If D(m) is a parallelogram in the xy-plane, whose center is m, then the function p defined by p(m) =

JJ eA(x.

y

)

dx dy

D(m)

is a log-concave function of m. Abdel-Hameed and Proschan (1976) considered the transformation g(A) =

J rp(A, x)f(x) duix],

(10.50)

where f and rp are nonnegative Borel-measurable functions of nonnegative arguments, !l denotes the Lebesgue measure on [0, (0) or the

296

10. Orderings and Convexity-Preserving Transfonnations

counting measure on {O, 1, ... }, and the integral is assumed to exist. They showed that various geometric properties possessed by fare inherited by g under appropriate assumptions on cp. To describe their results we first observe some definitions and basic properties. The first definition concerns four geometric properties of a function.

10.30. Definition. Let f : [0, 00) (a) (b) (c) (d)

~[0,

00). We say that

° °

f is star-shaped if f(ax) ~ af(x) for every x 2= and ~ a ~ 1, f is superadditive if f(x + y) 2= f(x) + f(y) for every x 2= 0, Y 2= 0, fis root-increasing iffllX(x) is increasing in x >0, f is supermultiplicative if f(x + y) 2= f(x)f(y) for every x

2= 0,

Y 2= 0.

10.31. Theorem. The following elementary relationships among the geometric properties are valid:

~f

Proof.

Easy.

f star shaped

~et(x)

superadditive

~ef(x) supermultiplicative.

root increasing

0

10.32. Definition. Dual geometric properties may be defined by reversing the direction of the inequality in Definition 10.30(a), (b), and (d), and by replacing "increasing" by "decreasing" in (c). The dual geometric properties are called, respectively: (a') antistar shaped, (b') subadditive, (c') root-decreasing, and (d') submultiplicative. The relationships among (a'), (b'), (c'), and (d') in Definition 10.32 are analogous to those among (a), (b), (c), and (d) in Definition 10.30. We next define completely monotonic functions:

10.33. Definition. A nonnegative function f of a nonnegative argument is completely monotonic if it has derivatives of all orders and (-1),,!(k)(X) 2= for x 2= and k = 1, 2, ....

°

°

We shall need to define two more notions in order to state and prove the main results.


297

10.34. Definition. Let (jJ(A, x) be defined on A x X, where A and X are ordered sets. Then (jJ(A, x) is said to be totally positive of order 2(TPz) if (jJ(A, x) ~0 for AE A, x E X, and (jJ(Al , Xl)

I(jJ(Az, Xl)

(jJ(A l, xz) (jJ(Az, xz)

I

~0

Totally positive functions of order 2 and of higher orders play an important role in analysis, statistics, inventory theory, reliability theory, and many other theoretical and applied fields.

10.35. Definition. A function (jJ(A, x) is said to obey the semigroup property if

where Il denotes the Lebesgue measure on [0, co) or the counting measure on {0,1,2, ... }, AE[O,CO) 'or alternatively, AE{0,1,2, ... } and XE [0, co).

10.36. Theorem.

(a) Let (jJ(A, x) be TP z , and assume that

JX(jJ(A, x) dll(X) = aA

for all

A> 0

(10.51)

holds for some a > O. Then f is star-shaped implies that g is star-shaped.

(b) Let (jJ(A, x) satisfy the semigroup property, and f (jJ(A, x) dll(X) == 1. Then f is superadditive implies that g is superadditive. (c) Let (jJ(A, x) satisfy the semigroup property. Then f is supermultiplicative implies that g is supermultiplicative. (d) (jJ(A, x) satisfy the semigroup property iff f is exponential implies that g is exponential. (e) Let (jJ(A, x) be TP z and satisfy the semigroup property. Then (e-1) f is root-increasing and Il is the Lebesgue measure imply that g is root-increasing. (e-2) f is root-increasing, Il is the counting measure, and for all A> O. lim i;Ho

J (jJ(A, x );X dll(X) == 0

imply that g is root-increasing.

for some

;0 E [-co, co)

298


(f) Let @(A, x ) satisfy the semigroup property. Then f is completely monotonic implies that g is completely monotonic.

10.37.

Remark. Examples of kernels

@(A,

x ) satisfying (e-2) are:

(i) the Poisson kernel @(A, x ) = e-A(A"/x!),A 2 0 and x and 0 (ii) the binomial coefficients (t), A, x = 0, 1, . . . .

= 0,

1, . . . , ;

Proof of Theorem 10.36. (a) For each c > 0, g ( A ) - cA =

I

[

,'I

@(A, x ) f ( x ) - - x d p ( x ) .

Since f is star-shaped, then f ( x ) - ( c / a ) x changes sign at most once in 2 0, and if once, from - to By the variation diminishing property o f the TP, function @(A, x ) (Karlin, 1968, p. 21), it follows that g(A) - cA changes sign at most once, and if once, from - to +. Hence g must be star-shaped. (b) Write

+.

x

+ A*)

=

j

@(A1

+ A2

J

.If

(x) 4 4 x )

=I j @ ( L x

-Y)@(A,,Ylf(x)dp(Y)dCl(x) [by the semigroup property]

[since f is superadditive] [since

I

@(A, z ) d p ( z ) = I].

Thus g is superadditive. (c) Using (10.50), the semigroup property, and the supermultiplicativity o f f , we obtain

Thus g is supermultiplicative


299

(d) Let rjJ(A, x) satisfy the semigroup property and f be exponential. Then by the same kind of argument as in (c), we obtain g(A] + Az) = g(A])g(Az) for all A] 2: 0 and Az 2: O. Since rjJ and f are measurable, it follows from Tonelli's theorem that g is measurable (Royden, 1968, p. 270), and thus g(A) must be exponential, as pointed out by Breiman (1968, p.305). Suppose now that (10.50) maps exponential functions into exponential functions. Take f(x) = e :", s 2: O. For each fixed A, consider the measure VA defined for every Borel set A explicitly by the relation vA(A) =

JrjJ(A, x) d!J(X). A

Define the Laplace transform of

VA

by

Then, by the well-known property of the Laplace transform, for every A] 2: 0 and Az 2: 0, we have that gAl(S)gA2(S) is the Laplace transform of the convolution measure vt. VA2; i.e.,

Since g is exponential in A by assumption, we have gAt(S)gA2(S)= gAt+A2(S); i.e.,

By the uniqueness of the Laplace transform (see Theorem 1a of Feller, 1971, p. 432), it follows that

i.e., rjJ satisfies the semigroup property. (e-1) We can assume that for each A, rjJ(A, x) is strictly positive on a set of positive Lebesgue measure; otherwise g(A) would be zero and we would have nothing to prove. Since f is root-increasing, it follows that for each fixed a, O::s: a < 00, f(x) - a" changes sign at most once in x 2: 0, and if once, from - to +.

300


Also, by the previous result (d). f 0 (k = 1, ... , m); let b, be positive n-tuples and Zk (k = 1, ... , m)


301

be real n-tuples. Then the function

is convex and Sex) ~ f ( I : k ~ akAn(xZk> l bd) holds, where An(xzk> b k ) is the weighted arithmetic mean of the n-tuple XZk with weight vector b k . Let a and p be two positive n-tuples, and let f be a continuous and strictly monotonic function that is positive for all positive x and tends to 00 either as x ~0 or as x ~00. We write (10.52) If Pi ~ 1 (i = 1, ... ,n), we write In(a, p) instead. Of course, in the case P; = 1 we have a quasiarithmetic mean, i.e., In(a, p) == fn(a, p).

10.41. Theorem. Let f have continuous second derivatives and be strictly monotonic and convex.

r.r:

(a) If It!' is a convex function, i.e., if is decreasing, then In(a, p) is a convex function of a. (b) A necessary and sufficient condition that lea, p) is a convex function of a is that the function f is of the form (ax + b > 0, c> 1), or

eax + b •

na

10.42. Remark. The result in Theorem 1O.41(a) is proved in Vasic and Pecaric (1979a), and for Pi == 1 we obtain a result in Hardy, Littlewood, and P6lya (1934, 1952, pp. 85-88), where the result in Theorem 1O.41(b) is also proved. 0 In the following we give some similar results with symmetric functions and means. Let a be a positive n-tuple and k an integer satisfying 1 -s k :S: n. Then the kth elementary symmetric function of a is defined by

e ~ k J ( a= ) ~ ) a ~ l . . . a ~ n(= G ) p ~ k l ( a ) ) ,

(10.53)

where S(A) = {J.:Ai=O or 1 and I:7=lAi=k}, e ~ O ] ( a ) = p ~ O l ( a and ) = l , p ~ k l ( a is ) the kth symmetric mean of a. It is important to note that the

302


symmetric functions and means can be generated as the following: (10.54)

or equivalently, (10.55)

10.43. Theorem. (a) If a is a positive n-tuple, then the sequence { p ~ k l } k : is ~ log-concave. (b) If a is a positive and log-convex sequence, then P ~ " . ! . 2 P ~ k l ~ ( p ~ k l l ) 2 for

1:s k :s n - 2.

(10.56)

(c) If r and v are integers such that 1:s r s; n - 1, O:s v:s k - 1, then ( ~ve!;-vl

?

~ ( ~ v e ! ; - v - 1 ) v(e!;-v+l». ~

(10.57)

10.44. Remark. Theorem 1O.43(a) is a old result (see Bullen, Mitrinovic, and Vasic, 1987, pp. 285-88), (b) is proved in Ozeki (1972) as a consequence of Theorem 10.9, and (c) can be found in Mitrinovic (1967) (see also Bullen, Mitrinovic, and Vasic, 1987, pp. 295-96). 0 Let a again be a positive n-tuple, k be an integer such that 1 :s k :s n, and .N* = {O, 1,2, ... }. Then the kth complete symmetric function of a is defined by

c ~ k l ( a =) ~ ) a ~ l ... a ~ n( = (n +: - 1 ) q ~ k l ( a » ) ,

(10.58)

where S(I.) = {I.:AiE.N*, I : 7 ~ l A i = k } c , ~ o J ( a ) = q ~ O l ( a ) and = l , q ~ k J ( a )is the kth complete symmetric mean of a. The complete symmetric functions can be generated similarly, i.e., (10.59)

10.45. Theorem. (a) If a is a positive n-tuple, then the sequence { q ~ k l } is log-convex. (b) If the sequence {ai } is log-convex, then (10.60)


303

10.46. Remarks. (a) In Hardy, Littlewood, and P6lya (1934, 1952, p. 164) it is noted that Theorem 10.45(a) is due to Schur. This inequality has been cited in many papers (see Mitrinovic, Lackovic, and Stankovic, 1979, and the references given therein). Theorem 1O.45(b) was proved by Ozeki (1967) as a consequence of Theorem 10.10. (b) For some further generalizations of symmetric means and of Theorems 1O.43(a) and 1O.44(a), see Bullen, Mitrinovic, and Vasic (1987, 0 pp. 317-30). Ozeki (1973) proved the following result: Let Ak> Bi , C; (1 < k s; n) be defined as in Remark 6.18. Then the following inequalities are valid:

10.47. Theorem.

A'-l

+ A'+l 2: 2A, (n 2: r + 1),

(10.61) (10.62)

C,-l + C,+l

2:

2C,

(n

2:

r

+ 1).

(10.63)

Note that Theorem 10.47 is a simple consequence of Theorem 3.30. McLeod (1958/59) and Bullen and Marcus (1961) proved the following theorem as a generalization of an inequality of Marcus and Lopes (1957):

10.48. Theorem.

Let rand s (1 ::s; r ::s; s ::s; n) be integers.

is a concave function of a. 10.49. Remarks. (a) By assuming that C(a) = ( c ~ ] ( a ) / c ~ - p l ( a ) ) l(1::s; Ip p::S; r) is a convex function of a, McLeod (1958/59) proved his own conjecture for p = r. In Baston (1978) this conjecture was proved for p=r-l.

(b) A similar result for the well-known Dresher Inequality was proved by Godunova (1967a) (see Section 4.3). She proved this result by using Theorem 10.39 and the following result: Let A and B be sets in two linear spaces LA and L B , let x = Sea) and y = T(b) be real-valued functions which transform A and B into R, and Ry , respectively. Assume that x = Sea) is concave (convex) on A and y = T(b) is convex (concave) on

304


B; and that M(x, y) is a concave (convex) function of two real variables,

defined on R;

X

R y, and is increasing in x when y is kept fixed. Then

M(S(a), T(b)) is concave (convex) on AxB. If S(a)=a or T(b)=b, then the monotonicity condition for M(x, y) in x or y is not necessary.

(c) In the same paper Godunova (1967a) proved the following result: ,Xn-I) is a concave (convex) function of n -1 variables for ,Xn-I ~0, then, for Xn > 0,

If f(xI' XI ~0,

is a concave (convex) function of (XI' ... ,xn ) . (d) Of course, Theorem 10.48 and Dresher's inequality are related to the well-known Minkowski Inequality. For many other related results, see Bullen, Mitrinovic, and Vasic (1987, especially pp. 267-268, 309, D 323-324, 330), and Pales (1982). Let LrCx, y) = (x r+ 1 + yr+I)/(x r + yr), then the following result is valid (Alzer, 1988). 10.50. Theorem. Let X and y be two positive real numbers such that X y, then the function L(r) = Lr(x, y) is strictly convex in ~_, strictly concave in ~+, strictly log-convex in (-00, -1/2], and strictly logconcave in [-1/2, +(0).

*"

Proof.

From L"(r)

= (xy)'(log(x/y))2(y -x)(xr -

we have L"(r) > 0 for r fer) = log Lr(x, y). Then r(r)

yr)/(x r + yr)3,

< 0 and L"(r) < 0 for r > O. Similarly, let

= (xy)'(log(x/y)?(y -x)(X2r+ l

_

y2r+I)/«x r + yr)(x r+ 1 + yr+l)f

holds. Thus r(r) > 0 for r < -1/2, and r(r) < 0 for r> -1/2. For arbitrary but fixed x, y (x F,.(x, y) = (r/(r

D

*" y) let us define

+ 1))(xr+ 1 -

F'o(x, y) = (x - y)/(logx -logy),

yr+I)/(x r - yr)

for

r ='= 0, -1,

F_ 1(x, y) =xy(logx -logy)/(x - y).


305

(Note that Lr(x, y) = F,.(x 2, y2)1F,.(x, y).) Then the following results holds (Alzer, 1989b): 10.51. Theorem. The function F(r) = F,.(x, y) (x =1= y) is strictly logconvex in (-00, -1/2] and strictly log-concave in [-1/2, +(0). A similar result is also given in Alzer (1989b): Let

Then we have 10.52. Theorem. Let x, y, u, and v be positive real numbers such that max(x, y)/min(x, y) > max(u, v)/min(u, v). Then the functions SlIlx, y)/SlIr(u, v) is strictly convex for r E IR+.

10.53. Remark. There exist many other results which can be included in this section. See, for example, Lupas and Muller (1967,1970), Ibragimovand Gadziev (1970), Wood (1971), and Ross (1978,1980) (see also Riekstyn's (1986, pp. 22, 25-26). D


Chapter 11 Convex Functions and Geometric Inequalities

In this chapter we give some results concerning convex functions and geometric inequalities, mainly from Mitrinovic, Pecaric, Tanasescu, and Volenec (1988) and Tanasescu (1989). Note that the recent book of Mitrinovic, Pecaric, and Volenec (1988) contains a chapter with the same title, and many useful applications of convex functions to geometric inequalities are also given there. The theory of majorization is closely related to inequalities based on convex functions; it was treated comprehensively by Marshall and Olkin (1979), and will be reviewed in Chapter 12. Some recent related results concerning majorization and geometric inequalities can be found in Mitrinovic, Pecaric, Tanasescu, and Volenec (1988), and results in Tanasescu (1990) concern the concavity property of some hyperbolic forms. Those results can provide an answer to some geometric problems and their generalizations, and will be treated in this chapter.

11.1.

Old and New Results via Majorization Theory

11.1. 1. A Partial Ordering of Triangles

Let il(s) denote the set of all metrically distinct triangles with given semi-perimeter s > 0 (degeneracy allowed). An irredundant abstract representation of ~ ( s ) is immediate: il(s) = {T = (x, y, z): s

2:

x

2: Y 2: Z 2:

307

0,

X

+Y + z

=

2s}.

308

11. Convex Functions and Geometric Inequalities

Of course, a vector T = (x, y, z) uniquely determines a triangle with (ordered) sides x ~y ~ z, thus for convenience we shall use "T = (x,y,z)" to mean a triangle, where x ~ y ~ are z the lengths of the triangle T.

11.1. Definition. The set of triangles ~ ( s ) is a partially ordered set (denoted by p.o. set) with the partially ordering relation T=(x,y,z)- _ oCP ) ! 2=0 ox oy s=(x+y+z)/2

(~O)

(11.15)

on I R ~ ,where the partial derivatives are taken by treating s as a constant, and we let s = (x + y + z)/2 only afterward (if necessary) to determine the sign in (11.15).

11.1. Old and New Results via Majorization Theory

313

Another simplification arises if the function f can be written in the form

f(x, y, z) = g(u, v, w), whereu=s-x, v=s-y, andw=s-z. Since of ox

og OU

og ov

og ow'

2-=--+-+-

2 of = og _ og

oy

ou

ov

+ og for u - x = y ow

- x,

we find

(x _ y)( of _ of) = (u _ v)(Og _ Og). ox oy ou ov Thus f is Schur-convex (Schur-concave) on ~ ~ iff g is Schur-convex (Schur-concave) . In the following we provide some examples for the purpose of illustration. Since the function g(u, v, w) = VIi + VV + YW is Schurconcave, we have, for LVs - x defined similarly, y'S -s LYs

-.x; ~ V3S

on ilaCs) or {T 1 , To}

y'S ~ '2:.Ys - x ~ (0 - 1 + 2Y0 - 1)y'S

on ilo(s) or {T 1 , T 3 }

Furthermore, the function g(u, v, w) = u- 1 + V-I Thus

1 s-x

(11.16) (11.17')

+ w-1 is Schur-convex.

9 on ilaCs) \ {T 1 } or {To}, s

' 2 : . - - ~ -

(11.17) (11.18)

Note that in many examples f(To) lies surprisingly close to f(T3 ) . However, we should be cautioned not to conclude that this is always the case, as shown in the prior example. We conclude this section by considering a classic problem of finding max f(T),

where f(T) = min{lx - yl, Iy - z], [z -xl}·

T E ~ ( S )

Although this is often encountered as an extreme value problem in real numbers or in the class of (R, r)-triangles, or R-triangles, it is most suitably formulated as a maximization problem in il(s). The function f vanishes on L 1 U L; U L 3 U L 4 , and it is neither Schur-convex nor

314


Schur-concave on ~ ( s ) . However, on every level line L T , first it is increasing, then decreasing, always having its maximum value at y = 2s/3, attainable only when x - y = y - z. Indeed, on ~ ( s ) f can be simplified as f(T) = min{x - y, y - z} and, on each LT , x - Y is Schurconcave, while y - z and x - z are each Schur-convex. Thus Maxf = (x - z )/2 on each level line, and x - z is Schur-convex on the whole set ~ ( s ) . It then follows that Max minj]» - YI, Iy - z], [z - xl} = s/3,

(11.19)

attained uniquely at T= (s, 2s/3, s/3) (an old but beautiful result). Moreover, for T E ~ a ( s ) we have Maxf(T) = 5/6, attained uniquely at T = (5/6)(5, 4, 3), a nice right triangle. 11.2.

Concavity via Hyperbolic Forms

11.2. Definition. A real n-ary form is a homogeneous polynomial in n indeterminates with real coefficients. A real form P of degree m is said to be hyperbolic with respect to some a E IRn (or a-hyperbolic) if the equation in t E IR P(ta + x) = 0

(11.20)

has previsely m real roots for each given real vector x E IRn. Let us denote these roots by tl(a, x):s; ... :s; tm(a, x) (for fixed a they are all continuous in x). Let 'Jtp be the set of vectors a E IR n such that Pis a-hyperbolic. Note that for every a E 'Jtp the function Ha(x) = maXlsksm tk(a, x) and ha(x) = minlsksm tk(a, x) are continuous, positive, and homogeneous on IR n with H a ( -x) = -ha(x). Also note that the open cone CCP, a) = {b E IR n: Ha(b) < O} is the desired neighborhood of a in 'Jtp that is an open subset of IRn. Moreover, this cone is convex. (More results on hyperbolic form can be found in Garding, 1959, and Tanasescu, 1990). Throughout this section it will be assumed P is a hyperbolic form of degree m ~ 2 and LP = <jJ, where LP is the set of all common zeros of the partial derivatives of P of order m - 1. Furthermore, let us assume that the cone C = CCP, a) £; 'Jtp satisfies a E IR':-+ and pea) > 0 (hence P>O on C). Now let K be a convex cone included in the nonnegative orthant IR':-. For every s > 0 let K denote the set of all x' = { X : } 7 ~ 1such that x = { X ; } 7 ~ E1 K. Of course, under the prior assumptions we have K S £; S

11.2. Concavity via Hyperbolic Forms ~ ~ . Take

315

p = lIs and define on K the positively homogeneous function

1';, by (11.21)

Clearly, we have tacitly assumed that KlIp ~ C = C(P, a). In the following theorem P, and ~ j denote partial derivatives of the first and second order, respectively.

11.3. Theorem. Let K some real number p 2:: 1.

~ ~ :

be a convex cone such that Kl1p

~C

for

(a) If P; 2:: 0 on Kl/p n {Xi> O} for every i = 1, ... , n, then the function 1';, defined in (11.21) is strongly concave on K. (b) If ~ j 2:: 0 on KlIp n {Xi> 0, Xj > O} for 1:s; i <j:s; nand p < m, then the same conclusion as in (a) holds.

11.4. Theorem. If 1';, is strongly concave on some convex cone K c ~ : , then it is strongly concave on cl(K)+ (the closure of K) except for the trivial case 1';,(x) = 1';,(y) = 1';,(x + y) = 0 for x, y E cl(K). The following result is a consequence of Theorems 11.3 and 11.4.

11.5. Theorem. If T I = (Xl, xz, X3) and T z = (YI, Yz, Y3) are two triangles with areas Al and A z, respectively, then for each p 2:: 1 there exists another triangle with sides (xf + yf)lIP , i = 1, 2, 3, and with area A, such that: (a) If 1 :s;p :s; 4, then (11.22)

and equality holds iff (i) the triangles are similar or (ii) they are both degenerate, with the longest side on the same position (in (ii) it occurs only for p = 1); (b) if both triangles are nonobtuse, then the inequality in (11.22) holds for all p 2:: 1, and equality holds iffthe triangles are similar.

11.6. Remarks. (a) Theorem 11.5 was first stated as a conjecture by Oppenheim (1971), and later proved by him (Oppenheim, 1974) for p = 1, 2, 4 only (which is part (a». Carroll (1982) gave a complete proof for both parts of the theorem.

316


(b) In fact, the inequality (11.22) is a consequence of Jensen's inequality for n = 2. Using the weighted Jensen's inequality, we can obtain known results for n triangles (see Mitrinovic and Pecaric, 1988a). (c) If a hyperbolic form P in Theorems 11.3 and 11.4 is also symmetric, then the function F;,(x) defined in (11.21) is also Schurconcave on K. This is a generalization of a related result in Mitrinovic and Pecaric (1988a). (Note that the results in their paper are given in Mitrinovic, Pecaric, and Volenec, 1988.) 0 Theorem 11.5 may be nicely generalized in an obvious manner by letting (for n 2= 2)

k= 1, ... , n,

P=Tt···Tn

and

Q = ToP.

Note that the forms P, Q are positive in IR':- whenever the Tk's are. Thus it seems natural to define C = {x E 1Il':-: P(x»O} = C(P, In)

n IR':- = C(Q, In) n IR':-.

where In = (1, ... , 1). This follows from the obvious fact that P, Q are In-hyperbolic and Tk(ln) > 0 for k = 0, 1, ... , n. Theorem 11.3(a) assures that F;" Gp are strongly concave on K p = CP for every p 2= 1, where

Consequently, the following theorem is valid.

11.7. Theorem.

(a) If 1:s p :s n, then

F;,(x + y) 2= Fp(x) + F;,(y)

for every x, y E cl(Kp),

(11.24)

and equality holds iffeither x, yare proportional, or p = 1, FI(x) = FI(y) = F1(x + y) = 0, and the largest component ofx, yare similarly placed. (b) If1:sp:sn+1, then Gp(x

+ y) 2= Gp(x) + Gp(y) for every x, y E cl(Kp),

(11.25)

and equality holds under the same conditions as in (a).

For n = 4, each point x = (Xl' Xz, X3' X4) can be considered as representing an inscribable quadrilateral of sides Xl,"" X4 with area Z A(XI' xz, X3, X4) = 2- (p (X)) lIZ, where P = T; T Z ~ 4 and x E C =

11.2. Concavity via Hyperbolic

FOnDS

317

C(P, 14 ) , Thus Theorem 11.6(a) yields a perfect analog of Theorem 11.5. This is precisely what the third theorem in Carroll (1982) says (also see Theorem 0 in Mitrinovic and Pecaric, 1988a). Theorems 11.3 and 11.4 give many other nongeometric applications. For example, Tanasescu (1989) used them to prove Minkowski's inequality, Bellman's inequality (the discrete case of Theorem 4.29), and a generalization of Theorem 10.48. A geometric implication of Theorem 10.48 is:

11.8. Theorem. If T 1 = (Xl, X2, X3) and T 2 = (Y1' Y2,Y3) are two arbitrary triangles, degenerate or not, with radii R 1, r1 and R 2, r2, respectively, then for every p ~ 1 there exists a triangle T with sides z, = (xf + yf)1/p for

i = 1,2,3,

and radii R, r such that

(11.26) Furthermore, equality holds iffthe triangles T 1 , T 2 (even if degenerate) are similar. .

11.9. Remark. Note that Remarks 11.6(b), (c) are also valid for Theorems 11.7 and 11.8. D


Chapter l2

Convexity, Majorization, and SchurConvexity

In this chapter we describe some results on convexity, majorization, and Schur-convexity (Schur-concavity). The notion of majorization arose as a measure of the diversity of the components of an n-dimensional vector (an n-tuple) and is closely related to convexity. It is formally discussed in Hardy, Littlewood, and P6lya (1934, 1952) and is treated most comprehensively by Marshall and Olkin (1979). In this chapter we shall restrict our attention to results in majorization and Schur-convexity theory that directly involve convex functions. Most of the results are already given in Marshall and Olkin (1979); thus to avoid duplication of effort we refer to their book for most of the proofs.

12.1.

Majorization and Convex Functions

For fixed n

I2

let x=(x1,.

* *

>X,),

y = ( y , , ... ,y,)

(12.1)

. . . 2&,I, Y(1)5 Y ( 2 ) 5 . . S Y ( , )

(12.2)

denote two n-tuples. Let X [ l l ? X[2] 2 X(1) S X ( 2 ) 5

a

-

a

*

?X[n]

Y [ l ]2 Y [ 2 ] 2

9

SX(n),

*

(12.3)

be their ordered components.

121.. DeJinifion. y is said to majorize x (or x is said to be majorized by y ) , in symbols y > x, if m

m

i=l

i=l

2 x [ ; I ~ y[;l

holds for rn = 1, 2, . . . , n - 1

319

(12.4)

320

U. Convexity, Majorization, and Schur-Convexity

and n

n

2: Xi = i=l 2: Yi' i=l

(12.5)

Note that (12.4) is equivalent to n

2: i=n-m+l

n

XU)::5

2:

holds for

Yu)

m

= 1, 2, ... , n -

1.

(12.6)

i=n-m+l

This definition provides a partial ordering, namely, y > x implies that, for a fixed sum, the y;'s are more diverse than the x;'s. To illustrate this point, we see that (i) y > y always holds for y = (Y, ... ,y), where y = (lin) I:7=1 Yi' and (ii) ( I : 7 ~ Y 1 i, 0, ... , 0) > Y holds for all y such that Yi ~ 0 (i = 1, ... , n). This notion is closely related to convex functions as shown in the following theorem. 12.2. Theorem. Let I be an interval in such that Xi' YiE I (i = 1, ... ,n). Then n

~ ,

and let x, y be two n-tuples

n

2: cp(x;}::5 2: CP(Yi) i=l i ~ l

holds for every continuous convex function cp: I -

~

iff Y > x holds.

12.3. Remark. Theorem 12.2 is well-known as the majorization theorem, and a convenient reference for its proof is Marshall and Olkin (1979, p. 11). It is due to Hardy, Littlewood, and P6lya (1934, 1952, p. 75), and can also be found in Karamata (1932); for a discussion 0 concerning the matter of priority see Mitrinovic (1970, p. 169). The partial ordering of majorization defined in Definition 12.1 assumes that the sums of the components of x and yare the same. When this condition is replaced by a weaker one, we have the notions of weak submajorization and weak supermajorization (see, e.g., Marshall and Olkin, 1979, p. 10):

12.4. Definition. (a) y is said to weakly submajorize x, in symbols Y>wx, if (12.4) and n

n

;=1

;=1

2: Xi::5 2: Yi

(12.7)

12.1. Majorization and Convex Functions

321

hold. (b) Yis said to weakly supermajorize x, in symbols y>w x, if m

m

i=1

i=1

2: xU) ~ 2: Y(i)

for

m

= 1, 2, ... , n

- 1

(12.8)

and n

n

2: Xi ~ i=1 2: Yi i=1

(12.9)

hold. It is known that (see, e.g., Marshall and Olkin, 1979, p. 11, p. 22):

12.5. Fact. (a) Ify > x holds, then there exist a finite number of n-tuples Z1 , •.. , ZN such that y

= Z1 > Z2 > ... ZN-1 > ZN = x,

(12.10)

and that for all j, Zj and Zj+1 differ in two coordinates only.

(b) y > x holds iff there exists a doubly stochastic matrix Q = (qiJ (i.e., qij ~ 0 and Li qij = 1 for all j, l: j qij = 1 for all i) such that x = yQ.

12.6. Fact. (a) y >w x holds iff there exists an n-tuple Z such that y > Z and z e x (i.e., z, ~ x d o ir = 1, ... , n) hold. (b) y > W x holds iff there exists an n-tuple Z such that Z > x and Z ~ y. In view of Fact 12.5(a) we can prove Theorem 12.2 simply by proving it holds for n = 2. Similarly, by combining Theorem 12.2 and Facts 12.5(a) and 12.6 we can obtain (see e.g., Marshall and Olkin, 1979, p. 10):

12.7. Theorem. Let I be an interval in ~ , and let x, y be two n-tuples such that Xi, YiE I (i = 1, ... ,n). Then (a) n

n

2: cf>(Xi) :5 2: cf>(Yi) i=1

(12.11)

i ~ 1

holds for every continuous increasing convex function cf> iff Y> w x holds; and (b) the inequality in (12.11) holds for every continuous decreasing convex function cf> iff Y>W x holds.

12.8. Remark. Theorem 12.7 is due to Weyl (1949) and Tomic (1949). Weyl obtained the result by assuming Xi> 0 and Yi> 0 (i = 1, ... , n),

322

12. Convexity, Majorization, and Schnr-Convexity

whereas Tomic did not assume this condition. Earlier Polya (1947) proved Theorem 12.7 by using Theorem 12.2 for the special case I = IR, and a similar proof for an arbitrary interval is given in Mirsky (1959). In fact, Mirsky proved a more general result which includes Theorem 12.7 as a special case. 0 In the following we discuss related results for Wright-convex functions and for functions with increasing increments. In Theorems 12.9, 12.11, and 12.12 we assume that x, yare n-tuples such that Xi' Yi E I (i = 1, ... , n) for some interval I.c: IR.

12.9. Theorem. For k that Xk + Ck E I for all k.

= 2, ...

,n let Ck

= I:7,;:l

(Xi - Yi), and assume

(a) If Yk:s Xk+l for k

= 1, ...

, n - 1,

(12.12)

k

2: Xi;:::: 2: Yi i ~ 1

k

for

i=1

k = 1, ... , n - 1,

(12.13)

and n

n

2: Xi = i=1 2: Yi'

(12.14)

i ~ 1

then (12.11) holds for every Wright-convex function cf> :1- IR. Furthermore, (12.11) holds for every Wright-convex function cf> if the reverse of the inequalities in (12.12) and (12.13) hold.

(b) if (12.12) and (12.14) hold and the reverse of the inequality in (12.13) holds, then the reverse of the inequality in (12.11) holds for every Wright-convex function cf>: I -IR. Furthermore, the same is true if (12.13) and (12.14) hold and the reverse of the inequality in (12.12) holds.

Proof. The proof follows by changing X2k-l to x; and X2k to Yk (k = 1, ... , n) in Theorem 5.46 and the fact that z;:::: maxl05i05n {Xi' Yi}, where z is the value of X2m+l in (5.64). 0 12.10. Remark. The following result is a special case of Theorem 12.9: Assume that a l 2: a3;:::: ... ;:::: a 2n+l' a2k;:::: 0 (k = 1, ... , n), al, a2k+l E I,


ak + ak+l E I (k f/>(al

= 1, ... , 2n).

323

Then we have

+ a2) + ... + f/>(a2n-l + a2n) + f/>(a2n+l) ~ f/>(al) + f/>(a2 + a3) + ... + f/>(a2n + a2n+l)

for all Wright-convex functions f/>: I

0

~ ~ .

The following theorem is a weighted version of Theorem 12.2. It can be regarded as a generalization of the majorization theorem in Theorem 12.2 and is given in Fuchs (1947).

12.11. Theorem. Let x, y be two decreasing n-tuples, and let p = (PI' ... ,Pn) be a real n-tuple such that k

k

i=1

i=1

2- PiXi:=; 2- PiYi

for

k

n

n

i±1

i=1

= 1, ... , n -

1;

2- p.x, = 2- PiYi'

Then for every continuous convex function f/>: I n

n

i=1

i=1

(12.15)

(12.16) ~ ~ ,

we have

2- Pif/>(X;) :=; 2- Pif/>(Yi)'

(12.17)

Similarly, a weighted version of Theorem 12.9 is given in Bullen, Vasic, and Stankovic (1973):

12.12. Theorem. n-tuple. If k

Let x, y be two decreasing n-tuples, and p be a real k

2- p.x, :=; 2- PIYi i ~ 1

for

k

= 1, ... , n -

1, n

(12.18)

i=1

holds, then (12.17) holds for every continuous increasing convex function f/> :I ~ ~ . If x, yare increasing n-tuples and the reverse inequality in (12.18) holds, then (12.17) holds for every decreasing convex function f/> : I ~ ~ . Proof of Theorems 12.11 and 12.12. Without loss of generality, assume that Xi =1= Yi' and define d, = (f/>(Xi) - f/>(y;))/(x; - Y;) (i = 1, ... , n). Then

324

12. Convexity, Majorization, and Schur-Convexity

from (12.7) we have d,

n

n

~ di+1

for i:5 n - 1, and the proof follows from

n

2.: P;CP(Xi) - ;=1 2.: PiCP(Yi) = ;=1 2.: Pi(Xi ;=1

Yi)di n-I

= (Xn - Yn)dn +

2.:

(Xk

-

Yd(dk

-

dk+I),

k=1

12.13. Remarks. (a) In Mitrinovic (1970, p. 165), it is stated that (12.15) and (12.16) hold iff (12.17) holds. Imoru (1974) presented what he claimed was a proof for this statement, but Cheng (1977) and Pecaric (1980a) independently showed, by a counterexample, that (12.15) and (12.16) are sufficient, but not necessary, for (12.17) to hold. Furthermore, they showed that (12.15) and (12.16) become necessary (and (12.18) is necessary in Theorem 12.12) when the components of p are all nonnegative. (b) Theorem 12.11 can be used to prove a number of inequalities for convex functions, including Petrovic's inequality (see Remark 5.38(e» and the Jensen-Steffensen Inequality and its reverse inequality (see Peearic, 1981c). 0 The definition of majorization stated in Definition 12.1. involves the comparison of the diversity of the components of two n-tuples. In the following we state a similar definition for integrable functions. Let x(t), y(t) be real-valued functions defined on an interval [a, b] such that f ~ x ( t dt, ) f ~ Y ( td ) t both exist for all s E [a, b].

12.14. Definition. y(t) is said to majorize x(t), in symbols, y(t) > x(t), for t E [a, b], if they are decreasing in t E [a, b] and s

s

J x(t) dt:5 Jy(t) dt o

0

and equality in (12.19) holds for s

= b.

for s

E

[a, b],

(12.19)


325

An integral analog of Theorem 12.2 is the following: 12.15. Theorem. y(t) > x(t) for t [a,b] and b

f

[a, b] iff they are decreasing in

E

b

is nonnegative decreasing (concave) in each variable separately, and property:

a continuous increasing and G; satisfy F;(t) > G;(t) (increasing), is convex satisfies the following

(u;

+ h,

Uj

+ k) - (u; + h,

Uj) -

(u;,

Uj

+ k) + (u;,

Uj)

((u;

+ h,

Uj

+ k) - (u; + h,

Uj) -

(u;,

Uj

+ k) + (u;,

Uj):5

for all i =1= j, 0:5 U; :5 U; + h :51, 0:5 Uj:5 integrals are finite,

0)

+ k:5 1, then providing the

=

co

J4>(t)(G

Uj

~0

1(t),

... , Gn(t)) dt s;

( ~ )J4>(t)(F 1(t), ... , F;.(t)) dt.

o

0

(12.34)

As a consequence of Theorem 12.19, Boland and Proschan (1986) obtained the following results:

12.20. Corollary. Assume that co

co

Jt dF(t) < 00,

J t dG(t) < 00.

o

o

Then F(t) > G(t) for t E [0,00) holds iff for all nonnegative increasing continuous convex (concave) functions 4> and nonnegative decreasing


331

(increasing) functions , 00

00

J (t)( G(t» dt s;

J (t)(F(t»

(2=)

o

dt,

(12.35)

0

provided the integrals are finite.

Proof. The "only if' part follows immediately from Theorem 12.19. To prove the "if" part, assume now that (12.35) holds. Letting (u) = u and x(t) = I[x, (i.e., the indicator function of the interval [x, (0», it follows that f; G(t) dt 2= f; F(t) dt for all x 2= 0 since x is nonnegative and increasing. Taking (t) == 1, it also follows that f ~t dF(t) = f ~ t dG(t), and hence F(t) > G(t) for t E [0, (0). D 00)

12.21. Corollary.

F(t) > G(t) for t E [0, (0) holds iff 00

00

J(t) dG(t) J(t) dF(t)

(12.36)

2=

o

0

holds for all convex functions , provided the integrals are finite.

Proof. The "if" part of the result is immediate. Now suppose F(t) > G(t) for t E [0, (0). It suffices to prove (12.36) for the case where has derivative 1jJ and (0) = O. Then, by Theorem 12.19, 00

00

J (t) dG(t) = J 1jJ(t)G(t) dt o

0 00

=

00

J[1jJ(t) - 1jJ(O)]G(t) dt + 1jJ(0) Jt dG(t) o

0

00

2=

00

J[1jJ(t) - 1jJ(0) ]F(t) dt + 1jJ(0) Jt dF(t) o

0

00

= J (t) dF(t).

D

o

12.22. Remark. Another approach to proving the necessity of (12.36) in Corollary 12.21 is as follows (see Bhattacharjee, 1981): Suppose

332


pet) > G(t) for t E [0, (0). Let ZG and ZF be the random variables with [5;;' t dG(t)]-l fO G(t) dt and respective distribution functions [f;;' t dF(t)t 1 S ~Pet) dt (these are the equilibrium distributions of G and F respectively). Then ZG >st ZF (ZG is stochastically larger than ZF) and hence E( 'lJ!(ZG» 2 E('lJ!(ZF» for all increasing functions 'lJ!. But 00

00

00

f 'lJ!(t)G(t) dt = [f t dG(t) ]E( 'lJ!(ZG» 2 [f t dF(t)] E('lJ!(ZF» o

0

f

0

00

=

'lJ!(t)P(t) dt;

o

hence the statement follows.

12.2.

D

Schur-Convex Functions

The following notion of Schur-convexity and Schur-concavity, due to Schur (1923), generalizes the definition of convex functions and concave functions via the notion of majorization. 12.23. Definition. A real-valued function f defined on a set A c said to be Schur-convex (Schur-concave) on A if y > x on A =? fey) 2 (::5)f(x).

~ n is

(12.37)

If equality in (12.37) holds only when x is a permutation of y, then said to be strictly Schur-convex (Schur-concave) on A.

f is

We note the trivial fact that f is a Schur-concave function on A iff -f is a Schur-convex function on A; thus results for Schur-convex functions immediately apply to Schur-concave functions, and vice versa, under obvious modifications. Furthermore, we note in the following theorem that all Schur-convex functions are permutation invariant. 12.24. Theorem. Assume that A c

~ n is

permutation invariant, that is, x E A implies that Px E A for every n X n permutation matrix P. If f is a Schur-convex function in A, then f(x) = f(Px) for all P and all x E A.

Proof.

For an arbitrary but fixed permutation matrix P, we have x > Px

and

Px > x

for all x E A;

U.2. Schur-Convex Functions

333

thus we have f(x) "2:.f(Px) and f(Px) "2:.f(x).

This implies that f(x) = f(Px)

for all x E A

and all P.

D

A useful condition for verifying the Schur-convexity property of f is known as the Schur-Ostrowski condition, and is stated in the following (see, e.g., Marshall and Olkin, 1979, p. 57): 12.25. Theorem. Let f : A ~Iffi be a real-valued function where A c Iffi n is permutation-invariant, and assume that the first partial derivatives of f exist in A. Then f is Schur-convex in A iff f)"2:. (x; - xJ( af - a 0 for all x E A ax; aXj holds for all 1 :5 i =1= j

:5

(12.38)

n.

The next theorem is a restatement of Theorem 12.2 and illustrates how one-dimensional convex functions and n-dimensional Schur-convex functions are related (see, e.g., Marshall and Olkin, 1979, p. 64). 12.26. Theorem. Let I be an interval in Iffi. Let x, y be two n-tuples such that x., Y; E I(i = 1, ... , n) and definef(x): I">« Iffi bea real-valuedfunction such that f(x) = ~ 7 = 1¢ (x;)for some continuous function ¢: I ~Iffi. Then f is a Schur-conuex function on I" iff ¢ is a convex function on I. A corresponding result can be given for functions of the form f(x) = n 7 ~ 1¢ (x;) when ¢ is a log-concave function (see Fact 13.24), and it has important statistical applications when the random variables are i.i.d. with a common density function ¢(x). To see how n-dimensional convex functions and n-dimensional Schurconvex functions are related, we observe the following result (Marshall and Olkin, 1979, pp. 68-69). 12.27. Theorem.

Let f :A

~Iffi

where A c Iffi n is permutation invariant.

(a) If f is permutation invariant and convex in A, then it is Schurconvex in A.

334

U. Convexity, Majorization, and Schur-Convexity

(b) More generally, if f is permutation invariant and convex in each pair of arguments (where the remaining arguments are held fixed) in A, then f is Schur-convex in A.

By combining Schur-convexity and monotonicity, useful results can be obtained via weak majorization. For example, a restatement of Theorem 12.7 is (Marshall and Olkin, 1979, p. 59): U.28. Theorem. Assume that f is defined as in Theorem 12.26. If 4J is an increasing (decreasing) convex function on I, then f is an increasing (decreasing) Schur-convex function on I", Similarly, we observe an analog of Theorem 12.27 for monotonic convex functions defined in IR n (marshall and Olkin, 1979, p. 68): U.29. Theorem. Let f be defined as in Theorem 12.27 where A c IR n is permutation invariant. If f is permutation invariant, increasing (decreasing), and convex in A, then f is increasing (decreasing) and Schur-convex in A. Marshall and Olkin (1979, p. 61) contains a detailed study of closure properties of compositions of Schur-convex and Schur-concave functions of the form

'ljJ(x) = h(4Jl(x), ... ,4Jk(X)),

X

E IR n •

For example, they illustrate that (i) If each of 4Ji (i = 1, ... ,k) is Schur-convex and if h : IR k ~IR is increasing (decreasing), then 'ljJ: IR n ~ IR is Schur-convex (Schur-concave); and (ii) if each of 4Ji (i = 1, ... , k) is Schur-concave and if h is increasing (decreasing), then 1/J is Schurconcave (Schur-convex). Combining those results with Theorem 12.27, we immediately obtain useful results when each of 4Ji (i = 1, ... , k) is a permutation-invariant convex function in IR n • If each of 4Ji is also monotonic, then similar results in their Table 1 apply. For details, see Marshall and Olkin (1979, p. 61). The notions of Schur-convexity and Schur-concavity are useful in obtaining inequalities via majorization and weak majorization, and there exist numerous results in this area. In the following we describe two convolution theorems (Theorems 12.30 and 12.32) due to Marshall and Olkin (1974) and Proschan and Sethuraman (1977a), respectively (see also, Marshall and Olkin, 1979, pp. 100-101).

U.2. Schur-Convex Functions

335

12.30. Theorem. If and f are two Schur-concave functions defined on IR n, then the function tjJ defined on IR n by tjJ(a)

=

J (a - x)f(x) dx ~ n

is Schur-concave (whenever the integral exists).

An application of Theorem 12.30 yields the following Schur-concavity property of distribution functions when the density functions are Schurconcave. This result is due to Marshall and Olkin (1974, 1979, p. 101). 12.31. Corollary. Let f(x): IR n ~[0, 00) be a probability density function of a random variable X = (Xl' ... , X n ) such that the probability measure is absolutely continuous with respect to Lebesgue measure. If f is a Schur-concave function of x for x E IR n, then the distribution function of X

F(a)

=

l

~ L Q{Xi ~a;}

a

E

IR

n

is a Schur-concave function of a.

Proof.

For i = 1, ... , n, let I if a, - Xi 2= 0, I(-oc.ail = { 0 otherwise

be the indicator function of {x, :Xi of the set

~ ai}'

Since the indicator function (x)

A = {x: x E IR n , Xi 2= 0 for i = 1, ... , n}

is a Schur-concave function of x E IR n and, by (a - x) F(a) can be expressed as

F(a)

=

= II7=1 I(-OC,ail'

J (a - x)f(x) dx, ~

n

then the conclusion follows from Theorem 12.30.

0

12.32. Theorem. Suppose that either (i) .r = IR, e c IR is an interval and f.l is Lebesgue measure or (ii) .r is the set of all integers, e is an interval or an interval of integers, and f.l is the counting measure. If g: e X . r ~[0,00)

336


is totally positive of order two (TP2) and satisfies the semigroup property g(OI

+ O2, x) =

Jg(OI' t)g(02' X- t) dv(t) .1'

for some measure von x, and 1J(x) is Schur-convex for x E IR n. Then the function 1jJ defined for 0 == (0 1 , • . . , On) E ex· .. X e by 1jJ(O) ==

J}] g(

0i' x i)1J(x) }] d/l(xi)

is a Schur-convex function of O. Theorems 12.30 and 12.32 can be applied to obtain useful integral inequalities and probability inequalities. Some of the recent results derived from those two theorems using Schur-concavity can be found in the review article by Tong (1988).

12.3.

Multivariate Majorization and Convex Functions

As indicated in Definition 12.1, the notion of majorization concerns a partial ordering of the diversity of the components of two n-tuples x and yin IR n • A natural problem of interest is the extension of this notion from n-tuples (vectors) to k X n matrices. For example, let

X == (XI, X2, ... ,Xk)

,

=

Y==(YI,Y2"",Yk)'==

Xl2 X22

X2n

Xkl Xk2

Xkn

C X21

Yl2 Y22

C Y21

Ykl Yk2

be two k X n real matrices, where responding row vectors. If Xi == Yi

for

Xl, . . .

X'") , (12.39)

Y'Y2n") Ykn

,Xk; y, ... , Yk are the cor-

i == 1, ... , k

(12.40)

12.3. Multivariate Majorization and Convex Functions

337

where (12.41) then intuitively speaking the components of x are less diverse than those of y. The question of interest is, of course, how to find a useful definition so that meaningful results can be obtained. This involves a multivariate extension of Definition 12.1. To state a definition of multivariate majorization given in Marshall and Olkin (1979, Chapter 15), we first observe the definition of a T-transform matrix (Marshall and Olkin, 1979, p. 21):

12.33. Definition. An n it is of the form

X

n matrix is said to be a T-transform matrix if

T = a1+ (1 - a)Irs

(12.42)

for some a E [0, 13, where I is the n X n identity matrix and I, is obtained by interchanging the rth and sth columns of I for some r Zs.

12.34. Definition. Let A, B be two k x n real matrices for k 2 2, n 2 2. (a) Y is said to chain majorize X (in symbols, Y >X ' ) if there exists a finite number of T-transform matrices T I , . . . ,TN such that X = YIIEV=,Ti. (b) Y is said to majorize X in a multivariate sense (Y > m X) if there exists an n x n doubly stochastic matrix Q such that x = YQ. (c) Y is said to row-wise majorize X (Y >'X) if yi > x i holds for i = l , 2 , . . . , k. The implications of partial orderings defined in Definition 12.34 are given in the following theorem: 12.35. Theorem. Let X, Y be two k x n real matrices. Then (ii)

Y>'X$

Y>"X+Y>'X.

(12.43)

Furthermore, the implications (i) and (ii) are strict. Proof. (i) follows from the fact that the multiplication of a finite number of T-transform matrices is doubly stochastic; (ii) follows from Fact 12.5(b). TOshow that (i) is strict, by Fact 12.5(b) it suffices to find a

338


matrix Q that is doubly stochastic but it not the multiplication of a finite number of T-transform matrices; such a 3 x 3 matrix can be found in Marshall and Olkin (1979, p. 431). The proof for the strict implication (ii) follows immediately from Fact 12.5(b) and Definition 12.34. 0 Rinott (1973) considered the notions of multivariate majorization given in Definition 12.34 and derived some useful results in probability and statistics. He also provided a characterization of chain majorization and row-wise majorization. His result is stated in the following theorem, and a convenient reference for the proof is Marshall and Olkin (1979, pp. 434-435). 12.36. Theorem. Let f(X) =f(XI' ... ,Xk): I R k x n IR~ be a differentiable function. Then (a) f(Y)

for all Y >c X

~f(X)

holds iff (i) f(X) = f(XP) holds for all n x n permutation matrices P and (ii) ~ 7 ~ 1(Xjr - xjs)[fUr)(X) - f u ~ ) ( X ) ] ~0 holds for all r, s = 1, ... , n, where fUr)(X) = 0/ oajr!(U)/u=x. (b) f(Y) ~f(X) for all Y »: X holds ifffor every fixed j = 1, ... , k, (Xjr - xjs)[fur/ X) - ks)(X)] ~0 holds when the values of the other columns of X are held fixed.

12.37. Remark. Note that the statement of Theorem 12.36(b) is just that xj , Yj satisfy Theorem 12.25 for each j

= 1, ...

, k.

0

Marshall and Olkin (1979, p. 435) proved a result for majorization in a multivariate sense. Their result is related to symmetric convex functions: 12.38. Theorem. Let X, Y be two k x n matrices as defined in (12.39). If y>mx, then f ( Y ) ~ f ( Xholds ) for all f(X)=f(XI"" , x d : l R k x n ~ 1 R which are symmetric and convex in the sense that (i) f(X) = f(XP) for all n x n permutation matrices P and (ii) f(aU for all a

E

+ (1- a) V)

~ af(U)

+ (1- a)f(V)

[0, 1] and k x n matrices U and V.

Chapter 13 Convexity and Log-Concavity Related Moment and Probability Inequalities

In this chapter we discuss some moment and probability inequalities that arise from the applications of convexity, Jensen-type inequalities, and log-concavity. We shall assume without explicit statement that expectations and other integrals mentioned in theorems, corollaries, etc., exist. If an expectation or integral does not exist, the corresponding statement is to be considered vacuous.' The convexity-related results are given in Sections 13.1-13.3, and results that arise from the log-concavity property of probability density functions are discussed in Sections 13.4 -13.7.

13.1.

Jensen's Inequality

Jensen's inequality for one-variable functions and for functions of several variables, and its reversals, refinements, and converses have been treated extensively earlier in this book. In this chapter we first present a stochastic version of Jensen's inequality. It is a probabilistic analog of the previous results and has important applications in probability and statistics. For n 2 1 let X = (XI,. . . , X, ) be an n-dimensional random variable. Let F ( x ) = P [ X s x ] be the distribution function of X, and let p = ( p l , . . . , p,) denote the mean vector of X.

13.1. Theorem. For n 2 1 let @ : A + R be a continuous convex function where A c R" is an open convex set such that P[X E A] = 1. Then p E A 339

340

13. Convexity and Moment and Probability Inequalities

and E1'(X)

=

J1'(x) dF(x)

2:

(13.1)

1'(r.a).

A

Proof. The fact that r.aEA is easy to verify. To show that (13.1) holds we use the proof in Marshall and Olkin (1979, p. 454). For Z E A and i = 1, , n, let 1't)(z) = lim.!. [1'(Zl'

,

dO E

+ E, Zi+l ,

Zi-l' Zi

... ,

zn) - 1'(Zl' ... , zn)],

and denote

V+ l' (x) = (1'(l)(x), ... , 1't.ix)),

x EA.

Then V+ l' (x) is Borel-measurable and, more importantly, 1'(x) - 1'(z) 2: [V+ l' (z)][x - z),

x,zEA

holds (Marshall and Olkin, 1979, p. 451). Choosing z = u, we have 1'(X) - 1'(r.a) 2: [V+1'(r.a)](x- u)

a.s.

(13.2)

The proof follows by taking expectations on both sides of (13.2).

0

Note that if l' is differentiable, then V+l' (x) is just the gradient vector of l' at x. Furthermore, note that if A is an interval in IR and the distribution of X is discrete, this result reduces to the result treated in Theorem 2.1. For n = 1, a more general result is known (Chow and Teicher, 1978, pp. 102-103): 13.2. Theorem. c E IR and r : IR

Let 1': IR ~IR be a continuous convex function. Let be Borel measurable. If r(x) and

~IR

6(x) == (x - EX) - (r(x) - Er(X))

are both increasing for x

E

IR, then

E1'«X - EX) + c)

2:

E1'« reX) - Er(X)) + c).

(13.3)

Proof. If P[ 6(X) = 0] = 1, then (13.3) is immediate. Otherwise, by the convexity of l' we have 1'«X - EX) + c) - 1'«r(X) - Er(X)) + c) 2:

6(X)V+1'«r(X) - Er(X)) + c)

a.s.,

(13.4)


341

where

and (by the convexity of 1J) 'y+ 1J(z) is increasing in z. By the mono tonicity of 0 for all XE [Rn, then (13.19) is equivalent to 10gf(Ct'xl + (1- Ct')X2) 2: Ct' log j'(x.) + (1- Ct')logf(x2)'

(13.20)

The main theorem of Prekopa (1971) states: 13.20. Theorem. Let gp be defined as in (13.17). Iffis log-concave, then (13.18) holds for all B I' B 2 C [Rn and all Ct' E [0, 1]. An immediate generalization of Theorem 13.20 is: 13.21. Coronary. Let X be an n-dimensional random vector with probability density function f(x), and let B I , . . . , B; (k 2: 2) be subsets of [Rn. Iff(x) is a log-concave function of x, then (13.21) holds for all Ct'; such that Ct'; >

0 and

I : 7 ~ 1Ct';

= 1.

350


Proof,

p[ X

E

~ 1a;B;]

=

p[X

E

~ {P[X E Bd}"" { P [ X E ~ {P[X E

X

p[x

~ 2( aJ(l-

a1 B1 + (1- (1) k

(1))B;]

~(aJ(l- ( 1))B;

]}1-"'1

Bd}"'l{P[X E B 2 ]}"'2 E

~(a;/(l-

~

...

~

TI {P[X E B

a1 - (

2

»BT-"'1-"'2

k

0

i ]} "" .

i=l

Prekopa's original result is given when B 1 , B 2 are convex sets only; his original proof depends on an application of the Brunn-Minkowski Inequality and is quite lengthy. His result was generalized to a larger class of probability measures for convex sets by Borell (1975). Borell considered measures with the property that

(13.22) n

holds for all convex sets B 1 , B 2 C IR , all a E [0, 1], and for some s E [-00, lin). When s = O,then by continuity (13.22) reduces to (13.18). When s = -00, then the right-hand side of (13.22) is just min{P(B 1 ) , P(B 2 )} . Borell (1975) first showed that 13.22. Theorem. If the probability measure !!J satisfies (13.22) for all B 1,B2clR n , all aE[O, 1], and for some SE[-OO, lin), then it is absolutely continuous w.r.t. Lebesgue measure. As a consequence of Theorem 13.22, the class of measures satisfying (13.22) for some s E [-00, lin) must be of the form (13.17) for some f. Borell's (1975) main theorem concerns a characterization of the class of probability measures satisfying (13.22). His original result concerns convex sets only. In the following theorem the convexity condition on B 1 , B 2 is removed: 13.23. Theorem. Let f: IR n ~[0, (0) be a probability density function, and let !!J be the probability measure defined in (13.17). Then the following

351

13.5. A Class of Log-Concave Probability Measures

statements are equivalent: (a) $9' satisfies (13.22) for all sets B , , B, E 93, all a E [0, 11 such that B E 3, and for some s E [-a, l / n ) . (b) There exists a Borel-measurable function g:Rn+R such that f(x) = g(x) almost everywhere and (i) i f s E [-cot 0), then (g(x))"'('-") si convex, (ii) i f s = 0, then logg(x) is concave, (iii) i f s E (0, U n ) , then (g(x))""l-"' 1s. concave.

Sketch of Proof. The original proof of Borell (1975) is lengthy and is for convex sets only. In the following we adopt the proof given by Rinott (1976). Rinott's proof involves replacing integrals on f over sets in R" by a certain measure Y of epigraphs in Fin+'. Thus his proof is valid only for s E [-m, l / ( n 1)). Let B* be any set in R"+'. Consider a measure Y given by

+

=(

dv(B*)

for s # 0, s E [-m, l / ( n

+ l)),

for s = 0.

(13.23) Then it can be shown that, for B:, B; c Rn+l,

+

V ( ~ B T ( 1 - (Y)B;)

2[

" ~ s f 0, s E [-m, l / ( n + l ) ) , [a(v(B:))s+ ( 1 - a ) ( ~ ( B 2 * ) ) " ]for (Y(BT))a(4B;))1-a for s = O . (13.24)

The inequality in (13.24) can be obtained by first showing it for rectangular sets, then extending it to any Bore1 measurable sets by using the argument in Borell (1975). (b) .$ (a): For arbitrary but fixed B c R" and g :R" + R, let us define the epigraph in Rn+l: A ( B , g ) = {(x, A) : x E B , a E R , g(x)

5

A).

352

13. Convexity and Moment and ProbabiJity Inequalities

Let us first consider the case s = 0 and assume that f satisfies (13.19). Denoting B* = A(B, -log!) c IR n +\ we can easily verify that for measure v defined in (13.23) we have PCB) = v(B*)

(13.25)

for all Be IR n and [aB 1 + (1- a)B 2)* ~ aBr

for all B 1 , B2 c;; IR

n

•

+ (1- a)Bi

(13.26)

Thus (13.24), (13.25), and (13.26) together imply

P[aB 1+ (1- a)B 2l = v[aB 1 + (1- a)B 2*l 2:

2:

v[aBr

+ (1 -

a)Bn

(v(Bt))"'(V(B;))l-'" = (P(B 1))"'(P(B 2))1-""

which completes the proof for s = O. For s < 0 we can repeat the above arguments with B* = A(B, rl(l-ns»), and for s E (0, 1/(n + 1)) we can define B* = H(B, r /(1-ns») where the hypograph H(B, g) is given by H(B, g)

= {(x, A):X E B, AE IR, and 0$ A$g(x)}.

(a):::;> (b): First consider the case s =0 and assume that (13.18) holds. Let Sk(X)= {y: Iy - x] $ IIk} and define !k(X) = [

f

fey) dy

Sk(X)

J/ f

dy.

Sk(X)

Then (13.18) implies !k( axl

+ (1 - a)x2) 2: (!k(x 1))"'(!k(x2))1-", for a E [0, 1].

Consequently the function g(x) = limk->oo inf!k(x) satisfies

g( ax1 + (1- a)x2) 2: (g(x 1))"'(g(X2))1-", for

a E [0, 1].

By differentiation of the integral argument, we have f = g almost everywhere. Thus the proof is complete for s = O. For s = -00, a similar argument yields g( axl + (1 - a)x 2) 2: min{g(x1)' g(X2)}' For s E ( -00, 0), we can consider the sets C 1, C 2 C IR n + 1 defined by C; = B; X (c., (0), where the B;'s are spheres in IR n and c, >0 (i = 1, 2). If B 11 B 2 are chosen to satisfy

(f fI dx,/ f fI dX;) = (Ct/C2r, 8,

1

8

2

1

then C 1 , C 2 satisfy (13.24) for some s E (-00,0). Suppose rl(l-ns) is not convex. We choose C 1, C 2 in this fashion to satisfy C; c A(B;, r/(l-ns))

13.6. Some Properties of Log-Concave Density Fnnctions

353

and

Thus we have

P[aB 1+ (1- a)Bzl < v(aC 1 + (1- a)C z)

= [a(v(C1))S + (1- a)(v(Cz)Yp/s< [a(P(B1))S + (1- a)(P(B 2)Yllls, which is a contradiction. For s E (0, 1/(n by choosing C; = B; x (0, c;) (i = 1, 2).

13.6.

+ 1)) the proof follows similarly 0

Some Properties of Log-Concave Density Functions

Log-concave density functions which satisfy (13.19) play an important role in statistics and probability. In the following we observe some known facts concerning this class of densities. 13.24. Fact. Let Xl' ... , X; be i.i. d. univariate random variables with a common density function hex). If hex) is a log-concave function ofx for x E [R, then the joint density function of (Xl' ... , X n ) is a log-concave function of x for x E [Rn. 13.25. Fact. If f(x) = g(T(x)) where g: [R ~[0,00) is decreasing and T(x) is a convex function of x for x E IRn, then f is a log-concave function ofxfor XE [Rn. The following theorem, due to Prekopa (1971) and Brascamp and Lieb (1975), shows that the integral of a log-concave function is log-concave: ~ be a log-concave function 13.26. Theorem. Let f(x, y): [ R n + m[0,00) of (x, y) for x E [Rn and y E [Rm. Then the function g: [Rn ~[0, 00) given by

g(x)

=

I

f(X, y) dy

(13.27)

[I;lm

is log-concave. Proof. We adopt the proof in Brascamp and Lieb (1975). First note that it suffices to prove the theorem for m = n = 1 because the general case

354


follows by Fubini's theorem and induction. Let Xl' X2 be two points in such that g(X1)g(X2) 0. For convenience we may assume that

~

'*

supf(x, y)

= supf(x', y);

y

y

because otherwise we can replace f(x, y) by ebXf(x, y) for suitably chosen b and the problem remains unchanged. For each fixed A> 0, denote C1(A) = {(x, y) :f(x, y) Cix, A) = {y :f(x, y)

2= A} C

2= A}

~ 2 ,

c ~ .

Then, by log-concavity of t. C1(A) is convex and C 2 (x , A) is an interval. (For the convexity of C 1(A), see Fact 13.28). Letting v(x, y) = f C2(X.).) dy be the Lebesgue measure of the set C 2 (x, A), we have, by Theorem 13.18, V(ax1 + (1- a)x2' A)

+ (1- a)v(x2' A) for all a E [0,1]. Since g(x) can be expressed as g(x) = fO' v(x, A) ds, we have g(ax1 + (1- a)x2) 2= ag(x1) + (1- a)g(x2) 2= (g(X1))"'(g(X2))1-", 2=

av(x1' A)

for all a E [0, 1], where the second inequality follows from the arithmetic mean-geometric mean inequality. 0 A simple application of Theorem 13.26 is (Brascamp and Lieb, 1975; see also Barlow and Proschan, 1981, p. 104): 13.27. Corollary. The convolution of two log-concave density functions 0 in ~ n is log-concave.

Proof. Let f1' f2 be log-concave density functions. Then h(x - y)fz(y) is jointly log-concave in (x, y) E ~ 2 n . Thus by Theorem 13.26 g(x)

=

f

ft(x - y)fiy) dy

JR"

is log-concave.

0

A density function f is said to be unimodal if the set

D).

= {x:x E

~ n , f ( x 2= ) A}

(13.28)

is a convex set in ~n for all A> 0. The following facts show how log-concavity and unimodality are related.

13.7. Some Statistical Applications

355

13.28. Fact. If f: ~ n- [0, (0) is a probability density function, then log-concavity offimplies unimodality off.

Proof. Let XI' Xz E (13.20),

~ n be

in D)... Then for every a

E

[0, 1], we have, by

log j'(ux, + (1- a)xz) 2: a 10gf(xI) + (1 - a)logf(xz) 2:

Thus axl

a log A + (1 - a)log A 2: log A.

+ (1 - a)xz is also in D)...

(13.29)

D

A function f is said to be Schur-concave if y > X implies f(x) 2: f(y) for all x, y E ~ n (see Definition 12.23). It is known that all Schur-concave functions are permutation invariant (see Theorem 12.24). Furthermore, it is known that

13.29. Fact. If f: ~ n [0, _ (0) is a permutation-invariant and logconcave function ofx E ~ n , then it is a Schur-concave function ofx E ~ n .

Proof. Assume y > x and, without loss of generality, it may be assumed that x, yare of the form

I and XI + Xz= YI + Yz· Let y* = (». YI, X3' ... , x n ) . where Yz<X2 ~ X satisfy the conditions in Theorem 14.2, then (14.3) where I:! is defined as in Theorem 14.1. By choosing i = 1, ... , n,

for each permutation (Jrl, ... ,

Jr n )

Xi

= tny,

for

of (1, ... ,n), and

(14.3) implies (14.1) as a special case.

0

As applications of Theorem 14.2, Marshall and Proschan (1965) gave the following results: 14.4. Theorem. If the joint distribution of X = (Xl' ... ,Xn ) is permutation invariant, then 1

- E max{O, Xl' ... , Xn } is decreasing in n for n = 1,2, . . .. n

(14.4)

Proof. Immediate by letting 1>(Xl,'" ,xn)=max{O,xl,··· ,xn }, a= «n -1)-1, ... , (n _1)-1, 0) and b = (n-1, ... ,n- l) in Theorem 14.2. 0

14.1 Muirhead's Theorem and Generalizations

363

14.5. Theorem. Let Xl' ... , X; be independent and identically distributed random variables with a common distribution function F(x), and let 1J : ~ ~ ~ be continuous and convex. Then (14.5)

is decreasing in n

= 1, 2, ... ,

where F(n) is the nth convolution of F.

Proof. This follows from Theorem 14.2 by letting a = «n1)-1, ... , (n -1)-1, 0), b = (n- 1 , ••• ,n- 1 ) , and Xl, ... ,Xn be i.i.d. D random variables with distribution function F. A different generalization of Theoerm 14.1, given in Proschan and Sethuraman (1977b), concerns the multiplication of log-convex functions. Their result states: 14.6. Theorem. Let x = (Jt1 , ••• , Jr n) be a permutation of (1, 2, ... , n) and let ~ ! denote the summation over all such n! permutations. Let a, b be two n-tuples, 'ljJ(i, z) be a log-convex function in z for each i E {I, 2, ... ,n}, and define (14.6)

Then g(a) 2: g(b) holds for all such 'lJ1 iff a > b. For a continuous analog of Muirhead's theorem, Ryff (1967) gave the following result: 14.7. Theorem. Let a(s), b(s) be two bounded measurable functions defined on [0, 1], and let y(t) be a positive function defined on [0, 1] such that (y(t)Y' ELl for all p E (-00, 00). Let 1

1

g(u, a) = J log{J [y(t)]a(s) dt} ds. o

(14.7)

0

Then g(y,a)2:g(y,b)¢::>a(s»b(s)

for

sE[O,l].

(14.8)

364

14. Muirhead's Theorem and Related lnequalities

A generalization of Theorem 14.7 along the direction of Theorem 14.6 can be found in Proschan and Sethuraman (1977b):

14.8. Theorem. Let 'ljJ(t, z) defined on [0, 1] x (-00,00) be a log-convex function of z for each fixed t. Also let SUPlzlsk 'ljJ(t, z) belong to L 1 for each k < 00. For any bounded measurable function a(s) on [0, 1], define I

g",(a)

I

= f 10g{f 'ljJ(t, o

a(s)) dt} ds.

(14.9)

0

Then g",(a) "2=g",(b)

(14.10)

holds for all such 'ljJ iff a(s) > b(s).

14.9. Remark. Theorem 14.1 follows from Theorem 14.6 by choosing 'ljJ(i, z) = y ~ for i = 1, ... ,n. Similarly, Theorem 14.7 follows from Theorem 14.8 by choosing 'ljJ(t, z) = [y(t)y. 0 14.10. Remark. Examples of log-convex functions 'ljJ(t, x) that arise in probability, statistics, and analysis are as follows: (i)

Laplace transforms. Let f E L1[0, 00) and f(z) "2=0 for all z. Then the Laplace transform f*(x) = f ~fez )e- X Z dz is log-convex where-

ver finite. (ii) Examples of log-convex functions arising in analysis are presented in Mitrinovic (1970, pp. 18-20). (iii) Moments of distributions. Let f be a probability density on [0,00). Then the moment Il<x = f ~ z<Xf(z)dz is log-convex wherever finite (see e.g., Theorem 13.8). (iv) Additional examples of log-convex functions arising in stochastic 0 processes are given in Keilson (1971).

14.2.

Moment Inequalities

Applying Theorem 14.1 to permutation-invariant nonnegative random variables, we obtain the following moment inequality (Proschan and Sethuraman, 1977a; Tong, 1977):

14.2 Moment Inequalities

365

14.11. Theorem. Let (Zl' ... , ZN) have a permutation invariant density function such that p [ n ~ {Z, l 2: O}] = 1, and let a = (ai, ... , aN) and b = (b 1 , ••. , b N) be two N-tuples. If a > b, then E

Proof.

N

N

j=l

j=l

IT Zji 2: E IT ZJi.

(14.11)

By Theorem 14.1, for every w in the sample space the inequality N

N

IT

~ ! ~ Z';;lw) 2 : ~ ! j=l

(14.12)

Z~lw)

j ~ l

holds, where I:! denotes the summation taken over all permutations of (1, ... , N). The conclusion follows by taking expectations on both sides of (14.12) and the permutation invariance property of the distribution function. 0 A special case of Theorem 14.11 is 14.12. Corollary. For {3 E [0, 00) let !l(3 denote the 13th moment of the random variable Z(!lo == 1). If Z 2: 0 a.s. and a> b, then N

N

IT !lo} 2: IT !lbi'

j=l

First Proof. Immediate from Theorem Z, Z, , Z2' ... , ZN be i.i.d. random variables For the special case in which aj is possible:

(14.13)

j ~ l

2:

14.11

by

letting

o

0 and b, 2: 0 for all j, a different proof

Second Proof. Without loss of generality, we may assume that a 1 < b, :s: b 2 < a2, a l + a2 = b l + b 2 and a, = b; (i = 3, ... , n). The conclusion then follows immediately from Theorem 13.8. 0 14.13. Remark. Corollary 14.12 asserts that the function tjJ(a) = IIf=l !loi is a Schur-convex function of a. This result, in fact, is equivalent to Theorem 13.8 and is closely related to a known classic result in Hardy, Littlewood, and P6lya (1934, 1952, p. 72). It will be used in the rest of the chapter to derive probability inequalities for a class of positively dependent random variables. 0

366

14. Muirhead's Theorem and Related Inequalities

14.3. Additional Inequalities for Exchangeable Random Variables Applying Corollary 14.12 to exchangeable random variables (defined in Definition 13.11), we have the following result (Tong, 1977): 14.14. Theorem. Let X I , . . . ,Xn be exchangeable random variables and let a = (ai, ... , aN) and b = (b l , . • • , b N) be vectors of nonnegative integers such that I:;':I a; = I:;':I b, = n. Let B c ~ be an arbitrary but fixed Borel-measurable set, and denote

y(k) =

p[O {X;

E

J.

k= 1, ... , n,

B}

as defined in (13.15), where y(O) == 1. If a > b, then N

N

Il y(a;) ~Il y(b;). ;=1

(14.14)

; ~ I

Proof. Following the line of argument given in the proof of Theorem 13.16, from Corollary 14.12 we have

N

= Il ErQj(V) j ~ 1

Jl N

~

Jl N

Erbj(V)

=

rt b

E[ E

IB(g(V;, v)) I V

=

v]

N

=

n y(bJ

j=1

where r( v) = E {IB(g( VI, v)) I V

= v} is the

conditional expectation.

o

Note that Theorem 14.14 implies Theorem 13.16, but the converse is false. To see this, consider the inequaltiy y(n - 1)y(1)

~ y(n

- 2)y(2),

n

~ 3,

which follows from (n - 1, 1) > (n - 2,2) and Theorem 14.14. But Theorem 13.16 fails to apply.

14.3. Additional Inequalities

367

A special result from Theorem 14.14 and (n, 0, ... ,0) > (1, 1, ... , 1) is that

p[O {Xi

E

B}]

2:

D

P[Xi E B].

(14.15)

Since the right-hand side of (14.15) corresponds to the case of independent Xl' ... , X n , this result can be restated as:

14.15. Fact. Let Xl, ... , X; be exchangeable random variables and let ¥t, ... , Yn be i.i.d. random variables such that Xi' 1'; have a common marginal distribution. Then

p[O {X;

E

B}] 2:

p[O {1';

E

B}

J

(14.16)

holds for all Borel-measurable sets Be IR. A question of interest is whether the inequality in (14.16) also holds when the 1';'s are less positively dependent than the X;'s in a certain fashion. This leads to the problem of a partial ordering of positive dependence of exchangeable random variables, a natural extension from the comparison between positively dependent random variables and i.i.d. random variables to the comparison between two sets of exchangeable random variables. Rinott and Pollak (1980) and Shaked and Tong (1985), among others, studied this problem recently and obtained results under reasonable assumptions. A special result they obtained is for exchangeable normal variables (Rinott and Pollak, 1980, for n = 2, Shaked and Tong, 1985, for general n):

14.16. Fact. Let X I , . . . , X; be exchangeable normal variables with a common mean fl, a common variance o", and a common correlation coefficient pz. Let YI , . . . , Yn be exchangeable normal variables with a common mean fl, a common variance o", and a common correlation coefficient Pl' If 0 ~ PI < Pz, then E r r 7 ~ 14>(X;) 2: r r 7 ~ 41 >(1';) holds for all Borel-measurable functions 4>: IR - [0,00) such that the expectations exist. Consequently (14.16) holds for all Borel-measurable sets B. 14.17. Remark.

For n = 2, E

m=1 4>(X;) 2: E

r r ~ ~ 14>(1';) holds iff

Corr(4>(X I ) , 4>(Xz)) 2: Corr(4>(¥t), 4>(Yz))

(14.17)

368


holds, which is the motivation given by Rinott and Pollak (1980) for studying this problem. Furthermore, if k", then (14.20) holds for all Borel-measurable functions : IR ~IR such that the expectations exist. Proof,

The proof is as for Theorem 14.18 since 1k> 1k*.

D

In certain applications the special case k" = (1, ... ,1) is of great interest. In the following corollary we show that if the vector k contains only even integers, then again the condition that 2:: 0 can be removed. 14.20. Corollary. Let {Ui H, {V;H, W, and g satisfy the conditions in Theorem 14.18. Let k, k" be two n-tuples such that k" = (1, ... ,1) and the components of k are nonnegative even integers such that ~ 7 = k1 , = n. Then (14.20) holds for all Borel-measurable functions : IR ~IR. Proof. For every fixed W = w the function r defined in (14.22) satisfies (again from Corollary 14.12) r

Il Erkj(l-j, w) 2:: [Er

2

(V1 '

wW/2 2:: [Er(V1 , wW·

j=1

The conclusion then follows after unconditioning.

D

As a special consequence, we observe 14.21. Corollary. Let {UiH, {V;}'i, W, and g satisfy the conditions in Theorem 14.18, and let s(k) be the random vector defined in (14.19). Let

14.5. Applications to Special Families

371

k = (n, 0, ... , 0), k* = (1, ... ,1), and let n be an even positive integer. Then (14.16) holds for all Borel-measurable functions cf>: ~ ~ ~ .

In certain applications to be discussed in Section 14.5 we restrict our attention to a family of random variables such that ;(k) is obtained by choosing k = (s, 1, ... , 1,0, ... ,0) in (14.18). For notational convenience we shall denote this random vector by ;(s). The following corollary shows how the components of ;(s) depend on s. Its proof follows immediately from Theorem 14.18 and is omitted. 14.22. Coronary. Let {U;}7, { ~ H ,W, and g satisfy the conditions in Theorem 14.18. For given s 2: 1, let ;(s) = (;1' ... , ;n) be the random vector obtained according to (14.18) by choosing

k2

= ... = k n - s + l = 1,

kn -

s +2

= ... = k; = O.

Let X = ;(s + 1) and Y = ;(s). Then (14.20) holds for all such cf> all n, and all s < n.

14.5.

(14.23) 2:

0, for

Applications to Special Families of Random Variables and Distributions

In this section we state some applications of the main results in Section 14.4 for obtaining inequalities via this partial ordering of positive dependence for several families of random variables and distributions (see Tong, 1989).

14.5.1.

Exchangeable Random Variables

Consider the random variables defined for i X;=g(lf;, VI' W),

= 1, ... , n by

Y;=g(U;,

~ , W).

Then Xl' ... ,Xn are exchangeable and Y\, ... , Yn are exchangeable. But for k = (n, 0, ... ,0) and k" = (1, ... ,1), we have X 4 ;(k) and y 4 ;(k*). Thus a partial ordering of positive dependence can be

372


obtained by applying Theorem 14.18 and the related results given in Section 14.4 By applying this result to the exchangeable normal, t, chi-square, gamma, F, and exponential variables, we obtain many useful inequalities as special cases. The multivariate normal variables will be treated separately in this section. Exchangeable exponential variables can be obtained by taking feu, v, w) = min(u, v, w) as considered previously by Marshall and Olkin (1967), and have an important application in reliability theory.

14.5.2.

Distributions with the Semigroup Property

Let {fe(x): () E Q} denote a family of density functions, and assume that Q is an interval of real numbers or an interval of integers. It is said to possess the semigroup property (see, e.g., Proschan and Sethuraman, 1977a) if ()', ()" E Q implies ()' + ()" E Q and the convolution fe'(x) * fe"(x) = feo+e'.(x),

14.23. Application. Let X e,I' ... , Xe,n denote i.i.d. random variables with density fe(x), and for fixed ()o and ()o - () E Q let X eo-e denote another independent random variable with density feo-eCx). Next define an n-dimensional random vector X( () = (XI' ... , X n) such that Xi = Xe,i + X eo-e for i = 1, ... ,n. If {fe(x): () E Q} possesses the semigroup property and if ()1, ()z E Q, ()1 =1= ()z implies 10 1 - Ozl E Q, then (a) E e II7=1 ¢(Xi ) is a decreasing function of () for () < ()o for all Borelmeasurable functions ¢ ~ 0 (provided that the expectation exists); (b) E e II7=1 ¢(X;) is a decreasing function of () for () < ()o for all even positive integers n and all Borel-measurable functions ¢; (c) Pe[XI E B, ... , X; E B] is a decreasing function of () for () < ()o for all Borelmeasurable sets B c IR . Proof. For every fixed ()1' ()z E Q such that ()1 < ()z < ()o, define U; = X elo i' ~ = X e 2 - e l o i ( i =... 1 , ,n) and W=XeO-Xe2. The conclusions follow from Theorem 14.18. D Note that Application 14.23 applies to the binomial, gamma, Poisson distributions, the Poisson process, and several other distributions.

14.5. Applications to Special Families

14.5.3.

373

The Multivariate Normal Distribution

Applying Theorem 14.18 we now show how the positive dependence of a multivariate normal variable with a common marginal distribution can be partially ordered via the correlation coefficients.

14.24. Application. Let 0::; PI < pz::; 1, k and k* be two n-tuples of nonnegative integers given as in (14.18), and R =R(k) = (Pij) be a correlation matrix such that for i =1= j, Pij

= ,-I

pz if 1::;i,j::;k l,k l+1::;i,j::;kl+k z , ... , :!=/m+ 1::;i,j::;n; {

PI

otherwise.

[That is, the random variables X I, . . . , X; are partitioned into r groups of sizes k«. ... , k, respectively; the correlation coefficients of the variables within the same group are pz, and the correlation coefficients between groups are Pl'] Let X - . N " n ( crR(k)) ~ , and Y ~ K n ( ~aZR(k*)), , where ~ = ( t J "... , tJ,). (a) If k>k*, then (14.20) holds for all such
k" and the components of k, k* are even integers, then (14.20) holds for all such Borel-measurable functions (3,1,0,0) >- (2, 2, 0, 0) >- (1,1,1,1). 0 A special consequence of Application 14.24 is the result given in Fact 14.16 for exchangeable normal variables. For other related applications in the multivariate normal distribution, see Tong (1990, Chapter 7).

Chapter 15 Arrangement Ordering

In this chapter we present the theory of a relatively new partial ordering of vectors (of length n, unless otherwise indicated) based on the n! permutations of their elements. We consider functions that increase as the arrangement of vector elements becomes more ordered. As an unexpected bonus, we find that we can obtain as special cases majorization and Schur functions, totally positive functions of order two, and positive set functions. Our discussion is largely based on Hollander, Proschan, and Sethuraman (1977). Applications of this arrangement ordering theory in probability, statistics, reliability, and rearrangement inequality mathematics are presented in Chapters 16 and 17.

15.1.

Definitions and Basic Properties

Let S be the group of permutations of (1,2, ... ,n). A member of S is denoted by n = (n 1 , ••. , nn)' The product operation is the composition of 1[ and 1[' E S; i.e., 1[01['(i)

= 1[(1['(i)),

i = 1, ... ,n,

(15.1)

where 1['(i) = n;. Thus S is a noncommutative group. The identity element is e = (1, ... ,n).

15.1. Definitions. (a) Let 1[ and 1[' be two members of S such that 1[' contains exactly one inversion of a pair of coordinates that occur in the 375

376

15. Arrangement Ordering

natural order in :n:; e.g., (15.2) (15.3) where i <j and x, < Jrj' We say that :n:' is a simple transposition of :n:; in t symbols, :n: > :n:'. (b) Let :n: and :n:' be two elements in S such that there exists a finite number of elements :n: D, :n: 1 , ••. , :n:k in S satisfying :n: = k 1> :n:D>:n: ... >:n: = :n:'; i.e., :n:' is obtained from :n: by a finite number of simple transpositions. We say that :n: is better arranged than :n:'; in sumbols, :n::5- :n:'. Note that the elements of S are partially ordered by arrangement.

15.2. Definition. A function f from S into IR is increasing in arrangement or arrangement increasing (AI) if :n::5-:n:' implies that f(:n:) 2:: f(:n:') for :n:, :n:' in S. In earlier papers on the subject, the term "decreasing in transposition" was used rather than "arrangement increasing."

15.3. Remark. From Definitions 15.1 and 15.2, it is clear that the simplest way to obtain a result for AI functions is to consider a pair of permutations :n: >:n:' that by definition differ only in two coordinates. Thus we will see that proofs throughout are generally brief and simple. In the same vein, we will see that generalization and strengthening of certain well known rearrangement inequalities may be achieved by arguments shorter and simpler than those used in the original proofs. See, e.g., results in Section 15.2. 0 15.4. Examples. The following functions are AI functions from S into IR: (a) fl(:n:) = -(Jrl + ... + Jrk) for 1 :s k :s n. (b) h(:n:) = ~ 7 ~ a1 .x., where al:S' .. :S an' (c) N:n:) = r 1 7 ~ lg ()'i, Jr i), where AI:S"':S An' and g(A, i) is totally positive of order 2 (TP z) for -00 < A < 00, i = 1, ... , n.

15.1. Definitions and Basic Properties

377

(d) hen) = I : 7 ~ 1g (A;, nJ, where A1 s; ... S; An' and g(A, i) is a nonnegative set function; i.e., -00 < A1 < Az < 00, 1 s; i 1 < i z s; n implies that g(A1, i 1) - g(A1 , i z) - g(Az , i 1) + g(Az , i z) ~ O. (e)

1 if fen) ~a fsCn) = { 0 if fen) < a,

where f is an AI function on S.

0

Thus far we have considered functions of one vector argument. Next we consider functions of two vector arguments. Let g(A, x) be a function from IR n X IR n into IR. Let Aon denote (An" ... , An.), where n is a permutation in S. We say that g(A, x) is permutation-invariant if g(A 0 n, x 0 n) = g(A, x)

(15.4)

for all n E S; i.e., applying a common permutation to both vector arguments Aand x leaves the function g unchanged. 15.5. Definition. Let A, M be subsets of IR. We say that g(A, x) is arrangement increasing (AI) in (A, x) (or in A and x) on An X M" if

(i) g(A, x) is permutation-invariant, and (ii) AEAn , xEM n , A1 S; · · ·S;An, X1S; .. ·s;xn, n$.n' implies that , g(A, xon) ~ g ( A xon'). Note that condition (ii) above may be replaced by the equivalent condition: (ii') Define[;..,x(n)=g(A,xon), where A1S;···S;An andx1s;·· ·s;xn. Then f.jn) is AI on S. In the statistical context, we may think of Aas a parameter vector, x as an outcome vector, and g(#.-, x) as the corresponding density. In a probabilistic context, we may view g(A, x) as the probability that a Markov chain makes a transition from permutation state #.- to permutation state x in a step. 15.6. Examples. The following functions are AI functions on IR n x IR n .

(a) g6(A, x) = h(#.- - x), where h is a Schur-concave function on IR n ; g6,(A, x) = h(A + x), where h is a Schur-convex function on IR n •

378


(b) g7(l, x) = 07=1 ¢(A i , Xi), where ¢(A, x) is TP z in -00 < A< 00, < x < 00. Conversely, if an AI function gel, x) is of the form 07=1 ¢(Ai , Xi) with ¢ 2= 0, then ¢ must be TP z . -00

(c) g8(l, x)

= :E7=1 ¢(Ai , x;),

where ¢ is a nonnegative set function.

0

We can also define AI functions on IR n • A function h defined on M" is said to be arrangement increasing (AI) on M" if for every x E M" with Xl::; ••. ::; Xn and for every pair n, n' E S satisfying n >- n', we have

h(xon) 2= h(xon').

(15.5)

Note that the corresponding function defined by

fin)

= h(xon) is AI on

S.

(15.6)

It is clear from the definitions above that AI is essentially a property of functions on S. In most situations we can put A = IR = M, though in some cases, like Corollary 15.16, and in some applications, the functions are defined only on An x M", where A and M are proper subsets of IR. Thus it is more convenient for many theoretical and practical applications to formulate the AI property for functions on IR n x IR n and on IR n . We summarize the relationships among the various domains in the following lemma. From now on we set A = M = IR, unless essential generality is to be gained by doing otherwise.

15.7. Lemma. IR n • Define (i) g*(x, l)

Let gel, x) be a permutation-invariant function on IR n x

= gel, x)

(ii) h..(x) = gel, x) for

for l, x E IR n , X E

(iii) A.in) = gel, xon) for

IR n,

AI::; ... ::;

AI::;' .. ::;

An'

An' Xl::;'

•• ::;X n ,

and n

E

S.

Then the following statements are equivalent; (a) g is Alan IR n

X

IRn.

(b) g* is Alan IR n X IRn. (c) h;. is Alan IR n for each l such that AI::;' .. ::; An' (d) A.x is Alan S for each l and x such that AI::;"'::; An and Xl::;"

·::;xn ·

15.2. Arrangement Increasing Functions

379

The equivalences follow immediately from the definitions of the various types of AI functions. The next lemma shows that the concept of an AI function yields as special cases such well known and useful concepts as (a) Schur-concave and Schur-convex functions, (b) total positivity of order 2, and (c) nonnegative set functions. 15.8. Lemma.

(a) Let g(J.., x) = h(J.. - x). Then g is AI on

is Schur-concave on

(b) Let g(J.., x) convex on [Rn.

[Rn X [Rn

iff h

[Rn.

= h(J.. + x).

Then g is AI on [Rn

X [Rn

iff h is Schur-

(c) Let g(J.., x) = II7=1 h(Ai' Xi). Then g is AI iffhis TP2 in J.. and x. (d) Let g(J.., x) = ~ 7 = h1 (Ai, Xi). Then g is AI iff h is a nonnegative set function.

Proof. We give the proof of (a) only. The rest are proved similarly. Let AI::;A2 and Xl ::; X2::; ... ::; Xn ; Now

and (AI - X2, A2 - Xl) majorizes (AI iff h is Schur-concave. D

15.2.

Xl,

A2 - X2). This shows that g is AI

Preservation Properties of Arrangement Increasing Functions

In this section we show that the AI property is preserved under a number of basic mathematical and statistical operations. 15.9. Lemma. Let g(J.., x) be AI on [Rn X [Rn. Let f and h be permutation-invariant and nonnegative functions on [Rn. Then k(J.., x) =' f(J..)g(J.., x)h(x) is AI on IR n X [Rn.

Proof. The conclusion follows immediately from the definition of an AI function. D

380


The AI property is preserved under mixtures:

15.10. Theorem. Let fer be AI on S and integrable with respect to nonnegative measure. Then fen) = f fer(n) dp,(a) is AI.

p"

a

The proof is obvious and hence omitted. A similar preservation under mixtures property holds for AI functions g(A, x) on ~ n X ~ n and AI functions hex) on ~ n . More interesting and useful is the fact that the AI property is preserved under composition:

15.11. Theorem. Let gj be AI on ~ nX ~ n , i = 1, 2. Let g(x, z) = f ... f gl(X, y)gz(y, z) doiy«, ... ,Yn) be well defined, where fA da(y) = fA da(yon) for each permutation n E S and Borel set A in ~ n . Then g(x, z) is AI on ~ n X ~ n . Proof. g(x, z) is obviously permutation-invariant. To complete the proof it will suffice to show that g ( x , z ) - g ( x , z ' for ) ~ OXI:5·· ·:5xn, Zl < Zz, z ~ = Zz, z ~= Zl, and z{ = z, for i = 3, ... , n. Write g(x,z)-g(x,z') =

f··· J

[gl(X,YI,YZ," .)gZ(YI,YZ,'" ;ZI,ZZ,"')

-gl(X;YI,YZ," .)gZ(YI'YZ'···;zz, Zl," .)]da(y),

(15.7)

where the " ... " indicates natural ordering of the omitted arguments. Breaking up the region of integration into the two regions YI < Yz and YI ~Yz, and making a change of variable in the second region yields:

g(x, z) - g(x, z')

=

J... J[gI(x; YI , Yz, ... )gZ(YI, Yz, ... ; Zl, z«, ... ) y,

)] da(y)

J.. -J [gl(X; y)gz(y; z) - gl(X, y)gz(y; Zz, Zl, ... ) y,

where Ai> 0 for j = 0, 1, ... , n; A = ~ ~Ai; x.> 0 for j (j) Multivariate Pareto distribution.

O.

= 1, ... , n.

where xi> Ai> 0 for j = 1, ... , n; a> O. (k) Multivariate normal distribution with a common variance and a common correlation coefficient. gl1(l..., x)

= (2Jt)-nIZI1:;1- lI Ze-(lIZ)(X-A)I:-l(x-A)',

where (x - I.)' is the transpose of (x - I.) and 1:; is the positive definite covariance matrix with elements oZ along the main diagonal and elements pif elsewhere, p> -l/(n - 1). To verify that gu gz, g4, gs, and gs are AI, note that AX is TP z, and form Lemma lS.8(c), the product g(l..., x) = II A7; of TP z functions is AI. The additional factors that appear are functions of ~ Xi and thus are

386


permutation-invariant. Hence by Lemma 15.9, the desired conclusion follows. To verify that g3, g6, and g7 are AI, we use a similar argument. We note that the functions

( ~ ) and I'( A + x)

are TP z. The remainder of the

argument is as just above. To verify that S» is AI, we first note that g9 is the joint density of (X)Aj)/(Xo/Ao), j = 1, ... , n, where X, has a xZ-distribution with 2Aj degrees of freedom, j = 0, 1, ... , n. For fixed outcome X 0 = Xo, say, the conditional density of (X)Aj)/(Xo/Ao) is TP z in Aj, Xj' Thus the corresponding joint density of (Xt!Al)/(Xo/Ao),'" , (Xn/An)/(XO/Ao) is AI. By unconditioning on X 0 and using the fact that the AI property is preserved under mixtures (Theorem 15.10), we conclude that g9 is AI. Note that glO is AI since ( ~ Aj-lXj- n + 1)-(a+n) is AI. D

15.19. Remark. Eaton (1967) and Marshall and Olkin (1974) gll is AI. (This can be verified directly from the definition showing that (x - J..)I:-l(x -J..)' ~ ( x -* J..)I:-l(x* - J..)', where .. ' ~ X n , A l ~ A Z ~ '" ~ A n andx*=(xz,Xl,X3"" , ,Xn).)

15.3.

show that of AI by

Xl ~Xz

~

D

Arrangement Increasing Property of Overlapping Sums

In order to state the main result of this section, we need notation as follows: For k = 2, 3, ... , n, let Ik

= {I: I is a subset of size k from {I, ... , n}},

(15.11)

h,i = {I C Ik : i E I}. 15.20. Theorem. Let X={Xl, ... ,Xn}, {XI,Ich}, k=2, ... ,n, be independent collections of random variables. Let X have an AI density function. Let the random variables in {Xl> I c Id be i. i.d. and have a common log-concave density gk, k = 2, ... , n. For i = 1, ... , n let

Zi=Xi +

L

XI

[C]k,i

k"Z:2

Then Z = (Zl, ... , Z,.) has an AI density function.

(15.12)

15.3. Arrangement Increasing Property of Overlapping Sums

387

15.21. Remark. Note that the summands in the ZI' ... , Z; overlap considerably. For example, Xu appears in the expressions for ZI and Z2,X123 appears in the expressions for Z I, Z2' and Z3' etc. Thus, the inheritance of the AI property of Z from that of X is complicated by the overlapping of the X/so 0 15.22. Remark. Theorem 15.20 takes on added interest if we note that the tempting conjecture that "the convolution of AI functions is AI" is false. An even more tempting conjecture that "the convolution of an AI density and a permutation invariant density is AI" is also false. These facts point up the need for the log-concavity of the S« density in the statement of Theorem 15.20. 0 15.23. Remark. Random variables ZI, ... , Z; of the type specified in (15.12) arise routinely in shock models, inventory problems, biometric models, and elsewhere in multivariate statistics, where the occurrence of an event simultaneously affects two or more random variables of interest. A classical example is the multivariate exponential of Marshall and Olkin (1967), where a shock of type (iI' i«. ... , ik) results in the simultaneous failure of components iI, ... ,ik. In Example 15.25 we give illustrative examples from reliability theory, in particular, in which the main theorem would apply. 0 To prove the main result, we shall find it helpful to have available the following lemma:

15.24. Lemma. For some k ~2, let {Xl, I c: h} be i.i.d. random variables with a common log-concave density function g (with respect to the counting measure on a lattice or the Lebesgue measure). Let lV; =

~

x.,

i

= 1, ... ,n.

(15.13)

IEh,i

Let f(Wl' ... , wn ) be the density function of W = (WI' ... , Wn ) . Then f is a Schur-concave function. The proof is rather lengthy; we refer the reader to Hollander, Proschan, and Sethuraman (1981).

388


Note that the result extends the class of Schur-concave functions considerably since it allows for mutually dependent random variables. We now present:

Proof of Theorem 15.20. Let X ~ k= )

L

X},

i = 1, ... , n,

k=2, ... , n.

b=h,i

Let XCk) = ( X ~ k ) ., .. , X ~ k » k, = 2, ... , n, and X = (Xl, ... ,Xn ) . Then Z=X+X(2)+ ... +x(n).

From Lemma 15.24, the density functions of X (2 ) , . . . , x(n) are all Schur-concave. Hence the density of X(2) + ... + x(n) is Schur-concave by Corollary 15.12. Finally, the density of Z = X + (X(2) + ... + x(n» is Schur-concave by Lemma 15.8 and another application of Corollary 0 15.12.

15.25. Example. The density functions of the following distributions are AI densities of overlapping sums: (a) Multivariate Exponential of Marshall-Olkin. Marshall and Olkin (1967) introduce the widely used multivariate exponential in which a shock of type I causes the simultaneous failure of components in I, where I is a subset of {I, ... , n}. The shocks of type I are governed by a Poisson process with rate Ai' The 2n - 1 such Poisson processes are assumed mutually independent. Numbers of replacements, amounts of damage cumulated, and down times for components. Assume failed components are immediately re, , Z ~ R o ) f replacements in a fixed placed. Then the numbers Z ~ R ) ... interval of time have a joint multivariate Poisson distribution. (See Teicher, 1954, and Dwass and Teicher, 1957.) If AI :5 A2:5 ••• :5 An ; Aij = A(2) for all pairs i<j; Aijk=A(3) for all triplets i<jo,vi>o:i=I, ... ,nl; hs(u, v) = IT tu, - vi)+' where a+ = max{O, a}; h6(u, v) = -max lUi - Vii; h7 (u , v) = - I: lUi - v;/. 0

16.4. Example. Judicious selection of hI and h Z from, say, Example 16.3, can yield some useful AI geometric probability functions:

(a) The rectangular probability P[ai::; Xi::; b.: i

= 1, ... , n] = P[X E [a, b]]

is an AI function of (a, b), as can be seen by using hl(a, x) = hl(a, x) and = hl(x, b). (See Boland, 1985, for more rectangular probabilities of this type.) (b) P[I:?=I a.X, 2=: c l , ~ 7 ~ b1 .X, 2=: cz] is AI in a and b, as can be seen by letting hl(a, x) = h 2(a, x) and h 2(x, b) = h 2(x, b). For example, if X = hZ(x, b)

16.1. Moment and Geometric Inequalities

393

( X , , X , ) has a permutation invariant density function, then

PvA = P[4x1 -k x , 2 4 , 2 x , -k 4x2 2 21 5

P [ X I-k 4x2 2 4 , 2 x , -k 4x2 2 21 = P[&q.

(See Fig. 16.1) (c) Of course, any combination of two of the types of functions in Example 16.3 gives us a probability function which is A1 in (a, b), such as a,Xi Ic1and

P[ r=l

( X i - bi)’

5 c2]

i=l

or Xi 2 a, for all i

= 1,

. . . , n and

0

IXi - bil Ic 2 ] . i=l

2x,

+ 4x,=

Figure 16.1. Regions generated by A1 functions.

2

394

16. Applications of Arrangement Ordering

Now let X be a random vector with density function f(x) that is permutation invariant, hi be an AI function for i = 1, 2, and (a, b), (a', b') E IR n x IR n satisfy (a, b) ~ (a', b'). We use the notation Y = (1'1, Yz) '=' (hl(a, X), hZ(X, bj) and Y' = ( Y ~ , Y ~ ) ' = ' ( h l ( a / , X ) , h Z ( X , bSince ' » . the distribution of X is permutation invariant, Y and Y' have the same marginals. In other words, the distributions of the marginals of Yare unaffected by a permutation of the components of a or b, although their joint distribution may be altered. 16.5. Remark. Corollary 16.2 states that ( Y ~ , Y ~ ) is in a sense more positively quadrant dependent than (YI , Yz). See Lehmann (1966), Barlow and Prosch an (1981, Chapter 5, Section 4), and Tong (1990, Chapter 5) for concepts of dependence. 0 We might say that the bivariate vector ( U ~ , U ~ ) is more positively dependent than (UI , Uz) if the two vectors have the same distribution and

or equivalently,

for every pair cf>1' cf>z of increasing functions. (See Rinott and Pollak, 1980; Shaked and Tong, 1985; and Tong, 1989, for a slightly different concept of more positive dependence.) Then Corollary 16.1 implies that ( Y ~ , Y ~ ) is more positively dependent than (YI , Yz), since for any increasing cf>I and cf>z, Cov( cf>I(YD, cf>z(ym - Cov( cf>1(1'1), cf>z(Yz»

=

E ( c f > I ( Y D c f > z ( -Y ~E(cf>I(YI)cf>z(Y » Z».

16.6. Remark. Corollary 16.1 yields moment inequalities when the distribution of X is permutation invariant and X takes values in [O,oor. We use the notion of the previous remark except that now we assume that (a,b),(a',b')E[O,OOrX[O,oor and hI and h Z take nonnegative values. Let cf>i(y) = ymi for y;:::: 0 and m, be a positive integer, i = 1, 2. Then Corollary 16.1 implies that when (a, b) ~ (a', b'), E « y ~ ) m l ( y ; );::::m zE )( Y ~ l Y i z ) for

all m l , m2;:::: 1.

0

16.2. Arrangement Increasing Probabilities

16.2.

395

Arrangement Increasing Probabilities for AI Families of Densities

Many families of multivariate densities A(x) possess the property that the function ep(l.., x) = A(x) is arrangement increasing in the parameter I.. and the outcome x. The multinomial n

epl(l.., x) = N!

A ~ i

2: -' ;=lX; !

for O 0 for all i and 1.., then Pi-[E7=dXdaJ ~ 1] is AI in (1.., a), where a E (0, oot.

396


(c) P , . [ E ? ~ d X i - a i ) 2 ~isc ]AI in (A, a). Hence for a given A, the probability that X lies in a sphere of radius y7; with center a = (aI' ... , an) increases as the order of the coordinates of a becomes more similar to the order of coordinates in A= P'l' ... , An)' Similarly,

are both AI in (A, a). (d) If XE [0, with probability 1 for all A, then P , . [ L ? ~ l( X ; / a ; ) 2 ~ c ] is AI in (A, a), where a E (0, (e) p,.[n?=l (Xi - a;)+ ~ c] is AI in (A, a). The boundary of this region in the two-dimensional case is a hyperbola, illustrated in Fig. 16.2. 0

oar

oar.

16.9. Example. Suppose that {A(x): AE A} is an AI family of densities, Now n?=l X'!'i is an AI function where each A(x) has support in [0, of m and X. Hence an application of Theorem 16.3 yields that E.. n?=l X'!'i = f-l'::" .... m n is an AI function of Aand m. Similarly, eX'. = eExiti is also an AI function of x and t. Thus the multivariate moment

oar.

Figure 16.2.

Illustration for Example 16.8(e).

16.3. Applications to Rank Order Problems

397

generating function

is an AI function in (l, t).

16.3.

D

Applications to Rank Order Problems

This section is largely based on Hollander, Proschan, and Sethuraman (1977). Given a set of real numbers {Xl' ... ,Xn }, let ri denote the rank of Xi; i.e., ri = 1 + Lj*i I(xi' xJ, where 1 if c>d I( c, d) = ! if c = d { o if c < d.

If there are tied x's this definition yields the average of the corresponding ranks. Let r = (r1 , ... ,rn ) , the vector of ranks, or the rank order. Similarly, for random variables X 1 , • . . , X n , let R, denote the rank of Xi' and R= (R u · · · , R n ) .

16.10. Theorem. Let Xl' ... ,Xn have joint density ep«l, x), an AI function on [Rn X [Rn with vector parameter l. Let g(l, r) = P,.[R = r] for r E [Rn denote the probability of rank order r. Then g(l, r) is an AI function on [Rn x [Rn. Proof.

We may write g(l, r) as

g(l, r)

f

= ep(l, x)J(x, r) dotx, , ... , x n ) ,

(16.2)

where a is a permutation-invariant measure and where J(x, r) = 1 if Xi has rank r., i = 1, ... ,n, and =0 otherwise. Since ep(l, x) is AI by hypothesis and J(x, r) is AI by construction, it follows that the composition g(l, r) given in (16.2) is AI by Theorem 15.11. D Thus if a set of random variables has an AI density, then the corresponding rank order has an AI frequency function.

398


16.11. Corollary. Let f be an AI function on IR n. Let R be the rank-order of vector X, where X has the AI density ,,;', then the distribution of feR) when X has parameter ')..0"; is stochastically larger than the distribution of j'(R) when X has parameter ')..0";'. 0 16.13. Remark. Note that Theorem 16.10 and Corollary 16.11 do not require that the AI density of X be absolutely continuous. The theory easily covers ties; we simply use average ranks and thus do not require that r be restricted to the set S. Thus subsequent applications discussed in this section also apply to multivariate discrete AI densities such as gl, gz, s«, s«. s-, and s« in Example 15.18. 0 16.14. Application (The trend problem). Let Xi have TP z density and let A1::S ... ::S An' Then Theorem 1 of Savage (1957) states essentially that g(').., r) = p'.[R = r] is an AI function.

f (Ai, x)

Note that Savage's result follows from the application of Theorem 16.10. As a further application, put U(r) = - ~ ; : l r., where 1::s m s: n, and note that U(r) is AI on IR n • From Corollary 16.11 it follows that if Al::S"'::SA n and ,,;>,,;', then the distribution of U(R) under ')..0"; is stochastically larger than the distribution of U(R) under ), 0";'. Restricting Al = ... = Am = 1 and Am+ l = ... = An = A> 1 in the above, we obtain a stochastic comparison result for the Wilcoxon statistic in the two-sample problem if the experimenter mistakenly counts observations from the second distribution as arising from the first distribution. These ideas are generalized and summarized in the following theorem. 16.15. Theorem. Let the random vector X have a density y ~ - ; " l . (We call such an operation of obtaining (Xli,yZ,"" i) from (Xlj,rl- l, ... , i-I) a basic rearrangement. ) 17.2. Example.

(7,5,3,1), (2,6,4,8), (6,0,9, 3) ~ (1, 3, 5, 7), (8,4,6,2), (3, 9, 0, 6) ~ (1, 3, 5,7), (2,4,6,8), (3,9,0,6) ~(1,3,5, 7), (2, 4, 6, 8), (3, 0, 9, 6) ~ (1, 3, 5, 7), (2, 4, 6, 8), (0, 3, 9, 6) ~ (1, 3, 5, 7), (2,4,6,8), (0, 3, 6, 9).

o 17.3. Remarks. (a) It should be clear that ~ is a partial ordering on (IR")' and that if (Xl, ... , Xs ) ~ (Zl' ... ,z,), then the components of the vectors Xl" .. 'Xs are relatively less similarly arranged than the components of the vectors Zl , ... , Zs . Of course if (Xl, ... , Xs) ~ (Zl, ... ,zs), then the relative arrangement of the components in the vectors Xl' ... ,Xs is equivalent to that of the components in the vectors ZI, ,Zs' For any (XI, ... ,xs) E ( ~ n ) ' , it follows that (XI,

,xs) ~ (XIj, ... , X s j) ~ (Xlt. ... , xsl )·

(b) For the case s = 2, it is clear that for any pair of vectors Xl and Xz , we have:

17.1. Definition and Basic Properties

409

This yields the well-known rearrangements inequality of Hardy, Littlewood, and P6lya (1934, 1952, p. 261). 0

17.4. Definition. Let D; c IR n for i = 1,2, ... ,s and let D = D I X .•• X D, c (R")'. Normally we consider sets D satisfying: (Xl, ... ,Xs) ED=? (XlJ,., ... , Xsn:) E D

(17.2)

for any permutation, :71:, of {I, 2, ... , n}. Then a function f: D --IR is said to be multivariate arrangement increasing (MAl) if:

(Xl,'"

, X s ) ~ ( Z I ",Zs)=?f(XI,··. " ,Xs)$,f(ZI"" ,zs)·

Alternatively, f is said to be multivariate arrangement decreasing (MAD) if -f is MAL Note that MAl functions are permutation invariant in the sense that for any permutation :71:,

f(xl

, •••

,xs )

=f(Xbt,

... ,

X S 1t )

.

We recognize that MAl functions of two vector arguments coincide with AI functions defined in Section 15.1. Thus Definition 17.4 represents a generalization of Definition 15.2. The next proposition is useful in obtaining many examples of MAl functions, as well as in relating several basic classes of functions. We first present two definitions. Consider the lattice IRs with componentwise ordering. For X, y E IRs, let X v y = (maxfx. , YI), ... , max(x s , Ys» and X A Y = (min(xI' YI), ... , min(xs , Ys)).

17.5. Definition. A real valued function f satisfying f(x v y)

+ f(x A y)

+ f(y)

~ f ( x )

(17.3)

is called L-superadditive (or lattice superadditive). See Marshall and Olkin (1979, Sec. 6.D) for results concerning L-superadditive and L-subadditive functions.

17.6. Definition. A real valued function f satisfying f(x v y)f(x

A

y)

~ f ( x ) f ( y )

is called multivariate totally positive of order 2 (MTPz) .

(17.4)

410

17. Multivariate Arrangement Increasing Functions

In the following proposition, let E = IR n or IR':-.

17.7. Theorem. (a) Let f(x l, ... ,xs ) = g(x i + Xz + ... + x,}. Then f is MAlon D = E' iffg is Schur-convex on E. (b) Let f(XI' ... ,xs ) = L7=1 g(Xli' XZi, ... , xsi). Then f is MAlon D = E' iff g is L-superadditive on E. (c) Let f(XI' ... ,xs ) = TI7=1 g(Xli' XZi, ... ,Xsi), where g > 0. Then f is MAlon D = E iffg is MTP z . S

Proof. (a) Let g be Schur convex on E. From the definition of O.

2: max(xl(i]" i=1

... , Xs[i))

0

17.12. Example. The permanent of an n x n matrix with positive elements ia a MAD function of its columns and a MAD function of its rows. Proof. Let P(al' ... , an) be the permanent of the n x n matrix with kth column = ak' Then P(al' ... ,an)

=

2: 3tESn

an

,I " . annn'

17.2. Preservation and Closure Properties

413

To show that the permanent is MAD, we need only show that P(a1' , an) ~ P(ai , ... , a ~ ) , where (a}, ... , a ~ ) is obtained from (a., , an) by interchanging the e and m coordinates of each vector a, such that ake > akm (here t, m are arbitrary but fixed and t < m). Without loss of generality, we assume that ake :s akm ~k :s r for some r, 1:s r:S n. Let us define S' to be the set of permutations ,w; on {1, ... , n} such that ,w;( t) < ,w;(m). Now for any permutation ,w;, define ,w;* by ,w;*(i) = Jr(i) for i =t- t, m and ,w;*( t) = ,w;(m), ,w;*(m) = ,w;( t). It is clear that if,w; is such that either max{,w;(t),,w;(m)}:sr or min{(,w;(t),,w;(m)}>r, then

On the other hand, if ,w; does not fall into either of these categories, it is easy to see that

Hence P(al"" ,an)= 2:'[a".,l·· . a"'nn+a".n " . ·a".;n] 1EES'

;::: 2:

[ a ~ 1 1 · · · a ~ n n+ a ~ i l ... a ~ ; n ]

"'ES'

=P(ai, ...

, a ~ ) .

For a probabilistic interpretation of this result, suppose n balls are to be thrown (independently) into n urns. Let Pk = (Pk1' Pk2' ... ,Pkn) be ,n. That is, the probability distribution of the kth ball for k = 1, Ph = probability that ball k lands in urn i. Then P(P1 , P2, , Pn) is the probability that the n balls end up in n different urns. This probability function is MAD in the vectors PI , ... , Pn' and in particular,

P(P1' ... ,Pn) ~ P(pi, ... ,p,n, where p%

17.2.

= (Pk!l]' ... ,Pk(n»

for each k

= 1, ' .. , n.

0

Preservation and Closure Properties of Multivariate Arrangement Increasing Functions

The class of multivariate arrangement increasing (decreasing) functions is clearly closed under addition and under formation of mixtures (with respect to positive measures). The product of positive MAl functions is

414

17. Multivariate Arrangement Increasing Functions

cp is an increasing function on cp(h Xkm . We need only show that f(xt> . . . ,xs ) $f(xi, ... ,x;). Without loss of generality, we may assume that the indices k such that Xu $ Xkm are the indices k = 1, 2, ... , r, where r < s. For any vector W E ~ n , let us define w* to be the vector obtained from w by interchanging its e and m coordinates. By breaking up the region of integration into the 3 regions Ze < z.; , Ze = Zm' and Ze > Zm, we obtain

Proof.

f(xi, ... ,x;) - f(x l

- J Zf

Convex Functions, Partial Orderings, and Statistical Applications (Mathematics in Science and Engineering)

Convex functions, partial orderings, and statistical applications

Convex functions and their applications

Simulation: Statistical Foundations and Methodology (Mathematics in Science and Engineering)

Mathematics in Engineering and Science

Mathematics in Engineering and Science

Mathematics in engineering and science

Backlund Transformations and Their Applications (Mathematics in Science and Engineering)

Convex Functions and Orlicz Spaces