Group Theory: Classes, Representation and Connections, and Applications

MATHEMATICS RESEARCH DEVELOPMENTS SERIES GROUP THEORY: CLASSES, REPRESENTATION AND CONNECTIONS, AND APPLICATIONS No pa...

Author: Danellis C.W. (ed.)

23 downloads 1005 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

MATHEMATICS RESEARCH DEVELOPMENTS SERIES

GROUP THEORY: CLASSES, REPRESENTATION AND CONNECTIONS, AND APPLICATIONS No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

MATHEMATICS RESEARCH DEVELOPMENTS SERIES Boundary Properties and Applications of the Differentiated Poisson Integral for Different Domains Sergo Topuria 2009. ISBN: 978-1-60692-704-5 Quasi-Invariant and PseudoDifferentiable Measures in Banach Spaces Sergey Ludkovsky 2009. ISBN: 978-1-60692-734-2 Operator Splittings and their Applications Istvan Farago and Agnes Havasiy 2009. ISBN: 978-1-60741-776-7 Measure of Non-Compactness for Integral Operators in Weighted Lebesgue Spaces Alexander Meskhi 2009. ISBN: 978-1-60692-886-8 Mathematics and Mathematical Logic: New Research Peter Milosav and Irene Ercegovaca (Editors) 2009. ISBN: 978-1-60692-862-2 Role of Nonlinear Dynamics in Endocrine Feedback Chinmoy K. Bose 2009. ISBN: 978-1-60741-948-8

Geometric Properties and Problems of Thick Knots Yuanan Diao and Claus Ernst 2009. ISBN: 978-1-60741-070-6 Lie Groups: New Research Altos B. Canterra (Editor) 2009. ISBN: 978-1-60692-389-4 Lie Groups: New Research Altos B. Canterra (Editor) 2009. ISBN: 978-1-61668-164-7 Online Book Emerging Topics on Differential Geometry and Graph Theory Lucas Bernard and Francois Roux (Editors) 2009. ISBN: 978-1-60741-011-9 Weighted Norm Inequalities for Integral Transforms with Product Kernals Vakhtang Kokilashvili, Alexander Meskh and Lars-Erik Persson 2009. ISBN: 978-1-60741-591-6 Group Theory: Classes, Representation and Connections, and Applications Charles W. Danellis (Editor) 2010. ISBN: 978-1-60876-175-3

MATHEMATICS RESEARCH DEVELOPMENTS SERIES

GROUP THEORY: CLASSES, REPRESENTATION AND CONNECTIONS, AND APPLICATIONS

CHARLES W. DANELLIS EDITOR

Nova Science Publishers, Inc. New York

Copyright © 2010 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter cover herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal, medical or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Library of Congress Cataloging-in-Publication Data Group theory : classes, representation and connections, and applications / [edited by] Charles W. Danellis. p. cm. Includes bibliographical references and index. ISBN 978-1-61324-968-0 (eBook) 1. Group theory--Juvenile literature. I. Danellis, Charles W. QA174.2.G756 2009 512'.2--dc22 2009034517

Published by Nova Science Publishers, Inc. Ô New York

CONTENTS Preface Chapter 1

Chapter 2

vii Application of Symmetry Analysis to Description of Ordered Structures in Crystals Wiesława Sikora and Lucjan Pytlik

1

Higher Algebraic K – Theory of G – Representations for the Actions of Finite and Algebraic Groups G Aderemi Kuku

41

Chapter 3

Liberal Nationalism, Citizenship and Integration Sune Lægaard

Chapter 4

The Consideration of Rape as Torture and as Genocide: Some Implications for Group Theory Daniela De Vito

103

Group Work Is Not One, But a Great Many Processes — Understanding Group Work Dynamics Eva Hammar Chiriac

153

The Continuous Shearlet Transform in Higher Dimensions: Variations of a Theme Stephan Dahlke, Gabriele Steidl and Gerd Teschke

167

Chapter 5

Chapter 6

83

Chapter 7

Exceptional Groups, Symmetric Spaces and Applications Sergio L. Cacciatori and B. L. Cerchiai

177

Chapter 8

A Survey of Some Results in the Lie Group Analysis Phillip Yam

217

Chapter 9

On Graph Groupoids: Graph Groupoids and Corresponding Representations Ilwoo Cho

263

The Group Aspect in the Physical Interpretation of General Relativity Theory Salvatore Antoci and Dierck Ekkehard Liebscher

299

Chapter 10

vi Chapter 11 Index

Contents Long Time Behaviour of the Wiener Process on a Path Group Rémi Léandre

313 323

PREFACE Group theory studies the algebraic structures known as groups. The concept of a group is central to abstract algebra: other well-known algebraic structures, such as rings, fields, and vector spaces can all be seen as groups endowed with additional operations and axioms. Groups recur throughout mathematics, and the methods of group theory have strongly influenced many parts of algebra. Linear algebraic groups and Lie groups are two branches of group theory that have experienced tremendous advances and have become subject areas in their own right. Various physical systems, such as crystals and the hydrogen atom, can be modelled by symmetry groups. Thus group theory and the closely related representation theory have many applications in physics and chemistry. This new and important book gathers the latest research from around the globe in the study of group theory and highlights such topics as: application of symmetry analysis to the description of ordered structures in crystals, a survey of Lie Group analysis, graph groupoids and representations, and others. Chapter 1 – A typical situation in which symmetry analysis can be applied to the description of ordering process in crystalline solids includes the following elements: a) welldefined high-symmetry phase, with a known symmetry space group, b) physical changes lowering the crystal symmetry to a subgroup of the original space group, c) local physical quantity (scalar, vector, or tensor type), responsible for deviation from the high-symmetry phase of the crystal (the ordering quantity) and d) wave vector k , attributed to the ordering process, which defines the relations between values of the ordering quantity in the neighboring unit cells of the crystal. The calculated BV-s can be used for construction of the ordering mode in the crystal by taking linear combination of these BV-s, attributed to one ore more IR-s. The mixing coefficients are used as components of the order parameters for a given phase transition. The final form of the solution is determined by the mathematical conditions imposed on each single-site part of the solution. For some quantities reality of the solution is the only condition, while for some other additional conditions (tensor symmetry, zero-trace) have to be taken into account. After imposing the conditions the number of free parameters in the solution can be determined. The IR-s generating the solution also determine the symmetry reduction, thus the symmetry group of the new structure, and the relations between the sets of equivalent positions in the initial and final symmetry group. The type of symmetry lowering determines also the accompanying changes (deformations) compatible with a given phase transition. The present paper includes examples of such symmetry-adapted descriptions for the most frequently encountered types of physical quantities.

viii

Charles W. Danellis

Chapter 2 – Here is a brief outline of the contents of this article. As already remarked, the theme of the article is to present computations of higher K - theory of various examples of equivariant exact categories given in chapter I for the actions of finite and algebraic groups. In chapter II, the authors introduce higher algebraic K – theory of ordinary as well as equivariant exact categories with capious examples for the actions of finite and algebraic groups. In section 2 of chapter II, the authors discuss induction techniques for higher K – theory by realizing equivariant higher K – theory as Mackey functors yielding some explicit results on higher K – theory of grouprings. Chapter III is devoted to presenting explicit computations of higher K – theory (including profinite higher K – theory) of the various examples of equivariant exact categories encountered in chapter I and II for the action of finite and algebraic groups. Time and space prevented us from discussing actions of other groups e.g. profinite groups, compact Lie groups as well as group actions on objects of other categories e.g. Waldhausen and symmetric monoidal categories. However, interested readers could see [39] for more information. Chapter 3 – Liberal nationalists such as Will Kymlicka and David Miller have endorsed the "revaluation of citizenship" currently expressed in more stringent naturalization requirements across western states. Kymlicka and Miller claim that such measures are “nation-building” policies and only make sense as attempts at “cultural integration” of immigrants. The chapter discusses liberal nationalism as a view that assigns normative significance to one sort of group membership, nationality, for the purpose of regulating access to another kind of group, namely the political community of citizens. The chapter discusses in which ways the recent “revaluated” naturalization requirements might be related to the aims of liberal nationalism. The paper raises some doubts about this possibility as well, however, since ”revaluated” naturalization requirements are exclusionary in a way that might be counterproductive from a liberal nationalist point of view. Chapter 4 – This chapter determines and assesses some of the theoretical implications for rape that emerge once it is placed within the international crimes of torture and genocide. Specifically, the differences between rape as a form of torture with its emphasis on the individual and rape as genocide which focuses on violations committed against the group will form the basis of this theoretical analysis. The question therefore becomes, “does the dynamic of rape alter when it is subsumed within these complex and contrasting international crimes?” Furthermore, it will be argued that any study of group theory, as it relates to rape within the context of international law, must appreciate the relationship between rape as it affects the individual and rape as it affects the group. Chapter 5 – This chapter aims to present illustrative applications of the model as well as explaining The periodic system for the understanding of group processes. A core question is whether or not the model can contribute valuable information and if it is a practical tool for describing and interpreting what happens in groups during work. Earlier research has shown promising results indicating that this kind of tool can supply a better understanding of interactional dynamics in groups, not only from a scientific perspective but also from users’ applied perspective. Chapter 6 – This note is concerned with the generalization of the continuous shearlet transform to higher dimensions. Quite recently, a first approach has been derived in [4]. The authors present an alternative version which deviates from [4] mainly by a different

Preface

ix

generalization of the shear component. It turns out that the resulting integral transform is again associated with a square-integrable group representation. Chapter 7 – In this article the authors provide a detailed description of a technique to obtain a simple parametrization for different exceptional Lie groups, such as G 2 , F 4 and E 6 , based on their fibration structure. For the compact case, the authors construct a realization which is a generalization of the Euler angles for SU(2), while for the non compact version of G 2(2) #SO(4) the authors compute the Iwasawa decomposition. This allows us to obtain not only an explicit expression for the Haar measure on the group manifold, but also for the cosets G 2 #SO(4), G 2 #SU(3), F 4 #Spin(9), E 6 #F 4 and G 2(2) #SO(4) that the authors used to find the concrete realization of the general element of the group. Moreover, as a by-product, in the simplest case of G 2 #SO(4), the authors have been able to compute an Einstein metric and the vielbein. The relevance of these results in physics is discussed. Chapter 8 – Lie Group Analysis as a mathematical discipline was born in the 1870s out of some brilliant work that was done by the nineteenth-century mathematician Sophus Lie. Working together with a fellow student called Felix Klein in Berlin during the year 1869 -70, Lie conceived the notion of studying mathematical systems from the perspective of the transformation groups that leave the systems invariant. In his famous Erlanger program, Klein subsequently pursued the role of finite groups in the study of regular bodies and the theory of algebraic equations, while Lie developed his notion of continuous transformation groups and their role in the theory of differential equations. Lie's work was a tour de force of the 19th century, and today the theory of continuous groups is a fundamental tool in such diverse areas as analysis, differential geometry, number theory, differential equations, quantum mechanics, high energy physics and gauge theory. Lie's achievements are striking because he showed that many of the ad hoc methods of integration for ordinary differential equations that were in use before his time were actually direct consequences of his theory. Furthermore, Lie gave a classification of ordinary differential equations in terms of their symmetry groups, thereby identifying the full sets of equations that could be integrated or reduced to lower-order. Lie's classification, in particular, showed that all second-order equations that are integrable by his methods are reducible to one of exactly four distinct canonical forms simply by taking suitable choices of change of variables. It follows therefore that by subjecting these four canonical equations to suitable change of variables alone, all the known equations that are integrable by the old methods are obtained as well as infinitely many more equations that are integrable which are not yet known. Chapter 9 – The authors consider countable directed graphs and their corresponding graph groupoids, and the canonical representations of them. The study of representations of graph groupoids is based on the observation of the canonical representations of categorical groupoids. The substructures of a fixed graph groupoid are considered, and the corresponding sub-representations. In particular, the authors observe the representations and the corresponding von Neumann algebras of (i) the subgroupoids induced by the towers of full-subgraphs, (ii) the quotient groupoids, induced by the full-subgraph-inclusion, and (iii) graph fractaloids which are the graph groupoids with fractal property. Chapter 10 – When, at the end of the year 1915, both Einstein and Hilbert arrived at what were named the field equations of general relativity, both of them thought that their fundamental achievement entailed, inter alia, the realisation of a theory of gravitation whose underlying group was the group of general coordinate transformations. This group theoretical property was believed by Einstein to be a relevant one from a physical standpoint, because the

x

Charles W. Danellis

general coordinates allowed to introduce reference frames not limited to the inertial reference frames that can be associated with the Minkowski coordinate systems, whose transformation group was perceived to be restricted to the Poincar’e group. Two years later, however, Kretschmann published a paper in which the physical relevance of the group theoretical achievement in the general relativity of 1915 was denied. For Kretschmann, since any theory, whatever its physical content, can be rewritten in a generally covariant form, the group of general coordinate transformations is physically irrelevant. This is not the case, however, for the group of the infinitesimal motions that bring the metric field in itself, namely, for the Killing group. This group is physically characteristic of any given spacetime theory, since it accounts for the local invariance properties of the considered manifold, i.e., for its ``relativity postulate''. In Kretschmann's view, the so called restricted relativity of 1905 is the one with the relativity postulate of largest content, because the associated Killing group coincides with the infinitesimal Poincar’e group, while for the most general metric manifold of general relativity the associated Killing group happens to contain only the identity, hence the content of its relativity postulate is nil. Of course, solutions to the field equations of general relativity whose relativity postulate has a content that is intermediate between the two above mentioned extremes exist too. They are the ones found and investigated until now by the relativists, since the a priori assumption of some nontrivial Killing invariance group generally eases Chapter 11 – The authors show that the law of the Wiener process on a path group tends to the Haar distribution on a path group

In: Group Theory Editor: Charles W. Danellis, pp. 1-39

ISBN 978-1-60876-175-3 © 2010 Nova Science Publishers, Inc.

Chapter 1

APPLICATION OF SYMMETRY ANALYSIS TO DESCRIPTION OF ORDERED STRUCTURES IN CRYSTALS Wiesława Sikora and Lucjan Pytlik Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, Kraków, Poland

ABSTRACT A typical situation in which symmetry analysis can be applied to the description of ordering process in crystalline solids includes the following elements: a) well-defined high-symmetry phase, with a known symmetry space group, b) physical changes lowering the crystal symmetry to a subgroup of the original space group, c) local physical quantity (scalar, vector, or tensor type), responsible for deviation from the high-symmetry phase of the crystal (the ordering quantity) and d) wave vector k , attributed to the ordering process, which defines the relations between values of the ordering quantity in the neighboring unit cells of the crystal. The symmetry analysis method is based on decomposition of the respective full representation (permutational, structural, magnetic, quadrupolar etc.) of the crystal space group, calculated for a given wave vector k and a given set of equivalent positions, into its irreducible representations (IR-s). Such a decomposition takes place when the basis (coordinate system) used for description of the original function space is transformed to the special symmetry-adapted basis. The decomposition is equivalent to splitting the original function space into sub-spaces attributed to individual IR-s. The new coordinates, called basis vectors (BV-s), can be divided into subsets that are attributed to individual IR-s and transform within the respective subspaces. The essential part of each BV is defined for all considered sites of the unit cell, but it can be split into single-site parts, with the number of elements dependent on the nature of the ordering quantity. The calculated BV-s can be used for construction of the ordering mode in the crystal by taking linear combination of these BV-s, attributed to one ore more IR-s. The mixing coefficients are used as components of the order parameters for a given phase transition. The final form of the solution is determined by the mathematical conditions imposed on

2

Wiesława Sikora and Lucjan Pytlik each single-site part of the solution. For some quantities reality of the solution is the only condition, while for some other additional conditions (tensor symmetry, zero-trace) have to be taken into account. After imposing the conditions the number of free parameters in the solution can be determined. The IR-s generating the solution also determine the symmetry reduction, thus the symmetry group of the new structure, and the relations between the sets of equivalent positions in the initial and final symmetry group. The type of symmetry lowering determines also the accompanying changes (deformations) compatible with a given phase transition. The present paper includes examples of such symmetry-adapted descriptions for the most frequently encountered types of physical quantities.

INTRODUCTION Theory of group representations had been applied for simplification of description of many-body, complex physical systems many years ago. E. Wigner [1] G.J.Lubarski [2] and A.P. Cracknell [3] introduced as “symmetric coordinates” set of basis vectors of irreducible representations of molecules symmetry groups in calculations of molecules vibrations. Lubarski discussed also the role of irreducible representations of crystal symmetry group in crystallographic second order phase transitions. In the description of magnetic ordering in crystals the symmetry analysis based on the theory of groups and representations was at first introduced by E.F. Bertaut [4, 5]. He obtained the symmetry-adapted ordering modes, also derived from the representation analysis by calculation of the basis vectors of irreducible representations. Later that line of analysis has been developed by many other theoreticians, like Izyumov [6, 7] and others and is known in the literature as the symmetry analysis method. Up to now the algorithms and procedures of symmetry analysis have been developed to the level of routine calculations and several dedicated applications are offered on the Internet pages [8, 9, 10, 11,12]. The symmetry analysis method is able to predict all possible channels of structural transformations from a well-known, high symmetry parent crystal structure to structures with reduced symmetry groups, identical with one of subgroups of the initial symmetry group. These symmetry transformations can be described by a small number of parameters, independently specifying the multiplicity and locations of occupied positions as well as other types of physical quantities affected by the crystal transformation. In situations where the microscopic picture is very complicated, as occurs often in investigations during last years, the symmetry analysis method proves to be very helpful. In its present form it has been successfully applied to the description of structural (both displacive and order-disorder) and magnetic phase transitions and also to the ordering of quadrupols and clusters and it has lead to many interesting results. The necessary conditions and general theoretical assumptions resumed with special attention focused on constraints imposed on the solutions by the symmetry and the remaining "internal degrees of freedom", specific for each application field had been presented in [13] This chapter is the developing of presentation the symmetry analysis method and possibility of its application to discussion of orderings of some physical properties in crystals.

Application of Symmetry Analysis to Description of Ordered Structures…

3

PHYSICAL CHARACTERISTICS OF THE PROBLEM A typical situation in which the symmetry analysis is applied to the description of ordering process in solids includes the following features: •

•

•

•

there is a well-defined high-symmetry (high-temperature) phase, with a well-known symmetry space group and with well-known sets of equivalent positions (Wyckoff positions) in the unit cell under some circumstances (e.g change of temperature or external pressure) some changes take place, leading to lower crystal symmetry, with preservation of the group-subgroup relation. (in most cases continuous or semi-continuous phase transition) a local physical quantity exists, with known rules of transformations under the action of operations belonging to crystal symmetry group, defined on one or more symmetry equivalent sets of sites in the unit cell, which becomes non-zero in the low-symmetry phase, thus representing a deviation from the high-symmetry state of the crystal there is a wave vector (given in reciprocal lattice) responsible for the ordering process, usually known from the experiment, which describes the propagation direction and wavelength of ordering waves, thus defining the relations between the values of the ordering quantity in neighboring unit cells of the crystal. The modulus of the ordering wave vector k may be commensurate, leading to unit cell multiplication in the low-symmetry phase, or incommensurate leading to an incommensurately modulated phase. The set of symmetry equivalent k vectors is denoted as {kL} and it is called the k vector star. Each individual k vector in the set is called an arm of the star. The set of symmetry operations leaving the k vector unchanged is called k vector symmetry group G(k).

In such circumstances the global (whole-crystal) physical quantity representing the nonzero deviation from the high-symmetry state is usually written as:

(1)

v v

v

where r1 , r2 ,..., rn denote the position of individual atomic sites in the unit cell, belonging to

v

one set of equivalent positions, uˆ ( ri ) denotes the local ordering quantity at the i-th site, p,q,s

v v v

are integer numbers and a1 , a2 , a3 denote the lattice vectors of the high-symmetry crystal phase. The block in square parenthesis is attributed to one unit cell, denoted by p,q,s. The

4

Wiesława Sikora and Lucjan Pytlik

v

characteristics of uˆ (ri ) depend on its physical nature and may be represented by a scalar, vector or tensor quantity, with the respective number of elements.

SYMMETRY-ADAPTED DESCRIPTION OF THE ORDERING MODE The symmetry analysis method has been derived from the theory of group and representations and it is based on the decomposition of the full (permutational, quadrupolar, structural or magnetic) representation of the crystal space group, calculated for a given star of wave vectors {kL} and a given set of equivalent positions, into its irreducible representations (IR-s). Such a decomposition takes place when the basis (coordinate system) used for description of the function space is transformed to the special symmetry-adapted basis. It represents the splitting of the whole function space into the direct sum of sub-spaces attributed to individual IR-s. Which irreducible representations really appear in the sum and how many times, depends on the chosen set of equivalent positions and the type of discussed property. This choice decides about the accessible space of degree of freedom for discussed phase transition. The symmetry analysis also allows calculation of the so called basis vectors (BV-s) attributed to each individual IR, i.e. vectors, which under the action of symmetry operations transform according to the respective IR. The number of BV-s attributed to a given IR is equal to number of arms in k-vector star and dimension of irreducible representation of G(k), and further-on each BV will carry three indices L,ν,λ , which represent respectively the arm of k-vector star, IR number and the BV-s sequential number, running from 1 up to the product of dimension of the IR and its multiplicity. Each BV is defined for all considered sites of the crystal but can be split into local (or atomic-site) parts, with the number of elements dependent on the nature of the ordering quantity. Because of translational symmetry of crystals the BV-s closely resemble Bloch (Wannier) waves or Fourier components, because their local parts fulfill the following relation:

(2) The above-mentioned BV-s can be used for construction of the global vector describing the ordering quantity in the whole crystal:

(3)


5

Such a construction of the global description as a linear combination of BV-s with coefficients Cνk,Lλ has its local counterpart for the uˆ components describing a given atomic site:

(4) Using the translation symmetry one is able to limit the problem to the elementary cell. This significantly simplifies the procedure of detection, which IR-s appear in the decomposition of function space under consideration [6]. At first the n⋅u dimensional vector (n is the number of equivalent positions in the elementary cell and u is the dimension of discussed property) may be defined, where all components are equal 0 except one, which is 1. The set of such n⋅u vectors, which have the value 1 on different positions form the basis of the reducible representation Â of the G(k) symmetry group, which includes all relevant degree of freedom of given parent structure. This reducible representation Â, by using the general formula known in the theory of group representations, decomposes into the set of irreducible representations of given G(k) symmetry group. Only these IR-s, which appear in the decomposition are allowed for the structure realized after the phase transition and are named as active representations.

(5)

nν determines how many times the ν-th representation occurs in the decomposition, for a given Wyckoff set of positions and given type of investigated property, and indicates the multiplicity of IR in Â. From the general rules of theory of representations one may construct the corresponding representation of full space group related to the {kL} star and given active representation of G(k). As can be seen in order to construct a symmetry-adapted description of the ordering in the low-symmetry phase one has to calculate the BV-s of IR-s of the high-symmetry space group and then, knowing the BV-s, one has to adjust (fit) the linear combination coefficients in order to describe the experimental results as close as possible. In many investigated physical problems the translational properties of new structure are described by one k vector from given star and description by the symmetry analysis method is limited to IR-s of G(k), which have small dimension. The main advantage of the procedure is also the fact that usually the number of IR-s that has to be included in the final combination is minimal; in most cases one IR is enough. In the case of displacement type structural phase transitions it is proved that one IR may be active (the “soft mode” concept). Participation in magnetic phase transition more than one IR-s is able, when the energy of crystal states belonging to different representations of crystal space group are very close. Such situation takes place when anisotropy interactions are negligible and only exchange Hamiltonian may be taken into account. Then the symmetry of such Hamiltonian is higher than symmetry of sites (described

6


by crystallographic space group). Irreducible representations of such “exchange” group limited to the elements of corresponding G(k) group as usual are reducible. The decomposition of one exchange representation into IR-s of considered G(k) indicates the set of IR-s, which belongs to so named “exchange multiplet”, and may appear together in creation the new structure. Another situation, when more than one IR-s is able, appears in the case, when one of IR-s (named associated) leads to the higher symmetry of “model” structure then the other one (named relevant). The detailed discussion of this problem is given in [6]. The coefficients Cνλ form good order parameters of the phase transitions, which can be used for construction of invariants in the structure symmetry group. It is useful to mention here, that the second order invariants, as follow from the theory of representations – may be constructed only with using functions belonging to the same representation. In the Landau theory of phase transitions such invariants are connected with transition temperature. This is the base for discussion of possible connections between different type of transitions (for example possibility of occurrence at the same time magnetic and structural transitions). The symmetry analysis also provides means to find the symmetry group of the new structure, using the Cνλ and the given IR τν, active in the phase transition. The calculated basis vectors transform under the action of elements of the parent symmetry group by matrices of the respective IR τν. Because the coefficients Cνλ of the linear combination are the components of the analyzed property in the frame of basis vectors, they transform according to inverse matrices of τν. The symmetry elements, which leave the set of Cνλ components invariant, belong to the new structure symmetry group. The symmetry considerations are also able to indicate the relations between the old sets of equivalent positions (in the parent group) and the new sets of equivalent positions (in the final subgroup). The group-subgroup relations and relations between corresponding sets of positions are also given - between many others, useful crystallographic information on “Bilbao crystallographic server” [9]. There is one more problem that complicates the actual procedure, namely the physical conditions imposed on the local ordering quantities or their algebraic representations. For most physical quantities there is a condition that they should be represented only by real numbers, what becomes a problem when the obtained BV-s are complex. In such a way an extra constraint is imposed on the set of Cνλ coefficients. If {kL} contains -k then the problem is easily solved by including BV-s calculated for an inverse wave vector, with a proper phase shift contained in the complex value of the respective coefficient of the combination. The reality condition is one, rather general, requirement, but there are more specific ones, depending on the nature of the ordering quantity and its algebraic representation. These detailed conditions will be discussed in the following sections, devoted respectively to scalar, vector and tensor quantities.

CONDITIONS IMPOSED ON PHYSICAL SOLUTIONS AND THE NUMBER OF FREE PARAMETERS As mentioned before the general requirement concerning the constructed physical solution is its restriction to real values. The situation seems to be different for two classes of problems. One class of solutions covers all the wave vectors for which their components take the values Q=0 or Q=½ (in reciprocal lattice units). In such cases the wave vectors k and -k


7

are equivalent in the crystallographic sense, i.e. they are equal modulo the reciprocal lattice vector. Therefore there is no reason to use the BV-s calculated for both k and –k for construction of the model structure. For such case an extra advantage is provided the fact that the kt value is a multiple of π giving always real values of exp(ikt). Unfortunately the above statement does not apply to the BV’s components, as they still can be complex. From the experimental point of view that class of structures comprises the ferromagnetic and antiferromagnetic structures with doubled unit cell, i.e. structures in which the magnetic moments in neighboring cells can be different only by their sign. For the second class of solutions, i.e. for all the wave vectors different from the ones mentioned above, the solution must be combined from the BV-s calculated for both wave vectors i.e. k and –k, as they are not equivalent. That class of solutions comprises both the commensurate structures with bigger unit cells (k=1/3, 1/4, etc.) as well as the incommensurately modulated structures. As mentioned above the only mathematical condition imposed on the respective linear combinations of BV-s, is the condition that the result should be real, i.e. the imaginary part of the linear combination should be equal to zero for every component of the resulting vector. Below a short algebraic analysis of this condition is provided, separately for both classes of wave vectors. The notation concerning the used BV-s and coefficients of the linear combination will be as follows:

(6) The Ψkνλ above denotes the unit-cell parts of λ-th BV, calculated for IR τν , for a given wave vector k i.e. such symbol in fact denotes a vector consisting of p⋅n elements, where p is the number of components describing the quantity under consideration (1 for scalar, 3 for vector and 9 for 3x3 tensor) and n is the number of sites in the symmetry equivalent set in the unit cell. The vector elements are ordered by number of ion and the component of the considered quantity. It has been mentioned before that for the first class of solutions the magnetic structure vector can be expressed as a linear combination of BV’s calculated for positive k’s only. For that case the condition imposed on the solution, which can be expressed as Im {S} = 0, takes the following form:

(7) where i and α denote the ion number and it’s α component of physical quantity while u and w denote the real and imaginary parts of the respective BV. Thus p⋅n equations are obtained, one for each ion and each component of physical quantity related to it. . The number of unknown (real) variables depends on the number of IR-s taken into account and their multiplicity and respective dimensions. In general:

8


(8) where nν is the multiplicity of the irreducible representation τν in the linear combination, and dν denotes the dimension of τν. In most cases the system of equations looks like highly overdetermined. However many of the obtained equations are linearly dependent and in order to determine the actual status of the system of equations these equations have to be eliminated as shown below. The second class of solutions, i.e. wave vectors with components different from 0 and ½ , mostly comprises what is usually called modulated structures, and includes both commensurate and incommensurate (e.g. with k sweeping continuous range of values) structures. For such cases the solution has to be built as a linear combination of BV’s calculated for both wave vectors k and –k, in order to form a real solution, in the form of a “standing wave” with a given amplitude and phase. If the described physical quantity is supposed to be real for any cell in the crystal then the only possibility is making the expressions multiplying the cos and sin terms respectively both equal to zero. This has to be true for all components of the physical quantity vector. Thus two uniform systems of equations are obtained for the unknown coefficients, each system consisting of p⋅n equations. This gives 2pn equations total and the systems of equations take the form:

(9) where i=1,…,n runs over the investigated ions in the symmetry equivalent set of sites, and α=1,..,p runs over the components of the physical quantity on each ion. Now the number of unknown variables is doubled as the coefficients for both k and –k BV-s have to be determined. At first sight it seems that the system of equations should be highly over-determined. However an additional aspect that should be remembered is the fact that the equations have been generated for atomic sites that are symmetry related, so many of the equations may be linearly dependent. Again the number of unknown coefficients depends only on the number of IR-s taken into the model and their respective multiplicities and dimensions:

(10) As mentioned before the obtained homogenous system of equations is definitely redundant as some of the equations are linearly dependent and should be eliminated. At first it should be noticed that some of the equations contain only zero terms and have no relevancy at


9

all. Then in the second stage equivalent equations are eliminated by normalizing all the equations (the greatest coefficient is made equal to one) and elimination of equations that are identical (i.e. contain the same coefficients). In most cases the obtained set of equations does not contain linearly independent equations. For the remaining cases the third stage of the processing is applied. It consists of elimination of the redundant equations by applying a Gram-Schmidt procedure (known from vector algebra textbooks). What is left after that procedure is the minimal system of equations, containing Ne equations. Again a well-known algebra theorem allows a direct determination of the number of free parameters (number of the unknown variables that can be chosen at will). The obtained of free parameters was found to be exactly half of the number of the unknown variables, i.e. equal to the number of complex coefficients in the linear combination of BV-s.

(11)

"MODY" PROGRAM - A PRACTICAL IMPLEMENTATION OF SYMMETRY ANALYSIS As mentioned above the first part of the task is to calculate the basis vectors (BV-s) for a given space group of the high-symmetry phase. There are well-elaborated algorithms leading to calculation of the BV-s, using e.g. the projection operator’s technique, with a special version using stabilizer [7]. There are several applications offered on the Internet [ 8, 9, 10, 11, 12] , which are dedicated to the same task, but differ in some aspects, as they have been tailored to different application fields in solid state physics. One of the applications has been written by the authors of the present paper and can be downloaded from its homepage [8]. The program, offering a full graphical user interface, in the first stage collects all the necessary data from the user. In addition to the high-symmetry space group the user has to specify the set of symmetry equivalent atomic positions, by choosing the first atom position (the rest is generated by symmetry operations). Once this is known the user is asked to specify the ordering wave vector and the exact arm of the wave vector star. The last thing the user has to choose is the type of ordering quantity. At present the program offers four options: scalars, polar vectors, axial vectors and 3x3 tensors. After completing the input data the calculations of the ordering modes may be carried out and after that the user is offered the list of output results. The results include the symmetry characteristics used in the calculations, namely symmetry elements of the group, its irreducible representations, splitting of atoms into independent G(k) orbits and transformation matrices for tensor quantities. The principal result of the calculations contains the basis vectors attributed to individual atomic sites of a given set of symmetry equivalent positions in the unit cell. By default the BV-s elements are listed for the "zero" cell i.e. without the extra translation vector tp,q,s . In order to calculate the BV components in the neighboring cells the user has to specify the translation vector tp,q,s for

10


eq. (1) in the input data. The construction of the final ordering modes is done by verifying the constraints imposed on the final result (the described ordering quantity) and then solving for or fitting the coefficients of the linear combinations of BV-s. This part of the procedure, together with examples dedicated to different physical quantities, will be presented in the following sections.

SCALAR ORDER PARAMETERS: SITE OCCUPATION PROBABILITY The simplest case of symmetry analysis is encountered for physical quantities represented by scalars. Good examples of physical properties represented by scalars are presented by charge density or occupation probability of local ion sites in a given crystal structure. An example of such calculations for hydrogen distribution on interstitial sites in an intermetallic compound can be found in [14,15]. An essential physical assumption about the parent, high symmetry phase, states that in the high-symmetry phase the occupation probability P on all allowed symmetry equivalent interstitial sites should be the same. The actual value depends on the hydrogen concentration, and the number of symmetry equivalent sites. The assumption has two possible, physical explanations: • •

hydrogen atoms are localized on the set of equivalent positions in a given space group with completely statistical (random) distribution, hydrogen atoms execute random jumps within the sets of equivalent positions of a given space group, thus giving equal values of timed averaged occupation probability.

The calculated values, describing the ordering of H atoms leading to lowering the symmetry, always denote the change ΔP of the site occupation probability P from the equilibrium values mentioned above. Each subset of symmetry equivalent sites, called an orbit in the given subgroup, should be occupied with the same probability P'. If P' = 1 the subgroup orbit is fully occupied, while P' = 0 means that the subgroup orbit is empty after the ordering, while the condition 0 = q ( x + y ) – q ( x ) – q ( y ) for

x, y ∈V. If {e1....en } is a basis of V, the form matrix

A=

( e ,e ) i

{

j

i . j =1... n

by Oq ( F ) = x ∈ GLn ( F )

{

∈ GLn ( F ) .

Then the orthogonal group of q is defined

} while the special orthogonal group of q xAx − A = 0, det x − 1 = 0}

xAx t − A = 0

defined by SOq ( F ) = x ∈ GLn ( F )

q is represented by the symmetric

is

t

3.1.3. Note that an algebraic group is said to be connected (resp irreducible) if it is connected (resp irreducible) as a variety. The connected component of G containing the identity is called its 1-component and denoted by G0 . G0 has finite index in G. For example GLn , SLn , SOq , Sp2 n ( F ) , Ga , Gm are all connected. The 1-connected component of Oq is SOq 3.1.4. An algebraic group G is said to be unipotent if any x in G is unipotent. If we regard G as a matrix group, this means that x − 1n is nilpotent i.e. the only eigenvalue of x is 1. G is said to be solvable if G is solvable as an abstract group. In particular, unipotent groups are solvable. The radical r(G) (resp unipotent radical ru (G ) ) of G is the unique maximal closed connected solvable (resp unipotent) normal subgroup of G. We have ru (G ) ⊂ r (G ) and

ru (G ) = r (G )u . G is said to be reductive (resp semi-simple) if ru (G ) = {e} (resp r (G ) = {e} ) 3.1.5. (a) We also have the concepts of quotients. If H is a closed subgroup of G, there is an essentially unique quotient F-variety G/H which is an algebraic group if H is normal. As a variety G/H is not necessary affine (see [64]). We next introduce the notion of homogeneous space. (b) Let G be an algebraic group. X an algebraic variety. An action of G on X is a morphism G x X → X: (g, x ) → g x such that

Higher Algebraic K – Theory of G – Representations…

49

(a) g (h x ) = (g h)( x ) for all g, h ∈ G, x ∈ X (b) e x = x for all x ∈ X. X, equipped with a G-action is called a G – variety or Gspace. If X is a scheme, call X a G-scheme. (c) A homogeneous space for G is a G-space X on which G acts transitively i.e. there exists only one G-orbit. In this case, all the isotropy groups (stabilizers) Gx , x ∈ X are conjugate in G. If we fix a point x0 ∈ X and H = Gx 0 ≤ G , we have a bijection

X → G H : gx0 → gH Thus, if H is closed subgroup of G, a quotient of G by H is a pair ( X , x0 ) , x0 ∈ X with isotropy group H = Gx0 such that the following universal property holds. For any pair

(Y , y0 ) consisting of homogeneous space Y for G and y0 ∈Y, such that Gy exists a unique G-morphism ϕ : X → Y such that ϕ ( x0 ) = y0 . Note that X

0

⊃ H , there

G H is usually a quasi-projective variety.

3.1.6. (a) An algebraic group T over F is called an F-torus if over some algebraic field extension F´ over F, T × F ' Π Gm is a (finite) product of multiplicative groups Gm F

In particular T ×F Fs = Π Gm for a separable closure Fs of F. We shall also sometimes write Fsep for separable closure of F. (b) An F-torus T is said to be split if T

ΠGm over F.

Say that T is anisotropic if it does not contain any split subtorus (see [64]) For example

⎧⎛ x T ( ) = ⎨⎜ ⎩⎝ − y

y⎞ ⎫ 2 2 ⎟ ∈ SL2 ( ) x + y = 1⎬ is an anisotropie torus since it is compact and x⎠ ⎭ ∗ (c) Let T be an F-torus, and Fs a separable closure of F. hence cannot be isomorphic to Then the Abelian group of characters X ( T ) = HomFs (T , Gm ) is a module over the Galois group Γ = Gal ( Fs / F ) . Note: (1) X(T) is free Abelian group and T splits if and only if Γ operates trivially on X(T)

(2) T is a anisotropic iff X (T ) = {0} Γ

(3) There exists a unique maximal anisotropic F-sub torus Ta of T and Ta .Ts = T and

Ta ∩ Ts is finite (see [64]) where Ts is a unique maximal split K-subtorus of T (d) 1) A torus is connected and Abelian and hence solvable and all its elements are semi simple. Conversely if G is connected Abelian algebraic group, all of whose elements are semi simple, then G is torus 2) The set of closed tori in G, ordered by inclusion, has a maximal element – a maximal torus. The maximal tori are all conjugate and lie in G0 . Hence we may assume that G is connected

Aderemi Kuku

50

(e) If G = GLn , the F-subgroup T of diagonal matrices is a maximal torus in G 3.1.7. Let G be a connected F-group. A maximal connected solvable F-subgroup of G is called a Borel subgroup of G and usually denoted by B An F-subgroup of G is called a parabolic subgroup if it contains a Borel subgroup. It is usual to denote a parabolic subgroup by P. If G = GLn , then the F-subgroup B of upper triangular matrices is a Borel subgroup of G Note: (1) Every maximal torus is connected and solvable and hence is contained in some Borel subgroup.

(2) All Borel subgroups in G are conjugate over F . Every element of G is contained in such a group. (3) Let P be a closed F-subgroup of G. The quotient G/P is projective iff P is parabolic. If P is a parabolic subgroup of C, then it is connected and equal to its own normalizer in G (i.e. P = NG ( P ) )

(4) If P,Q are two parabolic subgroups of G, and if they are conjugate then P = Q 3.1.8. Let G, G´ be F-groups. An F-morphism α: G´→G is called an isogeny if kerα is

finite and α is surjective over F (This means that G´,G are of the same dimension). An isogeny α is central if kerα ⊂ centre of G´. Two F-groups G, H are said to be strictly isogeneous if there exists an F-group G´ and central isogenies α: G´→G and α´: G → H A semi-simple F-group G is simply connected if there is no proper isogeny G´ → G with a semi-simple F-group G´. G is adjoint if its centre is trivial. 3.1.9. (a) A connected solvable F-group G is called split if there exist a series of subgroups Gi +1 ⊂ Gi ... such that Gi Gi +1 is isomorphic to either Ga or Gm for

i = 0,1,...., n − 1 . − −

A reductive F-group G is called split if it has a maximal torus which splits over F A reductive group G over F is called quasi-split if it has a Borel subgroup defined over F (b) Let G, H, be algebraic groups over F G is called a twisted form of H if Gsep and H sep

are isomorphic ova Fsep where Fsep is the separable closure of F.

3.2. Representations of G in P (F) 3.2.1. Let G an F - group, P(F) the category of finite-dimensional vector spaces over F,

P ( F )G or RepF ( G ) the category of representations of G in P(F). Recall from section 2 that

objects of P ( F )G are of the form (V, α: G → Aut(V)) where V ∈ P(F) 3.2.2. Now, for any G-scheme X, Let VBG ( X ) be the category of G-equivariant

(algebraic) vector bundles or X. This category is also denoted by P(G, X) (see 3.3.2) Let H be a closed subgroup of G and X the homogeneous space G/H. Then we have an equivalence of categories


51

ind ⎯⎯ → VBG (G / H ) Re pF ( H ) ←⎯⎯ res

where ‘ind’ and ‘res’ are defined as follows: −

−1

res: For any vector bundle E ⎯⎯ → G H , p (e ) ∈ Re p F ( H ) ( where e =eH=H) P

since the stabilizer of H in G/H = e . −

ind: Let ( V,α : H → Aut(V)) ∈ Re pF ( H ) . Then, one has a vector bundle (G x V) / H→ G/H where H acts on [64]. We denote (G x V)/H by V% . Here

( G × V ) / H by ( g , V ) h = ( g ⋅ h, h −1V ) , see

α (h −1 )V .

3.2.3. Let G% be a semi-simple connected and simply connected, F-split algebraic group

% containing the torus T% . The factor over a fiel F. Let P% ⊂ G% be a parabolic subgroup of G variety F = G% P% is smooth and projective (see [64], [65]). Call F = G% P% a flag variety.

( )

% , W: = N % (T% ) / T% the Wey1 group of be the normalizer of T% in G G

−

Let N G% T%

−

G – a finite group. Let WP := {w ∈ W wPw−1 = P} . Put s(F) = ⎡⎣W : WP% ⎤⎦ . % and Z% * = Hom ( Z% , G ) the group of characters of Z% . Let Z% be the center of G m * Note that Z% is a finite group.

−

Let

X

( )

∈ Z% and Re pG P% X

*

( )

those V ∈ Re pF P%

( )

be the full subcategory of Re pF P% consisting of

such that Z% acts on V by the character X . The F-group

scheme Z% acts on V by the charater X and hence on every V% = (G% × V ) / P% ∈

VBG% ( F ) See 3.2.2 above. −

We write VBG% ( F, X ) for the full subcategory of VBG% ( F ) consisting of V% such that Z% acts on every fibre of V% by the character X

3.3. G- Modules on G-spaces X 3.3.1. Let G be an F-group and X a G-scheme with G-action on X given by

θ : G × X → X , ( g , x ) → gx . Let p2 : G × X → X : ( g , x ) → x be the projection onto

X, M(X) the category of coherent O coherent O

X

-modules. Then a coherent G-module on X is a

-module F together with an isomorphism ϕ : θ *F

X modules on G × X .

p2*F of O G× X

3.3.2. Let M (G, X) be the category of coherent G-modules that are coherent as O

modules. Then M (G, X) is an Abelian category (which is also exact).

X

-

Aderemi Kuku

52

Let P (G, X) be the subcategory of those coherent modules that are vector bundles on X (i.e. locally free sheaves of O

X

-modules. Then P (G, X) is an exact category and P2* , θ * :

P(X) →P(G, X) are exact functors where P(X) is the category of locally free sheaves of

O -modules (or equivalently vector bundles on X) X 3.3.3. We have the following elaborations on the situation in 3.3.2. (a) Let A be a finite – dimensional separable F-algebra, G an algebraic F-group and X a G-scheme. A G-A-module over a G-scheme X is a G-module M which is also a left A ⊗ F OX -module such that

g ( am) = ga ⋅ gm for g ∈ G, m ∈ M. (b) Let M(G, X, A) be the category whose objects are G-A-modules over X and whose morphisms are A ⊗ F OX - and G-module morphism. Then, M (G, X, A) is an Abelian category.

Let P (G, X, A) be the full subcategory of M (G,X,A) consisting of locally free OA⊗O X

module. Then P (G, X, A) is an exact category.

CHAPTER II. HIGHER K-THEORY OF EQUIVARIANT EXACT CATEGORIES – DEFINITIONS, EXAMPLES, AND SOME RESULTS Section 1. Brief Review of K n (C) , n ≥ 0, C an Exact Category In this section, we provide definitions and relevant examples of higher K-theory of exact categories C (including equivariant exact categories), thus developing notations for later use in the envisaged results.

1.1. Definition of K n (C) 1.1.1. Definition Let Δ be category defined as follows: ob(Δ): = { n ={0 < 1< … < n}}

HomΔ (m, n ) = {monotonic maps f , m → n i.e., f ( i ) ≤ f ( j ) for i < j }. For any category A, a simplical object in A is a contravariant functor X: Δ → A. Write X n for X ( n ) A

cosimplical object in A is a covariant functor X: Δ →A. −

Equivalently, one could define a simplical object in a category A as a set of objects

X n (n≥0) in A and a set of morphisms δ i : X n → X n +1 (0≤ i ≤n) called face maps as well as a set of morphisms s j : X n → X n +1 (0≤ j ≤n) called degeneracies satisfying −

certain simplical identities (see [57]) The geometric n-simplex n

Δ = {( x0 , x1 ,...., xn ) ∈ R

n +1

is

the

0 ≤ xi ≤ 1∀i and ∑ xi = 1}

Δˆ :Δ → Spaces : n → Δˆ n is a co-simplical space.

topological and

space A

functor


53

1.1.2. Definition: Let X ∗ be a simplical set. The geometric realization of X ∗ written

(

)

X ∗ is defined by X ∗ = X × Δˆ = U X n × Δˆ n / ≅ where the equivalence relations ≅ is Δ

generated by

n≥0

( x , ϕ ( y ) ) ≅ (ϕ ( x ) , y ) ∗

∗

for any x ∈ X n y ∈ Ym and ϕ : m → n in Δ and

ˆ is given the product topology and X is considered as a discrete space. where X n × Δ n n

1.1.3. Definition. Now let A be a small category. The nerve of A, written NA, is the simplical set whose n-simplices are diagrams: fn f1 An = { A0 ⎯⎯ → A1 → ... ⎯⎯ → An } where the Ai ’s are A-objects and the f i are A-

morphisms. The classifying space of A is defined as NA and denoted by BA. Remarks: BA is a CW-complex whose n-cells are in one-one correspondence with the

non-degenerate diagram An above.(see [57]) 1.1.4. Definition Now let C be an exact category. We form a new category QC such that

ob(QC) = ob C and morphisms from M to P, say is an isomorphism class of diagrams j i M ←⎯ ⎯ N ⎯⎯ → P where i an admissible monomorphism ( or inflation) and j is an

admissible epi morphism or deflation) in C i.e., i and j are part of some exact sequences i j 0 → N ⎯⎯ → P → P ' → 0 and 0 → N '' → N ⎯⎯ → M → 0 , respectively.

Composition is also well defined (see [57]). 1.1.5. Definition: For n ≥ 0, define K n (C) := π n +1 ( BQC) n ≥ 0, where for any topological space Y,

π n +1 (Y ) is the (n+1)- homotopy group of Y.

1.1.6 Note: The definition above due to D.Quillen [57] coincides with the classical definition of K 0 (C) as the Abelian group on the isomorphism classes (C) of C-objects

subject to relations (C′) + (C′′) = (C) wherever 0 → C ' → C + C '' → 0 is an exact sequence in C (see [57]).

1.2. The Plus Construction – Another Definition of K n (P ( A)) = K n ( A) n≥ 1 There is an alternative definition of K n (P ( A)) = K n ( A) , n ≥1 also due D. Quillen . This definition, which is also very useful for computations, arises from the following theorem. 1.2.1. Theorem [65] Let X be a connected CW-complex, N a perfect normal subgroup of π 1 ( X ) . Then there exists a CW-complex X (depending on N) and a map i : X → X + +

such that (i ) i* : π 1 ( X ) → π 1 ( X + ) is the quotient map

π 1 ( X ) → π 1 ( X ) / N (ii ) For any

π 1 ( N ) / N -module L, there is an isomorphism i* : H * ( X , i* L ) → H * ( X + , L) where i* L is L considered as a

π 1 ( X ) -module. (iii) The Space X + is universal in the sense that if Y is

any CW-complex and

f : X → Y is a map such that

f : π 1 ( X ) → π 1 (Y ) satisfies

f* ( N ) =0 then there exists a unique map f + : X + → Y such that f + i = f

Aderemi Kuku

54

1.2.2. Definition Now Let A be a ring with identity and put X=BGL(A) in above theorem. Then π 1 BGL( A) = GL( A) contains E(A) as perfect normal subgroup. Hence by the theorem

above, there exists a space BGL( A) . Define K n ( A) = π n ( BGL ( A) + ) for all n ≥1. +

1.3. Examples of K n of Ordinary And Equivariant Exact Categories 1.3.1. Let C=P (A), the category of finitely generated projective modules over a ring A

with identity. We write K n ( A) for K n (P ( A) = π n +1 (BQP ( A)) . 1.3.2. If C=M (A), the category of finitely generated modules over a Noetherian ring A.

We write Gn ( A) for K n (M( A)) = π n +1 (BQM( A))

. Note. In 1.3.1 and 1.3.2 above, we shall be interested the group ring A=RG where G is a finite group and R is the ring of integers in a number field or p-adic field F. We have indeed identified P (RG), M (RG) with some categories of G-representations in Chapter I. Since RG is an R-order in the semi-simple F-algebra FG, we shall also be interested in K n ( A) , Gn ( A) where A is an R-order in a semi-simple F-algebra Σ . Recall that A is a sub ring of Σ such that R is contained in the centre of A, A is a finitely generated R-module and F ⊗ R A = Σ . 1.3.3. Also the inclusion functor P (RG)→M(RG) induces Abelian group homomorphism

K n ( RG ) → Gn ( RG ) n ≥ 0 which generalizes to higher dimensions the Cartan map K 0 ( RG ) → G0 ( RG ) (see [6] or [39]) 1.3.4. When C = PR ( RG ) , the category of RG-lattices where R is a commutative ring

with identity, we shall write Gn ( R, G ) for K n (PR ( RG )) . It is well know that when R is regular, then Gn ( R, G ) ≈ Gn ( RG ) (see [39] [28]). When R is a field of characteristic zero, and G a finite group, then G0 ( F , G ) G0 ( FG ) coincides with the Abelian group of generalized characters x : G →F and this provides the initial connection between representation theory and K-theory of the group-algebra FG. 1.3.5. Let X be a scheme, and P(X) the category of locally free sheaves of OX -modules (or equivalently the category VB(X) of (algebraic) vector bundles on X. We shall write

K n ( X ) for K n (P ( X )) or K npr ((c F , B ),

l

)

K n ((c F , B);

l

) K n ( VB( X )) : Recall

that if X=spec(R) for a commutative ring R with identity, we shall recover K n ( R ) as in 1.3.1 1.3.6 Let X be a Noetherian scheme and M(X) the category of coherent sheaves of OX -

modules. We shall write Gn ( X ) for K n (M( X )) with the observation that if X= spec(R), we recover Gn ( R ) as in 1.3.2 1.3.7 Let G be an algebraic group over a field F (i.e. an F-group), X a G-scheme, M(G,X) the category of coherent G-modules that are also coherent as OX -modules. We write

Gn (G, X ) for K n (M(G , X ) for all n ≥ 0. If A is a finite dimensional separable F-algebra, X


55

a G-scheme, M(G,X,A) as defined in I,3.3.3 (b), we shall write Gn (G , X , A) for

K n (M(G, X , A) for all n ≥ 0 1.3.8 If G is an F-group and X a G-scheme and P (G,X) as in I 3.3.2, then we shall write

K n (G , X ) for K n (P (G, X )) Note that P (G , X ) can be identified with the category VBG ( X ) of G-equivariant

(algebraic) vector bundles on X and so K n (G, X )

K n (P (G, X ))

K n VBG ( X ) for all n

≥ 0. We shall also write this group as K nG ( X ) If A is a finite dimensional separable F-algebra, and P (G , X , A) as in I3.3.3 (b), we shall write K n (G , X , A) for K n (P (G, X , A)) n ≥0 1.3.9 Let G be an F - group and H a closed subgroup of G. In I,3.2.2 we saw that there is an equivalence of categories between the exact category P ( F ) H or equivalently Re p F ( H )

of representation of H in P(F) and the exact category VBG (G H ) of G-equivariant vector bundles on G/H. Note that K 0 (P ( F ) H )

K 0 (V BG (G H )) where the latter group is

G denoted by K 0 (G H ) . It is also usual to denote the “representation group” of H (i.e. the

group of generalized characters of H) by R(H). In fact R (H) is a ring called representation ring of H. (see [64]) .Note that for any algebraic group G over F, K 0 (Re p F (G ) R (G ) is a free Abelian group generated by the classes of irreducible representations and that R (G) also has the structure of a ring induced by tensor product. 1.3.10 Now, in the notation of I 3.2.3, let G% a semi-simple connected and simply

connected, F-split algebraic group over F, T% ⊂ G% be a maximal F-split torus of G% , P% ⊂ G% a

% parabolic subgroup of G

( )

W = N G% T%

containing the torus T% , F = G% P%

the flag variety. Let

% −1 = P% } T% be the Weyl group of G% (a finite group), WP% :{w ∈ W wPw

put s (F ) = [W : WP% ] . Then we have the following. 1.3.11. Theorem [55] R( P% ) is a free R (G% ) -module of rank s ( F ) . 1.3.12 In the notation of I,3.2.3 let VBG (F : X) be the full subcategory of

VBG (F ) consisting of those V% such that Z% acts on every fibre of V% by the character X . We shall write K n ( F,X ) for K n ( VBG% ( F,X )) and R ( P% ) for K 0 (Re pFX ( P% )) X

G

G% , Z% , T% , P% be as in I,3.2.3 put G = G% Z% , P = P% Z% , T = T% Z% , F = G% P% = G P put G = Gal ( Fsep / F ) where Fsep is the 1.3.13

Let

separable closure of F. Let c: G → G ( Fsep ) be the 1 – co cycle (see [55]) and twisted form of F corresponding to c (see [55]. [40]) G

We shall write K n

( c F ) for

K n (VBG ( c F ))

c

F the

Aderemi Kuku

56

1.3.14. Let B be a finite dimensional separable F-algebra, X a smooth projective variety equipped with the action of an affine algebraic group G over F, c X the twisted form of X via

a 1-cocycle c. Let VBG ( c X , B ) be the category of vector bundles on

c

X equipped with

left B-module structure. We write K nG ( c X , B ) for K n ( VBG ( c X , B )) . 1.3.15. a) Let G be a finite group, S a G-set and S the translation category of S (see

I,2.3.2), C an exact category. We saw in I, 2.3.3 that the category [ S , C ] of covariant functor

ζ : S → C is an exact category. For all n ≥ 0, let K nG ( S , C) be the nth algebraic K-group of the category [ S , C] with respect to fibre wise exact sequences.

Recall that if C = M( R ) and S=G/H, H ≤ G, Then [G H , M( R) ]

K nG ( G H , M( R) )

K n (M( RH )) Gn ( RH ) for all n ≥ 0

C=P(R)

If

K nG ( G H , P( R) )

M( RH ) and so

[G

then

H , P ( R)] PR ( RH ) and

so

K n (PR ( RH )) Gn ( R, H ) for all n ≥ 0

1.3.16. Let S, T be G-sets, C an exact category. Recall from I 2.4.1, that we obtained an

exact category

T

[ S , C]

as follows: ob( [ S , C ]) = ob [ S , C ] while exact sequences in T

[ S , C] are T-exact sequences in [ S , C] . We now denote by T K-group K n ( [ S , C ]) . T

K nG ( S , C, T ) the nth algebraic

Note that K nG (G H , P ( R ), T ) (resp K nG (G H , M( R ), T ) is the nth algebraic K-

group of PR ( RH ) resp (M( RH )) with respect to exact sequences that split when restricted to the various subgroups H´ of H such that T H ' ≠ 0 1.3.17. Let S, T be G-sets, C an exact category. Recall from I, 2.4.3 that we have an exact

category [ S , C ]T of T-projective functors in [ S , C] with respect to split exact sequences. We G write Pn ( S , C, T ) for K n ([ S , C ]T )

[G

Note

that

H , P ( R ) ]G e

if

T=G/e

where

e

is

the

P ( RH ) and K n ([G H , P ( R )]G e )

identity

element

of

G,

then

K n ( RH ) for all n ≥ 0.

1.4. Mod- l s higher K-theory (ordinary and equivariant) n +1

1.4.1. Let C be an exact category, l a rational prime, s a positive integer, M ls the (n+1)

- dimensional mod- l s space i.e. the space obtained from S n by attaching an (n+1)-cell via a s

map of degree l (see [5], [53])

If X is an H-space, we write π n +1 ( X , n +1

l s ) for ⎡⎣ M ins +1 , X ⎤⎦ , the set of homotopy classes

of maps from M ls to X. If C is an exact category and X=BQC, we write K n (C,

l s ) for


π n+1 ( BQC,

l s ) for K 0 (C) ⊗

l s ) for n ≥ 1 and K 0 (C,

57

l s . Call K n (C,

l s ) mod-

l s K-theory of C. 1.4.2. Examples (i) If A is a ring with identity, and C =P (A) the category of finitely generated projective A-modules, write K n ( A, also π n ( BGL ( A) + ,

l s ) for K n (P ( A),

l s ) . Note that K n ( A,

l s ) is

ls ) .

(ii) If Y is a scheme and C = P (Y), the category of locally free sheaves of OY -modules, write K n (Y ,

l s ) for K n (P (Y ),

we recover K n ( A,

l s ) . Note that for Y=Spec(A), A commutative,

ls) .

(iii) Let A be a Noetherian ring and M (A) the category of finitely generated A-modules.

l s ) for K n (M( A),

We write Gn ( A,

ls ) .

(iv) If Y is a Noetherian scheme, C = M(Y) the category of coherent sheaves of OY -

l s ) for K n (M(Y ),

modules, write Gn (Y ,

ls) .

(v) Let G be algebraic group over field F, X a G-scheme and C=M (G,X). Write

Gn ((G , X ),

l s ) for K n (M(G , X ),

(vi) If C=P(G,X) write K n ((G , X ),

l s ) for K n (P (G , X ),

C = VBG ( c X , B) ,

(vii) If

K n (VBG (c X , B),

ls )

we

ls )

K n ((c X , B),

write

ls )

for

ls )

(viii) If C = M(G,X,A) we Write Gn ((G , X , A),

l s ) for K n (M(G , X , A),

ls )

1.5. Profinite Higher K-Theory (Ordinary and Equivariant)

n +1 = lim M n +1 , We define the profinite K-theory of an exact s ∞ ⎯⎯ → l s pr n +1 by K n C, ˆ l = ⎡⎣ M l ∞ ; BQC ⎤⎦ . We also write K n C, ˆ l for

1.5.1. Now put M

(

C

category

lim K n (C,

)

(

)

ls )

←⎯ ⎯ S

1.5.2.Examples C

(i) If

(

=P(A),

)

(

we

write

)

(

K npr A, ˆ l

K n A, ˆ l = K n P ( A), ˆ l for A any ring

)

for

(

)

K npr P ( A), ˆ l and

Aderemi Kuku

58

(

(

)

(

K npr Y , ˆ l for K npr P (Y ), ˆ l

(ii) If C=P(Y), we write

)

)

and

(

)

K n Y , ˆ l for

K n P (Y ), ˆ l ,Y a scheme (iii) If C = M(A), we write

(

for K n M( A), ˆ

l

) , A a Noetherian ring

C

(iv) If

(

Gnpr ( A,

=

l

(

)

(

) for Gnpr M( A), ˆ l and Gn A, ˆ l

M(Y),

(

)

)

Gnpr Y , ˆ l for

write

)

K npr M(Y ), ˆ l Gn (Y , ˆ l ) for K n (M(Y ), ˆ l ) Y a Noetherian Scheme C

(v) If

=

M(G,X),

(

(

Gnpr (G , X ), ˆ l

write

)

)

for

K npr M(G, X ), ˆ l ; Gn ((G , X ), ˆ l ) for K n (M(G, X ), ˆ l ) G an algebraic group, Z a G-Scheme. (vi) If C= P(G,X), write K n

pr

( (G, X ), ˆ ) for K ( P(G, X ), ˆ ) G ((G, X ), l

pr n

l

K n (M(G, X ), ˆ l ) X a G – scheme, G an algebraic group C

(vii) If

VBG (c X , B ) ,

=

(

l

n

(

) for

)

K npr ( c X , B), ˆ l for

write

)

K npr VBG (c X , B ), ˆ l ; K n (c X , B), ˆ l ) for K n (VBG (c X , B), ˆ l ) (See earlier definitions) (viii) If C = M (G, X, A), write Gn

pr

( (G, X , A), ˆ ) for K ( M(G, X , A), ˆ ) .and l

pr n

l

Gn ((G, X , A), ˆ l ) for K n (M(G, X , A), ˆ l ) ( see earlier definitions) (ix) If

(

)

C=

(

P(G,X,A),

)

write

K npr (G, X , A), ˆ l for K npr P (G, X , A), ˆ l K n ((G, X , A), ˆ l ) for K n (P (G, X , A)

l

)

Section 2. Induction Techniques for finite group actions ; Mackey functors 2.1. Mackey functors – Brief Review In this section, we briefly introduce Mackey functors in a way relevant to our context. For more general definition and presentation see ([39] [27]) Mackey functors are functors satisfying certain functorial properties in particular, the categorical version of Mackey subgroup theorem in representation theory. Induction theory has always aimed at computing various invariants of certain classes of subgroups of a group G. It turns out that for such a Mackey functor M, one can always find a canonical smallest class UM of subgroup of G such that the values of M an any G-set can be computed from


59

their restrictions to the full subcategory of G-sets of the form G/H, H∈ UM . For more details (see [39] [27]).

2.1.1. Definition Let G be a finite group, GSet the category of (finite) G-sets. A pair ( M * , M * ) of functors GSet →R-Mod is a Mackey functor if (i) M * : GSet →R-Mod is covariant and M : GSet → R – Mod is contravariant and *

M * ( X ) = M * ( X ) = M ( X ) for any G-set X (ii) M transforms finite disjoint unions in GSet into finite products in R-Mod, i.e., the *

embeddings

& Xi Xi → ∪

& X2 ∪ & ... ∪ & Xn) M ( X1 ∪

induce

isomorphism

M ( X 1 ) × M ( X 2 ) × ... × M ( X n )

(iii) For any pull-back diagram

commutes (Mackey subgroup property). 2.1.2. A morphism (or natural transformation) of Mackey functors τ : M → N consists of a family of homomorphism τ ( X ) : M ( X ) → N ( X ) , indexed by the objects X in GSet, such that

τ is a natural transformation of M * as well as of M * , i.e. such that for any G-map

f : X → Y the diagrams.

Aderemi Kuku

60

are commutative. 2.1.3. A pairing M x N → L of two Mackey functors M and N into a third one, called L, is a family of R-bilinear maps

such that for any G map f : X → Y the following diagrams commute

(the last two being related to Frobenius reciprocity) 2.1.4. A Green functor is a Mackey functor G: GSet →R → Mod together with pairing G x G → G such that an R-bilinear map G(X) x G(X) → G(X) turns G(X) into an R-algebra

with

unit

1G ( x ) and

such

f * (Y )(1G (Y ) ) = 1G ( X ) holds.

that

for

each

G-map f : X → Y ,

the

equation


61

2.1.5. If G is a green functor, M a Mackey functor and G x M → M a pairing such that

1G ( X ) acts as identify on M(X),we shall call M with respect to this pairing a G-module. 2.2. Higher K-Theory as Mackey Functors (For Finite Group Actions) 2.2.1. Let G be finite group, S, T, G-sets; and C an exact category. In I, section 2 and section 1 of this chapter, we obtained three equivariant exact categories with associated higher K-groups as follows:

[ S , C] with

(1) K nG ( S , C ) is the nth algebraic K-group (n ≥0) of the exact category

respect to fibre-wise exact sequences. Recall that if C=P(R), S = G/H, then

K nG (G H , P ( R ))

Gn ( R, H ) and

K nG (G H , M( R ))

Gn ( RH ) and that when R is regular Gn ( R, H )

that

if

C

=

M(R),

(2) K n ( S , C, T ) is the nth algebraic K-group (n ≥ 0) of the exact category G

then

Gn ( RH ) T

[ S , C] with

respect to T-exact sequences in [ S , C] . Note that when S = G/H, C=M(R), (resp. C= P(R)) then K nG (G H , M( R ), T ) resp.K nG (G / H , P ( R ), T )) is the nth algebraic

K-group of the exact category M(RH) (resp PR ( RH ) with respect to exact sequences which split when restricted to the various subgroups, H´ of H such that

T H ' ≠ φ (recall that T H ' = {t ∈ T / gt = t for all g∈ H´} th (3) Pn ( S , C, T ) is the n algebraic K-group (n ≥ 0) of the exact category [ S , C ]T with G

respect to split exact sequences. (4) If

S

=

C=P(R)

G/H,

(resp

M(R)),

then

PnG (G H , P ( R ), T ) (resp

PnG (G H , M( R ), T ) ) is the nth algebraic K-group of the exact category

PR ( RH ) (resp M( RH ) ) consisting of objects that are relatively H´ - projective for subgroups H´ of H such that T

H'

≠ φ with respect to split exact sequences.

Note in particular that Pn (G H , P ( R ), G e) = K n ( RH ) . For details and properties of G

this construction see [39]. [10] We now have the following.

2.2.2. Theorem [10] [39] Let G be a finite group, T a G-set, C an exact category, Ab the category of Abelian groups .Then K n ( −, C ) , K n ( −, C, T ) , Pn ( −, C, T ) : GSet → Ab are Mackey functors for G

G

G

all n ≥ 0. If the pairing C x C → C is naturally associative and commutative and contains a natural unit, then K 0G ( −, C) , K 0G ( −, C, T ) GSet → Ab are Green functors ; K nG ( −, C ) is a unitary

K 0G (−, C ) -module and K nG (−, C, T ) and PnG ( S −, C, T ) are K 0G (−, C, T ) -

modules. For a proof see [10].[39].

Aderemi Kuku

62

2.2.3. Remarks (1) It is well known that the Burnside functor Ω: GSet → Ab is a Green functor and that any Mackey functor M : GSet → Ab is an Ω-module and any Green Functor is an Ωalgebra (see [39],[7], [27]). Hence the above K-functors

K nG (−, C, T ) , PnG ( S −, C, T ) and K nG (−, C ) are Ω-modules, and K 0G ( −, C, T ) and K 0G ( −, C ) are Ω-algebra. (2) Let M be any Mackey functor : GSet →Ab, X a G-set. Define K M ( X ) as the kernel of M(G/G) → M(X) and I M ( X ) as the image of M(X) → M(G/G). An important induction result is that ⎜G⎜M (G/G) ⊆ K M ( X ) + I M ( X ) for any Mackey functor M and G-set X. ([39] [7]). This result also applies to all the K-theoretic functors defined above. (3) If M is any Mackey functor: GSet →Ab; X a G-set, define a Mackey functor

M X :GSet → Ab by

M X (Y ) = M ( X × Y ) . The projection map pr:

X × Y → Y defines a natural transformation θ X : M X → M where θ X (Y ) = pr: M ( X × Y ) → M (Y ) . M is said to be X-projective if θ X is split surjective (see [39], [27]). Now define the defect base DM of M by DM = {H ≤ G⏐X

H

≠ φ} where

X is a G-set (called the defect set of M) such that M is Y-projective iff there exists a G-map f : X → Y (see [39]). If M is a module over Green functor G, then M is Xprojective iff G is X-projective iff the induction map G(X) →G(G/G) is surjective. In general, proving induction results reduce to determining G-sets X for which G(X) →G(G/G) is surjective and this in turn reduces to computing DG . Thus, one could apply induction techniques to obtain results on higher K-groups which are modules over the Green functor K 0 ( −, C) and K 0 ( −, C, Y ) for suitable exact G

G

categories C e.g. C = P(R) or M(R) (see [39]) (4) One can show via general induction theory principles that for suitably chosen C, all the higher K-functor K n ( −, C) , K n ( −, C, T ) are ‘’hyper elementary computable’’ G

G

– see ([39], [24]) or below.

2.3. Some Consequent Results on Higher K-Theory of Grouprings In this subsection, we provide some specific results on higher K-theory of grouprings that are consequences of the techniques briefly outlined in 2.1 and 2.2. For proofs of these results, see [39]. 2.3.1. Theorem [39] [10] Let G be a finite group, T a G-set, k a field of characteristic p , p ≠ 0 then, there exists an isomorphism of Mackey functors :


63

⎛1⎞ G ⎜ ⎟ ⊗ Pn ( −, P (k ), T ) p ⎝ ⎠ the

Cartan

⎛1⎞ G ⎜ ⎟ ⊗ K n ( −, M(k ), T ) : GSets →Ab. Hence for all n ≥ 0, p ⎝ ⎠ K n (kG ) → Gn (kG ) induces isomorphismes map

⎛1⎞ ⎛1⎞ ⎜ ⎟ ⊗ K n ( kG ) → ⎜ ⎟ ⊗ Gn ( kG ) ⎝ p⎠ ⎝ p⎠ Proof : see [39] [9] We also have the following consequences of 2.3.1.

2.3.2. Theorem [39] [9] Let p be a rational prime, k a field of characteristic p, G a finite group. Then for all n ≥ 1 K 2 n ( kG ) is a finite p-group The Cartan homomorphism

ϕ2 n −1 : K 2 n −1 (kG ) → G2 n −1 (kG ) is surjective and

Ker ϕ2 n −1 is the Sylow p-subgroup of K 2 n −1 ( kG ) 2.3.3. To be able to state the next results, we need an alternative characterization of Mackey functors as functors on a category of subgroups of G rather than on G-sets. Let

δ G denote the subgroup category whose objects are the various subgroups of G with

δ G ( H1 , H 2 ) = {( g , H1 , H 2 ), g ∈ G, gH1 g −1 ≤ H 2 } and

composition

of

( h, H 2 , H 3 ) ∈ δ G ( H 2 , H 3 ) defined

( g , H1 , H 2 ) ∈ δ G ( H1 , H 2 ) and

by ( h, H 2 , H 3 ) o ( g , H1 , H 2 ) = ( hg , H1 , H 3 ) , so that (e, H , H ) ∈ δ G ( H , H )

is the

identity where H ≤ G and e ∈ G is the trivial element. There is a canonical functor C, the coset functor from δ G into GSet : H →G/H and with each

morphism

( g , H1 , H 2 ) ∈ δ G ( H1 , H 2 ) ,

the

G-map

−1

Ψ g −1 : G H1 → G H 2 : xH1 → xg H 2 .

ˆ = M o C : δ G → Ab is If M: GSet → Ab is a Mackey functor , then the composite M a bifunctor which can be shown to describe situations similar to the Mackey subgroup theorem (see [11] for details). Call Mˆ a G-functor as so christened by J.A Green (see [11]) It can be shown that there is a one-one correspondence between the G-functors

Mˆ : δ G → Ab and Mackey functors M: GSets →Ab. So we can identify any Mackey functor M: GSet → Ab with Mˆ : δ G → Ab and thus sometimes write M(H) instead of M(G/H). (See [39]). To be able to state the next result we need the following definition.

Definition 2.3.4. Let G be a finite group, U a collection of subgroups of G closed under subgroups and isomorphic images, A a commutative ring with identity. Then a Mackey functor. M : δ G →

Aderemi Kuku

64

A-Mod is said to be U-computable if the restriction maps M (G ) → Π M ( H ) induces an H ∈U

isomorphism

M (G ) → lim M ( H ) where

lim M ( H )

← H ∈U

← H ∈U

is

the

subgroup

of

all

( x) ∈ Π M ( H ) such that for any H, H´∈U, and g ∈G with gH ' g −1 ≤ H , H ∈U

ϕ : H ' → H given by h → ghg −1 , then M (ϕ )( xH ) = xH ' . Now, if A is a commutative ring with identity, M : δ G → Ab a Mackey functor, then

A ⊗ M : δ G → A -Mod is also a Mackey functor where ( A ⊗ M )( H ) = A ⊗ M ( H ) . Now, let P be a set of rational primes,

P

=

⎡1 ⎤ ⎢ q ∉ P ⎥ ,C(G) the collection of all ⎢⎣ q ⎥⎦

cyclic subgroups of G, hP C ( G ) the collection of all P-hyperelementary subgroups of G, ie

hP C(G ) = {H ≤ G there exists H ' ≤ H , H ' ∈ C(G ), H H ' is a p-group for some p ∈P} Recall that if R is Dedekind domain with quotient field F, G a finite group, we define for ≥ 0 all n SK n ( RG ) := Ker ( K n ( RG ) → K n ( FG ) , SGn ( RG ) := Ker (Gn ( RG ) → Gn (( FG )) We now have the following result.

2.3.5. Theorem [39] [24] Let R be aDedekind ring , G a finite group, M any of the modules K n ( R −) , Gn ( R −) , SGn ( R −) over G0 ( R −) then P ⊗ M is hP (C(G )) -computable

CHAPTER III. SOME RESULTS ON THE ACTION OF FINITE AND ALGBRAIC GROUPS Section 1. Some results on K n ( RG ), Gn ( RG ), C l n ( RG ), SK n ( RG ), SGn ( RG ) n ≥ 0 (G finite) and consequences for some infinite groups 1.1. On K n ( RG ) , Gn ( RG ) , SK n ( RG ) , SGn ( RG ) , G finite 1.1.1. Let R be a Dedekind ring with quotient field F, G a finite group. Recall that K n ( RG ) : = K n (P ( RG )) and Gn ( RG ) : = K n (M( RG )) where we have earlier identified

P ( RG ) and M( RG ) as categories of G-representations. Hence , the study of

K n ( RG ) , Gn ( RG ) belongs to integral representation theory Define: SK n ( RG ) := Ker ( K n ( RG ) → K n ( FG )

SGn ( RG ) := Ker (Gn ( RG ) → Gn (( FG )) for all n ≥ 0


65

Define: Cl n ( RG ) = Ker ( SK n ( RG )) → ⊕ SK n ( Rˆ P G )) where R is the ring of integers p

in a number field F and where p runs through all prime ideals of R and Rˆ p is the completion of R at p . Call C l n ( RG ) the n-dimensional (higher) class group of RG. Note that

C l o ( RG ) coincides with the usual class goups C l( RG ) of RG. (See [39], [6]) We

shall provide in this section some important results on K n ( RG ) , Gn ( RG ) , SK n ( RG ) , SGn ( RG ) and C l n ( RG ) which constitute the core results of studies on higher K-theory of integral grouprings.

1.1.2. Remarks We shall be interested mostly in R being the ring of integers in a number field or a p-adic field. Note that the phenomenal growth of K-theory has been due partly to the fact that the classical K-groups of groupings (i.e. above groups for n = 0; 1,2) house various interesting topological/geometric invariants . For example (1) C l 0 ( RG ) = C l( RG ) house Swan-Wall invariants see [74]. (2) K1 ( RG ), SK1 ( RG ) house Whitehead torsion and is also useful in the classification of manifolds (see [54] [51]) (3) K 2 ( RG ) helps in the understanding of pseudo-isotopy of manifolds. See [15], [39] First

we give finiteness results on K n ( RG ) , Gn ( RG ) , SK n ( RG ) , SGn ( RG ) , C l n ( RG ) . Note that these results are proved

in [39] in the generality of arbitray R-orders Λ in a semi-simple F-algebra Σ , which invariably apply to Λ = RG

1.1.3. Theorem [39] [28] [37] Let R be the ring of integers in a number field F, G a finite group. Then for all n ≥ 1 K n ( RG ) is finitely generated Abelian group and K 2 n ( RG ) is finite.

SK n ( RG ) is a finite group. Hence C l n ( RG ) is finite. SK ( Rˆ G ) is also finite. n

p

If G is a finite p-group, then SK n ( RG ), SK n ( Rˆ p G ) are finite p-groups (See [39] [28] [37] ) for the proofs.)

1.1.4. Theorem [39] [47] [24] Let R be the ring of intergers in a number field F, G a finite group. Then for all n ≥ 1 Gn ( RG ) is finitely generated Abelian group

SGn ( RG ) = 0

Aderemi Kuku

66

SGn ( Rˆ p G ) = 0 where p is prime ideal of R and Rˆ p is the completion of R at p . For proofs see [39] [40] [24] Next we present a result on the ranks of K n ( RG ) , Gn ( RG )

1.1.5. Theorem [39] [31] Let R be the ring of integers in a number field F, G a finite group; Γ a maximal R-order containing RG . Then for all n ≥ 1, rank K n ( RG ) =rank Gn ( RG ) =rank K n (Γ) . For proof see [39] or [31] Our next aim is to present a decomposition of Gn ( RG ) n ≥ 0, G a finite Abelian group and extend these results to some non-Abelian groups e.g. quaternion and dihedral groups. First we have to develop some notations. 1.1.6. Let R be a left Noetherian ring and C a finite cyclic group of order n, generated by t, say. We write C = < t >. If f : C → C is a ring homomorphism which is injective when restricted to C, then kerf is generated by Φ n (t ) , the nth cyclotomic polynomial. Then the ideal Φ n (t ) C is independent of the choice of generators. Definie R (C ) = RC / Φ n (t ) RC for any Noetherian ring R. Then

(C ) is an integral

domain isomorphic to

[ζ n ] where ζ n is the primitive nth root of 1. We identify

the field of fractions of

[ζ n ] with

that R (C ) = R ⊗

(ζ n ) ,

1 ⎡ 1⎤ (C ) . We write R C = R(C )( ) = R ⎢ζ n , ⎥ . Note n ⎣ n⎦

(C ) and R C = R ⊗

C

Now, Let G be a finite Abelian group, X (G) the set of cyclic quotients of G. Then,

G Γ

Π

C∈X ( G )

Π

C∈ X ( G )

(C ) . If

Γ is a maximal order in

G containing

G , then

(C ) . So, for any ring R, R ⊗ Γ = Π R(C ) and R ⊗ A = Π R C c∈X ( G )

C∈ X ( G )

1.1.7. Theorem [39] [77] Let G be a finite Abelian group R a Noetherian ring. Then, for all n ≥ 0,

Gn ( RG ) Gn ( R ⊗ A)

⊕ Gn ( R C )

C∈ X ( G )

proof see [39] or [77]. 1.1.8. Our next aim is to obtain some extensions of theorem 1.1.7 to some non-Abelian groups . e.g. Dihedral and Quaternion groups. We first give some definitions and fix notations. Let R be a ring and G a group acting on R by ring automorphism. Call R a G-ring. The twisted groupring R # G is defined as the R-module R ⊗ G with elements a ⊗ g (a ∈ R, g

∈ G) denoted a # g and multiplication defined by (a # g) ⋅ (a´ # g´) =ag (a´) # gg´. If G acts trivially on R, then R # G = RG. A sub ring G´ of G is cocyclic G/G´ is cyclic.


67

1.1.9. Theorem [39] [77] Let H = G×|G1 be the semi-direct product of G and G1 , where G is a finite Abelian group and G1 any finite group such that the action of G1 on G stabilizes every cocyclic subgroup of G , so that G acts on each cyclic quotient C of G . Let R be a Noetherian ring. Then for all n ≥ 0, Gn ( RH )

⊕ Gn ( R C # G1 ) .

C∈ X ( G )

Remarks.This leads to the following result on the dihedral group. Note that D2 n = G×|G1 where G is a cyclic group of order n and G1 is a group of order 2. 1.1.10.

Let

Theorem

D2 n be

the

dihedral

group

1 1 Gn ( D2 n ) = ⊕ Gn ( [ζ d , ]+ ) ⊕ Gn ( [ ])ε ⊕ G* ( ) where 2 d dd ≥2n 1 [ζ d , ]+ is the ring of integers in d cyclotomic field (ζ d )

(ζ d ) + =

of

order

ε ={

2n .Then

1 if n is od d 2 if n is even

and

(ζ d + ζ d−1 ) the maximal subfield of the

1.1.11. We next consider the generalized quaternion group of order 4.2 s . In general, let R be a commutative G-ring with identity (G a finite group), c : G × G → R* a normalized 2*

cocycle with values in R (see [76]). Then the crossed – product ring R #c G is the R-module

R ⊗ G with multiplication given by ( a # g )( a '# g ') = ag ( a ')c( g , g ') # gg ' . So R #c G is an associative ring with identity 1#1G . If c =1, then we obtain the twisted groupring. Now let H be a generalized quaternion group of order 4.2 s .Then H has a presentation

H = x, y|x 2 = y 2 , y 4 = 1, yxy −1 = x −1 Let G1 = {1, γ } be a two element group acting on s

[ζ 2s ] by complex conjugation with fixed field Q(ζ 2s ) + = subfield of

(ζ 2s + ζ 2 ) the maximal real

(ζ 2s ) . Let c : G1 × G1 → (ζ 2n+1 )* be the normalized co cycle given by

c ( v, v ) = −1 ,and let Σ = Q (ζ 2s+1 ) #c G1 be the crossed product algebra, and Γ a maximal -order in Σ . Then, we have the following result 1.1.12. Theorem. In the notation

1.1.11,

we

have

1 1 1 ⊕ G* ( [ζ 2s , s ]+ ) ⊕ G* (Γ[ s +1 ]) ⊕ G* ( [ ]) 2 . j =0 2 2 2 s

Gn ( H )

Note: It is still an open problem to obtain a decomposition for Gn ( H ) for an arbitrary

finite group H. 1.1.13. Recall that if R is the ring of integers in a number field F, G a finite group, then the higher dimensional class group of RG is defined by

Aderemi Kuku

68

Cl n ( RG ) := Ker ( K n ( RG ) → ⊕ K n ( Rˆ p G ) for all n ≥ 0, where Rˆ p is the completion of R p

at p and p runs through all the prime ideals of R . Note that C l 0 ( RG ) coincides with the usual class group C l( RG ) , which houses some topological/geometric invariants e.g. Swan-Wall invariants -Morever C l1 ( RG ) is intimately connected with Whitehead torsion. It is also classical that C l 0 ( RG ) , C l1 ( RG ) are finite groups. We now present the following results

1.1.14. Theorem [39] [18] (1) Let R be the ring of integers in a number field, G a finite group. Then C l n ( RG ) is a finite group for all n ≥ 1 (2) For all n ≥ 1, the only possible p-torsion in C l 2 n −1 ( RG ) is for those primes p dividing the order of G. For the proof of (1) see [39] or [28]. For the proof of (2) see [39] or [18].

1.1.15. Remarks Observe that theorem 1.1.14 (2) above was stated for odd –dimensional class groups. Proving analogous result for even-dimensional class groups is still open. We also present the following result. 1.1.16. Theorem. [39] [18] Let S r be the symmetric group of degree r and let r ≥ 0. Then C l 4 n +1 ( Sr ) is a finite 2torsion group and the only possible odd torsion in C l 4 n −1 ( S r ) that can occur are for those odd primes p such that

p −1 divides n. 2

For a proof see [39] or [18].

1.2. Consequences for Some Infinite Groups In this subsection, we indicate how results on K n ( RG ) , Gn ( RG ) could be extended to yield results on K n ( RV ), Gn ( RV ) where RV is a group ring of an infinite group V. 1.2.1. Let α be an automorphism of a ring A, with identity . We shall write

Aα [T ] = Aα [t , t −1 ] the α-twisted Laurent series ring over A. Here T= < t > is the infinite cyclic group generated by t. Note that additively multiplication given by ( at ) ⋅ (bt ) = aα (b)t i

j

−1

i+ j

Aα [T ] = A[T ] = A[t , t −1 ] with

for a, b ∈ A.

Now let α be an automorphism of a finite group G, R the ring of integers in a number field F. We also denote by α the automorphism induced on RG by α. Then


69

( RG )α (T ) = RV where V = G × T and the action of the infinite cyclic group T = < t > on α

G is given by α ( g ) = tgt for all g ∈ G.V is called a virtually infinite cyclic group and K−1

theory of RV is fundamental to the Farrell – Jones conjecture which asserts that K-theory of an arbitrary discrete group should have as “building blocks” the K-theory of finite groups and virtually infinite cyclic groups Note that an R-automorphism of RG extends to an F-auto morphism of FG, which we also denote by α. We now have the following result.

1.2.2. Theorem [41] Let V = G × T be a virtually infinite cyclic group where G is a finite group, α ∈ Aut(G) α

and the action of T on G is given by

α ( g ) = tgt −1 . Then

(1) Gn ( RV ) is a finitely generated Abelian group for all n ≥ 1 (2)

⊗ K n ( RV )

⊗ Gn ( RV )

⊗ K n ( FV ) for all n ≥ 2.

Proof : See [41]. Note that the proof of the theorem above like many others, is in the generality of replacing RG by an arbitrary R-order Λ in a semi-simple F-algebra Σ . One then deduces above result for Λ = RG , Σ = FG .

We also have the following result which shows that G2( m +1) ( Λα [T ]) is actually finite for all odd positive integers m where F is a totally real field, when Λ = RG .

1.2.3. Theorem [41] Let R be the ring of integers in a number field, G a finite group, T = < t > the infinite cyclic group and V = G × T where α ∈ Aut (G) and the action of T on G is given by α

α ( g ) = tgt . Then for all odd positive integers m, G2 m +1 ( RV ) is a finite group when F is a −1

totally real field. Proof: See [41].

1.3. Profinite higher K-theory of RG, RV 1.3.1. Recall that in II, 1.3, we defined profinite K-theory of an exact category C by

K npr (C,

l

n +1 M lns+1 . ) := [ M ln∞+1 ; BQC] where M l∞ = lim →

We

also

s

defined K n (C, ˆ l ) := lim K n (C, ← s

/ l s ) See page [39] [33] K npr ( A, ˆ l ) for pr write Gn ( A, ˆ l ) for

Recall also that for any ring A with identity we write

K npr (P( A), ˆ l ) and for any Noetherian ring A we K npr (M( A), ˆ l ) We call K npr ( A, ˆ l ) (resp Gnpr ( A, ˆ l ) ) the profinite K-theory(resp. G –

Aderemi Kuku

70 theory),

of

A.

Similarly

we

write

K n ( A, ˆ l )

K n (P ( A), ˆ l ) for

for

and

Gn ( A, ˆ l ) K n (M( A), ˆ l ) We shall be interested in the cases A = RG and A = RV where R is a Dedekind domain (e.g. R=ring of integers in a number field or p-adic field F) and V = G × T where α

α ∈ Aut (G) is given by

α ( g ) = tgt and T = < t > is the infinite cyclic group. −1

1.3.2. Remarks Note that the profinite higher K-theory K n (C, ˆ l ) for exact categories C (including pr

equivariant exact categories) is a cohomology theory which generalizes classical profinite topological

K-theory

K 0 (C) ⊗ ˆ l where for a compact topological space X,

C = VB ( X ) (resp VB G ( X ) ) is the category of (finite dimensional) X, ( resp. category of G – equivariant

-vector bundles on

-vector bundles on a G – space X where G acts

continuously on X) see [1]. The theory K (C, ˆ l ) is also a K-theory analogue of classical pr n

continuous cohmology of schemes rooted in Arithmetic algebraic geometry. As such

K npr (C, ˆ l ) could be called continuous K-theory of C. The following exact sequence provides a standard mechanism for computing K n (C, ˆ l ) . pr

1.3.3. Theorem [39] [33] If C is an exact category, l a rational prime, then for all n ≥ 1, there exists an exact sequence 0 → lim K n +1 (C, 1

← s

l s ) → K npr (C, ˆ l ) → K n (C, ˆ l ) → 0

Proof see [39] or [33].

1.3.4. Remarks It follows from 1.3.3 that (1) If

A

is

any

0 → lim K n +1 ( A, 1

← s

(2) If

A

is

a

← s

with

identity

we

have

exact

sequences.

l ) → K ( A, ˆ l ) → K n ( A, ˆ l ) → 0 s

Noetherian

0 → lim Gn +1 ( A, 1

ring

pr n

ring

,

then

we

have

an

exact

sequence

l ) → G ( A, ˆ l ) → Gn ( A, ˆ l ) → 0 s

pr n

In particular if R is the ring of integers in a number field or a p-adic field F,G a finite group, T = < t > an infinite cyclic group, V = G × T , then A = RG and A=RV fit into the α

exact sequences in (1) and (2) above.


71

1.3.5. Definition: Let l be a rational prime. An Abelian group G is said to be l -complete

if G

lim(G l s G ) . ← s

1.3.6 Theorem [39] [33] Let C be an exact category such that K n (C) is a finitely generated Abelian group for all n ≥ 1. Let l be a rational prime. Then K n (C, ˆ l ) is an l -complete profinite Abelian group pr

for all n ≥ 2 Moreover, K n (C, ˆ l ) pr

K n (C) ⊗ ˆ l

K n (C, ˆ l ) .

Proof: see [39] or [33]. 1.3.7. Corollary Let R be the ring of integers in a number field F, G a finite group. Then K n ( RG, ˆ l ) , Gn ( RG, ˆ l ) are l -complete profinite Abelian groups for all n≥2. pr

pr

K npr ( RG, ˆ l ) K n ( RG ) ⊗ ˆ l Gnpr ( RG, ˆ l ) Gn ( RG ) ⊗ ˆ l Gn ( RG, ˆ l )

Moreover,

K n ( RG, ˆ l )

and

Proof: Follows from the fact that K n ( RG ) , Gn ( RG ) are finitely generated Abelian groups for all n ≥ 1 ( see 1.1.3 and 1.1.4).

1.3.8. Corollary [40] Let R be the ring of integers in a number field F, G a finite group, V = G × T where T = α

< t > is the infinite cyclic group. Then Gn ( RV , ˆ l ) is an l -complete profinite Abelian group for all n ≥ 2. pr

Moreover Gn ( RV , ˆ l ) pr

Gn ( RV ) ⊗ ˆ l

Gn ( RV , ˆ l ) .

Proof: Follows since Gn ( RV ) is finitely generated for all n ≥ 1 (See [40])

1.3.9. Remarks When R is the ring of integers in a p-adic field, K n ( RG ) , Gn ( RG ) are no longer finitely generated. However, one can, through other techniques (see [39]) obtain the following

l -completeness result for Gnpr ( RV ,

l

).

1.3.10. Theorem [39] [40] Let R be the ring of integers in a p-adic field F, G a finite group. Then for all n ≥ 2,

Gnpr ( RV , ˆ l ) is an l -complete profinite Abelian group. 1.3.11. Before stating the next result ( 1.3.12), we first explain the construction of the map ϕ in the theorem.

Aderemi Kuku

72

Note that for any exact category C, the natural map M ln∞+1 → S n +1 induces a map ϕ ϕ [S n +1 , BQC] ⎯⎯ →[ M ln∞+1 , BQC] i.e. K n (C) ⎯⎯ → K npr (C, ˆ l ) . So when C = M(RV),

where V = G × T , G a finite group, T = < t > the infinite cyclic group and α ∈ Aut(G) given α

by

α ( g ) = tgt −1 , we obtain a map ϕ; Gn ( RV ) → Gnpr ( RG,

l

)

If B is an Abelian group, we write div B for the subgroup of divisible element of B. We now state the following result.

1.3.12. Theorem [40] Let R be the ring of integer in a number field F, G a finite group V = G × T where α

= < t > is the infinite cyclic group and α ∈ Aut(G) is given by

T

α ( g ) = tgt −1 , RV the group

ring of V over R. Then for all n ≥ 2 (i) div Gn ( RV , ˆ l ) =0 pr

(ii) Gn ( RV , ˆ l )

Gn ( RV , ˆ l ) is an l -complete profinite Abelian group pr (iii) The map ϕ : Gn ( RV ) → Gn ( RV , ˆ l ) is injective with uniquely l -divisible pr

cokernel

Section 2. Some Results on the Actions of Algebraic Groups 2.1. The representation ring R (H) and the group K 0 (VBG (G H )) 2.1.1. Let F be a field, G an algebraic F-group, P ( F )G the category of representations of G in P (F) where P (F) is the category of finite-dimensional vector spaces over F. We shall also denote this category by Re p F (G ) . Note that Re p F (G ) is an exact category and we denote K 0 (Re p F (G )) by RF (G ) or R (G,F) or just R(G) when the context is clear . Note that R(G) is a free Abelian group generated by the classes of irreducible representations and that R(G) also has a ring structure induced by tensor product. Call R(G) the representation ring. (see [50] [55]) Since Re p F (G ) is an exact category, K n (Re p F (G )) is defined for all n ≥ 0 and we denote K n (Re p F (G )) by Gn (G, F ) . 2.1.2. For any G-scheme X, let VBG ( X ) be the exact category of G-equivariant (algebraic) vector bundles on X. We saw in I, 3.2.2 that if H is a subgroup of G and X = G/H the homogeneous G-space, then we have an equivalence of categories:


Re p F (G)

ind res

73

VBG (G / H ) where the maps ind and res were described in I, 3.2.2.

So, we have an Abelian group isomorphism R ( H )

K 0 VBG (G H ) . We shall denote

K 0 (VBG (G H )) by K (G H ) G 0

α : H → G be a homomorphism of algebraic group; then f induces a ring * homomorphism α : R(G ) → RH . Hence we can consider R ( H ) as an R (G ) - module. If f * is an embedding of a subgroup H into G, call α a restriction map. 2.1.3. Let

If E is a field extension of F and we write GE for G ( E ) = G × E , then the exact functor F

Rep(G ) → Rep(GE ) :[V ] → [V ⊗ E ] induces a ring homomorphism. If E/F is a finite F

extension, we also have an exact functor Re p(G E ) → Rep(G) which takes a GE -module M over E to itself considered as a G-module over F. This induces a ‘corestriction ‘map: R (GE ) → R(G) . Note that the composition R (G ) → R(G E ) → R(G) coincides with multiplication by [E, F]. In particular the homomorphism R (G ) → R (GE ) is injective (see [50]).

2.2. The groups K n (G , X ), Gn (G, X ) , X a G-Scheme 2.2.1. Let G be an F-group and X a G-scheme. In I, 3.3.2 we defined equivariant exact categories M(G, X), P(G, X). Let

K n (G , X ) denote K n (P (G , X )), Gn (G, X ) .denote K n (M(G ), X )) . o Note that if X = Spec ( F ) , and n

=

0

then

R(G ) = G0 (G, Spec( F ) = K 0 (G, Spec( F )) o

When X = G/H is a homogeneous space where H is subgroup of G,

R( H ) o

G0 (G , G / H )

K 0 (G , G / H )

K 0G (G / H ) (see 2.1)

If U is a unipotent subgroup of G, then, there is a natural bijection between the sets of irreducible representation of G and irreducible representations of G/U. Hence the natural map R (G / U ) → R (U ) is an isomorphism. (See [50])

o

Gn (G, −) is contravariant with respect that G-morphism and is covariant with

o

respect to projective G-morphisms of G-schemes (See [50]) K n (G, −) is contravariant with respect to G-morphisms of G-varieties. ( see [50])

o

The inclusion of categories P (G , X ) → M(G , X ) induces a homomorphism

K n (G , −) → Gn (G, −) o

morphism α : G → H induces M (G , X ) → M ( H , X ) , P (G , X ) → P ( H , X )

A

exact and

hence

homomorphism Gn (G, X ) → Gn ( H , X ) , K n (G, X ) → K n ( H , X )

functors group

Aderemi Kuku

74

2.2.2. The restriction functor M(G, X ) ⎯⎯ → M( X ) induces a group homorphism f

f* Gn (G , X ) ⎯⎯ → Gn ( X ) . The next few results provide information on the map f* , which

links equivariant K-theory to ordinary K-theory of X. Note that P (G , X ) → P ( X ) also induces K n (G , X ) → K n ( X ) .

2.2.3. Theorem [50] If G0 (G, X ) → G0 ( X ) surjective, then Pic(G E ) = 0 for any finite extension E/F. If

Pic(G E ) = 0 and X is affine, then G0 (G, X ) → G0 ( X ) is surjective proof; see [50].

2.2.4. Theorem [50] Let G be an algebraic F-group. For all n ≥ 0 Gn (G, X ) → Gn ( X ) is a split surjection if X is a smooth projective variety. Proof see [50]

2.2.5. Theorem [50] Let G be a split reductive F-group. Then for any G-scheme X, the homomorphism

→ G0 ( X ) . G0 (G , X ) → G0 ( X ) induces an isomorphism ⊗R ( G ) G0 (G, X ) ⎯⎯ Proof: See [50].

2.2.6. Theorem [50] Let G be a split reductive group, X a smooth projective variety. Then the homomorphism K n (G, X ) → K n ( X ) induces an isomorphism ⊗ R (G ) K n (G , X ) K n ( X ) Proof See [50].

2.2.7. Theorem [50] Let U be a split unipotent group, X a U -scheme. Then the restriction homomorphism Gn (U , X ) → Gn ( X ) is an isomorphism. 2.2.8. Recall from 2.1.3 and 2.2.1 that a group homomorphism

α : H → G induces a ring

α : R(G ) → RH as well as group homomorphism Gn (G , X ) → Gn ( H , X ) and K n (G, X ) → K n ( H , X ) . The next few results present some

homomorphism

*

information on Gn (G, X ) → Gn ( H , X )

.


75

2.2.9. Theorem [50] Let G be an F-group and H a closed subgroup of G such that the factor – variety G/H is 1 isomorphic to AF . Then for any G-scheme X, the restriction map Gn (G, X ) → Gn ( H , X ) is an isomorphism for all n ≥ 0. Proof See [50].

2.2.10. Theorem [50] Let G be a split solvable group, T ⊂ G a split maximal torus, X a G – scheme. Then the restriction homomorphism Gn (G , X ) → Gn (T , X ) is an isomorphism.

2.3. Higher K-theory of Twisted Flag Varieties 2.3.1 Recall from I, 3.2.3, that if G% is a semi-simple, connected and simply connected Fsplit algebraic group over a field F, and P% ⊂ G% a parabolic subgroup of G% containing the maximal torus T .then F = G% / P% is a flag variety We also saw in II, 1.3.11 that R ( P% ) is a free R (G% ) -module of rank s (F ) where s (F ) = [W ;WP ] , W = the Weyl group of G% ,

% and WP% = {w ∈ W wPw

−1

= P%} .

F be the twisted flag variety relative to the co cycle c : Gal ( Fsep / F ) → G% ( Fsep ) (see II 1.3.13 and [55] [40]). Now for any V% ∈ VBG (F ) (see II, 1.3.8), let c V% be the vector bundle over c F obtained by twisting V% by c. Then we have a biexact % × P (F) → VB% ( F ) : (V , M ) → V% ⊗ M which induces a pairing functor Rep (P) Now

let

c

P c

F

c

F

μc : RF ( P% ) ⊗ K n ( F ) → K n (c F ) . We now have the following result. 2.3.2. Theorem [55] In the notation of 2.3.1, we have (1)

μc : RF ( P% ) ⊗ K n ( F ) → K n (c F ) is surjective for all n ≥ 0.

(2)

μc induces a graded ring isomorphism RF ( P% ) ⊗% K* ( F ) ⎯⎯ → K* ( c F ) RF ( G )

(3) RF ( P% )

RF (G% ) s (F ) where RF ( P% ) is considered as an RF (G% ) - module by

restriction of representatives. s

(4) If {ai i = 1, 2...s} is a free RF (G% ) -basis of RF ( P% ) then ⊕ K n ( F ) i =1

isomorphism. Proof See [55]

K n ( c F ) is an

Aderemi Kuku

76

2.3.3. Theorem [40] In the notation of 2.3.1 and 2.3.2, let F be a number field. Then for all n ≥ 1, 1) K 2 n +1 ( c F, B ) is a finitely generated Abelian group. 2) K 2 n (c F, B ) is a torsion group and has no non-trivial divisible subgroups Proof: see [40]. We next present the following result on the local structure.

2.3.5. Theorem [40] Let F be a p-adic field, l a rational prime such that l ≠ p . Then for all n ≥ 1, and any separable F-algebra B, K n ( c F, B) l is a finite group. Proof : See [40]

2.4. Finiteness Results for Some Objects of the Motivic Category C (G) 2.4.1. Let G be an algebraic group over a field F. By considering a smooth projective G – scheme as an object of a category C (G) defined below, we have similar finiteness results to those for K n (c X , B ) where c is a 1 co-cycle,

c

X is the c-twisted form of X and B is a

separable F-algebra. 2.4.2. The category C (G) is constructed as follows (the construction is due to I. Panin see [55], or [40]. The objects of C (G) are pairs (X, A) where X is a smooth projective G – scheme and A is a finite dimensional separable F – algebra on which G acts by F – algebra Define HomC (G ) (( X , A), (Y , B )) := K 0 (G, X × Y , A ⊗ F B ) . op

automorphisms. Composition

of

morphisms

is defined as follows: if u : ( X , A) → (Y , B ) , v : (Y , B ) → ( Z , C ) are two morphisms, then the composite is defined * v o u := p13* ( p23 (v) ⊗ B p12* (u )) where p12 : X ⊗ Y ⊗ Z → X ⊗ Y ,

by

p13 : X ⊗ Y ⊗ Z → X ⊗ Z ,and p23 : X ⊗ Y ⊗ Z → Y ⊗ Z . The identity endomorphism of (X, A) in C (G) is the class [ A ⊗ F OΔ ] (where Δ ⊂ X × X is the diagonal) in

K 0 (G, X × X , A(γ ) ⊗ F A) = End C (G) (X,A) . We now have the following results. 2.4.3. Theorem. [40] Let

α : C ⎯⎯ →( X , F ) be

an

isomorphism

in

the

category

C(G),

i.e., α : ( Spec ( F ), C ) ⎯⎯ →( X , F ) . For every 1 – cocycle: Gal ( Fsep / F ) → GF sep and any finite dimensional separable F-algebra B, let K n ( c Y , B ) be as defined in II, 1.2.3. a) If F is a number field, then for n ≥ 1, (i) K 2 n +1 ( c X , B) is a finitely generated Abelian group. (ii) K 2 n (c X , B ) is a torsion group and has no non-trivial divisible elements.


77

b) If F is a p-adic field, l a rational prime such that l ≠ p ,then for all n ≥ 1 and any separable F-algebra B, K n (c X , B )l is a finite group. Proof : See [40]

2.5. Profinite Higher K-Theory of Twisted Flag Varieties In this subsection, we obtain some l -completeness and other results for some twisted flag varieties as well as Brauer- Severi varieties over number fields and p-adic fields. Recall that if

l is a rational prime an Abelian group H is said to be l -complete if H = lim H / l s H . Recall ←⎯ ⎯ s

the definition of profinite K-theory K n ( c F , B ), pr

l

) from II, 1.5.2 (vii)

2.5.1. Theorem [40]

% a semi-simple, connected, simply connected split algebraic Let F be a number field, G group over F , P% a parabolic subgroup of G% , B a finite dimensional separable F-algebra. Then for all n ≥ 1, (1) K n (( c F , B ), pr

l

) is an l -complete profinite Abelian group.

(2) divK npr (( c F , B ),

l

)=0

Proof see [40]

2.5.2. Remarks The following results can be proved by procedures similar to those used to prove the result above. See [40] for details. (1) If F a number field, then K 2 n (γ F , ˆ l ) = 0 . pr

(2) If V is a Brauer-Severi variety over a number field F, then for all n ≥ 2,

K 2prn (V , ˆ l ) is l -complete and divK 2prn (V , ˆ l ) = 0 2.5.3. Our next aim is to consider the situation when F is a p-adic field. Before doing this, we make some general observations. Note that for any exact category C, the natural map ϕ M ln∞+1 → S n +1 induces a map [ S n +1 , BQC ] ⎯⎯ →[ M ln∞+1 , BQC ] . ϕ

i.e., (I) K n (C) ⎯⎯ → K n (C, ˆ l ) and hence maps pr

(II) K n (C) / l → K n (C, ˆ l ) / l and s

pr

s

(III) K n (C) /[l ] → K n (C, ˆ l )[l ] s

pr

s

We denote the maps in (II) and (III) also by ϕ by abuse of notation. We now present the following result.

Aderemi Kuku

78

2.5.4. Theorem. [40]

% a semi-simple connected and simply Let p be a rational prime, F a p-adic field, G connected split algebraic group over

F , P% a parabolic subgroup of G% , c a 1-

cocycle Gal ( Fsep / F ) → G% ( Fsep ) , c F the c-twisted form of F , B a finite dimensional separable F-algebra, l a rational prime such that l ≠ p . Then for all n ≥ 2. (a) K n (( c F , B); ˆ l )) is an l -complete profinite Abelian group. pr

(b) K n ((c F , B ), pr

l

)

K n ((c F , B);

l

).

(c) The map ϕ : K n (c F , B) → K n ((c F , B); ˆ l ) induces isomorphisms pr

(1) K n (c F , B)[l s ]

K npr ((c F , B); ˆ l )[l s ]

(2) K n (c F , B) / l s

K npr ((c F , B); ˆ l ) / l s .

(d) Kernel and cokernel of K n (c F , B) → K n ((c F , B); ˆ l ) are uniquely l -divisible. pr

(e) divK n ((c F , B); ˆ l ) = 0 for n ≥ 2 pr

Proof : See [40]

2.5.5. Remarks (a) Let V be a Brauer-Severi variety over a p-adic field F. By a similar proof to that of 2.5.4 we have (i) K n (V , ˆ l ) pr

(ii) K n (V ) / l

s

K n (V , ˆ l ) is an l -complete profinite Abelian group. K pr (V , ˆ ) / l s and K (V )[l s ] K pr (V , ˆ )[l s ] . n

l

n

n

l

Kernel and cokernel of K n (V ) → K (V , ˆ l ) are uniquely l -divisible. pr n

divK npr (V , ˆ l ) = 0 b) Finally, if c X is as in 2.4.3, we have similar results to those of 2.5.4 for

K npr (c X , B) , etc.

ACKNOWLEDGMENT I like to thank Ivan Tatchim for helping to type the manuscript for this chapter


79

REFERENCES [1] [2] [3]

M.F. Atiyah, K-theory. W.A. Benjamin (1967). H. Bass Algebraic K-theory W.A. Benjamin (1968). H. Bass, Lenstra's Calculation of G0 ( Rπ ) and Application to Morse-Smale

[4]

Diffeomophisms. Lecture Notes in Math 88, Springer, 1981, 287 -291. H . Bass, A.O. Kuku and C. Pedrini (eds), Algebraic K-theory and its Applications. ICTP K-theory Proceedings. World Scientific (1999).

[5]

W. Browder, Algebraic K-theory with Coefficients

[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]

P . Lecture Noes in Math 657.

Springer. 40 - 84. C.W. Curtis and I. Reiner, Methods of Representation theory I and II. Wiley (1982); (1987). A.W.M. Dress, Contributions to the Theory of Induced Representations. Lecture Notes in Math. Springer 1973,183 - 240. A.W.M. Dress, Induction and Structure Theorems for Orthogonal Representations of Finite Groups, Ann. Math 102 (1975) 291-325. A.W.M. Dress and A.O. Kuku, The Cartan Map for Equivariant Higher K-groups. Comm. in Algebra, 9 (1981), 727 -747. A.W.M. Dress and A.O. Kuku, A Convenient Setting for Equivariant Higher, Algebraic K-theory. Lecture Notes in Math. 966, Springer (1986) 58 - 68. J.A.Green. Axiomatic representation theory of finite groups. J. Pure Appl. Algebra 1(1)(97), 41 – 72. X. Guo and A.O. Kuku, Higher Class Groups of Generalized Eichler Orders.Comm. In Algebra 33 (2005), 709 -718. X. Guo and A.O. Kuku, Wild Kernels for Higher K-theory of Division and Semi-simple Algebras, Beitrage ZÜr Algebra und Geometrie, 47 (2006) (1), 1 – 14. R.Hartshone, Algebraic Geometry. Springer, NY 1977. A Hatcher and J.Wagoner. Pseudo – isotopies on compact manifolds. Asterisq 6 (1973). M. Karoubi, K-theory : An Introduction. Springer 1978. M. Karoubi, A.O. Kuku and C. Pedrini (eds), Contemporary Development in Algebraic K-theory. ICTP Lecture Notes Series, 15 (2003). M.Kolster and R.C. Laubenbacher. Higher class groups of orders, Math. Z. 228 (1998) 229 – 246. A.O. Kuku. Some Algebraic K-theory Application of LF and NF Functors. Proc. AMS (37) (2) 1973, 36 - 365. A.O. Kuku, Whitehead Group of Orders in p-adic Semi-simple Algebras. J. Algebra 25 (1973), 415 - 418. A.O. Kuku, Some Finiteness Theorems in the K-theory of Orders in p-adic Algebras. J. Lond. Math. Soc (13) (1) 1976,122 - 128. A.O. Kuku, SK n of Orders and Gn of Finite Rings, Lect. Notes in Math. 51, Springer-

Verlag (1976) 60 - 68. [23] A.O. Kuku, SGn of Orders and Group Rings, Math. Z. 165 (1979) 291 – 295. [24] A.O. Kuku, Higher Algebraic K-theory of Group Rings and Orders in Algebras over number fields, Comm. Algebra 10(8) (1982) 805 - 816.

80

Aderemi Kuku

[25] A.O Kuku, Equivariant K-theory and the cohomology of profinite groups, Lect. Notes in Math. 1046 (1984), Springer – Verlag, 234 – 244. [26] A.O. Kuku, K-theory of group rings of finite groups over maximal orders in division algebras, J. Algebra 91 (1) 1984) 18 - 31. [27] A.O. Kuku, Axiomatic theory of induced representations of finite groups, Les cours du CIMPA, No. 5, Nice, France, 1985. [28] A.O. Kuku, K n , SK n , of integral group rings and orders, Contemp. Math. AMS 55 (1986) 333 - 338. [29] A.O. Kuku, Some finiteness results in the higher K-theory of orders and grouprings, Topology Appl. 25 (1987) 185 – 191. [30] A.O. Kuku, Higher K-theory of modules over EI categories, Africa Mat.3 (1996) 15 27. [31] A.O. Kuku, Ranks of K n and Gn of orders and group rings of finite groups over integers in number fields, J. Pure Appl. Algebra 138 (1999),39 - 44. [32] A.O. Kuku, Equivariant higher K-theory for compact Lie-group actions. Bietrage zÜr Algebra and Geometric 41 (1) (2000), 141 - 150. [33] A.O. Kuku, Profinite and continuous higher K-theory of exact categories, orders and group-rings, K-theory 22 (2001)367 -392. [34] A.O. Kuku, Classical Algebraic K-theory: the functors K 0 , K1 , K 2 . Handbook of Algebra 3 (2003) Elsevier, 157 - 196. [35] A.O. Kuku, K-theory and Representation Theory - Contemporary developments in Algebraic K-theory, ICTP Lect. Series, No. 15 (2003) 259 -356. [36] A.O. Kuku, Higher Algebraic K-theory, Handbook of Algebra, 4 (2006), Elsevier, 3 74. [37] A.O. Kuku, Finiteness of higher K-groups of orders and group-rings. K-theory (2005), (36) 51 - 58. [38] A.O. Kuku, Equivariant higher algebraic K-theory for Waldhausen categories, Bietage ZÜr Algebra und Geometridc contributions to Algebra and Geometry. vol.47, (2) 583 601 (2006). [39] A.O. Kuku, Representation theory and higher Algebraic K-theory. Chapman and Hall (2007). [40] A.O. Kuku, Profinite equivariant higher algebraic K-theory for action of algebraic Groups. Homology, Homotopy and Applications (To appears). [41] A.O. Kuku, Higher Algebraic K-theory for twisted Laurent series rings over orders and semi-simple algebras. Algebras and Representation theory. (2008)11:355-368. [42] A.O. Kuku and G.Tang, Higher K-theory of group-rings of virtually infinite cyclic groups. Math. Ann 322 (2003) 711 -725. [43] A.O. Kuku and M. Mahdavi-Hezeveli. Subgroups of GLn ( R) for local rings R. Comm. In Algebra 32 (15) 2004 1805 – 1902. [44] A.O. Kuku and G. Tang, An explicit computation of "bar" homology groups of a nonunital ring. Beitrage ZÜR Algebra und Geometrie 44 (2) 2003,375 - 382. [45] T.Y. Lam, Induction techniques for Grothendieck groups and Whitehead group of finite groups. Ann. Sc. Ecole. Norm Sup. Paris 1 (1968) 91 - 148. [46] T.Y. Lam and I.Reiner, Relative Grothendieck groups. J. Algebra 11 (1969) 213-242.


81

[47] R.C. Laubenbacher and D. Webb. On SGn of orders. J. Algebra 133 (1990) 125-131. [48] H.W. Lenstra, Grothendieck groups of Abalian group-rings. J. Pure. App. Alg. 20 (1981) 173-193. [49] S. Maclane categories for the working mathematician, Springer – verlag (1971). [50] A.S. Merkurjev. Comparison of equivariant and ordinary K – theory of Algebraic varieties. St Petersburg math. J. 9 (1998) No 4, 815 – 850. [51] J. Minor, Whitehead torsion. Bull. Amer. Math Soc. (72) 1966, 358 – 426. [52] J. Milnor, Introduction to Algebraic K-theory. Princeton (1971). [53] Neisendorfer, Primary Homotopy theory. Memoir Amer. Math. Soc. 232 AMS (1980). [54] R. Oliver, Whitehead groups of finite groups. Cambridge Univ. Press (1988). [55] I – Panin. On the Algebraic K – theory of twisted flag varieties K – theory 8 (1994) 541585. [56] D. Quillen, On the cohomology and K – theory of the general linear groups of a finite field. Ann Math 96 (1972) 552 – 586. [57] D. Quillen, Higher algebraic K-theory. Lecture Notes in Math. Springer-Verlag (1973) 85 - 147. [58] D. Quillen, Finite generation of the K-groups of algebraic integers. Lecture Notes in Math. 341 Springer-Verlag (1973), 195 -214. [59] G.Segal, Equivariant K-theory. Publ. Math IHES, 34 (1968). [60] G. Segal, Representation ring of compact Lie-groups, IHES (34) 1968, 113 - 128. [61] J.P. Serre, Linear representations of finite groups. Springer-Verlag, (1977). [62] J.P. Serre, Local fields. Springer-Verlag, Berlin (1979). [63] E.A. Spanier, Algebraic topology. Mcgraw- Hill (1966). [64] T.A Springer Linear Algebraic Groups. Second edition, Birkhauser Boston 1998. [65] V. Srinivas, Algebraic K-theory. Progress in Math 90. [66] A.A. Suslin, Stability in Algebraic K-theory. Lect. Notes in Math. 966 Springer-Verlag (1982) 304 - 333. [67] A.A. Suslin and A.V. Yufsyakov, K-theory of local division algebras. Soviet Math. Docklady 33 (1986) 794 -798. [68] A.A. Suslin and M. Wodziki, Excision in algebraic K-theory. Ann. Math (2) 136 (1) 1992 51-122. [69] R.G. Swan, Vector bundles and projective modules. Trans AMS 105 (1962) 264 – 277. [70] R.G.Swan, K-theory of finite Groups and orders Springer – Verlag, lecture notes 149 (1979) [71] R.G. Swan, Algebraic K-theory. Lecture Notes in Math 76 Springer-Verlag (1968). [72] R.W. Thomason, Algebraic K-theory of group scheme actions. In algebraic topology and algebraic K-theory. Proceedings Princeton, NJ (1987) 539 – 563. [73] J.B. Wagoner, Continuous cohomology and p-adic K-theory. Lecture Notes in Math 551, Springer-Verlag 241-248. [74] C.T.C. Wall, Finiteness conditions for CW-complexes. Ann Math. 81 (1965) 56-69. [75] C.T.C. Wall, Norms of units of group rings. Proc. Lond. Math. Soc (1974) 593-632. [76] D. Webb, Grothendieck groups of dihedral and quaternion groups. J. Pure Appl. Alge. (35) (1985) 197 -223. [77] D. Webb, The Lestra map on classifying spaces and G-theory. Invent. Math. 84 (1980) 73 - 89.

82

Aderemi Kuku

[78] G.W. Whitehead, Elements of homotopy theory. Springer-Verlag, NY 1978. [79] J.H.C. Whitehead, Simple homotopy types. Am. J. Math 72 (1950.) 1 - 57.



Chapter 3

LIBERAL NATIONALISM, CITIZENSHIP AND INTEGRATION 1

Sune Lægaard Centre for the Study of Equality and Multiculturalism, University of Copenhagen

ABSTRACT Liberal nationalists such as Will Kymlicka and David Miller have endorsed the "revaluation of citizenship" currently expressed in more stringent naturalization requirements across western states. Kymlicka and Miller claim that such measures are “nation-building” policies and only make sense as attempts at “cultural integration” of immigrants. The chapter discusses liberal nationalism as a view that assigns normative significance to one sort of group membership, nationality, for the purpose of regulating access to another kind of group, namely the political community of citizens. The chapter discusses in which ways the recent “revaluated” naturalization requirements might be related to the aims of liberal nationalism. It is argued that “revaluated” naturalization requirements can indeed be characterised as a kind of nation-building policies and as attempts at cultural integration, but that the applicable sense of nationality is too “thin” to serve the functions shared nationality is supposed to according to liberal nationalism. Even if the there is no positive connection between liberal nationalism and revaluated citizenship, such policies might still be preferable on the basis of liberal nationalism. The paper raises some doubts about this possibility as well, however, since ”revaluated” naturalization requirements are exclusionary in a way that might be counterproductive from a liberal nationalist point of view.

1

Earlier versions of this paper were presented in spring 2006 at the university of Copenhagen and at a conference on ‘Globalisation and the Political Theory of the Welfare State and Citizenship’ at Aalborg University, and at the Association for Legal and Social Philosophy conference ‘Aliens and Nations: Citizenship, Sovereignty and Global Politics in the 21st Century’ at Keele University, April 2007. Thanks to Joe Carens, Abraham Doron, Nir Eyal, Nils Holtug, Kasper Lippert-Rasmussen, Anne Phillips, and Anne Julie Semb for comments.

84

Sune Lægaard

INTRODUCTION This chapter concerns the relationship between liberal nationalism and citizenship policies. More specifically, focus is on the claim made by the prominent representatives of liberal nationalism David Miller and Will Kymlicka that citizenship policies currently fashionable in Europe, sometimes known as “revaluated citizenship”, are properly understood as “nation-building” policies in the sense relevant to liberal nationalism, i.e. as attempts at integrating immigrants into a national culture and identity. The policies in question include the introduction or strengthening of language tests and tests of knowledge of society as additional conditions for naturalization. The chapter considers whether such policies in fact involve integration in the sense relevant to liberal nationalism. After a sketch of the type of policies in question and of liberal nationalism as a normative position, this chapter discusses the concepts of nation-building, integration and the nation. On this basis, it is argued that the naturalization policies in question do not involve nation-building or national integration in the sense relevant to liberal nationalism. The significance of the discussion of these questions is first of all theoretical. Liberal nationalism is a political theory that posits the political significance of a certain kind of group, namely the nation. The chapter takes up one specific aspect of the political significance of nations advocated by proponents of liberal nationalism as a theoretical position, namely the application of this general normative claim to the regulation of access to another kind of political group, that of citizens of the state. Liberal nationalists envisage nationality as a requirement for access to citizenship, and as an aim and justification for naturalization policies regulating such access. The discussion is theoretical in the sense that it explicates and assesses the pretensions in these respects of liberal nationalism. The chapter’s main claim in this regard is that nationality in the sense presupposed by versions of liberal nationalism does not function as well as supposed by proponents of liberal nationalism as a requirement for naturalization and does not justify the kind of naturalization policies endorsed by liberal nationalists. This is an internal critique of liberal nationalism on the basis of conceptions of nationality and liberal constraints on state policies that liberal nationalists themselves advocate. So at the theoretical level, the chapter suggests that the positive claims of liberal nationalists that revaluated naturalization policies are necessarily justified or best understood on the basis of liberal nationalist concerns are unfounded. But the chapter furthermore considers the possibility that the kind of naturalization policies in question may actually be problematic from the point of view of liberal nationalism. Together, these criticisms suggest that liberal nationalism is deficient as a theory of access to citizenship.

REVALUATED CITIZENSHIP The “revaluation of citizenship” has been used as a label for the simultaneous trends, the United States with its history of a quite inclusive citizenship regime, to restrict access citizenship and reserve substantial benefits and privileges to citizenship status, and, Europe, to provide opportunities for long settled immigrants and their descendants

in to in to

Liberal Nationalism, Citizenship and Integration

85

naturalise and to encourage this.2 What these apparently opposite developments supposedly have in common is to “reaffirm citizenship as the dominant membership principle” [Joppke and Morawska, 2003: 1] which is “now being described as an important value and identity” [Kymlicka, 2003a: 195]. This general description is compatible, however, not only with the noted differences in policies, but also with different basic justifications or rationales.3 For present purposes, “revaluated citizenship” (RC) will therefore be used in a more precise and restricted sense as a label for specific empirical developments in citizenship policies currently fashionable in Europe [De Hart and Van Oers, 2006: 318]. RC policies primarily include stricter naturalization requirements in the form of the introduction or strengthening of language tests and tests for knowledge of society as well as pledges of allegiance as additional conditions of attaining citizenship [Bauböck, et al, 2006: 24; Joppke, 2007: 41, 44; Kymlicka, 2003a: 195; Waldrauch, 2006: 151f]. RC policies are prominent in countries such as the UK, the Netherlands and Denmark. The new Dutch Nationality Act, which was approved in 2000 and entered into force in 2003, includes a naturalization exam consisting of a societal knowledge test and a test of proficiency in Dutch [De Hart and Van Oers, 2006: 326f; Entzinger, 2003: 75ff, 84; Van Oers, De Hart and Groenendijk, 2006: 414; Waldrauch, 2006: 151f]. The British government announced an examination for citizenship including tests of language skills and knowledge of society and a citizenship ceremony in its 2002 white paper [Home Office, 2002, cf. Kostakopoulou, 2003], which was legally implemented in the 2002 Nationality, Immigration and Asylum Act [Dummett, 2006: 574; Waldrauch, 2006: 151ff].4 In Denmark, a requirement for documented knowledge of Danish language, society, culture and history was introduced in 2002. The level of language proficiency was raised and the documentation requirement transformed into a formal nationality test in 2005 [De Hart and van Oers, 2006: 326f; Ersbøll, 2006: 128f, 131f; Waldrauch, 2006: 151ff]. Taking these empirical developments as paradigm examples of the revaluation of citizenship, the question at hand concerns the normative rationale for RC policies, more specifically whether and to what extend RC should be understood as a sort of liberal nationalism? One reason for this focus is that the general political context in which RC policies are introduced is focused on the relationship between immigrants and established societies, which is often, at least in Europe, conceived and expressed in nationalist terms. Naturalization is accordingly often debated in explicitly nationalist terms as a matter of becoming a member of the nation as a cultural community, as well as of the state as a political community, and more restrictive naturalization requirements are often demanded on the basis that citizenship should be reserved for members of the nation. These features of everyday political debate are reflected at the level of political theory. Here, the two most prominent theoretical advocates of liberal nationalism, David Miller and Will Kymlicka, endorse RC 2

Joppke and Morawska, 2003: 16f. The term was originally coined with reference to the American case by Peter Schuck, cf. Schuck, 1998, chap. 8. 3 The US Welfare Reform Act of 1996, which cut back on the social entitlements of non-citizens, was arguably primarily a budget issue, whereas the changes in European naturalization procedures are not (only) economically motivated, but are also premised on a concern with a common identity, i.e. with sending a message about what it means to be “British”, “Dutch”, “Danish” etc. The question is whether this identity concern is necessarily nationalist in a relevant sense. Thanks to Joe Carens, Anne Phillips and Nils Holtug for discussion on this point. 4 The British citizenship test can be found at www.lifeintheuktest.gov.uk/.

86

Sune Lægaard

policies [Kymlicka, 2003a, 2006: 136f; Banting and Kymlicka, 2004: 21f; Miller, 2006: 335; 2008: 385], and they do so, moreover, on the basis that RC policies are a proper response to the underlying concerns of liberal nationalism. In order to assess their claim about the relationship between RC policies and liberal nationalism, it is necessary to specify what liberal nationalism is, and hence in what sense RC policies are claimed to be nationalist.

LIBERAL NATIONALISM Nationalism, for present purposes, is characterised by operating with the existence of a certain kind of group, nations, and by making the normative claim that nationality should play a political role. Different kinds of nationalism can then be distinguished depending on their conception of what nationality is, or should be, and on which kind of political significance it should have. These are in principle two different issues: how the kind of group in question is understood, i.e. what defines nations in general and membership of specific nations in particular, is in principle independent from what political significance this kind of groups is accorded, e.g. whether it is appealed to in claims for political self-determination or as justification for specific cultural policies. Membership of states is one of the most important nationalist concerns. Given a specific conception of the nation, it is an open question whether it should have political significance in any particular respect, although some ways of understanding nations will of course be better suited or more obvious candidates for specific roles. If the nation is primarily understood in linguistic terms, for instance, then it is most obviously relevant to language policies. But if the nation is understood as a community with pretensions for political self-determination and perhaps as having pre-political claims to a certain territory, then nations will more obviously be relevant to the drawing of boundaries of states. One respect in which liberal nationalists often claim that nationality should be considered politically relevant is in relation to immigration, i.e. admission to and residence in the state.5 Another aspect, which is the subject of this chapter, concerns citizenship as the full and equal kind of membership of the state as a political community which is partly formal state membership and partly an attendant set of legally enforceable rights and duties; the “status” and “rights” aspects of citizenship, respectively [Joppke, 2007: 38]. Nationalism in this respect, then, is the claim that all citizens should be members of the nation, which might also be formulated as the claim that what is sometimes known as the “identity” aspect of citizenship should refer to the nation as a community [Joppke, 2007: 38, cf. Kymlicka and Norman, 2003: 211]. This normative claim implies that nationality, whatever exactly it is taken to be, must be distinct from, i.e. not equivalent to, citizenship [Miller, 1995: 18f, contrary to the legal use of “nationality” in Bauböck, et al, 2006: 17], since the prescription that all citizens should be members of the nation would otherwise be a tautological claim. The claim implies that considerations of nationality should matter with respect to access to citizenship, in particular when granting the status of citizenship to immigrants through naturalization. This is not to say that nationalism necessarily claims this; only that it follows

5

See Lægaard, 2007a, and 2009, for discussions of liberal nationalism in this regard.


87

from requiring a sufficiently close connection between nationality and citizenship. As will be evident below, prominent liberal nationalists do make this claim. “Liberal nationalism” denotes versions of nationalism that seek to be compatible with liberal concerns about respect for the equal moral status of persons. Liberal nationalism is liberal in virtue of a) a constrained conception of what shared nationality can legitimately consist in, b) instrumental justifications of the nationalist requirement with reference to more fundamental liberal ideals and values, and c) the liberal constraints it accepts on the political means states can employ in order to secure the nationalist goal [Kymlicka, 2001a: 39f; 2006].6 With respect to a), liberal conceptions of nationality are characterised by a denial that common nationality can be based on “race”, descent or ethnicity [Miller, 1995: 20; Kymlicka, 2001a: 40] but can only concern “public” matters such as a shared language, history and public culture [Miller, 1995: 26, 87, 158, 172; Kymlicka, 2001a: 40] in order for the common national identity to be compatible with cultural pluralism [Miller, 1995: 137, 142, 172, 179f; Kymlicka, 2006: 130].7 With respect to b), one justification for nationalism with reference to liberal ideals and values appeals to the importance of a cultural context of choice and sense of identity to free or autonomous choice, and identifies the national culture as such a context and source of identity [Kymlicka, 1989, chap. 8, 1995, chap. 5, 2001a: 208ff, 227ff, 2004, cf. Margalit and Raz, 1990; Tamir, 1993]. Another instrumental justification appeals to an empirical claim to the effect that a common national identity is a practical precondition, or at least strongly facilitating condition, for the trust and solidarity necessary for upholding liberal political ideals such as social justice and deliberative democracy. The claim then is that supporters of such liberal values must therefore also support liberal nationalism [Miller, 1995, chap. 4, 2000; Kymlicka, 2001a: 212-215, 225-27]. With respect to c), liberal constraints require that long-term residents who are as such subjected to the political authority of the state should be able to become citizens within a reasonable timeframe [Carens, 1989, 2002, 2005; Miller, 2008: 7, cf. Bauböck, et al, 2006: 30], which sets additional practical limits to the degree of acculturation that can be required of applicants.

REVALUATED CITIZENSHIP AS NATION-BUILDING Nationalism is, as noted, premised on a distinction between two kinds of groups, namely the state as a political community, which primarily involves citizenship as legal status and rights, and the nation as an identity conferring community of sentiment and shared culture. Given this distinction, nationalism as a political theory makes the normative claim that there should be a specific kind of relationship between state and nation and the corresponding kinds of group membership. One way of formulating this is in terms of “nation-building” [see Weinstock, 2004, on different kinds hereof]. Almost all states are in fact nation-building 6

There are also attempts at justifying liberal nationalism in non-instrumental terms, e.g. Miller 1995, chap. 3, but this raises the question whether such non-instrumentally justified forms of nationalism can be liberal; e.g. what the relationship is between the resulting special obligations among members of the same nation and the obligations arising from liberal principles. Kymlicka’s form of liberal nationalism is purely instrumental, and thus avoids this issue. 7 Liberal nationalist conceptions of nationality in cultural terms as opposed to ethnicity understood as biological descent may be problematic, however, insofar as membership in a nation is nevertheless characterised in terms of ascriptive properties such as “belonging”, “origins” or even descent, cf. Joppke 2005b: 7f.

88

Sune Lægaard

states in the sense that they have attempted to diffuse a single national culture, including a common national language, throughout their territory, promoted a particular national identity based on participation in that culture, and encouraged and sometimes forced all the citizens on the territory of the state to integrate into common public institutions operating in the national language [Kymlicka, 1995: 76-80, 2001a: 1, 25ff, 2003b: 267-270]. One formulation of nationalism as a normative view is the claim that it is a legitimate function of the state to protect and promote a national culture and language within its borders, which is to say that nation-building is permissible and maybe even required [Kymlicka, 2001a: 39; Weinstock, 2004: 56]. Citizenship policies, including naturalization requirements, are commonly listed among the tools states use in nation-building – along with standardized public education, official languages, compulsory military service, public symbols etc. [e.g. Kymlicka, 2001a: 1, 155, 2003b: 267; Banting and Kymlicka, 2004: 21, 2006: 40]. Kymlicka describes RC policies as “strengthened” and more “robust” nation-building policies [Banting and Kymlicka, 2004: 21f, 2006: 40]. Miller argues that the prevalence of these sorts of citizenship programmes confirm the liberal nationalist thesis that formal citizenship alone is not “a sufficiently strong cement to hold together a democratic welfare state, whose successful working depends upon relatively high levels of interpersonal trust and co-operation,” but which also requires citizens “to share a cultural identity of the kind that common nationality provides” [Miller, 2008: 378]: If democratic states require nothing more of their citizens than subscription to some formal set of constitutional principles, then citizenship programmes which include, for instance, learning the national language and some aspects of the nation’s history and political culture would amount to a misguided attempt at cultural integration where none is needed. [Miller, 2008: 380]

This claim that the RC policies should be understood as attempts at cultural integration echoes Kymlicka’s characterisation of them as nation-building policies. But Miller’s claims furthermore seems to be that RC only makes sense as an attempt at integration into a national culture, i.e. that cultural integration is the only plausible rationale for such policies [cf. also Kostakopoulou, 2003]. But on what basis are RC policies classified as policies of nation-building and cultural integration, and in what sense are they a kind of nationalism? If the claim merely is that one way of justifying RC is by appeal to liberal nationalism, but that there might be other equally good justifications, this would certainly be plausible, but it would not establish that RC is a kind of nationalism in any strong or interesting sense. In that case nationality would not have any particular theoretical significance in relation to citizenship. But as noted, liberal nationalists have not only argued that nationality might play a role in relation to citizenship, but that RC policies as such are nation-building policies and that a concern with nationality is necessary to make sense of them. This means that RC policies are nationalist in virtue of their content or character, and are thus only justifiable, rather than merely potentially justifiable, on the basis of a concern with nationality [Miller, 2008, see Kymlicka, 2001b, for an analogous claim regarding territorial borders and membership practices of liberal states]. This claim raises the question whether it is indeed the case that RC is necessarily nationalist, or whether there might be alternative justifications for these policies. But it also


89

invites discussion of the notions that liberal nationalists rely upon in making their claims, i.e. the concepts of cultural integration and of the nation. Since the significance of the former claim and of liberal nationalism as a political theory more generally depends on what is meant by the latter terms, there is reason to discuss them in more detail.

THE CONCEPT OF “INTEGRATION” Nationalism has been formulated in terms of the view that the state may and even should pursue nation-building policies in order to ensure that all citizens are members of the nation. In relation to naturalization requirements, this normative view finds expression in the demand that immigrants should “integrate into the national culture” as a condition for attaining citizenship, and RC policies have been characterised as “attempts at cultural integration” by liberal nationalists and have been endorsed by them as such. But the use of the concept of integration in this context is not unproblematic. According to one common usage, “integration” denotes a liberal approach to ethnic and cultural diversity. On this usage, integration is opposed to assimilation, which is understood as the nationalist demand that all members of society adopt the national culture and abandon all cultural attachments and practices that are incompatible herewith. In Britain, this usage became prominent in 1968 when the then British home secretary, Roy Jenkins, stated in response to Enoch Powell that: I do not regard [integration] as meaning the loss, by immigrants, of their national characteristics and culture. I do not think that we need in this country a melting pot, which will turn everybody out in a common mould, as one of a series of carbon copies of someone’s misplaced vision of the stereotyped Englishman… I define integration, therefore, not as a flattening process of assimilation but as equal opportunity, coupled with cultural diversity, in an atmosphere of mutual tolerance. [Quoted in Parekh, 1990: 64, cf. Mason, 2000: 121, note 17]

This usage is common in political theory as well [e.g. Parekh, 1990: 63ff, cf. Joppke and Morawska, 2003: 4f; Joppke, 2008: 541]. There are several problems with it, however: First of all, liberalism and nationalism are not necessarily opposites, or this is at least what liberal nationalists claim. To what degree nationalism can actually be liberal, and in what specific senses and respects, is debatable, but the possibility should not be ruled out by mere stipulation. The same is the case with respect to integration and assimilation, which, on more precise and reasonable definitions, need not be opposed or conceptually incompatible. The concept of assimilation is potentially much broader than allowed by the traditional images of “melting pot” and “carbon copies”. On one fruitful conception, assimilation does not denote complete similarity but the process whereby a group of people becomes gradually more like another group in certain respects [Brubaker, 2004: 119f]. Policies with the aim of making members of minorities abandon at least some of their customs and practices can therefore be said to be assimilationist in a relevant sense [Mason, 2000: 121]. This means that not only nationalism, but liberalism too, implies policies that are properly described as assimilationist, to some degree and in some respects, even though the liberal reason for requiring the

Sune Lægaard

90

abandonment of a given practice will normally be that it infringes the rights of individuals [e.g. Mason, 2000, chap. 3]. So the interesting question is not “integration or assimilation?”, but in which respects immigrants are required to integrate, to what degree this requires them to abandon their customs and practices, and what political means the state can use to secure the relevant kind of integration. Regarding the first question, Kymlicka writes in general terms about “integration into society” [e.g. 2001a: 51, 155] or into the “societal culture” [e.g. 2001a: 54, 156, 161f, 171]. But “society” and the “societal culture” are vague and complex notions (for the latter concept, see below). The common assumption that “society” is unitary and that immigrants integrate into society as whole persons is problematic [Joppke and Morawska, 2003: 3; Joppke, 2005b: 10f]. So it is necessary to specify the notion of integration of immigrants in different respects, which Kymlicka actually does in several places. Although his usage is not entirely systematic, he distinguishes between the following respects in which immigrants can integrate: •

•

• •

Linguistic integration, i.e. learning the national language [e.g. 2001a: 51, 54, 174], which is a practical precondition for successful integration in the second sense, namely: Institutional integration [e.g. 2001a: 51, 54, 155, 162, 164f, 167, 169], i.e. participation in shared institutions functioning in the national language according to the applicable rules, which includes abiding by the law and respecting the rights and duties of citizens. This subdivides into: (a) public institutional integration, i.e. subjection to, participation in and employment in common state institutions, (b) civil society integration, i.e. participation in shared associations, the labour and housing markets etc. (Institutional integration can then proceed on more or less fair terms, cf. 2001a: 162-172). Civic integration, i.e. civic engagement and (active, participatory) democratic citizenship in the political community [2001a: 168]. Psychological integration [2001a: 167f], i.e. a personal sense of identification with and belonging to the new society and/or community.

The kind of integration that can reasonably be required of immigrants, according to Kymlicka, is linguistic and institutional integration (provided the terms of integration are fair). Civic and psychological integration cannot be normatively required, but may follow as empirical side effects of linguistic-institutional integration. The requirement of linguistic-institutional integration is not only reasonable on liberal grounds; it is also widespread across western states. Immigrants are everywhere required to accept political principles of democracy and toleration or pluralism that do not mark a difference between different liberal societies but are characteristic requirements of any liberal democratic society [Carens, 2000: 120-123]. The only really distinctive cultural commitment setting requirements on immigrants to different liberal democratic states apart is knowledge of the particular national languages [Carens, 2000: 131, 2005: 45; Joppke, 2007: 45, 2008: 539], i.e. the specific form of linguistic integration. This “thin” notion of integration requiring acceptance of basic liberal values and knowledge of the national language and institutions is similar across liberal states [Joppke and Morawska, 2003: 5-9; Joppke, 2005b: 237ff]. In one


91

sense, contemporary Western states’ membership policies, even after the revaluation of citizenship, are therefore no longer in the service of reproducing particular nationhood [Joppke 2005a: 50]: Their membership policies may still be notionally ‘nation-building’, but only in the generic sense of forging non-ethnic, liberal-democratic collectivities that are not different here from elsewhere. [Joppke 2005a: 53f, cf. 2005b: 239]

This is said to be the case because purely linguistic-institutional integration is neither especially substantial in cultural terms, nor involves much national particularity; it is a “minimalist” kind of nation-building [Weinstock, 2004]. Even if linguistic-institutional integration may over time lead, as a side effect, to some degree of civic and psychological integration, the resulting participation need not be in the national culture as a set of cultural practices and customs distinct from the political institutions of the state. Similarly, psychological identification resulting from linguisticinstitutional integration need not be directed at the nation as a community distinct from the state, but may concern the society in a broader sense and the public institutions that have come to provide the setting for the life of the immigrant. So the question is whether, and in what sense, linguistic-institutional integration can amount to integration into a nation or national culture that would qualify it as a nationalist aim?

SOCIETAL CULTURES AND NATIONS The reason why Kymlicka considers policies aimed at linguistic-institutional integration as nation-building policies [e.g. Kymlicka, 2003b: 289f] revolves around his notion of “societal culture” and his identification of societal cultures with nations. For a societal culture to exist, what matters is not the presence of “culture” in any ordinary (quasi)anthropological sense, but established institutions operating in a shared language within a delimited territory [Kymlicka, 1995: 76, 2001a: 25]. Here, “institutions” are not primarily to be thought of in the general theoretical sense as practices governed by public rules [Rawls, 1971/1999: 55f/47ff], which might apply to most established social practices, but in a more specific, ordinary and very concrete and material sense, as “schools, media, economy, government etc.” Kymlicka’s emphasis is on such individually identifiable institutions operating in a shared language within a given territory, so it might have been just as appropriate to characterise this social complex as an “organized society” (often a state). Societal institutions do embody culture in a sense beyond a shared language, of course, in the sense of conventions and practices, but Kymlicka stresses that such abstract cultural forms must be institutionalised in the noted concrete and material sense in order for a societal culture to exist. Since a societal culture primarily denotes a set of territorially concentrated institutions operating in a shared language, linguistic-institutional integration is in fact equivalent to integration into a societal culture, more or less as a matter of definition. Kymlicka furthermore equates societal cultures with nations: The capacity and motivation to form and maintain such a distinct culture is characteristic of 'nations' or 'peoples' (i.e. culturally distinct, geographically concentrated, and institutionally

92

Sune Lægaard complete societies). Societal cultures, then, tend to be national cultures. [Kymlicka, 1995: 80, cf. 2004: 120f]

Given this understanding, linguistic-institutional integration is not only equivalent to integration into a societal culture, but can be characterised as integration into a national culture. And it is in this sense that the diffusion by the state of a societal culture throughout a society, as well as policies aimed at securing linguistic-institutional integration, such as RC naturalization policies, are nation-building policies. So in Kymlicka’s sense, becoming a member of a nation or integrating into a national culture simply means functioning as a member of a society with public institutions operating in a shared language [cf. Weinstock, 2004: 55f]. But this is not equivalent to Miller’s stronger sense of “integration into the cultural nation” [Miller, 2008: 376, cf. Weinstock, 2004: 53]. According to Miller, becoming a member of the nation crucially involves people's sense of identity as well as their ways of thinking, behaving and sentiments toward other members of the nation [Miller, 1995: 21-27]. Miller’s conception of what it means to integrate into the nation may be characterised by way of Kymlicka’s notion of psychological integration, with the important difference that subjective identification for Miller is not merely a possible side effect of institutional integration, but the primary constituent of nationality. Miller’s fundamentally subjective conception of nationality also includes objective elements,8 but where Kymlicka focuses on participation in public institutions, Miller focuses more on what was termed “civil society integration” above, i.e. the informal social norms and expectations governing interaction in everyday life. So the concept of nationality is used by Kymlicka as denoting a “thin” sort of societal membership and by Miller as a label for “thicker” (although still liberally constrained) membership of a nation as a primarily cultural and affective community.

NATIONALITY AND THE INSTRUMENTAL ARGUMENTS When it comes to the instrumental justifications for the need for common nationality, however, liberal nationalism depends on a more substantial form of nationality than Kymlicka’s societal one. As noted above, liberal nationalists justify the nationalist aim with reference to empirical claims about the instrumental functions of common nationality in relation to liberal political ideals such as individual freedom, social justice and deliberative democracy [Kymlicka, 1989, chap. 8, 1995, chap. 5, 2001a: 208-215, 225-29, 2004; Margalit and Raz, 1990; Tamir, 1993; Miller, 1995, chap. 4, 2000]. The sort of nationality that is important in these respects is nationality as a cultural precondition for choice and as an affective identity providing a sense of social trust and solidarity.

8

Mason, 2000: 116f, categorises Miller as an objectivist with respect to what it means to share a national identity, since Miller also claims that common nationality depends on objective characteristics such as a shared public culture. But Mason also notes that Miller’s conception of nationality depends in part on subjective beliefs, and that objective commonalities are necessary for shared nationality, according to Miller, because the subjective beliefs otherwise are impossible to uphold. All this is compatible with my characterisation of Miller’s conception of nationality as involving a necessary and strong subjective component that is apparently missing in Kymlicka’s societal conception.


93

Nationality in Kymlicka’s thin societal sense might certainly be an important precondition for individual freedom, since the institutions constituting the societal culture provide many of the options among which individuals can choose. But his point is that in order for choice to be genuinely free, options should not only be materially available, but must also be meaningful from the point of view of individuals [Kymlicka, 1995: 83, 2001a: 209f, 227f, 2004: 117f, cf. Miller, 1995: 85f.]. This requires a kind of correspondence between the societal institutions providing options and the cultural frame of reference of individuals who are to choose among them. But for this to make sense, the cultural frame of reference must refer to something internalised by persons, not just to the institutionally embodied “vocabulary of tradition and convention” [Kymlicka, 1995: 76, 2004: 117]. This means that the context of choice argument for liberal nationalism not only requires the existence of a societal culture as defined by Kymlicka, but also that the individuals who are to make choices within this cultural structure have internalised the relevant cultural traditions and conventions. So if institutional integration of immigrants is to be understood as national integration in a way connecting with the context of choice argument, it must involve a measure of psychological integration and cultural assimilation enabling immigrants to understand the options provided by the institutions.9 The need for a more substantial conception of nationality is even more evident in relation to the other two instrumental justifications for liberal nationalism. Taking social justice and deliberative democracy as given liberal ideals, the argument is that these ideals make motivational demands on people to trust and to feel solidarity with one another, and that a common national identity provides such sentiments [Miller, 1995: 90-98; Kymlicka, 2001a: 225f, cf. Lægaard, 2006]. What is required here is an affective, subjective or psychological, sense of national identity rather than (or in addition to) participation in a set of institutions operating in a shared language. The premise of these instrumental arguments is precisely, pace Rawls [1971], that people cannot be relied on to support institutions, not even if they are publicly known to be just; there must be something external that motivates them to do so. If nationality is what is supposed to motivate people to support democratic and redistributive institutions, then it cannot simply consist in their participation in these institutions, for this is exactly what supposedly cannot be taken for granted. So once again the arguments for liberal nationalism require a more substantial notion of common nationality than suggested by Kymlicka’s primarily institutionally defined concept of a societal culture. To sum up: The strengthened naturalization requirements characteristic of revaluated citizenship can be characterised as linguistic-institutional integration, and even as policies of nation-building in the thin sense defined in relation to Kymlicka’s notion of societal cultures. But the kind of common nationality required by liberal nationalism cannot be secured by mere linguistic-institutional integration understood along the lines of a societal culture, which 9

The individuals with whose meaningful choices this argument for integration is concerned might be the immigrants, the existing citizens, or both. In the case of the immigrants, the point of the argument is to secure their opportunity for making free choices in their new society. To require them to integrate for this purpose might seem paternalist and as such potentially problematic from a liberal perspective. According to Kymlicka’s liberalism [1989, chaps. 2-4, 1995: 80ff] this is nevertheless consistent with respect for the equal moral status of the immigrants, since the liberal ideal of individual freedom is concerned with the “right” rather than with a specific conception of the “good,” and the integration requirement is justified instrumentally with reference to this ideal. As a concern for the context of choice of existing citizens, the argument might alternatively be invoked as a reason against immigration, but for it to be plausible as such immigration must destroy the context of choice provided by established institutions, cf. Lægaard, 2007a.

94

Sune Lægaard

is primarily institutionally defined. Whether one prefers to say that linguistic-institutional integration is insufficient or that successful linguistic-institutional integration also requires psychological integration and a degree of cultural assimilation in addition to the institutional participation suggested by Kymlicka’s characterisation of a societal culture is a merely terminological issue. This conclusion might seem to support Miller’s formulation of the relationship between RC policies and liberal nationalism, but there are also problems with his position: a) The kinds of RC policies regarding naturalization supported by Miller are not sufficient to secure the kind of cultural integration required by his more substantive version of liberal nationalism, and b) naturalization policies that would be sufficient, would not be liberally acceptable. As to the first point, Miller asks if immigrants can be required to “absorb some aspects of national culture as a condition of being admitted to citizenship?” [2008: 385] But the answer he gives only refers to “teaching citizenship formally” to make people understand “what’s expected of them as citizens”, which is spelled out in terms of distinctively political values such as active participation. He concludes that the RC policies involving citizenship tests can be defended on national culture grounds. But these tests merely concern “a working knowledge of the national language and, and some familiarity with the history and institutions of the country” [2008: 385]. As an affirmative answer to the question whether immigrants can be required to absorb some aspects of the national culture as a condition of being admitted to citizenship, his claim only makes sense if a) the “national culture” in question only concerns the obligations of citizenship, proficiency in the official language and the noted kind of familiarity, and b) if successful performance at tests of these things can be taken as evidence that immigrants have in fact “absorbed” the culture in question. But both assumptions are problematic on Miller’s own conception of nationality, since his point is exactly that nationality must consist in more than acceptance of political principles and knowledge of political institutions and must be an internalised aspect of subjective and affective identity in order to motivate people in the required way. The substantial lesson is that the sort of integration involved in RC policies is not sufficient to secure the kind of integration required by liberal nationalist arguments. And this means that the RC naturalization requirements are only nationalist in the weak sense that they might be favoured on liberal nationalist grounds over more lenient conditions, not in any stronger sense. One might take this point, however, as an argument for much stricter naturalization requirements. This would reveal liberal nationalism as implying a much more exclusionary position conditioning access to citizenship on full psychological integration and a substantial degree of cultural assimilation. But here the third element of liberal nationalism as sketched in the introduction becomes relevant, namely its acceptance of liberal constraints on nation-building policies. One such constraint derives from the requirement that all longterm residents of a state should be able to become citizens within a reasonable period of time, because their subjection to the political authority of the state and its coercive imposition of rules and legislation is only justifiable if they themselves can come to participate as equal citizens in the democratic determination of the states’ policies [Miller, 2008: 377]. Taken in isolation, this democratic principle implies that access to citizenship should only be conditional on residence and sets limits to the period of residence that can be required [Bauböck et al, 2006: 32; Carens, 1989, 2002, 2005]. Access to citizenship should therefore be a right governed by objective conditions rather than a discretionary act of the state premised on individually examined ability to assimilate [Joppke, 2005a: 52]. This stress on


95

justifiability arguably fails to capture other aspects of citizenship, e.g. those related to obligations, but as an acknowledged component of liberal nationalism it suggests that it must be possible to satisfy any additional naturalization requirements, including demands for cultural integration, within such a limited period. The acceptance of this liberal constraint means that a liberal nationalism cannot simply extend the residence requirement as a means toward securing psychological integration. Add to this the practical problems of determining the degree of psychological integration, together with the lack of any necessary or even merely reliable connection between length of residency and psychological integration, and the conclusion seems to be the following: Even though liberal nationalism is fundamentally concerned with psychological integration and a substantial degree of cultural assimilation, it cannot secure this aim through naturalization requirements, both for practical reasons and since it would then cease being liberal.

NATURALIZATION REQUIREMENTS: INCLUSION OR EXCLUSION? The foregoing discussion of liberal nationalism has assessed the claims of liberal nationalists regarding RC policies on the basis of assumptions internal to liberal nationalism. From this perspective the relevant or interesting function of naturalization policies is to secure integration of would-be citizens into the nation. The general aim of liberal nationalism as applied to the issue of access to citizenship is that all citizens should be members of the nation, and naturalization policies must be designed and evaluated in terms of how they contribute towards this end. This is an example of how membership of one group (the community of citizens of a state) is made conditional on membership of another (the nation). Because liberal nationalists both think it possible and desirable for non-members to become members of a nation, this is not equivalent to complete exclusion of non-nationals from citizenship. And because liberal nationalists are liberals, they do not require complete assimilation as a condition for naturalization. The tests characteristic of RC policies rather require applicants for naturalization to display a sufficient degree of effort to attain knowledge and abilities relevant to membership of the nation, and this is then taken by liberal nationalists as evidence that the aim of such policies is integration into the nation and that such requirements are also likely to be sufficiently effective tools in securing this aim. I have argued that there are reasons from within liberal nationalism for thinking that RC policies are neither necessary nor sufficient for this purpose. If this conclusion holds, then the positive claims of liberal nationalists about the close link between RC policies and liberal nationalist concerns are unfounded. But this leaves the possibility that there might be a weaker relation of support from liberal nationalism to RC policies; such naturalization requirements might be preferable from a liberal nationalist point of view over less exacting conditions, even though RC policies do not secure the liberal nationalist correlation between citizenship and nationality. This section briefly considers whether there might be reasons to doubt even this weaker claim. The reason for doubt has to do with the way in which RC policies are exclusionary, and the way in which this exclusion is related to nationality. RC policies, as all policies regulating membership of a group, might be considered as both inclusionary and exclusionary [Joppke, 2008: 542]. According to the liberal nationalist

96

Sune Lægaard

interpretation of these policies, their aim is inclusion of a specific kind, namely integration into the nation. But precisely because the overall aim of liberal nationalism is that all citizens should be members of the nation, RC policies understood as instruments for this purpose will equally have exclusionary functions, namely to deny naturalization to those who do not qualify for membership in the nation by not performing sufficiently well at the tests. Inclusion and exclusion are necessary compliments to each other as long as the requirements of membership of the nation are substantial and not merely formal; if the conditions for inclusion are so demanding that not all applicants will automatically meet them (for instance, after a certain length of residence), some will inevitably be excluded. If one takes up an external perspective on liberal nationalism as a view about naturalization, or on RC policies in general, there may be all kinds of reasons to criticise exclusionary implications of liberal nationalism or the exclusionary functions and effects of RC policies. Such discussions go beyond the scope of the present chapter. But there is nevertheless reason, even from the point of view of liberal nationalism, to consider the exclusionary effects of naturalization requirements like those embodied in RC policies. Even if, contrary to what has been argued hitherto, there actually is a close link between liberal nationalism and RC policies, such policies might also be problematic from the point of view of liberal nationalism. One way to frame the inclusion/exclusion issue is to say that RC policies seem to mark a shift in political emphasis from the status aspect to the identity aspect of citizenship [Joppke, 2008: 534]. Insofar as liberal nationalists are right in understanding RC policies as attempts at securing that all citizens are also members of the nation, it is true that such policies are motivated by what might be termed an “identitarian” concern for unity among and integration of citizens. But at the same time RC policies in fact regulate access to the status of citizen. So even if the link between RC policies and the nation as an identity community holds, one should not say that these policies indicate a shift from citizenship status to citizenship identity, but rather that access to citizenship status has been made conditional on considerations of identity. But even if the strong link between RC policies and liberal nationalist concerns does not hold, for instance because the national “identity” in question is too “thin”, without any cultural particular content [Joppke, 2008], the problem still is that even apparently “thin” identity political aspects of RC policies might be problematic from a liberal nationalist point of view. The problem is that, even if the content of the identity promoted by way of RC policies is purely universalistic, in terms of liberal values of equality, democracy and respect for individual rights, for instance, the institutionalisation of naturalization requirements making access to citizenship conditional on a properly expressed acceptance of these “national” values might have exclusionary effects that contradict the liberal nationalist aim of inclusion [Lægaard, 2007b]. The aim of liberal nationalism is to secure integration into the nation of all prospective citizens, and liberal nationalists understand RC policies in this light, that is, as mechanisms making demands on applicants that are likely to result in the required kind of identity. The mechanisms by which liberal nationalists suggest this inclusionary aim be pursued, in this case naturalization policies, are exclusionary since applicants who fail at the tests will be denied citizenship. This might not be a great problem if all that is required for membership of the nation is basic proficiency in the national language and rudimentary knowledge about institutions, i.e. something like Kymlicka’s societal notion of nationality. A argued above,


97

however, this conception of nationality is insufficient in relation to the instrumental arguments on which liberal nationalism is based. But if the valued kind of nationality is more akin to Miller’s “thicker” notion of affective identification with a cultural community, the association of this aim with exclusionary naturalization policies might be problematic from the point of view of the very same aim of inclusion. If applicants for naturalization are presented with strict requirements in the form of tests that have to be passed in order to gain access to the desired status of citizenship, and the level required for passing is considerable, the notion of nationality associated with these tests is just as likely to be viewed as a source of opposition to be overcome rather than as an object of affective identification. Rather than identifying with the nation in the relevant sense, applicants might view the requirements in a purely instrumental and strategic perspective, i.e. as obstacles to be overcome in the most efficient manner, or they might even develop negative relations to the national ideal that threatens to prevent them from gaining the full rights of citizenship of the country in which they reside. But in that case, which is not improbable given the level of requirements of many RC policies, the naturalization policies advocated by liberal nationalists on the basis of a concern with securing affective identification not only do not secure this aim but directly contradict and oppose its realisation.

CONCLUSION RC policies have been claimed, by some, to be a form of liberal nationalism [Kymlicka, 2003a], and, by others, to be at most nationalist in name [Joppke and Morawska, 2003; Joppke, 2005a, 2005b]. The present paper has attempted to show that both views may in a sense be right. The sense in which RC policies regarding naturalization can be nationalist is minimal, however, in that they only involve linguistic-institutional integration and acceptance of liberal principles, and not a more substantial cultural form of assimilation. Whether one would call this a kind of nation-building is really a matter of taste, but insofar as one does, it should be stressed that the sense of nationality involved is extremely thin, to the extent that it really only involves participation in societal institutions operating in a shared language, including a demand for acceptance of the liberal values on which the institutions are based. As such, this kind of nationality does not in itself involve the kind of affective national identity and national culture on which the instrumental arguments for liberal nationalism rely. The upshot is that liberal nationalism as a substantial and not merely nominal position is not plausible as a view about access to citizenship, i.e. the status dimension of citizenship. This is not a claim that nationalism simply is a claim about citizenship and that it is problematic as such, nor is it to deny that nationalism is often plausibly taken to have implications for which naturalization policies states should pursue. As noted in the beginning of this chapter, nationalism as a political theory can be formulated in many ways depending on the conception of the nation and on the political significance that is claimed for it. The chapter has solely been concerned with one particular application of a particular form of liberal nationalism, which is completely compatible with acknowledgement that nationalism is usually not only, or even primarily, a claim about citizenship. And many forms of nationalism do of course have clear implications for naturalization policy. Assessments of these implications will usually take an external perspective, i.e. evaluate whether the implied

98

Sune Lægaard

policies are acceptable on the basis of other values or principles. The main claim of the present chapter is different from such external criticisms, since it is based on internal features of liberal nationalism, i.e. the conceptions of the nation and the liberal constraints that Kymlicka and Miller accept. The suggestion is that these internal characteristics of liberal nationalism do not underwrite the claims about RC policies that Kymlicka and Miller make. RC policies are not nation-building policies in the relevant sense, and do not only make sense as attempts at integration of immigrants into the cultural nation. Because of the liberal nationalist view of nationality and the nature of RC policies, there is no reason to see these policies as confirmations of the normative claims of liberal nationalism, or reason to think that liberal nationalism is necessary in order to justify such policies. So liberal nationalism does not, on its own premises, provide a very strong or specific position on naturalization policies. This claim does not rule out the possibility that liberal nationalism may have more to say with respect to the content of citizenship in terms of the identity and culture that citizens can be expected and encouraged by the state to adopt and engage in, i.e. the identity aspect of citizenship as independent from the issue of access to the status of citizenship. But a liberal nationalism that is still substantive cannot, then, be a view about the conditions for membership of the state. It might more plausibly, when this is once again understood as an internal claim, be a view about the cultural and educational policies that the state should pursue with respect to its members. On this basis, Miller’s claim that revaluated citizenship only makes sense as nation-building is true in the “thin” linguistic-institutional sense, but false in the “thicker” psychological sense. The true version of the claim depends, however, on a conception of the nation that is too thin for Miller and Kymlicka’s purposes. The second version is false, insofar as naturalization requirements and policies concerning the identity aspect of citizenship can be sufficiently justified without reference to an affectively internalised national culture.

REFERENCES Andrew Mason (2000). Community, Solidarity and Belonging: Levels of Community and their Normative Significance. Cambridge: Cambridge University Press. Ann Dummett (2006). United Kingdom. In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume 2: Country Analyses (pp. 551-585). Amsterdam: Amsterdam University Press. Avishai Margalit, and Joseph Raz (1990). National Self-Determination. Journal of Philosophy, 87(9), 439-461. Betty De Hart, and Ricky van Oers (2006). European trends in nationality law. In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume I: Comparative Analyses (pp. 317-357). Amsterdam: Amsterdam University Press. Bhikhu Parekh (1990). Britain and the Social Logic of Pluralism. In Bhikhu Parekh (Ed.), Britain: A Plural Society (pp. 58-76). Commission for Racial Equality, discussion paper 3.


99

Christian Joppke (2005a). Selecting by Origin: Ethnic Migration in the Liberal State. Cambridge, Mass.: Harvard University Press. Christian Joppke (2005b). Exclusion in the Liberal State: The Case of Immigration and Citizenship Policy. European Journal of Social Theory, 8(1), 43–61. Christian Joppke (2007). Transformation of Citizenship: Status, Rights, Identity. Citizenship Studies, 11(1), 37–48. Christian Joppke (2008). Immigration and the identity of citizenship: the paradox of universalism. Citizenship Studies, 12(6), 533–546. Christian Joppke, and Ewa Morawska (2003). Integrating Immigrants in

Liberal Nation-States: Policies and Practices. In Christian Joppke, and Ewa Morawska (Eds.), Toward Assimilation and Citizenship: Immigration in Liberal Nation-States (pp. 1-36 ). Basingstoke: Palgrave Macmillan. Daniel Weinstock (2004). Four Kinds of (Post-)nation-building. In Michel Seymour (Ed.), The Fate of the Nation State (pp. 51-68). Montreal: McGill-Queen's University Press. David Miller (1995). On Nationality. Oxford: Clarendon Press. David Miller (2000). Citizenship and National Identity. Cambridge: Polity. David Miller (2006). Multiculturalism and the welfare state: Theoretical reflections. In Keith Banting, and Will Kymlicka (Eds.), Multiculturalism and the Welfare State: Recognition and redistribution in contemporary democracies (pp. 323-338). Oxford: Oxford University Press. David Miller (2008): Immigrants, Nations, and Citizenship. Journal of Political Philosophy, 16(4), 371–390. Dora Kostakopoulou (2003). Why Naturalization? Perspectives on European Politics and Society, 4(1), 85-115. Eva Ersbøll (2006). Denmark. In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume 2: Country Analyses (pp. 105-148). Amsterdam: Amsterdam University Press. Han Entzinger (2003). The Rise and Fall of Multiculturalism in the Netherlands. In Christian Joppke, and Ewa Morawska (Eds.), Toward Assimilation and Citizenship: Immigrants in Liberal Nation-States (pp. 59-86). Basingstoke: Palgrave Macmillan. Harald Waldrauch (2006). Acquisition of nationality. In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume I: Comparative Analyses (pp. 121-182). Amsterdam: Amsterdam University Press. Home Office (2002). Secure Borders, Safe Haven: Integration with Diversity in Modern Britain. White Paper. Available at: http://www.archive2.official-documents.co.uk /document/cm53/5387/cm5387.pdf John Rawls (1971). A Theory of Justice. Revised edition (1999). Harvard: Harvard University Press. Joseph H. Carens (1989). Membership and Morality: Admission to Citizenship in Liberal Democratic States. In Rogers Brubaker (Ed.), Immigration and the Politics of Citizenship in Europe and North America (pp. 31-49). Lanham, Md.: German Marshall Fund of America and University Press of America.

100

Sune Lægaard

Joseph H. Carens (2000). Culture, Citizenship, and Community: A Contextual Exploration of Justice as Evenhandedness. Oxford: Oxford University Press. Joseph H. Carens (2002). Citizenship and Civil Society: What rights for residents? In Randall Hansen, and Patrick Weil (Eds.), Dual Nationality, Social Rights and Federal Citizenship in the U.S. and Europe (pp. 100-118). New York: Berghahn Books. Joseph H. Carens (2005). The Integration of Immigrants. Journal of Moral Philosophy, 2(1), 29-46. Keith Banting, and Will Kymlicka (2004). Do Multiculturalism Policies erode the Welfare State? Queen’s University, School of Policy Studies Working Paper #33. Available at: http://www.queensu.ca/sps/publications/working_papers/33.pdf Revised version (December 2004) of “Do Multiculturalism Policies erode the Welfare State?” in Philippe Van Parijs (ed.): Cultural Diversity versus Economic Solidarity. Brussels: Deboeck. Keith Banting, and Will Kymlicka (2006). Introduction. Multiculturalism and the welfare state: Setting the context. In Keith Banting, and Will Kymlicka (Eds.), Multiculturalism and the Welfare State: Recognition and redistribution in contemporary democracies (pp. 1-45). Oxford: Oxford University Press. Peter Schuck (1998). Citizens, Strangers, and In-Betweens: Essays on Immigration and Citizenship. Boulder: Westview. Rainer Bauböck, Eva Erbøll, Kees Groenendijk and Harald Waldrauch (2006). Introduction. In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume I: Comparative Analyses (pp. 15-34). Amsterdam: Amsterdam University Press. Ricky Van Oers, Betty de Hart, and Kees Groenendijk (2006). Netherlands In Rainer Bauböck, Eva Ersbøll, Kees Groenendijk, and Harald Waldrauch (Eds.), Acquisition and Loss of Nationality. Policies and Trends in 15 European States. Volume 2: Country Analyse (pp. 391-434). Amsterdam: Amsterdam University Press. Rogers Brubaker (2004). Ethnicity without Groups. Cambridge, Mass.: Harvard University Press. Sune Lægaard (2006). Feasibility and Stability in Normative Political Philosophy: The case of liberal nationalism. Ethical Theory and Moral Practice, 9(4), 399-416. Sune Lægaard (2007a). David Miller on Immigration Policy and Nationality. Journal of Applied Philosophy, 24(3), 283-298. Sune Lægaard (2007b). Liberal Nationalism and the Nationalisation of Liberal Values. Nations and Nationalism, 13(1), 37-55. Sune Lægaard (2009). Liberal Nationalism on Immigration. In Nils Holtug, Kasper LippertRasmussen, and Sune Lægaard (Eds.), Nationalism and Multiculturalism in a World of Immigration (pp. 1-20). Basingstoke: Palgrave. Will Kymlicka (1989). Liberalism, Community and Culture. Oxford: Clarendon Press. Will Kymlicka (1995). Multicultural Citizenship. Oxford: Clarendon Press. Will Kymlicka (2001a). Politics in the Vernacular: Nationalism, Multiculturalism, and Citizenship. Oxford: Oxford University Press. Will Kymlicka (2001b). Territorial Boundaries: A liberal egalitarian perspective. In David Miller, and Sohail H. Hashmi (Eds.), Boundaries and Justice: Diverse Ethical Perspectives (pp. 249-75). Princeton: Princeton University Press.


101

Will Kymlicka (2003a). Immigration, Citizenship, Multiculturalism: Exploring the Links. In Sarah Spencer (Ed.), The Politics of Migration: Managing Opportunity, Conflict and Change (pp. 195-208). Oxford: Blackwell. Will Kymlicka (2003b). New Forms of Citizenship. In Thomas J. Courchene, and Donald J. Savoie (Eds.), The Art of the State: Governance in a World without Frontiers (pp. 265309). Montreal: Institute for Research on Public Policy. Will Kymlicka (2004). Dworkin on Freedom and Culture. In Justine Burley (Ed.), Dworkin and his Critics (pp. 113-133). Oxford: Blackwell. Will Kymlicka (2006). Liberal Nationalism and Cosmopolitan Justice. In Robert Post (Ed.), Another Cosmopolitanism (pp. 128-44). Oxford: Oxford University Press. Will Kymlicka, and Wayne Norman (2003). Citizenship. In R.G. Frey, and Christopher Heath Wellman (Eds.), A Companion to Applied Ethics (pp. 210-23). Oxford: Blackwell. Yael Tamir (1993). Liberal Nationalism. Princeton: Princeton University Press.

In: Group Theory Editor: Charles W. Danellis, pp.103-152


Chapter 4

THE CONSIDERATION OF RAPE AS TORTURE AND AS GENOCIDE: SOME IMPLICATIONS FOR GROUP THEORY Daniela De Vito Roehampton University, London, United Kingdom

“...there is no pain that lasts a hundred years, nor a body that will endure it ,... there is no pain greater than the pain of being alive - a Nicaraguan saying” 1

INTRODUCTION Within the past few decades, relatively rapid developments have occurred in relation to the promotion, protection, and even at certain levels the enforcement of human rights norms. A plethora of international, regional and national human rights instruments and mechanisms have been established. Despite these critical steps, the human rights discourse which involves States, international organisations such as the United Nations (UN), non-governmental organisations, the media, individuals, theoretical and legal developments, etc., continues to adapt and to reveal discrepancies between what is required and what in reality occurs. How rape has been conceptualised, placed, and treated by various institutions and instruments within international human rights and humanitarian law presents both inconsistencies and, in recent times, innovative conclusions. With respect to inconsistency, when rape is mentioned explicitly within, for instance, the context of international

1

Quote taken from: James Quesada. “Suffering Child: An Embodiment of War and Its Aftermath in PostSandinista Nicaragua.” P. 51.

104

Daniela De Vito

humanitarian law, it tends to be associated with a woman’s “honour” and not as a crime of 2 violence. Alternatively, an emphasis is placed on the protection of women and not on the prohibition of rape. As long as there is no single authoritative instrument or provision that exists for defining rape within regional and United Nations (UN) human rights instruments, it will not be possible to point to an overarching definition of rape that can be utilised within the context of international humanitarian law. However, in 1998, the Trial Chamber for the International Criminal Tribunal for Rwanda (ICTR) included within its Judgement in the case 3 of the Prosecutor v. Jean-Paul Akayesu an attempt to define rape within international law. Highly innovative, this definition has been used as a starting point for subsequent international criminal tribunal reflections on how rape can be categorised. (See also Elements of Crimes, Rome Statute, International Criminal Court, War Crime of Rape and Crime against Humanity of Rape) 4 In contrast, there is a series of international crimes, such as torture, that have been conceptualised and treated as crimes of violence and as such their prohibition within 5 international law is considered paramount. Furthermore, beyond rape being subsumed within the categories of such international crimes as torture, genocide, the grave breaches provisions of the Geneva Conventions (1949), or crimes against humanity, rape currently does not stand on its own as an enumerated international crime. Rape is prohibited under international law, but is not specifically designated as an international crime. The result of this process is that rape must be subsumed within an established international crime, such as genocide, crimes against humanity, war crimes, etc. if it is to be prosecuted within an international criminal tribunal or the recently

2

Rhonda Copelon has argued that where rape is mentioned in the Geneva Convention (1949) it is conceptualised as an “attack against honour”, rather than depicted as a crime of violence. She argues this is problematic, because it marginalises the seriousness as well as the violent nature, of rape under international humanitarian law. She urges that rape should be viewed as a form of torture, in order to remove the ambiguity that is the legacy of sexism and to place such crimes against women on a par with crimes against men Rhonda Copelon, 1999: 337. 3 In its findings, the Trial Chamber defined rape as “…a physical invasion of a sexual nature, committed on a person under circumstances which are coercive.” The Chamber also stated: “…rape is a form of aggression and that the central elements of the crime of rape cannot be captured in a mechanical description of objects and body parts. This approach is more useful in international law.” ICTR, Prosecutor v. Jean-Paul Akayesu (Case No. ICTR-96-4-T, 2 September 1998): 138. The Akayesu Judgement provided an overarching and potentially progressive definition of rape where none had existed before in instruments of international law. The case also established that rape could be tried as a component of genocide if committed with the intent to destroy a targeted group. 4 There remains dispute over what is considered an international crime (“crime under international law”?). Overall, an international crime, in relation to State responsibility, may be understood as “…those acts which the ‘international community as a whole’ considers to be serious breaches of obligations essential for the protection of fundamental interests of that community.” UN Document A/CN.4/453 and Add. 1-3, “Fifth Report on State responsibility by Mr. G. Arangjio-Ruiz, Special Rapporteur,” Yearbook of the International Law Commission II (1) (1993): 31. 5 See Articles 1, 2, 4 and 5 from the United Nations Convention against Torture and other Cruel, Inhuman or Degrading Treatment or Punishment (1984). P.R. Ghandhi, 2000: 109.

The Consideration of Rape as Torture and as Genocide

105

6

established International Criminal Court. As such, determining and assessing some of the theoretical implications for rape that emerge once it is subsumed into an established international crime will form the crux of this study. This chapter determines and assesses some of the theoretical implications for rape that emerge once it is placed within the international crimes of torture and genocide. Specifically, the differences between rape as a form of torture with its emphasis on the individual and rape as genocide which focuses on violations committed against the group will form the basis of this theoretical analysis. The question therefore becomes, “does the dynamic of rape alter when it is subsumed within these complex and contrasting international crimes?” Furthermore, it will be argued that any study of group theory, as it relates to rape within the context of international law, must appreciate the relationship between rape as it affects the individual and rape as it affects the group. In general rape has been conceptualised as a crime committed against individual victims. In turn, the international crime of torture has been constructed as a human rights violation committed against the individual. The consideration of rape as a form of torture is a recent innovation. At the international level, the first formal association was made by the United Nations (UN) Special Rapporteur in 1986. Rape as torture still emphasises the individual victim/survivor. In contrast, the critical focus of genocide is the protection of entire human groups. It is the ‘right to existence’ of human groups and not of individuals which is the 7 concern. When rape has been considered as genocide which is conceived as a crime committed against certain groups its dynamic changes. Rape is no longer just a violation against the individual. Rape becomes part of a notion set up to protect the group. The determination of this chapter is that there is still a place for the individual victim of genocide or of rape as genocide. However, as with the current concept of human rights, this space is unequal and at times uncomfortable. Crucially, even with innovative jurisprudence there is a need to assess the complex relationship between rape that affects the individual and rape as genocide which is placed within the group dynamic. Although the chapter will touch upon international human rights and humanitarian law, it will not function within a traditional legal framework. Analysing recent changes achieved within the human rights discourse, in relation to how rape has been conceptualised, placed and treated, will instead be driven by conceptual resources found within the broad discipline of political theory. For instance, when is a human rights violation a public/private matter? Also, when are rights concerns for individuals or for groups? Underlying the chapter are two fundamental questions: (1) What are some of the theoretical implications of considering rape as a form of torture when committed within the spheres of international human rights and humanitarian law? and; (2) What are some of the theoretical implications of considering rape as genocide?

6

For more on this, please refer to the Statutes for the International Criminal Tribunal for Rwanda, the International Criminal Tribunal for the Former Yugoslavia, the Sierra Leone Special Court, and the International Criminal Court. 7 See for instance, Prosecutor v. Jean- Paul Akayesu (International Criminal Tribunal for Rwanda (ICTR), Trial Chamber Judgement, 1998). Hereinafter the Akayesu case.

106

Daniela De Vito

Three main areas of discussion, emanating from two real-life international criminal tribunal cases, have been identified to arrive at an understanding of some of the theoretical implications that can affect rape. These two cases, one taken from the International Criminal Tribunal for the Former Yugoslavia (ICTY) and the other from the International Criminal Tribunal for Rwanda (ICTR) offer an opportunity to extract and to assess certain theoretical issues that could affect the understanding of rape. The ICTY case (2001) discusses rape and torture and also acknowledges the individual component of rape with references to violations 8 of sexual autonomy. In contrast, the ICTR case (Akayesu case, 1998) proposes for the first time that under specific circumstances rape could be genocide thereby constructing rape within the formulation of genocide—a crime against the group (Akayesu case, 1998). The three main areas of deliberation are: possible tensions within the current concept of human rights between the individual and the group; understanding rape as a violation committed against the individual; and what can occur to rape once it has been subsumed within an established international crime such as torture or genocide. These factors are crucial to determining and assessing some of the theoretical implications emerging from recent and innovative international jurisprudence. The chapter is organised into six sections, including this Introduction: Torture and Genocide Briefly Defined Setting the Theoretical Framework: Subsuming Rape into Established International Crimes (Torture and Genocide) Rape as a Form of Torture: Arriving at an Understanding of the Political Rape as Genocide: Theoretical Implications for the Group Conclusion

TORTURE AND GENOCIDE BRIEFLY DEFINED Torture The right not to be subjected to torture is listed in several international and regional 9 human rights instruments. The non-derogable nature of this right, meaning that even under declared states of emergency it can never be suspended, is well founded within several regional and international human rights instruments and within international humanitarian law. Its position within what is referred to as ‘general international law’ is commonly accepted. Briefly on this subject, through various international, regional, and even domestic developments achieved since the establishment of the UN, the prohibition of torture fits into 8

Prosecutor v. Dragolijub Kunarac, Radomir Kovac and Zoran Vukovic, ICTY, IT-96-23-T & IT-96-23/1-T, Judgement, 22 February 2001. Hereinafter the Foca Case. Although the Kunarac et al cases focuses on Crimes Against Humanity, it has been chosen for analysis due to the discussions of rape and torture and the notion of sexual autonomy. 9 For example, this right is explicitly noted in the Universal Declaration of Human Rights (1948), Declaration on the Protection of All Persons from Being Subjected to Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment (1975), Convention against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment (1984), Inter-American Convention to Prevent and Punish Torture (1985), European Convention for the Prevention of Torture and Inhuman or Degrading Treatment or Punishment (1987), Body of Principles for the Protection of All Persons under any Form of Detention or Imprisonment (1988), Code of Conduct for Law Enforcement Officials (1979), etc.


107

the category of general international law which emanates from certain established sources of international law such as treaties or custom. (Nigel Rodley, 1999) What constitutes torture? More than one definition of torture exists, such as those outlined under the UN and the Inter-American human rights regimes. The understanding of torture varies. For example, the UN Convention (1984) mentions that torture involves the infliction of ‘severe’ pain or suffering while the Inter-American Convention (1985) leaves out 10 this element of severity. In contrast, the European Convention (1987) does not include a 11 definition of torture. Torture, as outlined in the Greek Case , can be understood from the perspective of a ‘sliding scale’ with a series of phases surpassed in order to arrive at this international crime: “So, for torture to occur, a scale of criteria has to be climbed. First, the behaviour must be degrading treatment: second, it must be inhuman treatment; and third, it must be an aggravated form of inhuman treatment, inflicted for certain purposes.” (Nigel Rodley, 1999: 77-78. See also: OSCE, 1999: 37 Duncan Forrest, 1991: 1-20 Bertil Duner, 1998: 3-62 and Michael Peel, 2005: 7-19)

A more contemporary examination of these degrees of acts can be found in the International Criminal Tribunal for the Former Yugoslavia (ICTY) Trial Chambers decision in the ‘Celebici’ case. Due to how the charges of rape were formulated, namely under the heading of ‘Torture and Cruel Treatment’, the Trial Chamber had to work through not only what constituted torture and cruel treatment, but also into which category rape should be placed in this particular case. In a detailed analysis the Trial Chamber, albeit within the context of an international armed conflict, examined the various components of torture; wilfully causing great suffering or serious injury to body or health; inhuman treatment; and 12 cruel treatment. Which definition of torture, then, will be most useful for this chapter? The two cases mentioned emanate from the UN human rights system and address situations of international 10

Article 1 of the UN Declaration (1975) reads “…torture means any act by which severe pain or suffering, whether physical or mental, is intentionally inflicted…” Article 1 of the UN Convention (1984) reads “1. For the purposes of this Convention, the term ‘torture’ means any act by which severe pain or suffering, whether physical or mental, is intentionally inflicted on a person…” Article 2 of the Inter-American Convention (1985) reads “…torture shall be understood to be any act intentionally performed whereby physical or mental pain or suffering is inflicted…” P.R. Ghandhi. Pp. 109 and 315. N. Rodley. P.389. 11 The Greek Case was, in 1967, brought before the European Commission of Human Rights by four member States alleging violations committed by the Greek Government of human rights violations – including tortureunder the European Convention for the Protection of Human Rights and Fundamental Freedoms (1950). In the words of the Commission: “…for all torture must be inhuman and degrading treatment, and inhuman treatment also degrading. The notion of inhuman treatment covers at least such treatment as deliberately causes severe suffering mental or physical, which, in the particular situation, is unjustifiable. The word torture is often used to describe inhuman treatment, which has a purpose, such as the obtaining of information or confessions, or the infliction of punishment, and it is generally an aggravated form of inhuman treatment.” Yearbook of the European Convention on Human Rights (The Greek Case). V. 12. 1969: Introduction and 186. The inclusion, by the Commission, of an element of justifiability has been criticised since it could be construed that in particular circumstances such acts could be acceptable. Nigel Rodley, 1999: 78-84. 12 Prosecutor v. Zejnil Delali et al. ICTY, IT-96-21-T, Judgement, 16 November 1998. Hereinafter the Celibici case). Pp. 161-162.

108

Daniela De Vito

or non-international armed conflicts. It is not set in stone that the UN Convention (1984) and its definition of torture must be followed for situations of armed conflict. Yet, in an interesting assessment, the ICTY Trial Chamber has argued: “It may therefore be said that the definition of torture contained in the Torture Convention (1984) includes the definitions contained in both the Declaration Against Torture and the Inter-American Convention and thus reflects a consensus which the Trial Chamber considers to be representative of customary international law.” (Celebici case, 1998: 170)

In contrast, however, the ICTY Trial Chamber in another decision concludes “…that the definition of torture under international humanitarian law does not comprise the same elements as the definition of torture generally applied under human rights law. In particular, the Trial Chamber is of the view that the presence of a state official or of any other authority-wielding person in the torture process is not necessary for the offence to be regarded as torture under international humanitarian law.”

13

As such, the Trial Chamber used customary international law to define torture within the confines of international humanitarian law as involving “(i) The infliction, by act or omission of severe pain or suffering, whether physical or mental. (ii) The act or omission must be intentional. (iii) The act or omission must aim at obtaining information or a confession, or at punishing, intimidating or coercing the victim or a third person, or at discriminating, on any ground, against the victim or a third person.” (Foca case, 2001: 170) Furthermore, in the Rome Statute for the recently established International Criminal Court there has been a shift away from the requirement of an ‘official link’ in the commission of torture and under certain circumstances (Crimes against humanity) a specific ‘purpose’ is 14 not required. Regardless, the definition of torture as found under the UN Convention (1984) will be used for relevant cases. In other circumstances, the appropriate instrument or interpretation will be referenced on a case by case scenario.

13

Prosecutor v. D. Kunarac (et al). IT-96-23-T & IT-96-23/1-T (hereinafter the ‘Foca case’). Judgement. 22 February 2001. 14 The UN Human Rights Committee, in reference to Article 7 of the ICCPR has also argued this point in 1992. “It is the duty of the State party to afford everyone protection…against acts prohibited by article 7, whether inflicted by people acting in their official capacity, outside their official capacity or in a private capacity. Foca Case. P. 164. See also the Rome Statute for the International Criminal Court (ICC) – Elements of Crimes – in relation to the issue of an ‘official capacity’ and ‘purpose’. Articles 7 and 8.


109

Genocide Considering rape as genocide may have an impact on the prohibition of rape and on rape as a form of torture. Why is it necessary to formulate a definition of genocide? Such definition making is critical for any attempt to understand what has taken place in a particular situation. Without a definition, it is not possible to determine if genocide was/is present. However, Helen Fein is correct in asserting that at times the term genocide is overly used, thereby watering down its meaning. (Helen Fein, 1993:4) A definition is also necessary for prevention in the face of “… a potential or developing crisis. .” (George J. Andreopoulos (ed.), 1994: 4) Despite the need to define genocide, its formulation is fraught with difficulties. That is, “There is no single correct definition of genocide waiting to be discovered.” (Michael Freeman, 1991: 186) This is not to say that the more definitions of genocide exist, the better. Rather, as witnessed by the Genocide Convention (1948), gaps can exist or clarifications need to be made in terms of what constitutes genocide. One key difference from earlier episodes of genocide is that the twentieth century versions had increasingly involved genocide committed by the State against its own citizens under situations outside inter-State wars. In other words, conditions not necessarily involving conflict between States. (Chalk & Jonassohn, 1990: 9) Therefore, genocide has taken place, more and more, within the parameters of intra-State 15 within the State - realities. The push to place the prevention and punishment of genocide within the paradigm of post W.W.II international legal norms originated with Raphael Lemkin. Specifically, he wanted the newly formed United Nations (UN) - 1945 - to tackle the subject. However, the roots of trying to deal with some of the issues involved in ‘genocide’ can be traced centuries earlier to even the 17th Century. The religious wars and “inhumane international conditions” in Europe during the 1600s prompted the legal theorist Hugo Grotius (1583-1645) to push for reforms. He urged the powers at the time to resolve conflicts via legal structures rather than wars. He believed that this method would promote, “...co-operation, peace and more humane treatment of people.”(Kegly & Wittkopf, 2003: 481) In his examination of genocide, Raphael Lemkin determined that it was the “...denial of the right of entire groups to exist (and that) the punishment of the crime of genocide is a matter of international concern.” In December 1948, the UN’s General Assembly 16 unanimously adopted the Genocide Convention (1948). It entered into force in January 1951. Precisely defining what constitutes genocide was not only difficult in 1948, but is still contentious today. (Damien Short, 2008 Adam Jones, 2006 Martin Shaw, 2006 and Martin Shaw, 2007) One of the most serious problems is deciding what constitutes a victim group of 17 genocide. For example, the Genocide Convention (1948), under article II, attempts to define potential victim groups. It sees that genocide is the crime of “...acts committed with the intent to destroy, in whole or in part, a national, ethnic, racial or religious group.” One ongoing 15

This fact could have important implications, for example, in relation to the tensions between State sovereignty and the issue of humanitarian intervention under international law. 16 GA Res. 260A(III), UN Doc. A/760. Ratner & Abrams, 2001: 27-28 and n. 11. 17 Other concerns stem from the precise meaning of “in whole or in part” and how to determine “intent”. For more on these issues, please see below and also refer to George J. Andreopoulos.

110

Daniela De Vito

concern is that this definition does not provide protection to those who do not technically fall into these categories, such as members of ‘political’ groups. In turn, Kelly Dawn Askin proposes that ‘gender’- in particular “women as a gender” - be added as one of the groups 18 outlined in the Genocide Convention. She notes that gender, as it is understood, would not go beyond the intentions of those who framed the Genocide Convention since they wanted “…to extend protection to stable groups only – groups having an enduring identity”. (Kelly Dawn Askin, 1997: 342) In the Akayesu case (ICTR), rape constituted genocide since it was a method used against Tutsi women. The aim was to ‘destroy in whole or in part’ the ethnic group of Tutsis via rape. The effects of being raped, of witnessing relatives being raped, and of unwanted pregnancies contributed to this ‘destruction’. According to the ICTR Trial Chamber, the women were targeted due to their membership in the Tutsi group. However, and with the understanding that the Genocide Convention (1948) is the basis for any examination of genocide, only focussing on the fact that these women were Tutsi and putting aside the idea that they were raped because they were women or individuals may present an incomplete overview of the crimes that occurred in Rwanda. Even with a working definition of genocide, the Genocide Convention does not include all potential acts. The issue of ‘cultural genocide’ as an ‘enumerated act’ under the Genocide Convention (1948), such as “prohibiting the use of a language and destroying or preventing the use of libraries”, etc., was discussed at the General Assembly and at the UN’s Economic and Social Council (ECOSOC). This section was not included in the final draft of the 19 Genocide Convention passed in 1948. In addition, the tension created by defining ‘who’ is a potential victim and what genocide could reflect, highlights the need to move beyond an understanding of the subject that is solely framed by the horrible events of the Holocaust. (Alan J. Rosenbaum, 2001: 22) In other words, genocide can involve several types of groups and methods/realities beyond only “...the complete biological destruction...” of a group. (Ratner & Abrams, 2001: 22) Away from a working definition of genocide, what are some of the other issues that the Genocide Convention (1948) outlines? Under article III, the 1948 20 Convention provides for the punishment of genocide. It also provides for the prevention of this crime. Under article VIII, any “Contracting party” may ask the UN to take “...action under the UN Charter (1945) for the prevention and suppression of acts of genocide or any of the acts in article III.” (P.R. Ghandi, 2000: 20) Once again, Article 4 of the ICTY Statute borrows language from the relevant sections of the Genocide Convention (1948). Furthermore, genocide is considered an international crime whether it occurs during war or 21 during peacetime. Although this provision of the Genocide Convention (1948) is not explicitly found in Article 4 of the ICTY Statute, the ‘Final Report of the Commission of 18

The connection is between sex crimes and genocide: “The sex crimes against women occur specifically because of their gender.” Kelly Dawn Askin, 1997: 342 and n. 1072. 19 One concern was that a definition of ‘cultural genocide’ would be difficult to develop and therefore could be open to “…abusive and illegitimate claims of genocide”. Cultural genocide had been included in two earlier drafts of the Genocide Convention. Ratner & Abrams, 2001: 31. 20 In relation to genocide and permissive universal jurisdiction. 21 Article I of the Genocide Convention reads “…genocide whether committed in time of peace or in time of war, is a crime under international law.” P.R. Ghandhi, 2001: 19.


111

Experts’ for the Former Yugoslavia highlights both when genocide can take place and that it 22 is an international crime. Article II of the Genocide Convention (1948) – which is repeated in the ICTY Statute – 23 mentions the important component of ‘intent’. That is, for genocide to be proven it must be 24 demonstrated that the perpetrator(s) had the intent to destroy in whole or in part a group(s). The individual on its own is not the focus. Rather, several individuals as members of a 25 national, ethnical, racial or religious group “as such” are the concern. The ‘Final Report of the Commission of Experts’ describes intent overall as “…the subjective or mental 26 In relation to genocide, the Commission focused on constituent element of a crime”. whether there was enough evidence to show reasonably that the perpetrator would have been “…aware of the consequences” that his or her actions would have inflicted onto the target 27 group(s). For the Commission this would reveal intent but not necessarily the motive 28 behind the acts or omissions. Despite this attempt to work through what constitutes intent,

22

In particular, the Commission noted “Thus, irrespective of the context in which it occurs (for example, peace time, internal strife, international armed conflict or whatever the general overall situation) genocide is a punishable international crime.” Bassiouni & Manikas. Quoting from the ‘Final Report of the Commission of Experts pursuant to Security Council Resolution 780 (1992), UN SCOR (Security Council Official Record), Annex, UN Doc. S/1994/674 (27 May 1994): 523. 23 The ICTY Statute reads “Genocide means any of the following acts committed with intent to destroy, in whole or in part, a national, ethnical, racial or religious group, as such:” Ibid: 521. Social and political groups are also excluded from protection under Art 4 of the ICTY Statute. Id. 526. 24 Article 4 (2) of the ICTY Statute includes a list of act(s) - actus reus (guilty act) - which inter alia are killing (a), causing serious bodily or mental harm (b), the forcible transfer of children (e), etc. Id. 521-522. 25 According to the ‘Final Report of the Commission of Experts’ the wording of “as such” allows for an understanding of genocide to include “…the crimes against a number of individuals must be directed at them in their collectivity or at them in their collective character or capacity.” Id. 524. The categories of ‘national’, ‘ethnic’ and ‘racial’ be understood as follows: “…the term ‘national group’ as a set of individuals whose identity as such is distinctive in terms of nationality or national origins…The difference between the terms ‘ethnic’ and ‘racial’ is perhaps harder to grasp. It seems that the ethnic bond is more cultural…The racial element, on the other hand, refers more typically to common physical traits.” Morris & Scharf, 1998: 173 at n. 698-700. It should be noted that membership in a particular group may also involve a subjective element. This means that a person may consider themselves as being part of a group even if technically they do not fit all the parameters. Prosecutor v. Alfred Musema, Judgement. Para. 161. 26 Id. 27 Id. Intent, in relation to genocide, is described as “…the criminal mens rea”. Kelly Dawn Askin,1997: 338. Mens rea (guilty mind) involves “The state of mind that the prosecution must prove a defendant to have had at the time of committing a crime in order to secure a conviction.” Elizabeth A. Martin (ed.). Oxford Dictionary of Law. P. 290. 28 Id. To elaborate, “…most commentators agree that so long as the requisite intent is established, underlying motives are irrelevant.” Ratner & Abrams, 2001: 38.

112

Daniela De Vito

it nonetheless remains unsettled. (Ratner & Abrams, 2001: 36) One area of debate relates to 29 the issue of evidence, since often it may be circumstantial or indirect. Using the Genocide Convention (1948) as a starting point, what does genocide consist of? Steven Ratner and Jason Abrams outline “three main elements”. They are as follows: ¾ “the commission of at least one of the acts enumerated in Article II(a) through (e); ¾ the direction of that act at one of the enumerated types of groups; and 30 ¾ the intent to destroy the group in whole or in part.”

Thus, the process of defining genocide continues today. The very real tensions created by this debate must be addressed. If not, and unless workable definitions exist in international law and in relevant academic disciplines, acting on the prevention and punishment of this crime will continue to be bogged down by situations such as States incorrectly claiming that genocide is not “technically” taking place as defined by the Genocide Convention (1948). The measures undertaken during 2004 at the UN and within the broader international community, regarding whether or not the human rights violations that have taken place in the Darfur region of Sudan constitute genocide is one example of this difficulty. The purpose of illustrating these shifting debates is not intended as a precursor to solving the tensions 31 involved. Rather, it is meant to elucidate the point that definitions of genocide do not offer a complete picture of what genocide can constitute. The same can be true for any definition of 32 rape, torture, and rape as a form of torture.

29

Ibid and Supra n. 27 in relation to the lack of a ‘plan’ per se. Ibid: 27. In terms of ‘who’ can commit genocide – that is, does it have to be the ‘State’ or can other actors through their acts also commit genocide? - during the deliberations to formulate the Genocide Convention it was decided by the Ad Hoc Committee “…not to insert in definition any words to the effect that genocide must be committed, encouraged or tolerated by heads of state or by those who lawfully or factually exercise governmental authority.” Bassiouni & Manikas, 1999: 527 and n. 100. 31 For example the 1985 Whitaker Report of the UN “…called for decisive amendment of the Convention to include all political mass murders.” George J. Andreopoulos, 1994: 75. Whitaker Report: “Revised and updated report on the question of the prevention and punishment of the crime of genocide prepared by Mr. B. Whitaker, Review of Further Developments in Fields with which the Sub-Commission Has Been Concerned”. UN Doc. E/CN.4/Sub.2/1985/6 (2 July 1985). Also, the International Criminal Tribunal for Rwanda (ICTR) Trial Chamber in the Akayesu Judgement (at para. 516) “…stated that the list of four groups in the Convention (Genocide) is not exclusive insofar as the Convention’s drafters sought to protect ‘any stable and permanent group’.” Ratner & Abrams, 2001: 42. 32 I am grateful to Nigel Rodley for insisting on this point. 30


113

SETTING THE THEORETICAL FRAMEWORK: SUBSUMING RAPE INTO ESTABLISHED INTERNATIONAL CRIMES (TORTURE AND GENOCIDE) As outlined in the Overall Introduction, the approach taken is an analytical one driven by the resources of political theory, while touching upon international law (human rights and humanitarian). The reason for adopting this interdisciplinary format is to effectively examine some of the theoretical arguments that underpin claims to human rights, and to assess the ways in which the crime of rape has been conceptualised, placed and treated within international law. Undertaking this broader approach offers a more comprehensive discourse; theory and law should not be viewed as necessarily mutually exclusive and more importantly each discipline can assist the other. This section sets out the broad theoretical framework underpinning the work. The aim is to highlight some of the critical themes such as how the status, role, and rights of the individual have changed and adapted over time to be able to fit within the current concept of human rights and how an emphasis on the individual exists within current understandings of human rights. The final aspect outlines a process whereby under certain circumstances rape can be subsumed into the established international crimes of torture and genocide. This will provide the grounding which will enable the exploration of the two theoretical implications identified: what can occur when rape is linked to torture and what can occur when rape is associated with genocide. It is the individual and the group elements of torture and genocide that will affect this analysis.

The State, the Individual and Human Rights We can at first glance, see an important connection between liberalism and the concept of human rights, the emphasis on the individual. Again, one can debate to what degree the concept of human rights allows for notions such as the collective, group rights, and the community. In essence, there are three main features that link the political ideology of liberalism with human rights: -

individualism universalism all individuals as being ‘morally equal’

These features, it can be argued, evoke a certain level of controversy. Yet, if one looks at Article 1 of the UDHR (1948) echoes of this liberalism emerge: “All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.” (P.R. Ghandhi, 2000: 22 33

33

Beyond liberalism the UDHR also echoes the broader notion of natural rights. As Johannes Morsink points out “Thus the similarity of language between the Universal Declaration and the eighteenth-century declarations of rights should not be taken too lightly.” Morsink, however, adds “Thus it is clear that although natural rights philosophy does inform the Universal Declaration, it does not inform all parts equally.”

114

Daniela De Vito

The seeds leading to the development of the concept of human rights are not only found in liberalism. Furthermore, other traditions may understand and present human rights differently. The implications of this may or may not adversely affect the overall concept of human rights. So, for the sake of clarity I will understand the current concept of human rights as emphasising the individual. However, this individual can to varying degrees also have a place, role, and duties and receive benefits within his or her respective community. This 34 approach will, hopefully, allow for more flexibility in the theoretical discussions to follow. During the 17th and 18th centuries changes occurred in the relationship between the ‘state’ and the individual where it was understood that the former existed for the individual and not the inverse. (Noberto Bobbio, 1995: 41) Regardless, due to how the current international system has been developed and continues to function, the role of the State in vis à vis the individual and his or her human rights continues to be paramount. First, and with reference to the UN, its Charter and how it operates is State-centred. Human rights are indeed mentioned in the UN Charter. However, these passages must be considered alongside Article 2(7), which addresses State sovereignty and domestic jurisdiction matters, since these are still relevant despite the plethora of human rights instruments and mechanisms subsequently developed 35 within UN structures. Second, and in a twist of irony, despite the fact that human rights are to be used as a buffer in the face of abusive State power and violence it is these very entities that are charged with upholding lofty human rights norms. Taken to the extreme, this set-up is similar to a scenario whereby the fox is charged with protecting the chickens it also feeds upon. As Article 2(1) of the International Covenant on Civil and Political Rights (1966) reads “Each State Party to the present Covenant undertakes to respect and ensure to all individuals within its territory and subject to its jurisdiction the rights recognised in the present Covenant.” (P. R. Ghandhi, 2000: 64)

Bringing complaints of human rights violations to international and regional human rights mechanisms are considered options of last resort. The State that violates the human rights of individuals within its jurisdiction, either through acts or omissions must, in theory at 36 least also rectify the situation.

Johannes Morsink, 1984: 311 and 334. 34 For details on additional themes associated with human rights such as theories of rights (choice and benefit, etc.); the justification of human rights; and universality v. cultural relativism please refer to the relevant sections of Jack Donnelly (1989), Michael Freeman (2002) and Peter Jones (1994). 35 To elaborate: “International relations is structured around the principle of sovereignty, which grants a state exclusive jurisdiction over its own territory and resources, including its population. Sovereignty in turn implies non-intervention in the internal affairs of other states.” Jack Donnelly, 1996: 232. 36 A States’ involvement within the jurisdiction of another State can also be called into question. The International Court of Justice (ICJ), in its examination of the United States’ activities during Nicaragua’s civil war, concluded – in relation to international humanitarian law - “…violations of common article 3, including torture and ill-treatment, are violations of the ‘laws and customs of war’ relevant to any armed conflict.” Nigel Rodley, 1999 P. 72 (in reference to Military and Paramilitary Activities in and against Nicaragua (Nicaragua v. United States of America), Merits,ICJ Rep. (1986), para. 218.


115

The Individual/Group Schism The current concept of human rights, as incorporated by national, regional, and international human rights mechanisms, places an emphasis on the individual. Theoretically, as argued by authors such as Jack Donnelly, these rights exist outside the reality of the modern State. Human rights are not given to human beings by the State. Individuals, by the mere fact that they are human beings, already exist with certain rights. The process that entrenches human rights into law is separate. Yet, the individual still has a place and a role within larger frameworks such as society or the State. Individual human rights and notions such as the group and the State do not always co-exist without conflict. For instance, a group of immigrants living in a Canadian city attempted to change their district’s by-laws to install permanent separate swimming areas between men and women at the municipal pool. In this case, the group, other individual bathers who did not support the need for separate swimming areas at the public pool, and the local council had to negotiate according to their respective needs, rights, and constraints. The process was protracted and complex. In the end, the request from the immigrant group was not granted since, according to the council, it did not match the overall culture of the local community and the rights of other individual swimmers. This example demonstrates how at times and depending on the human rights violation under scrutiny, the individual has to navigate a narrower and more precarious path. Within torture and rape as torture, the individual victim has a distinctive and well-established place. Within genocide and rape as genocide, the place for the individual is less defined and more complicated due to the nature of a crime rooted in the protection of human groups. The individual has to operate within the reality of the group. The degree of this accommodation would vary from case to case.

Rape, Torture, and the Political Returning to the crime of rape, whether occurring under national jurisdictions or international mechanisms, rape is committed against the individual. National courts and 37 Similarly, the international crime of international entities address the individual victim. torture is a violation against the individual. When rape is committed in situations that could fall within the criteria of torture, it is also committed against an individual. For example, if a prisoner is raped by a prison official, and the rape occurred in an attempt to elicit information from this individual, the act could be deemed ‘rape as a form of torture’. The act of rape as torture would be a violation of that individual’s human rights. In this case, the emphasis is on the individual and on his/her individual human rights. Therefore, and following the framework within the current concept of human rights and as applied under international law, the individual as a victim of rape (be it as a form of torture or as genocide) should be the focus of the theoretical analysis.

37

Although rape as a criminal offence is formulated under a trial system with the perpetrator and the State as participants, I wish to emphasise for the purpose of this chapter that rape is a crime (or under international law and within the rubric of human rights violation) committed against the individual victim.

I am grateful to Sam Ashenden for insisting on this clarification.

Daniela De Vito

116

An additional and critical layer of complexity emerges when rape is considered as a form of torture. Individual victims of torture, recognised under international law, tend to have been politically active or have political connections. Both men and women suffer rape under conditions covered by human rights and humanitarian law. Yet women are more likely to be subjected to rape both in situations of armed conflict and in situations outside of armed conflict, than men are. In a study (1990) comparing 56 politically active men and women who had been tortured in Latin America, it was found that 12 of the 28 women and 5 of the 28 men had been raped while in custody. (F. Allodi & S. Stiasny, 1990: 144-146) The results confirmed the hypothesis that for women “…torture was more frequently sexual (in nature), 38 and its consequences more often affected the women’s sexual adaptation.” Furthermore, for acts falling short of rape, the report states “…men…were, as often as women, subjected to sexual humiliation and abuse.” (Ibid: 147) The report also added that both men and women who were politically active suffered higher levels of ‘persecution and torture’. (Id) A later study conducted by TAT [non-governmental association against torture] of men (47) and women (17) from the Basque region of Spain who had been arrested during 1992-1993 found “…significantly more women than men being subject to such abuse (‘sexually tortured’) (94% vs 66%).”(The Lancet, 1993: 1307) The two studies just quoted examine the numbers of men and women who have been raped and focussed on those who had been politically active in their respective countries. Does this emphasis on political involvement provide an accurate description of all torture victims? Moreover, and critically for this discussion, what is meant by the ‘political’ and can this include a broad notion of the concept? Lastly, is this the best option? Similar questions will be posed in conjunction with rape as genocide, in light of what constitutes the current concept of human rights – the individual v. the group dynamics. At the United Nations (UN) level, this focus has been addressed with the acknowledgement that in reality the majority of individuals who have been tortured were not politically active or associated with persons who had been political. Specifically, the UN Special Rapporteur on Torture has outlined: “It is true that many of the more high-profile cases of torture that come to international attention concern people involved in political activities of various sorts. Such victims of torture may well be of a class or connected with organisations that have international contacts.”

39

Beyond such a perception of who is a victim of torture, the Special Rapporteur on Torture instead asserts: “…the overwhelming majority of those subjected to torture and ill-treatment 40 are ordinary common criminals from the lowest strata of society.” Even in terms of considering rape as a form of torture, proponents have also at times presented their case with an emphasis on the political. For instance, Evelyn Mary Aswad has argued that rape may be considered as a form of torture if for example “…government officials rape for political 38

“Women reported…much higher incidence of sexual dysfunction, namely sexual anxiety and avoidance.” Id: 146-147. Parenthesis added by this author. 39 UN doc. A/55/290. P. 8. 40 It should be noted that these two comments were made in reference to a link between torture and poverty. Id.


117

purposes…” (Evelyn Mary Aswad, 1996: 1930) More recently, such an association has been presented by the Medical Foundation for the Care of Victims of Torture: “For example, how is raped used politically and what impact does it have when perpetrated in the context of torture…Political beliefs and political identity as a context.” (Michael Peel, 2005: 23)

The Medical Foundation, from their own files (from 2000-2003), examined one hundred cases whereby women had been raped (as torture). Of the 100 cases, 39 women said they were raped due to their direct political involvement and 21 stated they were raped due to a political association via a family member or partner. (Ibid: 45-47) It is interesting to note that the Medical Foundation does not consider the second category (21 cases) as falling into the active ‘political’ camp. (Id: 47) What could be the link between torture and the political? The traditional view behind the use of torture may be an important indicator. For example, according to Nigel Rodley, torture has historically been used as a way of arriving at the ‘truth’. (Nigel Rodley, 1999: 7) In ancient Greece and Rome, torture occurred to extract the truth in relation to serious crimes. In Roman times, torture was also used in cases of treason. (Ibid: 7-8) By the middle ages in Europe, the use of torture had become more formalised within a rationalised approach to getting at the truth via proof. The need for a confession and arriving at what was then known as ‘the queen of proofs’ seems to have resulted in the use of torture within the legal process. Technically, there were strict rules in place as to when torture could be used (for example, as a last resort). However, religious inquisitions in Europe led to the abuse of these so-called safeguards. Despite attempts to end its use during the Enlightenment period, torture did not vanish throughout the world. In Europe its use by the Nazis, especially against individuals suspected of participating in the resistance, is well documented. Even with the Universal Declaration of Human Rights (1948), and subsequent international and regional conventions prohibiting the use of torture, the practice is in use extensively throughout the world. The purposes behind the use of torture tend to be limited in scope. Most often, the torture attempts to extract information from the victim in relation to either his/her activities or the activities of others associated with the victim. Such activities may be political or criminal in nature. The torturer may also want to intimidate the victim or others. The torturer may want a confession. (Id: 8-11) The purposes behind the use of torture, some of which have been identified, do not adequately explain why political activities tend to be associated with individual victims of torture. Therefore, it must be as the UN Special Rapporteur on Torture and authors such as Nigel Rodley have proposed, victims of torture with political involvement are more likely to report the violations to relevant human rights mechanisms and/or organisations. The reasons for this includes, inter alia, the victims may be better educated and more aware of existing reporting mechanisms and organisations, or these individuals may be less likely to remain silent about their experiences. (Id: 10) It may also have been due to the mandate of some of the human rights organisations operating within this field. For instance, in its early days Amnesty International focussed heavily on releasing Prisoners of Conscience and to ensure that these individuals were not subjected to torture. Imprisoned or ‘disappeared’ for their beliefs and opinions (without advocating or using violence), many of these individuals were politically active or were associated with those who were involved in political, trade union, or

118

Daniela De Vito

other movements. This may have contributed to a perception that most victims of torture also had a political ‘connection’. There is, however, no clear demarcation as to why the focus of recognised cases of torture has been on the political. This fact does not reflect the nature of torture per se. It also does not have to do with how torture has been defined under international law. Instead, this reality leads to two key analytical questions: what is the political and what are the implications of referring to torture (or rape as a form of torture) within the political activities of the victims or of those associated with the victims? Finally, what is the justification for linking the individual component of the current concept of human rights (in this case torture or rape as a form of torture) with the political? On the surface, it may seem that the best way to address the emphasis placed on the individual within the concept of human rights is to examine in detail what is the individual and what is the individual’s place within the development of the concept of human rights. Although important, there is another layer of analysis utilised that will complement and contribute to the process of understanding the implications of considering rape as a form of torture – a crime committed against the individual. As such, the interest lies in working through the type of individual who has been subjected to torture, including rape as a form of torture. Since reported cases tend to include a political element, the task will be to assess what is meant by the political involvement or association of the individual victim of torture. The focus still remains on the individual, a key component within the concept of human rights, but the approach undertaken will bring into the discussion ‘what makes up this individual’? In the case of torture and rape as a form of torture, this includes examining and understanding ‘the political’.

Rape as Genocide Recently, the International Criminal Tribunal for Rwanda (ICTR) in the Akayesu Judgement (1998) determined that rape could constitute genocide. According to the Trial Chamber, Tutsi women were raped because of their membership in an ethnic group. This was genocide since these were acts that contributed “…to bring about its (the Tutsi ethnic group) physical destruction in whole or in part.” (1948, Article II c) Genocide, according to the UN Convention on Prevention and Punishment of Genocide (1948), is a crime against a group be it a national, ethnical, racial, or religious group. A theoretical complexity emerges once rape (which is constructed as a violation against the individual) is subsumed into a crime such as genocide (which is focussed on protecting the group). Subsequently there will be an analysis of the theoretical implications for rape when it attempts to operate both as a construct for the individual and within the confines of a crime developed for the group. What will develop is an understanding that rape can benefit from being subsumed into the crime of genocide and that it must find space for both the individual and the group. Due to how genocide has been presented within international law, this space will be unequal and may change how rape is conceived, placed and treated. With the advent of human rights, what then, happens to the individual and to the individual in relation to the group? Prior to World War II, the term ‘human rights’ was not specifically referenced within international law. Issues such as the international movement to abolish slavery and to protect minorities – through the defunct League of Nations - and trade union rights did exist during the 19th and early 20th centuries. In 1919 the International Labour


119

Organisation (ILO) was established by the Treaty of Versailles. The ILO’s work included, for instance, initiating international provisions to protect freedom of association. In 1926, the Slavery Convention came into being. There were also steps in relation to protecting noncombatants during war and constraints on the conduct of war. (Nigel Rodley, 1999: 1-2) Despite this slow evolution, the individual did not possess a prime position. The State was the critical player. International relations and even international law as it was understood depended on State to State relations, needs and interests. This is evident with human rights before the 1940s whereby such matters tended to be considered within the “domaine réservé, or domestic jurisdiction, of states.” (Ibid: 2) To elaborate, during this period “…for most purposes governments could do what they wished with those under their jurisdiction and international law had no opinion on the matter, much less any means to act.” (Id) After World War II, the international community understood that what a State does in its own back garden was indeed the business of others. (Id) It was not enough to be simply concerned with how a State acted in response to another State. Individuals, in relation to the protection of their human rights, needed not only their respective States to comply with norms but also other States had to be involved. How deeply and under which circumstances this involvement should occur was not only a concern during the early days of the UN and the first human rights instruments, but these tensions continue to be problematic today. Yet, and taking into account counteracting opinions and interpretations mentioned throughout this chapter, what emerges over the past sixty years is indeed a remarkable picture. The individual has a set of human rights outside any references of the State. The individual does not receive his/her human rights from the State. This set of rights may change over time and the rights may not universally be replicated exactly. Regardless, States have a responsibility to uphold these rights. Within later human rights conventions that include such provisions, which unlike the UDHR are legally binding for States parties, the individual has a right to seek redress for violations of their human rights. The emphasis is for this redress to take place within the State in question, but if this process is not possible or fails then international and regional mechanisms are in place. In a dissenting voice to this overview J. Shand Watson argues: “The traditional position on the status of the individual is thus a practical and efficacious one. First, there must be state consent for the creation of rights because this is the only way to ensure tangible results. Second, only states can be the subjects of rights because only states have the political power to create and enforce them. Since individuals lack such power, they are dependent on representation by states and are thus the objects rather than the subjects of rights.” (J. Shand Watson, 1999: 295)

Although I concur with Watson’s understanding of the reality that States still wield an enormous amount of power, the position of the individual has changed over the centuries. There continues to be variations, to the overview mentioned above, that understand the individual as primarily existing within the group and within the ‘state’. Watson, in his book, correctly points out the inconsistencies within human rights theory and State practice. However, theoretical developments are important and should be considered as part of a more complete picture. The system is not perfect and individuals are still limited in their capacities. Human rights violations committed by the State through its acts or omissions take place. Redress is not always possible, sometimes it takes several years, or the outcome may not be

120

Daniela De Vito

beneficial. What must, however, be emphasised is that the relationship between the State and the individual who possesses human rights has transformed. This changed relationship continues even with the difficulties or inconsistencies that exist within the concept of human rights and within the system evolved to uphold these norms. In theory, at the very least, the individual now has a place at the table. The implications of this means that when rape is considered as a form of torture or as genocide the degree to which the individual is addressed within the constraints of these varied human rights violations must be factored. In other words, although the individual has gained a firm place within the human rights discourse, variations still exist due to the specific nature of particular violations such as torture and genocide. In the former, the individual will have a stronger role while in the latter the individual will have to make room for the ‘group’. Rape and rape as a form of torture are violations against the individual. Rape as genocide is a crime against a particular group. As noted, the current concept of human rights emphasises the individual and yet makes room under certain conditions for the group. A similar schism had developed between the individual and society. However, the differences between the individual and the group/society and the current concept of human may not always be at odds or in conflict. For instance, it may be necessary to create space for a Tutsi woman who has been raped because of her membership within this ethnic group, to view the act of rape as a violation committed against the Tutsi group and not as a violation against her as an individual. 41 This may be the solution if the aim is to bring closer together the schism created by the individual/group dilemma found within the current concept of human rights. How, then, can these differently constructed crimes (rape as a form of torture and rape as genocide) be compatible? In addition, how does each notion fit into the current concept of human rights? These critical theoretical questions will be now be discussed.

RAPE AS A FORM OF TORTURE: ARRIVING AT AN UNDERSTANDING OF THE ‘POLITICAL’ As argued prior, the concept of human rights, as developed through the United Nations 42 and other components of the human rights regime, tends to emphasise the human rights of individuals and human rights violations committed against individuals. Within the confines of international human rights and humanitarian law, rape, torture, and the consideration of rape as a form of torture are violations committed against the individual. With specific reference to rape, this conceptualisation is evident with the International Criminal Tribunal for the Former Yugoslavia (ICTY) case – Foca. According to the Trial Chamber in its Judgement, rape is a ‘serious violation of sexual autonomy’. (Foca case, 2001: 208) As with torture, rape therefore constitutes a violation against the individual. A similar approach can be taken with the consideration of rape as a form of torture.

41 42

I am grateful to Sam Ashenden for pointing out this possibility. A variety of components make up the human rights regime including inter alia the United Nations, regional bodies such as the European Commission on Human Rights, international and regional instruments such as the Universal Declaration on Human Rights (1948), and non-governmental organisations.


121

43

Beyond presenting the individual as a victim/survivor , torture and the consideration of rape as a form of torture bring out an additional layer of complexity – an association of the individual with ‘the political’. Even in terms of considering rape as a form of torture, proponents have also at times presented their case with an emphasis on the political. Evelyn Mary Aswad maintains that rape may be considered as a form of torture if for example “government officials rape for political purposes…” (Evelyn Mary Aswad, 1996: 1930) Despite the fact that this is a critical issue, I am not at this point interested in examining ‘why?’ victims/survivors of torture have been perceived as being associated with ‘the political’. Briefly, this formulation has taken place in connection to his/her own activities, his/her association with others such as family members, and/or the ‘purpose’ behind the 44 torture. Rather, I wish to unpack the political and determine some of the implications for the individual if the political is emphasised when considering rape as a form of torture. As will be demonstrated, this process involves several components and cannot be settled with a singular approach. That is, to determine what can be understood as the ‘political’ and how this could impact on the individual it is necessary to include an examination of the following two elements. First, what are the possible contexts or circumstances under which the label ‘political’ is attached to an individual victim of torture or rape as torture? For example, is it only an individual who is a member of a trade union or an opposition political party member who can be given the label of political activist? Second, since the issue of torture will be the underlying theme of this overview of the political and the definition of torture developed earlier will be utilised, it was extremely important to also incorporate the notion of the ‘purpose’ behind an act of torture to determine what the ‘political’ can entail. That is, an individual who is in prison may not be involved with a trade union or may not be a member of an opposition political party but this individual may have been tortured to inter alia intimidate fellow prisoners. Also, since this section focuses on the constituent elements of torture as outlined in the UN Convention (1984), for an act of rape to qualify as a form of torture it must also meet the three-part criteria developed in this definition of torture. That is, rape as a form of torture must have involved severe pain or suffering; must have been committed for a prohibited purpose; and must have been carried out by a public official or by 45 an individual(s) acting in an official capacity. Once again, influencing this study is the current concept of human rights with its focus on the individual. As such, individuals who have suffered torture and the reality that an emphasis is placed on survivors/victims who have been politically active or were linked to others who are politically active, along with an examination of what constitutes ‘the political’ will form the basis of this examination. Thus, the purpose is to understand some of the theoretical 43

The distinction between victims and survivors is used in order to denote those who have been tortured and died – either as a result of injuries inflicted during the torture, or in connection with any circumstances surrounding the torture - and those who did not. It is not, however, used to distinguish between levels of suffering. 44 I am grateful to Stanley Cohen, from his comments on an earlier version of this chapter, for identifying a third possible contextual element to the political dimension of torture – the relationship between the offender and victim. However, as I will develop the focus of the inquiry will be on what constitutes the political and how an emphasis on the political can affect the individual victim of torture. 45 Once again, the ‘official link’ or ‘specific purpose’ may not be a requirement for all definitions of torture such as with the International Criminal Court (ICC) in its Elements of Crimes.

122

Daniela De Vito

implications emanating when the label of ‘political’ is attached to individuals who have been tortured or who have been raped under situations that can be considered as torture. In turn, determining whether a broader or a narrower notion of this concept is more useful when examining which ideas, issues, and acts could be considered as political is critical to this discussion. Does the label of ‘political victim’ expand or restrict the category of victim of torture or victim of rape as a form of torture? Controversially, does the label of political make the individual victim more acceptable within the international human rights discourse thereby affecting how the group is perceived?

Working Through what is ‘The Political’? Any attempt to grasp an essentially contested concept, such as the political, is fraught with interesting challenges. However, this is not to say that these challenges are insurmountable. Much depends on the endeavour(s) and the approach (es) undertaken. In order to arrive at a more solid understanding of the political the use of two avenues is foreseen, which when combined will enable the emergence of a broader or a narrower notion of this concept. On the one hand it is necessary to determine the ‘nuts and bolts’ involved in any understanding of the political. Questions such as ‘what is it?’ and ‘which issues can be placed in the political?’ are some of the considerations involved in this phase. My aim is not to attempt the formulation of an all-encompassing definition, but to present and assess a range of perceptions that can subsequently be used, adapted, or even discarded. At the outset it may seem as if by using this approach I will be in fact be moving away from a more solid view of the political. On the contrary, it is by exploring a variety of options that one can appreciate the critical nuances that make up the overall process of working through the political. On the other hand it is also important to gage ‘where the political is applicable’. This second avenue will involve outlining realms such as the public, the social, and the private since any definition of the political will need a depository – or perhaps even a number of depositories. Alternatively, these avenues can be understood as developing the “nature and boundaries of 46 the political.”(Judith Squire, 1999: I) 47 In combination, a broader or a narrower notion of the political can emerge. With reference to rape as a form of torture, it will be argued that even if a broader understanding of the political is used - when it is attached to individual victims – the end result will still restrict which individuals are considered to have suffered rape as a form of torture. Key, instead, will be to move beyond insisting on the focus of a correlation between the individual and political activities or associations without fully expanding on its meanings. It is to these tasks that I now turn.

46

Also: Adriano Cavaro in Stephen K. White (Ed.) and Donald J. Moon (Ed.), 2004. 47 Although I shall be outlining each avenue separately and then combining them to see how a broader or a narrower notion of the political can emerge, the entire process will not unfold in a clearly demarcated manner. Instead, this exercise could be understood as each ‘stage’ having the potential to flow into another. Much like the image of sharing a certain amount of water between three glasses.


123

What is the Political? Introducing the political as a part of the State and its institutions offers an initial albeit limited understanding of the concept. Although several options are available in order to move beyond this juncture only institutional/instrumentalist understandings of the political will be briefly outlined. However, framing the political as I will develop into a more expansive 48 perspective will be the focus. Judith Squires writes about recent approaches used to work through definitions of the political. During the 1950s/1960s it was the “positivist perspective (endorsing objectivity)” which prevailed, while during the 1970s/1980s the focus was on “socially constructed and historically variable forces”. (Judith Squires, 1999: 7) During the last decade of the twentieth century, this second approach also referred to as contextualist, witnessed the development of deconstructionist methodologies. (Ibid: 8) More recently, the positivist and contextualist approaches were joined by a “third perspective” which argues “Definitions of the political…are neither empirically true nor simply reflections of underlying social relations, but rather active means to shape the ‘real’ world.” Despite these varied perspectives Squires sees that an overall split has emerged in political studies between institutional and instrumental standpoints. (Id) Therefore, an institutional understanding of the political means that politics occurs in government institutions, whereas an instrumental understanding asserts that politics is concerned with power and decision-making. It should be noted that although the issue of power and relationships of power are extremely important when attempting any formulation of the political, due to the scope of this chapter the concept will be left to one side. Yet, as Noel Wood succinctly remarks “...every exercise of power usually contains some element of the political.” (Neil Wood, 2002: 27 and Noel O’Sullivan, 2000) What are some of the concerns with this so-called divide? William E. Connolly points out that politics becomes “bifurcated … between principle and instrumentality”. In the former camp, one finds individualists such as rights and justice theorists, and in the latter camp are ‘utilitarians, pragmatists’ who focus on instrumentality over principle. (William E. Connolly, 1983: 8) One result of this bifurcation is that two separate notions emerge in terms of how to distinguish between the political and the rest of society, one based on principle and the other on instrumentalism. Another worry for Connolly is that both understandings emphasise the individual as opposed to groups. (Ibid: 9) What does this mean? It is possible that what is linked to one or the other understanding of the political – such as rights and justice or power and policy - can also be found in its opposite viewpoint. To elaborate, in an attempt to understand the political A. W. Sparkes has developed a ‘model’ which I would argue includes the potential for such overlaps. (A.W. Sparkes, 1994: 61-69) It is for this reason that Sparkes’ model has been used, as a starting point for the development of my understanding of the political as a more expansive perspective and for its potential transfer to the issue of rape as a form of torture. Rather than calling this model a definition of the political per se, Sparkes refers to it as “a list of ten political-making characteristics.” (Ibid) For the sake of brevity I shall not replicate the model nor shall I delve into how/why the author arrived at its ten components. However, it should be 48

The inspiration for this term comes from an article by Jean Bethke Elshtain entitled “Displacement of Politics” where she mentions “…boundary shifts in our understanding of ‘the political’…” in Weintraub & Kumar (Eds.), 1997: 169.

124

Daniela De Vito

noted that at its crux the model is divided between the ‘Quasi-Polis’ and the Politician which 49 he calls “poles”. Using this model as a backdrop the divide between institutionalism and instrumentalism begins to loose its absoluteness. This occurs not because these two understandings of the political are outlined in an unclear manner but, rather, due to the fact that overlap can exist. Although Sparkes does emphasise ‘the polity’, his model of the political does not read in terms of an institutional/instrumental divide. This can be understood with reference to characteristic (vi) “power…” – which straddles 50 the quasi-polis and the politician planes – and to characteristic (x) “politics of a collective… law and order…” Taking a momentary step back from what Sparkes is attempting to do with this model, and without presuming that this was his intention, I would suggest that point (vi) “power…” is an instrumental component and point (x) “politics of a collective … law and order…” is an 51 institutional one. In other words, the former involves ‘instrumental issues’, whilst the latter involves ‘juridical’ considerations and matters of principle. If both can operate in the consideration of ‘Quasi-polis & its Governance’, then they can also overlap. This is what I mean by the ‘so-called’ divide between institutionalism and instrumental understandings of the political. Beyond introducing the possibility of overlaps in institutionalism/instrumentalism, Sparkes’ model also brings into play the notion of framing the political as a more expansive perspective. What then does a more expansive perspective of the political entail? Critically, it should not be understood solely as how Sparkes has presented his ideas. That is, there are several additional understandings of the political which attempt to break through supposedly set boundaries. One such example would be Jean Bethke Elshtain who advocates the existence of “boundary shifts in our understanding of ‘the political’” but that a limit needs to be present. (Jean Bethke Elshtain, 1981: 169) In other words, “But minimally, a political perspective requires that some activity called ‘politics’ be differentiated from other activities, relationships, and patterns of action.” (Ibid: 170) Thus, and in simple terms, a more expansive perspective of the political should include 52 the following three basic points. First, this perspective should approach formulating both ‘what is the political?’ and ‘where should the political be placed?’ flexibly. For example, it would not be adequate to say that the political only pertains to the State and its institutions. Second, an understanding of the political should accept that due to the topic(s) in hand it operates on an open-ended plane. 49

Sparkes argues “I am quite sure that primarily the concept of the political refers to the polity, so that its applicability to the governance and internal affairs of other groups depends on certain significant resemblances between them and the polity”. With reference to quasi-polis he adds “I shall use the word ‘quasi-polis’ to refer to the polity and other groups.” Sparkes, 1994: 65. 50 Sparkes refers to this area as the “Equator”. Ibid: 65 and 68. 51 Sparkes presents law as “something imposed on those subject to it. But a respect for persons, a commitment to autonomy…requires that, so far as possible, it should not be pure imposition…” Sparkes, 1994: 195. 52 I am not claiming that these points are all encompassing. Neither am I assuming that only these three are relevant. Instead, they have been included as an introduction to the idea that the political can be understood in this manner.


125

Third, although a more expansive perspective of the political is at its essence more open to accepting varied issues and institutions, this does not mean that everything could be placed in the political or that the political can be found everywhere. To consider the act of playing ‘fetch’ with one’s dog - in a dog-friendly park - as political would be contestable. Whereas the act of stealing a loaf of bread, if one were hungry and if one’s government refused to provide programmes which could help alleviate poverty, would be easier to include in an 53 understanding of the political. In relation to rape as a form of torture, and its links with the political involvement or association of individual victims, it is necessary to understand where ‘the political’ can be found. Is it only within government institutions; within political activities such as trade union membership; or is the political also found in broader social relationships such as neighbourhood watch schemes, etc. This last point brings up an interesting consideration – that of politicisation. In other words, when is an issue, act, thought, etc. political? In a sense the answer to this question is twofold. Initially it is necessary to bring into the discussion the possibility that something can indeed be non-political. At a certain level this could be upheld. For instance, one way to understand hedonism is to view it as “the doctrine that only bodily sensations are real, is but the most radical form of a non-political, totally private way of life, the true fulfilment of Epicurus’ … (‘live in hiding and do not care about the world’).” (Hannah Arendt, 1994: 11254 113) However, someone like Sparkes would counter this idea by insisting that “while there are things which are essentially political and things which are not essentially political, there are no things which are essentially non-political.” (A. W. Sparkes, 1994:70) It may seem that in using Squire’s quote I have in fact contradicted myself on two counts. One in relation to the playing ‘fetch’ with a dog example and the other with regards to the third point for a more expansive perspective of the political. In the former, my focus was solely on the act of, for instance, throwing a ball for the dog to chase. However, a link could in turn be made to the political process that resulted in the park being converted into a dog friendly area or under what conditions and where the ball was manufactured. This is how I understand Sparkes’ idea that “there are no things which are essentially non-political.” So, and in this case, an individual throwing a ball for their dog is “not essentially political” but how the dog friendly park came about is “essentially political”. The latter count is more difficult to resolve. One way would be to emphasise that the understandings of the political that have been presented all include certain limits. The main consideration would therefore be the degrees involved. As linked to a more expansive perspective of the political, Sparkes limits his model to the quasipolis and to the politician. In contrast, Elshtain goes beyond this and asserts “By the political here I refer to a specific version of what is public: that which is, in principle, held and considered in common and which is, in principle, open to public scrutiny.” (Jean Bethke Elshtain, 1981: 170) So, for Elshtain the political would go beyond the limits of the quasipolis – as framed by Sparkes – but remain within the public realm. In other words, Elshtain 55 does not uncritically accept the notion that “The personal is political”. Why should such 53

Although I have used ‘acts’ in these two examples the political can also involve other aspects such as thought. Arendt distinguishes between “ancient and modern varieties” of hedonism, 310. 55 This phrase is associated with feminism and the move to expand where the political could be found. One way to understand this idea is to state that the personal “is legally constructed, culturally defined, and the site of power relations.” Jean L. Cohen in Weintraub &Kumar, 1997: 136. 54

126

Daniela De Vito

limitations be included? One concern has to do with the possibility of creating an understanding of the political “that is so wide as to lose its specificity and usefulness.” (A. W. Squires, 1994: 9) Alternatively this can be understood as something which has become “…one of those ‘sponge’ concepts that C. Wright Mills mocked as it sweeps together and 56 liquidates indifferently whatever crosses its path.” A comparison can, at this point, be made to torture. References to the term have also been used to elucidate situations that are outside formal definitions. One such example would be the following: “Waiting for the exam results was sheer torture!” Despite the idea that a built-in caveat concerning limits has also been included in a more expansive perspective of the political, individual or group perceptions of what should be considered political and in which realm(s) should it be deposited - which may go beyond the above-mentioned limitations - cannot be placed to one side and completely ignored. Therefore, a more expansive perspective of the political can include limits in terms of content and applicability, but these ‘other’ perceptions may in turn be looking in and at times may need to enter into the fold. A secondary, and complementary, way to answer ‘when is an issue political?’ involves an examination of the process by which particular issues or acts gain the mantle of ‘the political’. In other words, “There is, in fact, nothing more political than the 57 constant attempts to exclude certain types of issues from politics.”

Depositing the Political Determining where to deposit the political should not take place in isolation from the process of working out what makes up the political. Both tasks are intertwined and are crucial in any attempt to understanding the political. In this part of the chapter I will examine the realms of the public, private and social, and their connection to the political. Traditionally, political discourse has included a distinction between the public and private realms. Even the ancient Greeks dealt with this split. In contrast to later developments Aristotle thought that economics and family matters took place in the ‘private’ realm (or oikos – household), and that the “political community” was to be found in the public realm. 58 (Weintraub & Kumar, 1997: 13 and 25) More recently, however, there have been attempts to work through this ‘dichotomy’. The essays in Public and Private in Thought and Practice offer one such alternative in that they include “not a single paired opposition, but a complex family of them, neither mutually reducible nor wholly unrelated.” (Ibid, 1997: xii) With Advocates of a strict notion of ‘the personal is political’ would insist “the claim… (is)… not that the personal and political are interrelated in important and fascinating ways…but that the personal is political.” Elshtain, 1981: 171. To elaborate, Elshtain does think that considering political issues which were previously “hidden behind a gloss of professed concern for the sanctity of the ‘private’ realm” may be laudable, but she goes on to point out several concerns including the idea that “Nothing ‘personal’ was exempt from political definition…” Elshtain, 1981: 171-172. 56 The concept being examined in this quote is ‘revolution’. However, in this paper the quote has been used in relation to ‘the political’ and to ‘torture’. Dick Howard, 1988: 110. 57 Leftwich and Held quoted in Squires, 1994: 9. 58 For an interesting overview of how ‘economics’/’private realm’ links were dealt with by “seventeenth –century Natural Law theorists” and subsequent developments please see Gobetti, Daniela in Weintraub & Kumar, 1997: 108-111.


127

reference to rape conceptualising and placing this violation beyond notions that it should remain within the private realm has been and continues to be discussed. Whether rape is committed as a crime or as a human rights violation, a process still exists (despite recent advances in this area) in changing perceptions and legal practices to ensure that it is not left within the private sphere and thus not adequately addressed as a matter for State or international mechanisms involvement. Even if rape is subsumed into the international crime of torture, a long established human rights violation often linked to activities or associations that take place within the public realm there must be an understanding of how rape can operate within the formulation of rape as a form of torture. This is especially critical when the label of ‘political’ is associated with an individual victim of torture or of rape as a form of torture, since this firmly places the activity/association within the public sphere. It is with this spirit in mind that I approach this part of the discussion. Within these realms it has been assumed – and at times challenged - that the public contains the political and the private contains the non-political. However, even this idea is not so straightforward. In other words, depending on which understanding of the political is utilised within these parameters the public realm can also be transformed. For example, Jeff Weintraub outlines two perspectives of the “public as political” where one “means the administrative state” and the other “means a world of discussion, debate, deliberation, collective decision making and action in concert.” (Wintraub & Kumar, 1997: 10-11) In turn, the political theorist Carole Pateman in The Sexual Contract challenges accepted liberal discourse that the private and public realms are so clearly demarcated. For Pateman, understanding how each sphere operates and affects both how politics function and how individuals carry out their private lives is critical. The goal is not to merge the two realms, but to examine their respective effects within each sphere. (Carole Pateman, 1988) Accepting the possibility of multiple publics, or even varied alternatives including ones that aim to articulate what is involved in a single public realm, raises the question of whether this means that the private realm is completely emptied of its contents? Two points are relevant to this inquiry. First, leaving some issues within the private realm may be necessary and a positive aspect of this discourse. As Hannah Arendt outlines “The distinction between the private and public realms, seen from the viewpoint of privacy rather than of the body politic, equals the distinction between things that should be shown and things that should be hidden.” (Hannah Arendt, 1998: 73)

In a textured analysis of the political Hannah Arendt contributes to the discussion by 59 outlining the social realm. Arendt begins by considering the Latin roots of ‘social’ (societas) which “originally had a clear, though limited, political meaning; it indicated an alliance between people for a specific purpose, as when men (sic) organise in order to rule others or to commit a crime…. Such an alliance could also be concluded for business 60 purposes.” Later in history “the term ‘social’ begins to acquire the general meaning of a 59

Nicolas Rex outlines what he has termed the “...invention of the social” at the beginning of the 20th Century. Nicolas Rex, 1999: 112. 60 Briefly, for Arendt “the political realm rises directly out of acting together, the ‘sharing of words and deeds’.” As such “The distinction between a private and a public sphere of life corresponds to the household and the political realms…” Arendt, 1998: 28 & 198.

128

Daniela De Vito 61

fundamental human condition.” (Ibid: 23 at note 3) Arendt considers that the social realm proper and not the private/public worlds which according to her is a more recent development. (Id: 28) This is due to the idea that from ‘families’, a greater entity emerged – that of society which in turn became the ‘nation’. (Id: 29) 62 For Arendt this social realm is a “characterisation of modern civil society.” Arendt is concerned with the development of this “curiously hybrid realm where private interests assume public significance that we call ‘society’.” (Hannah Arendt, 1998: 35) Before proceeding to the final section I would like to develop a schema which includes the understandings of the political and the depositories for the political which have been discussed. Schema 1 includes the following realms: Public = P1, Social = S and Private = P2 along the line. These are not plotted in an exact manner but are placed to simply signify which one(s) a particular understanding of the political emphasises. Of course, this does not mean that some viewpoints could not also include elements from the ‘social’. In addition, both endpoints would relate to all three realms. These areas have been presented as extremes and their plausibility in real life would require additional analysis. Moving away from them means re-adjusting how the political could be understood and potentially which realms are applicable. To summarise this section, if I was urged to express my own viewpoint it would rest within a more expansive perspective of the political – including the limits outlined; it would be closer to the endpoint ‘everything as political’ to signify a broader notion of the political; and it would involve all three realms, in a potentially overlapping manner, as possible depositories for the political. I have not articulated an exact map of which issues or acts could be considered as political, or, how each of the three realms could appear. The reason, as noted at the beginning of this section, is that the aim was to work through how the political could be understood and not to develop a particular formulation of this concept. This area on Schema 1 has been marked with an ‘*’.

61

Id., 24 (and Supra nn. 29 & 30). Jeff Weintraub offers this assessment and links it to how he uses the term civil society in his essay: “I will use ‘civil society’ to refer to the social world of self-interested individualism, competition, impersonality, and contractual relationships – centred on the market – which as thinkers in the early modern West slowly came to recognise, seemed somehow able to run itself.” In turn, Liberalism is associated with civil society in that “its tendency is to reduce society to civil society.” Weintraub & Kumar, 1997: 13 & 35.

62


129

Schema 1

Public = P1, Social = S and Private = P2

Associating the Political with Rape as a Form of Torture In the overview of this section I pointed out two elements that could be examined in connection to the political and its association with torture. One was rooted in the contexts or circumstances under which individuals had been subjected to torture, and the other was linked 63 In this section both aspects will be applicable to the with the purpose for the torture. consideration of rape as a form of torture, and some of the possible implications if this association is emphasised will be examined. Even if rape is considered as a form of torture, can its conceptualisation be problematic? In short, the answer is yes for several reasons. That is, at times, when proponents for the consideration of rape as a form of torture present their arguments one criterion discussed involves the existence of a political component to the act. For example, Evelyn Mary Aswad 63

A connection to political actions could occur if an individual is related to or associated with someone who was politically active. That is, the victim/survivor of torture would not necessarily have had to participate in a political activity, etc. for this association between the political and torture to exist. In addition: “Rape, particularly when used as a political weapon, meets the definitional criteria for torture under international law.” Deborah Anker, 1995: 784. “At the same time, the political motivations of the perpetrator or torturer in selecting rape as a method of abuse are largely ignored.” Deborah Blatt, 1994: 846. Also: “Politically and socially active women are particularly vulnerable to rape under these circumstances.” and in terms of refugee women “Public officials with authority over a woman’s fate, such as border guards, soldiers, and refugee camp administrators, extort sexual favours in exchange for food, services, or a positive determination of the woman’s refugee status…Sexual extortion of refugee women constitutes a method of torture under Article 1 of the torture convention.” It should be noted, however, that in the example given concerning refugee women, sexual extortion is not mentioned as rape but as a distinct form of torture. Ibid, 859 & 862.

130

Daniela De Vito

presents if “government officials rape for political purposes, those rapes should be classified as torture if they inflict severe pain.” (Evelyn Mary Aswad, 1996: 1930) In this instance the link to the political falls within one of the three components of the definition of torture namely the purpose behind an action. Two points must be made about this emphasis on the political. First, it could be argued that the political should be broadly interpreted. Or, to borrow from Schema 1, a more expansive notion of the political should be applied. Certain components of the human rights 64 In 1997 an Expert regime have tried to work through an understanding of the political. Group Meeting under the auspices of the UN Division for the Advancement of Women (DAW) took place. The overall topic of the meeting related to “Gender-based persecution”, but in a section pertaining to the UN’s 1951 Convention Relating to the Status of Refugees (hereinafter the Refugee Convention) the meeting addressed political opinion: “The meeting agreed that feminism constitutes ‘political opinion’ for the purpose of defining who is a refugee in the context of the Refugee Convention. It considered that political opinion or imputed political opinion may be determined from behaviour as well as expressed opinion. It noted that behaviour by a woman that does not conform to cultural or social norms with respect to gender roles may be construed as political opinion with respect to gender roles or the political opinion, feminism. The meeting also pointed out that peaceful activities by women in the course of armed conflict may be construed by opposing groups to impute political opinion. Political opinion may also be imputed to women as a result of the political opinion of male family members.” 65

The key aspects of this meeting’s understanding of the concept – albeit in terms of the Refugee Convention - are that it could involve several layers beyond simply the act of expressing one’s political opinion, and that the political can also be linked to cultural, social, or gender factors. Another attempt to understand the political is in relation to the Rome Statute of the International Criminal Court (1998) which establishes the permanent International Criminal Court (ICC). The ICC has jurisdiction over “the most serious crimes of concern to the international community as a whole” and the list includes genocide, crimes 66 against humanity, war crimes and aggression. Commenting on Article 7 of the Rome Statute (crimes against humanity), and in particular how to understand one of the ‘grounds’ for persecution listed – persecution on political grounds, M. Boot & C. Hall speculate that it could be related to the “…‘State or its government, or public affairs generally and its institutions’…” but also that it “might not be limited to grounds that concern membership of a particular political party or adherence to a particular ideology.” (M. Boot & C.Hallin Triffterer (ed.), 1999: 148) In turn, “the word ‘political’ might be understood as including public affairs issues such as environment and health; a political ground for persecution would then cover at least the existence of a difference of opinion concerning these issues as a reason for committing the acts concerned.” (Ibid) The authors go on to add, “Remarkably, social

64

Attention to the following example came from a section in: Triffterer, Otto (ed.), 1999: 148-149. 65 UN doc. EGM/GBP/1997/Report, 10. 66 The crime of aggression still needs to be worked out.


131

grounds have not been expressly included” in the Rome Statute. (Id: 150) In terms of rape as a form of torture, it could be argued that the political should be broadly interpreted. Therefore, a more expansive notion of the political would not only include the scenario whereby a government official subjects an individual to torture in order to gain information about a political opposition party, but it could also include the situation whereby a woman is tortured because she works for an organisation that is demanding better social services for women from the government. Even if a more expansive notion of the political is accepted a secondary point reveals itself. Emphasising the political – broadly or narrowly – limits which particular acts of rape can be considered as a form of torture. Furthermore, I would argue that emphasising the political, as for instance part of a ‘purpose’, has the effect of constraining the definition of torture. If a prison guard rapes a woman who has been convicted of stealing money from her employer, would this not be considered a form of torture? Alternatively, Article 1 of the Torture Convention (1984) includes the following: “…for such purposes as obtaining from him or a third person information or a confession, punishing him for an act he or a third person has committed or is suspected of having committed, or intimidating or coercing him or a third person, or for any reason based on discrimination of any kind.” (Nigel Rodley, 1999: 391)

This understanding of what constitutes a ‘purpose’ is quite broad. By emphasising the political even this can become lost. If a woman has been raped not while being interrogated to gain information about a political organisation or because of her involvement with a social organisation, but because the prison guard uses this method in order to keep the female prisoners in-line would this also not constitute torture? The answer to both questions is yes, if all three elements of torture are present. So, if a prison guard rapes the woman who stole money from her employer in order to intimidate her and her fellow inmates, and she suffered severe pain (physical or mental) then she would have been subjected to torture. I am not asserting that authors such as Anker, Aswad and Blatt are deliberately excluding such possibilities, and these may not be applicable in terms of rape that occurs during situations of armed conflict. However, by emphasising the political - either in relation to the victim/survivor or to the purpose for torture - the net result is a restraint in terms of which acts of rape should be considered as a form of torture. Having said that, the political must not be completely disregarded either. As such, a more appropriate understanding can be found with the UN Special Rapporteur on Violence against Women: “Rape committed as a part of political oppression is prohibited under international law as torture or cruel, inhuman or degrading treatment” and “Numerous international authorities have also recognised rape as a form of torture when rape is used to punish, coerce or intimidate, and is performed by State agents or with their acquiescence.”

67

Key is to understand the implications of associating individuals who have been raped within the context of torture with the label of ‘political’. It is not adequate to merely state that such individuals have been politically active or have had a political association. One must 67

UN doc. E/CN.4/1999/68/Add.4, 6.

132

Daniela De Vito

also develop what is meant by this concept. Furthermore, the reality that most victims/survivors of torture have not been politically active or have a political link must be emphasised. Otherwise, and I return to the controversial questions posed in the introduction to this section, the net effect of this pattern may be that such a political connection does indeed make the victim/survivor of rape as a form of torture more acceptable. The individual has a special place within the human rights discourse. Most of the human rights violations listed in a multiplicity of regional and international human rights documents relate to the individual and to his/her rights. The right not to be subjected to torture as well as rape as a form of torture are both formulated for and addressed to individuals. Furthermore, and as noted from the ICTY Foca case, rape is a violation against one’s sexual autonomy. I did not delve into what makes up an individual in relation to the current concept of human rights or what it means to violate the human rights of an individual. Rather, I was interested in addressing the complex fact that many individuals who have been tortured or raped under circumstances that can be considered as torture have also been constrained with the additional layer of political involvement or association. I was also concerned about individuals who had not been deemed to have this political label. As such, I have explored how the political as an essentially contested concept can be understood and where it can be deposited. A less restrictive approach was promoted which resulted in a more expansive perspective of the political. This in turn meant that a broader notion of the political could emerge. Regardless, I argued that emphasising the political when considering rape as a form of torture is too restrictive. The political would not be swept away. Yet, it is also critical to recognise that the overall realities in which rape as a form of torture occurs is multidimensional. In turn, these factors must be considered even while acknowledging contexts or circumstances whereby the individual victim does indeed have a political affiliation or whereby the purpose behind the act of torture or rape as torture is political. Thus, rape as torture fits more comfortably within the current concept of human rights with its emphasis on the individual. Rape and torture respectively are acts committed against the individual. In addition, this section has demonstrated that torture via its frequent but incomplete perception affects the individual who is placed within and confined by the parameters of the political – even if it is a broader understanding of the political. Rape as torture is also committed against the individual and in turn must negotiate these complex realities. It was therefore necessary to elaborate on this tripartite relationship of the individual influenced by the current concept of human rights, of rape as torture, and of the political in order to appreciate what can occur to rape when it is placed within such constructs. The next step is to outline the theoretical implications for the group when rape is subsumed within the international crime of genocide and to determine what happens to the individual.

RAPE AS GENOCIDE: THEORETICAL IMPLICATIONS FOR THE GROUP This section examines the second theoretical implication to emerge from the international cases mentioned earlier. Specifically, acknowledging the possible implications of considering


133

rape – a crime committed against the individual – as genocide which has been constructed as 68 a crime against the group, will unfold. From this analysis, two interesting questions emerge: a) Does compatibility exist when rape is subsumed within the category of a group violation (genocide), if rape is constructed as a violation of an individual’s sexual autonomy? 69 b) Is the link between rape and genocide necessarily problematic within this notion of autonomy for the individual victim of rape?

Once again, this process unfolds within an analytical political theory approach that touches upon the current concept of human rights with its focus on the individual. Although the key element is the individual, the concept of human rights also creates a space, albeit limited and at times controversial, for the group. The analysis brings out some of the theoretical implications when rape is subsumed within the international crime of genocide, which is defined as a violation committed against particular groups. Rape as genocide is a recent occurrence within international law. As Catherine A. Mackinnon observes “Each time a rape law is created or applied, or a rape case is tried, communities rethink what rape is.” (Catherine A. MacKinnon, 2006: 941) Rape as a crime or as a human rights violation is 70 conceptualised and treated as an act committed against the individual. In contrast, genocide under the Convention on the Prevention and Punishment of the Crime of Genocide (1948), includes a series of acts “…committed with intent to destroy, in whole or in part, a national, ethnical, racial or religious groups, as such…” (P. R. Ghandhi, 2000: 19) In other words, genocide has been conceived, placed and treated as a denial of the right to life of certain human groups. In 1998, the Trial Chamber for the International Criminal Tribunal for Rwanda (ICTR) delivered an innovative judgement in the case of the Prosecutor v. Jean-Paul Akayesu. (Akayesu case, 1998: 165-166) Jean-Paul Akayesu was a local official (bourgemestre) when the genocide against the Tutsi group in Rwanda began. Although he did not commit the particular acts of genocide - such as the rapes and killings – himself, he was found guilty of genocide due to the authority he wielded in the region. The Akayesu case was the first time that an international criminal body considered the possibility of rape as genocide. In its Judgement, the Trial Chamber argued that women were raped because they were members of the Tutsi ethnic group. Because genocide had been deemed by the Trial Chamber to have occurred in Rwanda during 1994, rape in relation to this case constituted genocide. Individual 68

Since this chapter is firmly grounded within recent international human rights and humanitarian law developments in relation to rape, the author acknowledges earlier definitions and a cultural perspective found in national definitions of rape, but has not addressed these in an expansive manner. For more on these issues, such as the emphasis on either consent or coercion elements please refer to: Catherine A. Mackinnon, 2006: 940. 69 The Trial Chamber from the International Criminal Tribunal for the former Yugoslavia (ICTY) argued that rape constitutes a violation of an individual’s sexual autonomy. More on this subsequently. International Criminal Tribunal for the Former Yugoslavia (ICTY), Prosecutor v. Dragoljub Kunarac (et al), (IT96-23-T & IT-96-23/1-T, Judgement, 22 February 2001), 208. 70 As noted , and when brought to trial at the national or international level, it is the Prosecutor who faces the defendant in a rape case it is also the individual victim’s story that is considered. Furthermore, and if appropriate, it is the individual victim who testifies before the court.

134

Daniela De Vito

victims of rape testified within the context of the trial against Jean-Paul Akayesu. It was these individuals who were raped, and they were in turn members of the Tutsi ethnic group. One possible way to approach the questions posed earlier is to consider that in some situations it may be more beneficial to subsume rape within the international crime of genocide. Genocide is often characterised as the most heinous of all human rights violations. 71 Its long history (pre-1940s and during more recent events such as in Rwanda), its devastating impact on groups and societies, etc. contribute to this conclusion. By recognising rape as part of genocide, it could be argued that the result is to elevate rape above other international crimes and human rights violations. Since rape is absent in much of international human rights law or as noted is distorted within international humanitarian law, this approach may be helpful as one option to counter these imbalances. On the other hand, some women who have been raped during genocidal events may deem for themselves that an association between rape and genocide is of greater consequence than to solely focus on the individual component of rape. It may be that the need to ensure a record of this association, for instance Tutsi women were raped because they were part of the Tutsi ethnic group, is more important than to keep the violations within the parameters of an act solely committed against the individual. In addition, the issue of harm as associated with rape can affect this discourse. An article so imbedded in theoretical discussions cannot forget these components. Furthermore, emphasising the importance of placing rape within the crime of genocide has been criticised since the effect may be to lessen the importance of other types of rape. As Rhonda Copelon argues “By treating genocidal rape differently, one is in effect saying that all these terrible abuses of women can go forward without comparable sanction.” Rhonda Copelon, 1995: 67) Although this caveat is an important consideration, depending on the circumstances, it may also be argued that it is crucial for rape to be considered as genocide for the victims and/or to more precisely reflect the context of a particular genocide. If an area of accommodation, which includes both the individual and the group, can be created within the crime of genocide, then rape as genocide can operate both as a violation against the group and as a violation against the individual. This area of accommodation can never involve an equal share of the space, for genocide will always need to occupy the majority of the surface area since its key concern is the survival of human groups. When rape is subsumed into genocide, which is conceived, placed and treated as a crime against enumerated groups, its dynamic changes. Rape is no longer just a violation against the individual. Rape becomes part of a notion developed to protect the group. Once again, this process relates to what can occur on a theoretical level when rape is subsumed within the international crime of genocide. Therefore, the determination of this chapter from an analytical political theory perspective will be that there is still a place for the individual victim of genocide or the individual victim of rape as genocide. However, as with the current concept of human rights, this space is unequal and not always comfortable. Crucially, even with innovative jurisprudence such as the ICTR case (Jean-Paul Akayesu, 1998) and literature on this issue, there is a need to assess this complex relationship between rape which affects the individual and rape as genocide which is placed within the group dynamic. 71

For instance, the opening sections of the Convention on Prevention and Punishment of Genocide (1948) reads “Recognising that at all periods of history genocide has inflicted great losses on humanity; and Being convinced that, in order to liberate mankind from such an odious scourge, international co-operation is required:” P. R. Ghandi, 2000: 19.


135

The Group and the Contention of Group Rights Which areas of international law, and even general theoretical traditions, have influenced the deliberation and the creation of a definition of the crime of genocide? Raphael Lemkin focussed on the life of the group and in particular on national groups. According to Lemkin, genocide could be understood as “…the criminal intent to destroy or to cripple permanently a human group. The acts are directed against groups, as such, and individuals are selected for destruction only because they belong to these groups.” (Raphael Lemkin, 1947: 146)

With this quote, one begins to see that both the group and the individual function within the concept of genocide. However, individuals are targeted due to their membership within a particular group. The implications of this for rape as genocide will be explored later. In 1946, the newly formed UN General Assembly passed resolution (96-I) which stated: “Genocide is a denial of the right of existence of entire human groups, as homicide is the denial of the right to live of individual human beings…”

72

Within the United Nations three types of law were interwoven into a concept of genocide – international criminal law (for individual criminal responsibility), human rights law, and humanitarian law. (William A. Schabas, 2000: 5) From international human rights law, a critical connection emerges. The right to life, outlined in the Universal Declaration of Human Rights (1948) and in the International Covenant on Civil and Political Rights (1966), is a human right accorded to individuals. The right to life is not an absolute right, since in times of war and under certain circumstances this right may be breached. There is also capital punishment, which technically is not prohibited under international human rights law, but whose eventual cessation is encouraged in human rights instruments. Although the right to life is imprinted within the Genocide Convention (1948), it is the right to life of ‘human groups’ that is in fact protected. In particular, it is the right of these human groups to exist (the right to existence) that should be protected. (Ibid: 6) Furthermore, the prohibition against genocide is pivotal since it is a crime “…directed against the entire international community rather than the individual.” However, genocide has also been described by William A. Schabas as “…a violent crime against the person.” (Id) It is this two-pronged interplay, a violation against the group and a violation against the individual which makes genocide and rape as genocide, such textured concepts. What then is a ‘group’? In simple terms, ‘groups consist of individuals’. (Id: 106) At the UN level, the term ‘group’ or ‘groups’ is noted in several instruments. For instance, the Universal Declaration of Human Rights (UDHR, 1948) mentions the family as a ‘fundamental group unit of society’ and that education will “…promote understanding, 73 tolerance and friendship among all nations, racial or religious groups.” The UDHR in Article 30 speaks of ‘any State, group or person’, which means that a group consists of more than one individual. (William A. Schabas, 2000: 106) Other instruments, such as the 72 73

UN General Assembly Resolution, 96-I, 11 December 1946. P.R. Ghandi, 2000: especially Articles 16(3), 26(2) and Article 30, 21-25.

136

Daniela De Vito

International Covenant on Civil and Political Rights (ICCPR, 1966) and the International Convention for the Elimination of All Forms of Racial Discrimination (ICERD, 1966), speak of ‘peoples’ having the right to self-determination and of ‘racial or ethnic groups’ 74 respectively. In the latter Convention (1966), Article 14 addresses the right of petition for individuals or for groups of individuals whom have suffered racial discrimination. A more formal understanding, within the framework of international law, has been proposed by Natan Lerner: “In international law, the notion of group requires the presence of those already mentioned unifying, spontaneous (as opposed to artificial or planned) and permanent factors that are, as a rule, beyond the control of the members of the group.” (William A. Schabas, 2000: 107 & Natan Lerner, 2003: 84)

Critically, what emerges from this proposal is that groups consist of individuals and that groups protected under international law involve a degree of permanence such as race or ethnicity. It may be more difficult to place religious groups, since some may argue that religious beliefs can change, within Lerner’s understanding of a ‘group’. The Genocide Convention (1948), including the reference to religious groups, was framed with the notion of 75 focussing on the ‘permanence’ of groups thereby excluding other groups. However, Lerner’s wording does allow for some flexibility because he includes the words ‘permanent factors that are, as a rule, beyond the control of members.’ Furthermore, and specifically with reference to minority rights “…the right extends to ‘persons belonging to such minorities,’ and not the minority as a group.” (Bill Bowring, 1999: 3-4) Key is the individual members of a minority group. How can this understanding of individuals, being part of a minority group, relate to genocide? The groups outlined in the Genocide Convention (1948) – national, ethnical, racial or religious – are not framed solely within a minority standing. That is, these groups can be in the minority or these groups can make up the majority population in a State, or they can lack power within the State. There are no restrictions within the Convention. Genocide is an international crime aimed at national, ethnical, racial or religious groups. An additional component is attached since individuals, as members of said groups, are the particular victims of genocide. With reference to when rape is considered as genocide, individuals are victims of this crime due to their membership in one or more of the enumerated groups. This may 76 contradict the UN’s vision with reference to this crime. Specifically, in its 1946 Resolution , a distinction was made between the right to life of human groups and of individuals. Leo Kuper’s work on this subject, is a characteristic of a contrasting view to this approach. For Kuper argued that genocide “…is a crime against a collectivity, it implies an identifiable group as victim.” (Leo Kuper, 1981: 53) However, as this section will argue, there must be the possibility within any understanding of genocide to include an examination of not only what occurs to the group as a whole but also to individual victims of genocide within the 74

P.R. Ghandhi, 2000: Article 1(1). P. 64. Article 1(4), 56. There was and continues to be concern with the Genocide Convention’s limited enumeration of groups that can be targeted. The exclusion of ‘political’ groups is one such example. There have also been calls to consider the category of ‘female’ as a group that can suffer genocide. For more on these issues please refer to: Lisa Sharlach, 2000: 91-92. 76 Supra note 72. 75


137

group. This overall conclusion may, or may not, seem to follow the ICTR Judgement in the Jean-Paul Akayesu case. Within the Akayesu Judgement, genocide was understood to involve an act (taken from the list of five which are enumerated in the Genocide Convention 1948) that is committed “…with the specific intent to destroy, in whole or in part, a particular group targeted as such.” (Akayesu case, 1998: 731) It is necessary to briefly re-visit the dichotomy between individual human rights and the suggestion of group rights. For if rape as genocide is conceptualised as a violation against an individual who is part of a group, and not as a violation exclusively committed against the group as a whole and without considering the individual, then the implications of formulating this crime within the accepted understanding of the current concept of human rights must be assessed. This will involve a brief re-visit of the current concept of human rights, with its emphasis on the individual and its acknowledgement of the ‘group’, and an introduction to the debate of whether human rights are applicable to groups as a whole and not just for individual members of a group. It was not until the Second World War and with the rise of Nazism that the current concept of human rights emerged. After the end of World War II, the newly formed United Nations (UN) set about articulating the idea of human rights. This process can be found inter alia in the UN Charter (1945) and in the Universal Declaration of Human Rights (UDHR, 1948). The current concept of human rights addresses the rights and freedoms of the individual. However, this individual can to varying degrees also have a place, a role, and duties and receive benefits within his or her respective community. As Jack Donnelly sets out in Universal Human Rights in Theory and Practice, theoretically human rights exist outside the reality of the modern State. Human rights are not given to human beings by the State. Individuals, by the mere fact that they are human beings, already exist with certain rights. The process that entrenches these rights into law is separate. (Jack Donnelly, 1996: 12) Yet, the individual still has a place and a role within larger frameworks such as society or the State. Furthermore, the current concept of human rights does acknowledge the ‘group’ under certain circumstances. There is Article 16(1) of the UDHR (1948) which mentions ‘family’ and in the Preamble to the ICCPR (1966) ‘peoples’ are said to have the right to self-determination. (Michael Freeman, 2002: 75) Critically, at the UN level, and for instruments that have such mechanisms it is the individual who petitions the relevant bodies such as the Committee against Torture. Furthermore, human rights violations are determined with the individual in mind. (Koen de Feyter in De Feyter and Pavlakos, 2008: 22) This leads to an important debate within the human rights discourse. Does a group have human rights? Or, as Jack Donnelly and others insist, is it only individuals as members of particular groups that bring human rights to the table? Koen De Feyter mentions the construction of some human rights instruments that address categories of human beings such as women or children but in turn he acknowledges the difficulty in recognising collective rights per se. (Ibid: 19) Returning to genocide and to rape as genocide, the implications of considering these acts as either violations committed against the group or against the individual as part of the group are critical. As stated earlier, if an individual is raped within the context of genocide it may be the case that placing this act within a crime carried out against the group is more important for the victim than if it were to remain as a crime against the individual. In turn, the

Daniela De Vito

138

implications of this approach may be to remove the ‘individuality’ of rape and of rape as 77 genocide. This section will address minority and group rights to bring out a clearer understanding of the challenges that still exist within the current concept of human rights between the individual and the group. The purpose will be to understand how the individual and the individual as part of a group is able to function, and to determine if compatibility between the individual and the group exists within differently constructed violations such as rape and rape 78 as genocide. Finally, although a distinction exists between group rights (such as the right of ‘peoples’ to self-determination) and minority rights, the emphasis is to clarify the notion of the individual as opposed to the group within the framework of ‘rights’. With reference to the right of ‘peoples’ to self-determination, there is continued debate as to its exact place within international law in general. The only established parameter for this right has been in relation to peoples extracting themselves from situations of colonialism. Once colonialism has been removed, then this right has been interpreted to mean States. Recent and ongoing efforts, with reference to the rights of indigenous peoples, seem to be moving more in the direction of groups of ‘peoples’ having rights. Yet, this is far from being a settled and accepted approach. 79 (Nigel Rodley, 1999: 61) However, as Jack Donnelly notes: “What is one to conclude? Is there an internationally recognised collective human right of peoples to self-determination? It is hard to say. Maybe. Sort of. Not exactly. Therefore, I propose to set this right aside. But even if there is such a right, the fact that there is (at most) precisely one internationally recognised human right of peoples can be seen largely as confirmation of the rule that human rights are the rights of individuals.” (Jack Donnelly, 2003: 132-133) 80

International law and theory in general, have had a difficult time in accepting that human rights could be applicable to groups. Another term for this concept is ‘collective 81 One concern, if group rights are subsumed into the rights’. (Bill Bowring, 1999: xiii) 77

The inspiration for this the term ‘individuality’, in conjunction with genocide, comes from Leo Kuper’s phrase “As a crime against a collectivity, it (genocide) sets aside the whole question of individual responsibility; it is a denial of individuality.” Leo Kuper, 1981: 86. 78 The aim of this section is not to outline or to resolve all the various positions within the human rights discourse related to ‘who is the bearer of rights – the individual; the individual as a member of a group; or the group as a whole?’ As such, the position used will remain within the notion that human rights are individual rights but that the group based on factors such as race, ethnicity, gender, etc. also has an important role. 79 The recently passed (2008) UN Declaration on the Rights of Indigenous Peoples (1994/95) speaks of ‘indigenous peoples’ in its articles. It is, however, a declaration and the document remained in a draft for over a decade due to its contentious content. 80 As Jack Donnelly, in reference to minority rights, observes: “I am not, let me repeat, challenging the idea of minority rights as they are already established in the major international human rights instruments (i.e., as individual rights that provide special protections to members of minority groups).” Jack Donnelly in Gene M. Lyons & James Mayall (eds.), 2003: 37. Danilo Turk in De Feyter and Pavlakos, 2008: p. 4. 81 Gerrit-Bartus Dielissen, within the context of his article, offers a more narrow definition of groups rights to mean the rights of any minorities. Gerrit-Bartus Dielissen in De Feyter and Pavlakos, 2008: 41.


139

category of human rights, is that all the initiatives and accomplishments since the Enlightenment period, which led to the current concept of human rights with a prime place given to the role and placement of the individual, would be negated. As Jack Donnelly argues: “I must insist that there are collective rights, just not collective human rights, and that a number of human rights are exercised by individuals as members of a collective group.” (Jack Donnelly, 1993: 149-150)

For Donnelly, to change the idea that collective rights should be considered as another branch of human rights would significantly alter the concept of human rights, which developed to protect individuals. (Ibid: 145) This viewpoint, whereby individuals as members of groups can exercise human rights, tends to reflect the current approach within international law. Neus Torbisco Casals, 2006: 44 and David Ingram, 2000: 242) However, discourse has taken place whereby the rights of groups are considered an essential component or addition to the concept of human rights. As Lyons and Mayall predict “The question is whether the existing regime can be expanded to include group rights or whether a new set of obligations need to be added.” (Lyons & Mayall, 2003: 6) One approach is to develop group rights as a branch of human rights. Another standpoint is to leave human rights with its focus on the individual as rights bearer. (Neus Torbisco Casals, 2006: 37) In turn, there may even be a possibility to create a new category of group rights that are separate to but influenced by the 82 current human rights regime. K. De Foeter has written about this issue in terms of the ‘collective rights of groups’. In addition, these ‘rights’ would not be about the duties of the State but would in fact challenge the authority of the State. As correctly noted, when framed in such a manner, the collective rights of groups are extremely contentious and have not found much support within the established international community. (Koen De Feyter in De Feyter and Pavlakos, 2008) These are but a few examples of how the discourse surrounding the issue of group rights has emerged. As noted, the aim is not to outline in a detailed manner the varied approaches, nor is its purpose to reconcile group rights with the concept of human rights. Rather, it should be emphasised that this section and its theoretical analysis of rape as genocide is situated within the realities of current international law whereby human rights norms and regimes tend to emphasise the individual but also acknowledge possible complex 83 notions vis à vis the‘group’. 82

This possibility has been articulated in relation to certain minority groups. Jennifer Jackson-Preece in Lyons & Mayall (eds.), 2003: 68. In general, the line of inquiry can be understood as attempts to “…distinguish between, on the one hand, rights that depend on an individual’s belonging to a group or community, and, on the other, individual rights common to all human beings.” Neus Torbisco Casals, 2006: 57. 83 As Hurst Hannum, with reference to indigenous rights , succinctly expresses: “This chapter does not attempt to resolve the philosophical clashes over individual versus collective rights or majority versus minority rights.” Hurst Hannum in Lyons & Mayall (eds.), 2003: 94. Furthermore, and as a caution against placing women as a group within human rights, Eva Brems argues “In years to come, the collective dimension of women’s rights may come to the forefront, for example, as a framework for addressing the root causes of gender discrimination. However, from an inclusive perspective, this would be a step in the wrong direction, as the same structural problems usually also affect men, and remedies created in response to women’s group claims risk excluding them.”

140

Daniela De Vito

John Rawls, in ‘The Idea of Public Reason Revisited’, attempts to bring the individual and the group together. In this article, Rawls writes about the place and role of citizens when dealing with the State and its institutions. In one section, he tries to link the family with the idea that equal justice can be applied to men and women within these entities. However, even Rawls is clear that any rights or responsibilities found within the family unit are very much attached to its individual members as part of the group and not to the group as a whole. As Rawls argues: “Now consider the family. Here the idea is the same: political principles do not apply directly to its internal life, but they do impose essential constraints on the family…and so guarantee the basic rights and liberties…of all its members. (John Rawls, 1999: 595-597)

At the UN level, Donnelly’s and Rawls’ arguments are evident. Article 27 of the International Covenant of Civil and Political Rights (ICCPR, 1966) outlines the rights of individuals as part of a listed minority group(s) – “In those States in which ethnic, religious or linguistic minorities exists, persons belonging to such minorities…” This article, however, does not set out rights for the minority group as a whole. (Bill Bowring, 1999: 4) Even in a more recent UN instrument, Declaration on the Rights of Persons Belonging to National or Ethnic, Religious and Linguistic Minorities (1992), the emphasis is on ‘persons’ belonging to such groups. (P. R. Ghandhi, 2000: 132-134) In other words, as currently framed, “…minority rights are individual rights.” (Bill Bowring, 1999: 14) However, Bill Bowring argues that international human rights law must move beyond this narrow interpretation and that it should recognise group and minority rights as such. (Ibid) An attempt to expand on this formulation has been proposed by Will Kymlicka in Mutlicultural Citizenship: A Liberal Theory of Minority Rights. His focus is on the possibility of developing minority rights. Human rights, it was argued by liberal thinkers after World War II, would protect and thus solve the problems faced by, for instance, ethnic and national minorities. (Will Kymlicka, 1997: 2-3) In the face of discrimination, the newly developed international human rights instruments would protect individual members of minority groups. There would be no need to include ‘minority group rights’. In fact, with this background in mind, the UN removed any reference to ethnic and national minorities in the Universal Declaration of Human Rights (1948). (Ibid: 3) In other words, “…post-war liberals around the world have repeatedly opposed the idea that specific ethnic or national groups should be given a permanent political identity or constitutional status.” (Id: 4) For Kymlicka, the concept of human rights alone is not adequate to address the needs of minority groups. For example, the individual right of free speech does not outline what sort of minority language policy a State should enact. It is for this reason that he proposes to enhance human rights with a ‘theory of minority rights’. (Id: 5) In his words, ‘group-differentiated rights’ whereby “…members of certain groups have rights regarding land, language, representation, etc that the members of other groups do not have.” (Id: 34)

Eva Brems in Lyons & Mayall (eds.), 2003: 121.


141

In a similar vein, Charles Taylor’s Multiculturalism and “The Politics of Recognition” examines this interplay between the individual and the group. Writing about the importance of recognising the needs, identities, etc. of individuals, Taylor presents the idea that individuals define their respective identities with the help of others: “We define our identity always in dialogue with, sometimes in struggle against, the things our significant others want to see in us. Thus, the contribution of significant others…continues indefinitely. (Charles Taylor, 1992: 33)

Individuals, for the most part, operate both on their own stage and within the constraints and advantages of the larger community or group such as the family. Once again, for Taylor, the group is addressed in terms of its individual members along with the potential ‘collective 84 goals’ linked to ‘members of the community’. The contribution that Kymlicka’s thesis can offer towards understanding the implications of genocide and of rape as genocide is that he makes a connection between individual and group rights – a theme hinted at within the international crime of genocide. Kymlicka acknowledges that ‘group-differentiated rights’ may seem to counter efforts to emphasise the individual in that his theory focuses on the group. (Will Kymlicka, 1997: 34) Yet, Kymlicka argues that there can be compatibility between individual rights and group-differentiated rights. For instance, minority language rights is an area whereby a right is attached to an individual member of a group and to the group as a whole. In Canada, as his example follows, the right of francophones to use French in courts is one enacted on by individuals. The right may be aimed at the entire francophone group, but it is exercised by individual 85 francophones. Other rights in Canada, such as fishing/hunting rights for indigenous peoples, are accorded to groups. Working with both the individual and group elements of the issues, Kymlicka points out: “Just as certain individual rights flow from each individual’s interest in personal liberty, so certain community rights flow from each community’s interest in self-preservation. These community rights must then be weighed against the rights of the individuals who compose the community.” (Will Kymlicka, 1997: 47)

Hence, the preservation of the group which is deemed critical can operate alongside the rights and needs of individual members of the community or group. There may be conflict such as if groups impose restrictions on its members, but Kymlicka differentiates between internal (‘claims of a group against its own members’) and external (‘claims of a group against the larger society’) protections, both of which have limitations such as fitting within human rights or to balance opportunities, etc. between groups. (Ibid: 35) Returning to genocide, acts associated with this crime are aimed at destroying in whole or in part a national, ethnical, racial or religious group. In turn, it is individual members of said groups who are killed, harmed, etc. The two components of Kymlicka’s vision, the group and 84

Taylor is speaking about the collective goals of French speaking Québecois. Charles Taylor, 1992: 39 and 53-58. 85 Kymlicka insists that French Canadians are a national minority thereby ensuring they can be accorded groupdifferentiated rights. Will Kymlicka, 1997: 45- 46.

Daniela De Vito

142

the individual within the group, are able to co-exist within this formulation. This does not exclude the current concept of human rights with its emphasis on the individual and his/her human rights. This part of Kymlicka’s approach does not, as warned by Jack Donnelly, completely subsume the category of group rights within human rights thereby negating a place for the individual. Rather an area of accommodation has been created whereby both the group is protected, and where the individual within the group is also protected and in turn acknowledged and can play an active role. The next question becomes, if rape as genocide can also be placed within this area of accommodation?

Rape as Genocide The international crime of genocide emerges first and foremost as the destruction in whole or in part of a national, ethnical, racial or religious group. Yet, by delving further into this violation and by incorporating certain elements from Kymlicka’s work, one is able to find a space for individual members of said groups since the acts of genocide enumerated in the Genocide Convention are committed against individuals. As Jack Donnelly emphasises with direct reference to genocide: “Therefore, it is individual members of threatened groups who must exercise the right.” (Jack Donnelly in Catherine Brolman et al (eds.), 1993: 131-132)

Is it acceptable, therefore, for rape as genocide to be placed within Kymlicka’s formulation? Any response to this question should simultaneously consider some of the implications emanating from the two International Criminal Tribunal Judgements mentioned in the introduction to this chapter. The reason is that these Judgements offer quite distinct presentations of rape. In the International Criminal Tribunal for the former Yugoslavia Judgement (Kunarac et al - Foca case, ICTY – 2001: 441), the Trial Chamber considered that rape could be understood as ‘a serious violation of sexual autonomy’. (Catherine A. MacKinnon, 2006: 950) In its overview of several common and civil law jurisdictions in relation to definitions of rape, the Trial Chamber concluded that the main principle linking these systems “…is that serious violations of sexual autonomy are to be penalised.” In turn, “Sexual autonomy is violated wherever the person subjected to the act has not freely agreed to it or is otherwise not a voluntary participant.” (Foca case, 2001: 457) As with the international crime of torture, this conclusion emphasises that rape should be conceptualised as a crime committed against the individual. As such, rape is an act perpetuated against the individual and it specifically 86 violates the sexual components of the individual. As Catherine Mackinnon observes within the context of consent definitions of rape, “This crime (rape) basically occurs in individual psychic space.” Sexual Autonomy, can be understood as having three components: “The first two are mental – an internal capacity to make reasonably mature and rational choices, and an external freedom from impermissible pressures and constraints. The third 86

As Catherine A. Mackinnon states “Force abrogates autonomy just as denial of self-determination is coercive.” Catherine A. Mackinnon , 2006: 941.


143

dimension is equally important. The core concept of the person…the bodily integrity of the individual.” (Stephen J. Schulhofer, 1998: 111)

Although this definition of sexual autonomy crucially includes both mental and physical aspects, there is concern with the notion of making choices. A similar link can be made with theories of human rights whereby the capacity to claim human rights implies the ability to hold human rights. One concern, if taken to the extreme, would be to contemplate if an 87 individual in a coma truly has human rights? In his examination of sexual autonomy, Stephen J. Schulhofer goes on to add that determining if a violation of sexual autonomy constitutes rape is an additional step and one that can be linked to cultural factors or social conditions. As Schulhofer clarifies: “In sexual crimes, however, the labelling decision may be very important. A persistent pitfall of rape reform has been the tenacity of common culture and its influence over the interpretation of ambitious reforms.” (Ibid: 104)

In contrast, the International Criminal Tribunal for Rwanda (ICTR) in its pivotal Judgement (Prosecutor v. Jean-Paul Akayesu, 1998), for the first time in international law 88 conceptualises rape under certain circumstances as genocide. The women who were raped during the genocide of 1994 were, according to the Trial Chamber, targeted for rape because they were members of the Tutsi ethnic group. The rapes were therefore considered as genocide within this context since, in the words of the Chamber, “…the Chamber is satisfied that the acts of rape and sexual violence described above, were committed solely against Tutsi women…and specifically contributing to their destruction and to the destruction of the Tutsi group as a whole.” 89

The chamber added, “These rapes resulted in physical and psychological destruction of Tutsi women, their families and their communities.” (Akayesu case, 1998: 731)

One way in which the rapes contributed to the destruction of the Tutsi group, was due to the fact that many of the Tutsi women and girls who were raped were killed afterwards or died from their injuries. (Angela M. Banks, 2005: 9-10) Another critical point, in relation to how rape was deemed as genocide, relates to the reality that Tutsi women were considered as ‘sexual objects’ and as the Trial Chamber in the Akayesu case observed “…sexual violence 87

For more on this topic, please refer to Peter Jones, 1994: 67-71. Jean-Paul Akayesu did not rape the women directly. Instead, the charges of rape in his case stem from the fact that he had administrative control and authority in the region where the rapes occurred. Akayesu case (ICTR-96-4-T, Judgment, 2 September 1998): Indictment. 89 Akayesu case, 1998: 165-166. It should be noted, in the Akayesu Judgement, that rape and other sexual violence within the parameters of genocide was “…defined by whatever causes serious bodily or mental harm.” This is due to how the Genocide Conventiuon (1948) has been formulated. Catherine A. Mackinnon, 2006: 941. 88

144

Daniela De Vito

was a step in the process of destruction of the Tutsi group – destruction of the spirit, of the will to live, and of life itself.” (Kelly Dawn Askin, 2005: 1010) The rapes of Tutsi women, within this context, could be placed “…under the legal definition of genocide because they represent the enemy’s intent to destroy.” (Lisa Sharlach, 2000: 93) In addition, rape as 90 and a way to genocide can be understood as a ‘particularly effective tool of genocide’ inflict serious bodily or mental harm onto a group. As the Akayesu Trial Chamber observed, “Indeed, rape and sexual violence certainly constitute infliction of serious bodily and mental harm on the victims and are even, according to the Chamber, one of the worst ways of inflicting harm on the victim as he or she suffers both bodily and mental harm.” (Akayesu case, 1998: 731)

Some of the after effects of the rapes that took place within the context of genocide in Rwanda included survivors becoming socially outcast and excluded. (Lisa Sharlach, 2000: 91) That is to say, an additional layer of complexity emerges linked to cultural opinions and sensitivities. As Lisa Sharlach argues in her overview of genocide in Rwanda, Bangladesh and the former Yugoslavia, in general rape as genocide stigmatises the survivor and that, “In such communities, women in their roles as mothers of the nation and as transmitters of culture symbolise the honour of the ethnic group. When a woman’s honour is tarnished through rape, the ethnic group is also dishonoured.” (Ibid: 90)

If both cases (Kunarac and Akayesu) are considered together, does the innovative link between rape and genocide as presented in the Akayesu case result in rape losing its understanding as a violation of autonomy? Upon closer examination of the Akayesu Trial Chamber comments quoted above, it would appear that the compatibility found within genocide as discussed earlier between the individual and the group can be located in this case. Yes, the Trial Chamber focuses on the fact that individual victims were targeted due to their membership in the Tutsi ethnic group. However, the Trial Chamber also acknowledges that both the Tutsi group and the individual victims of rape were targeted for genocide. Recalling the words of the Chamber: “…and specifically contributing to their destruction and to the destruction of the Tutsi group as a whole.” Therefore, in this particular case, the crime of rape as genocide is conceived as both an act committed against an individual and as an act committed against the group. As such, rape as genocide has retained its understanding as a violation against one’s autonomy, but also as a violation against the group as a whole. The ICTR Trial Chamber in this particular Judgement has created an area of accommodation whereby the group (Tutsi ethnic group) and the individual (Tutsi women) are acknowledged with the aim of hopefully protecting both in the future. Even with this compatibility, between the group and the individual, it should be emphasised that the Trial Chamber insisted that the women who were raped were victims because they were Tutsi. The attachment to the group is not completely removed despite the fact that the Chamber has also acknowledged space for 90

This reference relates to “The devastation that follows rape makes it a particularly effective tool of genocide because it destroys the morale of a woman, her family, and perhaps her entire community.” Lisa Scharlach.


145

the individual. This means that the individual victims of rape were not targeted because they were individual women, but because they were Tutsi women. This approach may further deny the ‘individuality’ of the victims since they have been placed by the Trial Chamber within the category of Tutsi women and not within the general category of ‘women’. The construct of the Genocide Convention (1948), which the ICTR Trial Chamber must follow, would explain the restriction of only focussing on the Tutsi ethnic group.

CONCLUSION Does this insistence on a strong connection with the ‘group’ diminish the so-called compatibility between rape (autonomy) and genocide (group)? I would argue it does not since a key component found within the definition of genocide is the protection of groups. As noted by the United Nations Commission of Experts (for the Former Yugoslavia), with reference to this author’s argument that the relationship between rape and genocide is a compatible one but inevitably—due to how genocide has been constructed in international law—an unequal relationship: “…the crimes against a number of individuals must be directed at them in their collectivity or at them in their collective character or capacity.”

91

However, if deemed necessary, this compatibility should also be arrived at via a decision made by those who have suffered these crimes. As mentioned at the very beginning of this chapter, it may be beneficial for such a label to exist if the individual victims of rape as genocide consider this proposition to be so. These victims must also have a voice in the process that constructs and interprets such international law judgements. As such, and this was a key aim of this chapter, some of the theoretical implications have been examined emanating from a situation whereby international law has placed rape simultaneously as a violation against the individual and as a violation against the group. In contrast to rape as a form of torture with its association to the individual, this chapter has determined that rape as genocide has to operate within the confines of genocide with its links to the group. Although genocide has been conceived as a denial of the right to life of human groups, it also includes scope for the individual. In other words, genocide is an attack against a national, ethnic, racial, or religious group, but it is also an attack against individual persons within these groups. This complex interplay, found within genocide, between the group and the individual, affects how rape as genocide is understood. This means that individuals are victims of rape as genocide due to their membership in one or more of the enumerated groups in the Genocide Convention (1948), and within circumstances found under the definition of genocide. Therefore, this author has presented that there must be an opportunity to consider rape as genocide to be conceived as a violation against an individual because they are part of a group, and not as a violation solely committed against the group as a whole. If this formulation is not possible, within international law cases addressing such

91

UN Document S/1994/674, ‘Final Report of the Commission of Experts’, (1992): 524.

146

Daniela De Vito

issues, then the result will be an exclusion of the construction of rape as a violation of an individual’s sexual autonomy. Despite the definition of genocide under international law, it cannot completely remove the notion that rape is a violation committed against the individual. Victims of rape who appeared before the ICTR in the case of the Prosecutor v. Jean-Paul Akayesu, albeit within limits set out by the Tribunal and the Trial Chamber—such as time limits and the rigours of cross-examination 92 —do have an opportunity to recount what happened to them as individuals during the 1994 genocide in Rwanda. The context of the genocide in Rwanda, namely that it was committed with the purpose to destroy in whole or in part the Tutsi ethnic group, was in turn a necessary component attached to the Akayesu (1998) Judgement. As such, the Trial Chamber in this case did acknowledge that the women were raped because they were members of the Tutsi ethnic group but also that they were individual Tutsi victims. As developed in this chapter, the definition of genocide —outlined in the Genocide Convention (1948) and as utilised by the international criminal tribunals—can offer a compatible space for rape. Although constrained by this definition, the group elements of this international crime can create room for the individual components of rape. For example, insisting on this connection with the group, an individual’s sexual autonomy does at a certain level become enmeshed within the group. Does this change the nature of rape? To a certain degree the answer is “yes” since rape is not solely associated with the sexual autonomy of an individual. Yet, the construction of rape as a violation committed against the individual does not completely disappear. An unequal space is indeed created whereby the group elements of genocide and the individual aspects of rape operate once rape is considered as genocide. The point is not to shift and to re-configure this unequal space, but rather to accept that this is the reality and the consequence of innovative international jurisprudence, once rape is subsumed into genocide, such as with the Akayesu Judgement (1998). In turn, victims of rape as genocide must also have the opportunity to decide for themselves how it is beneficial to consider rape as genocide. Linked to this, they must also be able to determine which aspects are held onto in terms of if the violation affects the individual, the group, or a combination of the two. As argued in relation to both rape as torture and rape as genocide, such theoretical implications and perhaps others must also be considered when assessing juridical innovations. Theoretical challenges for group theory emerge when assessing a violation such as rape and within the context of international human rights or humanitarian law. Rape is generally constructed as an act committed against the individual. Currently rape is not an enumerated international crime and must be subsumed into an established crime to be prosecuted. In this instance the focus was on rape as torture and rape as genocide. As such, any discussion of the group or of group rights must also consider the theoretical implications of what can occur to rape since it must navigate through constructs that focus on the individual, on the group or at times on both parameters. Thus, the consideration of rape as torture and of rape as genocide is not an insurmountable challenge for group theory but one that must address both the individual and the group due to how rape is understood. As noted, the harm of rape for both the individual and at times the group was an important example of this juxtaposition. This is why seemingly unrelated concepts such as human rights, the individual, and the political were

92

‘Rwanda Today: The International Criminal Tribunal and the Prospects for Peace and Reconciliation. An Interview with Helena Cobban’.


147

in turn associated with group theory and in particular group rights. If the outsider 93 of group rights is to develop a firmer place within current theoretical and legal discourse then the implications of such concepts must be fully explored.

REFERENCES Afsharipour, Afra. “Empowering Ourselves: The Role of Women’s NGOs in the Enforcement of the Women’s Convention” Columbia Law Review, 99(1), 1999. Allodi, F. & Stiasny S. “Women as Torture Victims.” Canadian Journal of Psychiatry, Vol. 35: 144, March 1990. Alston, Philip (ed.). The United Nations and Human Rights: A Critical Appraisal. Clarendon Paperbacks, Oxford, 1996. Andreopoulos, George J. (ed.). Genocide: Conceptual and Historical Dimensions. University of Pennsylvania Press, 1994. Anker, Deborah. “Women Refugees: Forgotten No Longer?” San Diego Law Review, 32, 1995. Archard, David. “The Wrong of Rape” The Philosophical Quarterly, Vol. 57, No. 228, July 2007. Arendt, Hannah. The Human Condition (Second Edition). The University of Chicago Press, London, 1998. Askin, Kelly Dawn. War Crimes Against Women: Prosecution in International War Crimes Tribunals. Kluwar Law International, The Hague, 1997. Askin, Kelly Dawn. “Gender Crimes: Jurisprudence in the ICTR.” Journal of International and Comparative Law (JICJ), 3 (2005). Aswad, Evelyn Mary. “Torture by Means of Rape.” The Georgetown Law Journal, Vol. 84: 1913, 1996. Baines, Erin K. “Body Politics and the Rwandan Crisis.” Third World Quarterly, Vol 24, No 3, 2003. Ball, Terance & Dagger, Richard. Political Ideologies and the Democratic Ideal. Pearson Longman, USA, 2006. Banks, Angela. “Sexual Violence and International Criminal Law: An Analysis of the Ad Hoc Tribunal’s Jurisprudence & the International Criminal Court’s Elements of Crimes.” (The Hague: Women’s Initiatives for Gender Justice, September 2005), 910. Bassiouni, M Cherif. Crimes Against Humanity in International Criminal Law (2nd Edition). Kluwer Law International, The Hague, 1999. Bassiouni, M. Cherif & Manicas, Peter. The Law of the International Criminal Tribunal for the Former Yugoslavia. Transnational Publications, N.Y., 1996. Blatt, Deborah. “Recognising Rape as a Method of Torture.” New York University Review of Law and Social Change, Vol. 5, 1994. Bobbio, Norberto. The Age of Rights. Polity Press, Cambridge, 1995. Bowring, Bill in Futtrell, Derdre & Bowring Bill (eds). Minority and Group Rights in the New Millenium. Martinus Nijhoff Publishers, The Hague, 1999. Brandt, Don (ed.). Violence Against Women: From Silence to Empowerment. World 93

This term has been utilised simply due to the current difficulties still present within international law that prevent a fuller acceptance of group rights.

148

Daniela De Vito

Vision International, California, 2003. Brems, Eva in Lyons, Gene M. & Mayall, James. International Human Rights in the 21st Century. Rowman & Littlefield, Oxford, 2003. Brownlie, Ian & Goodwin-Gill, Guy S. Basic Documents in International Law. Oxford University Press, Oxford, 2002. Brownmiller, Susan. Against Our Will: Men, Women, and Rape. Penguin Books, New York, 1975. Buss, Doris E. “Women at the Borders: Rape and Nationalism in International Law.” Feminist Legal Studies, Vol. VI no. 2, 1998. Cahill, Anne J. Rethinking Rape. Cornell University Press, London, 2001. Campbell, Kirsten. “The Trauma of Justice: Sexual Violence, Crimes Against Humanity and the International Criminal Tribunal for the Former Yugoslavia.” Social & Legal Studies, Vol. 13(3), 2004. Card, Claudia. The Atrocity Paradigm: A Theory of Evil. Oxford University Press, Oxford, 2002. Casals, Neus Torbisco. Group Rights as Human Rights, Springer, UK, 2006 Cavaro, Adriano in Stephen K. White (Ed.) and Donald J. Moon (Ed.), What is Political Theory? Sage Publications, London, 2004. Cases Cited: - ICTR. Prosecutor v. Jean-Paul Akayesu, ICTR-96-4-T, Judgement, 2 September 1998. - ICTY. Prosecutor v. Dragolijub Kunarac, Radomir Kovac and Zoran Vukovic, ICTY, IT-96-23-T & IT-96-23/1-T, Judgement, 22 February 2001. (Foca Case) - Yearbook of the European Convention on Human Rights. The Greek Case. V. 12. 1969. 504. Chesterman, Simon. “Never Again…and Again: Law, Order, and the Gender of War Crimes in Bosnia and Beyond.” Yale Journal of International Law, Vol. 22, Number 2: 299, summer 1997. Chalk, Frank & Jonassohn, Kurt. The History and Sociology of Genocide. Yale University Press, London, 1990. Chinkin, Christine. “Rape and Sexual Abuse of Women in International Law.” European Journal of International Law (EJIL), Vol. 5: 1-341, 1994. Chinkin, Christine. “A Critique of the Public/Private Dimension.” European Journal of International Law (EJIL), Vol. 10 No. 2, 1999. Conaghan, Joanne. “Extending the Reach of Human Rights to Encompass Victims of Rape: MC v. Bulgaria.” Feminist Legal Studies, 13, 2005. Connolly, William E. The Terms of Political Discourse (Second Edition). Martin Robertson, Oxford, 1983. Copelon, Rhonda. “Women and War Crimes.” St. John’s Law Review, Vol. 69:37, 1995. Copelan, Rhonda. “Surfacing Gender: Reengraving the Crimes against Women in Humanitarian Law”. In: N. Dombrowski (ed.) Women and War in the Twentieth Century, Garland Publishing, NY, 1999. Crelinsten, Ronald D. Politics of Pain. COMT, Leiden, 1995. De Feyter, Ken in De Feyter and Pavlakos (Editors), The Tension Between Group Rights and Human Rights: A Multidisciplinary Approach, Oxford and Portland, Oregon, 2008. De Than, Claire & Shorts, Edwin. International Criminal Law and Human Rights. Sweet & Maxwell, London, 2003.


149

De Vito, Daniela. Beijing Plus Five: An Assessment. University of Essex, Human Rights Centre, 2001. Dixon, Rosalind. “Rape as a Crime in International Humanitarian Law: Where to from Here?” European Journal of International Law (EJIL), Vol. 13 No. 3, 2002. Donnelly, Jack. Universal Human Rights in Theory and Practice. Cornell University Press, London, 1996. Donnelly, Jack in Brolmann, Catherine (ed.). Peoples and Minorities in International Law. Martinus Nijhoff, London, 1993. Donnelly, Jack in Lyons, Gene M. & Mayall, James. International Human Rights in the 21st Century. Rowman & Littlefield, Oxford, 2003. Duner, Bertil (ed.). An End to Torture: Torture and Other Barbarous Practices. Palgrave New Copy, London, 1998. Durham, Helen. “Women and Civil Society: NGOs and International Criminal Law.” In Kelly Dawn Askin and Doreen M. Koenig (eds.), Women and International Human Rights Law. Transnational, Ardsley, N.Y., 1999. Eboe-Osuji, Chile. “Rape as Genocide: Some Questions Arising.” Journal of Genocide Research, 9(2), June 2007. Elshtain, Jean Beth. Public man, private woman: Women in social and political thought. Robertson, Oxford, 1981. Engle, Karen. “Feminism and its (Dis)Contents: Criminalising Wartime Rape in Bosnia and Herzegovina.” American Journal of International Law, Vol. 99, No. 4, October 2005. Fabri, Mary R. “A Clinical Perspective of the Role of Gender in the Torture Experience.” Chicago Torture Conference March 4-7 1999. Located at: http://humanities.uchicago.edu/cis/torture/abstracts/maryfabri.html printed on 31/03/99). Fallmeth, Aaron Xavier. “Feminism and International Law: Theory, Methodology and Substantive Reform.” Human Rights Quarterly, Vol. 22: 694, 2000. Fein, Helen. Genocide: a sociological perspective. Sage, London, 1993. Forrest, Duncan (ed.). Glimpse of Hell: reports on torture worldwide. Amnesty International, London, 1991. Freeman, Michael. “The Theory and Prevention of Genocide.” Holocaust and Genocide Studies, 6(2), 1991 Freeman, Michael. Human Rights: An Interdisciplinary Approach. Polity, Cambridge, 2002. Gaer, Felice D. “And Never the Twain Shall Meet? The Struggle to Establish Women’s Rights as International Human Rights.” American Bar Association, 1998. Gardam, Judith & Jarvis, Michelle. Women, Armed Conflict and International Law. Kluwer Law International, The Hague, 2001 Ghandi, P.R. Blackstone’s International Human Rights Documents. Blackstone Press, London, 2000. Giffard, Camille. The Torture Reporting Handbook: How to document and respond to allegations of torture within the international systems for the protection of human rights. Human Rights Centre, University of Essex, 2000. Giles, Wenona & Hyndman, Jennifer. Sites of Violence: Gender and Conflicts Zones. University of California Press, Berkeley, 2004. Grayzel, Susan. Women’s Identities at War: Gender, Motherhood and Politics in Britain and France during First World War. University of Carolina Press, Chapel Hill, 1999.

150

Daniela De Vito

Green, Jennifer. “Uncovering Collective Rape: A Comparative Study of Political Sexual Violence.” International Journal of Sociology, 34, 2004. Green, Llezlie L. “Sexual Violence and Genocide Against Tutsi Women.” Columbia Human Rights Law Review, Summer 2002. Human Rights Watch. “Kosovo Backgrounder: Sexual Violence as an International Crime.” May 1999. Howard, Dick. The Politics of Critique. The Macmillan Press Ltd, London, 1988. Ingram, David. Group Rights: Reconciling Equality and Difference, University Press of Kansas, Kansas City, 2000. Jones, Adam. Genocide: A Comprehensive Introduction. Routledge, London, 2006. Jones, Peter. Rights. Macmillan Press, London, 1994. Kegly & Wittkopf. World Political Trend & Transformation. Wadsworth Publishing, London, 2003. Gabriel Kirk McDonald in Sienho Yee and Wang Tieya (eds.). International Law in the Post-Cold War World. Routledge, London, 2001. Kois, Lisa M. in Bertil Duner (ed.) An End to Torture: Strategies for its Eradication. Zed Books, London, 1998. Kuper, Leo. Genocide: Its Political use in the Twentieth Century. Yale University Press, New Haven, 1982. Kymlicka, Will. Multicultural Citizenship. Clarendon Press, Oxford, 1997. Lancet – letter to the editor. “Prevalence of Sexual Torture in Political Dissidents.” Vol. 345: 1307, 20 May 1995. Lemkin, Raphael. “Genocide as a Crime under International Law.” American Journal of International Law, Vol. 41: 146, 1947. Lerner, Natan. The UN Convention on the Elimination of all Forms of racial Discrimination. Kluwer Law International, Boston, 2003. Lusby, Katherine. “Hearing the Invisible Women of Political Rape.” University of Tulsa Law Review. Vo. 25: 911, 1995. Lyons, Gene M. & Mayall, James (eds.). International Human Rights in the 21ST Century: Protecting the Rights of Groups. Rowman & Littlefield, Oxford, 2003. Mackie, Vera. “Sexual Violence, Silence and Human Rights Discourse.” In Anne-Marie Hilsdon, Martha MacIntyre and Maila Stivens (eds.). Human Rights and Gender Politics: Asia Pacific Perspectives. Routledge, London, 2000. MacKinnon, Catherine. “On Torture.” In: K. Mahony & P. Mahoney (eds.). Human Rights in the Twenty-First Century: A Global Challenge. Martinus Nijhoff, Boston, 1993. MacKinnon, Catherine. “Rape, Genocide, and Women’s Human Rights.” Harvard Women’s Law Journal, Vol 17, 1994 MacKinnon, Catherine. “Defining Rape Internationally: A Comment on Akayesu.” Columbia Journal of Transnational Law, Vol 44, 2006. Martin, Elizabeth (ed.). Oxford Dictionary of Law. Oxford University Press, Oxford. Morris, Virginia and Scharf, Michael P. An Insider’s Guide to the International Criminal Tribunal for the Former Yugoslavia. Transnational Publications, N.Y., 1995. Morsink, Johannes. “The Philosophy of the Universal Declaration.” Human Rights Quarterly, Vol. 6(3), 1984. Mukamana, Donatilla and Collins, Amthony. “Rape Survivors of the Rwandan Genocide.” Critical Psychology, 17: Critical Psychology in Africa, 2006. O’Sullivan, Noel. Political Theory in Transition, Routledge, London, 2000.


151

O’Byrne, Darren J. Human Rights: an Introduction. Harlow: Longman, London, 2003. OSCE (ODIHR). Preventing Torture: A Handbook for OSCE Field Staff. OSCE, Warsaw, 1999. Peel, Michael. Rape as a Method of Torture. Medical Foundation for Survivors of Torture, London, 2005. Pateman, Carole. The Sexual Contract. Polity, Cambridge, 1988. Ratner, Steven & Abrams, Jason. Accountability for Human Rights Atrocities in International Law: Beyond the Nuremberg Legacy (2nd Edition). Oxford University Press, 2001. Rawls, John. Collected Papers. Harvard University Press, London, 1999. Rodley, Nigel S. The Treatment of Prisoners under International Law. Clarendon Press, Oxford, 1999. Rodley, Nigel S. “Conceptual Problems in the Protection of Minorities: International Legal Developments.” Human Rights Quarterly, Vol. 17: 61, 1995. Rose, Nikolas. Powers of Freedom: Reframing Political Thought. Cambridge University Press, Cambridge, 1999. Rosenbaum, Allan S. Is the Holocaust Unique? Perspectives on Comparative Genocide. Westview Press, Bolder, 2001. Rwanda Today: The International Criminal Tribunal and the Prospects for Peace and Reconciliation. An Interview with Helena Cobban. Schabas, William A. Genocide in International Law. Cambridge University Press, Cambridge, 2000. Schulhofer, Stephen J. Unwanted Sex: The Culture of Intimidation and the Failure of Law. Harvard University Press, London, 1998. Harvard University Press, London, 1998. Sharlach, Lisa. “Rape as Genocide: Bangladesh, the Former Yugoslavia, and Rwanda.” New Political Science, Volume 22, Number 1, 2000. Shaw, Martin. War & Genocide: Organised Killing in Modern Society. Polity Press, Cambridge, 2006. Shaw, Martin. What is Genocide? Polity Press, Cambridge, 2007. Short, Damien. Genocides. Zed Books, London, 2008. Sorenson, Susan & White, Jacqueline. “Adult Sexual Assault: Overview of Research.” Journal of Social Issues, 48 (1), 1991. Sparkes, A.W. Talking Politics: a wordbook. Routledge, London, 1994. Squires, Judith. Gender in Political Theory. Polity Press, Cambridge, 1999. Strumpen-Darrie, Christine. “Rape: A Survey of Current International Jurisprudence.” Human Rights Brief, Volume 7 Issue 3. http://www.wcl.american.edu/hrbrief/v7i3/rape.htm (Printed 05/07/2006) Taylor, Charles. Multiculturalism and ‘The Politics of Recognition’. Princeton University Press, N.J., 1992. Thomas Katie. “Sexual Violence: Weapon of War.” Forced Migration Review, Issue 27, January 2007. Triffterer, Otto (ed.). Commentary on the Rome Statute of the International Criminal Court: Observer’s Notes, Article by Article. Nomos Verl.-Ges., Baden-Baden, 1999. United Nations Documents: A/55/290 A/CN.4/453 and Add. 1-3, “Fifth Report on State responsibility by Mr. G. ArangjioRuiz, Special Rapporteur,” Yearbook of the International Law Commission II (1)

152

Daniela De Vito

(1993 A/CONF/183/9 (July 1998) E/CN.4/1999/68/Add.4 E/CN.4/Sub.2/1985/6 (Whitaker report, 2 July 1985) EGM/GBP/1997/Report G.A. Res. 91-I, 11 December 1946 GA res.217 A (III), 10 December 1948 G.A. Res. 3318(XXIX) of 14 Dec 1974 (Declaration on Women and Children in Emergency and Armed Conflict) GA Res. 260A(III), UN Doc. A/760. Security Council Resolution 995 of 8 November 1994 S/25704 (Report of the Secretary General Pursuant to Paragraph 2 of the Security Council Resolution 808, 1993) S/1994/674, ‘Final Report of the Commission of Experts’, (1992) Watson, James Shand. Theory and Reality in the International Protection of Human Rights. Transnational Publishers, N.Y., 1999. Wentraub, Jeff & Kumar, Krishan (eds.). Public and Private in Thought and Practice: Perspectives on a Grand Dichotomy. The University of Chicago Press, Chicago, 1997. Women’s Caucus. “ICC—Grave Breaches: The Torture Cluster.” www.iccwomen.org 12-26 February 1999. Wood, Neil. Reflections on Political Theory: A Voice of Reason from the Past. Palgrave MacMillan, London, 2002.

In: Group Theory Editor: Charles W. Danellis, pp.153-166


Chapter 5

GROUP WORK IS NOT ONE, BUT A GREAT MANY PROCESSES — UNDERSTANDING GROUP WORK DYNAMICS Eva Hammar Chiriac∗ Department of Behavioural Sciences and Learning Linköping University, SE-581 83 Linköping Sweden

ABSTRACT When people come together in order to collaborate with each other, lots of group processes arise. Earlier research has shown that group work is not just one single activity but is comprised of several activities with different goals and conditions. That implies that group work may change character several times during its functioning and in the group’s lifetime. In a simplified way, this may be described as certain working modes being better suited for some parts of the group work, while others are not. Consequently, group work functions on several levels during different parts of the group’s work. This may result in high quality group work but also in the exact opposite. Such different conditions emanate from different dynamic processes that can either facilitate or hamper the group’s work. In order to better understand and interpret processes that might arise when people come together in a group and work together on a common task, a new model will be presented, namely The periodic system for the understanding of group processes. The model represents a contemporary approach of categorising group processes and can hopefully provide a better understanding of interactional dynamics in groups and account for a greater explanation value with respect to the group processes. This chapter aims to present illustrative applications of the model as well as explaining The periodic system for the understanding of group processes. A core question is whether or not the model can contribute valuable information and if it is a practical tool for describing and interpreting what happens in groups during work. Earlier research has ∗

E-mail: [email protected]; Tel. no.: +46 13 285735; Fax no.: +46 13 282145

154

Eva Hammar Chiriac shown promising results indicating that this kind of tool can supply a better understanding of interactional dynamics in groups, not only from a scientific perspective but also from users’ applied perspective.

INTRODUCTION How to establish a well functioning group work in education is not self-evident. All educators and students who use or participate in group work know that it may function in various ways. Such a mode of working sometimes ends up with positive experiences and learning, however in other cases the outcome could be the reverse. When people come together in order to collaborate, lots of group processes arise within the group, which probably have an influential impact on the groups’ production and the quality of learning. Previous research has shown that group work is not just one single activity but is comprised of several activities each with different goals and conditions (Hammar Chirac, 2003, 2008; Steiner, 1972, 1976). That implies that the group’s work may change character several times during the work and its lifetime. In a simplified way, this may be described as certain working modes are better suited in some parts of the group work, while others are not. Additionally, this also implies that the group may need different kinds of support and assistance during various parts of its work. The question of why some group work turns out being successful and others a failure is still to be decided. What is the essence behind quality group work and can participants’ actions influence the process and outcome, and if so, how? Certainly it is possible for both educators and students to affect the group work in several ways. This chapter will address the issue of group processes connected to the type of task and activity in the group, in other words What do the group members do and how do they do it?

TYPE OF TASK Research concerning the group’s productivity as superior to the individual is contradictory. There appears to exist several reasons as to why individual performance either may increase or decrease in group context, such as category of group work and type of task. That category of group work has a great importance on the group’s work and processes seem to be well documented (Barkeley, Cross and Major, 2005; Brown, 2000; Forsyth, 2005; Hammar Chiriac, 2008). Lotan (2006) also asserts that various group tasks require different kinds of co-operation. Some group variables might provide more significant information about group work and the interaction dynamics in groups. One variable that seems to account for a greater explanation value regarding group processes is type of task. Type of task, is an important, or possibly the most important feature as regards work outcome and processes in group work. Earlier research has, for instance, shown that type of task has an even greater impact than group size, group composition or even the participants’ motivation (Steiner, 1972; Hammar Chiriac, 2003; 2008; Hammar Chriac and Granström, 2008, in progress). The educators may through the task contemplated influence the group’s working process without directing the students entirely (Hammar Chiriac, 2003, 2008; Hammar Chiriac and Hempel 2008). If the educators want to encourage real group work, i.e. a group work characterised by a common effort, utilization of the group’s competence and

Group Work Is Not One, But a Great Many Processes

155

joint problem solving and/or reflection, the teacher ought to design a task that stimulates that specific kind of group activity. To be able to design a task corresponding to the desirable group activity and outcome, the educator requires knowledge, applicable tools and a better understanding of group work as a function on various levels in separate parts of the group’s work. Students may also influence group work by ‘selecting’ how to approach the provided task. Even if the educator composes the group task in a certain way, there is no guarantee that the group members will really approach the task in the anticipated way. Thus, even if the teacher introduces, ‘a real group work’, the students may still choose an alternative working mode to solve the task. The participants may prefer a more convenient approach, associated with more effectiveness or being less provocative. Two theories, Steiner’s theory about group processes and productivity (1972) and Bion’s theory concerning the professional group (1961), have each proved to be usable tools for describing what happens in groups. Steiner’s theory, which displays various way to cooperate in groups, only acknowledges the working group but neglects the socio-emotional aspects of the group’s life. Bion’s theory, which focuses mainly on what the group is taking an interest in doing, pays attention to and emphasizes that both work and emotions must be allowed to influence the group if it is to survive.

STEINER’S THEORY OF GROUP PROCESSES AND PRODUCTIVITY In his theory, Steiner (1966, 1972, 1976) suggests the integrative role of task demands and processes in the group, and how these factors influence the group’s actual productivity. Steiner is mainly interested in small, task-oriented groups where the members could influence each other directly, generally in face-to-face interaction. The groups are supposed to be completely concerned with task performance and not at all concerned with sociability. Additionally, Steiner (1966, 1972, 1976) considers that a group’s performance depends on three factors; (a) task demands, (b) resources of the group, and (c) group processes. Task demands are the requirements imposed on the group by the task itself. The nature of the task also specifies the kind and amount of resources that are needed and the utilisation for optimal group performance. Task demands encompass all the prescriptions listed in a complete job-manual and determine (a) which resources (knowledge, ability, skill or tool) are relevant for the assignment, (b) how much of each kind of resource is needed for optimal performance, and (c) how the various resources must be combined in order to produce the best possible outcome (Hammar Chiriac, 2000, 2001; Steiner, 1966, 1972, 1976). Task demands are characterised by three different features. The first one deals with whether the task is divisible or unitary. If the task is divisible, it can be divided into sub tasks, which then can be distributed among the group members while a unitary task is impossible to divide in a beneficial way. The second characteristic for classifying tasks and demands is whether the demands are maximising or optimising. If they are maximising, it involves more quantity and high-speed as opposed to an optimising task, which will be adjusted to some predetermined standard or preferred outcome. The third characteristic concerns the process by which a group interacts to accomplish the task. It deals with how the group combines their concentration, force and effort on the work. In short, task demands are the requirements imposed on the

156

Eva Hammar Chiriac

group by the task itself. The nature of the task also specifies the kind and amount of resources that are needed and the utilisation for optimal group performance. The group’s resources in the group are decided by which group members are included in the group. The resources consist of all relevant knowledge, abilities, skills, or tools and specify the resources actually possessed by the group. Furthermore they include the distribution of resources among group members. If the group wants to obtain maximum productivity, assets must be combined and utilised in the best possible way. The processes in the group are determined by how the group members choose to combine their resources and efforts. In order to perform such high quality group work all resources must be utilized and coordinated in the best possible way. In a simplified way, group processes may be described as all actions by which the group transforms its resources into a product. It also includes all non-productive actions prompted by inadequate understanding and frustration. Task demands and resources are therefore possible to evaluate before the work begins contrary to processes that can not be predicted or evaluated in advance. Different types of task may be combined in several ways depending on these three factors (task demands, resources of the group and group processes in the group). Based on type of problems Steiner (1972) has identified five different forms of group work: additive, disjunctive, conjunctive, compensatory and complementary group work. An additive task requires that all members’ contributions to the work are weighted equally and then added together. The groups’ product depends on the sum all members’ effort. The added sum is the group’s result (e.g. pulling a rope). An additive task is unitary, maximising and the members must combine their efforts simultaneously (process: all together) in order to achieve an optimal outcome. The disjunctive task depends only on the most successful member to find and present the solution to the problem (e.g. problem-solving). The members must make an either-or decision, where one member’s contribution equals the group’s product. One members’ solution is given all the weight and all the other members’ proposals are rejected. Consequently, in a disjunctive task group members do not have to co-operate to accomplish the task. A disjunctive task may be characterised as divisible, optimising or maximising and there is no need to combine or utilise any efforts, on the contrary is the process is dependent on the most successful group member (individual process). Working with a conjunctive task the group is dependent on the weakest member, i.e. all members in the group must complete the task (e.g. mountain climbing). The group might be compared with a chain not being stronger than it weakest link. Conjunctive types of tasks are typically unitary and optimising. The members need to jointly co-operate and coordinate contributions with a high degree of interdependence (common process) in order to accomplish their assignment. The group must balance the average from all group members’ opinions when they work with a compensatory task. All members are obliged to participate and give independent individual judgments and so all contributions are added together and divided into a mean. The average is the group’s result (e.g. estimate the temperature in a room). A compensatory task could also include an assignment where all members are to give one’s vote to choose a candidate for a commission. The groups’ average is equivalent to who gets the majority of the votes (e.g. choose the best schoolmate in a class). Accordingly, a compensatory task is unitary, optimising and the process involves a combination of all members’ contribution (common process).


157

A complementary task also involves the entire group’s performance. The group work depends on the distribution of work among group members and that each individual takes responsibility for his or her part of the work. The assignment is divided into sub-tasks and distributed among the group’s members. Each person is responsibility for his part and may work on his own premises. The result is the sum of all members’ contributions (e.g. writing an anthology). A complementary task is divisible, optimising and the process concerns how to appropriately match the group members to their sub-task (process: each member doing their own part). The conclusion of Steiner’s theory is that the nature of the task has a great impact on the group’s productivity.

BION’S THEORY OF THE PROFESSIONAL WORK GROUP The purpose of a group is not always obvious to the members. Bion (1961) holds the opinion that all groups can act on two different levels. One is where the group acts in a mature and task oriented way. The other is characterised by the group acting in an immature and regressive mode. Bion labelled these conditions work group versus basic-assumption group. According to Granström (1986) Bion assumes that when disturbed or threatened, the group will act in a regressive way as a collective defence in the form of basic-assumption behaviour. The group acts ‘as if’ a certain condition exists and the group members have a common interest in managing anxiety. A group may act in three different basic-assumption modes; dependency, fight/flight and pairing (Bion 1961, 1998). Several authors and interpreters of Bion (e.g. Armelius and Armelius, 1985: Boalt Boëthius, 1983; Granström, 1986; Karetrud, 1989; Sjøvold. 1995, 1998) prefer to separate fight and flight groups and regards them as two different basicassumption conditions even though Bion himself did not make this distinction. The purpose of a work group is to manage the task, not only in a rational but also on effective way (Bion, 1961). The leader serves the group’s purpose, and feelings such as responsibility and co-operation are predominant. The work group handles changes and conflicts in a rational way. In the dependent condition the group’s aim is to obtain security and protection from the leader. The leader is perceived as omnipotent and the members depend on him/her. Feelings such as helplessness and inability of critical thinking are predominant. The dependence group will act against changes and denies conflicts. The fight group tries to subdue anxiety through fighting against a presumed shared enemy, and the leader is expected to identify the enemy and command the fight. Feelings like aggression, hostility and paranoiac imaginations occur frequently. The fight group is suspicious of new ideas and conflicts are solved through fight. The flight group tries to suppress anxiety through flight and hides from an experienced threat. The leader is expected to have a strategy for relieving the threat. Daydreaming, suspicion and generalisation dominate the atmosphere of the group. The members forget, ignore or laugh off suggestions for changes. Conflicts are rare, but if a conflict would occurs, common strategies are flight or denial.

158

Eva Hammar Chiriac

The pairing group tries to reproduce itself. Two members in the group, a pair, are usually more active than the other members. The main emotion expressed in the group is expectation and hopeful anticipation. The leader is not ‘born’ yet but the group is waiting for its saviour. Members have ‘fantasies’ about a better future, and conflicts are not allowed to surface. The leader must remain unborn or the group’s basic-assumption will disappear and the group has no longer any function. The pairing group has proved to be more difficult to isolate in a study than the other basic-assumption groups (Granström, 1986; Hammar Chiriac, 2003). The basic-assumption group activity can serve several purposes for the group, either helping or hampering a group’s work. Thus, it is important to consider that basic-assumption activity in a group does not always have a negative impact on the group, but can also serve the group’s purpose. Granström (1986) found that basic-assumption activity could serve three different functions for the group (a) to support group work activity (support function), (b) to release the group from anxiety (release function) and (c) to substitute an inactive release mode (substitute function). The ideal situation in a group work can be an oscillation between work and basicassumption activity, as long as the latter doesn’t become predominant. Several studies have been carried out on Bion’s theory on professional work groups and it appears that this theory is applicable in several different domains, for example meetings (Granström, 1986), decision processes (Sjövold, 1995, 1998).

A COMBINATION OF THE TWO THEORIES Both Steiner’s and Bion’s theories have separately proved to be practicable tools for group analyses. The question is whether a combination of the two theories can be one way to identify, describe and interpret dynamics in study groups. Can Steiner’s and Bion’s theories support and enrich each other and unravel new information, which would account for greater explanatory value with respect to group processes. One way to develop the theories was to combine the theories in a form of a model. This model may be compared to the periodic system in chemistry. All cells in that system theoretically can be filled with a formula, but in reality all cells in the periodic system do not hold an authentic element. Table 1 presents a theoretical combination of the two theories. Table 1 illustrates the theoretical model, based on Steiner’s and Bion’s theories, in a simplified and schematic way. All cells in the table represent one form of emotional level in the group (work or basic-assumption) combined with one type of activity (additive, disjunctive, conjunctive, compensatory or complementary). In the model prototypical metaphors are inserted in most of the cells, which theoretically can be filled with a group process. The qualitative prototype can be considered a metaphor for each group process and gives the reader a mental picture of the process in question. The use of prototypes as metaphors for the different processes can give illustrative descriptions of variations in group behaviour. The main purpose of Table 1 is to illustrate the combination in a simplified way. In addition it is also to construct a tool that can be used for interpretations of empirical data.


159

Table 1. The periodic system for the understanding of group processes Type of activity Additive Disjunctive Conjunctive Compensatory

Work group Dependence group Tug-of-war Expert Leader Dependence Mountaineering Jury

Complementary Anthology

Fight group Bullying Troublemaker Guerrilla

Groupthink Stoning

Flight group Stalling Slacker

Pairing group

A pair of lovers

Extended breaks Keeping piecework down Coffee party

THE PERIODIC SYSTEM FOR THE UNDERSTANDING OF GROUP PROCESSES The periodic system for the understanding of group processes contains several (17) different possible ways for groups to use, and actually are using in practice, when cooperating to solve a task (Hammar Chiriac, 2000, 2001, 2003, 2008). In a simplified way, the combination is based on what do the group members do and how do they do it? What the group members are doing is described in the horizontal dimension of the model and is based on Bion’s theory (1961, 1998). This dimension reveals that the group members are either working with the allotted task or with something else. The first-mentioned is in the model described as the work group, while dependence, fight, flight and pairing modes implies that the group members are doing something else than the prearranged task (Bion 1961, 1998). The vertical dimension in the model discloses how group members do perform the activity in question and is founded on Steiner’s theory (1966, 1972, 1976). The periodic system for the understanding of group processes discloses that groups may use five different modes to carry out the activity, namely in an additive, disjunctive, conjunctive, compensatory or complementary way. In the model well-known group processes are inserted. All processes comprise a combination of what group members are doing and how the members are doing what they actually are doing. For example in the first cell on the top to the left side in the model entitled tug-of-war, describes a group working on the allotted task (work group) in an additive way. To further elucidate the group processes which are being illustrated and the concepts, which characterises that type of process each cell will be illustrated below.

The Work Column The first column in the model, the work group, discloses that when the group members are focusing on the task they can use five possible modes of co-operation, namely additive, disjunctive, conjunctive, compensatory or complementary. In the column describing the working group are the prototypical metaphors tug-of war, expert, mountaineering, jury and

160

Eva Hammar Chiriac

anthology given as examples of processes occurring in these tasks oriented ways. To additionally increase the comprehension of the meaning of these concepts some examples will be given. The concept brainstorming (work group/additive) illustrates a group work where all members jointly and at the same moment help each other to accomplish the task. In a study group this type of co-operation might be found in a brainstorming activity. All members of the group participate in the brainstorming by suggesting relevant words. The words are added together to a pooled collection of word, which all group members are free to use. The expert (work group/disjunctive) is distinguished by the fact that only one member, who is or consider himself to be particularly familiar with the concerned question, takes the lead and controls the work in the group. One example concerning the expert might be a teacher, who shares his knowledge with the purpose of teaching as much as possible to the pupils. The students who are mutually focusing on the assignment, ask task-oriented questions which the teacher tends to answers. Both sides co-operate in the most profitable way to ensure that the students attain as much as possible from the teachers’ expert knowledge. Expert is a disjunctive concept and the prototypical metaphors in line in the model differ from the other concepts by means of describing something you are, i.e. property of individuals, while the other concepts in the model illustrates something you do or a process. This might be explained by the fact that the disjunctive activities exemplify separate individuals behaviour while remaining activities demonstrate group behaviours, which involve several individuals. The concept expert is an example of a prototypical metaphor describing properties of individuals. A corresponding process could be ‘develop expert knowledge’, which presents a misrepresentation of that the group members are active and not the leader. Mountaineering illustrates a group process describing a work group, which perform a conjunctive type of task. All members must jointly climb the mountain before the group has completed the project. Quality, in this case safety rather than speed (quantity) is more important. The mountain climbers are dependent on the least enabled climber in the group. In a study group this processes might be found when all students are active and participate in the group’s work and for example tries to solve a problem or generate the best idea. In a jury (work group/compensatory) the jury members are on a regular basis request to state their opinion before a decision is rendered. All judgments are weighted equally and added together before the verdict is announced. The verdict constitutes the mean value of all jury members’ opinions. In a study group this mode of co-operation might be illustrated by a group, which is asked to elect a representative for the group. All students are allowed to vote and the members’ average is the groups’ decision. The group member with the strongest support will be elected the groups’ representative. Finally, the concept anthology (work group/complementary) exemplifies the model’s last work group approach. Several authors each write a chapter to a book and thereafter all individual contributions are added together to a common volume. The various persons are responsible for his or her part but are free to work when and where he or she prefer, in fact it is not necessary for the writers to meet at all in order to complete the task. The anthology is dependent on that each author takes responsibility for, and accomplishes his or her welldefined part of the book. All contributions are required and are likely to affect the outcome of the work. This book in your hand is an example of a complementary type of task.


161

Working with Something Else The periodic system for the understanding of group processes demonstrates that a group can ‘choose to do something else’ than the prearranged task in four different ways: in a dependent, fight, flight or paring mode. Remember that this is a normal behaviour in all groups. There are no groups, which can be in a working condition through out the entire group work. The columns in the model describing fight or flight groups are, in conformity with the work group column, filled with concepts while the dependent and pairing columns only consist of one prototypical metaphor each, revealing a disjunctive condition.

The Dependent Column In the column for the dependent group only the cell corresponding to the disjunctive type of activity is filled with a concept. The concept leader dependence displays a group lacking in independence towards its leader. The person in charge controls and decides what is needed in the group and the group members do not question the leaders authority. An educator and his students may serve as an example again. In the dependent group the teacher is contemplated as an almighty and omnipotent leader and the pupils are dependent of his willingness to share his knowledge with them in their education and training. The students act helpless and ‘as if’ the teacher possesses the right answer. In addition, they do not consider themselves as having anything to provide in their own learning process but transfer all knowledge and wisdom to the teacher.

The Fight Column The third column in the model, the fight column, contains prototypical metaphors describing processes or properties adherent to a fight group. This implies that fight activities might occur in five potential modes. The processes inserted in the periodic system for the understanding of group processes are bullying, troublemaker, guerrilla, groupthink and stoning. Bulling (fight/additive) is probably the easiest process to recognise. Bullying is a violating act against an individual with difficulties to defend oneself. It is easy to associate to a schoolyard where one poor scapegoat stands in the middle of a circle of schoolmates. Quiet a few shout out words of abuse directed to the scapegoat while others just stand there in silence and let this happen. All schoolmates join hands (shouters as well as those who are quiet) together and at the same time expose the pupil for this violent action. The more schoolmates surrounded in the circle, the severer violation and feelings of being vulnerable contained by the ‘victim’. The troublemaker (fight/disjunctive) may serve as an example for that a disorderly pupil in a class might sabotage the entirely lesson for them all. (The troublemaker is just like all the other concepts in the disjunctive line a prototypical metaphor describing properties of individuals). In accordance with earlier examples this is a situation familiar to a school teacher. All educators have probably experienced that a classroom situation have been disrupted for the whole class on behalf of one or two pupils sabotage or acting troublesome.

162

Eva Hammar Chiriac

The next cell exhibits the concept guerrilla (fight/conjunctive). If the guerrilla wants to succeed they are dependent on that the attack is accomplished according to the prearranged plan and that everyone is performing optimal, i.e. the behaviour of the weakest guerrilla member determines the outcome of the attack. In conformity with the guerrillas fight against authorities, students may carry out a fight against their educators. If they are to succeed in their attempt dependents on if all students hold the same opinion and if they carry out a jointly fight. If for example the students experience an examination as being to hard and starts a common fight in order to lower the limit for the passing grade, it is crucial that all students agree and support each other, or the fight will be in vane. Another example is when all pupils in a school class decide to hand in blank papers in an exam. No one is going to answer any questions but handing in blank tests. Before the exam takes place all students come to an agreement, but if only one student decide to take the exam and answer the question the entire action is lost. Groupthink (fight/compensatory) may be exemplified by a group of politicians taking poor quality decisions. An executive political group are often comprised of a cohesive group of men, participating in long meetings, which includes a lot of decision-making under time pressure. Through the group member is striving for unanimity and consensus decisions, neglect to reality testing or critical thinking, shield off from outside information and used deficient decision-making processes, which may pose a risk for the politicians to overestimate the groups’ own competence and therefore make poor quality decisions. The decision-making process may include voting where every contribution is weighted equally. The last concept in the fight column is stoning (fight/complementary). The concept is based on the old biblical proverbial phrase ‘he without any sin should through the first stone’. There are several individuals who through a stone each at the enemy included in the fight. This is not an unusual situation in an educational group. If the students for example, are annoyed over that their results are not reported in time for their application for a renewed study loan, the students may blame the educator (who becomes the enemy). One by one bring forward arguments blaming the teacher and imploring their own innocence.

The Flight Column The flight columns’ cells are also filled prototypical metaphors. Flight is the most common mode for study groups to escape during work and employ in something else than the prearranged task. The most ordinary way is to ‘fly’ from work for a while by joking or laughing. The group members can ‘fly’ in five different ways and the emanate concepts for these flight processes are stalling, slacker, extended breaks, keeping piecework down and coffee party. Stalling (flight/additive) occurs with the purpose of postponing or escaping some task. For instance may a group of students who experience a compulsory lecture as unjustified choose to appear late in class. By means of students dripping in one by one into the hall the lecture is progressing slowly. The more students who are turning up late so the more the lecture turns out to be behind schedule and thereby also shortened. Pupils may also stall if one by one they ask for permission to sharpen their pencils claming they want to take notes. If the educator wants the students to put down any annotations he must allow the sharpening of pencils and wait until all pupils have finished the sharpening procedure. The class is


163

postponed until everybody in the classroom have functioning writing materials and if the pupils are highly successful the class will be drawn to an end before the work is done. The concept slacker (flight/disjunctive) describes how one group member slips away from work and responsibility. This may also depict properties of individuals known as freerider. A free-rider in a group work is a pupil how uses the groups common resources without making any contribution himself. Common extended breaks (flight/conjunctive) comprises when all group members join each other in a common flight from work through an extended long break for lunch. In a group work this might be disclosed by members taking too long pauses and lunch breaks or in other ways reduce allotted time for the task performance. Socialising and being together is more importent than returning to work. The last but one concept in the flight column is keeping piecework down (flight/compensatory) and may be depicted by a working team with an agreement on a certain level of piecework. The groups’ ‘lower’ level of piecework is based on for the group just right level, starting from the teams average working pace. Nobody exceed this level of capacity even if they could. In a study group this behaviour might be seen when the work capacity is lower than the groups’ actually ability. Even if one of the pupils has the capacity of higher performance he or she will not outdo the others. Everybody keeps within the predetermined common working level. This kind of behaviour is easily connected to social loafing, which is a tendency in individuals working in a group to work less hard when they know others are working on the same task. The difference is that keeping the piece work down do not imply that the students loaf about dedicated to unessential but that each pupil produce exactly as much as the group have decided on, even if they could do better. Finally the last prototypical metaphors found in the fight column is coffee party (flight/complementary). In a coffee party each attendant contribute with a piece of gossip. The various contributions must not really be connected but constitute one small part of the topic of conversation during the coffee party. Altogether provide the different contributions the groups’ common topic of discussion throughout the coffee party. A Dutch treat is another useful example of this mode of group process, when each person brings a dish to the pooled meal. A corresponding process might appear in a work group when one student begins to talk about something else, for instance about how the last group work degenerated. One by one, the other group members throw in information about previous groups and group work. One and all are contributing to the common flight by talking about their respective experiences of earlier group work.

The Pairing Column In the last column there is only one concept inserted, a pair of lovers (pairing/disjunctive). The pair of lovers is foremost interested in themselves. Now when they have found each other everything will turn out right and the group will survive through reproduction. In a group work this process may occur when two members, not necessary with different sexes, dominate a non-productive activity. The pair might in a familiar way discuss ‘a better future’ in the group, often founded on fantasies or wishes. ‘ If only we do like this … everything will be better’. Sexual allusions or internal individual insinuations unknown to the other group members are frequently occurring within the pair of lovers.

164

Eva Hammar Chiriac

If a group work gets stuck in a non-working condition, then that might be a sign that something in the working situation creates anxiety. This state of anxiety might be caused by too-difficult tasks, sensitive questions, too little time, or a lack of similarities or supervision. It may also result from the socio-emotional aspects in the group being allowed to take up a great deal of time. Therefore it is important to search for, understand and attend to the causes of why the group’s is in a state of anxiety. On the other hand, it might be wise to consider the beneficial aspects of the non-task oriented activities (Granström, 1986; Hammar Chiriac, 2003, 2008). The dependent, fight, flight and pairing approaches may serve the group’s purpose and relieve the pressure on the group. Shorter periods of non productive behaviour may be positive under certain circumstances to facilitate the group’s work and process.

CONCLUSION To conclude, the periodic system for the understanding of group processes demonstrates a contemporary approach to categorising group processes. The model addresses the issue of group processes connected to the type of task and activity in the group, in other words What do the group members do and how do they do it? Well-known group processes or properties of individuals illustrate the various group activities. The model supports the implication that certain working modes are better suited in some parts of the group work, while others are not and that group work is not one, but a great many processes.

SUMMARY Group work is not just one single activity but is comprised of several activities with different goals and conditions. Such different conditions emanate from different dynamic processes that may either facilitate or hamper the group’s work. This chapter addresses the issue of group processes connected to the type of task and activity in the group, in other words What do the group members do and how do they do it? A contemporary new approach of categorising group processes has been presented, namely The periodic system for the understanding of group processes, as well as illustrative applications of the concepts inserted in the model. The model discloses that a combination of Steiner’s theory (1972) and Bion’s theory (1961) may be one possible way to give a comprehensive and descriptive picture of group processes and also can be a practical tool for describing and interpreting what happens in groups during work.

REFERENCES Armelius, B-Å., and Armelius, K. (1985). Group personality, task and group culture. In: M. Pines (Ed.), Bion and group psychotherapy (pp. 255–273). London: Tavistock/Routledge.


165

Barkley, E.F., Cross, K. P., and Major, C.H. (2005). Collaborative learning techniques. San Francisco: Jossey-Bass. Bion, W.R. (1961). Experiences in groups. London: Tavistock Publications Limited. Bion Talamo, P., Borgogno, F. and Merciai, A. A. (Eds.), (1998). Bion’s legacy to groups. London: Karnac Books. Boalt Boëthius, S. (1983). Autonomy, coping and defense in small work groups. Stockholm: Almqvist and Wiksell International. Brown, R. (2000). Group processes. Dynamics within and between groups. Oxford: Blackwell Publishers. Forsyth, D.R. (2005). Group Dynamics. Belmont: Wadsworth Publishing Company. Granström, K. (1986). Dynamics in meetings. On leadership and followership in ordinary meetings in different organizations. Linköping: Linköping Studies in Art and Science. Hammar Chiriac, E. (2008) A scheme for understanding group processes in problem-based learning. Higher Education, 55, 505– 518. Hammar Chiriac, E. (2003). Grupprocesser i utbildning. En studie av gruppers dynamik vid problembaserat lärande. [Group processes in education. On dynamics in problem-based learning.] Linköping: Linköping Studies in Education and Psychology. Hammar Chiriac, E. (2001). Group processes in problem-based learning. Paper presented at the oral symposium Whatever happened to the group dynamics in problem based learning tutorial groups?, at the VIIth European Congress of Psychology, 1–6 July 2001, London. Hammar Chiriac, E. (2000). Group processes in PBL tutorial groups. Paper presented at the Second International Conference on Problem-Based Learning in Higher Education, 17– 20 September 2000, Linköping University, Sweden. Hammar Chiriac, E. and Granström, K. (2008). Prerequisites for meaningful group work – Students’ experiences of co-operation. In Jern, S. and Näslund, J. (Eds.), Proceedings from the 6th Nordic Conference on Group and Social Psychology, Lund University, May 2008. Hammar Chiriac, E., and Granström, K. (in progress). Students’ experiences of group work in school. [Manuscript to be submitted to an innternational journal]. Hammar Chiriac, E., and Hempel, A. (Eds.), (2008). Handbok för grupparbete – att skapa fungerande grupparbete i undervisning. [Handbook for group work – to establish well functioning group work in education.] Lund: Studentlitteratur. Karterud, S. (1989). A study of Bion’s basic assumption groups. Human Relations, 42, 315– 335. Lotan, R.A. (2006). Managing groupwork in the heterogeneous classroom. In C. M. Evertson and C. S. Weinstein (Eds.), Handbook of classroom management. Research, practice, and contemporary issues. (pp. 525-540) Mahwah: Lawrence Erlbaum Associates, Publisher. Sjøvold. E. (1995). Groups in harmony and tension. The development of an analysis of polarization in groups and organization based on SYMLOG-method. University of Trondheim, Departement of Organisation and Work Science. Sjövold, E. (1998). Group and organizational culture from the viewpoint of polarization. In K. Granström (Ed.). Small group studies. Proceedings from a conference on group and social psychology, Linköping University, May 1998. Linköping University, Skapande Vetande.

166

Eva Hammar Chiriac

Steiner, I.D. (1966). Models for inferring relationship between group size and potential productivity. Behavioural Science, 11, 273–283. Steiner, I.D. (1972). Group process and productivity. New York: Academic Press. Steiner, I.D. (1976). Task-performing groups. In: J.W. Thibaut, J.T. Spence, and R.C. Carson (Eds.), Contemporary topics in social psychology (pp. 393–422). Morristown, N.J.: General learning Press.


ISBN 978-1-60876-175-3 c 2010 Nova Science Publishers, Inc.

Chapter 6

T HE C ONTINUOUS S HEARLET T RANSFORM IN H IGHER D IMENSIONS : VARIATIONS OF A T HEME Stephan Dahlke1∗, Gabriele Steidl2† and Gerd Teschke3‡ 1 Philipps-Universität Marburg, FB12 Mathematik und Informatik, Hans-Meerwein Straße, Lahnberge, 35032 Marburg, Germany 2 Universität Mannheim, Fakultät für Mathematik und Informatik, Institut für Mathematik, 68131 Mannheim, Germany 3 Hochschule Neubrandenburg - University of Applied Sciences, Institute for Computational Mathematics in Science and Technology, Brodaer Str. 2, 17033 Neubrandenburg, Germany

Key Words: Shearlets, Lie groups, square-integrable group, representations. 2000 Subject Classification: 22D10, 42C15, 46E35, 47B25. Abstract This note is concerned with the generalization of the continuous shearlet transform to higher dimensions. Quite recently, a first approach has been derived in [4]. We present an alternative version which deviates from [4] mainly by a different generalization of the shear component. It turns out that the resulting integral transform is again associated with a square-integrable group representation.

1.

Introduction

Modern technology allows for easy creation, transmission and storage of huge amounts of data. Confronted with a flood of data, such as internet traffic, or audio and video applications, nowadays the key problem is to extract the relevant information from these sets. To this end, usually the first step is to decompose the signal with respect to suitable building ∗

E-mail address: [email protected] E-mail address: [email protected] ‡ E-mail address: [email protected]; G.T. gratefully acknowledges partial support by Deutsche Forschungsgemeinschaft Grants TE 354/5-1 and TE 354/4-1. †

168

Stephan Dahlke, Gabriele Steidl and Gerd Teschke

blocks which are well–suited for the specific application and allow a fast and efficient extraction. In this context, one particular problem which is currently in the center of interest is the analysis of directional information. In recent studies, several approaches have been suggested such as ridgelets [1], curvelets [2], contourlets [5], shearlets [11] and many others. For a general approach see also [10]. Among all these approaches, the shearlet transform stands out because it is related to group theory, i.e., this transform can be derived from a square-integrable representation π : S → U (L2 (R2)) of a certain group S, the so-called shearlet group, see [3]. Therefore, in the context of the shearlet transform, all the powerful tools of group representation theory can be exploited. So far, the shearlet transform is well developed for problems in R2 . However, for analyzing higher-dimensional data, there is clearly an urgent need for further generalization. In [4], a first approach in this direction has been presented. Similar to the two–dimensional case, the approach outlined there is based on translations, anisotropic dilations and specific shear matrices. It has been shown that the associated integral transform originates from a square-integrable representation of a group, the full n-variate shearlet group. Moreover, a very useful link to the important coorbit space theory developed by Feichtinger and Gröchenig [6, 7, 8] has been established and the potential to detect singularities has been demonstrated. In this note, we want to present a slightly different approach. It deviates from [4] mainly by the choice of the shear component. Instead of the block form used in [4], we work here with a suitable subgroup of Toeplitz matrices of the group of upper triangular matrices. Moreover, in contrary to the anisotropic (parabolic) dilation employed in [4], we restrict ourselves to isotropic dilations here. This setting yields a different generalization of the continuous shearlet transform. Nevertheless, similar to [4], the associated integral transform stems from a square-integrable group representation of a specific group, so that again all the powerful tools of group representation theory can be used. This note is organized as follows. In Section 1, we establish our new underlying group and compute the associated Haar measures. Then, in Section 2, we show that this group indeed possesses a strictly continuous representation in L2(Rn ) which is moreover square integrable.

2.

The Group Structure

In this section, we introduce a new version of the shearlet transform on L2 (Rn ). This requires the generalization of the shear matrix. For a ∈ R∗ := R \ {0} and s ∈ Rn−1 , we set   1 s1 s2 . . . sn−1    ..  0 1 s1 s2 a ... 0 .     .. . .   . .. . . ..  . .. .. Aa =  . . ..  = aIn and Ss =  . . . .   .   . .. 0 ... a  .. . 1 s1  0 ... ... 0 1 Lemma 2..1 The set R∗ × Rn−1 × Rn endowed with the operation (a, s, t) ◦ (a0, s0, t0 ) = (aa0, [SsT0 SsT ]1, t + Aa Ss t0 ) ,

The Continuous Shearlet Transform in Higher Dimensions

169

where the bracket operation [·]1 extracts the last n − 1 elements of the first column, is a locally compact group S. The left and right Haar measures on S are given by dµl (a, s, t) =

1 da ds dt and |a|n+1

dµr (a, s, t) =

1 da ds dt. |a|

A direct computation shows that e := (1, 0, 0) is the neutral element in S. Let us denote Ss−T = (Ss−1)T and observe that S[SsT ]1 = Ss . The inverse of (a, s, t) ∈ R∗ × Rn−1 × Rn is given by 1 −1 −T −1 (a, s, t) = , [S ]1 , −Ss A 1 t , a a s since (a, s, t) ◦

1 , [S −T ]1 , −Ss−1A 1 t a a s

1 = a, [Ss−T SsT ]1, −AaSs Ss−1A 1 t + t a a = (1, 0, 0).

Note that by induction arguments one easily verifies that Ss−1 is again of Töplitz type as Ss . Furthermore, the multiplication is associative. With the observation that for s, r ∈ Rn−1 and a ∈ R∗ T

S[SsT SrT ]1

= SsT SrT and Aa Ss = Ss Aa,

we have a, s˜, ˜ t) = (aa0 , [SsT0 SsT ]1, t + Aa Ss t0 ) ◦ (˜ a, s˜, ˜ t) ((a, s, t) ◦ (a0 , s0, t0)) ◦ (˜ 0 T T T ˜, [Ss˜ Ss0 Ss ]1, S[S T0 SsT ]1 Aaa0 ˜ t + t + Aa Ss t0 ) = (aa a s

= (aa0 ˜ a, [Ss˜T SsT0 SsT ]1, SsSs0 Aa Aa0 t˜ + Aa Ss t0 + t) t + t0 ) + t) = (aa0 ˜ a, [Ss˜T SsT0 SsT ]1, AaSs (Aa0 Ss0 ˜ ˜, [Ss˜T SsT0 ]1, Aa0 Ss0 t˜ + t0 ) = (a, s, t) ◦ (a0a = (a, s, t) ◦ ((a0, s0, t0) ◦ (˜ a, s˜, ˜ t)). It remains to compute the Haar measures. Observing that the vector [SsT SsT0 ]1 can be seen as a translation of s, we have for a function F on S Z Z Z da F (a0 a, [SsT SsT0 ]1, t0 + Aa0 Ss0 t) dt ds n+1 |a| R∗ Rn−1 Rn Z Z Z da dt ds n+1 = F (a0 a, [SsT SsT0 ]1, t) ∗ n−1 Rn | det(Aa0 Ss0 )| |a| Z ZR ZR da = F (a0 a, s, t) dt ds 0 n n+1 |a | |a| ∗ n−1 n ZR ZR ZR d˜ a = F (˜ a, s, t) dt ds 0 n ã n+1 0 |a | | a0 | |a | R∗ Rn−1 Rn Z Z Z d˜ a F (˜ a, s, t) dt ds n+1 , = ∗ n−1 n |˜ a | R R R

170


so that dµl is indeed the left Haar measure on S. Similarly we can verify that dµr is the right Haar measure on S. In the following, we will use only the left Haar measure and use the abbreviation dµ = dµl .

3.

The Representation

For f ∈ L2(Rn ) we define for (a, s, t) ∈ S −1 (π(a, s, t)f )(x) = fa,s,t (x) =: |a|−n/2 f (A−1 a Ss (x − t)).

(1)

It is easy to check that π : S → U (L2(Rn )) is a mapping from S into the group U (L2(Rn )) of unitary operators on L2(Rn ). The Fourier transform of fa,s,t is given by ∧ −1 ˆ (ˆ π (a, s, t)f)(ω) = |a|−n/2 f (A−1 (ω) a Ss (· − t) −n/2 −1 −1 −1 ˆ T T | det(A S )| f (S A ω)e−2πiht,ωi = |a| a

= |a|n/2fˆ(SsT ATa ω)e

s −2πiht,ωi

s

a

(2)

.

Recall that a unitary representation of a locally compact group G with the left Haar measure µ on a Hilbert space H is a homomorphism π from G into the group of unitary operators U (H) on H which is continuous with respect to the strong operator topology. Lemma 3..1 The mapping π defined by (1) is a unitary representation of S. Note that the representations π and π ˆ are equivalent. Let ψ ∈ L2 (Rn ), ω ∈ Rn , and 0 0 0 (a, s, t), (a , s , t ) ∈ S. Therefore, we obtain π ˆ (a, s, t) ◦ π ˆ (a0, s0, t0)ψ(ω) = π ˆ (a, s, t) π ˆ (a0, s0, t0 )ψ (ω) ˆ (a0, s0, t0 )ψ (SsT ATa ω)e−2πiht,ωi = |a|n/2 π 0

T

T

= |a|n/2|a0|n/2 ψ(SsT0 ATa0 SsT ATa ω)e−2πiht ,Ss Aa ωi e−2πiht,ωi 0

= |aa0|n/2 ψ(SsT0 SsT ATaa0 ω)e−2πihAa Ss t +t,ωi = π ˆ (a, s, t) ◦ (a0 , s0, t0) ψ(ω).

A nontrivial function ψ ∈ L2 (Rn ) is called admissible, if 0
1 the shearlet is defined by ψ(ω) = ψˆ1 (ω1)ψˆ2(˜ ω /ω1k ), where supp ψˆ1 ∈ [−a1 , −a0] ∪ [a0, a1] for some a1 > a0 ≥ α > 0 and supp ψˆ2 ∈ ([−b1, −b0] ∪ [b0, b1])n−1 for b1 > b0 ≥ β > 0. Furthermore, let S˜ denote the shear matrix generated by (s1 , . . . , sn−2 ) and let s¯ = S˜−T s. If (¯ sm , . . . , s¯n−1 ) = (−1, s¯1, . . . , s¯m−1 ) P then

and n

(t1, . . . , tm ) = −(tm+1 , . . . , tn ) P T ,

SHψ νm (a, s, t) ∼ |a| 2 −m

as a → 0.

(6)

Otherwise, the shearlet transform SHψ νm decays rapidly as a → 0. The support condition on ψˆ1 and ψˆ2 can be relaxed toward a rapid decay of the functions. An application of Plancherel’s theorem for tempered distribution yields SHψ νm (a, s, t) := hνm , ψa,s,ti = hˆ νm , ψâ,s,ti Z n ωA 2πihtA +P tE ,ωA i ¯ T T ˆ 2 = |a| ψ Ss Aa dωA e P T ωA Rm with ω A = (ω2 , . . . , ωm)T . Rewriting   aω 1    s1  ωA   T T ω Ss A a = ,    . A P T ωA  aω1  ..  + aS˜T T P ωA sn−1


173

where S˜ is the (n − 1) × (n − 1) submatrix of Ss , which is also a shear matrix generated by (s1 , . . ., sn−2 ), it follows by the definition of ψˆ that SHψ νm (a, s, t) =

n

|a| 2

ωA ¯ ¯ e2πihtA+P tE ,ωAi ψˆ1(aω1 )ψˆ2 a1−k (ω11−k s + ω1−k S˜T T P ωA m R

Z

) dωA.Substituting ξÃ = (ξ2, . . . , ξm)T := ω A /ω1k , i.e., d˜ ωA = |ω1|k(m−1) dξÃ , we get

Z

k ˜T T ¯ e2πihtA +P tE ,(ω1,ω1 ξA ) i ψˆ1(aω1 )|ω1|k(m−1) Rm ξÃ ¯ T 1−k 1−k ˜−T 1−k ˆ ˜ ) dξÃ dω1 ×ψ2 S (a ω1 S s + a T T P T (ω11−k , ξÃ )

SHψ νm (a, s, t) = |a|

n 2

and by setting s¯ = S˜−T s, s¯a = (¯ s1 , . . ., s¯m−1 )T , and s¯e = (¯ sm , . . . , s¯n−1 )T we obtain by 1−k 1−k substituting ω˜ A := a (ω1 s¯a + ξÃ ) SHψ νm (a, s, t)

=

|a|

n +(m−1)(k−1) 2

  ¯ ×ψˆ2 S˜T 

Z

k

1−k ω ˜

e2πihtA+P tE ,(ω1 ,(ω1 /a

T )T i

¯a ) A −ω1 s

Rm

a1−k ω11−k (¯ se + P T

If the vector s¯e − P T

ω Ã

¯ ψˆ1 (aω1 )|ω1 |k(m−1)



1 0   d˜ ωA dω1 ) + PT −¯ sa ω Ã

−1 6= 0n−m s¯a

(7)

then at least one component of its product with a1−k becomes arbitrary large as a → 0. By the support property of ψˆ2, we conclude that ψˆ2(S˜T(˜ ωA , ·)T) becomes zero if ω ˜ A is not in m−1 m−1 m−1 ⊂R . But for all ω Ã ∈ ([−b1, −b0] ∪ [b0, b1]) at least ([−b1, −b0] ∪ [b0, b1]) one component of −1 0 1−k 1−k T T +P a ω1 s¯e − P s¯a ω Ã is not within the support of ψˆ2 for a sufficiently small so that ψˆ2 becomes zero again. Assume now that we have equality in (7). Then SHψ νm (a, s, t)

=

=

Z T T ¯ n k 1−k |a| 2 +(m−1)(k−1) e2πihtA+P tE ,(ω1 ,(ω1 /a ω˜ A −ω1 s¯a ) ) i ψˆ1 (aω1 )|ω1 |k(m−1) m R    ω ˜ A ¯ T ωA dω1 ×ψˆ2 S˜  T 0   d˜ P ω Ã Z Z ξ1 k−1 T T T ¯ n |a| 2 −m e−2πi a htA+P tE ,(1,ξ1 ω˜ A −¯sa) iψˆ1 (ξ1 )|ξ1 |k(m−1) dξ1 Rm−1 R    ω Ã ¯ ωA . ×ψˆ2 S˜T  T 0   d˜ P ω Ã

Consequently, for tA + P tE = 0, we have SHψ νm (a, s, t)

=

|a|

n −m 2

Z R

¯ ψˆ1 (ξ1 )|ξ1 |k(m−1) dξ1

   ω Ã n ¯ T ωA ∼ |a| 2 −m . ψˆ2 S˜  T 0   d˜ P m−1 R ω Ã

Z

174


k T T ¯ For tA +P tE 6= 0, consider ψã,˜ωA (ξ1) := e2πihtA +P tE ,(ξ1 ,ξ1 ωÃ ) i/a ψˆ1(ξ1)|ξ1|k(m−1) which is for all ω ˜ A and a again in C ∞ . Due to the support property of ψˆ2, the integration with reˆ sTa)T i/a) spect to ω ˜ A is over some finite domain. Therefore, lima→0 ψã,˜ωA (htA + P tE , (−1, ¯ ˆ is uniformly in ω ˜ A . Hence, limes and integration can be exchanged and since ψã,˜ωA is a rapidly decaying function, the limes for a → 0 is zero implying that SHψ νm (a, s, t) decays rapidly as well. (Since the parameter a in the definition of ψã,˜ωA only appears in the exponential term, general results on Fourier transform imply that the dependency on a does not effect these arguments, see, e.g., [9], proof of Corollary 8.23 for details).

References [1] E. J. Candès and D. L. Donoho, Ridgelets: a key to higher-dimensional intermittency?, Phil. Trans. R. Soc. Lond. A. 357 (1999), 2495–2509. [2] E. J. Candès and D. L. Donoho, Curvelets - A surprisingly effective nonadaptive representation for objects with edges, in Curves and Surfaces, L. L. Schumaker et al., eds., Vanderbilt University Press, Nashville, TN (1999). [3] S. Dahlke, G. Kutyniok, P. Maass, C. Sagiv, H.-G. Stark, and G. Teschke, The uncertainty principle associated with the continuous shearlet transform , Int. J. Wavelets Multiresolut. Inf. Process. 6 (2008), 157-181. [4] S. Dahlke, G. Steidl, and G. Teschke, The continous shearlet transform in arbitrary space dimensions, Preprint Nr. 2008-7, Philipps-University of Marburg, 2008, to appear in: Appl. Comput. Harmon. Anal. [5] M. N. Do and M. Vetterli, The contourlet transform: an efficient directional multiresolution image representation, IEEE Trans. Image Process. 14(12) (2005), 2091–2106. [6] H. G. Feichtinger and K. Gröchenig, A unified approach to atomic decompositions via integrable group representations , Proc. Conf. “Function Spaces and Applications”, Lund 1986, Lecture Notes in Math. 1302 (1988), 52–73. [7] H. G. Feichtinger and K. Gröchenig, Banach spaces related to integrable group representations and their atomic decomposition I, J. Funct. Anal. 86 (1989), 307–340. [8] H. G. Feichtinger and K. Gröchenig, Banach spaces related to integrable group representations and their atomic decomposition II, Monatsh. Math. 108 (1989), 129–148. [9] G.B. Folland, Real Analysis, John Wiley & Sons, (1999). [10] K. Guo, W. Lim, D. Labate, G. Weiss, and E. Wilson, Wavelets with composite dilations and their MRA properties, Appl. Comput. Harmon. Anal. 20 (2006), 220–236. [11] K. Guo, G. Kutyniok, and D. Labate, Sparse multidimensional representations using anisotropic dilation und shear operators, in Wavelets und Splines (Athens, GA, 2005), G. Chen und M. J. Lai, eds., Nashboro Press, Nashville, TN (2006), 189–201.


175

[10] K. Guo, W. Lim, D. Labate, G. Weiss, and E. Wilson, Wavelets with composite dilations and their MRA properties, Appl. Comput. Harmon. Anal. 20 (2006), 220–236. [11] K. Guo, G. Kutyniok, and D. Labate, Sparse multidimensional representations using anisotropic dilation und shear operators, in Wavelets und Splines (Athens, GA, 2005), G. Chen und M. J. Lai, eds., Nashboro Press, Nashville, TN (2006), 189–201.



Chapter 7

E XCEPTIONAL G ROUPS , S YMMETRIC S PACES AND A PPLICATIONS Sergio L. Cacciatori1∗ and B. L. Cerchiai2† Dipartimento di Fisica e Matematica, Università dell’Insubria Milano, 22100 Como, Italy, and I.N.F.N., sezione di Milano, Italy. 2 Lawrence Berkeley National Laboratory, Theory Group, Bldg 50A5104, 1 Cyclotron Rd, Berkeley CA 94720-8162, USA and University of California Berkeley, Center for Theoretical Physics, 366 LeConte Hall #7300, Berkeley, CA 94720-7300, USA. 1

Abstract In this article we provide a detailed description of a technique to obtain a simple parametrization for different exceptional Lie groups, such as G2 , F4 and E6 , based on their fibration structure. For the compact case, we construct a realization which is a generalization of the Euler angles for SU (2), while for the non compact version of G2(2)/SO(4) we compute the Iwasawa decomposition. This allows us to obtain not only an explicit expression for the Haar measure on the group manifold, but also for the cosets G2/SO(4), G2/SU (3), F4/Spin(9), E6/F4 and G2(2) /SO(4) that we used to find the concrete realization of the general element of the group. Moreover, as a by-product, in the simplest case of G2 /SO(4), we have been able to compute an Einstein metric and the vielbein. The relevance of these results in physics is discussed.

1.

Introduction

In this article we describe our technique to analyze the structure of exceptional Lie groups, which is based on constructing a generalized Euler parametrization by starting from a suitable fibration. We review our results on G2 [1, 2], F4 [3] and E6 [4]. We also provide some new insights on the geometry of the non compact versions of these groups, by using the Iwasawa decomposition, and in particular we apply it to G2(2). Our method allows us to ∗ †

E-mail address: [email protected] E-mail address: [email protected]

178

Sergio L. Cacciatori and B. L. Cerchiai

explicitly calculate the Haar measure for the group manifold, and, as it is compatible with the fibration used to compute it, it naturally provides a metric for the corresponding coset as well. The layout of this paper is as follows. In section 2 we recall some of the basic facts about Lie groups and Lie algebras, that we need later. In section 3 we explain in detail how the generalized Euler parametrization is defined and we study some toy model to exemplify it. Then in the following sections we apply it to different exceptional Lie groups. In section 4 we construct G2 in two different ways as a fibration, first with SU (3) as a fiber and then with SO(4) as a fiber. In section 5 we determine the Spin(9) Euler angles for F4 , which we then use in section 6 to obtain the F4 Euler angles for E6. Finally, in section 7 we introduce the Iwasawa decomposition for the non compact version of the Lie groups, which we then apply to G2(2) in section 8. Since we are able to get an explicit expression for the Haar measure on the group manifold, the most immediate application of our results is the possibility of evaluating integrals [5]. Until now the only available method to compute some of them was to use the invariance properties of the Haar measure, but knowing its explicit form gives an analytic way to calculate many of them directly. In physics exceptional Lie groups appear naturally as the symmetry (gauge) groups of field theories which are low energy limits of certain heterotic string models [6]. Besides from being relevant for string phenomenology, these theories are interesting by themselves, e.g. E6 as a candidate for the symmetry group in a grand unified theory of high energy physics [7] and G2 as a possible example of a non confining gauge theory [8]. While the local properties of a field theory are determined exclusively at the level of the corresponding Lie algebra, in order to obtain non-perturbative results it is necessary to make use of the full global structure of the Lie group, because of the need for evaluating integrals on the group manifold. Being able to solve them analytically has drastically reduced the computer power required to run a lattice simulation. For instance our expressions for G2 are the base for the Montecarlo analysis presented in [9]. Moreover, our technique can also be applied to the noncompact versions of the Lie groups, such as G2(2), F4(4), E6(6) or E7(7). In this case, another parametrization is the Iwasawa decomposition. As its construction uses a nilpotent subalgebra, it is particularly simple and is therefore very useful. In physics these groups represent the U-duality of supergravity theories in different dimensions. One of the most interesting features of our method is that it is based on identifying a suitable subgroup and in studying the corresponding fibration. As a consequence it automatically yields an explicit expression for the coset space as well as for its metric, measure and vielbein, since the geometry on the group induces a geometry on the base. In the case of the maximal compact subgroups of noncompact exceptional Lie groups, e.g. SO(4) for G2(2) or SU (8) for E7(7), these symmetric spaces turn out to be Einstein spaces. Being solutions of Einstein equations, they are relevant by themselves for general relativity. In supergravity some of these cosets are interpreted as the scalar fields of the associated sigma model [10]. Moreover, they can represent the charge orbits of black holes when the attractor mechanism is studied [11] and they also appear as the moduli spaces for black holes. In [12] they are used to investigate the deep connection between black holes properties, duality and supergravity.

Exceptional Groups, Symmetric Spaces and Applications

179

As an example, the coset space G2(2)/SO(4) studied in section 8 is relevant for black ring solutions in 5-dimensional supergravity [13]. Finally, these symmetric spaces can be used to describe the entanglement of qubits and qutrits in information theory [14].

2.

General Settings

Because of their importance for the rest of the chapter and in order to set our conventions, we recall here some basic facts about semisimple Lie groups (see [15]).

2.1.

Lie algebras from Lie groups

A Lie group G is a group which is also a differential manifold and for which the group structure and the differential structure are compatible. This means that the two basic group operations, the product and the inversion, are required to be differentiable maps with respect to the differential structure. The dimension of the group is the dimension of G as a manifold. Here we consider only finite dimensional groups. In this case the differentiability of the inverse map is a consequence of the differentiability of the product map and of the implicit function theorem. We use the symbol e for the unit element, which therefore identifies a particular point on G. For any g ∈ G we can define two maps Lg : G −→ G, h 7→ gh, Rg : G −→ G, h 7→ hg, called the left and the right translation respectively. Note that with respect to the composition product, Lg and Rg0 commute. They define a left and a right action of the group on itself: L : G × G −→ G; (g, h) 7→ Lg (h), R : G × G −→ G; (g, h) 7→ Rg (h). Note that Lg and Rg are not homomorphisms. A homomorphism associated to a left action is (1) φg : G −→ G; h 7→ Rg−1 Lg h = ghg −1. Differentiating the Lg map at the identity, we have (dLg )e : Te G −→ Tg G. This operation associates to each vector ξ ∈ Te G a non vanishing vector field Xξ Xξ : G −→ T G; g 7→ (dLg )e (ξ) ∈ Tg G. which is well defined globally. Note that Xξ (e) = ξ. In this way, given a basis {τ1, . . . , τn} of Te G, at each point g we can obtain a set of vector fields which determine a basis for Tg G. This shows that the tangent bundle of G is trivial.

180


An important property of the field Xξ is that it is Lg -invariant (left invariant). This means (Lg )∗Xξ = Xξ . Viceversa, given a left invariant vector field V , it can be verified that V (e) ∈ Te G and V = XV (e) . Thus, the left invariant vector fields form a finite dimensional vector space XL (G) ' Te G. Moreover, XL (G) is closed under the Lie bracket of vector fields: for all X, Y ∈ XL(G) ([X, Y ] = LX Y ) [X, Y ] ∈ XL(G), where LX is the Lie derivative along X. Thus, g ≡ Lie(G) := {XL (G), [, ]} defines an algebra: the Lie algebra associated to G. The Lie product has the properties of being antisymmetric and of satisfying the Jacobi identity.

2.2.

Adjoint representations and the Killing form

A powerful way to “describe” the structure of a group is by means of its representations. A representation of a group G on a vector space V (real or complex) is a homomorphism r : G −→ Aut(V ), where Aut(V ) is the group of automorphisms of V with the composition as product. A representation is irreducible if V does not admit proper invariant subspaces, and it is faithful if Ker(r) = e. In a similar way, a representation of a Lie algebra on V is a homomorphism ρ : g −→ End(V ), where End(V ) is the Lie algebra of endomorphisms of V with the bracket of operators as Lie product. Noting that End(V ) = Lie(Aut(V )) and identifying Lie(G) with Te G, it can be seen that a representation of the algebra can be obtained from a representation of the group as ρ = dre . Among the representations of a group, an example which can be constructed in a natural way is the Adjoint. It is the representation over the Lie algebra V = g obtained in the following way through the homomorphism φg introduced above. For any fixed g, we define the map: Adg : Te G −→ Te G Adg := (dφg )

(2)

where d is the differential of φg at the identity. Then the Adjoint representation of the group is defined by (3) Ad : G −→ Aut(Te G); g 7→ Adg . Differentiating at the identity yields the adjoint representation of the Lie algebra ad : g −→ End(TeG); a 7→ ada

(4)

where ada (b) = [a, b] for all b ∈ g. Next, from the adjoint representation of the algebra, the Killing form on g can be constructed as follows: K : g × g −→ K; (a, b) 7→ K(a, b) := Tr(adaadb ),

(5)

where K = R, C is the field of g. The Killing product is symmetric and ad-invariant, which means K(ada(b), c) + K(b, ada(c)) = 0.


181

This defines a symmetric two form over Te G which in turn, using the left translation, induces a symmetric two form over the whole group: KG : G −→ T ∗ G ⊗ T ∗ G; g 7→ L∗g−1 K,

(6)

the pullback of K under Lg−1 . In general the Killing form is degenerate. It is obviously left invariant. If we choose {τi } for g and define the corresponding structure constants Pn a basis k k fij as [τi , τj ] = k=1 fij τk , then the Killing form is computed to be Kij = K(τi, τj ) = P m l l l,m fil fjm . The ad-invariance implies that the covariant tensor fijk := fij Klk is totally i i antisymmetric. In the basis {µ } which is canonically dual to τj , (i.e. µ (τj ) = δji ), the P Killing form has the particularly simple expression K = ij Kij µi ⊗ µj . Finally, an important by the Cartan 1-form. It is a Lie algebra valued form P role is playedP defined as J := i (Lg )∗ (τi µi ) = i J i Xτi and it can be used to rewrite the Killing form P as KG = K(J, J) = ij J i ⊗ J j Kij .

2.3.

Simple Lie algebras classification

Starting from a finite dimensional Lie group the associated Lie algebra can be easily determined. Being a linear space, it is much easier to analyze than the group itself. There is a very interesting class of Lie algebras, which are completely classified: the semisimple Lie algebras. A semisimple Lie algebra is a Lie algebra of dimension higher than 1, which does not admit any Abelian proper ideals. If it does not contain any proper ideal at all, it is called a simple Lie algebra. It can be shown that any semisimple Lie algebra can be written as a direct sum of simple algebras in a unique manner (apart from isomorphisms). An important result is that a Lie algebra is semisimple if and only if the corresponding Killing form is non degenerate. From the definition, it follows that for a semisimple algebra Ker(ad) = 0, so that the adjoint representation is faithful. The Lie algebra can then be identified with its adjoint representation. This allows a classification of all complex (finite dimensional) simple Lie algebras by performing a classification of their adjoint representation. As the main ingredients will be used later, let us recall the main steps. Any simple Lie algebra contains a unique (apart from isomorphisms) Cartan subalgebra, a maximal Abelian h ⊂ g subalgebra such that for each h ∈ h, adh is diagonalizable. r = dim(h) is called the rank of g. All such operators adh are simultaneously diagonalizable and define the roots α ∈ h∗ of the algebra, defined by the eigenvalue equations adh (λα) = α(h)λα,

0 6= λα ∈ g.

Since g is finite dimensional, the set N of all roots Root(g) is finite. If Λα is the eigenspace of α then 0 is a root, Λ0 = h, and g = α∈Root(g) Λα. One easily sees that [λα, λβ ] ∈ Λα+β , and vanish if α + β is not a root. ¿From ad-invariance it follows that K(λα, λβ ) = 0 if α + β 6= 0. It also can be shown that if α is a non vanishing root, then kα is a root if and only if k = 0, ±1 and dim(Λα) = 1. If KC is the restriction of the Killing form to the Cartan subalgebra it follows that KC is non degenerate and then defines a natural isomorphism between h and h∗, and a bilinear form (|) on h∗ in an obvious way. It also follows that Root(g) is real in the sense that contain a basis

182


for h∗ , such that the remaining roots are real combinations and than one can consistently define the r-dimensional real space h∗R = hRoot(g)iR. Apart from a multiplicative constant, (|) defines an Euclidean scalar product on h∗R . The main step is: (α|β) The Cartan Theorem: If α and β are two non vanishing roots, then nαβ := 2 (α|α) ∈ Z and β − nαβ α is also a root (Weyl reflection). This strongly constraints the relations between roots, because if |α| and θαβ are the norm and the angle between roots defined by the Euclidean scalar product, then nαβ |β|2 = , 2 |α| nβα

cos2 θαβ =

1 nαβ nβα. 4

At this point it is clear that all information on the algebra is contained in the root system. A simple root system SR is defined to be a basis of roots such that all remaining roots are combinations of this with only positive or negative integer coefficients. It always exists (not unique) and defines separates the roots in positive and negative Root(g) = R+ ⊕ R− . Given a simple root system SR = {α1, . . . , αr }, then the associated numbers nij associated by the Cartan Theorem must be all non positive if i 6= j, whereas ni i = 2 (moreover, one of |nij | or |nji | is always 1 if i 6= j) They characterize completely SR (apart from obvious equivalences) and define the the Cartan matrix Cij = nij . It has the properties Cii = 2, Cij ≤ 0 and Cij 6= 0 iff Cij 6= 0, i 6= j. To classify all simple Lie algebras, one classify all SR systems and equivalently all Cartan matrices compatible with them. This is done graphically by means of the Dynkin diagrams: A dot ◦ is signed for each simple root (there are r). Two root are connected by Nij = nij nji lines with a > indicating the direction from the longer root to the shorter one. Simple algebras correspond to connected Diagram. It results that all admissible Dynkin diagrams are divided in four classical series: Ar , Br , Cr , Dr , r being the rank of the corresponding algebras, and five exceptional algebras: G2 , F4 , E6, E7, E8 . The Dynkin corresponding diagrams can be found for example in [15]. Real Lie algebras can be next classified searching for generators of the complex algebra which also generate a real algebra (thus having real structure constants). This are called the real forms of the algebra. In particular, each simple algebra admits the compact form, the real algebra over which the Killing form is negative definite. The corresponding Lie Group is compact. All real forms are classified and are described for example in [16].

2.4.

Lie groups from Lie algebras

As we have seen in the previous sections, from a Lie group it is easy to obtain the associated Lie algebra by simple differentiation. Less trivial is the issue of recovering the group from the algebra. This is indeed the main argument of the remaining sections. Here we are simply going to recall some properties of a key instrument, the exponential map: exp : Lie(G) −→ G; X 7→ gX (1) where gX (t) is the integral curve on G associated to the left invariant vector field X, with gX (0) = e. The main properties are • exp(0) = e;


183

• exp(X + Y ) = exp(X) exp(Y ) if [X, Y ] = 0; • exp is differentiable and d exp0 : T0Lie(G) −→ Te G realizes the natural isomorphism between Lie(G) and Te G; • exp is a local diffeomorphism between an open neighborhood of 0 ∈ Lie(G) and an open neighborhood of e ∈ G. In general the exponential map is not surjective, however it permits to generate the whole group starting from the algebra. For matrix groups it is easy to show that exp(X) = eX :=

∞ X 1 n X n! n=0

. As we will work with definite representations, this will be our case. In this case, given a matrix realization of the group an opportune parametrization g(x1, . . ., xn ), it follows that the Cartan 1-form is X J i τi (7) J = g −1dg = i

where τi is a base for the Lie algebra. The 1-forms J i in Physics are also called the leftinvariant currents. They will play a central role in our construction. The main problem is now to search for opportune parameterizations of the group which permit to describe the whole group in a practical way, suitable for concrete physical applications. This means that we must be able not only to explicitly individuate the elements of the group, but also to specify the complete range for the parameters, and to be able to compute explicitly the significant quantities as for example the left invariant currents, the invariant measure and the Killing form.

3. 3.1.

Construction of Compact Lie Groups A toy model

We start by illustrating the main ideas of our strategy with the simplest possible example, the construction of the SU (2) group, the set of all unitary matrices with unitary determinant. The associated Lie algebra su(2) is generated by the Pauli matrices σ1 =

0 1 1 0

,

σ2 =

0 −i i 0

,

σ3 =

1 0 0 −1

,

(8)

which, after multiplication by i, indeed gives a basis for the 2 × 2 anti Hermitian matrices. It is a well known fact that the generic element of SU (2) can be expressed in the form g = eiφ

σ3 2

eiθ

σ1 2

eiψ

σ3 2

.

(9)

184


where φ ∈ [0, 2π], θ ∈ [0, π], ψ ∈ [0, 4π] are called the Euler angles for SU (2). Let us first recall the definition of the Euler angles traditionally used in classical mechanics to describe the motion of a spin. Fix a Cartesian frame (x, y, z) an suppose for the spin to be a rod of length L, with an end fixed in the origin and the other one in the starting position ~ ≡ (0, 0, L). The top of the spin can be moved in a generic position in the following way: L • first, rotate the system by an angle α around the x axis. The x axis will rotate by α to a new axis x0 on the x − y plane and similarly for the y axis. • Then we can take a rotation of the system around the axis x0 by an angle β. The z axis will be rotated to a new axis z 00 by β in the y 0 − z plane. • Finally we can take a rotation by γ around the z 00 axis. Essentially these movements represent the inclination of the spin with respect to the vertical axis, the rotation around the vertical axis and the rotation around its proper axis. To describe these operations mathematically note that a rotation Rnˆ (θ) by θ around an (oriented) axis specified by a unit vector n ˆ ≡ (nx, ny , nz ) can be written as Rnˆ (θ) = eθ(nx τ1 +ny τ2 +nz τ3 )

(10)

where 

 0 0 0 τ1 =  0 0 1  , 0 −1 0



 0 0 1 τ2 =  0 0 0  , −1 0 0



 0 1 0 τ3 =  −1 0 0  , (11) 0 0 0

are the generators of the infinitesimal rotations. Thus, the generic final position of the top of the spin will be ~ 0 = eγτ300 eβτ10 eατ3 L

(12)

where τ10 and τ300 are the generators of rotations around the x0 and z 00 axis respectively τ10 = cos ατ1 + sin ατ2 = eατ3 τ1e−ατ3 , τ300

= cos βτ3 −

sin βτ20

=e

βτ10

−βτ10

τ3 e

.

(13) (14)

Using A Be−A

ee

= eA eB e−A

and substituting in (12) we find ~ 0 = eατ3 eβτ1 eγτ3 L. ~ L

(15)

¿From the construction it is evident that for the range of the Euler angles we can take for example α, γ ∈ [0, 2π], β ∈ [0, π]. This is very similar to (9)and one is tempted to identify φ, θ and ψ with α, β and γ. Note however that for SU (2) we have ψ ∈ [0, 4π] which is a consequence of the fact that SU (2) is a spin 12 representation of SO(3) (and is its double cover).


185

Let us now look at the structure of the construction (9). We identified a maximal subgroup σ3 U (1)[φ] = eiφ 2 . Its Lie algebra is obviously a subalgebra of su(2). We then add a second generator τ1 not in the subalgebra, and after noting that all the remaining generators can σ1 be obtained by commuting τ1 with the subalgebra, we act on eiθ 2 with the subgroup both from the left and from the right: g = U (1)[φ]eiθ

σ1 2

U (1)[ψ].

(16)

This provides the structure of the generic element of the group, but one needs more information to determine the correct range for the parameters. For completeness, let us look how one can look at the geometric properties of the group and use them to identify the parameters. It is known that the group SU (2) is geometrically equivalent to a three-sphere S 3 , and admits a Hopf fibration structure with fiber S 1 over the base S 2 ' CP1 . To see this, note that the generic U (2) element by definition can be written in the form u1 w1 g= u2 w2 where ~u =

u1 u2

,

w ~=

w1 w2

determine an orthonormal basis for C2 . After imposing the condition det g = 1 we find that it becomes u1 −u∗1 g= u2 u∗2 where |u1|2 + |u2|2 = 1. Setting u1 = x + iy and u2 = t + iz we see the correspondence with S 3 . As we seen in the previous section, SU (2) being a real compact form is naturally endowed with an invariant metric given by the Killing product. Suitably normalized this is ds2 = − 12 Tr(g −1dg ⊗ g −1dg), so that we find ds2 =

1 Tr(dg † ⊗ dg) = (dx2 + dy 2 + dt2 + dz 2 ) x2 +y2 +t2 +z2 =1 2

(17)

which is the usual round metric on the sphere S 3. To determine the ranges for the parameters in (9) we can compute the associated metric, identify it with the round metric and choose the range to cover the whole S 3. From (9) we get ds2 =

1 (dφ2 + dθ2 + dψ 2 + 2 cos θdφdψ) 4

(18)

which can be obtained from (17) setting θ i u1 = cos e 2 1 (φ+ψ)+iα1 , 2

θ i u2 = sin e 2 2 (φ−ψ)+iα2 2

(19)

where i are signs and φi constant phases. We do not need to determine these quantities to find the ranges. Indeed for any fixed value of these parameters, to cover S 3 we need to take 1 (φ + ψ) ∈ [0, 2π], 2

1 (φ − ψ) ∈ [0, 2π], 2

θ ∈ [0, π]

(20)

186


which are equivalent to the announced ones after (9). Note that in this very simple case all the phases and signs can be determined noting that ! i i σ σ σ e 2 (φ+ψ) cos θ2 ie 2 (φ−ψ) sin θ2 iφ 23 iθ 21 iψ 23 (21) e e = e i i ie− 2 (φ−ψ) sin θ2 e− 2 (φ+ψ) cos θ2 which gives 1 = 1, 2 = −1, φ1 = 0 and φ2 = − π2 . This is a way to determine the ranges, but there is another way which is much simpler for higher dimensional groups G. It consists in studying the maximal subgroup U of G and the quotient G/U separately. In our case we can take U = U (1)[ψ]. This is a circle with metric 1 2 4 dψ . The range of ψ must be a period to cover the circle, and being i

iψ

U (1)[ψ] = e

σ3 2

=

e2ψ 0 i 0 e− 2 ψ

!

we can take ψ ∈ [0, 4π]. The points of the quotient are parameterized by ! i i σ σ ie 2 φ sin θ2 e 2 φ cos 2θ iφ 23 iθ 21 H(φ, θ) = e e = i i ie− 2 φ sin 2θ e− 2 φ cos 2θ

(22)

with a residual action of U (1)[ψ] on the right. For example we see that, in the quotient, H(φ, 0) degenerate in a single point and similarly for H(φ, π), because ! ! i i e2φ 1 0 e2φ 1 0 0 0 H(φ, 0) = = ∼ i i 0 1 0 1 0 e− 2 φ 0 e− 2 φ ! ! i i 0 ie 2 φ 0 i 0 e− 2 φ 0 i H(φ, π) = = ∼ . i i i 0 i 0 ie− 2 φ 0 e2φ 0 Indeed, we can first take for the quotient the representative ieiφ sin θ2 cos 2θ H(φ, π)U (1)[−φ] = ie−iφ sin θ2 cos θ2

(23)

so that when φ and θ vary in their ranges, this traces a two dimensional semi sphere x ≥ 0 in the (x, 0, t, z) space. However the equator x = 0 is contracted to a point and the semi sphere reduces to a sphere S 2 of radius 1/2. This is the celebrated Hopf fibration. To see this we can compute the metric on the quotient set. This is not simply 1 ds2H = − Tr(H −1dH ⊗ H −1dH) 2 because JH = H −1 dH is not cotangent to the quotient, having a component cotangent to the fiber. Using (22) we have indeed JH =

i (dθσ1 + sin θdφσ2 + cos θdφσ3 ). 2

(24)


187

However, we can simply project out the component along the fiber (t. i. the σ3 part) so that

and

i J˜H := (dθσ1 + sin θdφσ2 ) 2

(25)

1 1 2 ds2H = − TrJ˜H ⊗ J˜H = dθ + sin2 θdφ2 . 2 4

(26)

This is just the metric of a two sphere of radius 12 . Passing the complex coordinate z = tan ψeiφ and its complex conjugate, this metric reduces to the standard Fubini-Study metric for CP1. However, we can assume the ranges of the parameters for the quotient space to be unknown. They can thus be deduced from (26) as follows: the metric becomes degenerate at θ = 0, π. This is because fixing theta and varying φ along a period we obtain a circle with radius 12 sin θ. Thus we must restrict θ to [0, π]. φ has not such a constraint and could in principle vary in a period which we know to be 4π as for ψ. However, this is not the right 0 period in the quotient. Indeed, note that −I = (−1 0 −1 ) ∈ U (1)[ψ] and is in the center of the group, so that H(φ, θ) ∼ H(φ, θ)(−I) = −H(φ, θ) = H(4π − φ, θ) so that φ ∼ 4π − φ and then 0 ∼ 2π which reduces the period to φ ∈ [0, 2π].

3.2.

The generalized Euler construction.

Let us now generalize the previously described construction to the compact form of a generic finite dimensional simple Lie group G, n = dimG. In this case, our construction is not unique but is related to the choice of a maximal subgroup H. Because G is compact, the Killing product defines a scalar product (|) on g = Lie(G) and it is convenient to choose an orthonormal basis {τi}ni=1 of g. In particular, let us assume that the first k := dimH generators are a base for h = Lie(H). Finally, let us call p the subspace generated by the remaining generators. Note that [h, p] ⊂ p. Indeed, orthogonality and ad-invariance imply ([p, h]|h0) = (p|[h, h0]) = 0 for any p ∈ p and h, h0 ∈ h. This means that G/H is reductive. From this, it follows that any g ∈ G can be written in the form g = exp a exp b ,

a∈p, b∈h.

(27)

For compact simple Lie groups such a parametrization is surjective, a proof can be found in [3]. We can now suppose to have an explicit parametrization for H. This will be obviously a generalized Euler parametrization obtained inductively choosing a maximal subgroup H 0 of H and proceeding in the same way. This means that exp b can be substituted by such parametrization at the end. Actually we would like to improve the expression for exp a. To this purpose we can search for a subset of linearly free elements τ1 , . . . , τl ∈ p with the following properties

188


• if V is the linear subspace generated by τi , i = 1, . . ., l, then p = AdH (V ), that is, the whole p is generated from V through the adjoint action of H; • V is minimal, in the sense that it does not contain any proper subspaces with the previous property. Using simplicity it is not hard to show that such subspace V of p always exists. This means that the general element g of G can be written in the form g = exp(˜b) exp(v) exp(b) ,

b, ˜b ∈ h , v ∈ V .

(28)

This parametrization is obviously redundant, being in general 2k + l ≥ n. The point is that one does not need the whole H to generate V by adjunction, because of H will contain some subgroup Ho generating automorphisms of V AdHo : V −→ V .

(29)

Then Ho must be r-dimensional, where r = 2k+l−n is the redundancy, and the generalized Euler decomposition with respect to H will finally take the form G = B exp(V )H ,

(30)

where B := H/Ho. We seen that even for the simplest case of SU (2) the automorphism group Ho is non trivial (even though it acts trivially on V ) and coincide with Z2 .

3.3.

Determination of the range of parameters

The symbolic expression (30) means that the generic element of G can be written in the form g = bev h,

b ∈ B, v ∈ V, h ∈ H,

(31)

where h, v and b are function of k, l and n − l − k parameters respectively. Locally they define a coordinatization for the group. However, being the parametrization surjective, the parameters can be chosen to cover the whole group. As the group is in general a non trivial manifold, a surjective parametrization will not in general injective. A good choice for the range of the parameters is to chose a maximal open subset on which the parametrization is injective, so that its closure covers all the group. We will call this closure the range of parameters. In general the determination of the range is an highly non trivial task. We discuss here two practical methods. 3.3.1. Geometric identification After the parametrization g[~x] is given, one can use it to describe the geometry of the group or of its quotient with the maximal subgroup. If such geometry is known from theoretical arguments, the related information can be used to determine the range of the parameter.


189

The metric on the group can be computed starting from the Killing metric and the Cartan 1form. The parametrization provides a local coordinatization which can be used to determine a local expression for the Cartan 1-form ∂g dxJ = J i τi , (32) ∂xJ where τi , i = 1, 2, . . . , n is a basis for the Lie algebra. This defines the structure constants fij k so that the Killing metric has components J = g −1

Kij = −kfil m fjm l ,

(33)

where k is some normalization constant. As we are working with a compact form, the metric is positive definite when k is positive. We may assume to chose the basis and k such that Kij = δij . The metric induced on the manifold is then ds2 = gij dxi ⊗ dxj = J l ⊗ J m δlm .

(34)

In other words, the 1-forms J l represent the vielbein one forms on the group. In particular they can be used to compute the invariant volume n-form ω = J 1 ∧ . . . ∧ J n = det(J)dx1 ∧ . . . ∧ dxn

(35)

and the corresponding Haar measure dµ = | det(J)|

n Y

dxI .

(36)

I=1

¿From our parametrization (30), we can write the general element g ∈ G in the form g(x1, . . . , xs, y1 , . . ., xm ) = p(x1, . . . , xs )h(y1, . . . , ym )

(37)

where h ∈ H, p ∈ B exp(V ), m = dimH and m + s = n. We can assume for simplicity that τa , a = s+1, . . . , n generate H. Note that only H is a subgroup so that Jh ≡ h−1 dh ∈ Lie(H) whereas in general Jp ≡ p−1 dp ∈ Lie(G). However, for the subgroup, in place of the left-invariant form we prefer to use the right-invariant form J˜h ≡ dh h−1 . In this way, setting Jp =

n X

Jpi τi

n X

J˜h =

,

i=1

J˜hi τi ,

(38)

i=s+1

and using orthonormality, after a simple calculation we get 2

ds =

n X

Jpi

+ J˜hi

2

+

i=s+1

s X

Jpi

2

.

(39)

i=1

¿From this expression for the metric we can read the structure of the fibration with fiber H over G/H. Indeed, the forms Jpi + J˜hi , i = s + 1, . . ., n lie on the fiber, whereas Jpa , a = 1, . . ., s are orthogonal to the fiber. This means that 2

dσ =

s X i=1

Jpi

2

(40)

190


defines the metric on the quotient space G/H, as defined by the s-dimensional vielbein Jˆp obtained from Jp by projecting out the components along the fiber. At practical level, this decomposition not only permit the computation of the metric on the quotient space, but it also greatly simplifies the explicit computation of the metric on the whole group. To conclude, this method can be used to provide an explicit characterization of the geometry of the group and its quotients. If some of the geometrical notions on the group or its fibration are known by theoretical framework, by comparison we can determine the range of the parameters. We will see an explicit example later. 3.3.2. A topological method In general, however, the theoretical information about the group does not suffice to determine the explicit range for the parameters. In this case, we need to introduce an alternative method which requires a minimal information to work. Fortunately, such a method, which we called the topological method, is provided by a powerful theorem due to I.G. Macdonald which provides a simple way to compute the total volume of a compact connected simple Lie group. Let c ⊂ Lie(G) a Cartan subalgebra, and cZ the integer lattice generated in c by a choice of simple roots (the root lattice). Then, the first geometrical ingredient is the torus T := c/cZ, whose dimension is r = rankLie(G). The second ingredient is a well known result due to Hopf [17]: the rational homology of G is equal to the rational homology of a product of odd-dimensional spheres ! k Y H∗ (G, Q) ' H∗ (S 2i+1)ri , Q , i=1

where ri is the number of times the given sphere appears, and r1 + . . . + rk = r. The result of Macdonald [18] can thus be stated as follows: if we assign a Lebesgue measure µ on a compact simple Lie group G by means of an Euclidean scalar product h , i on g = Lie(G), then the measure of the whole group is µ(G) = µo (T ) ·

k Y

V ol(S 2i+1)ri ·

i=1

Y α∈R(g)

2 |α|

(41)

where R(g) is the set of non vanishing roots, µo is the Lebesgue measure on g induced by the scalar product and V ol(S 2i+1) = 2π i+1/i! is the volume of the unit sphere S 2i+1 . On the other side, we can in principle compute the measure of the whole group, induced by the Killing scalar product, by using (36) integrated over the range of the parameters. Using (39) we get n Y dxI . (42) dµ = | det(J p)|| det(J˜h )| I=1

Now, we assume to dispose of a good parametrization, which means that the one parameter subgroups selected by the orbits exp(tτi ), where {τi} is the selected basis of the Lie algebra, are embedded subgroups of G1. Then, such orbits are compact (for a compact group) and 1

This can be even accomplished for simple groups


191

exp(tτi ) is periodic in T . The point is that if we choose correctly the range R for the parameters, then Z dµ. (43) µ(G) = R

A suitable range correspond to cover each point of G exactly one times, a part from a subset of vanishing measure. Let us look at the measure weight f := | det(J)|. In general it will depend explicitly on the parameter but not on all the parameters. For each parameter ¯ the range for which does not appears in f we choose its period as a range. Let us call R ¯ has a boundary defined by f = 0. This equation in the remaining parameters. Then, R general will define infinitely many fundamental regions all equivalent for our purposes. With such a choice Ro for the range, we are sure that its image under our parametrization map g : Ro → G will define a closed n-dimensional variety on G. As G is connected, g(Ro) will cover G an integer number m of times Z 1 dµ. (44) m= µ(G) Ro If m > 1 it means that it exist an automorphism group Γ : Ro → Ro of order m, such that g(Γx) = g(x). In this case we determine the suitable range as R = Ro /Γ.

(45)

This is indeed what we done at the end of section 3.1.. Let us now illustrate our procedure for some exceptional examples.

4. 4.1.

Generalized Euler Angles for G2 The Lie algebra

The exceptional group G2 can be realized as the automorphism group of the octonionic algebra [19, 23]. In place of providing a theoretical proof of this fact, we will explicitly construct such a group starting for its Lie algebra which will result to be of G2 type. The octonionic algebra O is the eight dimensional real vector field generated by a real unit e0 ≡ 1 and seven imaginary units ei , i = 1, . . ., 7. It is endowed wit a distributive but non associative product defined by the relations e0 · a = a · e0 = a e2i

= −e0 ,

∀a ∈ O,

ei · ej = −ej · ei ,

1≤i<j≤7

and the Fano diagram, see fig. 4.1.. Each oriented line can be thought as an oriented circle, on which the lie three distinct imaginary roots ei , ej , ek whose products are ei · ej = ±ek , the sign being positive iff the triple {ei , ej , ek } follows the orientation of the arrow. For example e1 · e3 = −e2 and e1 · e2 = e3 . Note that each circle generates a quaternionic subalgebra. An automorphism of the algebra is an invertible linear map A : O −→ O

192


Figure 1. The Fano diagram.

which must satisfy a, b ∈ O.

A(a · b) = A(a) · A(b),

The set of all automorphisms is a group with respect to the composition product, and is indeed a Lie group. From this it follows immediately that its Lie algebra is the set of derivations D(O), the linear operators B : O −→ O

(46)

satisfying a, b ∈ O,

B(a · b) = B(a) · b + a · B(b),

(47)

and with the commutator as Lie product. Note that B(1) = 0 for all B ∈ D(O), so that we can search for a matrix representation of D(O) on the real space spanned by the imaginary units. This will give the smallest fundamental representation of G2, the 7 representation. Imposing the condition (47), with the help of a computer, we find a set of 14 linearly independent matrices 

    C1 =     

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 1

0 0 0 0 0 1 0

0 0 0 0 −1 0 0

0 0 0 −1 0 0 0



    ,    



    C2 =     

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 −1 0

0 0 0 0 0 0 1

0 0 0 1 0 0 0

0 0 0 0 −1 0 0



    ,    

Exceptional Groups, Symmetric Spaces and Applications 

    C3 =      

    C5 =      

    C7 =      

   1  C9 = √  3    

C11

   1  = √  3    

C13

   1  = √  3   

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 1 0

0 0 0 0 0 0 −1

0 0 0 1 0 0 0

0 0 0 0 1 0 0

0 0 0 0 1 0 0

0 0 0 −1 0 0 0 0 0 0 0 0 0 0

0 −1 0 0 0 0 0

0 0 0 0 0 0 1

0 0 0 0 0 0 0

0 0 −1 0 0 0 0

0 0 0 0 0 −1 0

0 −1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 1 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 −1

0 0 0 0 0 1 0

0 0 0 0 −1 0 0

0 0 0 1 0 0 0

0 0 0 2 0 0 0

0 0 0 0 0 0 1

0 0 0 0 0 −1 0

−2 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 1 0 0 0 0

0 −1 0 0 0 0 0

0 0 0 0 0 2 0

0 0 0 0 1 0 0

0 0 0 1 0 0 0

−2 0 0 0 0 0 0



    ,    

    ,    

−2 0 0 0 0 0 0

0 −1 0 0 0 0 0

    ,    



0 2 0 0 0 0 0

0 0 −1 0 0 0 0



0 0 0 0 0 0 0



    C4 =      

    C6 =      

   1  C8 = √  3   



    ,    

C10

0 0 0 0 0 −1 0

0 0 0 0 0 0 0

0 0 0 0 −1 0 0

0 0 0 1 0 0 0

0 0 2 0 0 0 0

0 −2 0 0 0 0 0

0 0 2 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 2 0 0

0 0 0 0 0 −1 0

0 0 0 0 0 0 −1

0 0 0 0 0 0 2

0 0 0 −1 0 0 0

0 0 0 0 1 0 0

0 0 0 0 0 0 0

   1  = √  3    

C12



    ,    

0 0 0 0 0 0 −1





    ,    

0 0 0 0 0 0 0

   1  = √  3    

C14

   1  = √  3   

193

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 1 0 0 0 0

0 1 0 0 0 0 0

0 0 −1 0 0 0 0

0 1 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 −1 0 0 −2 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 1 0 0 0 0 0

0 0 0 0 0 0 1



    ,     

    ,     0 0 0 0 0 −1 0

0 0 0 −1 0 0 0 −2 0 0 0 0 0 0

0 0 −1 0 0 0 0

    ,     0 0 0 0 −1 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0



0 0 1 0 0 0 0 −2 0 0 0 0 0 0



    ,     

    ,     

    .    

It is easy to check that these matrices define a Lie algebra with the commutator product, and R7 is irreducible under their action, so that they realize an irreducible representation. A rank two Cartan sub algebra is generated by C5 , C11 and passing to the adjoint representation it is easy to compute all roots which result to be the roots of G2, as expected (see for example [1]).

194

4.2.


Two Euler parameterizations

We can now realize two distinct Euler parameterizations for G2, based on different choices of the maximal subgroup H. The first one is based on H = SU (3), [2], and the second one on H = SO(4), [1]. The first one will permit us to adopt the geometrical method, whereas for the second one we will need the topological method. We will call them the SU (3)-Euler parametrization and the SO(4)-Euler parametrization. 4.2.1. The SU (3)-Euler parametrization Among the automorphisms of the octonions, we can look at the subgroup which fixes an imaginary unit. This is a subgroup of G2 and will be contained in the SO(6) group which rotates the remaining six imaginary units. Indeed, it results to be an SU (3) group. We can see it immediately from our matrices: the first 8 matrices have the first row and column null, so that leave e1 fixed. They generate a subalgebra, and the associated adjoint representation provides the roots of SU (3). It acts transitively on the subset of imaginary units orthogonal to e1 , which defines a six dimensional sphere S 6 , so that G2/SU (3) ' S 3. We then choose Ci , i = 1, 2, . . ., 8 as generators for H, so that Ca , a = 9, . . . , 14 generate p. To identify V (see (30)) we note that C9 generates all p under the action of H = SU (3), so that V = RC9 . Finally, note that the subalgebra of H commuting with C9 is the su(2) algebra generated by Ci , i = 1, 2, 3. Thus B = SU (3)/SU (2). As a first step we need to construct the SU (3) subgroup H. We could proceed in the same way, but as the construction of SU (3) is well known, we report here the final result, see [20, 21, 2]: H[x1, . . ., x8] = ex1 C3 ex2 C2 ex3 C3 ex4 C5 e

√ 3x5 C8 xC3 xC2 x8 C3

e

e

e

,

(48)

h πi , x4 ∈ 0 , 2 x8 ∈ [0 , π] .

(49)

with range h πi , x2 ∈ 0 , 2 x6 ∈ [0 , 2π] ,

x1 ∈ [0 , π] , x5 ∈ [0 , 2π] ,

x3 ∈ [0 , π] , h πi , x7 ∈ 0 , 2

We only note that (48) has the structure of (30) with V = RC5 ,

B = SO(3) = SU (2)/Z2,

H = U (2).

(50)

Then, our SU (3)-Euler parametrization is g[x1, . . . , x14] = e

x1 C3 x2 C2 x3 C3

e

e

e

√ 3 x C 2 4 8

x5 C5

e

e

√ 3 x6 C9 2

H[x7 , . . . , x14],

(51)

where we need to determine the range for x1 , . . ., x6, whereas the remaining parameters have the range of SU (3). To this aim, we will use the information G2/SU (3) ' S 3. From

√

x1 C3 x2 C2 x3 C3

p[x1, . . . , x6] = e

e

e

e

3 x4 C8 2

e

x5 C5

e

√ 3 x6 C9 2

(52)


195

we can compute Jp = p−1 dp and then the metric (40) induced on the quotient. By a direct computation, we get ( " 2 #) 1 4 2 (53) dσ = dx26 + sin2 x6 dx25 + cos2 x5dx24 + sin2 x5 s21 + s22 + s3 + dx4 3 2 where s1 = − sin(2x2) cos(2x3)dx1 + sin(2x3 )dx2 s2 = sin(2x2) sin(2x3)dx1 + cos(2x3)dx2 s3 = cos(2x2)dx1 + dx3.

(54) √ We can recognize this as the metric of a round six sphere S 6 of radius 3/2, with coordi~ where x6 is an azimuthal coordinate, x ∈ [0, π], and X ~ cover a five sphere nates (x6 , X), 3 embedded in C via x x ~ = (z1 , z2 , z3 ) = cos x5 eix4 , sin x5 cos x2 ei(x1 +x3 + 24 ) , sin x5 sin x2 ei(x1 −x3 − 24 ) , X h πi h πi , x3 ∈ [0 , 2π] , . x2 ∈ 0 , x4 ∈ [0 , 2π] , x5 ∈ 0 , x1 ∈ [0 , π] , 2 2

Computing the metric ds2S 5 = |dz1|2 + |dz2|2 + |dz3|2 in these coordinates we find 4 2 dσ = dx26 + sin2 x6 ds2S 5 . 3

(55)

This complete our identification for the range of the parameters. 4.2.2. The SO(4)-Euler parametrization The maximal subgroup SO(4) can be singled out as follows. We know that 1, e1, e2, e3 generate a quaternionic subalgebra H. We look at the subgroup H which leaves this subalgebra invariant. This will be generated by block diagonal matrices of the form {3 × 3} × {4 × 4}. These are the matrices Ci , i = 1, 2, 3, 8, 9, 10. Indeed, C1 , C2, C3 generate an SU (2) subalgebra which leaves invariant each element of H. Let us call this group SU (2)I . C8 , C9, C10 define a second SU (2) group (SU (2)II ) whose action restricted to e1 , e2, e3 generates the automorphisms of H. Note that the two subgroups commute. We can then realize the surjective homomorphism φ : SU (2)I × SU (2)II −→ H (a, b) −→ ab. √ Because of Kerφ is the Z2 subgroup generated by the element (exp(πC1), exp( 3πC8 )) = (z, z) with z = diag{I3 , −I4}, In being the n × n identity matrix. Thus H ≡ SU (2)I × SU (2)II /Z2 = SO(4).

(56)

As for SU (3), the construction of SO(4) is very easy, and, starting from the construction of SU (2)I and SU (2)II , we get √

H(x1, . . ., x6) = ex1 C3 ex2 C2 ex3 C3 e

3x4 C8

e

√ √ 3x5 C9 3x6 C8

e

,

(57)

196


with range x1 ∈ [0, 2π], x4 ∈ [0, π],

x2 ∈ [0, π/2], x5 ∈ [0, π/2],

x3 ∈ [0, π] x6 ∈ [0, π].

(58)

We also know that C5 , C11 generate a Cartan subalgebra, not contained in Lie(H). The action of H on this Cartan subalgebra generates the complement of Lie(H), so that we can take V = RC5 ⊕ RC11 . Finally, because of dimB = dimG2 − dimH − dimV = 6 = dimH, we expect for the subgroup Ho of H which commute with exp V to be a finite group. This means that the SO(4)-Euler parametrization will take the form √

g[x1, . . ., x14] = H(x1, . . . , x6)e

3x7 C11 +x8 C5

H(x9, . . . , x14),

(59)

where x9 , . . . , x14 will take the range of the whole SO(4), whereas the range of the first six parameters will be restricted by the action of Ho . Before to determine Ho , we remark that in this case the quotient manifold M = G2 /SO(4) is known to be the eight-dimensional variety of the quaternionic subalgebras of O. Unfortunately, we cannot use this information as before because an invariant metric on M (independent from the one we can compute by group theory) is not known, so we must use the topological method. Let us now proceed with the determination of Ho . This is the subgroup of 7 × 7 orthogonal matrices A of SO(4), whose adjoint action leave invariant the Cartan subalgebra: ACi At = Ci ,

i = 5, 11.

(60)

A direct computation shows that it is the finite group Z2 × Z2 generated by the idempotent matrices σ (σ = σ −1) and η (η = η −1 )  −1 0 0 0 0 0 0   1 0 0 0 0 0 0   σ=

0 0 0 0 0 0

−1 0 0 0 0 0

0 −1 0 0 0 0

0 0 1 0 0 0

0 0 0 1 0 0

0 0 0 0 −1 0

0 0 0 0 0 −1

 

 η=

0 0 0 0 0 0

0 1 0 0 0 0

1 0 0 0 0 0

0 0 −1 0 0 0

0 0 0 1 0 0

0 0 0 0 0 −1

0 0 0 0 −1 0

 .

(61) We need to look at the action of Ho on H to reduce the range of x1 , . . ., x6. Starting with σ we see that g = H(x1, x2, x3, x4, x5, x6)σeV σH(x9, x10, x11, x12, x13, x14) π π π π ). = H(x1, x2, x3 + , x4, x5, x6 + )eV H(x9 + , x10, x11, x12 + , x13, x14(62) 2 2 2 2 This shows that we can restrict 0 ≤ a6 < π2 to avoid redundancies. A similar computation can be done for the action of η, showing that redundancies are avoided restricting a2 ∈ [0, π/4]. The details can be found in [1]. At this time we partially determined the ranges x1 ∈ [0, 2π],

x2 ∈ [0, π/4],

x3 ∈ [0, π],

Exceptional Groups, Symmetric Spaces and Applications x4 ∈ [0, π], x9 ∈ [0, 2π], x12 ∈ [0, π],

x5 ∈ [0, π/2], x10 ∈ [0, π/2], x13 ∈ [0, π/2],

x6 ∈ [0, π/2], x11 ∈ [0, π], x14 ∈ [0, π].

197

(63)

To apply the topological method we must now determine the form of the invariant measure. This is easily computed using (42), (59) (and eventually the help of Mathematica): 14 Y √ dxi , dµ = 27 3f (2x7 , 2x8) sin(2x2) sin(2x5) sin(2x10) sin(2x13)

(64)

i=1

where β+α β − 3α β + 3α β−α ) sin( ) sin( ) sin( ) sin(α) sin(β) f (α, β) = sin( 2 2 2 2 1 (cos(α) − cos(β))(cos(3α) − cos(β)) sin(α) sin(β). (65) = 4 We see that for certain values of the angles x2, x5, x7, x8, x10, x13 the measure (64) vanishes. Apart from x7 , x8, however, this happens only on the boundary of the chosen ranges. This means that the condition of non vanishing measure determines the range for x7, x8 by means of the equation f (2x7 , 2x8) > 0. √

Note that the period of exC5 , as for e xC11 , is 2π, so that we must solve this equation inside the square [0, 2π] × [0, 2π]. This provides a tiling of the square, but it is easy to see that all the regions of such tiling are equivalent so that we can choose any one, see [1]. We choose a7 ∈ [0, π/6],

3a7 ≤ a8 ≤ π/2.

(66)

Our choice for the range R determine a covering G of G2 , whose volume is easily computed to be Z √ π8 V ol(G) = (67) dµ = 9 3 . 20 R The final step consists in evaluating the volume of G2 by means of the formula of Macdonald (41). This can be easily applied to our case, to give exactly V ol(G2) = V ol(G), so that we see that with our range, the group is covered exactly one time. In place of show the details of such calculations here, we use the previously determined SU (3)-parametrization to compute the volume of G2. In that case, the measure was (SU (3))

dµG2 so that

as expected.

=

27 5 sin x6 cos x5 sin3 x5 sin(2x2)dµSU (3)dx6dx5dx4 dx3dx2dx1 , 32 √ π8 V ol(G2) = 9 3 20

(68)

198

5.


Generalized Euler Angles for F4

A very simple construction for the Lie algebras of the exceptional Lie groups F4 and E6 is suggested by a theorem of Chevalley and Schafer [22] which states Theorem 5..1. The exceptional simple Lie algebra f4 of dimension 52 and rank 4 over K is the derivation algebra D of the exceptional Jordan algebra J of dimension 27 over K. The exceptional simple Lie algebra e6 of dimension 78 and rank 6 over K is the Lie algebra D + {RY } ,

T rY = 0 ,

(69)

spanned by the derivations of J and the right multiplications of elements Y of trace 0. Se also [23]. To make it workable we must explain the main ingredients. For our purposes, K = R. The exceptional Jordan algebra is the 27 dimensional real vector space spanned by the 3 × 3 octonionic hermitian matrices endowed with the Abelian product 1 (AB + BA) , (70) 2 that is the symmetrization of the usual matrix product. Searching for the derivation algebra of J, we would then find a 27 dimensional representation of the Lie algebra for F4 . However, it admits a decomposition in irreducible subspaces R27 = R26 ⊕ R, which is provided by the homomorphism A ◦ B :=

` : J −→ R ,

A 7→

3 X

Aii .

(71)

i=1

Its kernel is a 26 dimensional invariant subspace. We could restrict to this space from the beginning, but, because of the 27 dimensional representation will extend to an irreducible representation of an E6 algebra, we prefer to work with the whole space. To concretely construct the representation, let us first realize an explicit isomorphism between the space of exceptional Jordan matrices and R27 :   a1  ρ(o1)      a1 o1 o2  ρ(o2)  27 ∗  ,   −→  Φ : J −→ R , o1 a 2 o3 (72) a2  ∗ ∗   o2 o3 a 3  ρ(o3)  a3 where ai , i = 1, 2, 3 are real numbers, oi , i = 1, 2, 3 are octonions and  0  o  o1   2   o   3  7 X  o  8 i  ρ : O −→ R , o ei 7→   o4  .   i=0  o5     o6  o7

(73)


199

In this way, the set of derivations D is mapped into the set of endomorphisms of R27 . Indeed, choosing Ai = Φ−1 (ri), ri ∈ R27, the identity J(A ◦ B) = J(A) ◦ B + A ◦ J(B)

(74)

provides a set of equations for the 27 × 27 matrix M := ΦJΦ−1 . This equation can be easily solved by means of a computer. This gives a set of 52 linearly independent matrices. The matrices, with their structure constants and the program generating them can be found in [3]. We chosen to normalize the matrices with the conditions − 16 T race(MI MJ ) = δIJ , P I, J = 1, . . ., 52, and [Mi , Mj ] = − 3k=1 ijk Mk for i, j ∈ {1, 2, 3}. Now, we need to recognize the 26⊕1 irreducible representation. We said that this is determined by the kernel of the map ` defined in (71). Looking at the map Φ, we see that ker` ◦ Φ−1 = Rf27, where √ (75) f27 = (e1 + e18 + e27)/ 3, and ea , a = 1, . . . , 27 is the standard basis of C27. Indeed, f27 ∈ kerMI for all I = 1, . . . , 52. 27 With respect to the new basis {fa }27 a=1 for C √ f1 = (e1 − e18 )/ 2 , (76) √ f18 = (e1 + e18 − 2e27)/ 6 , (77) √ (78) f27 = (e1 + e18 + e27)/ 3 , fa = ea , in the other cases,

(79)

all matrices will have the last row and column null, so evidencing the decomposition. We will call ci, i = 1, . . ., 52, the resulting 27×27 matrices. The 26 dimensional representation can thus be obtained simply by deleting from each matrix the last row and the last column. However, we remark here that the 27 × 27 matrices will constitute the first 52 elements of the 27 dimensional irreducible representation of E6. Before starting with the construction of the group, let us stop momentarily to look at some properties of the algebra. Starting by the constructed matrices we can easily construct the 52 dimensional adjoint representation. Let us call Ci the corresponding matrices. We can easily check that the associated Killing form is negative definite and indeed Kij ∝ δij so that we can choose the constant to fix the Euclidean metric as invariant metric. A possible choice for a Cartan subalgebra is H = RC1 ⊕ RC6 ⊕ RC15 ⊕ RC36 and the roots can then be easily computed diagonalizing simultaneously the generators Ca , a = 1, 6, 15, 36, Ca~vi = λa,i~vi ,

i = 1, . . . , 27,

~vi ∈ C52.

The resulting vectors (λ1,i, λ6,i, λ15,i, λ36,i), i = 1, . . ., 27 represent the root which indeed coincide with the roots of F4 . Thus we have obtained a compact form of F4 . To construct the group it is useful to look at the subalgebras. Looking at the commutators, we see that the first 21 matrices generate an so(7) subalgebra, whose so(i) subalgebras, with i = 6, 5, 4, 3, are generated by the first i(i − 1)/2 matrices, respectively. A possible choice for the relative Cartan subalgebras is C1 for so(3), C1 , C6 for so(4) and so(5) and C1 , C6 , C15 for so(6) and so(7). This can be used to compute the roots and check the

200


algebras. If to so(7) we add the matrices ci , with i = 30, . . ., 36, we obtain an so(8) subalgebra. This is the Lie algebra associated to the Spin(8) subgroup of F4 which leaves invariant the three Jordan matrices Ji , i = 1, 2, 3, where Ji has {Ji }ii = 1 as the unique non-vanishing entry. Indeed, we find that Φ(Ji) are in the kernel of the so(8) matrices. The so(8) algebra can be extended to a so(9) subalgebra in three different ways: the algebra so(9)1 obtained adding c45, . . . , c52 to so(8) and corresponding to the subgroup Spin(9)1 of F4 which leaves J1 invariant. The algebra so(9)2 is obtained by adding c37, . . . , c44 to so(8), and corresponds to the subgroup Spin(9)2 of F4 which leaves J2 invariant. The so(9)3 is obtained adding c22, . . . , c29 to so(8), and corresponds to the subgroup Spin(9)3 of F4 which leaves J3 invariant. Again, this can be checked by applying the given matrices to Φ(J1 ), Φ(J2) and Φ(J3) respectively. We will use Spin(9)1 which we will refer to simply as Spin(9). Finally, recall that if p is the linear complement of so(9) in F4 , from ad-invariance and orthogonality it follows

5.1.

[so(9), p] ⊂ p ,

(80)

[p, p] ⊂ so(9) .

(81)

The generalized Spin(9)-Euler construction.

5.1.1. The maximal subgroup As maximal subgroup of F4 we take H = Spin(9). In particular, as we said, we identified it with Spin(9)1 . Then, the complementary subalgebra p is the 16 dimensional real vector space generated by the matrices ci , with i = 22, . . ., 29, 37, . . ., 44. A look to the structure constants shows that as subspace V (see (30)) we can take any one dimensional subspace of p. We choose c22 as generator for V . Since dimG − dimH − dimV = 21 we expect for Ho to be a Spin(7) subgroup of Spin(9). To check that this is true, let us first recall that first 21 matrices generate an so(7) algebra. We are now able to construct a new set of 21 generators c˜i , i = 1, . . ., 21 which commute with c22 and having the same structure constants as ci . To this end we start with the so(8) subalgebra generated by cI , I = 1, . . . , 21, 30, . . ., 36. From cα, α = 30, . . ., 36 we can generate all the matrices of so(7) as c k(k−1) +i+1 = [c30+i, c30+k ] , k = 1, . . . , 6, i = 0, . . ., k − 1 .

(82)

2

Note that for a, b ∈ {22, . . ., 29} the commutator [ca, cb] is a combination of four elements of so(8), all having the same commutator with c22. With this in mind, let us define c˜30+i := −[c22, c23+i] ,

i = 0, . . ., 7 ,

(83)

and then c30+i, c˜30+k ] , k = 1, . . . , 6, i = 0, . . ., k − 1 . c˜ k(k−1) +i+1 = [˜

(84)

2

Thus, the matrices c˜I , I = 1, . . ., 21, 30, . . ., 36 have exactly the same structure constants of cI and [˜ ci , c22] = 0 for i = 1, . . ., 21. This is the so(7) we were searching for, let us call it ho , so that Ho = exp(ho). In order to apply (30) we need to construct H. This can be


201

done using the same methodology, choosing SO(8) as a maximal subgroup, which can be constructed from its SO(7) maximal subgroup and so on. To avoid annoying repetitions to the reader, we limit here to expose the final expression for H: Spin(9)[x1, . . . , x36] = ex1 c3 ex2 c16 ex3 c15 ex4 c35 ex5 c5 ex6 c1 ex7 c30 ex8 c45 ex9 c3 ex10 c16 ex11 c15 ex12 c35 ex13 c5 ex14 c1 ex15 c30 ex16 c3 ex17 c5 ex18 c4 ex19 c7 ex20 c11 ex21 c16 ex22 c3 ex23 c5 ex24 c4 ex25 c7 ex26 c11 ex27 c3 ex28 c5 ex29 c4 ex30 c7 ex31 c3 ex32 c5 ex33 c4 ex34 c3 ex35 c2 ex36 c3 , (85) with ranges xi ∈ [0, 2π], i = 1, 2, 3, 9, 10, 11, 16, 22, 27, 31, 34, i = 4, 8, 12, 17, 21, 23, 26, 28, 30, 32, 33, 35, xi ∈ [0, π], π π xi ∈ [− , ], i = 5, 13, 18, 19, 20, 24, 25, 29, 2 2 π i = 6, 7, 14, 15, xi ∈ [0, ], 2 x36 ∈ [0, 4π],

(86)

and measure dµSpin(9) [x1, . . . , x36] = sin x4 cos x5 cos x6 sin2 x6 cos4 x7 sin2 x7 sin7 x8 sin x12 cos x13 cos x14 sin2 x14 cos2 x15 sin4 x15 sin x17 cos2 x18 cos3 x19 cos4 x20 sin5 x21 sin x23 cos2 x24 cos3 x25 sin4 x26 36 Y 2 3 2 sin x28 cos x29 sin x30 sin x32 sin x33 sin x35 dxi(87) . i=1

5.1.2. The whole F4 To construct the quotient B we need to individuate the subgroup SO(7) in H. We seen that this group is generated by the matrices c˜i , i = 1, . . ., 21. Now, we seen that this matrices satisfy the same commutation relation of ci . Thus, if we are able to extend this to the whole so(9) algebra, we can use the same expression (85) with the c˜i matrices to our scope. Fortunately, we see that it suffices to add the matrices c˜i = ci+8 , c˜i = ci+16,

i = 22, . . ., 28, i = 29, . . ., 36,

(88)

to obtain the desired set cã, a = 1, . . ., 36 generating the Spin(9) group. Because the last 21 exponentials in (85) generate SO(7) = Ho, group, we get B[x1 , . . . , x15] = ex1 c˜3 ex2 c˜16 ex3 c˜15 ex4 ˜c35 ex5 c˜5 ex6 c˜1 ex7 c˜30 ex8 c˜45 ex9 c˜3 ex10 c˜16 ex11 c˜15 ex12 c˜35 ex13 c˜5 ex14 ˜c1 ex15 ˜c30 . (89) Thus, the resulting Euler parametrization of F4 is F4 [x1, . . . , x52] = B[x1 , . . ., x15]ex16 c22 Spin(9)[x17, . . . , x52] .

(90)

202


It remains to determine the range for the parameters x1 , . . ., x16, the others ranges being the ones of Spin(9). We will apply the topological method. To this hand, we need to compute det(Jp ) as in (42). This is a quite involved computation, and requires some technical trick to be performed. We refer to [3] for the details. The resulting measure is dµF4 [x1 , . . . , x52 ] = dµo [x1, . . . , x16 ]dµSpin(9) [x17 , . . . , x52 ], (91) 7 7 x16 15 x16 2 4 2 7 sin sin x4 cos x5 cos x6 sin x6 cos x7 sin x7 sin x8 · dµo [x1 , . . . , x16 ] = 2 cos 2 2 16 Y · sin x12 cos x13 cos x14 sin2 x14 cos2 x15 sin4 x15 dxi . (92) i=1

From this we can select the ranges along the lines explained in section 3.3.2.. Note that the exponentials are trigonometric functions of xi /2 so that the periods are 4π so that we should take the range xi = [0, 4π] for i = 1, 2, 3 and i = 9, 10, 11. However, for all c˜i ∈ so(7) we have that e2π˜ci commute with c˜j and with c22, so that it can be reabsorbed in the Spin(9) factor of F4 and these periods can be reduced to [0, 2π]. The ranges determined by the topological method are then x1 ∈ [0, 2π] , x2 ∈ [0, 2π] , x3 ∈ [0, 2π] , x4 ∈ [0, π] , π π π π x5 ∈ [− , ] , x6 ∈ [0, ] , x7 ∈ [0, ] , x8 ∈ [0, π] , 2 2 2 2 x9 ∈ [0, 2π] , x10 ∈ [0, 2π] , x11 ∈ [0, 2π] , x12 ∈ [0, π] , π π π π x13 ∈ [− , ] , x14 ∈ [0, ] , x15 ∈ [0, ] , x16 ∈ [0, π] . 2 2 2 2

(93)

With this range, we cover the whole group at least one time. Let us call M the obtained homological cycle. Integrating the measure on the full range we obtain µ(M ) =

226 · π 28 . 37 · 54 · 72 · 11

(94)

To be sure that we covered F4 exactly once, we must compute the volume of the group by means of the Macdonald formula. Its Betty numbers were computed in [24]. For F4 there are four free generators for the rational homology, corresponding to four spheres having dimensions d1 = 3, d2 = 11, d3 = 15, d4 = 23 .

(95)

They contribute to the volume with a term V ol(S 3)V ol(S 11)V ol(S 15)V ol(S 21) = 2π 22

π 5 π 7 π 11 2 2 . 4! 6! 10!

(96)

The simple roots are [15] r1 = L2 − L3

(97)

r2 = L3 − L4

(98)

r3 = L4 L1 − L2 − L3 − L4 r4 = 2

(99) (100)

where Li , i = 1, . . ., 4 is an orthonormal base for the dual Cartan algebra. The volume of the fundamental region representing the torus is then 1/2. Finally, there are 48 non

Exceptional Groups, Symmetric Spaces and Applications 203 √ vanishing roots, 24 of with length 1, and 24 with length 2. We found explicitly these roots which correspond to the one just presented, with Li = ei , the canonical base of R4 . This contribute with the term Y √ 2 = 248( 2)24. (101) |α| α∈R(F4)

The volume of the group is then 226 · π 28 . 37 · 54 · 72 · 11 We conclude that the range determined covers the group exactly a single time. V ol(F4 ) =

6.

(102)

The F4 - Euler Angles for E6

As the construction of the Euler parametrization of E6 is very similar to the one of F4 , here we will be very short, see [4]. ¿From the theorem of Chevalley and Schafer cited above, we can extend the representation of the F4 algebra to the 27 irreducible representation of the whole E6 algebra, adding the realization of the matrices representing the action of RY . We only need to associate a 27 × 27 matrix M (A) to each A ∈ J, in such the way that, if v ∈ R27 then M (A)v = Φ(A ◦ Φ−1 (v)). (103) The set of traceless Jordan matrices being 26-dimensional, this adds 26 new generators which complete the F4 algebra to the 78-dimensional E6-algebra. However, computing the Killing form we can easily check that this is not the compact form having signature (52, 26). This is indeed the split form E6(−26). Fortunately, we can obtain the compact form multiplying the 26 added generators by i. The In this way, the algebra remains real and the representation becomes complex, being now V = C27. We realized such matrices using Mathematica and using the basis   o2 a 1 o1   o∗1 a2 (104) o3 ∗ ∗ o2 o3 −a1 − a2 for the traceless Jordan matrices. They can be found in [4]. We must now choose a maximal compact subgroup. It is convenient to select the largest one, which we know to be H = F4 , in our case the group generated by the firsts 52 matrices. Its linear complement p (in the algebra) contains two preferred elements associated to the two diagonal matrices (104) with a1 = 1, a2 = 0 and a1 = 0, a2 = 1 respectively. Following the order dictated by the map Φ, after orthonormalization w.r.t. the product (J|J 0 ) = Trace(J ◦ J 0 ) in J, these will correspond to the matrices c53 and c70 respectively. This is indeed the expression we used in [4] to do the computer calculations. There, we have found convenient a posteriori to recombine these two matrices in the new generators √ 1 3 c˜53 = c53 + c70, 2√ 2 1 3 c˜70 = − c53 + c70. 2 2

204


These, added to the four matrices previously considered for the Cartan subalgebra of a F4 , generate a Cartan subalgebra of E6 and the corresponding roots are exactly the ones described for example in [15], with Li replaced by the elements ei of the standard basis of R6 . In any case, it is easy to check that c˜53, ˜ c70 can be taken as generators of V . Evidently they commute. To realize the Euler parametrization, we note that the redundancy is now 28dimensional, so that we expect to find a 28 dimensional subgroup Ho of H which commute with V . In fact this happen to be the SO(8) subgroup generated by the first 28 matrices ci , i = 1, 2, . . ., 28. We can then write E6 [x1, . . . , x78] = BE6 [x1 , . . ., x24]ex25 c53 +x26 c70 F4 [x27, . . ., x78] , with BE6 = F4 /SO(28) and F4 as in the previous section. This means that in particular BE6 [x1 , . . . , x24] = B[x1 , . . . , x15]ex16 c22 B9 [x17, . . . , x23]ex24 c37 ,

(105)

where B is given by (89) and B9 [x1 , . . ., x7] = ex1 c˜3 ex2 c˜16 ex3 c˜15 ex4 c˜35 ex5 c˜5 ex6 c˜1 ex7 c˜30 . We can now compute the associated invariant measure. The calculation is quite involved and details can be found in [4]. Here we report the result only: dµE6 = 27 sin x4 cos x5 cos x6 sin2 x6 cos4 x7 sin2 x7 sin7 x8 · x16 x16 sin7 · sin x12 cos x13 cos x14 sin2 x14 cos2 x15 sin4 x15 cos15 2 2 2 2 4 7 sin x20 cos x21 cos x24· √ x22 sin x22!cos x23√sin x23 sin ! x x 3 3 25 25 sin8 x25 sin8 x26 + x26 − sin8 · 2 2 2 2 dµF4 [x27, . . . , x78]

26 Y

dxi.

(106)

i=1

Proceeding as for F4 , from this measure we can determine the range R for the parameters: x1 ∈ [0, 2π] , x2 ∈ [0, 2π] , x3 ∈ [0, 2π] , x4 ∈ [0, π] , π π π π x5 ∈ [− , ] , x6 ∈ [0, ] , x7 ∈ [0, ] , x8 ∈ [0, π] , 2 2 2 2 x9 ∈ [0, 2π] , x10 ∈ [0, 2π] , x11 ∈ [0, 2π] , x12 ∈ [0, π] , π π π π x13 ∈ [− , ] , x14 ∈ [0, ] , x15 ∈ [0, ] , x16 ∈ [0, π] , 2 2 2 2 x17 ∈ [0, 2π] , x18 ∈ [0, 2π] , x19 ∈ [0, 2π] , x20 ∈ [0, π] , π π π π x21 ∈ [− , ] , x22 ∈ [0, ] , x23 ∈ [0, ] , x24 ∈ [0, π] , 2 2 2 2 π x25 x25 x25 ∈ [0, ] , − √ ≤ x26 ≤ √ , 2 3 3

(107)

and xj , j = 27, . . ., 78, chosen to cover the whole F4 group. This choice defines a 78 dimensional closed cycle W having volume √ Z 3 · 217 · π 42 dµE6 = 10 5 3 V ol(W ) = . (108) 3 · 5 · 7 · 11 R


205

To complete the work we need to check that this is indeed the volume of Q E6 as given by the Macdonald formula. The rational homology of E6 is H∗(E6) = H∗ ( 6i=1 S di ), with ([24]) d1 = 3, d2 = 9, d3 = 11, d4 = 15, d5 = 17, d6 = 23 .

(109)

E6 is simply laced, with simple roots r1 = L1 + L2

(110)

r2 = L2 − L1

(111)

r3 = L3 − L2

(112)

r4 = L4 − L3

(113)

r5 = L5 − L4 √ L1 − L2 − L3 − L4 − L5 + 3L6 r6 = 2

(114) (115)

where Li , i = 1, . . . , 6 is an orthogonal base for the dual of the Cartan algebra. The volume of the associated torus is then L2 . As a check for the algebra, one can explicitly check that √ the 72 roots of the algebra are indeed the roots of E6, each one having length 2. They have indeed the structure given in [15], with Li = ei , the canonical base of R6 . The Macdonald formula then provides the result √ 3 · 217 · π 42 V ol(E6) = 10 5 3 , (116) 3 · 5 · 7 · 11 which concludes our check.

7.

Construction of Non Compact Split Forms and Their Coset Manifolds

Up to now we considered compact groups only. However, as discussed in the introduction, it is important to be able to concretely realize non compact groups also. A particular class is given by the split forms. A particularly suitable technique for this cases is the Iwasawa decomposition which we will discuss in this section. To make the advantage of such method more evident, in the next section we will compare the construction of a splitting form obtained by analytic continuation of a compact one, with the direct Iwasawa construction.

7.1.

Analytic continuation of the generalized Euler angles

A first way to realize a split form starting from the compact one is the following. Suppose to have realized an Euler parametrization of the compact group G with respect to a maximal subgroup H, say G[x1 , . . . , xp ; y1 , . . . , yr ; z1 , . . . , zm ] = B[x1 , . . . , xp ]eV [y1 ,...,yr ] H[z1, . . . , zm ],

p + r + m = n.

This will be originated by the orthogonal decomposition g = h+p. We know that [h, h] ⊂ h and [h, p] ⊂ p. Let us now suppose that the further condition [p, p] ⊂ h is satisfied. This

206


condition is also called symmetry. This is a non trivial condition and require for h to be a maximal subalgebra. Indeed, suppose to start with such a decomposition and fix a subgroup H 0 ⊂ H. This determine the new orthogonal decomposition g = h0 + p0 = h0 + p + p00, with p00 = p0 ∩ h. Thus, [p, p00] ⊂ p ⊂ p0 violates symmetry. On the other hand we can easily see that symmetry is satisfied by all examples we considered, and indeed this happens to be true for all simple Lie groups [16]. Thus, we can pass from the compact form to the split form relative to the given maximal subgroup, simply by the Weyl unitary trick [16, 15]: p 7−→ ip, i being the imaginary unit. Thus, the Euler parametrization of the split form is given by Gsplit[x1, . . . , xp; y1, . . . , yr ; z1, . . . , zm ] = G[ix1, . . . , ixp; iy1, . . . , iyr ; z1, . . . , zm ].

7.2.

The Iwasawa decomposition

If the Euler decomposition is particularly efficient for realizing compact Lie groups, for the non compact split forms a much more suitable realization is provided by the Iwasawa decomposition[25]. This is based on the Cartan decomposition relative to a maximal subgroup H, g = h ⊕ s, where h is the Lie algebra associated to K and s its linear complement. The Cartan decomposition requires the existence of a linear involution θ : g −→ g such that restricted to s the quadratic form B : s × s −→ R,

(a, b) 7−→ B(a, b) = K(a, θ(b)).

Recall that the Killing form on the compact form is negative definite. Starting from our orthogonal decomposition g = p ⊕ h we see that the map θ : g −→ g,

(a, b) 7→ (a, −b),

∀(a, b) ∈ p ⊕ h,

satisfy the required conditions so that we can identify p with s. The next step consist in selecting a Cartan subalgebra of p. We call it a and A is the generated group. Being a Cartan subalgebra, the adjoint action a is diagonalizable and we can associate to a complete set of positive roots. The associated eigenmatrices generate a nilpotent subalgebra n, as follows from the basic properties of the root spaces. The Iwasawa decomposition states that the split form of G associated to the maximal subgroup H can be realized as G = HAN = eh ea en .

(117)

Note that for H, being a compact group, we can use our Euler parametrization. Then all novelties are contained in the non compact quotient G/H. Before to look how this can handled, let us make some further comment on the comparison between Euler and Iwasawa


207

constructions. On one hand, there is not any compact counterpart of the Iwasawa construction but surely there are many other possibilities, as for example the exponential map itself. In this case, the big advantage of the Euler construction is that involves only parametric angles, which appear in the group elements in trigonometric form. Computationally, this is not immediately an advantage because of the difficulties to handle trigonometric simplifications by means o a computer calculation. Indeed, it can be checked that, at some steps, at hand manipulations are necessary and much simpler then direct computer computing. However, the true advantage arises when the explicit range of the parameters must be established. In this case, as we have seen, the trigonometric expressions provide a quite simple way to determine such ranges, whereas for a generic parametrization this involves the task to solve certain transcendental equation which can be handled only numerically. On the other hand, when we work with a split form the difficult task to determine the explicit range for the parameters restricts to the compact part only. Thus, we need to take the trigonometric expressions for the compact subgroup only, but it now possible to use a simpler realization of the non compact part, possibly much easier to handle. This is provided by the Iwasawa decompositions where, apart from the compact part, only Abelian or nilpotent matrices appear. In particular, from the structure of the root spaces the nilpotency of the non compact part will be at most the rank r of the group, so that we expect for the non compact part to appear polynomial terms of degree at most r, in place of trigonometric expressions (or hyperbolic after the Weyl trick).

7.3.

The coset manifold

Let us now look at the construction of the non compact quotient G/K. We need to compute the induced metric (40) starting from the Iwasawa expression. Obviously we can proceed exactly as for the compact case. However, following the tradition, we written the decomposition taking H as a left factor in place of a right factor, so that it will be convenient here to exchange left invariant forms with right invariant form. This does not change the substance, being the Killing form bi invariant. Thus, let us introduce the one form R = dG G−1 = HAdN N −1A−1 H −1 + HdA A−1 H −1 + dH H −1 JG ≡ HAJN A−1 H −1 + HJA H −1 + JH . R along the To compute the metric of M = G/H we need to eliminate the components of JG 0 fibers (h), so defining the reduced form JG , giving the metric 0 0 ⊗ JG ), dσ 2 = κTr(JG

κ being a normalization constant. Let us look at the structure of this metric. First, note that R 0 the term JH which appears in JG is projected out to obtain JG . Moreover, the adjoint action of H commute with the projection, because it respects the direct decomposition g = h ⊕ p. Thus, if we define Jp := π(AJN A−1 ) where π is the projection out from the fibers, then dσ 2 = κTr(Jp ⊗ Jp ) + κTr(JA ⊗ JA ).

208


NoteP that Jp is orthogonal to JA . Indeed, JA is easily computed because of A = exp( ri=1 )yi Hi , where Hi individuate an orthonormal basis (w.r.t. the product κTr) for P the Cartan subspace, define an Abelian group so that JA = ri=1 )dyi Hi and r X )dyi2. κTr(JA ⊗ JA ) = i=1

On the other side, Jp , can be easily determined using the simple properties of N . Recall that the generators of N are the positive root matrices Rl , l = 1, . . ., m := (n − r)/2, so that two of such matrices commute if the sum of the corresponding roots is not a root, otherwise the commutator is proportional to the matrix associated to the resulting root. Now P N (x1, . . . , xm) = e i=1m xi Ri , so that we will get JN =

m X

i

m X

i

n (~x)Ri,

n (~x) =

i=1

nij (~x)dxj

(118)

j=1

where the nij (x1, . . . , xm) are all polynomials of degree at most r in the xi . Now, Ri are eigenmatrices for the action of A so that we have Pr

ARi A−1 = e

a=1 ri,a ya

Ri,

(119)

where ~ri = (ri,1, . . ., ri,r ) are the positive roots whose components are the eigenvalues ri,a , a = 1, . . ., r of Ha with eigenvector Ri. Thus, A Jn A−1 =

m X

Pr

e

a=1 ri,a ya

ni (~x)Ri,

(120)

i=1

and to obtain Jp we only need to take the projection on p. The metric on the quotient is then dσ 2 =

r X i=1

dyi2 +

r X

ei ⊗ ei ,

(121)

i=1

ei = κTr[A Jn A−1 Pi ],

(122)

where Pi , together with Ha realize an orthonormal basis of p with respect to the product (a|b) = κTr(ab).

8.

Realizing G2(2) and G2(2)/SO(4)

The non compact form G2(2) is the split form of G2 associated to the maximal compact subgroup SO(4). Referring to section 4., we know that SO(4) is generated by Ci , i = 1, 2, 3, 8, 9, 10, and p is generated by Ca , a = 4, 5, 6, 7, 11, 12, 13, 14. To determine the split form we could multiply Ca by the imaginary unit i. Alternatively, noting that all matrices are antisymmetric, we prefer to transform the matrices Ca into symmetric matrices. The representative matrices obtained this way are normalized with the condition


209

T r(QI QJ ) = ηIJ , where η = diag{−1, −1, −1, 1, 1, 1, 1, −1, −1, −1, 1, 1, 1, 1}: 

 Q1 = 

0 0 0 0 0 0 0

0 0 0 0 0 0 0



0 0 0 0 0 0 0

 Q5 =  

0 0 0 0 0 0 0

 Q7 = 



1  Q9 = √  3 

1  Q11 = √  3 

1  Q13 = √  3

0 0 0 0 0 1 0

0 0 0 0 −1 0 0

0 0 0 0 0 0 0

0 0 0 0 1 0 0

0 0 0 −1 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 −1 0

0 0 0 0 0 0 1

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 −1 0 0 0

0 0 0 0 −1 0 0

0 −1 0 0 0 0 0

0 2 0 0 0 0 0 0 0 0 −2 0 0 0

0 0 0 0 0 −2 0

0 0 0 0 0 0 1

0 0 0 0 0 0 0

 Q3 =  

0 0 0 0 0 0 0

−2 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 −1 0 0

0 0 0 0 0 0 0

0 0 0 −1 0 0 0

0 0 0 0 0 1 0

0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0

0 0 1 0 0 0 0





 Q2 = 

 

0 0 0 0 0 −1 0

0 −1 0 0 0 0 0 0 0 0 0 0 0 0

−2 0 0 0 0 0 0 0 0 −1 0 0 0 0

0 0 0 0 0 0 1

0 0 −1 0 0 0 0

0 0 0 0 0 0 −1 0 0 0 0 0 1 0

0 0 0 −1 0 0 0

0 0 1 0 0 0 0 −2 0 0 0 0 0 0

 Q4 = 

 





 Q6 = 

 

0 0 0 0 0 0 0

0 0 0 −1 0 0 0

0 0 0 0 0 0 0

0 0 2 0 0 0 0 0 0 2 0 0 0 0



 

1  Q12 = √  3





 

0 0 0 0 0 −1 0

0 0 0 0 0 0 1

0 0 0 1 0 0 0

0 0 0 0 −1 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 1 0 0 0 0

0 1 0 0 0 0 0

0 0 0 0 0 1 0

0 0 0 0 1 0 0

1  Q10 = √  3



0 0 0 0 0 0 0

0 0 0 0 0 0 1



 

0 0 0 0 0 0 0

0 0 0 0 0 0 0

1  Q8 = √  3



0 −1 0 0 0 0 0

0 0 0 0 0 0 0



  0 0 0 1 0 0 0







0 0 0 0 0 0 0

0 0 0 0 0 0 0

1  Q14 = √  3

0 0 −1 0 0 0 0 0 −2 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 −2 0 0

0 0 0 0 0 1 0

0 0 0 0 0 0 −2

0 0 0 0 0 0 0

0 0 0 0 −1 0 0

0 0 0 1 0 0 0

0 0 0 0 0 0 1

0 0 0 0 0 0 1

0 0 0 1 0 0 0

 

0 0 0 0 0 0 0

0 0 0 0 0 1 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 −1 0 0

 



0 1 0 0 0 0 0

−2 0 0 0 0 0 0



   0 0 0 0 0 −1 0

0 0 0 −1 0 0 0 −2 0 0 0 0 0 0

0 1 0 0 0 0 0

0 0 −1 0 0 0 0

   

0 0 0 0 −1 0 0 0 1 0 0 0 0 0

 

0 0 1 0 0 0 0 0 0 0 0 0 0 0

   −2 0 0 0 0 0 0

  

The matrices {Q1 , Q2 , Q3 , Q8 , Q9 , Q10} generate the Lie algebra of SO(4), and the elements Q5 and Q11 commute, and generate a non compact Cartan subalgebra contained in p.

8.1.

Euler construction of G2(2) /SO(4)

As we seen, a first way to realize the split form is by analytic continuation. In this case it simply means that we have to substitute the matrices CI with QI in (59): g[x1, . . ., x14] = H(x1, . . . , x6)e

√ 3x7 Q11 +x8 Q5 √

H(x1, . . . , x6) = ex1 Q3 ex2 Q2 ex3 Q3 e

H(x9, . . . , x14), √

3x4 Q8

e

√

3x5 Q9

e

3x6 Q8

.

(123) (124)

210


We can then proceed with the computation of the invariant measure exactly as for the compact space. We will skip all details here and give only the final result 14 Y √ dµG2(2) = 27 3f (2x7 , 2x8) sin(2x2) sin(2x5) sin(2x10) sin(2x13) dxi ,

(125)

i=1

where f (α, β) = sinh(

β+α β − 3α β + 3α β−α ) sinh( ) sinh( ) sinh( ) sinh(α) sinh(β) 2 2 2 2

(126) from which we can determine the range for the parameters 0 ≤ a1 ≤ π , 0 ≤ a4 ≤ 2π , 0 ≤ a9 ≤ 2π , 0 ≤ a12 ≤ π , 0 ≤ a7 ≤ ∞ ,

π π , 0 ≤ a3 ≤ , 2 2 π 0 ≤ a6 ≤ π , 0 ≤ a5 ≤ , 4 π 0 ≤ a11 ≤ π , 0 ≤ a10 ≤ , 2 π 0 ≤ a13 ≤ , 0 ≤ a14 ≤ π , 2 3a7 ≤ a8 ≤ ∞ .

0 ≤ a2 ≤

(127)

Next, we can also compute the metric (40) on the quotient G2(2)/SO(4). The details are very similar to the ones in [1]. Introducing the 1–forms I1(x, y, z) := sin(2y) cos(2z)dx − sin(2z)dy, I2(x, y, z) := sin(2y) sin(2z)dx + cos(2z)dy, I3(x, y, z) := dz + cos(2y)dx,

(128)

one gets ds2G2(2) /SO(4) = da28 + da27 + sinh2 a8 cosh2 a7 + cosh2 a8 sinh2 a7 da25 + sin2 (2a5 )da24 +3da22 + 3 sin2 2a2da21 n 1 2 + cosh(2a8 ) cosh(2a7 ) sinh2(2a7 ) [I1 (a4 , a5 , a6) + 3I2(a1 , a2, a3)] 2 o 2 + [I2 (a4 , a5 , a6) − 3I1(a1 , a2, a3)] 3 2 + sinh2(2a7 ) [I3 (a4 , a5, a6) − I3 (a1, a2, a3)] 4 1 + sinh2(2a8 ) [I3 (a4 , a5, a6) + 3I3(a1 , a2, a3)]2 . (129) 4

Such computation is quite complicated for the G2 group already and for higher dimensional groups it becomes prohibitive.

8.2.

Iwasawa construction of G2(2) /SO(4)

The Iwasawa parametrization is the most suitable for the computation of the metric on G2(2)/SO(4). We know that the Cartan subalgebra of p is generated by H1 := C11 and


211

H2 := C5 . The roots of G2 can thus been computed diagonalizing the adjoint action of Hi . We obtain √ 2 1 r3 = ( √ , 1); r1 = ( √ , 0); r2 = ( 3, 1); 3 3 √ 1 r5 = (− √ , 1); r6 = (− 3, 1), r4 = (0, 2); 3 where we write only a choice of positive roots. The corresponding eigenmatrices (apart from some normalization constants) are √ R1 = 3C3 − C8 + 2C12 ; (130) 1 (131) R2 = √ (C1 − C2 + C6 − C7 ) − C9 + C10 + C13 + C14 ; 3 √ R3 = 3(C1 + C2 + C6 + C7 ) + C9 + C10 − C13 + C14 ; (132) √ R4 = C3 − 2C4 + 3C8 ; (133) √ (134) R5 = − 3(C1 − C2 + C6 − C7 ) − C9 + C10 + C13 + C14 ; 1 R6 = − √ (C1 + C2 + C6 + C7 ) + C9 + C10 − C13 + C14 . (135) 3 Q If we choose to parameterize the nilpotent subgroup N as N (x1, . . . , x6) = 6i=1 exi Ri we get n1 = dx1; n2 n3 n4 n5 n6

(136) √

64 = dx2 − 4 3x1 dx3 + 16x21 dx5 − √ x31dx6 ; 3 3 8 16 2 = dx3 − √ x1 dx5 + x1 dx6; 3 3 8 = dx4 + 8x3 dx5 − x2dx6 ; 3 4 = dx5 − √ x1 dx6; 3 = dx6.

(137) (138) (139) (140) (141)

As we see, this are polynomials whose higher order is 2. Then 2

dσ =

dy12

+

dy22

+

6 X

e i ⊗ ei

(142)

i=1

with

8 e1 = −2e−2y2 dx4 + 8x3 dx5 − x2dx6 , 3 √ √ √ 1 64 3 2 − 3y1 −y2 2 3y1 −y2 √ √ e x (dx2 − 4 3x1 dx3 + 16x1 dx5 − dx6 e = 1 dx6 ) − e 3 3 3 √ 1 8 16 2 4 √ − √13 y1 −y2 y1 −y2 3 + 3 e (dx3 − √ x1dx5 + x1 dx6) − e (dx5 − √ x1dx6) 3 3 3 √ √ √ 1 64 3 − 3y1 −y2 2 3 3y1 −y2 e = −√ e (dx2 − 4 3x1dx3 + 16x1dx5 − √ x1 dx6) + e dx6 3 3 3

212

Sergio L. Cacciatori and B. L. Cerchiai √ 1 8 16 4 √ − √1 y −y y −y + 3 e 3 1 2 (dx3 − √ x1dx5 + x21 dx6) + e 3 1 2 (dx5 − √ x1dx6) 3 3 3 e4 = 2e−

2 √ y 3 1

dx1,

√ √ 64 (dx2 − 4 3x1 dx3 + 16x21dx5 − √ x31 dx6) − e 3y1 −y2 dx6 3 3 1 1 4 8 16 √ √ y −y − y −y 1 2 1 2 +e 3 (dx5 − √ x1 dx6) − e 3 (dx3 − √ x1dx5 + x21 dx6), 3 3 3 √ √ √ 64 e6 = e− 3y1 −y2 (dx2 − 4 3x1 dx3 + 16x21dx5 − √ x31 dx6 + e 3y1 −y2 dx6 3 3 1 4 8 16 √ y −y − √1 y −y +e 3 1 2 (dx5 − √ x1 dx6)) + e 3 1 2 (dx3 − √ x1dx5 + x21 dx6), 3 3 3 √

e5 = e−

3y1 −y2

e7 = dy1 , e8 = dy2 .

The polynomial dependence on the variables makes the computation feasible by a computer even for higher dimensional groups.

9.

Conclusions

We have given a detailed explanation of the methods for studying the geometry of exceptional Lie groups that we have first introduced in [1] for G2. Indeed, here we have seen the elementary reasonings which constitute the basis of our ideas and provide a powerful tool for computing global parameterizations of Lie groups. Recall that a parametrization differs from a coordinatization in that it does not provide a diffeomorphism between the manifold and the space of parameters. However, a parametrization locally yields a coordinatization and it is global when it covers the whole group. This means that, if R is the space of parameters and G the group, then the parametrization p : R −→ G is surjective. In general, however, it cannot be injective. Indeed, in general group manifolds have a non vanishing curvature tensor and cannot be globally covered by a single chart. Nevertheless, a parametrization can be considered good when it is “minimal”, in the sense that R is the closure of an open local chart. This means that the bijectivity of p is lost only on a subset of vanishing measure, which is the boundary ∂R of R. In this case we call the set R the range of the parameters. In general, for a finite dimensional simple Lie group the true difficulty lies not so much in constructing a global parametrization, but rather in determining the range. Here is where the idea of the generalized Euler angles comes into play, as it is particularly suitable for computing the full range of the parameters, as it allows to express them in terms of Cartesian products. In particular, we have seen that there are essentially two methods to determine the range. The first is geometric and is based on the detailed knowledge of the geometry of the quotient space of the group and its maximal subgroup. We have described it for the example of the SU (3)-Euler parametrization of G2, but it can be adopted for the Euler parametrization of any of the compact classical simple Lie groups, as for example SU (N ), [20]. The second


213

method is topological and can be used when the geometrical information on the quotient space is lacking or the geometrical method is not sufficient to fix the range, as for example for the SO(4)-Euler parametrization of G2 [1] or for the parameterizations of F4 [3] and E6 [4]. Finally, we have also considered the construction of non compact Lie groups. In that case the Iwasawa construction is easier the Euler one. In particular, we have shown how to it can be applied to the non compact Lie group G2(2). The material exposed in the last section is all new.

Acknowledgments B.L.C. would like to thank S. Ferrara and A. Marrani for enlightening discussions. The work of B.L.C. has been supported in part by the Director, Office of Science, Office of High Energy and Nuclear Physics, Division of High Energy Physics of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and in part by NSF grant 10996-1360744 PHHXM.

References [1] S. L. Cacciatori, B. L. Cerchiai, A. Della Vedova, G. Ortenzi and A. Scotti, “Euler angles for G(2),” J. Math. Phys. 46 (2005) 083512 [2] S. L. Cacciatori, “A simple parametrization for G2,” J. Math. Phys. 46 (2005) 083520 [3] F. Bernardoni, S. L. Cacciatori, B. L. Cerchiai, A. Scotti, “Mapping the geometry of the F(4) group,” Adv. Theor. Math. Phys. Volume 12, Number 4 (2008), 889-994 [4] F. Bernardoni, S. L. Cacciatori, B. L. Cerchiai, A. Scotti, “Mapping the geometry of the E(6) group,” J.Math.Phys.49:012107,2008 [5] W. Krauth, M. Staudacher, “Yang-Mills Integrals for Orthogonal, Symplectic and Exceptional Groups”, Nucl.Phys. B584 (2000) 641-655, hep-th/0004076 [6] T. Friedmann, E. Witten, “Unification Scale, Proton Decay, And Manifolds of G2 Holonomy”, Adv.Theor.Math.Phys. 7 (2003) 577-617, hep-th/0211269 [7] D. O’Reilly, “String Corrected Supergravity; A Complete and Consistent NonMinimal Solution”, hep-th/0611068 [8] M. Pepe, “Deconfinement in Yang-Mills: a conjecture for a general gauge Lie group G”, hep-lat/0407019 [9] G. Cossu, M. D’Elia, A. Di Giacomo, B. Lucini, C. Pica, “Confinement: G2 group case”, PoSLAT2007:296,2007, arXiv:0710.0481 [10] S. Ferrara, A. Marrani, “Symmetric Spaces in Supergravity”, arXiv:0808.3567

214


[11] M. Günaydin, A. Neitzke, B. Pioline, A. Waldron, “Quantum Attractor Flows”, JHEP0709:056,2007, arXiv:0707.0267 [12] B.L. Cerchiai, S. Ferrara, A. Marrani, B. Zumino, “Duality, Entropy and ADM Mass in Supergravity”, arXiv:0902.3973 [13] Y.-X. Chen, Y.-Q. Wang, “Supersymmetric black rings and non-linear sigma models”, arXiv:0901.1939 [14] P.B. Slater, “Eigenvalues, Separability and Absolute Separability of Two-Qubit States”, J. Geom. and Phys. 59 (2009) 17-31, arXiv:0805.0267 [15] W. Fulton and J. Harris, “Representation Theory : A First Course”, Graduate Texts in Mathematics, Springer. [16] R. Gilmore, “Lie Groups, Lie Algebras, and Some of Their Applications,” John Wiley & Sons, 1974 New York. [17] Heinz Hopf, Uber Die Topologie der Gruppen-Manningfaltigkeiten und Ihre Verallgemeinerungen, The Annals of Mathematics 42 (1941), pp. 22-52. [18] I. G. Macdonald, “The volume of a compact Lie group”, Invent. math. 56, 93-95 (1980). [19] J. C. Baez, “The Octonions”, Bull. Amer. Math. Soc. (N.S.) 39 (2002), no. 2, 145–205 [20] S. Bertini, S. L. Cacciatori and B. L. Cerchiai, “On the Euler angles for SU(N)” J. Math. Phys. 47 (2006) 043510 [21] T. Tilma and E. C. G. Sudarshan, “Generalized Euler angle parameterization for SU(N)”, J. Phys. A: Math. Gen. 35 (2002) 10467-10501; “Generalized Euler Angle Parameterization for U (N ) with Applications to SU (N ) Coset Volume Measures”, J. Geom. Phys. 52, 3 (2004) 263-283. [22] C. Chevalley, R. D. Schafer, The exceptional simple Lie algebras F4 and E6, PNAS 36 (1950), pp. 137-141. [23] J. F. Adams, “Lectures on Exceptional Lie Groups”, The University of Chicago Press. [24] C. Chevalley, The Betti numbers of the exceptional Lie groups , Proceedings of the international congress of Mathematicians, Cambridge, Mass., 1950, Vol.2, Amer. Math. Soc., Providence, R. I. , 1952, pp. 21-24. [25] K. Iwasawa, “On some types of topological groups,” Annals of Mathematics (2) 50, (1949), 507558.


215

[13] Y.-X. Chen, Y.-Q. Wang, “Supersymmetric black rings and non-linear sigma models”, arXiv:0901.1939 [14] P.B. Slater, “Eigenvalues, Separability and Absolute Separability of Two-Qubit States”, J. Geom. and Phys. 59 (2009) 17-31, arXiv:0805.0267 [15] W. Fulton and J. Harris, “Representation Theory : A First Course”, Graduate Texts in Mathematics, Springer. [16] R. Gilmore, “Lie Groups, Lie Algebras, and Some of Their Applications,” John Wiley & Sons, 1974 New York. [17] Heinz Hopf, Uber Die Topologie der Gruppen-Manningfaltigkeiten und Ihre Verallgemeinerungen, The Annals of Mathematics 42 (1941), pp. 22-52. [18] I. G. Macdonald, “The volume of a compact Lie group”, Invent. math. 56, 93-95 (1980). [19] J. C. Baez, “The Octonions”, Bull. Amer. Math. Soc. (N.S.) 39 (2002), no. 2, 145–205 [20] S. Bertini, S. L. Cacciatori and B. L. Cerchiai, “On the Euler angles for SU(N)” J. Math. Phys. 47 (2006) 043510 [21] T. Tilma and E. C. G. Sudarshan, “Generalized Euler angle parameterization for SU(N)”, J. Phys. A: Math. Gen. 35 (2002) 10467-10501; “Generalized Euler Angle Parameterization for U (N ) with Applications to SU (N ) Coset Volume Measures”, J. Geom. Phys. 52, 3 (2004) 263-283. [22] C. Chevalley, R. D. Schafer, The exceptional simple Lie algebras F4 and E6, PNAS 36 (1950), pp. 137-141. [23] J. F. Adams, “Lectures on Exceptional Lie Groups”, The University of Chicago Press. [24] C. Chevalley, The Betti numbers of the exceptional Lie groups , Proceedings of the international congress of Mathematicians, Cambridge, Mass., 1950, Vol.2, Amer. Math. Soc., Providence, R. I. , 1952, pp. 21-24. [25] K. Iwasawa, “On some types of topological groups,” Annals of Mathematics (2) 50, (1949), 507558.



Chapter 8

A S URVEY OF S OME R ESULTS IN LIE G ROUP A NALYSIS

THE

Phillip Yam Department of Applied Mathematics, Hong Kong Polytechnic University, Kowloon, Hong Kong

Lie Group Analysis as a mathematical discipline was born in the 1870s out of some brilliant work that was done by the nineteenth-century mathematician Sophus Lie. Working together with a fellow student called Felix Klein in Berlin during the year 1869 – 70, Lie conceived the notion of studying mathematical systems from the perspective of the transformation groups that leave the systems invariant. In his famous Erlanger program, Klein subsequently pursued the role of finite groups in the study of regular bodies and the theory of algebraic equations, while Lie developed his notion of continuous transformation groups and their role in the theory of differential equations. Lie’s work was a tour de force of the 19th century, and today the theory of continuous groups is a fundamental tool in such diverse areas as analysis, differential geometry, number theory, differential equations, quantum mechanics, high energy physics and gauge theory. Lie’s achievements are striking because he showed that many of the ad hoc methods of integration for ordinary differential equations that were in use before his time were actually direct consequences of his theory. Furthermore, Lie gave a classification of ordinary differential equations in terms of their symmetry groups, thereby identifying the full sets of equations that could be integrated or reduced to lower-order. Lie’s classification, in particular, showed that all second-order equations that are integrable by his methods are reducible to one of exactly four distinct canonical forms simply by taking suitable choices of change of variables. It follows therefore that by subjecting these four canonical equations to suitable change of variables alone, all the known equations that are integrable by the old methods are obtained as well as infinitely many more equations that are integrable which are not yet known. Subsequent developments showed that Lie’s theory can in fact be used as a universal tool for tackling a large class of differential equations, both of ordinary and partial type, when all other means of integration fail. It is known, for example, that group analysis

218

Phillip Yam

is the only effective and universally used method for solving many nonlinear differential equations, e.g. the KdV equation, analytically. Group theory considers the symmetry of the underlying problems in formulating mathematical models, with the result that new and often better approaches to solving complex problems are revealed. Despite its great importance and depth of insight, however, the idea of Lie groups had not enjoyed widespread acceptance in the past and the subject had been largely neglected in mathematics. Applications of Lie groups in the area of finance and financial engineering, therefore, is almost unheard of. Ad hoc methods are still being used today instead of Lie’s canonical equations. 4.1 Some Useful Results in the Theory of Ordinary Differential Equations Existence theorems furnish the core of the general theory of differential equations. In particular, the classical existence and uniqueness theorems for systems of ordinary differential equations of the first order plays a central role in the theory of Lie groups, their invariants and invariant equations. Definition 4.1 A function y = φ(x), defined in a neighborhood of x0 and continuously differentiable n times, is said to be a solution of a differential equation F (x, y, y 0(n)) = 0 if F (x, φ(x), φ0 (x), . . ., φ(n) (x)) = 0 identically for all x in an interval (x0 − , x0 + ), for some > 0. Since any function y = y(x) represents a curve in the (x, y) plane, solutions of ordinary differential equations are also called integral curves. We first consider the existence of solutions of the initial value problem for the general first-order equation: dy = f (x, y), y(x0) = y0 (4.1) dx where f (x, y) is a single-valued continuous function of two variables defined in a rectangular domain |x − x0| ≤ a, |y − y0 | ≤ b, for some a, b > 0. The question of the existence of solutions to the initial value problem (4.1) was first investigated by Cauchy. For this reason, initial value problems are also called Cauchy problems. Cauchy obtained sufficient conditions for the existence of integral curves passing through any given point (x0, y0 ) within the domain of f . Furthermore, Cauchy showed that the solutions may not be unique if only the continuity of f is required. For example, the initial value problem p dy = 2 |y|, y(x0) = 0 (4.2) dx has two solutions, y = 0 and y = |x − x0 |(x − x0). Cauchy continued his investigations and proved the existence and uniqueness theorem for analytic differential equations, i.e. when f is analytic. His proof is based on the ingenious method of majorants and is applicable to both ordinary and partial differential equations. An alternative appproach, known as the method of successive approximations, furnishes s simple proof of the existence and uniqueness theorm for ordinary differential equations satisfying the so-called Lipschitz condition. In fact, this method can be generalized to provide proofs for some major results in functional analysis. Definition 4.2 We say that a function f (x, y) satisfies the Lipschitz condition if there exists

A Survey of Some Results in the LIE Group Analysis

219

a positive constant K such that |f (x, y1) − f (x, y2)| ≤ K|y1 − y2 |

(4.3)

for any points (x, y1) and (x, y2) within the domain of f . Proposition 4.1 (Existence and uniqueness of solutions for initial value problems) Let f (x, y) satisfy the Lipschitz condition in its domain. Then the initial value problem has one and only one solution y = φ(x) defined in a neighborhood of x0 . Proof: See Taylor (1996). The above result can be generalized to systems of first-order differential equations. Indeed, let y = (y1, y2 , . . ., yn ), f = (f1, f2 , . . ., fn ) and consider an initial value problem: dy = f (x, y), y(x0) = y0 (4.4) dx where f (x, y) is a single-valued continuous vector-function in a domain |x − x0| ≤ a, ky − y0k ≤ b, for some a, b > 0. Definition 4.3 The function f is said to satisfy the Lipschitz condition if there exists a positive constant K such that kf (x, y1) − f (x, y2)k ≤ Kky1 − y2 k

(4.5)

for any values of (x, y1) and (x, y2) in the domain of f Proposition 4.2 There exists a unique solution to the initial value problem for system of first-order equations if f satisfies the Lipschitz condition. Proof: See Taylor (1996). It follows that the general solution of a system of n first-order differential equations dy = f (x, y) dx

(4.6)

is dependent on precisely n arbitrary constant C1 , . . . , Cn , e.g. n arbitrarily chosen initial values y10, . . . , yn0 for the dependent variables at x = x0 . Consequently, the general solution is written as yi = φi (x, C1, . . . , Cn ), i = 1, 2, . . ., n.

(4.7)

A single equation of the nth order, y (n) = f (x, y, y 0(n−1)).

(4.8)

can be rewritten as a system of n first-order equations by considering the function y and its successive derivatives as new dependent variables, y1 = y, y2 = y 0 , . . . , yn = y (n−1) . The nth order equation is therefore equivalent to the system dy(n−1) dy1 dy2 dyn = y2 , = y3 , . . ., = yn , = f (x, y1, . . . , yn ), dx dx dx dx

(4.9)

220

Phillip Yam

and so a similar result is obtained ( by applying Proposition 4.2 ) for the initial value problem of an nth order equation: y (n) = f (x, y, y 0(n−1)),

(4.10) (n−1)

y(x0) = y0 , y 0(x0 ) = y00 , . . . , y (n−1)(x0) = y0

.

(4.11)

(n−1)

Proposition 4.3 Let f be continuous in a neighborhood of x0 , y0, y00 . . . , y0 fies the Lipschitz condition: (n−1)

|f (x, y1, y10 , . . . , y1 ≤

(n−1) Kk(y1 , y10 , . . . , y1 )

(n−1)

) − f (x, y2, y20 . . . , y2 −

)|

and satis-

(4.12)

(n−1) (y2 , y20 . . . , y2 )k

for some constant K. Then the initial value problem of the nth order equation (4.10-11) has a unique solution defined in a neighborhood of x0 . Again, it follows that the general solution of the nth-order differential equation is dependent on precisely n arbitrary constants C1 , . . ., Cn . Let a function f (x, y) be locally analytic at a point (x0, y0 ), i.e. expandable in a power series in x and y that converges near the point (x0 , y0) in the (x, y) plane. Let us first reduce, for the sake of brevity, the initial point x0 to zero by merely changing x into x − x0 . Proposition 4.4 For the Cauchy problem dy = f (x, y), y(0) = y0 dx

(4.13)

with a locally analytic function f (x, y), the solution y = y(x) is locally analytic near zero Proof: See Taylor (1996). Proposition 4.5 Given a Cauchy problem dy = f (x, y), y(0) = y0 dx

(4.14)

if f (x, y) is analytic in x, y in a neighborhood of (0, y0), then the solution y = y(x) is analytic near x = 0. Proof: See Taylor (1996). Definition 4.4 Given a differential equation, its singular solution is an integral curve such that, for any point (x0, y0 ) on this curve, there are at least two different solutions of the differential equation passing throught (x0, y0). Example 4.1 Consider Clairaut’s equation y = xp + ψ(p).

(4.15)


221

By taking p = y 0 and differentiating both sides, we obtain (x + ψ 0(p))

dp = 0. dx

(4.16)

dp = 0, we get p = C. Now, by equating to zero the second factor in the left-hand side, dx Finally, by applying Clairaut’s equation again, one obtains a solution that depends on one arbitrary constant C: y = Cx + ψ(C). (4.17) An alternative solution is obtained by considering the first factor in the left-hand side of equation (4.16), i.e. x + ψ 0 (p) = 0. We have x = −ψ0 (p), 0

y = −pψ (p) + ψ(p),

(4.18) (4.19)

which give a parametric representation of an isolated solution that is distinct from the former family of straight lines. In fact, by eliminating p from equations (4.18) and (4.19), one obtains the so called envelope of the former family. The general solutions of Clairaut’s equation, therefore, is made up of the one-parameter family of straight lines (4.17) and its envelope. In particular, for every point (x0, y0 ) lying on the envelope of the family (4.17), two distinct integral curves of Clairaut’s equation exist that passes through that point; namely the envelope itself and a straight line of the family (4.17) with a particular value assigned to C. Hence (x0 , y0) is a singular point of the problem and the envelope is a singular solution of Clairaut’s equation. The existence theorems in Propositions 4.1-3 describe only the local behavior of integral curves in the vicinity of ordinary points. A global investigation of the existence of solutions in the whole domain requires an analysis of the behavior of integral curves near singular points as well where the prescribed conditions (e.g. the Lipschitz condition) for the existence and uniqueness of solutions cease to hold. It appears, from our previous studies of analytic solutions, that a natural setting for the investigation of the true nature of solutions , including their singularities, is the extension of differential equations to the complex domain. Indeed, the method of majorants immediately applies to equations dw = f (z, w) dz

(4.20)

with complex variables z, w and f (z, w) being a complex-valued function. The existence theorem can now be reformulated as: Proposition 4.6 Let the function f (z, w) be analytic in a neighborhood of (z0 , w0). Then the equation (4.20) has a unique solution which reduces to w0 for z = z0 and is analytic in an open disc in the complex z-plane with the center at z0. Proof: Details of the proof are given in Taylor (1996). Even though the given proof is only valid for points in the the disc of convergence of the power series, global analytic solutions in the whole domain can consequently be determined from results in the theory of analytic functions. One advantage of the extension of differential equations to the complex domain

222

Phillip Yam

is resulting ability to determine the existence of analytic continuations. In particular, when the general solution is given in closed form, this virtually determines the solution together with all its singularities in the whole domain of existence. 4.2 Some Useful Results in the Theory of Partial Differential Equations An acquaintance with the elements of the theory of first-order partial differential equations is a prerequisite for Lie’s theory. This section contains the basic classical devices for solving a single differential equation as well as systems of linear homogeneous equations with one dependent variable. The coefficients of all equations are assumed to be locally analytic. Let x = (x1, x2, . . . , xn ) be n ≥ 2 independent variables and u is a single dependent ∂u variable. We denote the gradient of u by p = (p1, p2, . . . , pn ), where pi = ∂x . An i equation is said to be of the first order if the partial derivatives of the highest order that occur are of order one. A single partial differential equation of the first order with one dependent variable is written as F (x, u, p) = 0. The following definition of the solutions to a partial differential equation is classical and similar to that for ordinary differential equations. Definition 4.5 A function u = φ(x), defined and continuously differentiable in a neighborhood of a point x0, is said to be a solution of the partial differential equation F (x, u, p) = 0 ∂φ(x) if the substitution u = φ(x), pi = ∂xi converts into an identity in a neighborhood of the point x0 . In the case of two independent variables, a solution of that partial differential equation defines a surface in three-dimensional space and therefore it is termed, in the classical literature, an integral surface. The standard form of the linear partial differential equation of the first order is ξ 1 (x)p1 + · · · + ξ n (x)pn + c(x)u = f (x),

(4.21)

where ξ i (x), c(x), f (x) are given functions of the independent variables. In particular, the equation is called homogeneous if both c(x) and f (x) are identically zero. In this case, the left-hand side of (4.21) is exactly a linear form in p, i.e. every term involves a derivative pi that occurs in the first power only. The general quasi-linear equation of the first order is ξ 1 (x, u)p1 + · · · + ξ n (x, u)pn = g(x, u)

(4.22)

where ξ i and g are given functions of both the independent and dependent variables. Equations that are neither linear nor quasi-linear are termed nonlinear partial differential equations. When several partial differential equations of the first order are given instead of a single one, they furnish a simultaneous system of partial differential equations of the first order. The differential equations that form a simultaneous system may be linear, quasilinear or nonlinear. Consider a system of ordinary differential equations of the first order with n − 1 dependent variables: dyi = fi (x, y1, . . . , yn−1 ), i = 1, . . . , n − 1. (4.23) dx By the existence theorem of Proposition 4.2, its general solution has the form yi = φi (x, C1, . . . , Cn−1 ), i = 1, . . . , n − 1,

(4.24)


223

so that by the implicit function theorem and the analytic continuation of the φi ’ s, the system (4.24) can be solved for the constants of integration Ci : ψi (x, y1, . . . , yn−1 ) = Ci , i = 1, . . ., n − 1.

(4.25)

Each relation in this collection (4.25) is called a first integral of the system and the collection of all the first integrals is called the general solution of (4.23). This set (4.25) of n − 1 first integrals, however, is not the only possible representation of the general solution of (4.23). Indeed, it can be verified that every relation of the form Ψ(ψ1 , . . . , ψn−1 ) = C is a first integral, and so one can replace the functions ψi ’s by n − 1 functionally independent functions Ψi (ψ1 , . . ., ψn−1 ), i = 1, . . ., n − 1. Definition 4.6 Given a system of n − 1 first order equations (4.23), any relation of the form ψ(x, y1, . . ., yn−1 ) = C,

(4.26)

that is satisfied by all its solutions yi = yi (x), i = 1, . . . , n − 1 for some function ψ, not identically constant, is called a first integral. In fact, the system of first order equations (4.23) can be rewritten in the form dyn−1 dy1 dx = ··· = , = 1 f1 fn−1

(4.27)

which, because the denominators can be multiplied by any function distinct from zero, can further be rewritten ( replacing x by x1 and yi by xi+1 , for i = 1, . . . , n − 1) in the symmetric form dx1 dxn = ··· = . (4.28) ξ 1 (x) ξ n (x) This form is called symmetric because of the absence of the independent variable in the n−1 first order ordinary differential equations. Indeed, any one of the n variables x1, . . . , xn can now be taken to be the independent variable. A first integral of the system can now be written as ψ(x) = C. (4.29) Definition 4.7 A set of n − 1 first integrals ψk (x) = Ck , k = 1, . . ., n − 1,

(4.30)

is said to be independent if the functions ψ k (x) are functionally independent, i.e. if no relations of the form F (ψ 1, . . . , ψn−1 ) = 0 exist. Clearly, any set of n − 1 independent first integrals can be used to represent the general solution of the system (4.23). Furthermore, since the general solution of a system of n − 1 first order equations is dependent precisely on n − 1 arbitrary constants, one arrives at the following proposition. Proposition 4.7 Every system of n − 1 first order ordinary differential equations has n − 1 independent first integrals ψ1 , . . ., ψn−1 . Any other first integral ψ of the system can be expressed as ψ = F (ψ 1, . . . , ψn−1 ), (4.31)

224

Phillip Yam

for some analytic function F . Proposition 4.8 A function ψ(x1 , x2, . . ., xn ) is a first integral of the system dx1 dxn = ···= ξ 1 (x) ξ n (x)

(4.32)

if and only if it is a solution of the partial differential equation ξ 1 (x)

∂ψ ∂ψ + · · · + ξ n (x) = 0. ∂x1 ∂xn

(4.33)

Proof: See Taylor (1996). Proposition 4.9 Let x0i = ϕi (x),

i = 1, . . . , n

(4.35)

be an invertible transformation. Then the first-order operator X = ξ 1 (x)

∂ ∂ + · · · + ξ n (x) ∂x1 ∂xn

(4.34)

can be written in terms of the new independent variables x01 , . . . , x0n in the form ¯ = X(ϕ1 ) ∂ + · · · + X(ϕn ) ∂ , X ∂x01 ∂x0n where X(ϕ1 ) = ξ i

∂ϕi ∂ϕ + · · · + ξn i . ∂x1 ∂xn

(4.36)

(4.37)

Proof: See Taylor (1996). The following homogeneous first-order linear equation can be readily solved by applying propositions 4.8 and 4.9. Proposition 4.10 The general solution to the homogeneous linear partial differential equation ∂u ∂u + · · · + ξn =0 (4.38) X(u) ≡ ξ 1 ∂x1 ∂xn is u = F (ψ1 (x), . . . , ψn−1 (x)), (4.39) where F is an arbitrary function of n − 1 variables and ψ1 (x) = C1 , . . ., ψn−1 (x) = Cn−1

(4.40)

is a set of n − 1 independent first integrals of the system dx1 dxn = ··· = . ξ 1 (x) ξ n (x)

(4.41)


225

Proposition 4.11 Given a particular solution u = ϕ(x) of the non-homogeneous equation X(u) = f (x),

(4.42)

its general solution is obtained by adding to ϕ(x) the general solution of the corresponding homogeneous equation X(u) = 0. Thus, by virtue of Proposition 4.10, the general solution of the equation is (4.43) u = ϕ(x) + F (ψ1 (x), . . ., ψn−1 (x)), where F is an arbitrary function. The proposition 4.11 reduces the problem of integration of a non-homogeneous linear partial differential equation X(u) = f (x) to that of the associated system of ordinary differential equations dx1 dxn = ··· = , ξ 1 (x) ξ n (x) provided that a particular solution ϕ(x) of the non-homogeneous equation is known. In general, it is not a simple matter to find a solution ϕ(x). However, one can easily find a desired particular solution in some special cases. Proposition 4.12 For every partial differential operator X, there exist variables x0i such that X is reduced to the one-dimensional form X = ξ(x0)

∂ . ∂x0n

(4.44)

The variables x01 , . . . , x0n can be determined by solving the homogeneous equation X(u) = 0. Proposition 4.13 (Laplace’s method) Suppose ξ n (x) 6= 0. Then there exist variables x0i such that the non-homogeneous equation X(u) = g(x, u)

(4.45)

can be written in the form of the first-order ordinary differential equation ξ n (x)

∂u = g(x, u). ∂x0n

(4.46)

Proof: Choose x01 = ψ1 (x), . . ., x0n−1 = ψn−1 (x), x0n = xn ,

(4.47)

where ψz (x), . . ., ψn−1 (x) are any n − 1 functionally independent solutions of the associated homogenerous equation X(u) = 0. Now, consider the general quasi-linear equation ξ 1 (x, u)

∂u ∂u + · · · + ξ n (x, u) = g(x, u). ∂x1 ∂xn

(4.48)

226

Phillip Yam

Let us define u as a function of x implicitly by V (x, u) = 0

(4.49)

and treat V as an unknown function of n + 1 variables, x and u. Let Di =

∂ ∂ + pi ∂xi ∂u

(4.50)

be the operator of total differentiation with respect to xi . We have, upon differentiation of equation (4.49), ∂V ∂V + pi Di V ≡ = 0, (4.51) ∂xi ∂u whence pi = −

∂V ∂V / , ∂xi ∂u

i = 1, . . ., n.

(4.52)

Replacing pi by the above expressions, one obtains the homogeneous linear equation ξ 1 (x, u)

∂V ∂V ∂V = 0. + · · · + ξ n (x, u) + g(x, u) ∂x1 ∂xn ∂u

(4.53)

Proposition 4.14 Suppose the system of equations dx1 dxn du = ··· = = ξ 1(x, u) ξ n (x, u) g(x, u)

(4.54)

has a set of n independent first integrals ψ 1(x, u) = C1 , . . . , ψn (x, u) = Cn .

(4.55)

Then the general solution of the quasi-linear equation (4.48) is defined implicitly by V (x, u) ≡ F (ψ 1(x, u), . . ., ψn (x, u)) = 0. In particular, if ∂V /∂u 6= 0, then the solution can be written explicitly as u = φ(x). Proof: This is a direct consequence of Proposition 4.10. We now turn to the problem of solving a system of homogeneous linear equations. Consider the general system of r homogeneous linear partial differential equations in one unknown function u = u(x): n X

ξ iα (x)pi = 0,

α = 1, . . ., r,

(4.56)

i=1

where x = (x1 , . . . , xn) are the independent variables and pi = ∂u/∂xi are the partial derivatives. By introducing the differential operators Xα = ξ 1α (x)

∂ ∂ + · · · + ξ nα (x) , ∂x1 ∂xn

α = 1, . . ., r,

(4.57)


227

the system (4.56) can be rewritten in the compact form X1 (u) = 0, . . ., Xr (u) = 0.

(4.58)

Clearly, the system (4.58) has a trivial solution u = constant, which is of no interest to us. It ≤ r, then it also satisfies is also clear that if u solves s equations Xα(u) = 0, a = 1, . . . , sP s their linear combination with arbitrary variable coefficients, i.e. α=1 λα (x)Xα(u) = 0. This therefore motivates the following definitions. Definition 4.8 Differential operators X1 , . . . , Xs are said to be connected if there exist functions λα(x), not all zero, such that λ1 (x)X1 + · · · + λs (x)Xs = 0,

(4.59)

where equation (4.59) is satisfied as an operator identity in a neighborhood of a generic point x. If equation (4.59) implies that λ1 = · · · = λs = 0, then we say that the operators X1, . . . , Xs are unconnected. In the latter case, the corresponding differential equations X1(u) = 0, . . . , Xs(u) = 0 are said to be independent. Definition 4.9 Let Zα ’s be r linear combinations of the operators Xα ’s: Zα =

r X

hβ,α (x)Xβ ,

α = 1, . . . , r,

(4.60)

β=1

where hβ,α (x) are variable coefficients such that the determinant |hβ,α (x)| is not zero. The system of linear homogeneous equations Z1 (u) = 0, . . . , Zr (u) = 0 has the same set of solutions as the original system X1(u) = 0, . . . , Xr(u) = 0. Consequently, these two systems, as well as the operators Xα and Zα , are siad to be equivalent. Now, the total number of unconnected operators in any given collection is determined by the following well-known algebraic proposition. Proposition 4.15 The number r∗ of unconnected operators in a collection Xα is equal to the rank of the r × n matrix of their coefficients ξ iα(x): r∗ = rank (ξ iα ) .

(4.61)

The number r∗ is the same for any equivalent operators Zα . Proof: See Taylor (1996). According to the last result, any system of r homogeneous linear equations can be replaced by a system of r∗ independent equations. It is clear that the system cannot be independent when there are more than n equations. Furthermore, if r = n and those operators are unconnected, i.e. r∗ = r = n, the determinant of thePcoefficients will be non-zero. In this case, the linear homogeneous algebraic equations ni=1 ξ iα(x)pi = 0, α = 1, . . . , r, with respect to p = (p1 , . . ., pn ) yield p = 0, and hence the solution of the system is trivial, u = constant. Thus, a necessary condition for the existence of non-trivial solution

228

Phillip Yam

is r∗ < n. However, the condition r∗ < n alone is not sufficient for the existence of non-trivial solutions. Let see this from the following example. Example 4.2 Consider the system of two equations: ∂u ∂u −y =0 ∂y ∂z ∂u ∂u X2(u) ≡ y +z =0 ∂x ∂y

(4.62)

X1(u) ≡ z

(4.63)

It is evident that these equations are independent since they involve differentiations in different variables. One readily p obtains the general solution of the first equation in the form u = v(x, ρ), where ρ = y 2 + z 2 . Substituting this expression into the second equation and dividing by y/ρ, we get ∂v z ∂v + =0 (4.64) ∂x ρ ∂ρ Since v does not involve the variable z explicitly, it follows that ∂v/∂ρ = 0 and hence ∂v/∂x = 0. Consequently, u = v = constant. Thus, the equations do not have a nontrivial solution, even though r∗ = r = 2 is less than the number n = 3 of independent variables. Any solution u of the system of r equations X1(u) = 0, . . ., Xr (u) = 0 must satisfies the second-order equations Xα(Xβ (u)) = 0 for any values of the indices α and β. Consequently, u also solves the following equations of the first order: n X ∂u (Xα (ξiβ ) − Xβ (ξ iα )) =0 Xα(Xβ (u)) − Xβ (Xα(u)) ≡ ∂xi

(4.65)

i=1

Hence one can state that u annuls, together with the operators Xα , all their commutators defined as: Definition 4.10 The commutator of any two operators, Xα and Xβ , is the differential operator [Xα, Xβ ] of the first order defined by [Xα, Xβ ] = Xα Xβ − Xβ Xα,

(4.66)

or in the following equivalent form exhibiting the coefficients explicitly: [Xα, Xβ ] =

n X i=1

(Xα(ξi,β )) − Xβ (ξ i,α))

∂u . ∂xi

(4.67)

Therefore, any solution of the system of equations also solves the equations [Xα, Xβ ](u) = 0. We now have two alternatives: either some of the commutators are independent of the original operators Xα, or the commutators are linear combinations with variable coefficients of the original operators. The latter case means that the combined set of operators, those original operators together with their commutators, is connected. In the first case, one should consider an extended system of differential equations of the first order obtained by combining X1 (u) = 0, . . . , Xr (u) = 0 with all independent commutator equations


229

[Xα, Xβ ](u) = 0. Then one can apply the previous procedure to this new system again. Proceeding in this manner, one ultimately reaches the second case and hence arrives at what is called a complete system. Definition 4.11 Let X1(u) = 0, . . . , Xr (u) = 0 be a system of independent equations. It is called a complete system if all commutators are dependent on the operators Xα: [Xα, Xβ ] =

r X

hγα,β (x)Xγ .

(4.68)

γ=1 γ

If hα,β (x) = 0, i.e. if all commutators of the operators Xα vanish, we have a special case of a complete system known as a Jacobian system. The theory of complete systems is based on the following properties. Proposition 4.16 The commutator of [Xα, Xβ ] is invariant under changes of variables. Namely, let x0i = ϕi (x), i = 1, . . ., n, be new variables and let X α denote the operators Xα written in the variables x0i . Then [Xα, Xβ ] = [Xα,Xβ ]

(4.69)

Therefore, we conclude that a system is complete independently on the choice of variables. Proof: For details, also see Taylor (1996). Proposition 4.17 If the system X1 (u) = 0, . . ., Xr (u) = 0 is complete, then any equivalent system Z1(u) = 0, . . . , Zr (u) = 0 is also complete. Proof: Directly follows from the definition on equivalent systems. Proposition 4.18 Any complete system is equivalent to a Jacobian system. Proof: Since within any complete system, equations X1 (u) = 0, . . . , Xr(u) = 0 are, by definition, independent. Hence they can be solved with respect to particular r derivatives. Without loss of generality, we assume that the system is solved for the r derivatives with respect to the first r variables, x1, . . . , xr : Z1 (u) ≡

∂u ∂u + · · · = 0, · · · , Zr (u) ≡ + · · · = 0, ∂x1 ∂xr

(4.70)

where the dots denote terms not involving the derivatives with respect to x1 , . . ., xr . The previous system is complete since it is equivalent to the original complete system. It is evident that the commutators [Zα , Zβ ](u) may involve only the derivatives ∂u/∂xr+1, . . . , ∂u/∂xn, whereas the first r operators zα , α = 1, . . ., r contain differentiations with respect to x1, . . . , xr . Consequently, [Zα, Zβ ] can be expressed as a linear combination of zα only if [Zα , Zβ ] = 0. According to the preceeding discussion, every system of homogeneous linear partial differential equations can be converted into a complete system by simply adding all independent commutators. The following proposition provides a practical guide for integrating a complete system of independent equations X1(u) = 0, . . . , Xr(u) = 0.

230

Phillip Yam

Proposition 4.19 A complete system of r independent equations X1(u) = 0, . . ., Xr (u) = 0 has n − r functionally independent solutions, u = ψ1 (x), . . ., u = ψn−r (x). The general solution is an arbitrary function F of these n − r particular solutions: u = F (ψ 1(x), . . ., ψn−r (x)).

(4.71)

Proof: Firstly, we integrate one of the equations, e.g. X1(u) = 0, and suppose ξ 1,1(x) 6= 0, one can obtains n − 1 independent solutions u = ψ1,1(x), . . ., u = ψn−1,1 (x). Then, one can introduces a change of variables, x01 = φ1 (x), x02 = ψ1,1(x), . . ., x0n = ψ n−1,1 (x), where φ1 is functionally independent of ψi,1’s, and reduces the operator X1 to the form Y1 = f (x)∂/∂x01. Consequently, the original system is equivalent to a new complete system in the form Y1 (u) ≡

∂u =0 ∂x01

n X ∂u ∂u Yα (u) ≡ + ηs,α 0 = 0, α = 2, . . . , r ∂x0α ∂x1

(4.72) (4.73)

s=r+1

Therefore, u = v1 (x02, . . . , x0n), a function of n − 1 variables x02, . . . , x0n only. According to the proof of proposition 4.18, we can conclude that the system (4.72–73) is a Jacobian system. Hence, all commutators of the operators Yβ vanish. In particular, n X ∂ηs,α ∂ = 0, α = 2, . . ., n [Y1 , Yα] = ∂x01 ∂x0s

(4.74)

s=r+1

whence ∂ηs,α /∂x01 = 0, i.e. every ηs,α is a function of x02, . . . , x0n only. Therefore, the set of solutions of the original system is the same as the set of solutions of the new system Y2 (u) = 0, . . . , Yr (u) = 0, which forms a Jacobian system of r − 1 equations with n − 1 independent variables x02, . . . , x0n now. By applying the preceeding procedure to this new system and continuing in this manner for r − 2 more times, one ultimately reduces the original system to a single homogeneous linear equations with n − r + 1 independent variables. Its integration provides the general solution (4.71). In practice, if some operators Xγ do not enter in the right-hand sides of the commutator Pr relations, [Xα, Xβ ] = γ=1 hα, βγ (x)Xγ , then it is better to start the integration of the system by first solving the sub-system Xγ (x) = 0 with those Xγ involved in the right-hand sides of the commutator relations. Example 4.3 In the space of four variables x = (x, y, z, t), consider the system of two equations ∂u ∂u −y = 0, ∂y ∂z ∂u ∂u ∂u X2(u) ≡ +t +y = 0, ∂x ∂y ∂t X1 (u) ≡ z

(4.75) (4.76)


231

Their commutator has the form [X1, X2] = X3 , where X3 ≡ t

∂ ∂ +z ∂z ∂t

(4.77)

The three equations X1(u) = 0, X2(u) = 0 and X3(u) = 0 form a complete system since [X1, X2] = X3 t y [X1, X3] = X1 + X3 z z [X2, X3] = −X1.

(4.78)

Since all the commutator relations contain no X2, we solve the equations X1(u) = 0, X3(u) = 0. The first equation yields u = v(x, t, ρ), where ρ = y 2 + z 2 . Then X3(u) = 0 reduces to the equation ∂v ∂v + = 0, (4.79) 2t ∂ρ ∂t whence v = w(x, λ), where λ = ρ − t2 = y 2 + z 2 − t2 . Now, the last equation X2(u) = 0, reduces to ∂w/∂x = 0. Thus, the general solution is written u = F (y 2 + z 2 − t2 )

(4.80)

where F is an arbitrary function.

APPLICATIONS OF LIE GROUP ANALYSIS TO PARTIAL DIFFERENTIAL EQUATIONS ARISEN IN FINANCE The concept of a group is closely related to that of invariance or symmetry of mathematical objects - surfaces, functions, differential equations, etc. Indeeed, given any object M, the set G of all invertible transformations T leaving the object M unaltered: T :M→M

(5.1)

contains the identity transformation I, the inverse T −1 of any transformation T ∈ G and the composition T1 T2 of any transformations T1 , T2 ∈ G. Therefore, G is a group, and is usually called a symmetry group of the object M. The symmetry of algebraic equations is discussed in Galois theory, whereas Lie group theory deals with symmetries of differential equations. Nevertheless, Lie groups, unlike Galois groups, contain infinitely many transformations and depend on continuous parameters. Consequently, Lie’s theory deals with continuous groups. The crucial idea of this theory is to employ infinitesimal transformations instead of finite group transformations. Any Lie group is uniquely detemined by its infinitesimal transformations. These infinitesimal transformations, in turn, form a vector space closed under the commutator, i.e. they form a Lie algebra. For general discussions on the theory of Lie groups, see Bröker and Dieck (1985), Chevalley (1946). Also see Jacobson (1979) for general discussions on Lie algebras. 5.1

An Overview of the Theory of Lie Groups

232

Phillip Yam

5.1.1 Applications of Lie Groups to Differential Equations A set G of invertible point transformations in the (x, y) plane, x ¯ = f (x, y, ), y¯ = g(x, y, ),

(5.2)

f (x, y, 0) = x, g(x, y, 0) = y,

(5.3)

with depending on a parameter is called a one-parameter continuous group, if G contains the identity transformation as well as the inverse of its elements and their composition. It is said that the totality of all these transformations forms a symmetry group of a differential equation F (x, y, y 0(n)) = 0, or a group admitted by this equation if the equation is form invariant with respect to the group, i.e. F (¯ x, y¯, y¯0, . . . , y¯(n)) = 0 whenever F (x, y, y 0(n)) = 0. Lie’s theory reduces the construction of the largest symmetry group G to the determination of the its infinitesimal transformations, x ¯ ≈ x + ξ(x, y) and y¯ ≈ y + η(x, y),

(5.4)

defined as the linear parts, with respect to , in the Taylor series expansions of finite transformations of G. It is convenient to represent an infinitesimal transformation by a linear differential operator ∂ ∂ X = ξ(x, y) + η(x, y) (5.5) ∂x ∂y called the infinitesimal operator or the generator of the group G. The generator X of the group admitted by a differential equation is also termed an operator admitted by the equation. An infinitesimal operator admitted by the differential equation satisfies a system of linear partial differential equations termed the determining equations of the infinitesimal symmetries. The chief property of the determining equation is that the vector space of its solution is closed under the commutator, i.e. the commutator X = [X1, X2] of infinitesimal operators X1, X2 is again an infinitesimal operator admitted by the same differential equation. In turn, the infinitesimal operators of a given differential euqation form a Lie algebra. An essential feature of a symmetry group G is that it conserves the set of solutions of the differential equation admitting this group. Namely, the symmetry groups merely permute the integral curves among themselves. It may happen that some of the integral curves are individually unaltered under G. Such integral curves are termed invariant solutions. Consider an n-th order ordinary differential equation F (y 0(n)) = 0. Since this equation does not explicity contain the independent variable x, it does not alter after any transformation x ¯ = x + with an arbitrary parameter . These transformations form a group known as the group of translations along the x-axis. In fact, this differential equation is the most general one admitting the group of translations along the x-axis. Moreover, any one-parameter group reduces, after change of variables, to the group of translations. These new variables, called canonical variables t and u, are obtained by solving the equations X(t) = 1 and X(u) = 0

(5.6)


233

Therefore, we have an n-th-order ordinary differential equation F (x, y, y 0(n)) = 0 admitting a one-parameter group reduces to the form G(u, u0(n)) = 0, and hence once can reduce its order to n − 1. In particular, any first-order equation with a known one-parameter symmetry group can be integrated by quadrature using canonical variables. On the other hand, integrations of the second-order equations require two independent infinitesimal symmetries, i.e. two dimensional Lie algebra, etc. The great success in integration using symmetries provided Lie with an incentive to begin the classification of all ordinary differential equations of an arbitrary order in terms of symmetry groups, and thus to describe the whole set of equations integrable by his methods. The result of implementing this idea, as it applies, e.g. to second-order equations, was sensational. At that time, several hundreds of special types of differential equations of the second order were integrated by ad hoc methods. Lie’s classification showed that most of these equations can be integrated by a single method furnished by group theory and that they can be simplified by a mere change of variables and sorted into only four basic types! Lie’s four canonical equations, admitting two-dimensional Lie aglebras, are: y 00 = f (y 0 ), y 00 = f (x), y 00 =

1 f (y 0), y 00 = f (x)y 0, x

(5.7)

for arbitrary function f . It is readily seen that each of these equations is integrable by quadratures. In 1881, Lie gave a group classification of linear second-order partial differential equations with two independent variables, x and y: a11 uxx + 2a12uxy + a22 uyy + b1ux + b2 uy + cu = 0,

(5.8)

where the coefficients aij , bi, and c are arbitrary functions of x and y. Lie also developed a new method for constructing exact solutions known today as invariant solutions. In addition, one can also find, in his papers, many of Lie’s original ideas on symmetry analysis and can serve as a concise introduction to practical methods of group classification of partial differential equations. Finally, for general discussions on applications of Lie’s theory to differential equations, see Olver (1986), Ovsyannikov (1982) and Steeb (1996). 5.1.2 Tangent Transformations In addition to point transformation given early in the last subsection, contact transformations (first-order tangent transformations) have been found to be useful in mechanics, optics and geometry. Furthermore, Lie showed that the theory of partial differential equations of the first order reduces to the theory of groups of contact transformations. This is due to the fact that any first order partial differential equations, F (x, u, p) = 0, can be mapped into any other equation of the first order, H(x, u, p) = 0, by means of a suitable contact transformation. Contact transformations exist, in general, in the case of an arbitrary number of independent variables, but a single dependent variable only. These transformations involve independent variables x, a dependent variable u anad its partial derivatives p. Thus, a contact ¯ ) ∈ R2n+1 : x, u ¯, p transformation maps the points (x, u, p) ∈ R2n+1 into new positions (¯ x ¯ = f (x, u, p), ¯u = g(x, u, p), p¯ = h(x, u, p),

(5.9)

234

Phillip Yam

such that the transformation satisfies the contact condition X X p¯i d¯ xi = λ(x, u, p)(du − pi dxi ), d¯ u− i

(5.10)

i

where λ(x, u, p) is an indeterminate multiplier. Furthermore, contact transformations are of considerable use in the Hamilton-Jacobi theory of integration of Hamilton’s equations of motion and are known in mechanics as canonical transformations. The latter term reflects the chief property of contact transformations, considered in mechanics, to preserve the canonical form ∂H ∂H dxi dpi = =− and , i = 1, . . . , n, dt ∂pi dt ∂xi

(5.11)

of the equations of motion. The crux of the matter is that one can simplifies an arbitrary Hamilton-Jacobi equation ∂S + H(t, x, 5S) = 0, (5.12) ∂t e.g. reduces it to that with H = 0 by a suitable canonical transformation, and hence immediately integrates the new canonical equations of motion. It is clear that the inverse of canonical transformations and their composition are again canonical transformations, therefore the totality of all canonical transformations constitutes a group. Thus, HamiltonJacobi theory reduces mechanical problems to the theroy of contact transformation groups. In the case of several dependent variables, contact transformations are trivial, i.e. they are mere extended point transformations. The definition of a one-parameter group of point transformations can naturally be generalized to contact transformations depending upon a parameter . Lie showed that all oneparameter groups of contact transformatios are determined by infinitesimal contact transformations x ¯i ≈ xi + ξi (x, u, p), u ¯ ≈ u + η(x, u, p), p¯i ≈ p¯i + ζ i (x, u, p), where ξi = −

∂W ∂W ∂W ∂W , η = W − pi , ξi = + pi ∂pi ∂pi ∂xi ∂u

(5.13)

(5.14)

with an arbitrary function W = W (x, u, p) called by Lie, the characteristic function of the infinitesimal contact transformation. According to the previous discussions, the variety of contact transformations is confined to the set of functions W (x, u, p). This set is adequate for a group-theoretic description of first-order partial differential equations, but it is not sufficient for tackling higher order equations in a similar way. Therefore, Lie raised the question on the existence of higher order tangent transformations and emphasized their importance for the theory of higher order partial differential equations. Independently, A.V. Bäcklund encountered the question on whether there are surface transformations for which the second order tangency conditions, X X ui dxi = 0, dui − uij dxj = 0, (5.15) du − i

j


235

are invariant rather than the first order tangency condition. He investigated this question in the more general context of arbitrary finite-order tangent transformations of plane curves and surfaces in a three-dimensional space. However, Bäcklund came to a negative answer to this question. When the result was published in 1874, Bäcklund learned that the idea underlying the existence of higher order tangent transformations and possible applications in the theroy of differential equations had been discussed by Lie. This fact encourages him to prepare and publish in 1876 a revised and enlarged version of his work. Bäcklund began with the most general transformations written in the form: x ¯i = fi (x, u, u(1), u(2), . . .), u ¯ = g(x, u, u(1), u(2), . . .), i = 1, . . . , n,

(5.16)

where u(1) = {ui }, u(2) = {uij }, . . . are the sets of partial derivatives of the first, second and higher orders. These transformations are extended to all derivatives through differentiation and elimination to obtain ¯ij = hij (x, u, u(1), u(2), . . .), . . . u ¯i = hi (x, u, u(1), u(2), . . .), u

(5.17)

The resulting extended transformations leave invariant the contact conditions of any order: X X ui dxi = 0, dui − uij dxj = 0, . . . (5.18) du − i

j

Hence, they can be regarded as infinite-order tangent transformations. The main emphasis of Bäcklund’s work was on the restricted types of extended transformations. He extensively discussed the following two types. The first type comprises those extended transformations that are closed and invertible in a finite-dimensional space of variables (x, u, u(1), . . ., u(s) ), for some fixed finite s: ¯ = f, u ¯i1 ···is = hi1 ···is , x ¯ = g, u ¯ i = hi , . . ., u

(5.19)

where f , g, hi, . . . , hi1···is are functions of the variables x, u, u(1), . . . , u(s) only. The point and contact transformations belong to this type with s = 0 and s = 1, respectively. However, Bäcklund proved that they are the only possible candidates in this type of transformations. Nevertheless, the limiting case s → ∞, i.e. the case of infinite-order tangent transformations, remains as a possible candidate for a non-trivial generalization of contact transformations. Bäcklund himself did not investigate the details of this possibility. The second type consists of infinite-valued transformations widely known in the literature as Bäcklund transformations. Various Bäcklund transformations were found in later times and were widely used in problems of geometry and the theroy of differential equations. However, a possible path of generalization provided by infinite-order tangent transformations was not elaborated until recently. The main obstacle in this direction of study is the fact that the definition and invertibility test of the extended transformations involving infinitely many variables are, in general, incomprehensible. Recently, an attempt was undertaken to tackle the problem by imposing the oneparameter Lie group structures on these transformations and using Lie’s infinitesimal technique. A Lie group of invertible infinite-order tangent transformations is now termed as Lie-Bäcklund transformation group in recognition of the fundamental contribution of Lie

236

Phillip Yam

and Bäcklund. The extension of point and contact transformation groups to all derivatives provide examples of Lie-Bäcklund transformation groups. In modern group analysis, there exists a variety of so-called generalized symmetries which generalize Lie’s infinitesimal operators of point and contact transformation groups. However, the problem still remains whether these generalized symmetries generate, via the Lie equations, a group. The problem thus far is solved for the generalization furnished by Lie-Bäcklund operators, n m X X ∂ ∂ ξi + ηj j + · · · , (5.20) X= ∂xi ∂u i=1

j=1

where ξ i , ηj are arbitrary locally analytic functions of the independent and dependent variables x, u, and of the derivatives u(1) = uji , . . . , u(s) = uji1 ···is up to a finite order s. The dots mean that the action of X is extended to all derivatives u(2), u(3), . . .. The main theorem states that a Lie-Bäcklund operator generates a group of Lie-Bäcklund transformations represented by the exponential map: s s X (xi ) + · · · , s! s u ¯j = eX (uj ) ≡ uj + X(uj ) + · · · + X s (uj ) + · · · . s! x ¯i = eX (xi ) ≡ xi + X(xi) + · · · +

(5.21) (5.22)

5.1.3 Modern Theory of Lie Groups The rich store of results of Lie’s work remained for a long time the special preserve of a few. Consequently, symmetry analysis of differential equations laid dormant until the 1950s. In the 1960s, however, the group theoretic approach to differential equations was restored and tested in new situations. The result was widespread interest and a flurry of activity in applied group analysis which was regarded as a tool to search for exact solutions of differential equations as well as conservation laws, in particular first integrals of ordinary differential equations. Conservation equations express unalterable laws of nature. Consequently, fundamental mathematical models are formulated in the form of conservation laws. The concept of a conservation law generalizes the notion of first integrals of systems of ordinary differential equations of the first order to arbitrary ordinary and partial differential equations. The term conservation law is motivated by the conservation of such physical quantities as energy, monentum, etc. It has been known for a long time that the conservation laws of mechanics are connected with the symmetry properties of the physical system. E. Noether, in her paper written in 1918 under the influence of F. Klein, combined the methods of variational calculus with the theory of Lie groups and offer a general approach for constructing conservation laws for Euler-Lagrange equations when their symmetries are known. The general result exhibited by the Noether theoremRstates, e.g. in the case of Lagrangian of the first order, L(x, u, u(1)), that if the integral Ldx is invariant under an infinitesimal transformation ¯j = uj + ηj , then the quantities x ¯i ≈ xi + ξ i , u Ti = Lξ i +

X j

(ηj −

X k

ξ k ujk )

∂L j

∂ui

, i = 1, . . ., n

(5.23)


237

satisfy the conservation equation Di Ti = 0, i = 1, . . ., n,

(5.24)

whenever the function uj solve the Euler-Lagrange equations: ∂L ∂L − Di ( j ) = 0, i = 1, . . . , n ∂uj ∂ui

(5.25)

where Di is the total differential operator Di =

X j ∂ X ∂ ∂ + ui j + ukij k ∂xi ∂u ∂uj j

(5.26)

j,k

The application of Lie’s theory of differential equations to mathematical models in natural sciences has been essentially advanced in the 1960s, to a great extent due to the work of L.V. Ovsyannikov. For details, see Ovsyannikov (1982). His research has awoken a broad interest in the subject amongst applied mathematicians. He had supplemented Lie’s group classification of linear second-order differential equations in two independent variables and discovered new invariants of differential equations. Ovsyannikov’s invariants are closely related to the Laplace invariants and are used in modern group analysis, e.g. for the group theoretic treatment of Riemann’s method of solution of Cauchy problem. Special types of exact solutions, widely known today as invariant solutions, have long been successfully applied to the analysis of specific problems and were in common use in mechanics and physics prior to the popularity of group theory. Lie came to these solutions from the point of view of group invariance and proved the main theorem on invariant solutions of partial differnetial equations. Ovsyannikov’s work made it possible to clarify many intuitive ideas and apply them to numerous equations of mechanics, as a result of which the method of invariant solutions could be included as an important integral part of modern group analysis. It was precisely through the concept of an invariant solution that the theater of action of group analysis shifted in the 1960s from ordinary differential equations to problems of mechanics and mathematical physics. In addition, Ovsyannikov introduced a new class of solutions called partially invariant solutions. He also suggested to classify all distinctly different invariant and partially invariant solutions by constructing optimal systems of subalgebras of Lie algebras. Invariant and partially invariant solutions are represented via invariants. Finally, the search for these types of solutions reduces the total number of variables of the differential equations under consideration. For example, first order ordinary differential equations reduce to algebraic equations. Example 5.1 Consider a partial differential equation of the second order a11uxx + 2a12 uxy + a22uyy + b1ux + b2uy + cu = 0

(5.27)

where aij , bi, c are arbitrary functions independent of y. Now it is obvious that the equation admits the following generator with an arbitrary constant λ: X=

∂ ∂ + λu . ∂y ∂u

(5.28)

238

Phillip Yam

The equation X(J) = 0 provides two independent invariants, x and ue−λy . Hence, one can take invariant solutions in the form u = eλy v(x). Then the above partial differential equation reduces to a11v 00 + (2λa12 + b1)v 02a22 + λb2 + c)v = 0

(5.29)

If a11 6= 0, the ordinary differential equation has two linearly independent solutions, v1 = φ1 (x, λ) and v2 = φ2 (x, λ), involving λ. A solution of the original equation containing to arbitrary functions, f1 (λ) and f2 (λ), is given by the integral with any constant limits, α1 , β1 , α2 , β2 : Z β2 Z β1 λy f1 (λ)e φ1 (x, λ)dλ + f2 (λ)eλy φ2 (x, λ)dλ. (5.30) u= α1

α2

When we pass from ordinary to partial differential equations, it becomes impossible (with rare exceptions) and anyway virtually futile to write out general solutions. But mathematical physics in any case seeks only those solutions that satisfy given boundary conditions. Group invariant solutions can serve in the solution of invariant boundary-value problem. This application is governed by the following rule. Invariance principle: If a boundary value problem is invariant under a group G, then we should seek a solution among functions invariant under G. Here, invariance of a boundary value problem means that the differential equation, the manifold where the data are given, and the data themselves are invariant under G. Invariant solutions are usually regarded to be not particularly useful for solving boundaryvalue problems because arbitrary data are not invariant. It seems that this argument is irrefutable. However, it was shown recently that the invariance principle furnishes a systematic method suited to purposes of solving linear initial-value problems when it is combined with the theory of distributions and fundamental solutions. The new approach, however, necessitates an extension of Lie group methods to differential equations in distribtions. In this thesis, I investigate this application on solving a number of boundary-value problems arisen in mathematical finance. 5.2

Lie Groups of transformations and Infinitesimal Transformations

Consider invertible trnsformations in a region D of Rn defined by equations of the form x ¯i = fi (x), i = 1, . . . , n,

(5.31)

¯ , x ∈ D. We call a transformation to be smooth if it is infinitely dfferentiable. From where x hereafter, we shall assume every transformation to be at least smooth. Definition 5.1 (Group of transformations) A group if transformations in D ∈ Rn is a set G of transformations in the form (5.31) such that it contains the identity transformation, I, as well as the inverse and the product (composition) of all transformations pertaining to G. 5.2.1. One-parameter Lie Groups of Transformations


239

Definition 5.2 (One-parameter (continuous) group of transformations) Let x lie in a region D ⊂ Rn . A group of transformations, ¯ = f (x, ), x

(5.32)

with each transformation being indexed by lying in an interval U ⊂ R, is called a oneparameter (continuous) group of transformation, if the following conditions hold:

(i) f is smooth in both x and . (ii) The law of composition φ(, δ), , δ ∈ U , is smooth in its arguments. (iii) There is a unique e in U providing the identity transformation, i.e. f (x, e) = x. (iv) For any , δ, we have f −1 (x, ) = f (x, −1 ) and f (f (x, ), δ) = f (x, φ(, δ)), here we require (·)−1 to be a smooth function in U . If one think of as a time variable and x as spatial variables, then a one-parameter group of transformations in effect defines a stationary flow in D. We call the locus traced by a point x0, a path curve or G-orbit of the point under the action of G. Definition 5.3 (One-parameter Lie group of transformations) A one-parameter Lie group of transformations is a one-parameter group of transformations if, in addition, we have U is itself a group under the law of composition φ. Example 5.2 Consider the one-parameter family G of projective transformations T : x ¯=

x 1 − x

(5.33)

One could immediately verify that G satisfies group axioms formally. Indeed, the consecutive application of two transformations T , Tδ yields ¯= x

x 1 − ( + δ)x

(5.34)

However, to conclude that G is a group, we presume that three three expressions, 1 − x, 1 − δx, 1 − ( + δ)x, do not vanish. This simple observation is, however, of fundamental significance for the theory of Lie group analysis. To discern the true nature of the situation, let us consider a fixed point x0 > 0 and move along the path curve through x0 when ranges over all real numbers from a small interval 0 ≤ ≤ γ. Any two transformations T , Tδ are well defined in the previous interval. but their consecutive application may give the prohibited value + δ = 1/x0. It can be readily seen that iterated multiplication of the projective transformations inevitably leads to the prohibited value of the group parameter. One cannot solve the problem by merely fixing a ”tiny” interval, the true nature of the problem being that the orbit of any point x (excluding the isolated point x = 0) has a singularity when = 1/x.

240

Phillip Yam

This projective group provides an example of what is called a local group, meaning that the composition is defined only for those transformations sufficiently closed to the identity. The vicinity of the identity transformation, where the composition is determined, may depend upon a transformed point x. On the other hand, a global group is a transformation group such that the composition of any transformation is defined simultaneously at all generic points x. A common representative of a global group is the Euclidean group of rigid motions. Definition 5.4 (One-parameter connected local Lie group of trasformations) Let x lie in a region D ⊂ Rn . A group of transformations, ¯ = f (x, ), x

(5.35)

with each transformation being indexed by lying in an interval U ⊂ R, is called a oneparameter connected local Lie group of trnasformations, if the following conditions hold: (i) f is smooth in both x and . (ii) There is a subinterval U 0 ⊂ U , such that a law of composition φ(, δ) (not necessarily lie in U 0 but pertain to U ), defined for , δ ∈ U 0 , is smooth in its arguments. (iii) There is a unique e in U 0 providing the identity transformation, i.e. f (x, e) = x. (iv) For any , δ ∈ U 0 , we have −1 ∈ U 0 and f −1 (x, ) = f (x, −1 ), f (f (x, ), δ) = f (x, φ(, δ)), here we require (·)−1 to be a smooth function in U , Consider a one-parameter connected local Lie group G, ¯ = f (x, ) x

(5.36)

Expanding the equation (5.36) with respect to in a neighborhood of the identity = e, we have ¯ = f (x, e) + ξ(x)( − e) + O(( − e)2) (5.37) x where

∂f (x, e) (5.38) ∂ Now, the linear part of the transformation is called the infinitesimal transformation of the group G. Geometrically the infinitesimal transformation defines the tangent vector ξ(x), at the point x, to the G-orbit of x. Therefore, ξ is called the tangent vector field of the group G. ξ(x) =

Proposition 5.1 (Lie’s first fundamental theorem) Let G be a local group of f with infinitesimal transformation ξ. There exists a parameterization τ () such that the transfor¯ = f (x, τ) solve the system of first - order ordinary differential equations, named mation x as Lie equations d¯ x = ξ(¯ x) dτ

(5.39)


241

¯ = x when τ = 0. Conversely, the solution of arbitrary initial value problems of Lie with x equations defines a local group of transformations ¯ = f (x, ) defined G. Now, we have an identity Proof: Suppose x f (x, φ(, e + ∆)) = f (f (x, ), e + ∆)

(5.40)

Expanding the left - hand side of (5.40) about and notice that φ(, e) = ,

f (x, φ(, e + ∆)) = f (x, ) + Since

d¯ x (φ(, e + ∆) − φ(, e)) + O((φ(, e + ∆) − φ(, e))2) d (5.41)

∂φ φ(, e + ∆) − φ(, e) = ∂b

∆ + O((∆ )2)

(5.42)

b=e

we can substitute it into the equation (5.42), and hence d¯ x ∂φ ∆ + O((∆)2 ) f (x, φ(, e + ∆)) = f (x, ) + d ∂b b=e

(5.43)

Expanding the right - hand side of (5.40) about e, we get f (f (x, ), e + ∆) = f (x, ) + ξ(¯ x)∆ + O((∆)2)

(5.44)

Equating the coefficients of the term ∆ in the two expressions (5.43) and (5.44), we get d¯ x ∂φ = ξ(¯ x) (5.45) d ∂b b=e Denote H() = [∂φ/∂b]b=e; Clearly, H(e) = 1, so by continuity, H 6= 0 near e. Consider the parameterization τ (), near e, Z dt τ () = e H(t)

(5.46)

and we substitute it into equation (5.45), we then have d¯ x ¯ (τ = 0) = x = ξ(¯ x), x (5.47) dτ Therefore, the first part of the theorem follows. To prove the second part of the theorem, it is only needed to verify that the solution ¯ = f (x, ) of the initial value problems of Lie equations, x d¯ x ¯ (0) = x = ξ(¯ x), x dτ

(5.48)

f (f (x, ), δ) = f (x, + δ)

(5.49)

satisfies the condition

242

Phillip Yam

Define u(δ) = f (f (x, ), δ)

(5.50)

v(δ) = f (x, + δ)

(5.51)

Both of them satisfy the initial - value problem: dw = ξ(w), w(0) = f (x, ) (5.52) dδ By uniqueness theorem of initial - value problems, u = v in some neighborhood of δ = 0. Definition 5.5 The group parameter is said to be canonical if the composition law φ reduces to the addition, i. e. φ(, δ) = + δ. According to the Lie’s first fundamental theorem, after a suitable reparameterization, we can make arbitrary composition law reduced to addition. ¿From hereafter, we shall adopt the canonical parameter when referring to one - parameter groups unless otherwise stipulated. Consider a one - parameter connected local Lie group with infinitesimal transformation ξ, which is an analytic function. The solution of the corresponding Lie equations will also be analytic, i. e. ¯ = f (x, ) = x + x

¯ x d¯ s ds x |0 + · · · + |0 + · · · 1! d s! ds

(5.53)

For any function h(¯ x), by the chain rule of differentiation, (

X ∂h X δh d¯ xk dh )0 = )0 = ( )0 ( ξk , d ∂x ¯k d δxk k

(5.54)

k

or (dh/d)0 = X(h), where X is the linear partial differential operator of the first order, called the infinitesimal generator of the connected local Lie group, defined by X=

X k

ξk

∂ ∂xk

(5.55)

It is evident that X(x) = ξ. Consequently, one obtains, by repeated differentiations: (

¯ ds x )0 = X s (x) ds

(5.56)

Hence, we have s X(x) + · · · + X s (x) + · · · 1! s! Thus we proved the following proposition. ¯ = x+ x

Proposition 5.2 Let eX be the operator defined by the power series expansion:

(5.57)

A Survey of Some Results in the LIE Group Analysis s 2 X + X2 + · · · + Xs + · · · 1! 2! s! The one - parameter connected local Lie group of transformations is equivalent to eX = 1 +

¯ = eX (x) x

243

(5.58)

(5.59)

Definition 5.6 A function F (x) is called an invariant of a connected local Lie group G of transformarions if F remains unaltered when one moves along any path curve of the group G. In other words, F is an invariant if F (f (x, )) = F (x) identically in x and in a neighborhood of = 0. It should noted that for any given invariant F , any function Φ(F ) is also an invariant. Proposition 5.3 A smooth function F is an invariant of the group G with infinitesimal generator X if and only if it solves the homogeneous linear partial equation X(F ) ≡

X k

ξk

∂F =0 ∂xk

Proof: As a direct consequence of propositions 4.8.

(5.60)

Proposition 5.4 Any one - parameter connected local Lie group G of transformations has precisely n − 1 functionally independent invariants. Any set of independent invariants, ψ1 , . . . , ψn−1 , is termed a basis of invariants of G. Of course, basis need not be unique. One can find a basis, by finding n−1 independent first integrals of the characteristic system for equation dx1 dxn = ···= ξ1 ξn

(5.61)

An arbitrary invariant F of G is given by the formula F = Φ(ψ1 , . . ., ψn−1 ) Proof: As a direct consequence of propositions 4.10 and 5.3.

(5.62)

Definition 5.7 Given a connected local Lie group G, canonical variables x0 are defined by the condition that G reduces to translations, e. g. in the direction of the x0n -axis: ¯0n−1 = x0n−1 , x¯0n = x0n + x ¯01 = x01 , . . . , x

(5.63)

Proposition 5.5 For any one - parameter connected local Lie group, there exists a set of canonical variables Proof: As a direct consequence of propositions 4.12 and setting ξ = 1. Consider a system of s equations in Rn :

244

Phillip Yam Fσ (x) = 0,

σ = 1, . . ., s,

(5.64)

where x ∈ Rn and s < n. We impose the condition that the Jacobian matrix (∂F/∂x) is of rank s, at all points x satisfying the system of equations. The collection of all solutions of the system of equations is an (n − s) -dimensional manifold M ⊂ Rn . Definition 5.8 The system (5.64) is said to be invariant with respect to a connected local ¯ = f (x, ), or the system admits G, if Lie group G of transformations x Fσ (¯ x) = 0, σ = 1, . . ., s,

(5.65)

whenever x solves the system (5.64). Geometrically, it means that transformations of the group G carry any point of the variety M along this variety. In order words, the path curve of the group G passing through any point x ∈ M lies in M . Consequently, M is termed an invariant manifold for G. Proposition 5.6 A system of equations Fσ (x) = 0, σ = 1, . . ., s, is invariant under the group G with infinitesimal generator X if and only if XFσ (x) = 0, σ = 1, . . ., s

(5.66)

whenever x solves the system of equations. Proof: For details, see Olver (1986). Consider a connected local Lie group of transformations of the form

¯ x ¯ u

=

f (x, )

(5.67)

=

g(x, )

(5.68)

and act on space of n + m variables x = (x1, . . . , xn ) and u = (u1, . . . , um ), where x corresponds to the n independent variables and u corresponds to the m dependent variables appearing in a given system of differential equations. A group of transformations is said to be admitted by the system if the group induces a mapping that maps any solution u = Θ(x) of the system into another solution of the same system and leaving the system invariant in the sense that it reads the same in terms of the transformed variables for any solution u = Θ(x). In addition, the derivatives of the dependent variables u with respect to the independent variables x are transformed naturally in order to preserve contact conditions. For k ≥ 1, let u(k) denote the set of coordinates corresponding to all kth order partial derivatives of u with respect to x; a coordinate in u(k) is denoted by uα i1 i2 ...ik with α = 1, 2, . . ., m and ij = 1, 2, . . . , n, j = 1, . . . , k. Before proceeding further, the following notations are very convenient: Definition 5.9 The total differential operator is given by the following formal infinite sums: Di =

X X ∂ ∂ ∂ + uα + uα +··· i ii1 α ∂xi ∂u ∂uα i1 α α,i1

(5.69)


245

where i = 1, . . . , n. The total differential operator Di acts on functions involving any finite number of variables x, u, u(1), . . . . For simplicity, in the case with one dependent and independent variables, the total differential operator is denoted by D=

∂ ∂ ∂ ∂ + · · · + yn+1 +··· + y1 + y2 ∂x ∂y ∂y1 ∂yn

(5.70)

In studying the invariance properties of a kth order ordinary differential equation with independent variable x and dependent variable y, we are aim to find an admitted one parameter Lie group of transformations of the form: x ¯ = f (x, y, )

(5.71)

y¯ = g(x, y, )

(5.72)

where y = y(x). We naturally extend the group action to (x, y, y1, . . . , yk ) -space, by demanding each transformation preserves the contact conditions: dy = y1 dx dyk = yk+1 dx,

(5.73)

k = 1, 2, . . .

(5.74)

In other words, the transformed derivatives y¯k , k = 1, 2, . . . are defined successively by d¯ y = y¯1 d¯ x

(5.75)

x d¯ yk = y¯k+1 d¯

(5.76)

Therefore, we have the following relations Proposition 5.7 A one - parameter connected local Lie group of transformations x ¯ = f (x, y, )

(5.77)

y¯ = g(x, y, )

(5.78)

can be extended to its kth extension, k ≥ 1, which is a one - parameter connected local Lie group of transformations acting on (x, y, y1, . . ., yk ) -space, such that: x ¯ y¯

= f (x, y, ) = g(x, y, )

y¯i = gi (x, y, y1, . . . , yi, ) =

(5.79) (5.80) Dgi−1 , i = 1, 2, . . ., k, Df

(5.81)

where g0 = g(x, y, ). Suppose that the one - parameter connected local Lie group of transformations

246

Phillip Yam x ¯ = f (x, y, )

(5.82)

y¯ = g(x, y, )

(5.83)

has infinitesimal generator X = ξ(x, y)

∂ ∂ + η(x, y) ∂x ∂y

(5.84)

and the corresponding kth extension has the infinitesimal generator

X (k)

=

∂ ∂ ∂ + η(x, y) + η(1) (x, y, y1) ∂x ∂y ∂y1 ∂ , + · · · + η(k) (x, y, y1, . . . , yk ) ∂yk

ξ(x, y)

(5.85)

where k = 1, 2, . . .. Then, the explicit formulas for the extended infinitesimal η(k) can be found recursively according to the following proposition. Proposition 5.8 For η(0) = η(x, y), we have η(k) (x, y, y1, . . ., yk ) = Dη(k−1) − yk Dξ(x, y),

k = 1, 2, . . .

(5.86)

In particular, we have the first three infinitesimals as η(1) η

(2)

= =

ηx + (ηy − ξ x )y1 − ξ y (y1 )2,

(5.87) 2

ηxx + (2ηxy − ξ xx )y1 + (ηyy − 2ξ xy )(y1 )

(5.88)

− ξ yy (y1 )3 + (ηy − 2ξ x)y2 − 3ξy y1 y2 , η(3)

=

ηxxx + (3ηxxy − ξ xxx )y1 + 3(ηxyy − ξ xxy )(y1)2

(5.89)

+ (ηyyy − 3ξ xyy )(y1)3 − ξ yyy (y1 )4 + 3(ηxy − ξ xx)y2 + 3(ηyy − 3ξ xy )y1 y2 − 6ξyy (y1)2 y2 − 3ξ y (y2)2 + (ηy − 3ξ x)y3 − 4ξy y1 y3 . Proof: For details, see Olver (1986).

Proposition 5.9 For k = 2, 3, . . . , we have η(k) being linear in yk . Furthermore, η (k) is also a polynomial in y1 , y2, . . . , yk whose coefficients are linear homogeneous in ξ and η up to their kth order partial derivatives. Proof: For detail, see Olver (1986). In studying invariance properties of a kth order partial differential equation with dependent variables u and independent variables x = (x1, x2, . . . , xn), we are naturally led to the problem on finding the extension of transformations on (x, u) -space to


247

(x, u, u(1), . . . , u(k)) -space. Consider a one - parameter connected local Lie group of transformations

¯ x

= f (x, u, )

(5.90)

u ¯

= U (x, u, )

(5.91)

acting on (x, u) -space which preserves the contact conditions du =

X

ui dxi , · · · , dui1i2··· ij =

i

X

ui1 i2 ...ij i dxi ,

(5.92)

i

where j = 1, 2, . . ., k − 1 and il = 1, 2, . . . , n. Accordingly, the transformed derivatives u ¯(k) , k = 1, 2, . . ., are defined successively by d¯ u=

X

u ¯i d¯ xi , · · · , d¯ ui1 i2 ...ij =

i

X

u ¯i1 i2 ...ij i d¯ xi ,

(5.93)

i

Proposition 5.10 The corresponding kth extension to (x, u, u(1), . . . , u(k)) -space of the one - parameter connected local Lie group of transformations (5.90 - 91) is given by

¯ x

=

f (x, u, )

(5.94)

u ¯

=

U (x, u, )

(5.95)

u ¯i1 i2 ...ij

=

Ui1 i2 ...ij (x, u, u(1), . . . , u(j), )

(5.96)

such that          

u ¯i1 i2 ...ik−1 1 u ¯i1 i2 ...ik−1 2 .. . u ¯i1 i2 ...ik−1 n



u ¯1 u ¯2 .. .



    =  

u ¯n 



    =  



U1 U2 .. .



     = A−1   

Un

Ui1 i2 ...ik−1 1 Ui1 i2 ...ik−1 2 .. . Ui1 i2 ...ik−1 n



D1U D2U .. . Dn U



     = A−1   



  , 

D1Ui1 i2 ...ik−1 D2Ui1 i2 ...ik−1 .. . Dn Ui1 i2 ...ik−1

(5.97)



  . 

(5.98)

where A is given by the n × n matrix (Di fj ). Proof: For details, see Olver (1986). Consider a one - parameter connected local Lie group of transformations acting on (x, u) -space with infinitesimal generator X=

X i

ξ i (x, u)

∂ ∂ + η(x, u) . ∂xi ∂u

(5.99)

248

Phillip Yam

The explicit formulas for the extended infinitesimals η(k) of the corresponding kth extension with infinitesimal generator

X (k) =

X (1) ∂ ∂ ∂ + + η(x, u) ηi (x, u, u(1)) ∂xi ∂u ∂ui i i X (k) ∂ ηi1 i2 ...ik , k = 1, 2, . . . +···+ ∂ui1 i2 ...ik

X

ξ i (x, u)

(5.100)

i1 i2 ···ik

is given by:

Proposition 5.11      



(k)

ηi1 i2 ···ik−1 1

 (k)  ηi i ···i 2  1 2 k−1  ..  . ηi1 i2 ···ik−1 n



(1)

η1 (1) η2 .. . (1)

ηn







    =   

D1η D2η .. . Dn η

(k−1)

D1ηi1 i2 ···ik−1

  (k−1)   D2ηi1 i2 ···ik−1 = ..     .  (k−1) Dn ηi1 i2 ···ik−1





    −B·  



u1 u2 .. . un

    

 ui1 i2 ···ik−1 1    ui i ···i 2   1 2 k−1 −B· ..   .  ui1 i2 ···ik−1 n

(5.101)

    

(5.102)

where i1 = 1, 2, . . . , n, for l = 1, 2, . . ., k − 1 with k = 2, 3, . . . and the n × n matrix B = (Diξ j ). In particular, we have the infinitesimals, up to order 2, as:


(1)

∂η ∂η ∂ξ ∂ξ ∂ξ ∂ξ +[ − 1 ]u1 − 2 u2 − 1 (u1 )2 − 2 u1 u2, ∂x1 ∂u ∂x1 ∂x1 ∂u ∂u ∂ξ 2 ∂η ∂η ∂ξ 1 ∂ξ 2 ∂ξ − (u2 )2 − 1 u1 u2, = +[ ]u2 − u1 − ∂x2 ∂u ∂x2 ∂x2 ∂u ∂u 2 2 2 2 ∂ η ∂ η ∂ ξ2 ∂η ∂ ξ1 ∂ξ = 2 + [2 ]u1 − u2 + [ − − 2 1 ]u11 ∂x1∂u ∂u ∂x1 ∂x1 ∂x21 ∂x21 ∂ 2η ∂ 2ξ 1 ∂ 2ξ2 ∂ξ − 2 2 u12 + [ 2 − 2 ](u1)2 − 2 u1 u2 ∂x1 ∂u ∂x1∂u ∂x1∂u ∂ 2ξ 1 ∂ 2ξ2 ∂ξ ∂ξ 3 − (u ) − (u1 )2u2 − 3 1 u1u11 − 2 u2u11 1 2 2 ∂u ∂u ∂u ∂u ∂ξ2 u1 u12, −2 ∂u (2) =η 21

η1 = (1)

η2

(2)

η11

(2)

η12

2

2

2

249

(5.103) (5.104) (5.105)

(5.106)

2

∂ η ∂η ∂ η ∂ ξ2 ∂ ξ1 +[ ]u2 + [ ]u1 − − ∂x1 ∂x2 ∂x1∂u ∂x1∂x2 ∂x2∂u ∂x1∂x2 ∂ξ ∂η ∂ξ ∂ξ ∂ξ − 2 u22 + [ − 1 − 2 ]u12 − 1 u11 ∂x1 ∂u ∂x1 ∂x2 ∂x2 2 2 2 2 ∂ ξ2 ∂ η ∂ ξ1 ∂ 2ξ 1 ∂ ξ2 − (u2)2 + [ 2 − − ]u1 u2 − (u1 )2 ∂x1∂u ∂u ∂x1∂u ∂x2 ∂u ∂x2∂u ∂ξ ∂ 2ξ1 ∂ξ ∂ξ − 22 u1 (u2 )2 − (u1 )2u2 − 2 2 u2u12 − 2 1 u1 u12 2 ∂u ∂u ∂u ∂u ∂ξ 1 ∂ξ 2 u2 u11 − u1 u22, − ∂u ∂u ∂ 2η ∂ 2η ∂ 2ξ1 ∂η ∂ 2ξ2 ∂ξ = 2 + [2 ]u − u1 + [ (5.107) − − 2 2 ]u22 2 2 2 ∂x2∂u ∂u ∂x2 ∂x2 ∂x2 ∂x2 ∂ 2η ∂ξ2 ∂ 2ξ 1 ∂ξ ∂uu1u2 − 2 1 u12 + [ 2 − 2 ](u2)2 − 2 ∂x2 ∂u ∂x2∂u ∂x2 ∂ 2ξ 2 ∂ 2ξ1 ∂ξ ∂ξ 3 − (u ) − u1 (u2 )2 − 3 2 u2u22 − 1 u1u22 2 2 2 ∂u ∂u ∂u ∂u ∂ξ1 u2 u12. −2 ∂u =

(2)

η22

Proof: For details, see Olver (1986).

5.2.2 Multi - parameter Lie groups of Transformations and Lie Algebras So far we have only considered one - parameter Lie groups of transformations. In this subsection, we discuss some key results pertaining to multi - parameter Lie groups of transformations. ¿From hereafter, we assume only a finite number r of parameters. Each parameter of an r -parameter Lie group of transformations leads to an infinitesimal generator. These infinitesimal generators belong to an r -dimensional linear vector space on which there is an additional structure, called the commutator. This special vector space is called a r -dimensional Lie algebra.

250

Phillip Yam For an r -parameter connected local Lie group of transformations, ¯ = f (x, ), x

(5.108)

where = (1, . . . , n ). Let the law of composition of parameters be denoted by φ(, δ) = (φ1(, δ), . . ., φn (, δ))

(5.109)

where φ, being analytic in its domain of definition, satisfies the group axioms with 0 corresponding to the identity. Proposition 5.12 (Lie’s first fundamental theorem) The r-parameter connected local Lie group of transformations is equivalent to the solution of the initial - value problem for the system of nr first order partial differential equations in some neighborhood of = 0:   

∂x ¯1 ∂1

.. .

∂x ¯1 ∂r

··· .. .

∂x ¯n ∂1

···

∂x ¯n ∂r



 x)  = Ψ()Ξ(¯

.. .

(5.110)

¯ = x at = 0 where Ξ is the r × n infinitesimal matrix with entries with x ξ αj (x) =

∂x ¯j |=0 ∂α

(5.111)

for α = 1, 2, . . ., r, j = 1, 2, . . ., n and Ψ is the r × r matrix Ψ() = ( Proof: For details, see Olver (1986).

∂φβ (, δ) |δ=0 )−1 ∂δ α

(5.112)

Definition 5.10 The infinitesimal generator Xα, corresponding to the parameter α of the r-parameter connected local Lie group of transformations, is Xα =

n X

ξ αj (x)

j=1

∂ , α = 1, 2, . . . , r, ∂xj

(5.113)

Proposition 5.13 The r-parameter Lie group of transformations is equivalent to both ¯ = e (i) x

Pr

λα Xα

x

(5.114)

eµα Xα x,

(5.115)

α=1

where , λ1 , . . ., λr are arbitrary constants. ¯= (ii) x

r Y α=1

where µ1 , . . . , µr are arbitrary real constants. Proof: For details, see Olver (1986).


251

Definition 5.11 (Lie algebra) A Lie algebra L is a vector space over some field F in which a bilinear composition [x, y], called a Lie bracket or commutator, is defined for any x, y ∈ L, i.e. [x, y] ∈ L. In addition, the Lie commutator satisfies the following conditions: (i) [α1 x1 + α2 x2 , β1 y1 + β 2 y2 ] = Σi,j=1,2 αi αj [xi, yj ] (ii) [x, x] = 0 (iii) [[x, y], z] + [[y, z], x] + [[z, x], y] = 0 for any x, xi, y, yi, z ∈ L and αi , βi ∈ F, i = 1, 2. The third condition is often called the Jocobi identity. Definition 5.12 Given an r-parameter connected local Lie group of transformations with infinitesimal generators Xα, α = 1, 2, . . ., r,. The commutator of Xα and Xβ is another first order operator [Xα,Xβ ] = XαXβ − Xβ Xα =

n X

ηj

j=1

∂ , ∂xj

(5.116)

where ηj =

n X i=1

∂ξ βj ∂ξ αj ξ αi − ξ βi ∂xi ∂xi

(5.117)

Proposition 5.14 (Lie’s second fundamental theorem) The commutator of any two infinitesimal generators of an r-parameter Lie group of transformations is also an infinitesimal generator, in particular [Xα, Xβ ] =

X

γ Cαβ Xγ

(5.118)

γ γ are constants called the structure constants, α, β, γ = 1, 2, . . . , r. where the coefficient Cαβ Hence the infinitesimal generators Xα form an r-dimensional Lie algebra over the real field.

Proof: For details, see Olver (1986). (k)

(k)

Proposition 5.15 Let Xα , Xβ be the kth extended infinitesimal generators of the infinitesimal generators Xα, Xβ and let [Xα, Xβ ](k) be the kth extended infinitesimal gener(k) (k) ator of the commutator [Xα, Xβ ]. Then [Xα, Xβ ](k) = [Xα , Xβ ], k = 1, 2, . . .. Proof: For details, see Olver (1986).

Definition 5.13 A subspace J ⊂ L is called a subalgebra of the Lie algebra L if J itself is a Lie algebra under the induced Lie commutator from L. 5.3 Lie Group Analytical Approach for Valuation of Financial Derivatives Consider a kth order partial differential equation represented by F (x, u, u(1), u(2), . . . , u(k)) = 0.

(5.119)

252

Phillip Yam

Definition 5.14 A one-parameter connected local Lie group of transformations ¯ = f (x, u, ) x u ¯ = U (x, u, )

(5.120)

leaves the partial differential equation (5.119) invariant if and only if its kth extension leaves the surface F = 0 invariant. Proposition 5.16 Given a one - parameter connected local Lie groups of transformations having X

X=

ξ i (x, u)

i

∂ ∂ + η(x, u) ∂xi ∂u

as its infinitesimal generator with X X (1) ∂ ∂ ∂ X (k) = ξ i (x, u) + η(x, u) ηi (x, u, u(1)) + ∂xi ∂u ∂xi i i X (k) ∂ ηi1 ,i2 ,...,ik (x, u, u(1), . . ., u(k)) +···+ ∂ui,i2 ,...,ik

(5.121)

(5.122)

i1 ,...,ik

as the kth extended infinitesimal generator. The Lie group leaves the equation (5.119) invariant if and only if X (k)F = 0 whenever F = 0. Proof: For details, see Olver (1986).

Definition 5.15 u = Θ(x) is an invariant solution of F = 0 corresponding to a one parameter connected local Lie group of transformations admitted by this equation of and only if (i) u = Θ(x) is an invariant manifold of the Lie group. (ii) u = Θ(x) solves F = 0. Proposition 5.17 Suppose that f is a function not depending on ui1...i . A kth order partial differential equation (k ≥ 2) ui1 ...i1 = f (x, u, u(1), u(2), . . . , u(k))

(5.123)

admits an infinitesimal generator X=

X

ξ i (x, u)

i

∂ ∂ + η(x, u) ∂xi ∂u

(5.124)

if and only if (l)

ηi1 ...il =

X

ξj

∂f X (1) ∂ ∂f +η ηj + ∂xj ∂u ∂uj j

+ ···+

X j1 ,...,jk

(k)

ηj1 ...jk

∂f ∂uj1 ...jk

(5.125)


253

whenever ui1 ...ik = f . In addition, (p) (i) ηj1 ...jp is linear in the components of u(p) if p ≥ 2; (ii)

(p)

ηj1 ...jp is a polynomial in the components of u(1), . . . , u(p) whose coefficients are linear homogeneous in ξ i and η and in their partial derivatives with respect to (x, u) of order up to p.

Proof: For details, see Olver (1986). If f is a polynomial in the components of u(1) , . . ., u(k), then the equation (5.125) is a polynomial equation in u(1), . . . , u(k) whose coefficients are linear homogeneous in ξ i , η and in their partial derivatives up to kth order. Clearly, at any point x, one can assign an arbitrary values to each u, u(1), . . . , u(k) provided the partial differential equation ui1 ...il = f is satisfied; In other words, one can assign any values to u, u(1), . . . , u(k) except to the coordinates ui1 ...il . Therefore, after replacing ui1 ...,il , the resulting polynomial equation must holds for arbitrary values of u(1), . . . , u(k). Consequently, the coefficients of the polynomial must vanish separately, resulting in a system of linear homogenous partial differential equation for n + 1 functions ξ i and η. This resulting system is called the set of determining equations for the infinitesimal generator X admitted by ui1 ...il = f . In general, there are usually more than n + 1 determining equations, hence the set of determining equations is an overdetermined system. For f is non - polynomial function, one can still break up the equation (l)

ηi1 ...il =

X

ξj

X (k) ∂f ∂f ∂f +η ηj1 ...jk +···+ ∂xi ∂u ∂uj1 ...jk

(5.126)

j1 ...jk

into a system of linear homogeneous partial differential equations for ξ i and η by using similar arguments. (l)

Proposition 5.18 Suppose ui1 ...il = f is of order k ≥ 2 and is a linear partial differential equation which admits infinitesimal generator X=

X i

∂ ∂ +η . ∂xi ∂u

(5.127)

for i = 1, 2, . . ., n

(5.128)

ξi

Then ∂ξi =0 ∂u ∂ 2η =0 ∂u2

hence for n = 2, the infinitesimal generator in of the form.

X = ξ 1 (x1, x2)

∂ ∂ ∂ + ξ 2 (x1, x2) + [f (x1, x2)u + g(x1, x2)] . ∂x1 ∂x2 ∂u

According to the proposition 5.11, we get

(5.129)

254

Phillip Yam

(1) η1 (1)

η2

(2)

η11

∂g ∂f ∂ξ 1 ∂ξ u1 − 2 u2 = + u+ f − ∂x1 ∂x1 ∂x1 ∂x1 ∂g ∂f ∂ξ 1 ∂ξ2 u2 = + u− u1 + f − ∂x2 ∂x2 ∂x2 ∂x2 ∂ 2g ∂ 2 f ∂f ∂ 2ξ 1 ∂ 2ξ 2 u = 2+ u + 2 − − u2 1 ∂x1 ∂x1 ∂x21 ∂x21 ∂x21 ∂ξ ∂ξ + f − 2 1 u11 − 2 2 u12 ∂x1 ∂x1

(2)

(2)

η12 =η21

(2)

∂ 2f

∂ 2ξ

∂f 1 u1 u+ − ∂x1∂x2 ∂x1∂x2 ∂x2 ∂x1∂x2 ∂f ∂ 2ξ2 ∂ξ + − u2 − 2 1 u11 ∂x ∂x1∂x2 ∂x2 1 ∂ξ1 ∂ξ2 ∂ξ + f− − u12 − 2 u22 ∂x1 ∂x2 ∂x1 2 2 2 ∂ g ∂ f ∂ ξ1 ∂f ∂ 2ξ2 u2 = 2+ u− u1 + 2 − ∂x2 ∂x2 ∂x22 ∂x22 ∂x22 ∂ξ1 ∂ξ2 u22 . −2 u12 + f − 2 ∂x2 ∂x2

=

η22

∂ 2g

+

(5.130) (5.131) (5.132)

(5.133)

(5.134)

Proof: For details, see Ovsyannikov (1982). Consider a boundary value problem of a kth order partial differential equation in the form F (x, u, u(1), . . . , u(k)) = 0 defined on a domain Ωx in x-space with boundary conditions Bα (x, u, u(1), . . . , u(k−1)) = 0

(5.135)

prescribed on boundary surfaces ω α (x) = 0,

(5.136)

where α = 1, 2, . . ., s. ¿From hereafter, we only deal with those boundary values problems having unique solutions. Therefore, the invariant solution is precisely the unique solution. Definition 5.16 An infinitesimal generator X is said to be admitted by the boundary value problem (5.135 - 136) if and only if (i) X (k)F = 0 whenever F = 0 (ii) Xωα = 0 whenever ω α = 0, for α = 1, 2, . . ., s. (iii) X (k−1)Bα = 0 whenever Bα = 0 on ω α = 0, for α = 1, 2, . . ., s. Proposition 5.19 Suppose that the boundary value problem (5.135 - 136) admits a one - parameter connected local Lie group of transformations. Let Φ =


255

(φ1(x), φ2(x), . . ., φn−1 (x)) be n − 1 independent group invariants of the Lie group depending only on x. Let ν(x, u) be a group invariant of the Lie group such that ∂ν ∂u 6= 0. Then (5.135 - 136) reduces to G(Φ, ν, ν (1), . . . , ν (k)) = 0

(5.137)

defined on some domain ΩΦ in Φ-space with boundary conditions Cα (Φ, ν, ν (1), . . . , ν (k−1) ) = 0

(5.138)

prescribed on boundary surfaces θα (Φ) = 0

(5.139)

for some G, Cα, θα , for α = 1, 2, . . ., s. In particular, if the infinitesimal generator is of the form X=

X

ξ i (x)

i

∂ ∂ + f (x)u . ∂xi ∂u

(5.140)

u then ν = g(x) for a known function g and hence an invariant solution arising from X is of the separated form

u = g(x)ψ(Φ)

(5.141)

for an arbitrary function ψ of Φ = (φ1 (x), φ2 (x), . . ., φn−1 (x)). Proof: For details, see Ovsyannikov (1982). The Constant Elasticity of Variance (CEV) model with time - dependent model parameters for a standard European call option is described by the boundary value problem ∂ 2V 1 ∂V ∂V + σ2 (t)S θ − r(t)V = 0 + (r(t) − d(t))S ∂t 2 ∂S 2 ∂S with boundary condition V (S, T ) = δ(S − S0 )

(5.142) (5.143)

precribed on boundary surface t≤T

(5.144)

S≥0 where δ is the Dirac δ-function, T is the expiry date and S0 is the strike price. Firstly, for ease of calculation, we transform the boundary value problem to standard form, by incorporating the transformation: V¯ S¯

=

V eβ(t)

=

Se

α(t)

t¯

=

γ(t)

(5.145)

256

Phillip Yam

where α, β and γ are determined as follow: ∂V ∂t ∂V ∂S ∂ 2V ∂S 2

= = =

∂ V¯ ∂ V¯ ˙ ¯ ¯ γ˙ + S α˙ ¯ − β V e−β(t) ∂¯ t ∂S

∂ V¯ α(t) −β(t) e e ∂ S¯ ∂ 2V¯ α(t) 2 −β(t) (e ) e ∂ S¯2

(5.146) (5.147) (5.148)

On substituting (5.146 - 148) into (5.142), we have

γ˙

∂ V¯ ∂ 2 V¯ 1 ∂ V¯ ¯ ˙ S¯ ¯ − (β˙ + r)V¯ = 0 + σ 2(t¯)e(2−θ)α(t) S¯θ ¯2 + (r − d + α) ∂¯ t 2 ∂S ∂S

(5.149)

Choosing α, β, and γ such that:

α˙ = −(r − d) β˙ = −r 1 γ˙ = − σ2 (t)e(2−θ)α(t) 2

α=

Z

β=

Zt T

γ=

Zt t

T

r − d dt0

T

r dt0 1 2 0(2−θ)α(t0) 0 σ (t dt 2

Equation (5.149) can now be reduced to ∂ 2 V¯ ∂ V¯ = S¯θ ¯2 ∂¯ t ∂S In addition, the original boundary condition and surface are now transformed to: ¯ 0) = δ(S¯ − S0 ) V¯ (S,

(5.150)

(5.151)

and t¯ S¯

≥ 0 ≥ 0

respectively. t by x2 in (5.150), For the sake of reference, we replace V¯ by u, S¯ by x1 and ¯ i.e. θ 2u ∂u = xθ1 2 ∂x2 θx1 or

(5.152)

A Survey of Some Results in the LIE Group Analysis u2 = xθ1u11

257 (5.153)

According to the proposition 5.17, the system of determining equations can be found from: (1)

(2)

θ η2 = θxθ−1 1 ξ 1 u11 + x1 η 11 (1)

(5.154)

(2)

According to the proposition 5.18, η2 , η11 and the infinitesimal generator L are given by: ∂g ∂f ∂ξ 1 ∂ξ2 u2 = + u− u1 + f − ∂x2 ∂x2 ∂x2 ∂x2 ∂ 2g ∂ 2 f ∂f ∂ 2ξ 1 ∂ 2ξ 2 (2) η22 = 2 + u + 2 − − u2 u 1 ∂x1 ∂x1 ∂x21 ∂x21 ∂ 2x21 ∂ξ ∂ξ1 u11 − 2 2 u12 + f −2 ∂x1 ∂x1 ∂ ∂ d L =ξ 1 + ξ2 + (f · u + g) ∂x1 ∂x2 ∂u (1) η2

(5.155) (5.156)

(5.157)

On substituting (5.155 - 156) into (5.154), we get ∂f ∂ξ 1 ∂ξ2 ∂g + u− u1 + f − u2 ∂x2 ∂x2 ∂x2 ∂x2 =θxθ−1 1 ξ 1 u11 2 ∂ g ∂ 2f ∂f ∂ 2 ξ1 ∂ 2ξ 2 ∂ξ1 ∂ξ2 u u + xθ1 + u + 2 − − u + f − 2 − 2 u 1 2 11 12 ∂x1 ∂x1 ∂x1 ∂x21 ∂x21 ∂x21 ∂x21 ∂ 2g ∂ 2f ∂f ∂ 2ξ 1 ∂g ∂f ∂ξ + xθ1 2 − u + 2xθ1 0 = xθ1 2 − − xθ1 + 1 u1 2 ∂x2 ∂x1 ∂x2 ∂x ∂x1 ∂x2 ∂x1 1 2 ∂ξ ∂ ξ2 θ θ ∂ξ 1 u11 + −xθ1 − f − 2 u2 + θxθ−1 1 ξ 1 + x1 f − 2x1 2 ∂x1 ∂x2 ∂x2 θ ∂ξ 2 u12 + −2x1 ∂x1 Since u2 = xθ1u11 , therefore (5.154) is equivalent to 2 ∂g ∂f θ∂ f + x1 2 − u − 0= ∂x2 ∂x1 ∂x2 2 ∂ξ1 θ ∂f θ ∂ ξ1 u1 + 2x1 − x1 2 + ∂x1 ∂x2 ∂x1 ∂ 2ξ 2 θ−1 θ ∂ξ 2 θ ∂ξ 1 u11 + −x2θ + x + θx ξ − 2x 1 1 1 1 1 ∂x21 ∂x2 ∂x1 θ ∂ξ 2 + −2x1 u12 ∂x1

∂ 2g xθ1 2 ∂x1

(5.158)

258

Phillip Yam

On equating the coefficients of u and the derivatives of u to zero, we get the system of determining equations: ∂ 2g ∂g = xθ1 2 (1) ∂x2 ∂x1 ∂f ∂ 2f (2) = xθ1 2 ∂x2 ∂x1 ∂ξ ∂f ∂ 2ξ (3) 2xθ1 = xθ1 21 − 1 ∂x1 ∂x2 ∂x1 2 ∂ξ θ ∂ξ1 2θ ∂ ξ 2 (4) θxθ−1 ξ − 2x = x − xθ1 2 1 1 1 1 2 ∂x1 ∂x2 ∂x1 ∂ξ2 (5) =0 ∂x1 On solving this system, we get

ξ1

=

ξ2

=

f

=

g

=

1 (c1x2 + c2 )x1 2−θ x2 c1 2 + c2x2 + c3 2 2 1 1−θ 1 1 2−θ c1 x1 − c1 x2 + c4 − 2 2−θ 2 2−θ 0

(5.159) (5.160) (5.161) (5.162)

where c1 , c2, c3 and c4 are undetermined constants. The invariance of boundary condition and surfaces imposes further restriction on these constants ci ’s (i) The condition x1 > 0 implies ξ 1 (0, x2) = 0 ⇒ no restriction. (ii) The condition x2 > 0 implies ξ 2 (x1, 0) = 0 ⇒ c3 = 0 (iii) The condition u(x1, 0) = δ(x1 − x ˆ1 ), where 0 < x ˆ1 < ∞ implies f (x1 , 0)u(x1, 0) ⇒

ˆ1 ) f (x1, 0)δ(x1 − x

=

ξ 1 (x1, 0)δ0(x1 − x ˆ1 )

=

0

ξ 1 (x1, 0)δ (x1 − x ˆ1 )

Equation (5.163) is satisfied if i. ξ 1(ˆ x1 , 0) = 0 i.e.

1 c2 x ˆ1 = 0 2−θ c2 = 0 ∂ξ 1 1 c1 (0) = 0 (ˆ x1, 0) = − ii. f (ˆ x1 , 0) = − ∂x1 2− θ 2 1 1 ∴ − c1x ˆ2−θ + c4 = 0 1 2 2 − θ 2 1 1 c1x ˆ2−θ c4 = 1 2 2−θ Hence, we have

(5.163)


ξ1

=

ξ2

=

f

=

1 (c1x1 x2) 2−θ x2 c1 2 2 1 1 2 2−θ 1 1 − θ 1 1 2 2−θ ˆ1 − ( ) x x2 − ( ) x1 c1 2 2−θ 22−θ 2 2−θ

259

(5.164) (5.165) (5.166)

and the infinitesimal generator

L

=

∂ 1 x2 ∂ x1 x2 + 2 (5.167) 2−θ ∂x1 2 ∂x2 1 1 2 2−θ 1 1 2 2−θ 1 1 − θ ∂ ˆ1 − ( + ( ) x )x2 − ( ) x1 u 2 2−θ 2 2−θ 2 2−θ ∂u

According to the proposition 5.19, the corresponding invariant solution is

1

u=

1−θ 2−θ

x2 Denote

2−θ x1 2 x2

2−θ 1 2 1 2−θ x1 2 2−θ exp −( (ˆ x + x1 ) F ( ) ) 2 − θ x2 1 x2

(5.168)

by z. Now, the partial derivatives of u can be rewritten as: 1 1 2 1 2−θ ∂u 2−θ = 1−θ exp −( (ˆ x + x1 ) . ) ∂x2 2 − θ x2 1 2−θ x2   2−θ   1−θ 1 2−θ 2−θ 2 1 2 x ˆ x1 + x1 0 ) ( 1 − F +( )F − F   2 − θ x2 2−θ x22 x22 1 2 1 2−θ 2−θ exp −( (ˆ x + x1 ) · F ) 2 − θ x2 1 x22−θ 2−θ 1 2−θ 1 2 1 2−θ −1 2−θ 2 ) + x exp −( (ˆ x + x1 ) · F 0 1−θ +1 2 2 2 − θ x2 1 2−θ x2

1 ∂u =− ∂x1 2−θ

∂ 2u xθ1 2 ∂x1

=

1

x1−θ 1−θ +1 1

1 1−θ 2−θ

1 2 1 2−θ 2−θ ) exp −( (ˆ x + x1 ) · 2 − θ x2 1

x2 1 2 1 2−θ 1 2−θ 1−θ 1 F +( ) 2 x1 F − 2 x1 2 F 0 − 2 − θ x2 2 − θ x2 x2 2−θ 2−θ 2−θ 2 · θ 2 1 00 1 − 2 0 + F +( ( − 1) x1 ) 2F 2 2 x2 2 x2

(5.169)

(5.170)

(5.171)

260

Phillip Yam

On substituting (5.170) and (5.171) into (5.153), we get 1 1 0 ) F −x ˆ2−θ ·F =0 (5.172) 1 2−θ z which is a modified Bessel equation of the second type and its solution is readily be found in any standard table of Bessel functions. For a general discussion on Bessel equations, see Watson (1952). Therefore, the explicit pricing formula for a European call option is: F 00 + (1 −

Pc (S, t)

=

−

Se

RT t

d(t0 )dt0

∞ X 1 z n e−z G(n + 1 + , ω) Γ(n + 1) 2−θ

(5.173)

n=0

− S0 e

−

RT t

r(t)0 dt0

∞ X

1

z n+ 2−θ e−z G(n + 1, ω) 1 Γ(n + 1 + 2−θ ) n=0

where

z

=

S 2−θ e(2−θ)α (2 − θ)2γ

(5.174)

(2−θ)

ω

=

G(a, ω)

=

S0 (2 − θ)2 γ Z ∞ 1 ζ a−1 e−ζ dζ. Γ(a) ω

(5.175) (5.176)

REFERENCES Attanasio, O. P. and Davis, S. J. (1997), “Relative wage movements and the distribution of consumption”, J. Political Economy. 104, pp.1273 - 1262. Bardhan, I. and Chao, X. (1993), “Pricing options on securities with discontinuous returns”, Stoch. Process. Appl. 48, pp.123 - 137. Beckers, S. (1980), “The constant elasticity of variance model and its implications for option pricing”, J. Finance, 35, pp.661 - 673. Black, F. (1975), “Studies of stock price volatility changes”, Proceedings of the Centre for Research in Security Prices Seminar , University of Chicago, Chicago. Black, F. and Scholes, M. (1973), “The pricing of options and corporate liabilities”. J. Political Economy. 81, pp.637 - 654. Boyle, P. P. (1977), “Options: A Monte Carlo Approach”, J. Fin. Econ., 4 pp.323 - 338. Br¨ ocker, T. and Dieck, T. T. (1985), Representations of Compact Lie groups , GTM98, Springer - Verlag. Chevalley, C. (1946), Theory of Lie Groups, Princeton: Princeton University Press. Cox, J. C. (1983) May, “Optimal consumption and portfolio policies when asset returns follow a diffusion process”, Research Paper 658, Graduate School of Business, Stanford University.


261

Cox, J. C. and Ross, S. A. (1976), “The valuation of options for alternative stochastic processes”, J. Fin. Econ., 3, pp.145 - 166. Cox, J. C., Ross, S. A., and Rubinstein, M. (1979), “Option pricing: A simplified approach”, J. Fin. Econ., 7 pp.229 - 264. Duffie, D. (1992), Dynamic assets pricing theory , Princeton: Princeton University Press. Feller, W. (1951), “Two singular diffusion problem”, Annuals of Maths, 54, pp.173 - 182. Goldenberg D. H. (1991), “A unified method for pricing options on diffusion processes”, J. Fin. Econ., 29, pp.3 - 34. Heaton, J., and Lucas, D. (1996), “Evaluating the effects of incomplete markets for risk sharking and asset pricing”, J. Political Economy. J. Political Economy 104, pp.443 - 487. Hull, J. (1997), Options, futures and other derivatives , 3rd ed. Prentice Hall, Englewood Cliffs (New Jesey). Jacobson, N. (1979), Lie Algebras, Dover fKwok, Y. K. (1998), Mathematical Models of Financial Derivatives , Springer, Berlin Heidelberg New York. Lauterbach, B. and Schultz, P. (1990), “Pricing warrants: An empirical study of the Black - Scholes model and its alternatives”, J. Finance, 45, pp.1181 - 1209. Mace, B. (1991), “Full insurance in the presence of aggregate uncertainty”, J. Political Economy. 99, pp.928 - 956. Merton, R. C. (1973), “Theory of rational option pricing”, Bell J. Economics Management Science 4, pp.141 - 183. Oksendal, B. (1998), Stochastic Differential Equations: an introduction with applications, Springer-Verlag, 5th edition. Olver P. J., (1986), Applications of Lie Groups to Differential Equations , GTM107, Springer-Verlag. Ovsyannikov, L. V. (1982), Group Analysis of Differential Equations , Academic Press. Panjer, H. et al. (1998), Financial Economics, With Applications to Investments, Insurance and Pensions, The Actuarial Foundation, Schaumburg, Illinois. Schroder, M. (1989), “Computing the constant elasticity of variance option pricing formula”, J. Fin, 44, pp.211 - 219. Steeb W. H. (1996), Continuous Symmetries, Lie Algebras, Differential Equations and Computer Algebra, World Scientific. Taylor, M. (1996), Partial Differential Equations, AMS 115 - 117, Vol. I-III, Springer -Verlag. Watson G. H. (1952), A Treatise on the Theory of Bessel Functions , Cambridge: Cambridge University Press. Wilmott, P. (1998), Derivatives: The theory and practice of financial engineering , Chichester: John Wiley Sons.



Chapter 9

O N G RAPH G ROUPOIDS : G RAPH G ROUPOIDS AND C ORRESPONDING R EPRESENTATIONS Ilwoo Cho∗ Saint Ambrose University, Department of Mathematics, 421 Ambrose Hall, 518 W. Locust St., Davenport, Iowa, 52803, U. S. A.

Abstract We consider countable directed graphs and their corresponding graph groupoids, and the canonical representations of them. The study of representations of graph groupoids is based on the observation of the canonical representations of categorical groupoids. The substructures of a fixed graph groupoid are considered, and the corresponding sub-representations. In particular, we observe the representations and the corresponding von Neumann algebras of (i) the subgroupoids induced by the towers of full-subgraphs, (ii) the quotient groupoids, induced by the full-subgraph-inclusion, and (iii) graph fractaloids which are the graph groupoids with fractal property.

Key Words: Connected Directed Graphs, Groupoids, Graph Groupoids, Full-Subgraphs, Subgroupoids, Quotient Graphs, Quotient Groupoids, Graph Fractaloids, Canonical Representations, Groupoid W ∗ -Algebras, Graph von Neumann Algebras. Subject Classification: 05C99, 18B40, 47A99. In this paper, we consider the canonical representations of groupoids. We restrict our interests to the case where we have groupoids induced by countable directed graphs, called the graph groupoids, and their canonical representations. Throughout this paper, let G be a countable directed graph. We consider G as a combinatorial object (V (G), E(G), s, r), consisting of the vertex set V (G), the edge set E(G), the source map s, and the range map r. In particular, the maps s and r are defined from E(G) to V (G), completely dependent upon the direction on G. For instance, if an edge e has its initial vertex (or the source) v, and its terminal vertex (or the range) v 0, then s(e) = v and r(e) = v 0. Remark that, here, the vertices v and v 0 are not necessarily distinct in V (G). ∗

Email address: [email protected]

264

Ilwoo Cho

We construct an algebraic structure G from this combinatorial object G. This algebraic structure G of G is called the graph groupoid of G (See below). In [3] through [18], we introduced graph groupoids and we investigate some operator algebraic application of them. Also, in [20] and [22], the general (categorical) groupoids are introduced and considered. In fact, all our graph groupoids are groupoids, but the converse does not hold true. In this paper, we will concentrate on observing certain pure algebraic properties induced by graph groupoids. Such algebraic properties can be explained canonically by the certain natural representations of graph groupoids. One of the most interesting algebraic properties of graph groupoids would be their substructures and certain quotient structures induced by them, which are known little (Also, see [11] and [7]]). Groupoids are not groups, however, they are group-like structures. Roughly speaking, they are group-like structures having multi-units. In fact, all groups are groupoids. So, it is natural to consider the certain generalized group-like properties in the study of groupoids. Then, how can we construct the subgroupoids of graph groupoids, and the quotient groupoids from graph groupoids? And how their corresponding representations work? In this paper, we answer these questions. In [12] through [15], other representations of graph groupoids are considered. In particular, the representations for the fixed operator algebra B(H), where certain graph groupoids are acting on, are observed, where H is a separable infinite dimensional Hilbert space. We realized that the C ∗ -subalgebras of B(H), generated by (finitely many) partial isometries are ∗-isomorphic to groupoid C ∗ -algebras, where the groupoids are the graph groupoids, induced by partial isometries. However, in this paper, we will concentrate on the canonical representations of graph groupoids and the corresponding graph von Neumann algebras, as in [3] through [11], and [16], [17], and [18]. In [16], and [18], we observed the fractal property on graph groupoids. We call the graph groupoids with the fractal property the graph fractaloids. In Section 5, we introduce what we found in [16], [17], and [18], in representation theory point of view. Throughout this paper, we will restrict our interests to the cases where we have countable “connected” directed graphs. i.e., for any pair of distinct vertices, we can have at least one reduced finite path connecting them (See Section 2).The general “disconnected” case can be regarded as the direct sum of connected case, by our representations. This means that it suffices to consider connected case. In Section 1, we introduce the categorical groupoids and their canonical representations. In Section 2, we define graph groupoids and their canonical representations which are our main objects of this paper. In Section 3, we consider the towers (or the chains) of fullsubgraphs of a fixed connected directed graph and their canonical (sub-)representations. Also, we introduce the corresponding graph von Neumann algebras. In Section 4, the algebraic, and operator-algebraic quotient structures induced by graph groupoids are observed. With respect to these structures and the representations, we realize that the interesting (freeprobabilistic) W ∗ -algebraic structures are embedded in our graph von Neumann algebras, in general. In the final section, we consider the connected “finite” directed graphs generating graph fractaloids. Also, their canonical representations are reviewed. Most of the results in this paper are recently known (See [3] through [11], and [16] through [18]). But we approach these known results in the sense of representation theory. In particular, they are re-explained or re-proven by the canonical representations of (general)

On Graph Groupoids: Graph Groupoids and Corresponding Representations

265

groupoids in the sense of Section 1.

1.

Groupoids

In this section, we introduce categorical groupoids and the corresponding canonical representations of them.

1.1.

Groupoids

Let’s define the categorical groupoids in the sense of [20]. Definition 1..1 We say an algebraic structure (X , Y, s, r) is a (categorical) groupoid if it satisfies (i) Y ⊂ X , and (ii) for all x1 , x2 ∈ X , there exists a partially defined binary operation (x1, x2 ) 7→ x1 x2 , for all x1 , x2 ∈ X , depending on the source map s and the range map r satisfying the following: (ii-1) the operation x1 x2 is well-determined, whenever s(x1 ) = r(x2) ∈ Y. (ii-2) the operation (x1 x2 ) x3 = x1 (x2 x3 ) is defined, if the constituents are welldetermined in the sense of (ii-1), for x1, x2 , x3 ∈ X , (ii-3) if x ∈ X , then there exist y, y 0 ∈ Y such that s(x) = y and r(x) = y 0, and satisfies x = y 0 x y, (ii-4) if x ∈ X , then there exists a unique groupoid-inverse x−1 satisfying x x−1 = r(x) and x−1 x = s(x), in Y. By definition, we can conclude that every group is a groupoid (X , Y, s, r) with |Y| = 1 (and hence s = r on X ). The subset Y of X is said to be the base of X . Remark that we can naturally assume that there exists the empty element ∅ in a groupoid X , to avoid the partially-definedness of the binary operation on X (See [3] through [18]). i.e., by adding the empty element ∅ in X , the binary operation on it is totally-defined. Roughly speaking, the empty element ∅ means the element in X , representing the products x1 x2 , which are not well-defined in the sense of the above definition, for x1 , x2 ∈ X . Notice that if |Y| = 1 (equivalently, if X is a group), then the empty word ∅ is not contained in the groupoid X . However, in general, whenever |Y| ≥ 2, a groupoid X always has the empty word ∅. So, if there is no confusion, we automatically assume that the empty element ∅ is contained in X . Let Xk = (Xk , Yk , sk , rk ) be groupoids, for k = 1, 2. We say that a map f : X1 → X2 is a groupoid morphism, if (i) f (Y1 ) ⊆ Y2 , (ii) s2 (f (x)) = f (s1 (x)) in X2 , for all x ∈ X1 , and (iii) r2 (f (x)) = f (r1 (x)) in X2 , for all x ∈ X1 . Thus it is natural that f (∅) = ∅, by the definition of f. If a groupoid morphism f is bijective, then we say that f is a groupoidisomorphism, and that the groupoids X1 and X2 are groupoid-isomorphic . Assume that if f : X1 → X2 is a groupoid-isomorphism, then f (x1 x2) = f (x1 ) f (x2) in X2 , for all x1 , x2 ∈ X . Let X = (X , Y, s, r) be a groupoid. We say that this groupoid X acts on a set X, if there exists a groupoid action π of X such that: (i) π(x) : X → X is a well-defined function, and (ii) π(x1 x2) = π(x1) ◦ π(x2 ) on X, for all x1 , x2 ∈ X , where (◦) means the

266

Ilwoo Cho

usual composition. Sometimes, we call the set X, a X -set. We are interested in the case where a X -set X is a Hilbert space. Let X1 ⊂ X2 be given and assume that Xk = (Xk , Yk , s, r) are groupoids, for k = 1, 2. If Y1 = X1 ∩ Y2 , then the groupoid X1 is said to be a subgroupoid of X2 . Assume that X1 and X2 are groupoids and suppose X1 is a subgroupoid of X2 . If X1 satisfies the following relation in X2 : x X1 x−1 ⊆ X1 , for all x ∈ X2 , then as usual, we can construct the quotient structure X2 / X1 , algebraically. This structure X2 / X1 is called the quotient groupoid (Also, see [22] and [7]).

1.2.

Canonical Representations of Groupoids

For any given groupoid X , we can have the corresponding canonical representation (H, L), where H is a Hilbert space and L is a groupoid action of X , acting on H. The construction of the representations is a natural extension of that of groups. Fix a groupoid X = (X , Y, s, r). Define a Hilbert space H by H = y ∈ Y⊕Cξ y ⊕ (x ∈ X \ (Y ∪ {∅})⊕Cξ x ) = x ∈ X \ {∅}⊕ C ξ x . i.e., define H by the Hilbert space with its Hilbert basis {ξ x : x ∈ X \ {∅}}. This Hilbert space H is called the groupoid Hilbert space of X . Remark that the Hilbert basis elements {ξx }x∈X \{∅} satisfies the following multiplication rule: if x1 x2 6= ∅ ξ x1 x2 ξ x1 ξ x2 = ξ ∅def =0H otherwise, for all x1 , x2 ∈ X . By the above multiplication rule on the Hilbert basis of H, we can define the operators Lx on H, for all x ∈ X , where Lx is the multiplication operator on H, with its symbol ξ x , for all x ∈ X . It is easy to check that Lx1 Lx2 = Lx1 x2 , for all x1 , x2 ∈ X , and L∗x = Lx−1 , for all x ∈ X . Therefore, if y ∈ Y ⊂ X , then Ly is a projection on H, since L∗y = Ly−1 = Ly = Ly2 = L2y . And hence, if x ∈ X \ Y, then Lx is a partial isometry on H, since L∗x Lx = Lx−1 Lx = Lx−1 x and


267

Lx L∗x = Lx Lx−1 = Lxx−1 , for all x ∈ X . Recall that, for any x ∈ X , the elements x−1 x and x x−1 are contained in the base Y of X , so, Lx−1 x and Lxx−1 are projections on H. Define now a groupoid action L : X → B(H) by L(x) def = Lx , for all x ∈ X , where Lx is the multiplication operator on H, with its symbol ξ x , for all x ∈ X . This is a well-defined action of X , acting on H, since L(x) = Lx is a (bounded) operator in B(H), and it satisfies that L(x1x2 ) = Lx1 x2 = Lx1 Lx2 = L(x1) ◦ L(x2 ) on H, for all x, x1 , x2 ∈ X . This groupoid action L of X , acting on H, is called the canonical (left) action of X . Definition 1..2 Let X be a groupoid and let H be the groupoid Hilbert space and L, the canonical action of X , acting on H. Then the pair (H, L) induced by X is called the canonical (left) representation of X . By the above canonical representation of a groupoid X , each element x of X can be understood as an operator Lx in B(H). In particular, the C ∗ -subalgebra C ∗ (X ) of B(H), generated by {Lx : x ∈ X } is called the groupoid C ∗ -algebra of X (See [20]), and the W ∗ subalgebra vN (X ) of B(H), generated by {Lx : x ∈ X } is called the groupoid W ∗ -algebra of X . We are interested in the von Neumann algebras generated by groupoids. Let Xk be groupoids, and (Hk , L(k)), the corresponding canonical representations of Xk , for k = 1, 2. We say that the representations are equivalent, if there exists a groupoid (2) (1) isomorphism Φ : X1 → X2 such that LΦ(x) and Lx are unitarily equivalent, for all x ∈ X1 .

2.

Representations of Graph Groupoids

In this section, we construct the graph groupoids from the given countable directed graphs. Also, we construct the suitable canonical representations of graph groupoids.

2.1.

Graph Groupoids

Let G be a countable directed graph with its vertex set V (G) and its edge set E(G). Denote the set of all finite paths of G by F P (G). Clearly, the edge set E(G) is contained in F P (G). Let w be a finite path in F P (G). Then it is represented as a word in E(G). Conversely, if e1 , ..., en are connected directed edges (in the order of indices) in E(G), for n ∈ N, then we have a finite path w = e1 ... en in F P (G). If there exists a finite path w = e1 ... en in

268

Ilwoo Cho

F P (G), then we say that the directed edges e1 , ..., en are admissible (or connected in the order (e1 , ..., en )) and then the length |w| of w is defined to be n, which is the cardinality of the admissible edges generating w. Also, we say that finite paths w1 = e11 ... e1k1 and w2 = e21 ... e2k2 are admissible (or connected in the order (w1, w2)), if w1 w2 = e11 ... e1k1 e21 ... e2k2 is again in F P (G), where e11 , ..., e1k1 , e21 , ..., e2k2 ∈ E(G). Otherwise, we say that w1 and w2 are not admissible. By definition, |w1w2| = |w1 | + |w2 | , whenever w1 and w2 are admissible in F P (G), for w1, w2 ∈ F P (G). Suppose that w is a finite path in F P (G) with its initial vertex v1 and its terminal vertex v2 . We will write w = v1 w or w = w v2 or w = v1 w v2 , for emphasizing the initial vertex of w, respectively, the terminal vertex of w, respectively, both the initial vertex and the terminal vertex of w. Suppose w = v1 w v2 in F P (G), with v1 , v2 ∈ V (G). Then we say that “v1 and w are admissible” and “w and v2 are admissible”. Notice that even though finite paths w1 and w2 are admissible, w2 and w1 are not admissible, in general. For instance, if e1 = v1 e1 v2 is an edge with v1 6= v2 in V (G) and e2 = v2 e2 v2 is a loop-edge in E(G), then there is a finite path e1 e2 in F P (G), but there is no (nonempty) finite path e2 e1 in F P (G). This example shows that, in general, the admissibility on the set V (G) ∪ F P (G) is partially defined. The free semigroupoid F+ (G) of G is defined by a set F+ (G) = {∅} ∪ V (G) ∪ F P (G), with its binary operation (·) on F+ (G), defined by  w1      w1 w2 (w1, w2) 7−→ w1 · w2 =   w w    1 2 ∅

ifw1 = w2inV (G) ifw1 ∈ F P (G), w2 ∈ V (G)andw1 = w1w2 ifw1 ∈ V (G), w2 ∈ F P (G)andw2 = w1w2 ifw1 , w2inF P (G)andw1w2 ∈ F P (G) otherwise,

where ∅ is the empty word in V (G) ∪ E(G). The binary operation on F+ (G) is called the admissibility. i.e., the algebraic pair (F+ (G), ·) is the free semigroupoid of G. For convenience, we denote the free semigroupoid of G, simply by F+ (G). (In fact, there are free semigroupoids which do not contain the empty word. For instance, if a graph G is a one-vertex-N -loop-edge graph with N -loop-edges, for N ∈ N, then the free semigroupoid F+ (G) of G does not contain the empty word, because all finite paths are admissible from each other via the unique vertex of G. But, in general, if the vertex set V (G) of G contains more than one vertex, then the empty word is always contained in F+ (G). So, we assume automatically that free semigroupoids contain the empty word, if there is no confusion.) For the given countable directed graph G, we can define a new countable directed graph G−1 which is the opposite directed graph of G, with V (G−1 ) = V (G) and E(G−1) = {e−1 : e ∈ E(G)}, where e−1 ∈ E(G−1) is the opposite directed edge of e ∈ E(G). i.e., if e = v1 e v2 with v1 , v2 ∈ V (G), then e−1 = v2 e−1 v1 with v2, v1 ∈ V (G−1 ) = V (G). The graph G−1 is called the shadow of G. Similar to the previous paragraph, we can construct the


269

set F P (G−1 ) of all finite paths on G−1 and the free semigroupoid F+ (G−1 ) of G−1 . The admissibility on F+ (G−1) is oppositely preserved by that on F+ (G). In other words, if there is a finite path w in F+ (G), then there is the corresponding shadow w−1 in F+ (G−1 ), and vice versa. More precisely, if w = e1 ... en in F P (G), then there always exists a unique −1 −1 shadow w−1 = e−1 n ... e1 of w in F P (G ), and vice versa. It is trivial that (G−1 )−1 = G. Definition 2..1 Let G be a countable directed graph and let G−1 be the shadow of G. The countable directed graph Gˆ is called the shadowed graph of G if it is a directed graph with its vertex set V (Gˆ) = V (G) = V (G−1 ), and its edge set E(Gˆ) = E(G) ∪ E(G−1). Remark that even though E(Gˆ) = E(G) ∪ E(G−1), F P (Gˆ ) 6= F P (G) ∪ F P (G−1 ), −1 −1 in general. Suppose that e1 e2 ∈ F P (G) and e−1 2 ∈ F P (G ). Then e1 e2 e2 ∈ F P (Gˆ ), but it is contained in neither F P (G) nor F P (G−1 ). So, in general,

F P (Gˆ ) ⊇ F P (G) ∪ F P (G−1 ) . Therefore, the free semigroupoid F+ (Gˆ) satisfies that F+ (Gˆ) ⊇ F+ (G) ∪ F+ (G−1 ) , too. Now, we will define the graph groupoid of G. Definition 2..2 Let G be a countable directed graph and Gˆ , the shadowed graph of G, and let F+ (Gˆ) be the free semigroupoid of Gˆ . Define the reduction (RR) on F+ (Gˆ) by (RR) ww−1 = v and w−1 w = v 0, whenever w = vwv 0 in F P (Gˆ ) with v, v 0 ∈ V (Gˆ ). The set F+ (Gˆ) with this reducing ˆ + ˆ relation (RR) is denoted by F+ r (G ) and this set with inherited admissibility from F (G ) ˆ is called the graph groupoid of G. Denote this algebraic pair (F+ r (G ), ·), simply by G. Let ˆ w ∈ G \ {∅} be an element satisfying w ∈ / V (G ). Then we will say that w is a reduced finite path. The subset G \ (V (Gˆ ) ∪ {∅}) of G is said to be the reduced finite path set of G, and we denote it by F Pr (Gˆ).

270

Ilwoo Cho

Let e1 , e2 ∈ E(G) and assume e1 e2 ∈ F P (G) ⊂ F P (Gˆ ). Then we have a finite path −1 ˆ + ˆ w = e1 e2 e−1 2 in F P (G ) ⊂ F (G ). Under the reduction (RR), the element e2 e2 is −1 + identified with the initial vertex of e2 , and hence the finite path w = e1 e2 e2 in F (Gˆ) is −1 + ˆ + ˆ e1 (e2 e−1 2 ) = e1 in Fr (G ). i.e., the finite path w = e1 e2 e2 in F (G ) is identified with the “reduced” finite path e1 in G. This shows that even though we use the same notation w1 w2 for the product of w1 and w2 in G, this notation w1 w2 means the “reduced” product of w1 and w2, under (RR). Notice that all elements in the graph groupoid G can be regarded as the reduced words in E(Gˆ), under the reduction (RR). In fact, the graph groupoid G is a (categorical) groupoid with its base V (Gˆ), in the sense of Section 1.1. Now, we will define a certain inside structure of graph groupoids. Definition 2..3 Let G be a countable directed graph and G, the graph groupoid of G. Suppose X1 and X2 are subsets of G and assume that Xk are self-adjoint in the sense that Xk = Xk−1, where Xk−1 = {w−1 : w ∈ Xk }, for k = 1, 2. Define the subset X1 ∗ X2 of G by the set consisting of all reduced words in X1 ∪ X2 . Let X1 ∗ X2 be the subset of G and assume that it has the inherited admissibility on G. Then the algebraic structure (X1 ∗ X2, ·) is called the (reduced) free product of X1 and X2, inside G. For convenience, we denote (X1 ∗ X2, ·) simply by X1 ∗ X2. Let X1 and X2 be self-adjoint subsets of G. Then the free product X1 ∗ X2 inside G is regarded as the sub-structure of G. Notice that if either X1 or X2 is not self-adjoint in G, then the pair X1 ∗ X2 is not a substructure of G, in general. Proposition 2..1 Let G be a countable directed graph with its graph groupoid G, and let X be a self-adjoint subset of G. Suppose the set X is the subset of G consisting of all reduced words in X. Then there exists a graph GX having its graph groupoid X. Proof. Construct the sets V0 = V (Gˆ ) ∩ X, V1 = {v ∈ V (Gˆ) : w = vw or w = wv, ∀ w ∈ X} and E0 = F Pr (Gˆ) ∩ X. Construct a graph GˆX with its vertex set V (GˆX ) = V0 ∪ V1 and its edge set E(GˆX ) = E0. Then we can have a graph GX having its shadowed graph GˆX . The choice of GX is not unique. This graph GX has its graph groupoid X. The above proposition means that a subset X of a graph groupoid G, consisting of all reduced words in an arbitrary self-adjoint subset X, is regarded as an other graph groupoid of a certain directed graph GX , and the graph GX is induced by G, as a part of it. Let X1 and X2 be self-adjoint subset of G. Then the free product X1 ∗ X2 inside G is also regarded as a graph groupoid of a certain directed graph, by the previous proposition. Notice that all the admissibility of such structures are depending on the admissibility on G. The following theorem provides the algebraic structure theorem of graph groupoids. Theorem 2..2 Let G be a countable directed graph with its graph groupoid G. Then G = e ∈ E(G)∗ Xe , where Xe = {e, e−1 }, for all e ∈ E(G). Equivalently, if Xe is the collection of all reduced words in Xe , for e ∈ E(G), then G = e ∈ E(G)∗ Xe .


271

Proof. By definition, the free product G = e ∈ E(G)∗ Xe is a sub-structure of the graph groupoid G. So, it is sufficient to show that G is a sub-structure of G. To show that we need to show all elements in G are the alternating product of Xe ’s, for e ∈ E(G). Let v ∈ V (Gˆ) ⊂ G. Then v ∈ G, since if e = v e v 0 is an edge in E(G) with v, v 0 ∈ V (G), then e e−1 = v in Xe ⊂ G. Assume now that w = et11 ... etnn ∈ F Pr (Gˆ ) ⊂ G, where e1, ..., en ∈ E(G) and t t1 , ..., tn ∈ {1, −1}, for n ∈ N. Then w is contained in j = 1n∗ Xej ⊂ G, since ejj ∈ Xej , for any tj ∈ {1, −1}, for j = 1, ..., n. Finally, let ∅ be the empty word of G. Suppose e = v e v 0 ∈ E(G) with v 6= v 0 ∈ V (G). Then the sub-structure Xe contains the empty word ∅, since v v 0 = ∅ = v 0 v. Therefore, ∅ ∈ G. Therefore, set-theoretically, G ⊆ G. By definition of Free Product Inside a Graph Groupoid, G has the same admissibility with G. Thus the algebraic structures G and G are identical. Definition 2..4 Let G be the graph groupoid of a directed graph G, and let Xe and Xe be given as in the previous theorem, for all e ∈ E(G). Then the subsets Xe of G consisting of all reduced words in Xe are called the (reduced) free blocks of G. Let e ∈ E(G) and let Xe and Xe be given as above. Then the substructure Xe of G is either the set Z of all integers or Xe ∪ {∅, v, v 0}, set-theoretically. In particular, if e is a loop-edge, then Xe is identical to Z, and if e is a non-loop edge, then it is identical to the set Xe ∪ {∅, v, v 0}, where v and v 0 are the initial and the terminal vertices of e, in G. The above theorem shows that every graph groupoid is a free product of certain free blocks. And we can characterize the free blocks Xe of G as follows: Theorem 2..3 Let e ∈ E(G) and let Xe and Xe be given as above. Then Xe is a substructure of G, under the inherited admissibility on G, and it is either the infinite cyclic Abelian group Z or the graph groupoid Ge of the two-vertices-one-edge graph Ge , as an algebraic structure. Proof. Fix an edge e ∈ E(G), and let Xe = {e, e−1 } and Xe , the collection of all reduced words in Xe , under the inherited admissibility on the graph groupoid G of G. There are two cases where e is a loop-edge and where e is a non-loop edge. Assume that e = v e v is a loop-edge with v ∈ V (G). Then we have Xe = {ek : k ∈ Z}, set-theoretically. Thus the substructure Xe of G can be regarded as the cyclic group < e > generated by the generator e. (See (1) of Example 1.1) It is a group-isomorphic to Z. Assume now that e = v1 e v2 is a non-loop edge with v1 6= v2 ∈ V (G). Notice that e and itself is not admissible, and hence, in general, em = ∅, whenever m ∈ N \ {1}. And, by (RR) on G, we have that e e−1 = v1 and e−1 e = v2 . So, the substructure Xe is identical to {∅, v1, v2 } ∪ Xe ,

272

Ilwoo Cho

set-theoretically. Construct the one-edge graph Ge with V (Ge ) = {v1 , v2 } and E(Ge) = {e = v1 e v2 }. Then the graph groupoid Ge of Ge is identified with Xe , as algebraic structures. The above theorem provides the full characterization of free blocks Xe ’s of the graph groupoid G, for e ∈ E(G). Now, we can understand the graph groupoid G as a free product of the free blocks Xe ’s, for all e ∈ E(G), where Xe ’s are completely characterized by the above theorem. Also, by the previous two theorems, we can have the following isomorphism theorem for graph groupoids. Theorem 2..4 (See [4] and [5]) Let G1 and G2 be countable directed graphs with their graph groupoids G1 and G2 , respectively. If G1 and G2 have graph-isomorphic shadowed graphs Gˆ1 and Gˆ2 , then G1 and G2 are isomorphic, as graph groupoids. Example 2..1 (1) Let Ge be an one-edge graph with V (Ge ) = {v, v 0 } and E(Ge) = {e = v e v 0 }, where v and v 0 are not necessarily distinct. Assume that v = v 0, and denote it by v0 . Then the only edge e of Ge is a loop-edge. So, we can find that the graph groupoid Ge of G is determined by Ge = {en : n ∈ G, e0 def = v0 }. Notice that the graph groupoid Ge does not contain the empty word, since all reduced words (the vertex and the reduced finite paths) in V (Gˆ) ∪ E(Gˆ) are admissible from each other via v0 . We can easily check that Ge satisfies that: (i) (ek1 ek2 ) ek3 = ek1 +k2 +k3 = ek1 (ek2 ek3 ), for all k1, k2 , k3 ∈ Z, (ii) ek v0 = ek = v0 ek , for all k ∈ Z, (iii) For any ek ∈ G \ {v0}, there exists unique e−k such that ek e−k = v0 = e−k ek , for all k ∈ Z, and (iv) ek1 ek2 = ek1 +k2 = ek2 +k1 = ek2 ek1 , for all k1 , k2 ∈ Z. By (i) ∼ (iv), the graph groupoid Ge is an Abelian group. Furthermore, as a group, it is cyclic, because it is generated by e. i.e., Ge = < e > . This shows that Ge is the infinite Abelian cyclic group. Therefore, Ge is group-isomorphic to Z. Assume now that v 6= v 0 in V (Ge ). Then we can have that Ge = {∅} ∪ {v, v 0 } ∪ {e, e−1 }, by the reduction (RR). (2) Let GN be the one-vertex-N -loop-edge graph with V (GN ) = {v} and E(GN ) = {ej = v ej v : j = 1, ..., N }, for N ∈ N \ {1}. The case where N = 1 is observed in (1). The graph groupoid GN of GN is a group < e1 , ..., eN > generated by E(GN ). Notice that, by the admissibility on GN , GN does not contain the empty word. By the previous theorem, the graph groupoid GN is the free product j = 1N ∗ Xej , where Xej are the subset of G consisting of all reduced words in {ej , e−1 j }, for all j = 1, ..., N. We can regard each Xej as a graph groupoid of certain directed graph. In fact, Xej is the graph groupoid of the one-edge graph Gej with V (Gej ) = {v} and E(Gej ) = {ej }, for all j = 1, ..., N. By (1), Xej ’s are group-isomorphic to Z, for all j = 1, ..., N. Therefore, the graph groupoid GN of GN is identified with


273

N −timesZ {z ∗ Z}. | ∗ ........... By the definition of the reduced free product inside graph groupoids, it is the group free product, in this case. Therefore, GN is a group, moreover, it is group-isomorphic to the free group FN with N -generators. Notice that, if two countable directed graphs G1 and G2 are graph-isomorphic, via a graph-isomorphism g : G1 → G2 , in the sense that: (i) g is bijective from V (G1) onto V (G2), and (ii) g is bijective from E(G1) onto E(G2), and (iii) g(e) = g(v1 e v2 ) = g(v1) g(e) g(v2) in E(G2), for all e = v1 e v2 ∈ E(G1), with v1 , v2 ∈ V (G1), then the graph groupoids G1 and G2 are groupoid-isomorphic. More generally, if G1 and G2 have graph-isomorphic shadowed graphs Gˆ1 and Gˆ2 , via a graph-isomorphism h : Gˆ1 → Gˆ2 , then G1 and G2 are groupoid-isomorphic. Indeed, we can create the groupoid-isomorphism Φ : G1 → G2 determined by

Φ(w) def =

 h(w)     

ifw ∈ V (Gˆ1 ) ∪ E(Gˆ1)

  h(e1)...h(en )   

if w = e1...en ∈ F Pr (Gˆ1) withe1 , ...,en ∈ E(Gˆ1), f orn ∈ N \ {1},

in G2 , for all w ∈ G1 . This groupoid-isomorphism Φ is called the groupoidisomorphism induced by a graph-isomorphism h. Recall that a graph G1 is a full-subgraph of G, if E(G1) ⊆ E(G), and V (G1) = {v, v 0 ∈ V (G) : e = v e v 0, ∀e ∈ E(G1)}. Let G1 be the graph groupoid of a full-subgraph G1 of G. Then G1 is a subgroupoid of G, in the sense of Section 1. Remark the difference between subgraphs and full-subgraphs: we say that G2 is a subgraph of G, if V (G2) ⊆ V (G) and E(G2) = {e ∈ E(G) : e = v e v 0, with v, v 0 ∈ V (G2)}. By definition, all subgraphs are full-subgraphs.

274

Ilwoo Cho

2.2.

Canonical Representations

By the very definition of graph groupoids, all graph groupoids are (categorical) groupoids in the sense of Section 1. Indeed, if G is the graph groupoid of a countable directed graph G, then the quadruple G = (G, V (Gˆ), s, r) is a groupoid, where s and r are the extended source map, respectively, the extended range map on G, defined by s(w) = s(vw) = v and r(w) = r(wv 0) = v 0, for all w = v w v 0 ∈ G, with v, v 0 ∈ V (Gˆ) = V (G), where Gˆ is the shadowed graph of G. Since G is a groupoid in the sense of Section 1, we can construct the canonical representation (HG , L) of G. The groupoid Hilbert space HG of G is defined by HG = v ∈ V (Gˆ)⊕Cξ v ⊕ w ∈ F Pr (Gˆ)⊕Cξ w , where F Pr (Gˆ) def = G \ V (Gˆ) ∪ {∅} is the reduced finite path set in G. To emphasize that this groupoid Hilbert space HG is induced by a graph G, we call HG the graph Hilbert space, as in [3] and [4]. The canonical (left) action L of G is also similarly determined; L(w) def = Lw , for all w ∈ G, as the multiplication operator on HG , with its symbol ξ w . Notation The canonical representation (HG , L) of a graph groupoid G is called the canonical (graph) representation of G (or G). The following theorem provides the characterization on graph representations. Theorem 2..5 Let G1 and G2 be graphs. If they have graph-isomorphic shadowed graphs Gˆ1 and Gˆ2 , then the canonical representations (HGk , L(k) ) of Gk are equivalent, for k = 1, 2. Proof. Since the shadowed graphs Gˆ1 and Gˆ2 are graph-isomorphic, the graph groupoids G1 and G2 are groupoid-isomorphic. Indeed, if g is a graph-isomorphism, then there exists the groupoid-isomorphism Φg induced by g. Let G be a graph with its graph groupoid G, and let (HG , L) be the canonical representation of G. Then the groupoid W ∗ -algebra vN (G) is well-defined, as a W ∗ -subalgebra of the operator algebra B(HG ). w

Definition 2..5 Let vN (G) = C[L(G)] be the groupoid W ∗ -subalgebra of B(HG ). We call vN (G), the graph von Neumann algebra of G. For convenience, we denote vN (G) by MG .


275

Recently, the graph von Neumann algebras have been studied (See [3], [4], [5], [7] through [11], [16], [17], and [18]). For the readers, we refer [4], for the basic graph von Neumann algebra theory. The following proposition is the direct consequence of the previous theorem. Proposition 2..6 (Also, see [3] and [4]) Let G1 and G2 be directed graphs and assume that G1 and G2 have the graph-isomorphic shadowed graphs. Then the graph von Neumann algebras MG1 and MG2 are ∗-isomorphic. Recall that the graph groupoid G of a graph G is the free product of the free blocks Xe of G, where Xe ’s are the subgroupoid of G, consisting of all reduced words only in {e, e−1 }, for all e ∈ E(G). So, we have that w

MG def=C[L(G)] = C[G]

w

w w = C [e ∈ E(G)∗Xe ] = e ∈ E(G)∗rDG vN C[Xe ] , DG

= e ∈ E(G)∗rDG Me , by [3] and [4], where DG is the diagonal subalgebra of MG , defined by DG def = v ∈ V (G)⊕ (C Lv ) ⊂ MG and where w Me def = vN C[Xe] , DG , for all e ∈ E(G), where vN (S1, S2 ) means the von Neumann algebra generated by the sets S1 and S2. Notice that the symbol “∗rDG ” means the amalgamated reduced free product with amalgamation over DG . Therefore, we obtain the following theorem. Theorem 2..7 (See [3] and [4]) Let MG be the graph von Neumann algebra of G. Then MG is ∗-isomorphic to the DG -valued reduced free product algebra of W ∗-subalgebras w Me def = vN C[Xe ] , DG , for all e ∈ E(G), where the reduction is totally depending on the admissibility on G. Actually, the proof is not easy because we need the amalgamated free probability background. However, the meaning of the above theorem is simple. The free block structures of graph groupoids effect the operator algebraic free block structures of the von Neumann algebras generated by graph groupoids. However, we need to keep in mind that the operator algebraic freeness is determined with the amalgamations.

276

Ilwoo Cho

2.3. Connected Graphs Let G be a countable directed graph with its graph groupoid G. We say that the graph G is connected, if, for any pair (v1 , v2) of distinct vertices v1 and v2 in V (G), there always exists at least one reduced finite path w ∈ F Pr (Gˆ) ⊂ G, such that w = v1 w v2 and w−1 = v2 w−1 v1 . Otherwise, we say that G is disconnected. Suppose a graph G is disconnected. Then there exists t ∈ N and the family G = {G1, ..., Gt} of full-subgraphs Gj of G, for j = 1, ..., t, such that: (i) Gj ’s are the connected full-subgraphs, (ii) V (G) = j = 1tt V (Gj ), (iii) E(G) = j = 1tt E(Gj ), (iv) G is the minimal family satisfying the conditions (i), (ii), and (iii), where “ t” means the disjoint union. Notice that such a “minimal” family exists, by the axiom of choice. The elements of G are called the connected components of G. Let G be a disconnected graph with its connected components G1 , ..., Gt , for t ∈ N \ {1}. Then the shadowed graphs Gˆ of G is also disconnected with its connected components Gˆ1 , ..., Gˆt , where Gˆj are the shadowed graphs of Gj , for all j = 1, ..., t. Therefore, we can have that: Lemma 2..8 Let G be the graph groupoid of a disconnected graph G, and assume that G has its connected components G1, ..., Gt , for t ∈ N \ {1}. Then the graph groupoid G is decomposed by G = j = 1tt Gj , where Gj are the graph groupoids of Gj , for all j = 1, ..., t. By the above lemma, we can obtain the following theorem. Theorem 2..9 Let G be a directed graph with its connected components G1 , ..., Gt , for t ∈ N, and let G be the graph groupoid of G. Then the canonical representation (HG , L) of G satisfies that: P HG = j = 1t⊕ HGj and L = j = 1t LGj , where (HGj , LGj ) are the canonical representations of the graph groupoid Gj of Gj , for all j = 1, ..., t. The following corollary is the direct consequence of the previous theorem. Corollary 2..10 Let G be a graph with its graph groupoid G, and let (HG , L) be the canonical representation of G. Assume that G is a graph with its connected components w G1, ..., Gt . Then the graph von Neumann algebra MG = C[L(G)] of G is ∗-isomorphic w to j = 1t⊕ MGj , where MGj = C[L(Gj )] are the graph von Neumann algebras of Gj , for all j = 1, ..., n. By the previous theorem and corollary, we can realize that it is sufficient to consider connected graphs. Assumption In the rest of this paper, we restrict our interests to the case where we have connected directed graphs. All graphs are automatically assumed to be connected, if there is no special cases.


3.

277

Subgroupoids Induced by Full-Subgraphs

Throughout this section, let G be a connected countable directed graph. Define the collecG by the set of all “connected” full-subgraphs of G, containing the trivial graphs, tion Gconnet G consisting of only one vertex. Let G0 ∈ Gconnect be a trivial full-subgraph of G with V (G0) 0 = {v}, and E(G ) = ∅, where ∅ means the empty set. Denote G0 by Gv . Similarly, if G00 G ∈ Gconnect be an one-edge full-subgraph of G with V (G00) = {v, v 0 } and E(G00) = {e}, for e ∈ E(G), then we denote G00 by Ge . Here, remark that the vertices v and v 0 of Ge are not necessarily distinct. For example, if e is a loop-edge, then v = v 0 in V (Ge ). G Define the partial ordering ≤ on Gconnect by G1 ≤ G2 def ⇐⇒ G1 is a full-subgraph of G2 , G G for all G1 , G2 ∈ Gconnect . The pair (Gconnect , ≤) is a POset (or a partially ordered set ). G Then we can take a subset {G1, ..., Gk } of Gconnect , satisfying that

G1 ≤ G2 ≤ ... ≤ Gk . Then, such a subset {G1 , ..., Gk } is called a chain (or a totally ordered subset ) of G . In particular, under the above setting, we say that G1 is the minimal element of the Gconnect chain {G1, ..., Gk }, and Gk is the maximal element of the chain {G1, ..., Gk }. To emphasize the totally ordering relation on the chain {G1, ..., Gk }, we denote it by [G1, ..., Gk ]. Example 3..1 Let G be the one-flow circulant graph with V (G) = {v1, v2 , ..., vn }, and E(G) = {ej = vj ej vj+1 : j = 1, ..., n, with vn+1 def = v1 }. G Then we can take a chain [G1, G2 , ..., Gn ] in Gconnect such that

G1 = Gv1 , G2 = G[v1 ,v2 ], G3 = G[v1 ,v2 ,v3 ], · · ·, Gn−1 = G[v1 ,...,vn ] , and Gn = G, where G[v1 ,...,vk ] are the full-subgraphs of G with V (G[v1 ,...,vk ] ) = {v1, ..., vk }, and E(G[v1,...,vk ] ) = {e1, ..., ek−1 }, where ei = vi ei vi+1 , for all i = 1, ..., k, for all k = 1, ..., n. G Let [G1, ..., Gk ] be a chain of Gconnect , for an arbitrary fixed connected graph G. Then the graphs G1 , ..., Gk generate the tower of graphs, in the sense of [11]: a tower of graphs

278

Ilwoo Cho G1 ≤ G2 ≤ ... ≤ Gn ,

for n ∈ N, is the set of graphs G1, ..., Gn , satisfying the (full-subgraph) partial ordering ≤, as above, up to graph-isomorphisms. (Notice that, in [11], we do not assume the connectedness on graphs. However, as we assumed before, all our graphs are connected, by G Section 2.3.) i.e., for any chain [G1, ..., Gk ] of Gconnect , we can have the tower of graphs G1 ≤ G2 ≤ ... ≤ Gk , determined by the partial ordering ≤. By [11], we can obtain the following result. G be a chain. Then the canonical representations Theorem 3..1 Let [G1 , ..., Gk ] ⊂ Gconnect Gj (HGj , L ) of Gj ’s are the sub-representations of the canonical representation (HG , LG ) of G, satisfying that:

HGj Subspace⊆ HG , and LGj = LG |Gj , for all j = 1, ..., k, where Gj are the graph groupoids of Gj . By the previous theorem, we can get the tower of von Neumann algebras, MG1 ⊆ MG2 ⊆ ... ⊆ MGk , G whenever we have a chain [G1, ..., Gk ] of Gconnet . Moreover, all graph von Neumann ∗ algebras MGj are the W -subalgebras of the graph von Neumann algebra w

MG = C[LG(G)] of G, in B(HG ), G , where (HG , LG ) is the canonical representation of G. Indeed, if [G1, ..., Gk ] ∈ Gconnect then we have the corresponding towers

Gˆ1 ≤ Gˆ2 ≤ ... ≤ Gˆk of (connected) shadowed graphs, and hence we obtain the towers G1 Subgroupoid⊆ G2 Subgroupoid⊆ ... Subgroupoid⊆ Gk of graph groupoids, where Gj are the graph groupoids of Gj , for all j = 1, ..., k. Therefore, by the previous theorem, we have the towers of graph von Neumann algebras as above. G Corollary 3..2 Let [G1, ..., Gk ] ⊂ Gconnet be a chain. Then we can obtain the tower of graph von Neumann algebras,

MG1 ⊆ MG2 ⊆ ... ⊆ MGn , where the graph von Neumann algebras MGk are the W ∗ -subalgebras of the graph von Neumann algebra MG of G.


279

G Thus, by the previous corollary, whenever we have a chain [G1, ..., Gk ] ∈ Gconnect , we can get a tower of graph von Neumann algebras,

MG1 ⊆ MG2 ⊆ ... ⊆ MGk . Also we can construct the corresponding tower of diagonal subalgebras, DG1 ⊆ DG2 ⊆ ... ⊆ DGk . Since DGj ⊆ MGj , for all j = 1, ..., k, we obtain the ladder of von Neumann algebras, MG1 ∪ DG1

⊆ MG2 ∪ ⊆ DG2

⊆ ... ⊆ MGk ∪ . ⊆ ... ⊆ DGk

As we have seen in [10] and [11], the free probability works well on such ladders. More precisely, if we define the natural surjections Fj : MGj → MGj−1 by Fj w ∈ Gj

P tw LG ty LG w def = y ∈ Gj−1 y,

P

P for all w ∈ Gj tw LG w ∈ MGj (with tw ∈ Gj ), for all j = 2, 3, ..., k, and if we have the conditional expectations Ei : MGi → DGi , defined by Ei w ∈ Gi

P ˆ tw LG tv LG w def = v ∈ V (Gi ) v,

P

for all i = 1, 2, ..., k, then the diagram MG1 ↓E1 DG1

F2 ←− MG2 ↓E2 0 F2 ←− DG2

F3 ←− ... Fk ←− MGk ↓Ek 0 0 F3 ←− ... Fk ←− DGk

commutes, where Fi0 def = Fi |DGi , for all i = 1, ..., k. Notice that, here LG means the canonical groupoid action of the graph groupoid G of G, representing LGj , for all j = 1, ..., k, by the previous theorem. Also, note that every element x of MG has its expression, x=w∈G

P

tw LG w , with tw ∈ C.

280

Ilwoo Cho So, every element x0 in MGj , has its expression, x0 = w ∈ Gj

P

G

tw Lw j ,

where Gj are the graph groupoids of Gj , for all j = 1, ..., k. Therefore, we obtain that: G Proposition 3..3 If [G1, ..., Gk ] is a chain in Gconnect , then the pairs (MGj , Fj ◦ ... ◦ Fj−i ∗ ◦ Ej−i ) is an amalgamated W -probability space over Dj−i , in the sense of Voiculescu, for j = 1, ..., k, and i = 0, ..., j − 1, where Fn ’s and En ’s are given as in the above diagram.

We will not explain deeply about the above proposition. The meaning of the above proposition is that: whenever we have a chain of connected full-subgraphs, the operatoralgebraic free probability works well on the ladder induced by the chain (Also, see [11]).

4.

Quotient Graphs and Quotient Groupoids

Fix a full-subgraph K of G. Then, as a new directed graph, the graph K has its own graph groupoid K. We can understand the graph groupoid K of K as a sub-structure of the graph groupoid G of G. Indeed, K is a subgroupoid of G, in the sense of Section 1. Then we can consider the quotient structure G / K, in the natural manner. The main purpose of this section is to find a directed graph GK having its graph groupoid G / K, if the quotient groupoid G / K is well-determined. If we can find a suitable graph GK having its graph groupoid G / K, which is the quotient groupoid in the sense of Section 1, then this shows that such quotient algebraic structures are also a graph groupoid of certain graphs, and hence they are also understood as graph groupoids. Since we know the structures and representations of (arbitrary) graph groupoids, the study of quotient groupoids induced by graph groupoids is also to study graph groupoids. Notice that the quotient groupoid G / K, induced by the subgroupoid inclusion K ⊂ G, is pure algebraic. So, if such quotient groupoidal structures are again graph groupoids of certain graphs, then this relation provides another bridge between combinatorics and algebra.As before, we assume that all our graphs are connected.

4.1. Quotient Graphs In this section, we will introduce a new type of directed graphs so-called the quotient graphs. To define a quotient graph GK induced by the partial ordering K ≤ G, called the full-subgraph-inclusion from now on, we need to define the “collapsing”. Definition 4..1 Let G be a countable directed graph and let K ≤ G be a full-subgraph of G. Let’s identify all elements in V (K) ∪ E(K) to the ideal vertex vK , inside G. This process is called the collapsing K in G, and the identified vertex vK is said to be the collapsed vertex of K. By doing the collapsing a full-subgraph K in G, we can create a new directed graph GK .


281

Definition 4..2 Let G be a countable directed graph and assume we have a full-subgraphinclusion K ≤ G. Suppose we do the collapsing K in G and let vK be the collapsed vertex of K. Define a new directed graph GK by a graph with its vertex set V (GK ) = {vK } ∪ (V (G) \ V (K)) and its edge set E(GK ) = E(G) \ E(K), under the identification (or collapsing) rule: if e ∈ E(G) \ E(K), with e = e v or e = e = vK e e. v e, for v ∈ V (K), then identify it to the new edge ee = e e vK , respectively, e This new graph GK is said to be the quotient graph of G by K. Notation and Assumption In the rest of this paper, whenever we mention a quotient graph GK or the construction of a quotient graph GK , if we write the set E(G) \ E(K), then this means not only a set-theoretical notation, but also our edge set E(GK ) of the graph GK , under the identification (or collapsing) rule. Example 4..1 (1) Let G be a countable directed graph. Then the graph G, itself is a fullsubgraph of G. Then we can do collapsing G in G and we can create the collapsed vertex vG . By definition, the quotient graph GG is nothing but the trivial graph with V (GG ) = {vG } and E(GG) = ∅, where ∅ means the empty set. (2) Let G be a countable directed graph and let Gv be a full-subgraph with V (Gv ) = {v} and E(Gv ) = ∅. Then, by doing the collapsing Gv in G, we can get the collapsed vertex vGv = v. So, the quotient graph GGv is G, itself. (3) Let ON be the one-vertex-N -loop-edge graph with V (ON ) = {v} and E(ON ) = {ej = v ej v : j = 1, ..., N }. Then we can take a full-subgraph K of G with E(K) = {ei1 , ..., eik } and V (K) = {v}, where k < N. Then this full-subgraph K is graph-isomorphic to the one-vertex-k-loop-edge graph Ok . By the collapsing K in ON , we can get the collapsed vertex vK , identified with v. Then the quotient graph (ON )K is graph-isomorphic to the one-vertex-(N − k)-loop-edge graph ON −k . (4) Let C3 be the one-flow circulant graph with V (C3 ) = {v1, v2 , v3 } and E(C3) = {e12, e23 , e31 }, where eij means a directed edge connecting the initial vertex vi to the terminal vertex vj . Take a full-subgraph K of G with E(K) = {e12, e23 } and V (K) = {v1, v2, v3} = V (G). By collapsing K in G, we can get the collapsed vertex vK and the quotient graph GK , which is the graph with V (GK ) = {vK } and E(GK ) = {e31}. Here, notice that the vertices v3 and v1 are identified with vK , by the collapsing K. Therefore, we can understand the edge e31 as e31 = vK e31 vK which is a loop-edge. So, the quotient graph GK is graph-isomorphic to the one-vertex-one-loop-edge graph O1. (5) Let C3 be given as in (4). Now, take a full-subgraph K with E(K) = {e12 } and V (K) = {v1, v2 }. Then the quotient graph GK is a graph with V (GK ) = {vK , v3} and E(GK ) = {e23 , e31 }. Since v1 and v2 are collapsed to vK , we can rewrite E(GK ) = {e23 = vK e23 v3 , e31 = v3 e31 vK }. Therefore, GK is graph-isomorphic to the one-flow circulant graph C2 . (6) Let T23 be the 2-regular 3-story growing tree with its root v1 ,

282

Ilwoo Cho v1111 % →

v11 %

v111 v1112 % → v112 &→ v1121 v1122

T23 = v1

. &

v1211 v12 → v121 %→ v1212 & v1221 v122 → & v1222

Take a full-subgraph K of G by the graph with E(K) = {[v11, v111], [v11, v112]} and V (K) = {v11, v111, v112}, where [v, v 0] means an edge connecting the vertex v to the vertex v 0. Then, by doing the collapsing K in G, we can have the following quotient graph (T23)K ,

vK %

v1111 −→% v1112 −→& v1121 v1122

v1 (T23)K =

& v12 → &

v121 v122

v1211 −→% v1212 → &

v1221 v1222

Let K1 and K2 be full-subgraphs of G. Then we can construct a new full-subgraph K = K1 ∪ K2 by a directed graph with V (K) = V (K1) ∪ V (K2) and E(K) = E(K1) ∪ E(K2). We can find the relation between the quotient graphs GK1 , GK2 and GK . Theorem 4..1 Let G be a countable directed graph and let K1 and K2 be full-subgraphs of G. Also, let K be a full-subgraph defined by K1 ∪ K2, as above. Then GK Graph= (GK1 )K 1 Graph= (GK2 )K12 , 2

j

where “Graph=” means “graph-isomorphic” and Ki are the full-subgraph of GKj with its edge set


283

E(Kij ) = E(Ki) \ E(Kj ) and its vertex set V (Kij ) = {vKj } ∪ (V (Ki) \ V (Kj )) , for i 6= j ∈ {1, 2}. Here, vKi means the collapsed vertex of Ki in G, for all i ∈ {1, 2}. Proof. Let G, K1 and K2 be given as above. By the previous paragraph, we can construct a new full-subgraph K = K1 ∪ K2. Then we can create the quotient graph GK , by collapsing K in G. Then the graph GK has its edge set E(GK ) = E(G) \ E(K) and its vertex set V (GK ) = {vK } ∪ (V (G) \ V (K)), where vK is the collapsed vertex of K. Fix the fullsubgraph K1 ⊂ G. Then the quotient graph GK1 is the graph with its edge set E(GK1 ) = E(G) \ E(K1) and its vertex set V (GK1 ) = {vK1 } ∪ (V (G) \ V (K1)). Now, define a directed graph K21 with its edge set E(K21) = E(K2) \ E(K1) and its vertex set V (K21 ) = {vK1 } ∪ (V (K2) \ V (K1)) , where vK2 is the collapsed vertex of K2 in G. Then this new directed graph K21 is a full-subgraph of GK1 . Indeed, the edge set E(K21) satisfies E(K21) = E(K2) \ E(K1) = E(GK1 ) \ E(K1) and the vertex set V (K21) satisfies V (K21) = {v ∈ V (GK1 ) : e = v e or e = e v, ∀ e ∈ E(GK1 )}. By regarding K21 as a full-subgraph of GK1 , we can do collapsing K21 in GK1 . The collapsed vertex vK21 is determined (and it is identified with vK1 ). So, we have the quotient graph (GK1 )K21 . This graph has its vertex set V (GK1 )K 1 = {vK 1 = vK1 } ∪ (V (G) \ (V (K1) ∪ V (K2))) 2

2

and it has its edge set E (GK1 )K21 = E(G) \ (E(K1) ∪ E(K2)) . Since the full-subgraph K = K1 ∪ K2 , the quotient graph GK is a directed graph with V (GK ) = {vK } ∪ (V (G) \ V (K)) = {vK } ∪ (V (G) \ (V (K1) ∪ V (K2)))

284

Ilwoo Cho and E(GK ) = E(G) \ E(K) = E(G) \ (E(K1) ∪ E(K2)) .

Therefore, we can see that the graphs GK and (GK1 )K21 is graph-isomorphic, by identifying vK and vK 1 . Similarly, we can check that the quotient graph (GK2 )K 2 is graph2 1 isomorphic to GK , too. The above theorem shows that a quotient graph GK of G by a full-subgraph K = K1 ∪ K2 of G, where K1 and K2 are full-subgraphs of G, can be understood as a quotient graph (GKi )K i of GKi by Kji , where Kji is a full-subgraph of GKi determined by Kj , for i 6= j j

∈ {1, 2}. The graphs Kij are said to be the full-subgraphs of GKj determined by Ki , for i 6= j ∈ {1, 2}. Notation Let G be a countable directed graph and let K be a full-subgraph of G. And let GK be the quotient graph of G by K. Assume that H is a full-subgraph of GK . Then the quotient graph (GK )H of GK by H is denoted by GK:H . Inductively, we can have a quotient graph GH1 :H2 :...:Hn of GH1 :...:Hn−1 , for all n ∈ {3, 4, ...}. The following corollary is the direct consequence of the above theorem. Corollary 4..2 Let K1, ..., Kn be full-subgraphs of G, for n ∈ N, and let K = ∪nj=1 Kj be a full-subgraph with V (K) = ∪nj=1 V (Kj ) and E(K) = ∪nj=1 E(Kj ). Then the quotient graph GK is graph-isomorphic to GK

σ(1) σ(1),σ(2) σ(1) :Kσ(2) : Kσ(3)

σ(1), ..., σ(n−1)

... : Kσ(n)

, σ(1)

for all σ ∈ Sn , where Sn is a symmetric group for n ∈ N, and where Kσ(2) means σ(1),σ(2)

the full-subgraph of Kσ(1) determined by Kσ(2), and Kσ(3) σ(1)

σ(1),...,σ(t−1)

of Kσ(2) determined by Kσ(3), and, inductively, Kσ(t) σ(1),...,σ(t−2) Kσ(t−1)

means the full-subgraph

means the full-subgraph of

determined by Kσ(t) , for all t = 4, ..., n.

Example 4..2 Let G be a one-flow circulant graph C5 with V (G) = {v1 , ..., v5} and E(G) = {e12, e23, e34 , e45 , e51}. Fix the full-subgraphs K1 and K2 with V (K1) = {v1, v2 , v3 } and E(K1) = {e12 , e23}, and V (K2) = {v2, v3 , v4 } and E(K2) = {e23 , e34}. Then we can create a new full-subgraph K = K1 ∪ K2 of G by a directed graph with


285

V (K) = {v1, v2 , v3 , v4} and E(K) = {e12, e23, e34 }. By collapsing K in G, we can have the collapsed vertex vK , and hence we can have the quotient graph GK with V (GK ) = {vK , v5 } and E(GK ) = {e45 = vK e45 v5, e51 = v5 e51 vK }. We can easily see that the quotient graph GK is graph-isomorphic to the one-flow circulant graph C2 . Now, consider the quotient graph GK1 of G by K1. By collapsing K1 in G, we can find the collapsed vertex vK1 and we can create GK1 by a directed graph with V (GK1 ) = {vK1 , v4 , v5}, and E(GK1 ) = {e34 = vK1 e34 v4, e45 = v4 e45 v5 , e51 = v5 e51 vK1 }. Consider the full-subgraph K21 of GK1 determined by K2. Then it is a full-subgraph of GK1 with V (K21) = {vK1 , v4 } and E(K21) = {e34 = vK1 e34 v4 }. Now do collapsing K21 in GK1 . Then we can get the collapsed vertex vK21 , and we have the quotient graph GK1:K21 with V (GK1 :K 1 ) = {vK 1 , v5 } 2

2

and E(GK1:K 1 ) = {e45 = vK 1 e45 v5 , e51 = v5 e51 vK 1 }. 2

2

2

The graph GK1:K21 is graph-isomorphic to the one-flow circulant graph C2 . Thus we can see that the graphs GK and GK1 :K21 are graph-isomorphic, too.

286

Ilwoo Cho

4.2. Quotient Groupoids Induced by Graph-Groupoid Inclusions Let G be a countable directed graph and let K ≤ G be a full-subgraph of G. And let G and K be the graph groupoids of G and K, respectively. In this section, we will construct a quotient groupoid G / K, if exists, from the groupoid inclusion K ⊂ G. To do that, we will consider the conditions when we can construct quotient groupoids G / K. Notice that, even though the groupoids G and K are constructed by directed graphs G and K, respectively, the quotient groupoid (if exists) G / K is defined pure algebraically, for the given inclusion K ⊂ G, as groupoidal algebraic quotient structure. It is easy to see that the graph groupoid K of a full-subgraph K of the given graph G is a subgroupoid of the graph groupoid G of G. Pure algebraically, we can define the quotient structure G / K, generated by the inclusion K ⊂ G, if wKw−1 ⊆ K, for all w ∈ G. Then the quotient groupoid G / K is well-defined and it is again a (categorical) groupoid (in the sense of Section 1). Let K ⊂ G be a subgroupoid inclusion, where K and G are groupoids. Then under certain condition, we can define the quotient groupoid G / K by the quadruple (G / K, B / K, s, r) with G / K def = {x K, K x : x ∈ G \ K}, B / K def = {y K, K y : y ∈ B \ K}, s (x K) = s(x) K, r (xK) = K, and s (Kx) = K,

r (Kx) = K r(x).

However, to construct the quotient groupoid G / K, we need the following condition: x K x−1 ⊆ K, for all x ∈ G. In such case, the algebraic structure G / K, induced by the subgroupoid inclusion K ⊂ G, is again a groupoid. Definition 4..3 Let K ⊂ G be a subgroupoid inclusion, where G is a groupoid and K is a subgroupoid of G. Suppose K satisfies that x K x−1 ⊆ K, for all x ∈ G. Construct the groupoid G / K as in the previous paragraph. This groupoid G / K is called the quotient groupoid of G by K. Since graph-groupoid inclusion K ⊂ G, generated by the full-subgraph inclusion K ≤ G, is again a subgroupoid inclusion, we may define the corresponding quotient groupoid G / K, under certain condition. Let K ≤ G be a full-subgraph inclusion and let K ⊂ G be the corresponding subgroupoid inclusion. Consider the quotient-groupoid-condition,


287

w K w−1 ⊆ K, for all w ∈ G. Let w ∈ K. Then, clearly, w K w−1 ⊆ K. Assume now that w ∈ G \ K, and let w = v w v 0, with v, v 0 ∈ V (Gˆ). Remark that the vertices v and v 0 for w are not necessarily distinct in V (Gˆ). Moreover, w = w w w can be a vertex in V (Gˆ ). Suppose w = v is a vertex in V (Gˆ), then v K v is either {∅} or loopv (K ˆ), where loopv (K ˆ) def = {w ∈ K : w = v w v} ⊆ K. In particular, if v ∈ V (Gˆ) \ V (K ˆ), then v K v = {∅}, and if v ∈ V (K ˆ ), then v K v = loopv (K ˆ). Lemma 4..3 Let v ∈ V (Gˆ). Then v K v is identified with either {∅} or loopv (K ˆ) ⊆ K. This shows that v K v ⊆ K, for all v ∈ V (Gˆ). The above lemma shows that to check the quotient-groupoid-condition for the subgroupoid inclusion K ⊂ G, it is sufficient to consider the reduced finite paths w ∈ F Pr (Gˆ) of the shadowed graph Gˆ, weather w K w−1 ⊆ K or not. Our main purpose is to find the graph-theoretical characterization of the quotientgroupoid-condition for the groupoid inclusion K ⊂ G induced by a full-subgraph inclusion K ≤ G. Suppose now that w = v w v 0 ∈ F Pr (Gˆ), with v, v 0 ∈ V (Gˆ), in G. To satisfy w K w−1 ⊆ K, the fixed reduced finite path w = v w v 0 must satisfy that: the initial vertex v of w is contained in V (K ˆ ) (⊆ V (Gˆ)). Lemma 4..4 Let K ⊂ G be the given subgroupoid inclusion induced by a full-subgraph inclusion K ≤ G, and let w = v w v 0 ∈ F Pr (Gˆ) be a reduced finite path with v, v 0 ∈ V (Gˆ), in G. Then w K w−1 ⊆ K, if and only if the initial vertex v of w is contained in V (K ˆ). Proof. To prove the above characterization, we will use the quotient-graph technique. Let K ⊂ G be given, induced by K ⊂ G. Then we can construct the quotient graph GK of G by K, by collapsing K into the collapsed vertex vK . In other words, the full-subgraph K is identified with the vertex vK of GK . Then we can construct the corresponding graph groupoid GK of GK . (⇐) Suppose w = v w v 0 ∈ F Pr (Gˆ ) is a reduced finite path with v, v 0 ∈ V (Gˆ), and assume that the initial vertex v of w is contained in the vertex set V (K ˆ) = V (K). Then, after the collapsing K into the collapsed vertex vK of the quotient graph GK , the element w is identified with the element w = vK w v 0 in the shadowed graph GˆK of the quotient graph GK of G by K. Since the graph groupoid K of the given full-subgraph K is identical to the trivial subgroupoid {vK } of GK , the subset w K w−1 of G is identified with w vK w−1 = w w−1 = v, and, by hypothesis, it is contained in V (GˆK ) ⊂ GK . This is equivalent to w K w−1 ⊆ K, in G. (⇒) Suppose w = v w v 0 ∈ F Pr (Gˆ) and let w K w−1 ⊆ K. Notice that the graph groupoid K of K is identified with the collapsed vertex vK in the graph groupoid GK of the

288

Ilwoo Cho

quotient graph GK of G by K. By hypothesis, the subset w K w−1 of G is also identical to the collapsed vertex vK of GK . Assume now that v is not contained in V (K ˆ ). Then, inside the graph groupoid GK of the quotient graph GK , the subset w K w−1 is identified with w vK w−1 . Since v is not in V (K ˆ ), the element w vK w−1 = ∅ or v(6= vK ) in GK . If it is ∅ (where w and vK , or equivalently, v 0 and vK , are not admissible), then it is okay, but if it is nonempty, then w vK w−1 = w w−1 = v 6= vK , in general. This shows that w K w−1 * K in G, in general. This contradict our assumption. Thus, we can obtain the following theorem. Theorem 4..5 Let K ⊂ G be a subgroupoid inclusion of graph groupoids G and K, induced by a full-subgraph inclusion K ≤ G. Then the quotient groupoid G / K is well-determined, if and only if all reduced finite paths, contained in G \ K, satisfy that both the initial and terminal vertices are in the vertex set V (K) = V (K ˆ) of the full-subgraph K. Proof. By the quotient-groupoid-condition, by regarding the graph-groupoid inclusion K ⊂ G as a subgroupoid inclusion, the quotient groupoid G / K is a well-defined groupoid if and only if w K w −1 ⊆ K in G, for all w ∈ G. And, by the previous lemmas, it is enough to consider where w’s are reduced finite paths. i.e., the quotient-groupoid-condition holds if and only if w K w−1 ⊆ K, for all w ∈ F Pr (Gˆ). Furthermore, we observed that, for any fixed w0 ∈ F Pr (Gˆ), the set-inclusion w0 K w0−1 ⊆ K in G, if and only if w0 = v0 w0, with v0 ∈ V (K) = V (K ˆ ) ⊆ V (Gˆ ). Therefore, the quotient groupoid G / K is welldetermined if and only if, every reduced finite path w = v w v 0 in F Pr (Gˆ), with v, v 0 ∈ V (Gˆ), satisfies v, v 0 ∈ V (K ˆ ). The above theorem provides a characterization of the existence of quotient groupoids. Proposition 4..6 Let K ⊂ G be a subgroupoid inclusion of graph groupoids, induced by a full-subgraph inclusion K ≤ G. Suppose that there exists the corresponding quotient groupoid G / K. Then it is identified with the graph groupoid GK of the quotient graph GK of G by K. Proof. Let K ⊂ G be given as above, and assume that the corresponding quotient groupoid G / K is well-defined. Equivalently, assume that w K w−1 ⊆ K, for all w ∈ G. Then, by the very definition of quotient groupoids and by the previous observations, the quotient groupoid G / K and the graph groupoid GK of GK are groupoid-isomorphic. Indeed, we can define a groupoid-isomorphism Φ : G / K → GK by Φ (wK) = w vK = w ∈ GK and Φ (Kw) = vK w = w ∈ GK , for all w K, K w ∈ G / K. By the above theorem, we can get the following theorem. The following theorem gives us the graph-theoretical characterization of the existence of quotient groupoids. However, remark and recall that all our graphs are connected.


289

Theorem 4..7 Let K ≤ G be a full-subgraph inclusion with its corresponding subgroupoid inclusion K ⊂ G. Then the vertex sets of G and K satisfy V (G) = V (K), if and only if the subgroupoid inclusion induces the quotient groupoid G / K. Proof. (⇒) Assume that V (G) = V (K). Then, for any w = v w v 0 ∈ F Pr (Gˆ), with v, v 0 ∈ V (Gˆ), the initial and the terminal vertices v and v 0 of w are contained in V (K ˆ) = V (K) = V (G) = V (Gˆ). Therefore, by the previous theorem, the quotient groupoid G / K is well-determined. (⇐) Assume now that the quotient groupoid G / K is well-determined by the given subgroupoid inclusion K ⊂ G. By the previous proposition, this quotient groupoid G / K is groupoid-isomorphic to the graph groupoid GK of the quotient graph GK of G by K. Again, by the previous theorem, the quotient graph GK is graph-isomorphic to the onevertex-multi-loop-edge graph. Indeed, all edges e ∈ E(Gˆ) ⊂ F Pr (Gˆ) should satisfy the above theorem for the existence of G / K. i.e., to satisfy both e K e−1 ⊆ K and e−1 K e ⊆ K, in G, every edge e should satisfy e = v1 e v2 ∈ E(Gˆ) with v1 , v2 ∈ V (K). This shows that the existence of the quotient groupoid G / K guarantees the equality of the vertex sets V (G) and V (K). The above theorem provides the graph-theoretical characterization of the existence of quotient groupoids for subgroupoid inclusions induced by full-subgraph inclusions (where the given graphs are connected). As we have seen in the proof of the previous theorem, we can get the following another graph-theoretical characterization of the existence of quotient groupoids: Theorem 4..8 Let K ≤ G be a full-subgraph inclusion with the corresponding subgroupoid inclusion K ⊂ G. Then the quotient groupoid G / K is well-defined, if and only if the quotient graph GK of G by K is graph-isomorphic to the one-vertex-multi-loop-edge graph. Proof. The proof is easy, by the previous theorem. We know that the quotient groupoid G / K is well-determined if and only if V (G) = V (K). This shows that, in the quotient graph GK of G by K, the only vertex of it is the collapsed vertex vK . i.e., V (GK ) = {vK }. Therefore, all nonempty edges of GK should be loop-edges. Example 4..3 Let Gn be the one-vertex-n-loop-edge graph, for all n ∈ N. Assume that n > 1, and take k < n in N. Then we can define a full-subgraph K of Gn consisting of kloop-edges. i.e., the graph K is graph-isomorphic to the one-vertex- k-loop-edge graph Gk . Then both the initial and terminal vertices of any reduced finite paths of Gn are contained in V (Gk ), since V (Gn ) = V (Gk ). Therefore, w Gk w−1 ⊆ Gk , for all w ∈ Gn , where Gp is the graph groupoid induced by the one-vertex- p-loop-edge graph Gp , for all p ∈ N. Therefore, the quotient groupoid Gn / Gk is well-defined, by the previous theorem. Moreover, by the previous corollary, this quotient groupoid Gn / Gk is groupoid-isomorphic to the graph groupoid Gn:k of the quotient graph Gn:k of Gn by Gk . It is easy to check that the graph Gn:k is graph-isomorphic to Gn−k (See [4], [5] and [8]). So, by [4] and [5], the graph groupoid Gn:k is groupoid-isomorphic to Gn−k . Therefore, the quotient groupoid Gn / Gk is groupoid-isomorphic to Gn−k . i.e.,

290

Ilwoo Cho Gn / Gk Groupoid= Gn:k Groupoid= Gn−k , for all k < n ∈ N,

where “Groupoid=” means “being groupoid-isomorphic”. Recall that, in [4] and [5], we show that the graph groupoid Gn of Gn is a group, which is group-isomorphic to the free group Fn with n-generators, for all n ∈ N. Therefore, we can have that Fn / Fk Group= Fn−k , for all k < n ∈ N, where “Group=” means “being group-isomorphic”. The above theorem shows that the quotient groupoid G / K is well-defined, if and only if the full-subgraph inclusion K ≤ G induces the quotient graph GK , which is graphisomorphic to the one-vertex-multi-loop-edge graph. Since G / K (if exists) is groupoidisomorphic to the graph groupoid GK of the quotient graph GK , we can obtain the following corollary. Corollary 4..9 Suppose the quotient groupoid G / K is well-determined, where K ⊂ G are subgroupoid inclusion induced by the full-subgraph-inclusion K ≤ G. Then the groupoid G / K is a group, and it is group-isomorphic to the free group F|E(GK )| , where E(GK ) is the edge set of the quotient graph GK . Proof. Since the quotient groupoid G / K is well-defined, the quotient graph GK is graphisomorphic to the one-vertex-multi-loop-edge graph G(n) , where n = |E(GK )| . We know that the graph groupoid G(n) of G(n) is a group, which is group-isomorphic to the free group Fn , with n-generators. So, the graph groupoid GK is groupoid-isomorphic to Fn , too. Therefore, the quotient groupoid G / K is groupoid-isomorphic to the group Fn , too.

4.3. Representations of Quotient Groupoids Let Γ be a group. Then the group Hilbert space HΓ is defined to be the l2-space l2(Γ) generated by Γ. The canonical (left) group action λ : Γ → B(HΓ ) is defined by the left regular representation, λ(g) def = ug ∈ B(HΓ ), where ug is the unitary operator with its adjoint u∗g = ug−1 , for all g ∈ Γ. Then we can get the canonical representation (HΓ, λ) of a group Γ. The group von Neumann algebra L(Γ) denote= vN (Γ) def = C[λ(Γ)]

w

is naturally defined, as a W ∗-subalgebra of B(HΓ ). Again, recall that all our graphs are “connected.” Fix now a full-subgraph inclusion K ≤ G, and the corresponding subgroupoid inclusion K ⊂ G. We observed that the quotient groupoid G / K is well-defined, if and only if the quotient graph GK is graph-isomorphic to the one-vertex-multi-loop-edge graph. This shows that, whenever the quotient groupoid G / K, induced by the subgroupoid inclusion K ⊂ G, (if exists) is groupoid-isomorphic to the free group, since G / K is groupoid-isomorphic to the graph groupoid GK of the quotient graph GK induced by K ≤ G. Therefore, we can obtain the following theorem.


291

Theorem 4..10 Let K ≤ G be a full-subgraph inclusion and K ⊂ G, the corresponding subgroupoid inclusion. Assume that the quotient groupoid X = G / K is well-defined, and suppose it is non-trivial. Then the canonical representation of the groupoid X is equivalent to the canonical (group) representation (l2(Fn ), λ), for n = |E(GK )| ∈ N, where λ is the left regular representation of the free group Fn . By the previous theorem, we can obtain the following corollary. Corollary 4..11 Let K ≤ G, and K ⊂ G be given as in the above theorem, and assume that the quotient groupoid G / K is non-trivially well-determined. Then the groupoid W ∗ algebra vN (G / K), induced by the canonical representation of G / K, is ∗-isomorphic to the group von Neumann algebra L(Fn ), for some n ∈ N. Assume now that the quotient groupoid G / K is trivial, equivalently, assume that the quotient graph GK , induced by K ≤ G, is graph-isomorphic to the trivial graph. Then, clearly, the groupoid W ∗ -algebra vN (G / K) is ∗-isomorphic to C, since the trivial groupoid has its canonical representation (C, 1). So, more generally, we can have the following theorem. Theorem 4..12 Let K ≤ G be a full-subgraph inclusion and K ⊂ G, the corresponding subgroupoid inclusion. Assume that the quotient groupoid X = G / K is well-determined. Then the canonical representation of X is equivalent to either the canonical group representation (l2(Fn ), λ) of the free group Fn , for some n ∈ N, or the trivial representation (C, 1). So, the groupoid W ∗ -algebra vN (X ) is ∗-isomorphic to either L(Fn ), or C. Independent from our topic, the study of group von Neumann algebras L(Fn ) is one of the most interesting area in von Neumann algebra theory. It is somewhat shocking that we still do not know whether the group von Neumann algebras L(F2) and L(F3 ) are ∗isomorphic or not. The famous Radulescu’s alternative theorem says that: either (1) or (2) holds true, where (1) L(Fn ) ∗−isomorphic= L(F∞ ), for all n ∈ N, (2) L(Fn ) ∗−isomorphic6= L(Fm ), for all m 6= n ∈ N ∪ {∞}, where F∞ means the free group with (countably) infinitely many generators (See [24]).

5.

Finite-Graph Fractaloids and Representations

As before, all graphs in this section are automatically assumed to be “connected.” In this section, we briefly introduce graph fractaloids in the sense of [16]. In [2], [16], [17], and [18], we considered a special kind of graph groupoids, as groupoids with fractal property. Let X be a groupoid and assume that X has fractal property. Then we call X , a fractaloid. In other words, the word “fractaloid” means “a groupoid with fractal property.” If a fractaloid is a graph groupoid, then we call it a graph fractaloid to emphasize it is induced by a graph. Then how we can define the fractal property, sometimes, called fractality? The fractality of groupoids would be extended from the fractality of groups. Recently, the fractal groups and fractality are widely studied in pure math, applied math and many other science fields.

292

Ilwoo Cho

Assumption In this section, for convenience, we restrict our interests to the case where all graphs are connected and “finite.” Recall that a graph G is finite, if |V (G)| < ∞ and |E(G)| < ∞. Let G be a connected finite graph and let v ∈ V (G). Define the in-degree degin (v) in G, and the out-degree degout (v) in G, by degin (v) def = |{e ∈ E(G) : e = ev}| , respectively, deg out (v) def = |{e ∈ E(G) : e = ve}| , for all v ∈ V (G). The degree deg(v) of v ∈ V (G) in G is defined to be the sum of degin (v) and degout (v), i.e., deg(v) def = degin (v) + degout (v), for all v ∈ V (G). Under the finiteness of graphs, deg(v) < ∞, for all v ∈ V (G). Originally, like fractal groups (See [1]), we defined the graph fractaloids by help of automata theory. By using the automata-theoretical labeling (or weighting) on graph groupoids, if a given graph groupoid G satisfies the fractality, then G is defined to be a graph fractaloid (See [16]). Here, we only introduce the main results of [16], which are the characterizations of graph fractaloids (Also, see [17]). But notice that, the graphs in [16] and [17] are connected and “locally finite.” Recall that a graph G is locally finite, if G is a graph satisfying that deg(v) < ∞, for all v ∈ V (G). So, in general, locally finite graphs can be infinite. For instance, all regular trees are locally finite graphs. But here we only consider finite graphs, which are automatically locally finite. The following theorem is the summary of the main results of [16]: Theorem 5..1 (See [16] and [17]) Let G be a connected “locally finite” (finite or infinite) directed graph with its graph groupoid G, and let AG be the automaton induced by G, in the sense of [16]. Suppose TG is the automata tree of AG , in the sense of [16] (and [1]). Then the following statements are equivalent: (1) G is a graph fractaloid. (2) the automata actions {Aw : w ∈ F+ (Gˆ)} act “fully” on TG . (3) the automata trees TG is graph-isomorphic to the 2N -regular tree T2N , where N = max{degout (v) : v ∈ V (G)}.


293

The characterization “(1) ⇔ (2)” provides the automata-theoretical characterization of graph fractaloids, induced by connected locally finite directed graphs. And the characterization “(1) ⇔ (3)” provides the algebraic characterization of graph fractaloids, induced by connected locally finite directed graphs. Based on the previous characterizations, we could find the graph-theoretical characterization of graph fractaloids, induced by connected “finite” directed graphs in [18]. Theorem 5..2 (See [18]) Let G be the graph groupoid of a connected “ finite” directed graph G. Then G is a graph fractaloid, if and only if the out-degrees of all vertices are identical in G. i.e., G is a graph fractaloid, if and only if N = degout (v), for all v ∈ V (G), where N = max{degout (v) : v ∈ V (G)}. If we consider “infinite” directed graphs G, generating their graph fractaloids G, then, unfortunately, the above characterization does not hold true. For instance, the n-regular tree Go satisfies that all vertices of Go have the same out-degrees, which are identical to n, for n ∈ N. However, the graph groupoid Go of Go is not a graph fractaloid, since the root vo of Go has its in-degree 0, but all other vertices have their in-degree 1. This makes the automata actions do not act fully on automata tree (See [2], [16], and [18]). Only under the finiteness of graphs, the above theorem works well. In [18], however, we could find the “partial” graph-theoretical characterization of graph fractaloids, induced by connected locally finite (finite or infinite) graphs: Theorem 5..3 (See [18]) Let G be a connected locally finite (finite or infinite) graph with its graph groupoid G. Assume that G has neither sinks or sources. Then G is a graph fractaloid, if and only if the out-degrees of all vertices are identical in G. Let G be an arbitrary graph and v ∈ V (G). Recall that the vertex v is a sink (resp., a source), if degout (v) = 0 (resp., degin (v) = 0), in G. Again, remark that all graphs are connected, and finite. We will use the above characterization for defining the fractal graphs and the graph fractaloids. Definition 5..1 Let G be a connected finite directed graph with its graph fractaloid G. We say that the graph G is fractal (or G is a fractal graph), if the out-degrees of all vertices are identical. In this case, the graph groupoid G is said to be a graph fractaloid. To emphasize that we restrict our interests to the case where G is finite, we call G the fractal finite graph, and G, the finite-graph fractaloid. In [18], we showed that there are sufficiently many finite-graph fractaloids.

294

Ilwoo Cho

Example 5..1 (1) Every one-vertex-multi-loop-edge graph is fractal. Indeed, the graph groupoids of them are groups, which are group-isomorphic to the free groups. Notice that free groups are fractal groups (See [1]). (2) Every one-flow circulant graph is fractal. It shows that there are fractaloids (groupoids with fractality) which are not fractal groups. (3) All graphs which are graph-isomorphic to the shadowed graphs of one-flow circulant graphs are fractal, too. (4) We say that a (finite) graph G is complete, if (i) G does not have loop-edges, (ii) for any pair (v1, v2 ) of distinct vertices, there always exists a unique edge e such that e = v1 e v2 (Remark that, if we take (v2 , v1 ), then there exists a unique edge e0 6= e, such that e0 = v2 e0 v1 .). Every complete graph with finite vertices is fractal. (5) For any arbitrary given connected finite graph G, there always exists a unique graph Go , such that (i) G ≤ Go, and (ii) Go is fractal (See [18] and [2]). Such a graph Go is called the finitely fractalized graph of G. Finitely fractalization is studied in [2], in detail. For more interesting examples, see [2], and [18]. Let G be a fractal finite graph with its finite-graph fractaloid G. Then, as a graph groupoid, G has its canonical representation (HG , L), and we can construct the graph von w Neumann algebra MG = C[L(G)] , which is the groupoid W ∗ -algebra, generated by G, in B(HG ). Definition 5..2 Let G be a fractal (finite) graph, and let MG be the graph von Neumann algebra of G. Define an operator TG in MG by P TG def = w ∈ E(G) (Le + Le−1 ) . This operator is called the radial operator of G on HG (or, in MG ). Remark that, in [16] and [17], the radial operator TG of G is defined by the labeling operator, in terms of the labeling on G, via automata theory. However, without using the automata language, we can define the same Hecke-type operator as above. And it is defined like the radial operators of groups, determined by the generators of the groups. Notice that the graph groupoid G is generated by the edge set E(G) of G (or the self-adjoint set E(Gˆ), the edge set of the shadowed graph Gˆ of G). In [16] and [17], we showed that the spectral information of the operator TG explains how fractality of G acts on HG . Let MG be the graph von Neumann algebra of a fractal graph G, and let DG be the diagonal subalgebra of MG . Then we can define the conditional expectation E : MG → DG , as a continuous C-linear map, satisfying (i) E(d) = d, for all d ∈ DG , (ii) E(d1md2) = d1 E(m) d2 , for all d1, d2 ∈ DG , m ∈ MG , and (iii) E(m∗) = E(m)∗ in DG , for all m ∈ MG . In particular, we can have the canonical conditional expectation E : MG → DG , defined by P P E (w ∈ G tw Lw ) def = v ∈ V (Gˆ) tv Lv , P for all w ∈ G tw Lw ∈ MG , with tw ∈ C. Then we can have the free distributional data E(mn ) of an operator m ∈ MG , called the DG -valued free moments of m. In particular, if m ∈ MG is self-adjoint, then the DG -valued moments {E(mn)}∞ n=1 provides the


295

spectral information of the operator m over DG . So, if we compute the DG -valued moments E(TGn ), for all n ∈ N, we can obtain the spectral data of the radial operator TG over DG , since TG is self-adjoint in MG (See [10], [16], and [17]). In [16], we compute the DG -valued moments of TG , and it is completely characterized by certain scalar-values. It provides the spectral information of the graph fractaloid G. Before introducing the complete computation of the DG -valued free moments E(TGn) of TG , we define some terminology. Let N ∈ N be a fixed number. Define the lattices lk in the real plane R2 by −−−→ lk = (1, ek ), for all k = 1, ..., N, −−−−→ where (t1 , t2) means the vector connecting the origin (0, 0) to the point (t1 , t2 ) in R2. We call l1 , ..., lN , the upward lattices induced by N. Define now the downward lattices l−1, ..., l−N induced by N, by −−−−−−→ l−k = (1, − ek ), for all k = 1, ..., N. For the given lattices l±1 , ..., l±N , we can define the lattice paths in R2, by the rule: the lattice path li lj is a vector addition of li and lj , by identifying the end point (1, εi ei ) of li and the starting point (0, 0) of lj , where 1 ifi > 0 εi def = −1 ifi < 0, for all i = ±1, ..., ±N. Inductively, we can determine the lattice paths li1 li2 ... lin , where ij ∈ {±1, ..., ±N }, for j = 1, ..., n, for all n ∈ N. Let l = li1 ... lin be a lattice path, generated by l±1 , ..., l±N . Then the length |l| of the lattice path l is defined to be n, the cardinality of the lattices generating l. Define the set LN be the collection of all lattice paths generated by l±1 , ..., l±N . Define now the subsets LN (n) of LN by LN (n) def = {l ∈ LN : |l| = n}, for all n ∈ N. Then the lattice path set LN is decomposed by LN = k = 1∞t LN (k). Consider special lattice paths in LN (k), for k ∈ N. Let l ∈ LN (k), and assume that it starts at (0, 0), and it ends on the horizontal axis. Then we say that the lattice path l satisfies the (horizontal) axis property. Define the subsets LoN (n) of LN (n) by LoN (n) def = {l ∈ LN (n) : l satisfies the axis property}, for all n ∈ N. It is easy to check that LoN (k) is empty, whenever k is odd. So, LoN (n) are nonempty only when n is even. In [16], we computed that: Theorem 5..4 (See [18]) Let G be a fractal graph with its graph fractaloid G, and let MG be the graph von Neumann algebra of G. Let TG ∈ MG be the radial operator of G. If

296

Ilwoo Cho N = max{degout (v) : v ∈ V (G)}, then the DG -valued free moments E(TGn ) of TG are o |LN (n)| · 1DG if niseven n E(TG) = if nisodd, 0DG for all n ∈ N.

The above theorem shows how the fractality of graph fractaloids works on graph Hilbert spaces under the canonical representations. The following proposition is introduced in [16] and [17]. Proposition 5..5 Let N ∈ N. Then 2n |LoN (2n)| = (i1, ...,i2n) ∈ CN

P

ci1 ,...,i2n ,

where  i1, ...,i2n ∈ {±1, ..., ± N }  2n = i1 ≤ i2 ≤ ... ≤ i2n , CN (i1, ..., i2n) P2n   ik = 0  

k=1

and where ci1 ,...,i2n satisfies the following recurrence relation: ci1 ,...,ij ,ij+1 ,...,i2n = ci1 ,...,ij (2n C2n−j ) , whenever ij+1 = ij+2 = ... = i2n , for all n ∈ N. Here, n Ck def = n ∈ N.

n! k!(n−k)! ,

for all k ≤

For instance, if N = 1, then |Lo1 (2n)| = 2n Cn , for all n ∈ N. The following example helps to understand the computation of ci1 ,...,i2n , where (i1, ..., 2n i2n ) ∈ CN , for n, N ∈ N. c−3,−2,−2,−1,1,2,2,3 = (c−3,−2,−2,−1,1,2,2) (8C1 ) = (c−3,−2,−2,−1,−1) (7 C2 ) (8C1 ) = (c−3,−2,−2) (5 C2 ) (7 C2 )(8C1 ) = (c−3 ) (3C2 ) (5C2 )(7C2 )(8C1 ) = (1C1 ) (3C2 )(5C2 )(7C2 )(8C1 ).

References [1] A. G. Myasnikov and V. Shapilrain (editors), Group Theory, Statistics and Cryptography, Contemporary Math, 360, (2003) AMS.


297

[2] I. Cho, Fractalized Graphs and Corresponding von Neumann Algebras, (2008) Preprint. [3] I. Cho, Graph von Neumann Algebras, ACTA Applied Math, 95, (2007) 95 - 134. [4] I. Cho, Characterization of Amalgamated Free Blocks of a Graph von Neumann Algebra, Compl. Anal. Oper. Theor. 1, (2007) 367 - 398. [5] I. Cho, Group Freeness and Certain Amalgamated Freeness, J. of KMS, 45, no. 3, (2008), 597 - 609. [6] I. Cho, Measures on Graphs and Certain Groupoid Measures, CAOT, 2, (2008) 1 - 28. [7] I. Cho, Operator Algebraic Quotient Structures Induced by Graphs, CAOT, (2008) To Appear. [8] I. Cho, Vertex-Compressed Subalgebras in a Graph von Neumann Algebra, ACTA Appl. Math., (2008) To Appear. [9] I. Cho, Von Neumann Algebras Generated by Automata, Graphene: Research, Technology, and Application, NOVA publisher, (2008) To Appear. [10] I. Cho, C-Valued Free Probability on a Graph von Neumann Algebra, J. of KMS, (2008) To Appear. [11] I. Cho, Towers of Graphs, and Towers of Graph von Neumann Algebras, (2008) Submitted to B. of KMS. [12] I. Cho, and P. E. T. Jorgensen, C ∗ -Algebras Generated by Partial Isometries, JAMC, (2008) To Appear. [13] I. Cho, and P. E. T. Jorgensen, C ∗ -Subalgebras Generated by Partial Isometries, JMP (2008) To Appear. [14] I. Cho, and P. E. T. Jorgensen, C ∗ -Subalgebras Generated by a Single Operator in B(H), (2008) Submitted to JMP. [15] I. Cho, and P. E. T. Jorgensen, C ∗ -Dynamical Systems Induced by Partial Isometries, (2008) Preprint. [16] I. Cho, and P. E. T. Jorgensen, Application of Automata and Graphs: Labeling Operators in Hilbert Space I, ACTA Appl. Math.: Special Issues, (2008) To Appear. [17] I. Cho, and P. E. T. Jorgensen, Application of Automata and Graphs: Labeling Operators in Hilbert Space II, (2008) Submitted to JMP. [18] I. Cho, and P. E. T. Jorgensen, Graph Fractaloids: Graph Groupoids with Fractal Property, (2008) Submitted to J. of Phy. A. [19] I. Raeburn, Graph Algebras, CBMS no 3, AMS (2005). [20] P. D. Mitchener, C ∗ -Categories, Groupoid Actions, Equivalent KK-Theory, and the BaumConnes Conjecture, arXiv:math.KT/0204291v1, (2005), Preprint. [21] R. Gliman, V. Shpilrain and A. G. Myasnikov (editors), Computational and Statistical Group Theory, Contemporary Math, 298, (2001) AMS. [22] W. Dicks and E. Ventura, The Group Fixed by a Family of Injective Endomorphisms of a Free Group, Contemp. Math 195, AMS, (2000) [23] V. Vega, Finite Directed Graphs and W ∗ -Correspondences, (2006) Ph. D thesis, Univ. of Iowa. [24] F. Radulescu, Random Matrices, Amalgamated Free Products and Subfactors of the von Neumann Algebra of a Free Group, of Noninteger index, Invent. Math., 115, (1994) 347 - 389.



Chapter 10

T HE G ROUP A SPECT IN THE P HYSICAL I NTERPRETATION OF G ENERAL R ELATIVITY T HEORY Salvatore Antoci 1 and Dierck Ekkehard Liebscher2 1 Dipartimento di Fisica “A. Volta” and IPCF of CNR, Pavia, Italia 2 Astrophysikalisches Institut Potsdam, Potsdam, Deutschland

Abstract When, at the end of the year 1915, both Einstein and Hilbert arrived at what were named the field equations of general relativity, both of them thought that their fundamental achievement entailed, inter alia, the realisation of a theory of gravitation whose underlying group was the group of general coordinate transformations. This group theoretical property was believed by Einstein to be a relevant one from a physical standpoint, because the general coordinates allowed to introduce reference frames not limited to the inertial reference frames that can be associated with the Minkowski coordinate systems, whose transformation group was perceived to be restricted to the Poincaré group. Two years later, however, Kretschmann published a paper in which the physical relevance of the group theoretical achievement in the general relativity of 1915 was denied. For Kretschmann, since any theory, whatever its physical content, can be rewritten in a generally covariant form, the group of general coordinate transformations is physically irrelevant. This is not the case, however, for the group of the infinitesimal motions that bring the metric field in itself, namely, for the Killing group. This group is physically characteristic of any given spacetime theory, since it accounts for the local invariance properties of the considered manifold, i.e., for its “relativity postulate”. In Kretschmann’s view, the so called restricted relativity of 1905 is the one with the relativity postulate of largest content, because the associated Killing group coincides with the infinitesimal Poincaré group, while for the most general metric manifold of general relativity the associated Killing group happens to contain only the identity, hence the content of its relativity postulate is nil. Of course, solutions to the field equations of general relativity whose relativity postulate has a content that is intermediate between the two above mentioned extremes exist too. They are the ones found and investigated until now by the relativists, since the a priori assumption of some nontrivial Killing invariance group generally eases

300

Salvatore Antoci and Dierck Ekkehard Liebscher the finding of solutions to the above mentioned equations. In the present chapter it is shown what are the consequences for the physical interpretation of some of these solutions whose relativity postulate is of intermediate content, when Kretschmann’s standpoint is consistently adhered to.

1.

Introduction

It may seem strange that, in the year 2009, one may conceive writing a text with the title given above. So many years have elapsed since Einstein [1] and Hilbert [2] eventually wrote the final equations of general relativity theory, and it might be reasonable to believe that by now the issue hinted at in the title should have been settled once and for all. The rôle played by group theory in the so called general theory of relativity should be clear beyond discussion, and no further paper should need to be written on this subject. However, this is not the case. Moreover, this problem has emerged as early as in the year 1917, when Erich Kretschmann defied the group theoretical assessment given by Einstein [3], and proposed an alternative of his own [4], whose validity in principle was to be soon acknowledged by Einstein himself [5]. In Frank’s review [6] of Kretschmann’s paper one finds a short, but precise account of the main points considered by Kretschmann. It reads, in English translation: “Einstein understands, under his general principle of relativity, the injunction that the laws of nature must be expressed through equations that are covariant with respect to arbitrary coordinate transformations. The Author shows now that any natural phenomenon obeying any law can be described by generally covariant equations. Therefore the existence of such equations does not express any physical property. For instance the uniform propagation of light in a space free from gravitation can be expressed also in a covariant way. However, there is a representation of the same phenomena, that admits only a more restricted group (the Lorentz tranformations). This group, that cannot be further restricted by any representation of the phenomena, is characteristic of the system under question. The invariance with respect to it is a physical property of the system and, in the sense of the Author, it represents the postulate of relativity for the corresponding domain of phenomena. In Einstein’s general theory of relativity, through appropriate choice of the coordinates, the field equations can be converted in a form that is no longer covariant under the group of coordinate transformations. The Author provides a series of examples of such conversions. But the equations converted in this way in general no longer admit any group, and in this sense Einstein’s theory of general relativity is an “absolute theory”, while the special theory of relativity satisfies the postulate of relativity for the Lorentz transformations also in the sense of the Author.” When reading Kretschmann’s paper today, one confronts its lengthy, sometimes obscure pages with a growing sense of admiration for the keen physical intuition that drove its author to a right conclusion despite his lack of the correct mathematical tools for tackling the difficult questions that he addressed, and forced him to try, one after another, several paths of thought that he critically evaluated not to be fully satisfactory in one way or another. In the present chapter Kretschmann’s comparison between the group theoretical assessment of the special and of the general theory of relativity is reconsidered by availing of the mathematical tool that is lacking in Kretschmann’s work, i.e. the group of infinitesimal

The Group Aspect in the Physical Interpretation of General Relativity Theory

301

Killing motions to be associated to each theory endowed with a metric tensor. If the group properties of both flat and curved spaces need to be compared through the same mathematical tool, it is this group that must take the rôle of what Kretschmann calls the group of invariance, the one endowed with physical meaning, that he so many times invokes in his paper as the one needed for properly assessing the “relativity postulate” of each theory. From this recognition several relevant consequences immediately follow for the way that must be kept when physically interpreting the solutions to the field equations of general relativity endowed with nontrivial groups of invariance. But let us go back at present to special relativity and to its group theoretical assessment, that has constituted the mathematically and physically sound paradigm from which Kretschmann has moved for building his interpretation of general relativity as an “absolute theory”.

2.

Finding the group of invariance in special and in general relativity

The reader shall forgive us if we recall here concepts that have been perused in all the textbooks of relativity since a long time. We need to do so for retrieving a mathematical formulation, that may have the distinct advantage of being applicable without change both to the special and to the general theory of relativity. The Poincaré group of transformations between the inertial coordinate frames 1 of special relativity can be given in principle many representations. The generally adopted one relies on what Landau and Lifshits [7] once called “Galilean coordinates” x1 = x, x2 = y, x3 = z, x4 = t, and on the Minkowski metric, expressed with respect to these coordinates: ηik = diag(−1, −1, −1, 1),

(1)

that is invariant 2 under the coordinate transformations of the Poincaré group. When this representation is adopted, the coordinates are not just labels for identifying events; due to the particular form of ηik , they have a direct metric reading, i.e. to each particular system of coordinates a physically admissible reference frame, to be built with rods, clocks and light signals, is directly associated in one-to-one correspondence. As a consequence, by availing of this representation, one recognizes that the Poincaré group, besides being, from a mathematical standpoint, the group of invariance of ηik , is endowed with direct physical meaning. The invariance of ηik under the Poincaré group constitutes what Kretschmann once called the physically meaningful “relativity postulate” of the original theory of relativity. However, it is quite possible, and Kretschmann was fully aware of this [9, 4], that one accounts for special relativity by adopting a general system of curvilinear coordinates, with the associated group of general coordinate transformations. This move has the distinct advantage of freeing the coordinate systems from the duplicity of function that they play in the 1

The notion reference frame acquired in the meantime a meaning different from coordinate system, although in special relativity both are usually intimately connected. 2 One cannot help recalling here the ironic sentence by Felix Klein: “Was die modernen Physiker Relativitätstheorie nennen, ist die Invariantentheorie des vierdimensionalen Raum-Zeit-Gebietes, x, y, z, t (der Minkowskischen “Welt”) gegen u¨ ber einer bestimmten Gruppe von Kollineationen, eben der “Lorentzgruppe”[8].

302

Salvatore Antoci and Dierck Ekkehard Liebscher

previous account, i.e. both providers of labels for the identification of the events, and elements to which the transformations of the invariance group directly apply. We do not know whether these coordinate systems can maintain a physical rôle beyond the purely topological one of identifying the events, namely, whether reference frames can be associated with these curvilinear coordinates too, as it was hoped for by Einstein [3]; today, the answer to the above question is the identification of a reference frame at a given event with the vector base in its tangent Minkowski space-time 3. However, we are sure that in general relativity the latter coordinate systems have no relation whatsoever with the physically relevant invariance group of the special relativity theory 4 . The adoption of curvilinear coordinates for expressing the theory of special relativity is fundamental for acknowledging that the restriction of the allowed coordinate systems to the ones corresponding to the inertial frames, although very intuitive and convenient for the calculations, is conceptually inessential. The eventual recognition of the group of invariance of the metric in a given theory is the true scope that we aim at, either in the special or in the general theory of relativity, and we shall equip ourselves with the appropriate mathematical tool. Since the absolute differential calculus of Ricci and Levi Civita is naturally expressed with curvilinear coordinates, these shall constitute an appropriate choice for accomplishing our task. There is a fundamental difference between the special and the general theory of relativity, that is decisive for the very choice of the group of invariance that we shall look at in both cases, and for the unique mathematical tool that we shall eventually adopt for the comparison. In special relativity, as it is evident just because Galilean coordinates can be used in the double rôle explained above, the representation of the group of invariance has a global character, while in a nontrivial pseudo Riemannian manifold a group of invariance of the metric, if it exists at all, in general can be identified mathematically only in the infinitesimal neighbourhood of each event. The several, keen but unsuccessful attempts by Kretschmann to provide a global identification of the invariance group through explicit analytic or geometric procedures both in the case of special relativity as seen in curvilinear coordinates, and in the case of general relativity, testify the difficulty of the global problem, on which scarce progress has occurred since Kretschmann’s times. Happily enough, if we investigate the invariance group of the metric in the infinitesimal neighbourhood of each event, by availing of the powerful tools provided by Lie and by Killing [10] we can identify and use the algebra of the Killing vectors that prevails in each one of these neighbourhoods, both in the special and in the general theory of relativity. The conceptual problem is thereby reduced to the mathematical problem of finding the solutions of the Killing equations (14) of Appendix A and of studying the group properties of the infinitesimal Killing motions found in this way. As it is evident from Appendix A, the group of the infinitesimal Killing motions does not deal with infinitesimal point transformations: by its very nature, this method analyses the invariance group of the metric under infinitesimal “Mitschleppen” (dragging along). This change of objective may appear inessential for 3 In this way, one obtains a field of frames which generates a teleparallel transport with torsion instead of curvature. The transformations of the frames form the Lorentz group at each point separately, and eventdependent Lorentz transformations for the field of frames. Eventually, already Einstein tried to generalize the theory to implement the electromagnetic field in this direction. 4 The choice of the coordinate system as well as that of the field of frames do not enter any observable here.


303

special relativity, due to the homogeneous character of the considered manifold. In this case, the global answer that the invariance group of the metric is the Poincaré group can be reached anyway, by starting from the infinitesimal Killing group, only through a more complicated argument. The study of the invariance group for infinitesimal “Mitschleppen”, however, is the only one that is possible in general for a pseudo Riemannian, curved manifold. The infinitesimal Killing vector group is therefore the tool for realizing Kretschmann’s program of comparison of the invariance groups of the metric that prevail in the special and in the general theory of relativity respectively. The search for the Killing group for both special and general relativity is straightforward and confirms Kretschmann’s objection of 1917: while the Killing group of the metric of special relativity is the Poincaré group for infinitesimal motions, for a general solution of the field equations of general relativity the Killing group reduces to the identity, i.e. general relativity, despite its very name, is indeed an absolute theory.

3.

Applying Kretschmann standpoint to solutions with intermediate relativity postulate

Finding exact solutions to the field equations of general relativity is a very demanding task; no wonder then if in the decades-long search for new solutions, since when Karl Schwarzschild discovered the spherically symmetric, static solution that bears his name [13], the problem has been eased by limiting the search to the simpler solutions for which the Killing groups of the metric are intermediate between the one of special relativity and the one, endowed only with the identity, of the most general solutions of general relativity. As a consequence, the invariance groups of the metric fields that we can really explore are nontrivial and, according to Kretschmann’s standpoint, intrinsic physical content is introduced a priori. Let us notice that the idea of a particular physical content associated with a particular nontrivial invariance group of the metric is fully in keeping with the findings by Hilbert, Klein and with the fundamental result by Noether [2, 11, 12] about the essential link between invariance and conservation laws. We are therefore confronted with a very interesting, but really difficult situation. The very fact that in general relativity each particular solution of the field equations exhibits its own particular content of the physically relevant invariance group is a novel feature that counters our expectations. We were prepared to search for a unique, once for all theory of the observables of general relativity, like it happens in special relativity, for which the Killing group is fixed from the outset. In general relativity these observables should behave as scalars under the group of coordinate transformations, because tensor quantities depend on the choice of the coordinates, which are today generally presumed to be mere labels for identifying events, otherwise devoid of physical meaning 5 . But we do not know how to find general exact solutions, for which these observables might display their full structure and meaning, and even if we could find these solutions and calculate their observables, the latter could not have any resemblance to the observables of special relativity. In fact, besides being invariant quantities, we know in advance that they would obey no genuine conservation 5

Scalars obtained by considering the tetrad components of some tensor with respect to some tetrad field would be equally devoid of physical meaning, due to the arbitrariness in the choice of the tetrad field.

304


law, since the Killing group of such general solutions would contain only the identity. We must content ourselves, however, with the examples provided by the particular solutions endowed with a nontrivial Killing group which, if the Riemann tensor is nonvanishing, is different from and endowed with less elements than the Poincaré group. Let us explore, by availing of Kretschmann’s and Noether’s standpoint, some well known solutions of general relativity, like the Schwarzschild solution, both with the original, pondered choice of the manifold done by Schwarzschild [13] himself and in the form, endowed with an inequivalent manifold, accidentally introduced 6 by Hilbert [14], as well as its Kruskal-Szekeres maximal extension [17, 18]; the Kerr-Newman solution [19, 20] will be considered too. There is also a body of literature on the so-called boost-rotation symmetric solutions 7 that seems worth of analysis. The perusal of these manifolds from the above mentioned standpoint leads to disconcerting results. All these solutions, with the exception of Schwarzschild’s original manifold [13], have one feature in common: the manifold, on which the solution is defined, happens to be built from the juxtaposition of submanifolds endowed with different invariance groups of the metric, hence with different intrinsic physical meaning, because, according to Noether [12], the quantities that are conserved in each one of the submanifolds are physically different. This peculiar behaviour, common to the solutions mentioned above, with the exception of Schwarzschild’s original solution, is invariably due to the presence, within the manifold, of surfaces on which the character of one Killing vector field changes from timelike to spacelike or vice versa, with a consequent change of the physical meaning of the prevailing Killing group when one crosses one such surface of junction between neighbouring submanifolds. This is a well known behaviour, but the danger of allowing in this way for intrinsically nonsensical, patchwork manifolds, with unrelated physical processes, subject to unrelated conservation laws going on severally in each of the submanifolds, has been intimated only recently [33]. The adoption of such composite manifolds as models of some physical reality has occurred because the criteria adopted for their selection have been based exclusively on the two very important notions of local singularity and of geodesic completeness. The two notions are deeply intertwined in the studies that have been developed during many years while searching for a general, invariant and physically satisfactory definition of singular boundary in general relativity 8 . It is not here the place for recalling them in extenso. Suffice it to say that the notion of intrinsic, local singularity has been associated with the divergent behaviour of the polynomial invariants built with the metric gik , with the Levi-Civita symbol iklm , with the Riemann tensor Riklm and with its covariant derivatives, when some limit boundary is approached along a geodesic path. A manifold is said to be geodesically complete when its geodesics either can be defined for any value of their affine parameter, or meet some limit boundary where some of the above mentioned polynomial invariants diverge. The occurrence of the latter divergence is of course an appropriate, sufficient condition for defining a singularity intrinsic to the manifold, and the requirement of geodesic 6 For a historical account on Schwarzschild’s original manifold and on the inequivalent choice of the manifold done by Hilbert, one may consult [15] and [16]. 7 From the references on the subject let us quote here only the solutions with nonspinning sources, reported and investigated in [23], [24], [25], [26], [27, 28, 29], [30, 31], [32]. 8 see for instance [34, 35, 36, 37] [38, 39, 40, 41].


305

completeness is likely to be a geometrically and physically correct regularity criterion for a general solution, for which the Killing group reduces to the identity. When this criterion is applied to the solutions mentioned above, for which the Killing group does not reduce to the identity, the following assessment is reached: • Schwarzschild’s original manifold [13] is defective, because of geodesic incompleteness. No geodesic reaching an intrinsic singularity due to the divergence of some polynomial invariant of the Riemann tensor can be drawn on it. • Hilbert’s manifold is defective due to geodesic incompleteness too. Geodesics hitting an intrinsic singularity of the previously defined kind, or emanating from it, can be drawn, but one fails to assign to them a proper arrow of time 9 . • The Kruskal-Szekeres manifold [17, 18] is geodesically complete and has a proper arrow of time. • The Kerr-Newman manifold [19, 20] lacks geodesic completion and does not have a proper arrow of time. Both the Kerr and the Reissner-Nordström [21, 22] manifolds have been severally completed. • The so-called boost-rotation symmetric manifolds of [23]-[32] generally await geodesic completion. When confronted with the diagram of the Kruskal-Szekeres manifold, the perception that a consequent reasoning has eventually led us to acknowledge the need of these four quadrants for properly describing, in general relativity, the gravitational field of one material particle at rest, this perception has been sufficient to raise in some relativists the following doubt. The faultless logic of the program of geodesic completion is of course likely to be quite correct for a general solution to the field equations of general relativity, for which the invariance group of the metric contains only the identity, hence it is irrelevant. Is not it possible that the same program may instead lead us astray when applied to manifolds that happen to be invariantly, intrinsically divided in submanifolds, because nontrivial and physically different invariance groups of the metric prevail in different parts of the complete manifold? The further consideration of the infinite repetitions occurring in the diagrams needed to perfect the program of geodesic completion for both the Kerr and the Reissner-Nordström solutions cannot but strengthen the doubt raised already by the Kruskal-Szekeres manifold, and leads one to wonder whether something similar to what occurred to Goethe’s “Zauberlehrling” is happening here. If one imposes instead the condition that, in order to be a model of some physical reality, a manifold must not contain in its interior local, invariant, intrinsic singularities, and must be endowed with a unique group of invariance, the assessment of the solutions previously considered becomes the following: 9 As required by Synge [42], a proper arrow of time shall satisfy both the postulate of order, according to which the affine parameter on one geodesic is always increasing or decreasing when one goes along the geodesic in a given sense, and the non-circuital postulate, according to which one cannot build, with segments of geodesics, a closed loop on which the time arrow always points in the same sense. For the arrow of time in Hilbert’s manifold see also [43, 16].

306


• Schwarzschild’s original manifold fulfills the condition. • Hilbert’s manifold, that we consider here in the usual coordinate system due to Hilbert [14], does not fulfill the condition because the hypersurface orthogonal, timelike Killing vector that can be uniquely drawn at each event for which r > 2m becomes spacelike for 0 < r < 2m. • The Kruskal-Szekeres manifold does not fulfill the condition for the same reason as the one prevailing with Hilbert’s manifold. • The interval of the Kerr-Newman manifold, expressed in Boyer-Lindquist coordinates, reads: %2 2 sin2 ϑ 2 ((r + J 2 )dϕ − Jdt)2 dr − %2dϑ2 − ∆ %2 ∆ + (dt − J sin2 ϑdϕ)2, %2 ∆ = r2 + J 2 + Q2 − 2M r,

ds2 = −

(2)

%2 = r2 + J 2 cos2 ϑ, and the manifold does not fulfill p the condition. A uniform Killing group structure prevails for r > r0 = M + M 2 − J 2 − Q2 . One can fulfill the condition by ending the manifold there. • The boost-rotation symmetric manifolds quoted in footnote [7] do not fulfill the condition, because they are obtained through the juxtaposition of submanifolds endowed with physically different groups of invariance. Again, we are confronted with a hypersurface orthogonal, timelike Killing vector that becomes spacelike on crossing certain hypersurfaces [33].

4.

The singular border between submanifolds endowed with different invariance groups

Despite the fact that the submanifolds into which the previously considered solutions have been divided are invariantly defined, one might still wonder why one should truncate manifolds that are geodesically complete, when no singularities defined through the polynomial invariants of the Riemann tensor occur at the borders produced in that way, and when regular geodesics can be drawn across them. The question can be answered by remarking that geodesics are very special worldlines, and that the regularity of all the wordlines either crossing such borders or lying closer and closer to them should be investigated. With the manifolds considered in the previous section, however, one does not need to accomplish such a cumbersome program for reaching the answer. The nontrivial Killing structure of the considered solutions allows in fact the definition of local, invariant, intrinsic quantities besides the just mentioned polynomial invariants, and these quantities happen to exhibit a divergent, singular behaviour when the borders between submanifolds endowed with different invariance groups are approached.


307

The Killing group of Schwarzschild’s original manifold [13], hence the Killing structure of both the submanifold of Hilbert’s solution [14] for r > 2m, and of the left and right quadrants of the Kruskal-Szekeres manifold [17, 18], define at each event a unique [44], hypersurface-orthogonal, timelike Killing vector ξi : ξi ξ i > 0, ξi;k + ξk;i = 0, ξ[i ξk,l] = 0.

(3)

Due to its uniqueness, and since each hypersurface orthogonal to it is spacelike, this vector defines the unique direction of absolute rest in the manifold where it prevails, and allows to build congruences of absolute rest. Let us calculate the first curvature of one such congruence, i.e. the four-acceleration ai =

Dui dui ≡ + Γikl uk ul , ds ds

(4)

where D/ds indicates the absolute derivative, and ui = dxi /ds is the four-velocity tangent to the chosen congruence. From it one builds the norm α = (−ai ai )1/2.

(5)

Due to its very definition, this local, invariant quantity is also intrinsic to the manifold where it prevails. When Schwarzschild’s solution is written by using Hilbert’s coordinates x1 = r, x2 = ϑ, x3 = φ, x4 = t, its interval reads ds2 = (1 − 2m/r)dt2 −

dr2 − r2 (dϑ2 + sin2 ϑdφ2 ). 1 − 2m/r

(6)

We evaluate now the norm α of the four-acceleration along a congruence of absolute rest for Schwarzschild’s manifold, that is accounted for in Hilbert’s coordinates by (6) with r > 2m. It reads " #1/2 m2 . (7) α= 3 r (r − 2m) This local, invariant, intrinsic quantity diverges for r → 2m. It defines a singularity that one meets when considering congruences of absolute rest closer and closer to the inner border of Schwarzschild manifold, i.e. closer and closer to the borders drawn in the interior of both the Hilbert and the Kruskal-Szekeres manifolds. In this case the answer to the previous question is therefore simply: the border between the submanifolds endowed with different invariance groups of the just examined solutions is to be considered singular from a geometric standpoint, as soon as one does not limit the attention to the polynomial invariants built with the Riemann tensor. As noticed long ago 10 by Whittaker [45] and by Rindler [46], besides the geometrical meaning, α, and its singularity, have an immediate physical meaning too. Let us consider a test body of unit mass kept on a congruence of absolute rest by a dynamometer of negligible mass; also the other end of the dynamometer is assumed to follow a congruence of absolute rest. According 10

before Synge [48] eventually convinced the relativists that the wise plan is to forget about Newton’s arrow and say “gravitational field = curvature of space-time”.

308


to Whittaker and Rindler, the quantity α then equals the strength of the gravitational pull measured by the dynamometer [47]. Also the left and right quadrants of the so called boost-rotation symmetric solutions of footnote [7] are endowed, at each event, with a unique timelike Killing vector that is hypersurface-orthogonal with respect to a hypersurface of spacelike character 11 . Therefore, this Killing vector too uniquely defines a direction of absolute rest, and from it a unique congruence of absolute rest is again obtained. Since the worldlines of the material particles of these solutions never cross the congruences of absolute rest, they can only be interpreted as worldlines of particles in a condition of absolute rest. Their current interpretation as worldlines of particles executing a uniformly accelerated motion with respect to an asymptotic reference system at spatial infinity is problematic [33], because it relies on an approximate asymptotic symmetry that contradicts the exact invariance group of the metric prevailing everywhere in the submanifolds of the left and right quadrants. Like it happens in the Kruskal-Szekeres manifold, on crossing the boundaries between the left and right submanifolds and the upper and lower submanifolds of these solutions the unique timelike, hypersurface-orthogonal Killing vector becomes null and then spacelike. Let us calculate the norm α of the four-acceleration along congruences of absolute rest lying closer and closer to the boundaries of the left and right submanifolds with the upper and lower submanifolds of the solutions of footnote [7]. We can expect 12 that we shall find a local, invariant, intrinsic singularity of nonpolynomial kind, associated with the change of the invariance group that prevails there. This is indeed the case, as it was already shown in [33], to which the interested reader is referred for details. In the Kerr-Newman solution, defined in Boyer-Lindquist coordinates by the interval (2), no unique, hypersurface-orthogonal, timelike Killing vector exists. However, a singular p 2 behaviour of α on approaching the boundary located at r = r0 = M + M − J 2 − Q2 can be invariantly proved as follows. Let λ and µ be two constants. The elements ξk

∂ ∂ ∂ =λ +µ ∂xk ∂t ∂ϕ

(8)

of the Killing group prevailing for r > r0 define invariantly a set of orbits. The squared norm of the first curvature on these orbits α2 = −gij ai aj = −gij (

p ξi ξk ξj ξl );k ( );l , with N = gmn ξ m ξ n , N N N N

(9)

contains always the factor 1/∆ and p diverges for orbits taken closer and closer to the surface ∆ = 0, for which r = r0 = M + M 2 − J 2 − Q2. All Killing congruences defined by (8) are spacelike in the limit ∆ → 0 except p for the case given by µ = λJ/(r02 + J 2 ) . This congruence is timelike for r > r0 = M + M 2 − J 2 − Q2 and null in the limit r → r0 . The norm of its first curvature, i.e. the norm of its acceleration diverges in the same limit 13 . Hence, in the Kerr-Newman case, at the surface r = r a local, invariant, intrinsic 0 11 In fact, the manifolds of these quadrants are diffeomorphic to the Weyl-Levi Civita manifolds [49, 50], for which the Killing group structure was examined in [44]. 12 in complete agreement with what occurs to the norm α of the four-acceleration calculated along a congruence of absolute rest in the left and right quadrants of the Kruskal manifold. 13 In the case of a static metric, J = 0, the congruence turns out to be the hypersurface-orthogonal one.


309

singularity is defined, despite the fact that the polynomial invariants built with the Riemann tensor are regular there.

5.

Conclusion

We all learned that the Riemann curvature is the root of all scalar invariants that can be constructed at a certain event when only the metric in its infinitesimal neighbourhood is known. In the generic case (the case with trivial Killing group) this is not questioned. However, when the Killing group of a manifold is not trivial, its properties may produce local, intrinsic, invariant quantities without counterpart in quantities built with the polynomial invariants of the Riemann tensor. Pasting together submanifolds endowed with nontrivial, physically different Killing groups, besides being a move not to be recommended per se, may produce a divergent behaviour when such invariant, intrinsic quantities are calculated at events closer and closer to the borders between the above mentioned submanifolds, even if the polynomial invariants of the Riemann tensor are not divergent there.

A The infinitesimal Killing vectors A very simple definition of the infinitesimal Killing vectors is given by [7] and is reproduced here for the reader’s convenience. Let us consider a pseudo Riemannian manifold equipped with two coordinate systems x0i and xi such that x0i = xi + ξ i ,

(10)

where ξ i is an infinitesimal four-vector. Under this infinitesimal coordinate transformation, the components of the metric tensor g 0ik in terms of g ik read g 0ik (x0p) =

k i ∂x0i ∂x0k lm p ik p im ∂ξ km ∂ξ g (x ) ≈ g (x ) + g + g . ∂xl ∂xm ∂xm ∂xm

(11)

The quantities in the first and in the last term of (11) are calculated at the same event (apart from higher order infinitesimals). We desire instead to compare quantities calculated for the same coordinate value, i.e. evaluated at neighbouring events separated by the infinitesimal vector ξ i . To this end, let us expand g 0ik (xp + ξ p ) in Taylor’s series in powers of ξ p . By neglecting higher order infinitesimal terms, we can also substitute g ik for g 0ik in the term containing ξ i of the expansion truncated at the first order term, and find: g 0ik (xp) = g ik (xp ) + g im

i ∂g ik m ∂ξ k km ∂ξ + g − ξ . ∂xm ∂xm ∂xm

(12)

But the difference δg ik (xp) = g 0ik (xp)−g ik (xp ) has tensorial character and can be rewritten as (13) δg ik (xp ) = ξ i;k + ξ k;i in terms of the contravariant derivatives of ξ i . When ξ i;k + ξ k;i = 0

(14)

310


the metric tensor g ik goes into itself under Lie’s “Mitschleppen” [10]. An infinitesimal Killing vector is a four-vector ξ i that fulfills (14).

References [1] Einstein, A., (1915). Sitzungsber. Preuss. Akad. Wiss., Phys. Math. Kl. , 844 (submitted 25 Nov. 1915). [2] Hilbert, D., (1915). Nachr. Ges. Wiss. Göttingen, Math. Phys. Kl., 395 (submitted 20 Nov. 1915). [3] Einstein, A., (1916). Annalen der Physik 49, 769. [4] Kretschmann, E., (1917). Annalen der Physik 53, 575. [5] Einstein, A., (1918). Annalen der Physik 55, 241. [6] Frank, Ph., (1917). Jahrbuch Forts. Math. 46, 1292. ´ [7] Landau, L. and Lifshits, E., (1970). Théorie des champs , Editions Mir, Moscou. [8] Klein, F. (1910). Jhrber. d. d. Math. Vereinig. 19, 287. [9] Kretschmann, E., (1915). Annalen der Physik 48, 907. [10] Schouten, J.A., (1954). Ricci-calculus; an introduction to tensor analysis and its geometrical applications, Springer, Berlin. [11] Klein, F., (1917). Nachr. Ges. Wiss. Göttingen, Math. Phys. Kl., 469. [12] Noether, E., (1918). Nachr. Ges. Wiss. Göttingen, Math. Phys. Kl., 235. [13] Schwarzschild, K., (1916). Sitzungsber. Preuss. Akad. Wiss., Phys. Math. Kl. , 189. [14] Hilbert, D., (1917). Nachr. Ges. Wiss. Göttingen, Math. Phys. Kl., 53. [15] Antoci, S., and Liebscher, D.-E., (2003). Gen. Relativ. Gravit. 35, 945. [16] Antoci, S., and Liebscher, D.-E., (2006). General Relativity Research Trends , Albert Reimer ed., pp. 177-213, Nova Science Publishers, New York. See also: http://arxiv.org/abs/gr-qc/0406090. [17] Kruskal, M.D., (1960). Phys. Rev. 119, 1743. [18] Szekeres, G., (1960). Publ. Math. Debrecen 7, 285. [19] Kerr, R. P., (1963). Phys. Rev. Lett. 11, 237. [20] Newman, E. T., Couch, E., Chinnapared, K., Exton, A., Prakash, A., and Torrence, R., (1965). J. Math. Phys. 6, 918. [21] Reissner, H., (1916). Annalen der Physik 50, 106.


311

[22] Nordström, G., (1918). Proc. R. Acad. Amsterdam 20, 1238. [23] Bondi, H. (1957). Rev. Mod. Phys. 29, 423. [24] Bonnor, W.B., and Swaminarayan, N.S., (1964). Zeits. f. Phys. 177, 240. [25] Israel, W., and Khan, K.A., (1964). Nuovo. Cim. 33, 331. [26] Bonnor, W.B., (1966). Wiss. Zeits. Jena (Math-Nat. Reihe) 15, 71. [27] Biˇca´ k, J. (1968). Proc Roy. Soc. A 302, 201. [28] Biˇca´ k, J., Hoenselaers, C., and Schmidt, B.G., (1983). Proc. Roy. Soc. Lond. A 390, 397, 411. [29] Biˇca´ k, J., and Schmidt, B.G., (1984). J. Math. Phys. 25, 600. [30] Bonnor, W.B., (1983). Gen. Rel. Grav. 15, 535. [31] Bonnor, W.B., (1988). Gen. Rel. Grav. 20, 607. [32] Biˇca´ k, J., and Schmidt, B.G., (1989). Phys. Rev. D 40, 1827. [33] Antoci, S., Liebscher, D.-E., and Mihich, L., (2006). Gen. Rel. Grav. 38, 15. See also: http://arxiv.org/abs/gr-qc/0412102. [34] Geroch, R., (1968). J. Math. Phys. 9, 450. [35] Geroch, R., (1968). Annals of Physics 48, 526. [36] Schmidt, B.G., (1971). Gen. Rel. Grav. 1, 269. [37] Geroch, R., Kronheimer, E.H., and Penrose, R., (1972). Proc. R. Soc. Lond. A 327, 545. [38] Ellis, G.F.R., and Schmidt, B.G. (1977). Gen. Rel. Grav. 8, 915. [39] Thorpe, J.A., (1977). J. Math. Phys. 18, 960. [40] Geroch, R., Liang Can-bin, and Wald, R.M., (1982). J. Math. Phys. 23, 432. [41] Scott, Susan M., and Szekeres, P., (1994). J. Geom. Phys. 13, 223. [42] Synge, J.L., (1950). Proc. R. Irish Acad. 53A, 83. [43] Rindler, W., (2001). Relativity, special, general and cosmological , Oxford University Press, Oxford, pp. 265-267. [44] Ehlers, J., and Kundt, W., (1964). Gravitation, An Introduction to Current Research , L. Witten ed., Wiley, New York, pp. 49-101. [45] Whittaker, E. T., (1935). Proc. R. Soc. London A 149, 384. [46] Rindler, W., (1960). Phys. Rev. 119, 2082.

312


[47] Antoci, S., Liebscher, D.-E., and Mihich, L., (2001). Class. Quantum Grav. 18, 3463. Also: http://arxiv.org/abs/gr-qc/0104035. [48] Synge, J. L., (1966). What is Einstein’s Theory of Gravitation? , in: Hoffman, B. (ed.), Essays in Honor of Vaclav Hlavatý , Indiana University Press, Bloomington, p. 7. [49] Weyl, H., (1917). Annalen der Physik 54, 117. [50] Levi-Civita, T., (1919). Rend. Acc. dei Lincei 28, 3.



Chapter 11

L ONG T IME B EHAVIOUR OF THE W IENER P ROCESS ON A PATH G ROUP Rémi Léandre∗ Institut de Mathématiques, Université de Bourgogne, 21000, Dijon, France.

Abstract We show that the law of the Wiener process on a path group tends to the Haar distribution on a path group

1.

Introduction

Let us recall the definition of a Borelian positive measure. Let Ω be a metric separable space. Let F be its Borelian σ-algebra. A positive Borelian measure is a map µ from F into R+ ∪ {∞} such that: -)If A and B are two elements of F with an empty intersection µ(A ∪ B) = µ(A) + µ(B)

(1)

-)If An is an increasing family of F , µ(∪An ) = lim µ(An )

(2)

µ(A) < ∞

(3)

n→∞

-)If A is relatively compact, ˜ be a topological separable group. The Haar measure on it is the unique positive Let G ˜ for all borelian A˜ measure µ ˜ if it exists such that for all g˜ ∈ G, ˜ =µ ˜g) = µ ˜ µ ˜(˜ gA) ˜(A˜ ˜(A)

(4)

It is a classical result of measure theory that the Haar measure exists on a topological group if and only the group is locally compact. In such a case the Haar measure is unique, modulo a normalization. ∗

[email protected]

314

Rémi Léandre

Let us consider a compact connected d dimensional Lie group G. We can suppose without any restriction that is a subgroup of S0(n), the group of orthogonal transformation of the Euclidean space Rn . We do this remark only to simplify the exposition. The Haar probability mesure is denoted dg. Let us consider the Brownian motion on Lie(G) t → Bt , t ≥ 0, starting from 0. Its law is caracterized by the following two facts for all t > s: -i)Bt − Bs is independent of the family of all Bs0 s0 ≤ s. -ii)Bt − Bs has the law of a Gaussian variable on Lie(G) of average 0 and covariance (t − s)Id. By the Kolmogorov lemma ([14], [16]), t → Bt has a continuous version. We can introduce the stochastic linear equation in the Stratonovitch sense ([8], [16]) dxt(g) = xt (g)dBt ; x0 (g) = g

(5)

In order to understand it, we can consider the Wong-Zakai approximation of it ([8]). Let us consider the polygonal approximation Btn of the Leading Brownian motion with mesh 1/n. We consider the ordinary random differential equation: dxnt (g) = xnt dBtn ; xn0 (g) = g

(6)

The Wong-Zakai theorem states that xnt (g) tends in all the Lp to xt (g). t → xt (g) is called the Brownian motion on G. It defines a semi-group of C ∞ (G). If f ∈ C ∞ (g), Pt [f ](g) = E[f (xt(g))]

(7)

Pt satisfies the following properties: Pt ◦ Ps [f ](g) = Pt+s [f ](g) Pt [f ](g) − f (g) = 1/2Lf (g) t→0 t where L is the Canonical Laplacian on G: X L= ∂i2 lim

(8) (9)

(10)

where ∂i is a orthogonal basis of the Lie algebra of G (The elements of the Lie algebra are left invariant vector fields on G). The Haar probability measure is the unique invariant measure for the Brownian motion on G. This means that Z Z Pt [f ](g)dg = f (g)dg (11) G

G

Moreover, if f is smooth ([8]), lim Pt [f ](g) =

t→∞

Z

f (g)dg

(12)

G

We are motivated in this work by an extension in infinite dimension of these classical results. There are two difficulties to overcome:

Long Time Behaviour of the Wiener Process on a Path Group

315

-)We have to define a suitable extension of the Haar measure for a non locally compact group. It is provided by the theory of Hida distribution in infinite dimension with a geometrical meaning of Léandre ([10], [11], [13]). -)We have to define a suitable extension of the Brownian motion on a infinite dimensioal Lie group. It is provided by the theory of the Brownian motion on a current group of AiraultMalliavin-Baxendale ([1], [3], [9]). We consider in this work the simplest case. Let G be a compact connected Lie group and let C([0, 1], G) be the set of continuous paths in G starting from e. On C([0, 1], G) is constructed the Wiener process t → {s → xt (s)} starting from the unit path (See [9] for a review on applications on the infinite dimensional Wiener process on a mapping space). Léandre ([11]) has shown that the invariant measure of the semigroup associated to the Wiener process on the path group is the Haar distribution on the path group. The Haar measure on a group exists as a measure if and only if the group is locally compact. So the Haar measure on the path group does not exist as a measure but as a distribution in the Hida-Streit approach of functional integral. For a review on geometrical functional integrals in the Hida-Streit approach, we refer to the review of Léandre ([10]). Let W.N∞− some weighted Hida test functional space associated to the group G. If σ ∈ W.N∞− , we associate a functional Ψ(σ) on the path group. The formal Haar measure R dD is defined on such functional Ψ(σ). We get a quantity Ψ(σ)dD which is continuous on W N∞− . We get the main theorem of this work: RTheorem 1 For any element σ of W.N∞− , E[Ψ(σ)(xt(.))] tends when t → ∞ to C([0,1],G) Ψ(σ)dD. We deduce easily from the main theorem the two following corollaries: R Corollary 2 If σ ∈ W.N∞− and if Ψ(σ) ≥ 0, then C([0,1],G) Ψ(σ)dD ≥ 0. Corollary 3 If σ ∈ W.N∞− and if Ψ(σ) = 0, then

R

C([0,1],G) Ψ(σ)dD

= 0.

We consider the simple example of a path group in order to simplify the exposition, because this case leads to simple computations. Only rough estimates in this work are performed in order to simplify the exposition. Readers interested for others point of view for this type of problems can look at [2] and [15].

2.

The Airault-Malliavin-Baxendale equation

Let us recall the theory of the Airault-Malliavin-Baxendale equation (See [9] chapter 4). Let H1 be a real Hilbert space of maps S → h(S) from a compact Riemannian manifold M with boundary into R. We perform the following hypothesis: Hypothesis H: There exists a maps (S, S 0) → eS (S 0) such that -i)h(S) =< h, eS (.) >H1 . -ii)(S, S 0) → eS (S 0) is Hoelder with Hoelder exponent α.

316

Rémi Léandre

We consider the Brownian motion t → Bt (.) with values in H1 . For all s < t, all Si , all s0j all Sj0 ≤ s the Gaussian random variable (Bt (S1) − Bs (S1), .., Bs(Sr ) − Bt (Sr )) is independent of the family Bsj (Sj0 ). Moreover E[Bt(S)Bt(S 0)] = teS (S 0)

(13)

Let Bi (t) a countable family of independent R valued Brownian motion. Let hi (.) an orthonormal basis of H1. We get X Bi (t)hi (S) (14) Bt (S) = Namely, we have eS (S 0) =

X

hi (S)hi(S 0)

(15)

The series (14) does not converge in H1 but in a convenient Hoelder space by Hypothesis H and the Kolmogorov lemma ([14], [16]). Let us recall the statement of the Kolmogorov lemma. Let M be a compact Riemannian manifold with boundary. S denotes the generic element of M and d the Riemannian distance. Let S → X(S) be a Rd -valued process. We suppose that (16) E[|X(S0)|p] ≤ Cp < ∞ E[|X(S) − X(S 0)|p] ≤ Cp1 d(S, S 0)αp

(17)

Then S → X(S) has a version which is α − -Hoelder and the Lp -norms of his Hoelder norms can be estimated with the system of Cp and Cp1 only. Let ei , i = 1, .., d be an orthonormal basis of Lie(G) and Bti (.) d independent Brownian motion with valued in H1. We consider the family of Brownian motion with valued in G parametrized by S ∈ M : X xt (S)eidBti (S) ; x0(S) = e (18) dxt (S == i

(e is the unit of G). The main result concerning the Airault-Malliavin-Baxendale equation (18) is that S → x1(S) has almost surely an α/2 − Hoelder version (See [9], chapter 4). In the sequel, we will choose M = [0, 1] and H the Hilbert space of maps from [0,1] into LieG s → h(s) such that Z 1 kh0 (s)k2ds = khk2 < ∞ (19) 0

We have h(s) =

Z

s

h0 (u)du

(20)

X

(21)

0

such that es (s0 ) = s ∧ s0

ei

which satisfies clearly Hypothesis H. Therefore we can consider the Brownian motion with values in H (t, s) → Bt (s). t is the dynamical time and s the internal time. We consider the Airault-Malliavin-Baxendale equation on C([0, 1], G), the continuous path space of G (The set of continuous functions from [0,1] into G starting from e). dxt (s) = xt (s)dBt(s) ; x0(s) = e

(22)


3.

317

Finite dimensional approximation

Let us consider a compact connected Lie group endowed with its normalized Haar probability measure dg. On Lie(G) we consider the biinvariant Killing metric. We consider on G the canonical biinvariant Laplacian L. If ei is an orthonormal basis of Lie(G), it corresponds to left-invariant P first order differential operators ∂i . We can consider the canonical Laplacian −L = di=1 ∂i2 . The semi-group associated Pt is represented by the stochastic differential equation in the Stratonovitch sense (5), (7). We can consider A = (Ai,j ) a positive symmetric definite matrix on Rd . The self-adjoint operator associated to A is given by X Ai,j ∂i ∂j f (23) −LA = i,j A A This is the generator √ of a diffusion semi-group Pt . Pt has the following stochastic representation. Let A be a symmetric square root of A. Let us consider the following Stratonovitch equation: √ A A (24) dxA t (g) = xt (g) AdBt ; x0 (g) = g

Then PtA [f ](g) = E[f (xA t (g))]

(25)

Moreover, the diffusion semi group is left invariant. This means that PtA [f ](g) = PtA [f (g.)](e)

(26)

because xt(g) = gxt(e). The unique invariant probability measure of PtA is the Haar probability measure on G. If f is smooth, Z f (g)dg (27) lim PtA [f ](g) = t→∞

G

We consider a set I = (0 < s1 < .. < sn ≤ 1) with |I| = n and m(I) = inf(si+1 −si ). If m(I) 6= 0, the diffusion t → (xt (si )) = xIt , (si ∈ I) constitutes a symmetric leftinvariant diffusion on G|I|. Associated to this diffusion there is a Laplacian LI , and a semi-group PtI of the considered before when we consider the Lie group G|I|. dg |I| is the unique invariant measure of PtI which satisfies (27).

4.

Haar distribution on a path group

Let us recall what is the Hida test functional space in a simple case ([7]). Let H2 be the Hilbert space of L2 functions h(.) from R+ into R. We consider the symmetric tensor ˆn product H2⊗ of H2 . It can be realized as the set of symmetric maps hn from (R+ )n into R such that Z |hn (s1 , .., sn)|2ds1 ..dsn = khn k22 < ∞ (28) (R+ )n

318

Rémi Léandre

The P symmetric Fock space W N0 coincides with the set of formal series σ = that n!khn k2 < ∞. To each hn we associate the nth Wiener chaos Z n Ψ(h ) = hn (s1 , .., sn)δBs1 ...δBsn

P

hn such

(29)

(R+ )n

if Bs is a standard R-valued Brownian motion with law dP . Moroever EP [|Ψ(hn )|2] = n!khn k22 and Ψ(hn ) and Ψ(hm ) are orthogonal in L2 (dP ). The L2 of the Brownian motion can be realized as the symmetric Fock space through the isometry Ψ. We introduce the Laplacian ∆+ on (R)+ and we consider the Sobolev space H2,k assoP n h , we choose a slightly different ciated to (∆+ + I)k . On the set of formal series σ = Hilbert structure: X kσk2k,C = n!C n khn k22,k < ∞ (30) We get another symmetric Fock space denoted W Nk,C . The Hida test function space W.N.∞− is the intersection of W.Nk,C k ≥ 1, C ≥ 1 endowed with the projective topology. The map Wiener chaos Ψ realized a map from W.N∞− into the set of continuous Brownian functional dense in L2(dP ). In infinite dimensional analysis, there are basically two objects: -i)An algebraic model. -ii)A mapping space and a map Ψ from the algebraic model into the space of functionals on this mapping space. Getzler in his seminal paper [5] was the first to consider another map than the map Wiener chaos. Getzler was motivated by the heuristic considerations of Atiyah-BismutWitten relating the structure of the free loop space of a manifold and the Index theorem on a compact spin manifold. Getzler was using as algebraic space a Connes space and as map Ψ the map Chen iterated integrals. Getzler’s idea was developped by Léandre ([10]) to study various path integrals in the Hida-Streit approach with a geometrical meaning. Especially Léandre ([11], [13]) succeeded to define the Haar measure as a distribution on a current group. Let us recall quickly the definition on it. We consider a compact Riemannian manifold M (S ∈ M ) and a compact Lie group G (g ∈ G). We consider the current group C(M, G) of continuous maps S → g(S) from M into G. -i)Construction of the algebraic model . We consider the positive self-adjoint Laplacian on M × G ∆M ×G . We consider the Sobolev space Hk of maps from h M × G into R such that Z ((∆M ×G + 2)k h)2 dSdg = khk2k (31) M ×G

We consider the tensor product Hk⊗n associated to it and we consider the natural Hilbert norm on it (dS and dg are normalizedP Riemannian measures on M and G respectively). hn such that W.Nk,C is the set of formal series σ = X C n khn k2k = kσk2k,C < ∞ (32) The Hida test functional space is the space W.N∞− = ∩W.Nk,C endowed with the projective topology.


319

-ii)Construction of the map Ψ. To hn we associate Z n hn (g(S1), .., g(Sn), S1, .., Sn)dS1...dSn Ψ(h )(g(.)) =

(33)

Gn

We put if σ =

P

hn Ψ(σ) =

X

Ψ(hn )

(34)

The map Ψ realizes a continuous map from W.N∞− into the set of continuous functional on C(M, G). -iii)Construction of the path integral. We put if hn belongs to all the Sobolev Hilbert spaces Hk Z Z Ψ(hn )dD = F (g1, .., gn, S1, .., Sn)dg1..dgndS1..dSn (35) C(G,M )

M n ×Gn

This map can be extended into a linear continuous application from W.N∞− (We say it is a Hida distribution) into R. This realizes our definition ([10], [11], [13]) of the Haar distribution on the current group C(M, G). Let I ∈ [0, 1]n. We consider the normalized Lebesgue measure dν n on [0, 1]n. Let Li be the ith partial Laplacian on Gn . We consider the total operator Lnt =

Y

(Li + 2)

Y ∂2 (− 2 + 2) ∂si

(36)

which operates on function hn on Gn × [0, 1]n and we consider its power (Lnt )k . Let hn (g n, I) be a function on Gn × [0, 1]n. We put Z n 2 n kh kC,k = C |(Lnt )k hn (g n, I)|2dg ndν n (I) (37) Gn ×[0,1]n

We put σ=

X

hn

(38)

kσk2k,C =

X

khn k2k,C

(39)

and we consider the Hilbert norm

Definition 4 The Hida Fock space W.N∞− is the space constituted of the σ defined above such that for all k ∈ N, C > 0 kσk2k,C < ∞ If σ belongs to W.N∞− , we associate XZ Ψ(σ)(x(.)) =

hn (g(s1), .., g(sn), I)dν n(I)

(40)

[0,1]n

where s → g(s) belongs to C([0, 1], G). We have a result analog of [13], but we repeat the proof to be self-consistent. Theorem 5 If σ ∈ W.N∞− , Ψ(σ) is a continuous bounded function on C([0, 1], G)

320

Rémi Léandre

Proof:Let λk an orthonormal basis of eigenvectors for the Laplacian L + 2 on G associated to the eigenvalues αk of L. g k denotes the trigonometric function on [0,1] conveniently normalized, except for g 0(t) = t (or g 0(t == 1 − t))(still conveniently normalized) which ∂2 are associated to the eigenvalues νk of the operator − ∂s 2 + 2 on [0, 1]. Let K = (k1, .., kn, k10 , .., kn0 ), |K| = n. We introduce hK (g n, I) = We have: (khn kC,k )2 = C n

Y

f ki (gi)

Z Gn ×[0,1]n

Y

0

g ki (si )

(41)

|(Lnt )k hn (g n , I)|2dg ndν n (I)

(42)

on the associated Hida Fock space W.N∞−. If X hn = λK h K ,

(43)

|K|=n

P

hn belongs to the Hida Fock space means that for all C > 0, all k Z X |K| |K| 2 C λK |(Lt )k hK (g |K|, I)|2dg |K|dν |K| < ∞ (44)

the fact that σ =

K

G|K| ×[0,1]|K|

But |K|

(Lt )k hK = αkK hK where αK denotes the product of the eigenvalues considered. Therefore (44) reads: X C |K| λ2K α2k K 0. By the Sobolev imbedding theorem ([6]), the supremum norm of hK can be estimated in C |K| αkK0 . It remains to show that X αkK0 |λK |C |K| < ∞ (47) K

as soon as (46) is checked. This come from the fact there exists a small C > 0 and a big k such that X Y 1

Group Theory: Classes, Representation and Connections, and Applications

Group Theory: Classes, Representation and Connections, and Applications (Mathematics Research Developments)

Group representation theory for physicists

Differential Geometry, Connections, Curvature, and Characteristic Classes

Fusion Systems: Group theory, representation theory, and topology

Combinatorial Group Theory and Applications to Geometry

Scissors Congruences, Group Homology and Characteristic Classes

Representation Theory and Automorphic Functions: Representation Theory and Automorphic Forms

Group theory and constellations

Group Theory and Chemistry

Group Theory and Physics

Group Theory and Physics

Group theory and chemistry

Geometry and group theory

Group theory and chemistry

Geometry and group theory

Group Theory and Spectroscopy

Group Theory and Chemistry

Group theory and physics

Group Skill and Theory

Group Theory and Physics

Group Skill and Theory

Representation theory of the symmetric group

Group Representation Theory for Physicists, 2nd Edition

[lambda]-rings and the representation theory of the symmetric group

λ-Rings and the Representation Theory of the Symmetric Group

An introduction to group representation theory

Lambda-Rings and the Representation Theory of the Symmetric Group

Chemical applications of group theory

Chemical applications of group theory

Envy, Competition and Gender: Theory, Clinical Applications and Group Work

Group Theory: Classes, Representation and Connections, and Applications