Signal Processing for Magnetic Resonance Imaging and Spectroscopy
edited by
Hong Yan University of Sydney Sydney, Aust...
92 downloads
1213 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Signal Processing for Magnetic Resonance Imaging and Spectroscopy
edited by
Hong Yan University of Sydney Sydney, Australia
Marcel Dekker, Inc.
New York • Basel
TM
Copyright © 2002 by Marcel Dekker, Inc. All Rights Reserved.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
ISBN: 0-8247-0653-6 This book is printed on acid-free paper. Headquarters Marcel Dekker, Inc. 270 Madison Avenue, New York, NY 10016 tel: 212-696-9000; fax: 212-685-4540 Eastern Hemisphere Distribution Marcel Dekker AG Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland tel: 41-61-261-8482; fax: 41-61-261-8896 World Wide Web http://www.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters address above. Copyright 䉷 2002 by Marcel Dekker, Inc. All Rights Reserved. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher. Current printing (last digit): 10 9 8 7 6 5 4 3
2
1
PRINTED IN THE UNITED STATES OF AMERICA
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Series Introduction Over the past 50 years, digital signal processing has evolved as a major engineering discipline. The fields of signal processing have grown from the origin of fast Fourier transform and digital filter design to statistical spectral analysis and array processing, image, audio, and multimedia processing, and shaped developments in highperformance VLSI signal processor design. Indeed, there are few fields that enjoy so many applications—signal processing is everywhere in our lives. When one uses a cellular phone, the voice is compressed, coded, and modulated using signal processing techniques. As a cruise missile winds along hillsides searching for the target, the signal processor is busy processing the images taken along the way. When we are watching a movie in HDTV, millions of audio and video data are being sent to our homes and received with unbelievable fidelity. When scientists compare DNA samples, fast pattern recognition techniques are being used. On and on, one can see the impact of signal processing in almost every engineering and scientific discipline. Because of the immense importance of signal processing and the fast-growing demands of business and industry, this series on signal processing serves to report up-to-date developments and advances in the field. The topics of interest include but are not limited to the following: · · · · · · ·
Signal theory and analysis Statistical signal processing Speech and audio processing Image and video processing Multimedia signal processing and technology Signal processing for communications Signal processing architectures and VLSI design
v
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
vi
SERIES INTRODUCTION
We hope this series will provide the interested audience with high-quality, state-of-the-art signal processing literature through research monographs, edited books, and rigorously written textbooks by experts in their fields.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Signal Processing and Communications Editorial Board Maurice G. Ballanger, Conservatoire National des Arts et Métiers (CNAM), Paris Ezio Biglieri, Politecnico di Torino, Italy Sadaoki Furui, Tokyo Institute of Technology Yih-Fang Huang, University of Notre Dame Nikhil Jayant, Georgia Tech University Aggelos K. Katsaggelos, Northwestern University Mos Kaveh, University of Minnesota P. K. Raja Rajasekaran, Texas Instruments John Aasted Sorenson, IT University of Copenhagen
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Digital Signal Processing for Multimedia Systems, edited by Keshab K. Parhi and Takao Nishitani Multimedia Systems, Standards, and Networks, edited by Atul Puri and Tsuhan Chen Embedded Multiprocessors: Scheduling and Synchronization, Sundararajan Sriram and Shuvra S. Bhattacharyya Signal Processing for Intelligent Sensor Systems, David C. Swanson Compressed Video over Networks, edited by Ming-Ting Sun and Amy R. Reibman Modulated Coding for Intersymbol Interference Channels, Xiang-Gen Xia Digital Speech Processing, Synthesis, and Recognition: Second Edition, Revised and Expanded, Sadaoki Furui Modern Digital Halftoning, Daniel L. Lau and Gonzalo R. Arce Blind Equalization and Identification, Zhi Ding and Ye (Geoffrey) Li Video Coding for Wireless Communication Systems, King N. Ngan, Chi W. Yap, and Keng T. Tan Adaptive Digital Filters: Second Edition, Revised and Expanded, Maurice G. Bellanger Design of Digital Video Coding Systems, Jie Chen, Ut-Va Koc, and K. J. Ray Liu
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
13. 14. 15. 16.
Programmable Digital Signal Processors: Architecture, Programming, and Applications, edited by Yu Hen Hu Pattern Recognition and Image Preprocessing: Second Edition, Revised and Expanded, Sing-Tze Bow Signal Processing for Magnetic Resonance Imaging and Spectroscopy, edited by Hong Yan Satellite Communication Engineering, Michael O. Kolawole
Additional Volumes in Preparation
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Preface
Magnetic resonance imaging (MRI) and magnetic resonance spectroscopy (MRS) have gained widespread use for medical diagnosis in recent years. The potential also exists for magnetic resonance to be useful in building quantum computers, which may have the capacity of being billions of times faster than classical computers. An important step in the use of MRI and MRS is the processing of detected signals. Typical tasks include image reconstruction from complete or incomplete k-space data, removal of artifacts, segmentation of an image into homogeneous regions, analysis of the shape and motion of the heart, processing and visualization of functional MR images, characterization of brain tissues, and estimation of the parameters of a spectroscopic signal. This book provides a review of prevalent signal processing algorithms for solving these problems and reports the latest developments in the field. The book is divided into three parts. Part I discusses algorithms for image reconstruction from MR signals, removal of image distortions and artifacts, and image visualization. Chapter 1 reviews data sampling requirements in the k-space and provides basic mathematical formulations of image reconstruction problems. Chapter 2 presents algorithms for the reconstruction of wavelet transform coefficients directly from the Randon transform domain. The algorithms can be useful for image enhancement, feature extraction, and the study of local tomography. Chapter 3 describes the convolution regridding method for image reconstruction from data obtained on v Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
vi
Preface
a nonuniform grid in the k-space. Chapter 4 develops a technique based on a dynamic finite element mesh (DMESH) model to analyze object motion and distortions in MRI. Chapter 5 presents a method based on the projection onto convex sets (POCS) algorithm to reduce the artifacts caused by unwanted motion of an object during data acquisition. Chapter 6 provides an overview of several tagged MRI methods for fast imaging of the heart motion. Chapter 7 presents a cortical flattening method for the visualization of functional MRI (fMRI) data and a k-space correlation method for the analysis of the data based on both magnitude and phase information. Part II covers techniques for the extraction of meaningful regions in MR images, as well as the analysis of shapes, geometric features, and dynamic information. Chapter 8 presents an image segmentation algorithm based on a multiscale linking model, which can provide the gross shape information at a higher scale and subvoxel accuracy at a lower scale. Chapter 9 proposes a method for cortical surface segmentation by deforming a cellular object initialized on the volume of the brain toward the interior of the deep cortical convolutions. Chapter 10 reviews feature-based image analysis methods including registration, segmentation, and supervised and unsupervised classification. Chapter 11 presents an fMRI segmentation method that takes voxel connectivity information into account and a method for activity detection based on anisotropic diffusion. Chapter 12 describes image segmentation techniques based on neural network models including learning vector quantization (LVQ), the multilayer perceptron, and self-organizing feature maps (SOFM). Chapter 13 introduces an approach to image segmentation based on stochastic pixel and context models. Chapter 14 presents a frequency analysis-based method and a local principal component analysis (PCA) based method for brain activation signal detection in fMRI. Chapter 15 provides an overview of 2-D and 3-D image analysis methods for measuring the deformation of the heart in tagged images. Part III discusses methods for spectral estimation and analysis of MRS signals and their applications. Chapters 16 and 17 provide an overview of one-dimensional and multidimensional spectroscopic signal processing methods, respectively. The linear prediction, the state-space model, and the maximum entropy based methods described in these chapters can overcome several disadvantages of the conventional Fourier transform-based method and provide improved spectral quantification. Chapter 18 reviews several spectroscopic imaging methods that can provide fast data acquisition and improve image quality compared to the traditional Fourier transform-based chemical shift imaging method. Chapter 19 reviews techniques for brain tissue characterization and tumor identification by analysis of the MR spectra based on statistical pattern recognition algorithms. Chapter 20 describes a wavelet-pocket based algorithm for analysis of metabolic peak parameters
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Preface
vii
in MR spectra. Chapter 21 presents the Crame´r–Rao theory for analyzing the lower bounds of statistical errors in parameter estimation in spectroscopic signal processing. Error bounds can be used to determine the quality of an input signal or the performance of a parameter estimation procedure. Sophisticated signal processing methods can serve as useful tools for a cost-effective and timely solution to real-world problems. This book presents a number of ready-to-use algorithms, as well as new research results and directions in MR signal processing. Its intended audience includes researchers, graduate students, and senior undergraduate students working on MRI- and MRS in radiology, physics, chemistry, electrical engineering, and computer science, and hardware and software engineers working on MRIand MRS-related products and services. Readers of this book will be able to utilize the techniques presented to solve the problems at hand and find inspiration and encouragement to develop new or improved methods for MRI and MRS. I would like to thank the authors for their contributions to this book. I am grateful to Mrs. Inge Rogers for her clerical work and proofreading and to several referees for providing technical comments on the chapters. Most of the editing was done when I was on leave in Hong Kong from the University of Sydney and support from a research grant from the Faculty of Science and Engineering of the City University of Hong Kong was greatly appreciated. Finally, I would like to thank the editors at Marcel Dekker, Inc., for their help during the writing of this book. Hong Yan
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Contents
Series Introduction Preface Contributors
K. J. Ray Liu
iii v xiii
PART I. IMAGE RECONSTRUCTION AND RESTORATION 1. Introduction to Image Reconstruction Zhi-Pei Liang, Jim Ji, and E. Mark Haacke 2. Wavelet-Based Multiresolution Local Tomography F. Rashid-Farrokhi and K. J. R. Liu 3. The Point Spread Function of Convolution Regridding Reconstruction Gordon E. Sarty 4. Mapping Motion and Strain with MRI Yudong Zhu
1
25
59
91
ix Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
x
Contents
5. Rotational Motion Artifact Correction Based on Fuzzy Projection onto Convex Sets Chaminda Weerasinghe, Lilian Ji, and Hong Yan 6. Tagged MR Cardiac Imaging Nikolaos V. Tsekos and Amir A. Amini 7. Functional MR Image Visualization and Signal Processing Methods Alex R. Wade, Brian A. Wandell, and Thomas P. Burg PART II.
125
167
189
IMAGE SEGMENTATION AND ANALYSIS
8. Multiscale Segmentation of Volumetric MR Brain Images Wiro J. Niessen, Koen L. Vincken, Joachim Weickert, and Max A. Viergever 9. A Precise Segmentation of the Cerebral Cortex from 3-D MRI Using a Cellular Model and Homotopic Deformations Yann Cointepas, Isabelle Bloch, and Line Garnero 10. Feature Space Analysis of MRI Hamid Soltanian-Zadeh
209
239
255
11. Geometric Approaches for Segmentation and Signal Detection in Functional MRI Analysis Guillermo Sapiro
317
12. MR Image Segmentation and Analysis Based on Neural Networks Javad Alirezaie and M. E. Jernigan
341
13. Stochastic Model Based Image Analysis Yue Wang and Tu¨lay Adaı
365
14. Functional MR Image Analysis Shang-Hong Lai and Ming Fang
401
15. Tagged MR Image Analysis Amir A. Amini and Yasheng Chen
429
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Contents
PART III.
xi
SPECTROSCOPIC SIGNAL PROCESSING
16. Time-Domain Spectroscopic Quantitation Leentje Vanhamme, Sabine Van Huffel, and Paul Van Hecke
475
17. Multidimensional NMR Spectroscopic Signal Processing Guang Zhu and Yingbo Hua
509
18. Advanced Methods in Spectroscopic Imaging Keith A. Wear
545
19. Characterization of Brain Tissue from MR Spectra for Tumor Discrimination Paulo J. G. Lisboa, Wael El-Deredy, Y. Y. Barbara Lee, Yangxin Huang, Angelica R. Corona Hernandez, Peter Harris, and Carles Aru´s 20. Wavelet Packets Algorithm for Metabolite Quantification in Magnetic Resonance Spectroscopy and Chemical Shift Imaging Luca T. Mainardi, Sergio Cerutti, Daniela Origgi, and Giuseppe Scotti 21. Crame´r-Rao Bound Analysis of Spectroscopic Signal Processing Methods Sophie Cavassila, Dirk van Ormondt, and Danielle Graveron-Demilly
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
569
589
613
Contributors
Tu¨lay Adalı Ph.D. Associate Professor, Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, Maryland Javad Alirezaie, Ph.D. Assistant Professor, Electrical and Computer Engineering, Ryerson University, Toronto, Ontario, Canada Amir A. Amini, Ph.D. Associate Professor and Director, Biomedical Engineering, Medicine, and Radiology, Washington University School of Medicine in St. Louis, St. Louis, Missouri Carles Aru´s, Ph.D. Assistant Professor, Department of Biochemistry and Molecular Biology, Universitat Auto`noma de Barcelona, Barcelona, Spain Isabelle Bloch, Ph.D., H.D.R. Professor, Signal and Image Processing, Ecole Nationale Supe´rieure des Te´le´communications, Paris, France Thomas P. Burg, Dipl.Phys. Physicist, Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts Sophie Cavassila, Ph.D. Assistant Professor, CNRS UMR, Universite´ Claude Bernard, Lyon 1, Villeurbanne, France xiii Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
xiv
Contributors
Sergio Cerutti, M.D. Professor, Department of Biomedical Engineering, Polytechnic University, Milan, Italy Yasheng Chen, B.Sc., M.Sc. Department of Biomedical Engineering, Washington University School of Medicine in St. Louis, St. Louis, Missouri Yann Cointepas, Ph.D. Signal and Image Processing, Ecole Nationale Supe´rieure des Te´le´communications, Paris, France Wael El-Deredy, Ph.D. School of Computing and Mathematical Sciences, John Moores University, Liverpool, United Kingdom Ming Fang, Ph.D. Project Manager, Imaging and Visualization, Siemens Corporate Research, Princeton, New Jersey Line Garnero, Ph.D. Director of Research, Cognitive Neuroscience and Cerebral Imaging, Hoˆpital La Salpeˆtrie`re, Paris, France Danielle Graveron-Demilly, Ph.D. Professor, CNRS UMR, Universite´ Claude Bernard Lyon I, Villeurbanne, France E. Mark Haacke, Ph.D. Director, The Magnetic Resonance Imaging Institute for Biomedical Research, St. Louis, Missouri Peter Harris, B.Sc., Ph.D. School of Computing and Mathematical Sciences, John Moores University, Liverpool, United Kingdom Angelica R. Corona Hernandez, B.Eng., M.Sc. School of Computing and Mathematical Sciences, John Moores University, Liverpool, United Kingdom Yangxin Huang, Ph.D. Biostatistician, Frontier Science and Technology Research, Harvard School of Public Health, Chestnut Hill, Massachusetts Yingbo Hua, Ph.D. Professor, Department of Electrical Engineering, University of California, Riverside, California M. E. Jernigan, Ph.D. Professor, Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada Jim Ji Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Contributors
xv
Lilian Ji, M.E. School of Electrical and Information Engineering, University of Sydney, Sydney, Australia Shang-Hong Lai, Ph.D. Professor, Department of Computer Science, National Tsing-Hua University, Hsinchu City, Taiwan Y. Y. Barbara Lee, B.Sc., Ph.D. School of Computing and Mathematical Sciences, John Moores University, Liverpool, United Kingdom Zhi-Pei Liang, Ph.D. Professor, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois Paulo J. G. Lisboa, B.Sc., Ph.D. Professor, School of Computing and Mathematical Sciences, John Moores University, Liverpool, United Kingdom K. J. Ray Liu, Ph.D. Professor, Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland Luca T. Mainardi, Ph.D. Professor, Department of Biomedical Engineering, Polytechnic University, Milan, Italy Wiro J. Niessen, Ph.D. Image Sciences Institute, University Medical Center, Utrecht, The Netherlands Daniela Origgi, Ph.D. Milan, Italy
Medical Physics, European Institute of Oncology,
F. Rashid-Farrokhi, Ph.D. Wireless Research Department, Lucent Technologies, Holmdel, New Jersey Guillermo Sapiro, Ph.D. Associate Professor, Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, Minnesota Gordon E. Sarty, Ph.D. Assistant Professor, Department of Psychology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada Giuseppe Scotti, M.D. Professor, Department of Neuroradiology, San Raffaele Hospital, Milan, Italy Hamid Soltanian-Zadeh, Ph.D. Senior Staff Scientist, Department of Radiology, Henry Ford Health System, Detroit, Michigan and Associate Pro-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
xvi
Contributors
fessor, Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran Nikolaos V. Tsekos, Ph.D. Assistant Professor, Radiology, Washington University School of Medicine in St. Louis, St. Louis, Missouri Leentje Vanhamme, Ph.D. Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium Paul Van Hecke, Ph.D. Professor, Biomedical NMR Unit, Katholieke Universiteit Leuven, Leuven, Belgium Sabine Van Huffel, Ph.D., M.D. Professor, Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium Dirk van Ormondt, Ph.D. Associate Professor, Applied Physics, Delft University of Technology, Delft, The Netherlands Max A. Viergever, Ph.D. Professor, Image Sciences Institute, University Medical Center, Utrecht, The Netherlands Koen L. Vincken, Ph.D. Image Sciences Institute, University Medical Center, Utrecht, The Netherlands Alex R. Wade, Ph.D. Stanford, California
Department of Psychology, Stanford University,
Brian A. Wandell, Ph.D. Professor, Department of Psychology, Stanford University, Stanford, California Yue Wang, Ph.D. Associate Professor, Electrical Engineering and Computer Science, The Catholic University of America, Washington, D.C. Keith A. Wear, Ph.D. Research Physicist, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Rockville, Maryland Chaminda Weerasinghe, Ph.D. Research Engineer, School of Electrical and Information Engineering, University of Sydney, Sydney, Australia Joachim Weickert, Ph.D. Mathematics and Computer Science, University of Mannheim, Mannheim, Germany
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Contributors
xvii
Hong Yan, Ph.D. Professor, School of Electrical and Information Engineering, University of Sydney, Sydney, Australia Guang Zhu, Ph.D. Associate Professor, Department of Biochemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China Yudong Zhu, Ph.D. Senior Scientist, Corporate Research and Development, General Electric, Schenectady, New York
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
1 Introduction to Image Reconstruction Zhi-Pei Liang and Jim Ji University of Illinois at Urbana-Champaign, Urbana, Illinois
E. Mark Haacke Magnetic Resonance Imaging Institute for Biomedical Research, St. Louis, Missouri
1 1.1
INTRODUCTION Basic Imaging Equation
In MRI, the desired image function I(⭈), representing spin density distributions weighted by relaxation effects, diffusion effects, etc., can be encoded in the measured data S(⭈) in a variety of ways [1–4]. This chapter focuses on the popular Fourier imaging scheme1 [4,5], in which I(⭈) is related to S(⭈) by the Fourier integral →
S( k ) = Ᏺ{I} =
冕
→
→
I( →r )e⫺i2 k ⭈ r d →r
(1)
1
We consider back-projection imaging [1,6] as a special case of Fourier imaging. Several nonFourier imaging schemes exist, including the Hadamard transform method [7], the wavelet transform based methods [8–12], and SVD (singular value decomposition) based methods [13–16], which are not covered in this chapter.
1
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
2
Liang et al. →
→
In the two-dimensional case considered here, k = (kx , ky), r = (x, y), and the imaging equation can be written explicitly as S(kx , ky) =
1.2
冕冕 ⬁
⬁
⫺⬁
⫺⬁
I(x, y)e⫺i2(kx x⫹ky y) dx dy
(2)
Sampling Requirements of k-Space Signals
In practice, S(kx , ky) is collected at a discrete set of k-space points. Two popular data collection schemes, known as rectilinear sampling and polar sampling, are shown in Fig. 1. It is clear from the figure that
再
kx = m ⌬kx ky = n ⌬ky
(3)
for rectilinear sampling, and
再
kx = m ⌬k cos(n ⌬) ky = m ⌬k sin(n ⌬)
(4)
for polar sampling, where m and n take integer values. We next discuss the Nyquist requirement on selecting (⌬kx , ⌬ky) and (⌬k, ⌬). Let us first review some basic definitions and results of standard sampling theory [17–19].
FIGURE 1 Two basic k-space sampling schemes used in MRI experiments: (a) rectilinear sampling and (b) polar sampling.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Introduction to Image Reconstruction
3
Definition 1 A signal g(x) is said to be support limited if there exists a finite W such that g(x) = 0 for 兩x兩 ⱖ W, and 兩x兩 < W is called the spatial support of g(x). Definition 2 A signal g(x) is said to be (frequency) band limited if its frequency spectrum G(k) = Ᏺ{g} is zero for 兩k兩 ⱖ kmax , where kmax is called the frequency bandwidth of the signal. Theorem 1 (Shannon) A band-limited function can be reconstructed perfectly from its sampled values taken uniformly at an interval not exceeding the reciprocal of twice the signal bandwidth. The following results immediately follow from the Shannon sampling theorem. (a) Let g(x) be band-limited to kmax . Then g(x) can be recovered from its sampled values g(n ⌬x) if ⌬x ⱕ
1 2kmax
(5)
Equation (5) is known as the Nyquist sampling criterion. The largest sampling interval permissible for g(x) by the Nyquist criterion is ⌬x = 1/(2kmax), which is called the Nyquist interval. (b) Let g(x) be support limited to 兩x兩 < W. Then its Fourier transform G(k) can be exactly recovered from its sampled values G(n ⌬k) provided that ⌬k ⱕ
1 2W
(6)
(c) A band-limited periodic function can be represented in terms of a finite Fourier series:
冘 N
g(x) =
cn e⫺i2nx/Wx
(7)
n=⫺N
Clearly, g(x) is periodic in x with period Wx and band-limited to N/Wx (or more precisely N/Wx ⫹ ␦ for some ␦ > 0). Since g(x) is uniquely specified by the 2N ⫹ 1 coefficients, it is expected that 2N ⫹ 1 samples taken uniformly within a single period would suffice to reconstruct g(x) uniquely. Therefore the Nyquist sampling criterion for this signal can be stated as ⌬x ⱕ
Wx 2N ⫹ 1
(8)
We next consider the sampling requirements of the two data acquisition schemes shown in Fig. 1. Note that k-space sampling is a multidimensional
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
4
Liang et al.
problem. In practice, however, one treats sampling along each dimension separately, thus reducing it to a one-dimensional problem. Although the resulting sampling pattern is not optimal, it guarantees ‘‘perfect’’ reconstruction of the underlying continuous k-space signal if infinite sampling is used. We therefore adopt this conventional treatment to determine the requirements on (⌬kx , ⌬ky) and (⌬k, ⌬). For rectilinear sampling, it is convenient to assume that the image function is support limited to a central rectangle region of widths Wx and Wy , namely, I(x, y) = 0
for
兩x兩 ⱖ Wx /2, 兩y兩 ⱖ Wy /2
(9)
Then, according to the sampling theorem, we have
再
1 Wx 1 ⌬ky ⱕ Wy ⌬kx ⱕ
(10)
For polar sampling, the following two standard assumptions are often invoked [20–22]: (a)
Support-limitedness (Fig. 2a): I(x, y) = 0
for
兹x 2 ⫹ y 2 ⱖ Rx
(11)
FIGURE 2 (a) An object is bounded by a circle of radius Rx . (b) Polar sampling of k-space.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Introduction to Image Reconstruction
(b)
5
Frequency band-limitedness (Fig. 2b): S(kx , ky) = Ᏺ{I} = 0
for
兹k x2 ⫹ k y2 ⱖ Rk
(12)
Although the first assumption is practically valid, the second assumption is only an approximation since I(x, y) cannot be both space and frequency limited simultaneously. These assumptions are, however, necessary to enable sampling along the -direction. With assumption (a), it is easy to show that ⌬k ⱕ
1 2Rx
(13)
Determining the sampling requirement along the -direction is more complicated. Let Sp (k, ) = S(k cos , k sin )
(14)
Clearly, Sp (k, ) is periodic in for a given k and can be expressed in terms of the Fourier series as
冘
cn (k)e⫺in
(15)
冕
Sp (k, )e in d
(16)
⬁
Sp (k, ) =
n=⫺⬁
where cn (k) =
1 2
⫺
The largest angular sampling interval ⌬ allowed for Sp (k, ) is determined by the number of significant terms in the Fourier series in Eq. (15). For a circularly symmetric object, Sp (k, ) is a constant over and the series will have only a dc term. In general, the number of significant terms increases with 兩k兩. A well-known result concerning this problem is derived in Ref. 20, which states that Sp (k, ) is approximately band limited to Rx 2 兩k兩 ⫹ 1 with respect to . In other words, the Fourier series coefficients cn (k) in Eq. (15) are not significant for 兩n兩 > [Rx 2 兩k兩] ⫹ 1, where the brackets represent rounding Rx 2 兩k兩 to the next higher integer. Based on the assumption of frequency-limitedness stated in Eq. (12), cn can be ignored for any measured k values if 兩n兩 > [2Rx Rk] ⫹ 1. Therefore, according to the earlier results concerning sampling of band-limited periodic functions, the angular sampling interval that satisfies the Nyquist criterion for all of the sampled k values is given by ⌬ ⱕ
2 2([2Rx Rk] ⫹ 1) ⫹ 1
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(17)
6
1.3
Liang et al.
Feasible Reconstructions
The sampling theorem states that when (⌬kx , ⌬ky) or (⌬k, ⌬) satisfies the Nyquist criterion, S(kx , ky) can be recovered from its sampled values and therefore I(x, y) can be reconstructed exactly. In practice, finite coverage of k-space is used such that S(kx , ky) is available for (kx , ky) 僆 , where is a finite point set. Without loss of generality, we may assume that =
再
(m ⌬kx , n ⌬ky)兩 ⫺
冎
M M N N ⱕm< ,⫺ ⱕn< 2 2 2 2
(18)
for rectilinear sampling, and =
再
(m ⌬k cos n ⌬, m ⌬k sin n ⌬)兩
⫺
冎
M M N N ⱕm< ,⫺ ⱕn< 2 2 2 2
(19)
for polar sampling. The image reconstruction problem can then be formally stated as Given find
S(kx , ky)
(kx , ky) 僆
for
I(x, y)
(20)
Because is a finite point set, the reconstruction of I(x, y) is not unique under the data-consistency constraint. In this case, it is useful to introduce the concept of feasible reconstructions. Definition 3 An arbitrary function Iˆ(x, y) is said to be a feasible reconstruction of I(x, y) if Iˆ(x, y) is consistent with the measured data according to the imaging equation Eq. (2). In other words, ˆI(x, y) is a feasible reconstruction if ˆ y)} = S(kx , ky) Ᏺ{I(x, Remark 1
for
(kx , ky) 僆
(21)
The true image function I(x, y) is a feasible reconstruction.
Remark 2 If Iˆ(x, y) is a feasible reconstruction for the reconstruction problem in Eq. (20), then Iˆ(x, y) ⫹ e i 2 (p⌬kx⫹q⌬ky)⌸(x⌬kx , y⌬ky) is also a feasible reconstruction for 兩p兩 > M/2 or 兩q兩 > N/2, where ⌸(x, y) is a two-dimensional rectangular window function defined as ⌸(x, y) =
再
1
兩x兩 < 1/2
0
otherwise
and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
兩y兩 < 1/2
(22)
Introduction to Image Reconstruction
7
Remark 2 is readily understood noting that
冕 冕 1/2⌬kx
⫺(1/2⌬kx)
1/2⌬ky
e⫺i2 [(m⫺p)⌬kx⫹(n⫺q)⌬ky] dx dy = 0
⫺(1/2⌬ky)
if m ≠ p
n≠q
or
(23)
The above two remarks indicate that for a given set of measured data, a feasible reconstruction always exists. However, in the case of finite sampling, the feasible reconstruction is not unique, and therefore additional criteria need to be applied to select an image function from the many feasible ones. 2
RECONSTRUCTION TECHNIQUES
This section describes several conventional reconstruction techniques. Advanced methods incorporating a priori constraints into the reconstruction process are beyond the scope of the chapter. The interested reader is referred to Refs. 23–33. 2.1 2.1.1
Reconstruction from Rectilinear k-Space Data Reconstruction Formulas
For rectilinear sampling, we can simplify the reconstruction problem to a one-dimensional problem based on the separability of the multidimensional Fourier transform [4,19]. In this case, we can rewrite the imaging equation as
冕
⬁
Sn = S(n ⌬k) =
N N ⫺ ⱕn< 2 2
I(x)e⫺i2n⌬kx dx
⫺⬁
(24)
The following formula (commonly known as the Poisson sum formula) governs how to reconstruct I(x) from Sn :
冘 ⬁
Sn e
i2n⌬kx
n=⫺⬁
1 = ⌬k
冘冉 ⬁
I
x⫺
n=⫺⬁
冊
n ⌬k
(25)
The left-hand side of Eq. (25) can be viewed as a Fourier series, with ⌬k being the fundamental frequency and Sn being the series coefficient of the nth harmonic. The right-hand side is a periodic extension of I(x) with period 1/⌬k. Equation (25) can be proven as follows. First, the following equality holds in a distribution sense:
冘 ⬁
n=⫺⬁
e i2n⌬kx =
1 ⌬k
冘 冉 ⬁
␦ x⫺
n=⫺⬁
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
n ⌬k
冊
(26)
8
Liang et al.
Next, based on Eq. (24) we have
冘 ⬁
冘 冋冕 册 冘冕 冕 冘 冕 冘 冉 冊 冘冕 冉 冊 冘冉 冊 ⬁
⬁
Sn e
i2n⌬kx
=
n=⫺⬁
I(xˆ)e⫺i2n⌬kxˆ dxˆ
⫺⬁
n=⫺⬁
=
⬁
⬁
n=⫺⬁
⫺⬁
I(xˆ)e i2n⌬k(x⫺xˆ) dxˆ
⬁
=
⫺⬁
= =
⬁
e i2n⌬k(x⫺xˆ) dxˆ
I(xˆ)
⫺⬁ ⬁
=
e i2n⌬kx
1 ⌬k 1 ⌬k
n=⫺⬁
1 I(xˆ) ⌬k
⬁
␦ x ⫺ xˆ ⫺
n=⫺⬁
⬁
⬁
I(xˆ)␦
x ⫺ xˆ ⫺
⫺⬁
n=⫺⬁ ⬁
I
x⫺
n=⫺⬁
n ⌬k
n ⌬k
dxˆ
dxˆ
n ⌬k
which proves Eq. (25). Equation (25) suggests a reconstruction formula for I(x). Assume that I(x) is support limited to 兩x兩 < Wx /2 and that ⌬k satisfies the Nyquist criterion, namely, ⌬k ⱕ
1 Wx
(27)
Then there are no overlaps among the different components on the righthand side of Eq. (25). Consequently, the true image function I(x) can be expressed in terms of Sn as
冘 ⬁
I(x) = ⌬k
Sn ei2n⌬kx
兩x兩
A. The argument for s < ⫺A is the same. Since Hf(s) = lim ε→0
1
冕
兩t兩>ε
f(s ⫺ t) dt t
since f(t) = 0 outside [⫺A, A], and since s > A, Hf(s) =
1
冕
s⫹A
s⫺A
f(s ⫺ t) dt t
Fixing s, and expanding 1/t in a Taylor series about t = s gives for some ts 僆 [s ⫺ A, s ⫹ A], 1 Hf(s) = 1 = =
1
冕 冘冕 冕
冋冘
s⫹A
N
f (s ⫺ t)
s⫺A N
A
k=0 ⫺A s⫹A
k=0
1 f (t)t dt ⫹ 2i k
册
(s ⫺ t)k (s ⫺ t) N⫹1 ⫹ dt k⫹1 s t sN⫹2
冕
s⫹A
t s⫺N⫺2 f (s ⫺ t) N⫹1 dt
s⫺A
t ⫺N⫺2 f (s ⫺ t)(s ⫺ t) N⫹1 dt s
s⫺A
Since ts 僆 [s ⫺ A, s ⫹ A], 兩ts 兩⫺N⫺2 ⱕ 兩s ⫺ A兩⫺N⫺2, so that 兩Hf (s)兩 ⱕ
1
兩s ⫺ A兩N⫹2
冕
A
兩 f (t)t N⫹1兩 dt
⫺A
▫
The significance of this observation for local tomography is the following. If (t) is the wavelet corresponding to the scaling function (t) for a Mul-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
38
Rashid-Farrokhi and Liu
tiresolution Analysis, then at least the 0th moment of must vanish. It is possible to design wavelets that have compact support and that have many vanishing moments. In this case, the functions H⭸R ⌿ i(t), where ⌿ i, i = 1, 2, 3 are given by Eq. (10), will have very rapid decay for each . Numerically, even for wavelets with a few vanishing moments, the essential support of H⭸R ⌿ i(t) is the same as the support of R ⌿ i(t) for each . This means that, by Eq. (14), the discrete wavelet coefficients (18) can be computed locally using essentially local projections. Rapid decay after ramp filtering is also observed in scaling functions (t) provided that has vanishing moments [14]. Specifically, if (t) satisfies 兰 (t) dt = 1 and 兰 t j (t) dt = 0 for j = 1, 2, . . . , N, then ⭸ satisfies 兰 (t) dt = 0, 兰 t (t) dt = 1, and 兰 t j (t) dt = 0 for j = 2, 3, . . . , N ⫹ 1. Therefore as in Lemma 1, it follows that 兩H⭸ (s)兩 ⱕ
1 1 2 ⫹ s 兩s ⫺ A兩 N⫹3
冕
兩⭸ (t)t N⫹2兩 dt
Even though the decay is dominated by the s⫺2 term, ramp-filtered scaling functions with vanishing moments display significantly less relative energy leakage outside the support of the scaling function than those without vanishing moments. In order to quantify this locality phenomenon, we define the spread of a function f with respect to an interval I under ramp filtering to be the normalized energy of the function (兩 兩 ˆf())∨(t) outside I, i.e., with I¯ denoting the complement of I,
spread( f; I) =
冉冕 冉冕
冊 冊
1/2
兩(兩 兩 fˆ ( ))∨(t)兩 2 dt
¯I
⬁
1/2
兩(兩 兩 fˆ ())∨(t)兩 2 dt
⫺⬁
The rapid decay of the ramp-filtered scaling functions is related to the number of vanishing moments of the scaling function. Orthonormal wavelets corresponding to scaling functions with vanishing moments have been called ‘‘coiflets’’ by Daubechies in [6, Sec. 8.2]. For coiflets with 1 and 3 vanishing moments, supported on the intervals [0, 5], and [0, 11], respectively, we have measured spreads with respect to these intervals of 0.016 and 0.013, respectively. These scaling functions correspond to scaling filters with 6 and 12 taps, respectively. Daubechies has also observed [6, Sec. 8.3.5] that the symmetric biorthogonal bases constructed in Ref. 18 are numerically very close to coiflets. For the biorthogonal ‘‘near-coiflet’’ scaling functions supported on the intervals [0, 4], [0, 8], and [0, 12], we have measured spreads
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
39
with respect to these intervals of 0.029, 0.016, and 0.0095 respectively. These scaling functions correspond to scaling filters with 5, 9, and 13 taps, respectively. It is most desirable to minimize both the spread of the scaling function and the number of taps in the corresponding filter. Under these criteria, the near-coiflet filter with 5 taps is near optimal [see Fig. 3(a) and (c), and Fig. 4(a)] and is therefore used in these simulations. The measured spreads for various compactly supported wavelet and scaling functions are given in Table 1. Even if g is replaced by the scaling function given by Eq. (8), H⭸R g has essentially the same support as R g for each . Figure 3 shows the Daubechies biorthogonal wavelet and scaling function (Table 3
FIGURE 3 Wavelet with less dissimilar lengths, l = k = k˜ = 4; (a) scaling function; (b) wavelet basis; (c) ramp-filtered scaling function; (d) rampfiltered wavelet basis.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
40 TABLE 1
Rashid-Farrokhi and Liu Spread of Wavelet and Scaling Functions Wavelet spread
Scaling spread
Filter
Coefficients
Support
Haar
1 1 0.50000000000000 1.00000000000000 0.50000000000000 0.25000000000000 0.75000000000000 0.75000000000000 0.25000000000000 0.12500000000000 0.50000000000000 0.75000000000000 0.50000000000000 0.12500000000000 0.06250000000000 0.31250000000000 0.62500000000000 0.62500000000000 0.31250000000000 0.06250000000000 0.68301270189222 1.18301270189222 0.31698729810778 ⫺0.18301270189222 0.47046720778416 1.14111691583144 0.65036500052623 ⫺0.19093441556833 ⫺0.12083220831040 0.04981749973688 0.32580342805100 1.01094571509000 0.89220013842700 ⫺0.03957026356000 ⫺0.26450716736900 0.04361630047420 0.04650360107100 ⫺0.01498698933040
[0,1]
0.3837
0.6900
[0,2]
0.09167
0.3726
[0,3]
0.01691
0.1959
[0,4]
0.003767
0.1389
[0,5]
0.0009341
0.1105
[0,3]
0.03391
0.3449
[0,5]
0.005446
0.1929
[0,7]
0.001058
0.1232
Linear spline
Quadratic spline
Cubic spline
Degree 4 spline
Daubechies 4 tap filter
Daubechies 6 tap filter
Daubechies 8 tap filter
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography TABLE 1
41
Continued
Filter Daubechies 10 tap filter
Coiflet with 1 moment vanishing
Coiflet with 3 moments vanishing
Near coiflet (5 taps)
Near coiflet (9 taps)
Wavelet spread
Scaling spread
[0,9]
0.0002376
0.08907
[0,5]
0.0003069
0.01613
0.000006154
0.01307
[0,4]
0.001682
0.02890
[0,8]
0.00005151
0.01632
Coefficients
Support
0.22641898258329 0.85394354270476 1.02432694425952 0.19576696134736 ⫺0.34265671538239 ⫺0.04560113188406 0.10970265864207 ⫺0.00882680010864 ⫺0.01779187010184 0.00471742793840 ⫺0.05142972847100 0.23892972847100 0.60285945694200 0.27214054305800 ⫺0.05142997284700 ⫺0.01107027152900 0.01158759673900 ⫺0.02932013798000 ⫺0.04763959031000 0.27302104653500 0.57468239385700 0.29486719369600 ⫺0.05408560709200 ⫺0.04202648046100 0.01674441016300 0.00396788361300 ⫺0.00128920335600 ⫺0.00050950539900 ⫺0.05000000000000 0.25000000000000 0.60000000000000 0.25000000000000 ⫺0.05000000000000 0.01250000000000 ⫺0.03125000000000 ⫺0.05000000000000 0.28125000000000 0.57500000000000 0.28125000000000 ⫺0.05000000000000 ⫺0.03125000000000 0.01250000000000
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
[0,11]
42 TABLE 1
Rashid-Farrokhi and Liu Continued
Filter Near coiflet (12 taps)
Coefficients
Support
Wavelet spread
Scaling spread
⫺0.00317382812500 0.00585937500000 0.01904296875000 ⫺0.04882812500000 ⫺0.04760742187500 0.29296875000000 0.56347656250000 0.29296875000000 ⫺0.04760742187500 ⫺0.04882812500000 0.01904296875000 0.00585937500000 ⫺0.00317382812500
[0,12]
0.000001515
0.009547
of Ref. 18) as well as the ramp-filtered version of these functions. Observe that the ramp-filtered scaling functions have almost the same essential support as the scaling function itself, but this may not be the case in general and Fig. 4 shows an example.1 Therefore, in order to reconstruct the wavelet and scaling coefficients for some wavelet basis, we only need the projections passing through the region of interest, plus a margin for the support of the wavelet and scaling ramp filters. Moreover, in order to reconstruct the image from the wavelet and scaling coefficients, we have to calculate these coefficients in the region of interest plus a margin for the support of the wavelet reconstruction filters (13). Since the wavelet and scaling ramp filters and also the wavelet reconstruction filters get wider in lower scales, we need to increase the exposure to reconstruct the low-resolution coefficients in the region of interest. Local tomography methods may be categorized into two types: Pseudolocal tomography: reconstruction of a ROI from densely scanned local projection data and sparsely sampled background data True local tomography: reconstruction of a ROI from local data that only passes through the ROI and an arbitrarily small margin 1
In Fig. 4 we have plotted another wavelet and scaling function (Table 6.2 of Ref. 19) and their ramp-filtered versions, for comparison. The scaling function in this basis does spread significantly after ramp filtering.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
43
FIGURE 4 Wavelet with extremal phase and highest number of vanishing moments with length 4; (a) scaling function; (b) wavelet basis; (c) rampfiltered scaling function; (d) ramp-filtered wavelet basis.
Methods using pseudolocal tomography are implemented in Refs. 9 and 10. In practical cases we may require high resolution in a localized region and low resolution in the surrounding area. In this case multiresolution and pseudolocal tomography can be used to reduce radiation exposure. Pseudolocal tomography methods use the fact that the support of the wavelet function does not increase by the Hilbert transform. These methods use the local information to reconstruct the wavelet coefficients and a sparsely sampled global information for the reconstruction of approximation coefficients. As
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
44
Rashid-Farrokhi and Liu
a result, a high-resolution image is reconstructed in the local region of interest and a low-resolution image outside the region of interest. In order to reconstruct a region of radius r0 pixels with bandwidth B, it is sufficient to use projections at N angles evenly spaced from 0 to , such that [11] N ⱖ 0.5B(R ⫹ r0) ⫹ 2 We start from a resolution L with bandwidth BL such that NL is the number of projections needed to reconstruct the image. Then at each successive increase in resolution the number of projections is doubled by adding one projection in between the existing projection. The new projections need to be exposed only for the local region plus a margin for the support of wavelet and scaling filters. This set of projections, as illustrated in Fig. 5, is used to reconstruct the region of interest at high resolution and the rest of the image at low resolution. Methods for local tomography have been implemented in Refs. 14 and 16. In these methods, a local region of interest is reconstructed from purely local data up to a constant. These methods use the fact that for some scaling functions the Hilbert transform has little effect on the support of the function
FIGURE 5
Exposure for different projections.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
45
[14]. In fact, these methods solve the interior problem [26]. The interior problem in even dimensions is not uniquely solvable, since there are nonzero functions that have zero projections on the ROE. Clearly, a true local algorithm will be unable to reconstruct such a function. It has been noted that these functions, which are in the null space of the interior problem, do not vary much well in the ROE [14,26]. Below we calculate the amount of exposure versus the size of ROI in different methods. Let the support of reconstruction filters in the wavelet filter bank be 2rr samples. And also consider an extra margin 2rm samples in the projection domain, and denote the radius of the region of interest by ri . The radius of the region of exposure is re = ri ⫹ rm ⫹ rr pixels. The amount of exposure in local algorithms in Ref. 14 normalized to the full exposure is given by rr ⫹ rm ⫹ ri R The amount of exposure in pseudolocal algorithms with rm ⫹ rr = 12 pixels and rm ⫹ rr = 22 pixels is plotted in Fig. 6. In the Delaney and Bresler algorithm [10] the exposure is given by
FIGURE 6
Exposure percentage versus the size of the region of interest.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
46
Rashid-Farrokhi and Liu
冘 冉 L
2⫺L ⫹
2⫺q⫹1
q=1
冊
rr ⫹ rm ⫹ ri R
where L is the number of levels in the wavelet filter bank. Similar exposure is required in DeStefano and Olson’s algorithm [9]. Figure 6 shows the relative amount of exposure versus the size of the region of interest in a 256 ⫻ 256 image for rm ⫹ rr = 12 pixels for these methods. Also the amount of exposure for the algorithm proposed in Ref. 13 is plotted for comparison. All the exposures in Fig. 6 are divided by 2 if we use interlaced sampling.
5
IMPLEMENTATION
In local reconstruction, artifacts are common close to the boundary of the region of exposure. These artifacts cause large errors in the reconstructed image, at the borders of the region of exposure. Natterer [26] has suggested that extrapolation of the data outside the region of interest be carried out using a minimum norm approach that reduces the artifacts (cf. Fig. 6 VI.8 in Ref. 26). Here, using a simpler approach, we have extrapolated the projections continuously to be constant on the missing projections. The extrapolation scheme is the same even when the region of exposure is not centered. Let the region of exposure, which is the subset of projections on which R f is given, be a circle of radius re whose center is located at polar coordinates (r, 0), i.e., ROE:{s:s 僆 [r cos( ⫺ 0) ⫺ re , r cos( ⫺ 0) ⫹ re]}
(20)
We use the constant extrapolation (R )local (s) =
再
R (s) R (r cos( ⫺ 0) ⫹ re)
if if
e 僆 ROE s 僆 [r cos( ⫺ 0) ⫹ re , ⫹⬁)
R (r cos( ⫺ 0) ⫺ re)
if
s 僆 (⫺⬁, r cos( ⫺ 0) ⫺ re] (21)
We assume the support of an image is a disc of radius R, and the radius of the region of interest is ri . A region of radius re = ri ⫹ rm ⫹ rr is exposed, where rm and rr are respectively the extra margin due to the support of the decomposition filters in the projection domain, and the reconstruction filters in the image domain. Suppose the projections are sampled at N evenly spaced angles. Below we summarize the algorithm to calculate the wavelet and scaling coefficients at resolution ⫺1. These steps can be easily generalized to the reconstruction at any resolution.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
1.
2.
3.
4.
47
The region of exposure of each projection is filtered by modified wavelet filters (19), at N angles. The complexity of this part, using FFT, is (9/2)Nre log re . The bandwidth of the projections is reduced by half, after filtering with modified scaling filters. Hence we use N/2 of the projections at evenly spaced angles. These projections are extrapolated to 4re pixels, using Eq. (21), and are then filtered by modified scaling filters. The complexity of the filtering part using FFT is 3N(4re)log 4re . The filtered projections obtained in step 1 and 2 are back-projected to every other point, using Eq. (16), to obtain the approximation (17) and detail (18) coefficients at resolution 2⫺1. The remaining points are set to zero. The complexity of this part, using linear interpolation, is (7N/2)(ri ⫹ 2rr)2. The image is reconstructed from the wavelet and scaling coefficients by Eq. (12). The complexity of filtering is 4(2ri)2(3rr).
The wavelet and scaling coefficients of the 256 ⫻ 256 pixel image of the Shepp–Logan head phantom using global data are shown in Fig. 7. In this decomposition we use the Daubechies biorthogonal basis (Table III of Ref. 18). Figure 8 shows an example in which a centered disk of radius 16 pixels is reconstructed using the local wavelet reconstruction method proposed in Ref. 14. Figures 8(c) and 8(d) show the magnification of the ROI using both standard filtered back-projection using global data and local re-
FIGURE 7 (a) Wavelet coefficients; (b) the reconstruction from wavelet coefficients.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
48
Rashid-Farrokhi and Liu
FIGURE 8 Local wavelet reconstruction; (a) wavelet coefficients; (b) the reconstruction from wavelet coefficients; magnification of the region of interest; (c) reconstruction using a wavelet method (local data); (d) reconstruction using a standard filtered back-projection method (global data).
construction for comparison. In this example the projections are collected from a disk of radius 28 pixels, therefore the amount of exposure is 22% of the conventional filtered back-projection method. There is a constant bias in the reconstructed image that is natural in the interior reconstruction problem [26,28]. In the above example the mean square error between the original image and the locally reconstructed image after removing bias is computed
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
49
FIGURE 9 (a) Wavelet coefficients; (b) the reconstruction from wavelet coefficients.
over the region of interest.2 The error energy in the reconstructed image is close to that of the filtered back-projection method using full exposure data. The local reconstruction method is applied to the real data obtained from a CT scanner. In the local reconstruction, even with 12 pixels extra margin, the reconstructed image has the same quality as that of the filtered back-projection method in the local region. Figure 9 shows a 1024 ⫻ 1024 scan of the heart reconstructed from projections sampled at 720 angles over 180 degrees with each projection consisting of 1024 samples covering a recon diameter of 47.5 cm. Using the algorithm proposed in Ref. 14, a local centered region of radius 128 pixels of this scan has been reconstructed by using only 27% of exposure. The reconstruction in the region of interest is as good as that which can be obtained using the filtered back-projection method, which involves global data and 100% exposure. The magnification of the region of interest reconstructed by the local method and global standard filter back-projection is shown in Figs. 10(c) and (d) for comparison [14].
2
The mean square error is calculated by 1 N
冘
[ f (n, m) ⫺ fˆ (n, m)]2
(n,m):(n,m)僆ROI
where f is the original image, ˆf is the reconstructed image with the constant bias removed, and N is the number of pixels in the ROI.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
50
Rashid-Farrokhi and Liu
FIGURE 10 Local wavelet reconstruction; (a) wavelet coefficients; (b) the reconstruction from wavelet coefficients; magnification of the region of interest; (c) reconstruction using a wavelet method (local data); (d) reconstruction using a standard filtered back-projection method (global data).
In the following section we present the wavelet reconstruction of an image from fan-beam data. 6
WAVELET RECONSTRUCTION IN FAN-BEAM GEOMETRY
Practical scanners use fan-beam geometry because of its fast scanning ability. The rearrangement of ray sums from all fan-beam-generated projections
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
51
can create a new set of projections with the same characterization as those generated by parallel scanners. The rearranged projections can then be reconstructed into cross-sectional images by the application of parallel beam reconstruction schemes. One of the major deficiencies of these methods, known as rebinning algorithms, is the requirement to execute a two-dimensional interpolation of ray sums in both the angular dimension and between measured ray sums. As a consequence, the image quality in these rebinning algorithms is a problem due to the interpolation process involved. Furthermore, the sampling schemes in the existing local tomography schemes are well adapted to parallel geometry, which cannot be used for fan-beam data. In Ref. 15 an algorithm was proposed to reconstruct the wavelet coefficients of an image directly from fan-beam data. The proposed algorithm uses a set of angle-dependent wavelet based filters to filter and resample nonuniformly spaced ray sums, which lead to a set of uniformly spaced samples of filtered projections. These projections can be applied to standard filtered back-projection and even the parallel beam direct Fourier reconstruction methods. Figure 11 shows a projection in fan-beam geometry. The angle ␥ gives the location of a ray within a fan,  is the source angle, and D is distance from the source to the center of the scanner. A function f can be reconstructed from its fan-beam projections R (␥) (see Ref. 30)
冕 冕 2
f(x, y) =
␥m
1 L2
0
R (␥)h⬘(␥ ⬘ ⫺ ␥)D cos ␥ d␥ d
(22)
⫺␥m
where L2(r, x, y) = [x ⫹ D sin  ]2 ⫹ [ y ⫺ D cos  ]2 1 h⬘(␥) = 2
冉 冊 ␥ sin ␥
2
h(␥)
h(␥) is the standard ramp filter, and
␥ ⬘ = tan⫺1
x cos  ⫹ y sin  D ⫹ x sin  ⫺ y cos 
Similar to the parallel beam case, we can show that the wavelet coefficients can be reconstructed from fan-beam projections R (␥) by [15] W2 j (g; f )(x, y) 1 = 2
冕冕 2
0
␥m
R (␥)h⫹␥ ,2 j (L sin(␥ ⬘ ⫺ ␥))D cos ␥ d␥ d
(23)
⫺␥m
where h is given by h = H⭸R g˜
(24)
and L and ␥ ⬘ are defined as before. For an N ⫻ N pixel image with M
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
52
Rashid-Farrokhi and Liu
FIGURE 11
A fan-beam projection.
projections and N samples in each projection, the number of operations involved in the inversion formula (23) is proportional to N 3M. In order to reduce the computational complexity of inversion (23), we rearrange the projections to form a pseudoparallel projection structure, which has uniform samples in the direction and nonuniform samples in each projection. Figure 12 shows the fan-beam projection sampling points in the (t, ) plane. In this geometry the projections are sampled at =  ⫹ ␥ and t = D sin ␥. We interpolate the samples in direction using sinc interpolation and then collected the proper values of P (D sin ␥). Starting from Eq. (15), we use the change of variable t = D sin ␥ to obtain the wavelet reconstruction formula from those pseudoparallel projections, i.e., W2 j ( g; f )(x, y)
冕冕 冕冕
=
P (D sin ␥)h , 2 j (x cos ⫹ y sin ⫺ D sin ␥) ⫻ D cos ␥ d␥ d
⫺␥m
0
=
␥m
0
␥m
R⬘ (␥)h , 2 j (x cos ⫹ y sin ⫺ D sin ␥) d␥ d
(25)
⫺␥m
where R⬘( ␥) = P (D sin ␥)D cos ␥. The above reconstruction formula can be separated into a filtering step
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
FIGURE 12
53
The fan-beam sampling points.
冕
␥m
Q ,2 j (g; t) =
R⬘( ␥)h ,2 j (t ⫺ D sin ␥) d␥
⫺␥m
and a back-projection step
冕
W2 j (g; f )(x, y) =
Q ,2 j (g; x cos ⫹ y sin ) d
(26)
0
The filtering is nonlinear and cannot be implemented using FFT. The complexity of this nonlinear filtering is N 2M. Fortunately, the wavelet and scaling filters R g˜ filters have essentially the same support as wavelet and scaling functions, which are compactly supported. Therefore in order to reduce the complexity further, we can separately apply wavelet based filters R g˜ to the nonuniform samples of projections and obtain the filtered projections at a uniform grid by
冕
␥m
P⬘2 j (g; )(t) =
˜ 2 j (t ⫺ D sin ) d␥ R⬘( ␥)R g
(27)
⫺␥m
The complexity of the above prefiltering is kMN 2, where k < 1 is a constant that depends on the number of samples to be considered in filtering. The
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
54
Rashid-Farrokhi and Liu
resulting uniform samples are then processed by a parallel reconstruction algorithm such as filtered back-projection. The filtering is given by
冕
tm
Q ,2 j (g; t) =
P⬘2 j (g; )( )h(t ⫺ ) d
(28)
⫺tm
and the back-projection is defined as in Eq. (26). The filter h(t) in Eq. (28) is the standard ramp filter. In the following we apply the above algorithms to reconstruct wavelet and scaling coefficients from fan beam projections. We use a near coiflets basis with 5 taps [19]. Figure 13 shows the reconstruction of a 256 ⫻ 256 image of a Shepp–Logan head phantom using standard fan-beam reconstruction (22) from projections sampled at 256 angles over 180 degrees, and with each projection sampled at 256 points in the fan angle of 60 degrees. The reconstruction of wavelet coefficients and image using Eq. (23) is shown in Fig. 14. Figure 15 shows the reconstruction using prefiltering (27) and standard filtered back-projection (28) and (26). The quality of both reconstructions is the same as that of the standard fan-beam reconstruction, while in the second algorithm fewer computations are involved. In local tomography simulations we have collected a small amount of nonlocal data in a margin of 12 pixels outside the ROI. In order to reconstruct a local centered disc of radius 32 pixels, we have reduced the fan-beam angle from 60⬚ to 20⬚ to expose the centered region of radius 44 pixels, which reduces the exposure to 34% of full exposure. Using the local reconstruction method proposed in Ref. 15, we have reconstructed the wavelet and scaling coefficients inside a disc of radius 44 pixels as illustrated in Fig. 16(a). Also, in Fig. 16(b) we have shown the reconstruction of an image from these coefficients.
FIGURE 13 (a) The Shepp–Logan head phantom; (b) the standard filtered back-projection in fan-beam geometry.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography
55
FIGURE 14 Reconstruction using (5); (a) wavelet coefficients; (b) the reconstruction from wavelet coefficients.
FIGURE 15 Reconstruction using (7–9); (a) wavelet coefficients; (b) the reconstruction from wavelet coefficients.
FIGURE 16 (a) The local reconstruction of wavelet coefficients; (b) the local reconstruction of image.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
56
7
Rashid-Farrokhi and Liu
CONCLUSION
We have presented algorithms to reconstruct the multiresolution wavelet and scaling coefficients of a function from its Radon transform in parallel-beam or fan-beam geometry. We have observed that for some wavelet bases with sufficiently many zero moments, the scaling and wavelet functions have essentially the same support after ramp filtering. Based on this observation, we have shown how a wavelet based reconstruction scheme, with essentially local data, can be used to reconstruct a local region of a cross section of a body. REFERENCES 1.
2.
3. 4. 5.
6.
7.
8. 9. 10. 11. 12.
13.
D. Walnut. ‘‘Application of Gabor and wavelet expansions to the Radon transform.’’ In: Probabilistic and Stochastic Methods in Analysis, with Applications ( J. Byrnes et al., eds.). Kluwer Academic Publishers, 1992, 187–205. C. Berenstein, D. Walnut. ‘‘Local inversion of the Radon transform in even dimensions using wavelets.’’ In: 75 Years of Radon Transform (S. Gindikin, and P. Michor, eds.). International Press, 1994, 38–58. C. Berenstein, D. Walnut. ‘‘Local inversion of the Radon transform in even dimensions using wavelets.’’ GMU Technical Report, CAM-21/93, Jan. 1993. M. Holschneider. ‘‘Inverse Radon transforms through inverse wavelet transforms.’’ In: Inverse Problems, Vol. 7, 1991, 853–861. G. Kaiser, R. F. Streater. ‘‘Windowed Radon transforms.’’ In: Wavelets: A Tutorial in Theory and Applications (C. K. Chui, ed.). New York: Academic Press, 1992, 399–441. A. Faridani, F. Keinert, F. Natterer, E. L. Ritman, K. T. Smith. ‘‘Local and global tomography.’’ In: Signal Processing, IMA Vol. Math., Appl., Vol. 23. New York: Springer-Verlag, 1990, 241–255. A. Faridani, E. Ritman, K. T. Smith. ‘‘Local tomography.’’ SIAM J. Appl. Math. 1992; 52(2):459–484. ‘‘Examples of local tomography.’’ SIAM J. Appl. Math. 1992; 52(4):1193–1198. A. Faridani, D. Finch, E. L. Ritman, K. T. Smith. ‘‘Local tomography II.’’ SIAM J. Appl. Math. 1997; 57(4):1095–1127. J. DeStefano, T. Olson. ‘‘Wavelet localization of the Radon transform.’’ IEEE Trans. Signal Proc. 1994; 42(8). A. H. Delaney, Y. Bresler. ‘‘Multiresolution tomographic reconstruction using wavelets.’’ IEEE Intern. Conf. Image Proc. 1994; ICIP-94:830–834. A. H. Delaney, Y. Bresler. ‘‘Multiresolution tomographic reconstruction using wavelets.’’ IEEE Trans. Image Proc. 1995; 4(6). J. DeStefano, T. Olson. ‘‘Wavelet localization of the Radon transform.’’ IEEESP Int. Symp. on Time-Freq. and Time-Scale Analysis, Oct. 1992, Victoria, B.C., Canada. T. Olson. ‘‘Optimal time-frequency projections for localized tomography.’’ Annals of Biomedical Engineering 1995; 23:622–636.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet-Based Multiresolution Local Tomography 14.
15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29.
30.
57
F. Rashid-Farrokhi, K. J. R. Liu, C. A. Berenstein, D. Walnut. ‘‘Wavelet based multiresolution local tomography.’’ IEEE Transactions on Image Processing, Oct. 1997. F. Rashid-Farrokhi, K. J. R. Liu, C. A. Berenstein. ‘‘Local tomography in fanbeam geometry using wavelets.’’ IEEE Int’l Conf. on Image Processing (ICIP96) 1996; II-709–712. F. Peyrin, M. Zaim, G. Goutte. ‘‘Multiscale reconstruction of tomographic images.’’ In: Proc. IEEE-SP Int. Symp. Time-Frequency Time-Scale Anal., 1992. R. R. Coifman, Y. Meyer. ‘‘Remarques sur l’analyse de Fourier ‘a feneˆtre.’’ Seˆrie I. C.R. Acad. Sci. Paris 1991; 312:259–261. M. Antonini, M. Barlaud, P. Mathieu, I. Daubechies. ‘‘Image coding using wavelet transform.’’ IEEE Trans. Image Proc. 1992; 1(2):205–220. I. Daubechies. Ten lectures on wavelets. SIAM-CBMS series. Philadelphia: SIAM, 1992. S. Mallat. ‘‘A theory for multiresolution signal decomposition: the wavelet representation.’’ IEEE Trans. on PAMI 1989; 11(7). K. T. Smith, F. Keinert. ‘‘Mathematical foundation of computed tomography.’’ Applied Optics 1985; 24(23). Y. Zhang, M. Coplan, J. Moore, C. A. Berenstein. ‘‘Computerized tomographic imaging for space plasma physics.’’ J. Appl. Physics 1990; 68. G. Beylkin, R. Coifman, V. Rokhlin. ‘‘Fast wavelet transforms and numerical algorithms.’’ Comm. Pure Appl. Math. 1991; 44:141–183. J. Guedon, M. Unser. ‘‘Least square and spline filtered backprojection.’’ Preprint, March 1994. C. Hamaker, K. T. Smith, D. C. Solomon, S. L. Wagner. ‘‘The divergent beam X-ray transform.’’ Rocky Mountain J. Math. 1980; 10:253–283. F. Natterer. The Mathematics of Computerized Tomography. New York: John Wiley, 1986. P. Maass, ‘‘The interior Radon transform.’’ SIAM J. Appl. Math. 1992; 52(3): 710–724. A. K. Louis, A. Rieder. ‘‘Incomplete data problem in x-ray computerized tomography.’’ Numerische Mathematik 1989; 56. A. I. Katsevich, A. G. Ramm. ‘‘New methods for finding values of the jumps of a function from its local tomography data.’’ Inverse Problems 1995; 11(5): 1005–1024. A. C. Kak, M. Slaney. ‘‘Principles of computerized tomographic imaging.’’ New York: IEEE Press, 1988.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
3 The Point Spread Function of Convolution Regridding Reconstruction Gordon E. Sarty University of Saskatchewan, Saskatoon, Saskatchewan, Canada
1
INTRODUCTION
The standard approach to MRI signal acquisition is to sample a Cartesian grid in the spatial frequency domain, or k-space, of the object being imaged. A straightforward application of the fast Fourier transform (FFT) [1,2] will then convert such data into an image when the spacing of the Cartesian grid is uniform. For a variety of reasons, it is sometimes necessary or desirable to acquire MRI data on a nonuniform grid in k-space. For example, the first nuclear magnetic resonance image was made from data obtained on a polar grid in k-space [3] and subsequently reconstructed by a back-projection method, similar to the methods used to reconstruct tomographic x-ray images [4,5]. More recently, in an attempt to improve on echo-planar imaging (EPI) [6], spiral [7,8], and rosette [9,10] trajectories in k-space were introduced. While it is possible to use projection reconstruction methods for spirals [11] and for a special case of the rosette known as the ROSE (described in detail below), the more general cases require a Fourier transform method for image reconstruction. Such Fourier reconstruction may be done directly, as we will see, but is is usually more computationally efficient first to interpolate the given k-space data onto a uniform Cartesian grid and then apply the FFT to 59
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
60
Sarty
obtain the reconstructed image. The process of interpolating data from the original data grid onto a Cartesian grid is known as regridding or simply gridding. The aim of this chapter is to provide a mathematical overview of regridded Fourier reconstruction in general and of the popular convolution regridding reconstruction in particular. The most important lesson to be learned from this mathematical overview is that the convolution regridding method is intimately connected to the integral (inverse) Fourier transform. Specifically, it will be shown that convolution regridding provides a computationally efficient approximation to a Riemann sum approximation of the inverse integral Fourier transform. Viewing convolution regridding this way also allows for a clear understanding of why the original data need to be weighted before regridding. Understanding the need to weight the data is more difficult when convolution regridding is viewed only from the interpolation point of view. The connection between convolution regridding and the integral Fourier transform was made before [12], but here the consequences are more fully explored through an understanding of the associated point spread function. The MRI signal model used here is given by S(t,  ) =
冕冕
→
→
→ → ⫺2 iK(t,  ) ⭈ p (p)e dp
(1)
Ꮽ
where Ꮽ is the selected slice within the sensitive volume of the radio frequency (RF) receiver coil, represents the transverse spin density in that → slice that we wish to image, and p = (x, y) is a point in that image. The → position K(t,  ) = (Kx (t,  ), Ky (t,  )) of the data point in k-space at a given point in time is proportional to the time integral of the applied magnetic → gradient fields [13,14], G(t,  ) = (Gx (t,  ), Gy (t,  )), as given by
␥ K(t,  ) = 2 →
冕
tI
tL
␥ G(,  ) d ⫹ 2 →
冕
t →
G(,  ) d
(2)
tI
where tL is the time at which the gradients are turned on, tI is the time at the beginning of signal acquisition, and ␥ is the gyromagnetic ratio. The variables t and  represent time and interleaf number, respectively. For the k-space trajectories considered below, t and  also represent the coordinates of a coordinate system that is naturally associated with the trajectories [15]. The pair (t,  ) will therefore be called natural k-space coordinates. The signal model of Eq. (1) is an approximation of the signal that is expected physically. A more accurate model will include contributions from isochromats of varying Larmor frequency. That is, would be a function of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
61 →
Larmor frequency in addition to it being a function of p via the gradient fields. The signal will also undergo decay due to spin–spin, or T2 , relaxation and to magnetic field inhomogeneities. In practice the effect of relaxation is minor, unless the acquisition time T = tF ⫺ tI is long relative to T2 , and the effects of magnetic field inhomogeneity will need to be corrected as part of the reconstruction process. Algorithms exist for inhomogeneity correction [16–19], but they will not be considered here. The signal model of Eq. (1) therefore allows us to concentrate on the Fourier transform aspects of MRI image reconstruction. The rest of the chapter is organized as follows. First, some examples of non-Cartesian approaches to MRI data acquisition are given, so that the reader has a clear picture of the nature of the given data. Next, a review of direct reconstruction and its connection to the integral Fourier transform is given. Then a review of regridding is given followed by an analysis of the connections between convolution regridding, direct reconstruction, the desired approximation to the integral Fourier transform, and reconstruction point spread functions. Finally, a specific case of the point spread function associated with ROSE sampled data is examined in detail. At the end of the chapter is an appendix giving the mathematical conventions used in this chapter. The appendix is presented also for the purpose of convincing the reader that the apparently formal manipulation of Dirac deltas in the main presentation can be put on a rigorous mathematical footing.
2
ACQUISITION OF k-SPACE DATA ON NON-CARTESIAN GRIDS
In this section the methods for generating the popular spiral and the more recently implemented rosette trajectories in k-space are described in detail. The purpose is to provide specific examples of methods for the acquisition of MRI data on non-Cartesian grids in k-space. Simple Archemedian spiral trajectories in k-space can be generated through the application of the gradient fields Gx (t) =
2 ␥
冉 冋
册
⫺
2At sin T2
冋
Gy (t) =
2 ␥
冉 冋
册
⫹
2At cos T2
冋
2t ⫹ T
A cos T
册冊
(3)
册冊
(4)
2t ⫹ T
and A sin T
2t ⫹ T
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
2t ⫹ T
62
Sarty
where t 僆 [0, T]. The corresponding k-space trajectory is given by Eq. (2) as (Kx (t), Ky (t)) =
At T
冉 冋 cos
册 冋
2t ⫹  , sin T
册冊
2t ⫹ T
(5)
an example of which is shown in Fig. 1(a). Variation of the parameter  serves to rotate the spiral k-space trajectory and leads to interleaved spirals. Interleaved spirals are required when the samples along one trajectory are not dense enough in k-space to permit unaliased image reconstruction. Images having pixel matrices greater than 64 ⫻ 64 generally require multiple interleaved data acquisitions because the physical limitations of gradient field generation (slew rate and maximum amplitude) also limit the number of spiral turns possible during a typical data acquisition time of T = 40 ms. To overcome the physical limitations, spiral trajectories more sophisticaled than the simple Archemedian example given here are also possible [20]. These more sophisticated spirals have been designed to optimize the use of the physically limited gradient field systems and to provide sampling densities that lead to optimal signal-to-noise ratios in the reconstructed images. For the purpose of understanding the Fourier aspects of image reconstruction, however, the simple Archemedian spiral is adequate.
FIGURE 1 (a) A simple Archemedian spiral trajectory in k-space. (b) A ROSE trajectory in k-space. Here  = 0, A = 80 cycles/meter, and = 32 for both trajectories.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
63
For rosette imaging, the applied gradient fields are Gx (t) = ⫺
冉
2 2A T␥
⫹ ( ⫺ 1)sin
Gy (t) =
2 2A T␥
冉
冋
( ⫹ 1)sin
冋
( ⫺ 1)
( ⫹ 1)cos
⫺ ( ⫺ 1)cos
冋
( ⫹ 1)
册冊
册
2t ⫹ T
2t ⫺ T
冋
( ⫹ 1)
册
2t ⫹ T
册冊
2t ⫺ T
( ⫺ 1)
(6)
(7)
resulting in the k-space trajectories (Kx (t), Ky (t)) = A cos
冉 冋
⭈ cos
冋 册 册 冋 2t T
21t ⫹  , sin T
册冊
21t ⫹ T
(8)
By setting 1 = 1 the rosette pattern can be made to follow a largely radial direction that may have some advantages for reducing motion artifacts in reconstructed images. The special case of the rosette trajectory with 1 = 1 is termed the ROSE trajectory (for radially oriented sinusoidal excursions in k-space) [12,15]. An example of a ROSE trajectory is given in Fig. 1(b). Similar to the spiral trajectories, ROSE and rosette trajectories can be rotated through the variation of the parameter  for interleaved data acquisition. When interleaved non-Cartesian acquisitions are appropriately set up, as the spiral and rosette are with the parameter  as introduced above, the acquisition time t and the interleaf parameter  provide a non-Cartesian local coordinate system on the k-plane. Since this coordinate system is naturally associated with the data acquisition scheme, I call the coordinate system the natural k-plane coordinate system [12,15]. The coordinate transformation → from the global Cartesian k-plane coordinates to the natural k-plane coordinates, and its derivative, the Jacobian, can be used to give an approximation of the integral Fourier transform directly in terms of the given data. The approximation is derived in the next section. For simple Archimedean spirals, the natural k-plane coordinates are → given by the components of as
x (t,  ) = A
冉冊 冋 t T
cos
册
2t ⫹ T
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(9)
64
Sarty
y (t,  ) = A
冉冊 冋 t T
sin
2t ⫹ T
册
(10)
where t 僆 (0, T] and  僆 [⫺, ) define the domain of the natural k-plane coordinates relevant to image reconstruction. The coordinates are local and valid only on a disk of radius A in the k-plane, but a mathematically valid extension is possible by letting t 僆 (0, ⬁). In either case, the origin is a singular point of the coordinate system and so must be excluded. The exclusion of a single point in the k-plane, or more generally of a set of Lebesgue measure zero [21] in the k-plane, does not affect the value of the integral Fourier transform and so, for our purposes, that point will not be missed. The Jacobian determinant of the simple Archimedean spiral natural k-plane coordinate system is given by 兩J(t,  )兩 = A2t, which, because→ of symmetry, is independent of . For the case of a more general spiral K(t) → whose radial component 兩K(t)兩 increases monotonically, it can be shown that the Jacobian determinant associated with its natural coordinates is given by → → 兩J(t,  )兩 = 兩K(t)⭈K⬘(t)兩 [22]. The natural k-plane coordinates associated with rosette trajectories in k-space are slightly more difficult to describe than those associated with spiral trajectories. The rosette natural k-plane coordinates provide multiple local coordinate systems on a disk of radius A in k-space. For each coordinate system, the domain of t is obtained by restricting it, so that the image → under (⭈,  ) is one outward-going or one inward-going side of a petal in the rosette pattern. The domain for  (angle) is the same as in the spiral case. With these restrictions, each natural k-plane coordinate system is given by
x (t,  ) = A cos y (t,  ) = A cos
冉 冊 冉 冉 冊 冉
冊 冊
2t T
cos
⫹
21t T
(11)
2t T
sin
⫹
21t T
(12)
The associated Jacobian determinant is given by 兩J(t,  )兩 = 兩A2 cos(t)sin(t)兩. It is independent of both  and 1 due to symmetries. The existence of multiple coordinate systems merely introduces an unimportant constant, through multiple Riemann sums, into the direct reconstruction method presented in the next section. When 1 = 1, the ROSE case, a complete pattern of 2 petals is traced out in time T when is integral and even. When is integral and odd, leaves will be traced and then retraced in time T. The behavior of the rosette trajectory is more complex for other values of and 1 . When is integral for ROSE trajectories, exactly 4 natural k-plane coordinate systems will be defined in time T.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
3
65
DIRECT RECONSTRUCTION AND THE INTEGRAL FOURIER TRANSFORM
Acquired MRI data are always spatial-frequency band limited. In the case of data acquisition on a Cartesian grid, the samples are restricted to a rectangular region Ꮽ in k-space. For spiral and rosette acquisitions, the region Ꮽ is a disk of radius A. Therefore it is possible only to obtain an image reconstruction that is an approximation of a band-limited version of the ideal image. In practice, band-limiting means that the resolution of the reconstructed image will be limited but, pragmatically, it is only necessary to produce an image having a resolution comparable to the dimension of individual pixels in the reconstructed image. Restricting the inverse integral Fourier transform to a region Ꮽ in kspace leads to a band-limited version B of the image given by →
B (p) =
冕冕
→
→ →
→
ˆ (k)e 2 i k ⭈ p dk
(13)
Ꮽ
The resolution in B is limited by the point spread function B which is the inverse Fourier transform of Ꮽ , the characteristic function of the set Ꮽ (see Appendix). More explicitly, B = * B . When Ꮽ is a square, centered on the origin of k-space with sides of length 2A, then
B (x, y) = 4A2
sin(2Ax) sin(2Ay) 2Ax 2Ay
(14)
When Ꮽ is a disk of radius A centered about the origin, then
B (x, y) = A
J1(2A兹x 2 ⫹ y 2) 2 2 兹x ⫹ y
(15)
where J1 is the first-order Bessel function. It will be shown below how the point spread functions of reconstructions from samples of ˆ are approximations of B . → Given a system of natural k-plane coordinates the integral transform of Eq. (13) can be written
冕冕 F
B (x, y) =
I
tF
ˆ (x (t,  ), y (t,  ))
tI
⫻ e 2 i[xx (t, )⫹yy (t, )] 兩J (t,  )兩 dt d →
(16)
where the acquisition time t 僆 [tI , tF ] and the interleaf parameter  僆 [I , F ]. For the spirals and rosettes of Eqs. (5) and (8), tI = 0, tF = T, I = 0, and F = 2. k-space samples from interleaved trajectories that define natural
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
66
Sarty
k-plane coordinates, such as spirals and rosettes, fall on natural k-plane coordinate grids. That is, the data samples, under the model of Eq. (1), are S(tp , q) = ˆ (x (tp , q), y (tp , q))
(17)
where 1 ⱕ p ⱕ P, with P being the number of data samples collected along each k-space trajectory, tp being the sampling time, 1 ⱕ q ⱕ Q, with Q being the number of interleaved acquisitions, and q being the associated interleaf parameter. A Riemann sum approximation of Eq. (16), up to an unimportant constant ⌬t ⌬, can be written in terms of the data samples as P (x, y) =
冘冘 Q
P
q=1
p=1
S(tp , q)e 2 i[xx (tp ,q)⫹yy(tp ,q)] 兩J(tp , q)兩 →
(18)
Equation (18) is a prescription for the direct Fourier reconstruction of MRI k-space data sampled on a non-Cartesian grid. Direct reconstruction is also known as the weighted correlation reconstruction method [23]. Direct reconstruction is not generally used in practice because of the (PQ)2 multiplications and additions required. On a Cartesian grid, 兩J兩 = 1, and the FFT, requiring only on the order of PQ log(PQ) multiplications and additions, can be used to compute efficiently the values of Eq. (18) at PQ specific image points. The difference in computational complexity between the FFT and the direct sum has lead to the development of reconstruction methods where data from non-Cartesian grids are interpolated onto a Cartesian grid so that the FFT may be applied. 4
REGRIDDED RECONSTRUCTION
The first obvious approach to the regridding of data from the acquisition grid in k-space to a uniform Cartesian grid is to use standard interpolation approaches like nearest-neighbor, bilinear interpolation or finite impulse response interpolation based on the truncated sinc function [24–26] (see Appendix). More sophisticated methods, designed to anticipate the underlying continuous function on k-space, such as gradient descent methods [27] or coordinate transformation methods [28], have also been tried. However none of those fairly obvious approaches, first used for the reconstruction of polar sampled x-ray tomography data, give satisfactory image reconstruction. The issue of reconstructing images from Fourier space data acquired on non-Cartesian grids is also relevant for the image reconstruction of long baseline interferometric data in radio astronomy. Radio astronomers in the late 1960s and early 1970s introduced the basis of the currently popular convolution regridding approach now known as gridding. Their first approach was to divide the k-plane into uniform square cells and to assign to
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
67
the Cartesian grid point, at the center of each cell, the sum of the signal values from the acquisition points that fell within that cell [29]. An improvement on the cell approach used the average value of the sampled points within the cell [30] or weighted the sampled points with a function, such as a Gaussian, of the distance from the cell center. A suitable modification of the Gaussian function, leading to a partition of unity, ensured that the sum of the sample weights was one for each cell [31,32]. Weighting the samples turned out to be a crucial component of the regridding process, and highquality reconstructions were the result. Weighting was known to ‘‘taper’’ the point spread function, sharpening and concentrating its central peak [31]. The Nyquist sampling theorem [33] suggests that optimal regridding would be obtained through integral convolution with a sinc function followed by resampling onto a Cartesian grid [34]. Since the sinc function has infinite support, the practical application of integral convolution followed by resampling requires that it be replaced with a finite convolving function. The most popular convolving function is the Kaiser–Bessel function, introduced for the purpose of regridding by Jackson et al. [35]. Conceptually, the first step of the convolution regridding procedure is to convolve the sampled data with the convolving function cˆ. The sampled data are D(kx , ky) = ˆ (kx , ky)ᏹ(kx , ky)
(19)
where ᏹ is given by Eq. (70). Convolving with cˆ gives Dc (kx , ky) =
冕冕 ⬁
⬁
⫺⬁
⫺⬁
D(, )cˆ (kx ⫺ , ky ⫺ ) d d
(20)
With the signal S as given by Eq. (1), evaluating the convolution on a Cartesian grid gives Dc (n⌬kx , m⌬ky) =
冘冘 P
Q
p=1
q=1
S(tq , p)cˆ(n⌬kx ⫺ Kx (tq , p),
m⌬ky ⫺ Ky (tq , p))
(21)
which are easily computed. Choosing a small support for cˆ ensures that the time required to compute the sums and products in Eq. (21) will not be excessive. The smooth nature of D * cˆ also ensures that the evaluation of Dc will be less prone to error than conventional interpolation [36]. Naively, one could apply the FFT to the data of Eq. (21) hoping to obtain a sampled version of c and to recover by dividing by c, the inverse integral Fourier transform of the convolving function cˆ. However, such an approach will not work well. Instead of convolving cˆ with the given samples
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
68
Sarty
directly, the data first need to be weighted as discussed above. Weighting, followed by convolution, followed by sampling on the Cartesian grid, gives Dcw (n⌬kx , m⌬ky) =
冘冘 P
Q
p=1
q=1
S(tq , p)W(tq , p)
⫻ cˆ(n⌬kx ⫺ Kx (tq , p), m⌬ky ⫺ Ky (tq , p))
(22)
where W is the weighting function. Jackson et al. [35] suggest using W = 1/(ᏹ * cˆ). Pipe and Menon [37] improve on the Jackson et al. weighting function through the iteration of the generalized function ᐃi (kx , ky) =
冘冘 P
Q
p=1
q=1
␦(kx ⫺ Kx (tq , p))␦ (ky ⫺ Ky (tq , p))Wi (q, p)
(23)
using the scheme ᐃi⫹1 =
ᐃi ᐃi * ⌿
(24)
where ⌿ is an ordinary function whose Fourier transform has support exactly on an open disk of radius V, the intended image field of view. Note that the integral convolution in the denominator of Eq. (24) is reduced to a sum by the Dirac deltas and that only discrete weights Wi (q, p) are determined as opposed to a weighting function defined on all t and . Iteration is stopped when the values of Ᏹ = (ᐃ * ⌿)ᏹ fall below a predetermined threshold. In terms of a mathematically valid approach to a Riemann sum approximation of Eq. (16), we will see that the natural k-space coordinate Jacobian determinant is a better choice for the weighting function. Voronoi areas have also been widely used for the weighting function [38]. The Voronoi area associated with a point in the sampling grid is defined as the set of points in the plane that are closer to the given sampling point than to all the other sampling points. Intuitively, the weighting function compensates for the nonuniform sampling density because it is the inverse of the sampling density. We will see that the compensation provided by the weighting function is required to make the underlying Riemann sum approximation of the inverse integral Fourier transform more exact. The reconstructed image Pg is finally computed by Pg (x, y) =
IDFT(Dcw)(x, y) c(x, y)
(25)
where the inverse discrete Fourier transform (IDFT) as given by Eq. (65) is computed by the FFT. Division by the function c removes the effect of the integral convolution with cˆ implied in Eq. (21). The computation of the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
69
function Pg of Eq. (25) is what is popularly known as the gridding reconstruction of the given k-space data. Selection of the convolution function requires some care in order to provide a good trade-off between the time required to compute the regridding and the quality of the reconstructed image. Jackson et al. compared several choices of convolving function cˆ : 1.
The ‘‘two-term’’ cosine: cˆ (kx , ky) =
冋
冋
⭈ ␣ ⫹ (1 ⫺ ␣)cos
2.
冉 冊册 冉 冊册
␣ ⫹ (1 ⫺ ␣)cos 2ky L
冋
冉 冊 冉 冊册 冉 冊 冉 冊册
␣ ⫹  cos
⫹ (1 ⫺ ␣ ⫺  )cos
冋
2kx L 4kx L
2ky L 4ky ⫹ (1 ⫺ ␣ ⫺  )cos L ⫻
␣ ⫹  cos
(27)
The case ␣ = 0.42 and  = 0.50 is known as the Blackman window [39]. The Gaussian: cˆ (kx , ky) = e ⫺(k x⫹k y)/(2 ) 2
4.
(26)
With ␣ = 0.54, this is the Hamming window. With ␣ = 0.50, this is the Hanning window. The ‘‘three-term’’ cosine: cˆ (kx , ky) =
3.
2kx L
2
2
(28)
The prolate spheroidal wave function and a numerical approximation of the prolate spheroidal wave function custom designed by Jackson et al. The prolate spheroidal wave function [40] is the eigenfunction having the largest eigenvalue of the operations of repeated forward and inverse integral Fourier transforms intermingled with multiplication by fixed characteristic functions (bandlimiting). The prolate spheroidal wave function is difficult to compute, but the Kaiser–Bessel function (presented next) provides a good approximation.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
70
Sarty
5.
The Kaiser–Bessel function:
冋 冉 冑 冉 冊 冊册 冋 冉 冑 冉 冊 冊册
cˆ (kx , ky) = ⭈
1 I0 L
1 I0 L


1⫺
1⫺
2ky L
2kx L
2
2
(29)
where I0 is the zero-order modified Bessel function of the first kind. With criteria that include minimizing aliasing energy and computation time, Jackson et al. concluded that the Kaiser–Bessel function is an optimal convolving function for use in convolution regridding reconstruction. The inverse integral Fourier transform of the Kaiser–Bessel function is c(x, y) =
sin(兹 2L2x2 ⫺  2 ) sin(兹 2L2y2 ⫺  2 ) 兹 2L2x 2 ⫺  2 兹 2L2y2 ⫺  2
(30)
A variation on convolution regridding, introduced by Rosenfeld [41], uses a singular value decomposition (SVD) approach [41]. Rosenfeld begins with an approximation to the Nyquist sampling theorem. The sampling theorem says that a band-limited function can be specified completely by a sufficiently dense set of samples. In MRI, the Fourier transform of the image, ˆ , is band limited, since its Fourier transform is, up to a reflection orthogonal transform, the image , which is contained within a finite field of view of diameter V. Therefore, assuming that the Cartesian grid is dense enough (⌬k ⱕ 1/V),
冘冘 ⬁
ˆ (kx , ky) =
⬁
ˆ (n⌬k, m⌬k) (kx ⫺ n⌬k, ky ⫺ m⌬k)
(31)
n=⫺⬁ m=⫺⬁
where
(kx , ky) =
sin(kx /⌬k) sin(ky /⌬k) kx /⌬k ky /⌬k
(32)
Truncating Eq. (31) to N ⫻ M terms, letting the sample values → ˆ (K(tp , q)) define a PQ dimensional vector b and letting ˆ (n⌬k, m⌬k) define an N M dimensional vector x by linearly ordering the values gives Ax = b
(33)
where the matrix entries of A are given by appropriate values of the sinc convolution kernel . The vector x defining the regridded values may be solved for by finding a pseudoinverse A# of A to give
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
x = A# b
71
(34)
Rosenfeld introduces a method that he calls the Block Uniform Resampling (BURS) algorithm for computing a sparse approximation to the Moore– Penrose pseudoinverse of A. The image is reconstructed by computing the discrete inverse Fourier transform of x; there is no need to divide by c The connection between Rosenfeld’s regridding method and the convolution regridding method may be best understood by expressing the convolution regridding procedure, up to just before the discrete Fourier transform, in matrix form. To simplify the presentation, let the vector entries of b and their corresponding k-space coordinates be enumerated by r or s and the vector entries of x and their corresponding k-space coordinates be enumerated by ᐍ. That is, r, s 僆 {1, . . . , PQ} and ᐍ 僆 { 1, . . . , N M }. Let the entries of the matrix B be defined by →
→
Bᐍr = cˆ (兩kr ⫺ kᐍ 兩)
(35)
and let the entries of the matrix C be defined by →
→
Crs = cˆ (兩kr ⫺ ks 兩)
(36)
Let the entries of the diagonal matrix D be defined by →
Drr = W(kr)
(37)
Then the result of convolution regridding as given by Eq. (22) is Cx = BDb
(38)
so that x = C⫺1BDb
(39)
Therefore, by setting A# = C⫺1BD
(40)
it is seen that Rosenfeld’s SVD method is equivalent to convolution regridding, provided the weights found from a solution of D in Eq. (40) are used and c = 1 is used in Eq. (25). It is interesting that the resulting weights may take on negative values, so that direct interpretation of the reconstruction as a strict Riemann sum approximation of Eq. (13) is lost. 5
REGRIDDING AND DIRECT RECONSTRUCTION COMPARED
The goal is now to show that convolution regridding reconstruction provides an approximation to the Riemann sum direct reconstruction of Eq. (18).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
72
Sarty
Writing out IDFT(Dcw)(x, y) from Eq. (25) explicitly and changing the order of summation results in
冘冘
IDFT(Dcw)(x, y) = ⫻
冋冘冘 n
p
S(tp , q)W(tp , q)
q
册
cˆ (n⌬kx ⫺ Kx (p, q), m⌬ky ⫺ Ky (p, q))e 2 i(xn⌬kx⫹ym⌬ky)
m
(41)
The term within the square brackets of Eq. (41) is the convolution of the exponential function sampled on a Cartesian grid with cˆ, evaluated at the data sample grid points tq and p . Denoting that term by E( p, q), it explicitly is E(p, q) =
冕 冕 冘冘 ⬁
⬁
⫺⬁
⫺⬁
n
␦ (kx ⫺ n⌬kx , ky ⫺ m⌬ky)e 2 i(xkx⫹yky)
m
⫻ cˆ (kx ⫺ Kx (p, q), ky ⫺ Ky (p, q)) dkx dky
(42)
and it is claimed that E(p, q) ⬵ e 2 i(xKx ( p,q)⫹yKy (p,q)) c(x, y)
(43)
Substituting the right-hand side of Eq. (43) into Eq. (25) in place of E( p, q)/ c(x, y) gives Eq. (18), so it is seen that gridding reconstruction provides an approximation to the direct reconstruction as long as the approximation of Eq. (43) makes sense. To see what the nature of the approximation claimed in Eq. (43) is, recall that the discrete Fourier transform is used as an approximation for the integral Fourier transformation (see Appendix). Therefore
冕冕 ⬁
E(p, q) ⬵
⬁
⫺⬁
cˆ (kx ⫺ Kx (p, q), ky ⫺ Ky (p, q))e 2 i(xkx⫹yky) dkx dky
⫺⬁
= c(x, y)e 2 i(xKx ( p,q)⫹yKy ( p,q))
(44)
With this connection between gridding and a Riemann sum approximation of Eq. (13), the importance of the data weighting function becomes clear. Rewriting Eq. (16) as
冕冕 F
B (x, y) =
I
tF
ˆ (x (t,  ), y (t,  ))
tI
⫻ e 2 i[xx (t, )⫹yy(t, )] W(t,  ) dt d
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(45)
Convolution Regridding Reconstruction
73
it can be understood that the use of other functions for W, like Voronoi areas for example, also leads to Riemann sum approximations for the integral transform of Eq. (13). The intuition that the sampled data need to be compensated for their density distribution is supported by the need to provide a good approximation to the volume under the surface of ˆ in a Riemann sum approximation of a band-limited inverse integral Fourier transform. The assertion that gridding provides a good approximation to direct reconstruction and hence to a Riemann sum approximation of a band-limited integral Fourier transform rests on the heuristic approximation in Eq. (44). A more direct way to appreciate the nature of the approximation given by gridding is to look at the point spread functions. The point spread function for direct reconstruction is obtained by inserting the definition of the integral Fourier transform Eq. (60) explicitly into Eq. (18) and interchanging the double integrals with the double sums to obtain P = * J , where
J (a, b) =
冘冘 Q
P
q=1
p=1
e 2 i[ax (tp ,q)⫹by (tp ,q)] 兩J (yp , q)兩 →
(46)
or, if direct reconstruction is accomplished with a more general weighting factor, P = * W , where
W (a, b) =
冘冘 Q
P
q=1
p=1
e 2 i[ax (tp ,q)⫹by (tp ,q)]W(tp , q)
(47)
The point spread function associated with direct reconstruction is indepen→ → → → → dent of location p 0 since P␦ p 0 (p) = ␦ p 0 * W (p) = W (p ⫺ p0), which is W → centered at p0 . The fact that P can be expressed as an integral transform also proves that the operator P is linear. The operator P is the composition of two operators, the Fourier transform and the operator that represents sampling followed by reconstruction. Therefore direct reconstruction is a linear transformation of the sampled data to an image. In fact, direct reconstruction is the most general linear reconstruction method with a shift-invariant point spread function. →
→
Theorem 1 (Maeda et al. [23]) Let a reconstruction algorithm for an image from a fixed set of samples of its integral Fourier transform, → { ˆ (k(tp , q))}, be linear with a shift invariant point spread function. Then that reconstruction algorithm is equivalent to the direct reconstruction method if the functions e ⫺2 iq ⭈ k (tp ,q) are linearly independent for 1 ⱕ p ⱕ P, 1 ⱕ q ⱕ Q. → →
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
74
Sarty
Linear reconstruction implies that
Proof.
冘冘 P
→
P (p) =
Q
p=1
→
→
gp,q (p)ˆ (k(tp , q))
(48)
q=1
for some coefficient function gp,q . Substituting the definition of the integral Fourier transform into Eq. (48) gives
冕冕 ⬁
→
P (p) =
⬁
⫺⬁
→ → → → (q) (p; q) dq
(49)
⫺⬁
where the point spread function is
冘冘 P
→
→
(p; q) =
p=1
Q
→ →
→ e⫺2 i q ⭈ k (tp ,q)gp,q (p)
(50)
q=1
→ → → → Since, by hypothesis, is translation invariant, (p; q) = (p ⫺ q) and so satisfies the differential equation
⭸ ⭸ → = ⫺ → ⭸p ⭸q
(51)
Applying Eq. (51) to Eq. (50) gives
冘冘 冘冘 P
Q
p=1
q=1
dgp,q → ⫺2 i q ⭈ k (tp ,q) (p)e → dp
=
p
→ →
→
→ →
2ik(tp , q)gp,q (p)e⫺2 i q ⭈ k (tp , q) →
(52)
q
→ (Note that differentiation with respect to p = (x, y) is shorthand for a column vector with one component having differentiation with respect to x and the other component having differentiation with respect to y.) Since e⫺2 i q ⭈ k (tp , q) are assumed to be linearly independent, it follows that → →
dgp,q → → → (p) = 2ik(tp , q)gp,q (p) → dp
(53)
which may be integrated by solving two simultaneous partial differential equations to give → →
→ gp,q(p) = Wp,q e 2 i p ⭈ k (tp , q)
where the weight Wp,q is an integration constant.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(54) Q.E.D.
Convolution Regridding Reconstruction
75
The situation for the point spread function of gridding is more complex. Begin with Eqs. (41) and (25) and write
冘冘
c(x, y)Pg (x, y) = ⫻
冋冘冘 n
p
S(tp , q)W(tp , q)
q
册
cˆ (n⌬kx ⫺ Kx (p, q), m⌬ky ⫺ Ky (p, q))e 2 i(xn⌬kx⫹ym⌬ky)
m
(55)
Substituting the definition of S from Eq. (1) into Eq. (55), interchanging the order of summation and integration, and dividing by c leaves Pg (x, y) =
冕冕 ⬁
⬁
⫺⬁
⫺⬁
(a, b)g (a, b; x, y) da db
(56)
where the point spread function is given by g (a, b; x, y) =
冘冘冘冘 p
q
m
e 2 i(xn⌬kx⫺ax (q,p)⫹ym⌬ky⫺by (q,p))
n
⫻ W(Kx (q, p), Ky (q, p))cˆ (n⌬kx ⫺ x (q, p), m⌬ky ⫺ y (q, p))/c(x, y)
(57)
The operator Pg , being an integral transform, is linear, but the corresponding point spread function is a function of image location. That is, the point spread function is different, up to any symmetries that may exist, for every point in the reconstructed image. However, the gridding point spread function approximates the direct reconstruction point spread function in the sense that they would correspond at the origin of the image, g (a, b; 0, 0) = W (a, b), if cˆ = c = 1. Also, if n⌬kx and m⌬ky were replaced by x (q, p) and y (q, p), respectively, in the exponential function, if the sum over the image pixel labels n and m were eliminated, and if the functions C and c were removed from Eq. (57), it would reduce to Eq. (47). Note that if the weights as given by Eq. (40) and c = 1 (which then is no longer related to cˆ) are used in Eq. (57), the point spread function for Rosenfeld’s SVD method can also be computed. The qualitative structure of g may be made more apparent with some simplification. Let Ᏻ be the point spread function associated with the Cartesian grid in k-space:
Ᏻ(x, y) =
冘冘 m
e 2 i(xn⌬kx⫹ym⌬k ) y
(58)
n
The function Ᏻ is the distributional Fourier transform of the sampling generalized function given by Eq. (71). The properties of Ᏻ are well known, since it is the point spread function associated with the reconstruction of Cartesian sampled MRI data. It is periodic with the period equal to the field of view V = 2A/N (in the x-direction, 2A/M in the y-direction; for simplicity
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
76
Sarty
assume that N = M). The function Ᏻ is also a Riemann sum approximation of the inverse integral Fourier transform of [⫺A,A]⫻[⫺A,A] , which is a tensor product of sinc functions as given explicitly by Eq. (14). Since the effect of the convolution with cˆ is intended to be cancelled out through a division with c, an approximation of g may be obtained by setting cˆ and c equal to one in Eq. (57) to obtain
g ⬵ ᏳW
(59)
where the over-line denotes complex conjugation. Therefore the gridding point spread function has characteristics of the point spread functions associated with both direct reconstruction and Cartesian discrete Fourier transform reconstruction. In particular, the field of view will be the minimum of the field of view, as determined independently by each of Ᏻ and g . The exact nature of the gridding point spread function is best appreciated by looking at a specific example. 6
THE GRIDDING POINT SPREAD FUNCTION FOR ROSE SAMPLING
Simulations were done with ROSE sampling patterns in the k-plane, so that the differences between the point spread function of direct reconstruction, and the point spread function of gridding reconstruction, could be appreciated. Data were generated that were sufficient to allow for reconstruction of an image having a resolution comparable to the pixel size in a 32 ⫻ 32 pixel image. The small image size was chosen for two reasons. First, the computation of the function of Eq. (57) is complex and would require many hours of computer time for the calculation of point spread functions associated with larger images. Second, the detailed structure of point spread functions associated with the larger images is much more complicated than the functions associated with smaller images. The essential behavior of the point spread function’s change with image location is therefore best understood from plots of functions associated with smaller images. A total of 4096 data samples, spaced at equal ⌬t, on a single ROSE pattern, P = 1, corresponding to  = 0, A = 80 cycles/meter and = 32 in Eq. (8), were generated. The data samples were computed values of the exact Fourier transform of the Shepp and Logan [42] mathematical phantom, Fig. 2(a), which was contained within a field of view of 0.1 ⫻ 0.1 meters. These data were reconstructed in three ways as shown in Fig. 2. The first was a direct reconstruction and the other two reconstructions were convolution regridding reconstructions accomplished by regridding onto 32 ⫻ 32 and 64 ⫻ 64 Cartesian grids, respectively, in k-space, with reconstructed values outside the intended image region of 32 ⫻ 32 pixels being discarded
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
FIGURE 2 The mathematical phantom (a) and reconstructions from ROSE sampled data. The phantom is shown at a pixel resolution of 256 ⫻ 256, and the reconstructions were computed on an image matrix of 32 ⫻ 32 pixels as described in the main text. Image (b) shows a direct reconstruction, image (c) shows a convolution regridding reconstruction from data regridded onto a 32 ⫻ 32 k-space grid, and image (d) shows a convolution regridding reconstruction from data regridded onto a 64 ⫻ 64 kspace grid. The coarse pixel grid makes the differences between the reconstructions readily visible. The differences between the direct and gridding reconstructions are primarily due to the slightly different reconstruction grids. Within the phantom, the differences between the gridding reconstructions is small, but large differences in the background in the corners of the reconstructions are apparent. The reconstruction from the regridded 32 ⫻ 32 data is particularly bad in the corners where the gridding point spread function no longer provides a good approximation to the direct point spread function. 77
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
78 Sarty
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
79
FIGURE 3 Point spread functions plotted over a range of a = a0 ⫾ 0.2 meters, where a0 is the x-coordinate of the image point where the point spread function is centered. (a) The direct reconstruction point spread function for the ROSE sampling grid as described in the main text. (b) The gridding point spread function for the ROSE data → regridded onto a 32 ⫻ 32 Cartesian grid in k-space for the image point p0 = (0, 0). (c) The gridding point spread → function for the ROSE data regridded onto a 64 ⫻ 64 Cartesian grid in k-space for the image point p0 = (0, 0). (d) The gridding point spread function for the ROSE data regridded onto a 32 ⫻ 32 Cartesian grid in k-space for → the image point p0 = (0.05, 0.05). The two-times oversampled gridding point spread function matches the direct reconstruction point spread function very closely, while the nonoversampled gridding point spread function is smoother. At the scale shown here, there are no large differences in the gridding point spread function at the origin of the image and at points relatively near to the origin of the image.
80
Sarty
in the latter case. For both the direct and regridded reconstructions, the data were weighted by the determinant of the Jacobian of the transformation from natural ROSE k-plane coordinates to Cartesian coordinates. The convolution function cˆ used for both gridding reconstructions was the Kaiser–Bessel function, Eq. (29), with ␣ = 12 and L = 4. For each of the three reconstruction cases, the point spread functions were computed. For direct reconstruction, the point spread function is the same for all images points. For gridding reconstruction, the point spread → function was evaluated at p0 = (0, 0), (0.01, 0), (0.05, 0), (0.1, 0), and (0.2, → 0) along the x-axis and at p0 = (0.01, 0.01), (0.05, 0.01), (0.1, 0.1), and (0.2, → 0.2), where all dimensions are in meters and the phantom is centered at p 0 → = (0, 0). The points p0 = (0.1, 0) and (0.1, 0.1) are at the edge of the intended → field of view, while the points p 0 = (0.2, 0) and (0.2, 0.2) are outside the intended field of view. The point spread function associated with direct reconstruction, Eq. (46), is shown in Fig. 3(a) as a section through y = 0. It is nearly equal to → the gridding point spread functions for the image point p0 = (0, 0) with the function associated with the 32 ⫻ 32 Cartesian k-space grid being smoother due to the smaller number of terms in Eq. (57); see Fig. 3(b) and (c). The values of both gridding point spread functions do not change appreciably near the origin of the image. The point spread function typically appears as plotted in Fig. 3(d). Therefore, to a high degree of accuracy, properties of gridding reconstruction, such as aliasing, may be studied by consideration only of the simpler direct reconstruction point spread function. The direct reconstruction point spread function, W , is, when the weights are reasonable, a Riemann sum approximation of B , as mentioned in Sec. 3. So the resolution of the convolution regridding process is, to a good approximation, the resolution provided by B and limited by the k-space radius covered by the data samples. The aliasing properties are determined to a high degree of approximation by the structure of the direct reconstruction point spread function, W , through the distribution and spacing of the sampled points in k-space. The approximation to the direct reconstruction point spread function given by the gridding point spread function breaks down near, and beyond, the gridding field of view. The gridding field of view is determined by the sample spacing on the Cartesian grid in k-space and is independent from the field of view as determined by W . In the numerical cases investigated here, the gridding field of view is 0.1 meters for the 32 ⫻ 32 Cartesian k-space grid and 0.2 meters for the 64 ⫻ 64 Cartesian k-space grid. In contrast, the field of view as determined by W is 0.1 meters. The deterioration of the gridding point spread function near and beyond the edge of the gridding field of view is illustrated in Fig. 4. Figures 4(a) and (b) show the 64 ⫻ 64
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
81
and 32 ⫻ 32 grid point spread functions, respectively, at the edge the intended 0.1 meter field of view. The deterioration of the 32 ⫻ 32 grid point spread function is apparent. Similar deterioration of the 64 ⫻ 64 point spread function occurs at the edge of the 0.2 meter gridding field of view as shown in Fig. 4(c). The 32 ⫻ 32 grid point spread function no longer resembles a decent point spread function 0.2 meters from the origin; see Fig. 4(d). Highquality reconstruction by the convolution regridding method therefore requires that the k-space Cartesian grid be twice as dense as the image grid.
7
CONCLUSION
The trick to reconstructing a high-quality image from k-space samples is to use those samples to approximate the multidimensional integral inverse Fourier transform of a continuously defined dataset. Weighting the samples appropriately is essential for obtaining the best possible Riemann sum approximation of the integral inverse Fourier transform. When direct reconstruction is used, the point spread function is the distributional Fourier transform of the weighted k-space sample pattern, ᏹ, which in turn is a Riemann sum approximation of the inverse integral Fourier transform of the characteristic function of the region Ꮽ that contains the k-space sample points. These basic mathematical constraints are not changed when a regridding method is used, whether it be the popular convolution regridding or a singular value decomposition method. In the case of convolution regridding, the introduced convolution function reduces the computational complexity of the interpolation while also reducing interpolation error. In the case of the singular value decomposition regridding method, the computational complexity of the interpolation to the Cartesian grid is reduced by blocking the pseudoinverse matrix computation, and the interpolation error is apparently compensated for in the weighting function. Both regridding methods introduce an aspect of the distributional inverse Fourier transform of the Cartesian grid, Ᏻ, into the point spread function of the image reconstruction on top of the direct reconstruction point spread function, W . In order to remove the undesired effects of Ᏻ , the Cartesian grid in k-space must be of twice the density that would be required for data acquired directly on a Cartesian grid. In that case, the resulting gridding point spread function is approximated to a high degree of accuracy by the translation invariant point spread function W of direct reconstruction. The aliasing and resolution properties of a gridding reconstruction from MRI data sampled on a non-Cartesian grid in k-space can therefore be predicted from the structure of W .
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
82 Sarty
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction FIGURE 4 Point spread functions plotted over a range of a = a0 ⫾ 0.2 meters, where a0 is the x-coordinate of the image point where the point spread function is centered. (a) The gridding point spread function for the ROSE → data regridded onto a 64 ⫻ 64 Cartesian grid in k-space for the image point p0 = (0.1, 0.1). (b) The gridding point → spread function for the ROSE data regridded onto a 32 ⫻ 32 Cartesian grid in k-space for the image point p0 = (0.1, 0.1). (c) The gridding point spread function for the ROSE data regridded onto a 64 ⫻ 64 Cartesian grid in → k-space for the image point p0 = (0.2, 0.2). (d) The gridding point spread function for the ROSE data regridded → onto a 32 ⫻ 32 Cartesian grid in k-space for the image point p0 = (0.2, 0.2). 83
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
84
Sarty
ACKNOWLEDGMENTS G. E. Sarty is supported by salary and operating grants from the Medical Research Council of Canada and from the Health Services Utilization and Research Commission of Saskatchewan. APPENDIX:
MATHEMATICAL CONVENTIONS USED IN THIS CHAPTER
The two-dimensional integral Fourier transform of a function is defined as
冕冕 ⬁
→
ˆ (k) =
⬁
⫺⬁
→ →
→ → ⫺2i k ⭈ p (p)e dp
(60)
⫺⬁
→
where k = (kx , ky) is a point in the k-plane, the ⭈ denotes the usual vector dot product on the plane 2, and the function is such that the integral converges. With this definition, the units associated with kx and ky are cycles/ meter if the units associated with x and y are meters. Also there is no need for a constant factor to make the transform unitary on the vector space of square integrable functions, so that the inverse integral Fourier transform is given by
冕冕 ⬁
→
(p) =
⬁
⫺⬁
→
→ →
→
ˆ (k)e 2 i k ⭈ p dk
(61)
⫺⬁
To avoid integral convergence problems, it is convenient to assume that 僆 (2), the space of Schwartz functions [12]. Schwartz functions are defined below. Integral convolution of two Schwartz functions f and g is defined by
冕冕 ⬁
→ f * g(p) =
⫺⬁
⬁ → → → → f(q)g(p ⫺ q) dp
(62)
⫺⬁
The two-dimensional discrete Fourier transform (DFT) of a set of values D on a finite Cartesian grid on the plane is defined by ˜ x , ky) DFT(D)(kx , ky) = D(k
冘 冘
(M/2)⫺1 (N/2)⫺1
=
D(n⌬x, m⌬y)e⫺2 i(kxn⌬x⫹kym⌬y)
(63)
m=⫺M/2 n=⫺N/2
where ⌬x and ⌬y denote the spacing between the grid points. Equation (63) gives a Riemann sum approximation, to within an unimportant constant factor ⌬x ⌬y, of Eq. (60) when D(n⌬x, m⌬y) = (n⌬x, m⌬y). The values of D ˜ on the Cartesian grid: can be recovered exactly from the values of D
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Convolution Regridding Reconstruction
再
(kx , ky) = (n⌬kx , m⌬ky)兩 ⫺
85
N N M M ⱕ n ⱕ ⫺ 1, ⫺ ⱕ m ⱕ ⫺ 1 2 2 2 2
冎
(64)
in k-space, where ⌬kx = 1/⌬x and ⌬ky = 1/⌬y via the inverse discrete Fourier transform (IDFT) given by ˜ IDFT(D)(x, y) = D(x, y)
冘 冘
(M/2)⫺1 (N/2)⫺1
=
2 i(xn⌬kx⫹ym⌬ky) ˜ D(n⌬k x , m⌬ky)e
(65)
m=⫺M/2 n=⫺N/2
when (x, y) are on the originally given Cartesian grid
再
(x, y) = (n⌬x, m⌬y)兩 ⫺
N N M M ⱕ n ⱕ ⫺ 1, ⫺ ⱕ m ⱕ ⫺ 1 2 2 2 2
冎
(66)
With these restrictions, the IDFT provides a Riemann sum approximation, again to within an unimportant constant factor, of the inverse integral Fourier transform of Eq. (61). The FFT provides a computationally efficient method for computing Eq. (63) on the grid of Set (64) and for computing Eq. (65) on the grid of Set (66). The characteristic function of a set S is the function whose value is 1 on S and 0 everywhere else; it will be denoted by S . The characteristic function notation is more versatile than the ‘‘boxcar function’’ notation used in some engineering literature. Distributions and associated generalized functions are required to make some of the calculations presented in this chapter mathematically rigorous. Here, a brief description of Schwartz functions and tempered distributions is given to make the presentation complete. The reader not familiar with distributional theory may interpret the following as the mathematical justification for the calculations that involve generalized functions. In particular, Schwartz functions can be convolved with generalized functions using the formalism of Eq. (62). More complete information about distributions can be found in standard textbooks on functional analysis [43–46]. As mentioned above, it is convenient to assume that the image to be reconstructed, , belongs to the set of Schwartz functions on the plane, (2). The set of Schwartz functions on the plane is defined as the set of all infinitely differentiable functions that satisfy →␥ ␣ → 储 储␥ , ␣ = sup 兩p ⭸ (p)兩 > 兩S⬘ m 兩 ⫹ 兩S⬘ 0 兩 ), a large negative value is produced. In order to improve the estimation robustness against noisy k-space data, the Ci (km , j) values are averaged over all the overlapping points associated with the view km . The average over p overlap points is given by ¯ m , j) = 1 C(k p
冘 p
Ci (km , j)
(24)
i=1
Therefore the optimization problem for angle estimation at the k thm view is formulated by seeking a j that maximizes the average similarity at all overlap points, between the k thm view and the already corrected views. This is given by ¯ m , j)] est (km) = max[C(k
(25)
j
The reliability of each estimate is expressed by a membership function [31], [est (km)], which maps the maximum average similarity at the k thm view to the real interval [0, 1]. That is,
[est (km)] =
再
q 0 1
if if if
0ⱕqⱕ1 q1
(26)
where q=
¯ m , est) ⫹ 1.0 C(k 2.0
For values 兩[est (km)]兩 ⱕ 1.0, the estimated rotation angles are included in a fuzzy set [31] with the membership function 兩[est (km)]兩. This membership function is consequently used not only for interpolation of less reliable estimates but also for the artifact correction algorithm.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
140
3.2
Weerasinghe et al.
Search Space
In order to minimize the search space of test angles, the search is limited to a specified interval near the initial guesses. These initial guesses are derived using an approach based on measuring the projection width of Xdirectional inverse Fourier transforms (IFT) of each view. A similar method was first proposed by Wood et al. [16]. This method was given little attention in the past mainly due to its limited accuracy and dependence on the object shape. However, in this application, it is used only as a starting point for angle estimations. The algorithm for deriving initial guesses is shown in the block diagram of Fig. 2, and is described as follows. First, each view of the data is subjected separately to an inverse Fourier transform (IFT) to establish the resulting X-directional IFT width associated with each view. The width of this X-directional IFT is shown to provide the projection width of the imaged object onto the X-axis [44]. Therefore two open snake contours [32] are used on either side of the X-directional IFT to extract this projection width. As shown in Fig. 3, if the object m(x, y, km) is contained within xmin and xmax at the acquisition of the k thm view, the Y-directional projection width wl is given by wl = xmax ⫺ xmin ⫹ 1
(27)
The width of the X-directional IFT equates to the Y-directional projection width wl of m(x, y, km) [33]. As shown in Fig. 3, a particular value of wl corresponds to a unique orientation of the scanned object provided that the object is asymmetric. For a symmetric object, it is possible to have multiple orientations matching the same wl value. The number of such possible orientations depends on the number of possible axes of symmetry. The outer boundary of the object is extracted by using a closed snake contour. A detailed discussion on ROI boundary extraction from a motion affected MR image can be found in Ref. 33. The extracted boundary is then rotated by known stepwise angles in order to establish the projected widths it casts on the X-axis. The results of this stage are used to create a lookup table indicating the rotation angle associated with each projection width. Hence the X-directional IFT width of each view can be converted to an associated rotation angle. If multiple angles of rotation cast the same projection width onto the X-axis (i.e., due to symmetric objects), all these possible angles are clustered in the search space. 3.3
Lookup Table
A lookup table is created for the Y-directional projection width versus rotation angle by rotating the estimated object boundary within the interval
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
141
FIGURE 2 Block diagram of the algorithm to select the search space for each view. (From Ref. 44.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
142
Weerasinghe et al.
FIGURE 3 Illustration of Y-directional projection width (i.e., X-directional IFT width wl). (From Ref. 44.)
min to max in steps of ⌬. The probable minimum and maximum rotation angles for a known maximum angular span are denoted by min and max . The assigned values for min and max may vary according to the part of the anatomy that is being imaged and its freedom of in-plane rotation. In this work, min = ⫺90⬚ and max = 90⬚ have been assigned to mimic head rotations. The value of ⌬ determines the resolution of the estimated angle, but small ⌬ can lead to long estimation times. For the simulation experiments described in this chapter, ⌬ is set to 1⬚. When this lookup table is created, the initial guesses for the search space of rotation angles associated with each view can be established. Since the lookup table contains rotation angles associated with each Y-directional projection width of m(x, y, km), it is possible to derive the best
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
143
matching angles for each projection width from the X-directional IFT. It should be noted that, in some cases where the scanned object displays geometric symmetries, there could be more than one possible initial guess. In this case, multiple search spaces are formed for each associated view. The angular span of the search space can be decided upon by the total angular span of the object and time constraints involved in the estimations. Typically, 10% of the total angular span can be adequate for accurate estimations. In the event of multiple initial guesses, some of the guesses may be contained within an already formed search space. In this case, the existing search space is extended, rather than a new search space being introduced, avoiding the possibility of overlapping search spaces. The initial guesses are usually subject to errors such as: 1. 2. 3. 4.
Errors in object boundary extraction Interpolation errors Errors in the extraction of X-directional IFT projection width Subpixel changes in projection width
Therefore the initial guesses only provide a rough estimate on where the correct rotation angle may be found. As an added precaution, an extra search space is formed at the estimated angle of the previous view. The search space is dynamically allocated. In the event of a perfectly circular object, this method fails to identify any specific initial guesses. In such circumstances, the search space for test angles has to be expanded to the total angular span of the object, at the expense of estimation time. 4
DATA CORRECTION
One of the popular data correction methods in MRI is the use of bilinear interpolation in the spatial domain followed by superposition onto the corrected k-space [14]. In this method, the MRI spatial frequency components are assumed to be the superposition of N different images corresponding to N different phase encoding steps. In each of the N images, only one line, corresponding to the phase encoding step, is nonzero and the other lines are zero. It is assumed that the zero lines have the motion parameters of the nonzero line. Hence the planar rigid motion parameters are fixed for all lines of each of the N different images. Alternatively, using the superposition property, the inverse 2-DFT of the MR signal can be obtained by adding the inverse 2-DFT of the previously discussed N images. Using the k-space data, the reconstruction algorithm is described as follows: 1.
Correct the phase error of the k-space data using the translational motion parameters, which are assumed to be known or have been estimated using postprocessing techniques [34].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
144
Weerasinghe et al.
2. 3. 4.
5. 6. 7.
Divide the k-space data into N different images so that each image contains one nonzero line, and the other lines are filled with zeros. Calculate the inverse 2-DFT of the N different images. Using the bilinear interpolation method, back-rotate each of the N different images by its estimated rotation angle corresponding to the phase encoding step of the nonzero line. Rotation angle estimation can be performed using the method described in Sec. 3. Calculate the 2-DFT of the N different images. To obtain the corrected MR signal, the 2-DFT values of Step 4 can be superimposed on a new k-space. Calculate the inverse 2-DFT of the new k-space.
In the early versions of BSA [14], the 2-DFT of the sum of the images had been applied instead of Steps 5 and 6. However, it caused the leakage spatial frequency components due to interpolation, to overlap with each other, thereby degrading the quality of the resultant MR image. Although the introduction of Steps 5 and 6 eliminate this problem, it does not address the general problem of overlap data and data void regions. Since the above algorithm interpolates data on one line of the k-space using only the rotated versions of the same line of the k-space data, it gives accurate results only around the region where the lines intersect, restricting the algorithm to be effective only for small angle rotations. Steps 1 through to 5 of BSA are involved in the resampling of the kspace nonuniform sampled data. However, the interpolation is performed in the spatial domain. Therefore the interpolation errors resulting from the high variance of k-space data are avoided. To increase the speed of the algorithm, in Step 2, instead of computing the inverse 2-DFT, the inverse 1-DFT of the nonzero line along the kx direction can be computed prior to calculating the inverse 1-DFT of all the lines along the ky direction. As there is only one nonzero value at each line along the ky direction, the inverse 1-DFT can be obtained very rapidly. Since Step 3 of the algorithm must be repeated N times, the selected interpolation method has to be fast as well as accurate. The bilinear interpolation technique is ideal for this purpose due to its speed and the low variance of the intensity distribution of the images in the spatial domain. As described in Sec. 3.1, data overlapping can occur during interpolation between any number of different views rotated at different angles, hence producing significant regions of data overlap. Only N ⫻ N data samples are acquired from all the N views, and there are N ⫻ N grid points required to be filled in the uncorrupted k-space. Therefore if some of the grid points are overrepresented due to data overlap, it is logical to expect some grid points to be underrepresented due to data voids.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
4.1
145
Management of Overlap Data
At a data overlap, the interpolation errors associated with each data sample can differ considerably. Therefore, in practice, the overlapping samples at a particular grid point can be ranked according to their interpolation accuracy. Such a system of ranking enables the assigning of weights to each sample, prior to computing the average spatial frequency value at a particular grid point in the corrected k-space. Weighted averaging of redundant data can also lead to improved performance under noisy conditions [35]. In the bilinear superposition algorithm [14], the interpolation is performed in the spatial domain, which reduces the interpolation errors caused by high variance in the spatial frequency data. However, the fact that each of the interpolations is performed using only a single view of data at a time cannot be discounted. Effectively, the attempt is to interpolate values onto a 2-D grid using 1-D signal data spanning on an inclined line on the 2-D k-space. The adjacent grid points to this inclined line are filled using interpolated values as shown in Fig. 4(a). Obviously, the greater the deviation d of the grid point from the inclined line, the less accurate the interpolated value will be. Notice that d is not the offset distance between the nearest data sample and the grid point, which is represented by g in Fig. 4(b). Since acquired
FIGURE 4 (a) Deviation between the acquired view data and the required grid points; (b) graphical representation of the distance d. (From Ref. 38.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
146
Weerasinghe et al.
data is available along an inclined line, the interpolation error associated along the direction of this line is significantly lower than the interpolation error associated with the direction perpendicular to it. Along this perpendicular direction, there are no data available. Therefore the ranking is based on the distance d between the prospective grid point and the nearest view, which is given by d = 兩kyo cos r (km) ⫹ kxo sin r (km) ⫺ km 兩
(28)
where (kxo , kyo) are the coordinates of the grid point, r (km) is the rotation angle, and km is the phase encoded view number. Therefore the weight wr related to the r th sample Sr in an overlap region can be assigned as wr =
1 dr
(29)
The corrected spatial frequency value S(kx , ky) at (kx , ky) grid point in the k-space can be computed by
冘 冘 m
wr Sr
S(kx , ky) =
r=1
(30)
m
wr
r=1
where m is the total number of competing samples. A minor modification is incorporated in order to include the reliability of the estimated rotation angles [est (km)]. At the reconstruction stage, weights at the data overlap points are decided not only on the distance d between the prospective grid point in the k-space and the nearest view but also on the value of [est (km)]. Since the value of d may vary within the interval [0, 1], with the highest weight at 0 and the lowest at 1, the value of d is mapped to a membership function as shown in Eq. (31):
(d) =
1 1 ⫹ ad 2
(31)
where the constant a is set arbitrarily to 16 for all cases involved in this study. Since both (d) and [est (km)] represent fuzzy sets [36], the weights (w) are decided based upon the fuzzy intersection of these sets: w = (d) [est (km)]
(32)
where represents the fuzzy intersection [31]. In this work, the minimum operation is used for fuzzy intersection. However, other t-norms such as
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
147
bounded difference, Einstein product, algebraic product, or Hamacher product [36] can equally well be used. The decision to use the minimum is based on the fact that it gives the largest w value from the possible t-norms mentioned above. The algorithm to manage data overlap can be formulated as follows: 1.
2. 3. 4. 5. 6. 7.
8. 9. 4.2
Divide the acquired views into p different groups, each containing a set of views subjected to rotations at the same angle. Grouping such views together not only reduces interpolation errors but also decreases the reconstruction time. If there are p such groups (i.e., p < N), only p transformations to the spatial domain are required, instead of N, to perform interpolations. If the motion is continuous, then p = N. Insert each group into a zeroed N ⫻ N complex matrix. Calculate the inverse 2-DFT of the p different matrices. Using bilinear interpolation, back-rotate each of the p different images by the estimated rotation angle. Calculate the 2-DFT of the p different images. Compute the d values corresponding to each sample. If a particular grid point does not have a view within d value of unity, such a grid point is masked and set to zero. This step is included to eliminate any leakage spatial frequency components, generated due to interpolation in the spatial domain. The resulting spatial frequency values are at most one unit away from the acquired view which spans an inclined line. Calculate the weights (w) associated with each corrected k-space value, incorporating the membership values (d) and [est (km)]. Compute the weighted average values at each k-space grid point.
Iterative Estimation of Missing Data
Although reconstructed image quality can be partly enhanced by weighted averaging of redundant data, the data void regions in the corrected k-space may still degrade the final image quality, by reduced intensity at particular regions and blurred edges caused by the nonavailability of particular spatial frequency components. An analytical method using the Least Squares Error (LSE) technique to estimate low spatial frequency values was proposed by Yan and Gore [37]. This method is limited to estimating a very small number of missing data, where the number of unknowns is much smaller than the number of equations in the LSE problem. However, when the number of required estimations increases, the problem becomes ill posed, and the estimation is observed to be noise sensitive and computationally unstable.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
148
Weerasinghe et al.
Since the rotations at large angles produce large data void regions, the estimation technique is required to be capable of estimating in excess of 10,000 samples for a 256 ⫻ 256 image. Therefore the analytical method [37] is ineffective due to the ill-posedness and prohibitive computational complexity. The problem of estimating missing spatial frequency values is often encountered in computed tomography (CT) and in confocal scanning microscopy. This problem is named the missing cone problem, and several methods have been proposed to suppress the artifacts in the reconstructed image. However, there is some problem-specific information regarding the missing cone problem, which is not valid for the frequency estimation problem described in this section. The noniterative solutions to the missing cone problem use specific information such as: 1. 2.
Missing spatial frequencies being restricted to a cone shaped region The complete sinograms of acquired data possessing bow-tie shaped spectral support
The proposed iterative techniques mostly use projections on to convex sets (POCS) as the basis of the algorithm [26]. Considering the arbitrary shape of the data-missing regions and the available a priori information on the imaged object, the method of POCS was found to be an effective and efficient technique to estimate a large number of missing spatial frequency values. However, the quality of the POCS solution largely depends on a priori constraints used in forming the convex sets. It can be shown that the acquired k-space data are often subjected to noise, measurement errors, and interpolation errors that cause the k-space constraints to form an ill-defined convex set, inconsistent with the spatial constraints. Inconsistent constraints lead to the divergence of POCS from the desired solution, within a finite number of iterations. In this chapter, a fuzzy model is proposed for representing the reliability of available k-space values. This model is used for adaptive updating of the convex sets during the POCS iterations. The updates are based on the highly reliable k-space data and the consistency of the convex set with the spatial constraints. Therefore the proposed fuzzy model avoids divergence of the POCS solution from the required result and also ensures final image quality improvement. 4.3
Reliability of Convex Sets
Decades of research have established that the quality of the POCS solution largely depends on a priori knowledge of the original form of the image and the accuracy of the signal degradation model. A major problem is utilizing
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
149
FIGURE 5 Projection between two nonintersecting convex sets results in a minimum mean square limit cycle. Point f1 is the point in C1 closest to C2 , and point f2 is the point in C2 closest to C1.
all available knowledge optimally, since not all a priori knowledge is equally reliable. In POCS, unreliable or inaccurate a priori information leads to inconsistent convex sets where the intersection is essentially empty. Inconsistent convex sets lead to slow convergence, and divergence from the desired solution, causing problems with the termination of iterations. If there are only two sets, the convergence is to the cycle between the closest points of the sets in the mean square sense. This is illustrated in Fig. 5. If there are more than two sets, POCS converge to greedy limit cycles that are dependent on the ordering of the projections [27]. Therefore the final solution does not display all the expected properties. This is geometrically illustrated in Fig. 6.
FIGURE 6 Projection between three or more nonintersecting convex sets results in greedy limit cycles. These cycles correspond to all the possible projection orders (e.g., 123, 213, 321, etc.).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
150
Weerasinghe et al.
It was observed in previous work [38] that the reconstructed image diverges from the desired image within a finite number of iterations, under certain circumstances. Therefore a computationally expensive regulatory error metric was used to determine the point of termination of iterations. The cause for such divergence is found to be the inconsistent k-space constraints imposed on the POCS algorithm. This inconsistency arises due to the noise, measurement errors, and interpolation errors embedded in the available kspace data that has been used for imposing k-space constraints. Spatial constraints can be affected by errors in estimation of the ROI and maximum image energy. Rotation angle estimation errors are also demonstrated to cause k-space data errors. Each of the available spatial frequency values can lead to a separate convex set of the form of a linear variety in the Hilbert space. A linear variety is defined as a translation of a subspace by a fixed vector [39]. This fixed vector is computed from the known spatial frequency value and defines the location of a particular linear variety in the Hilbert space. If the reliability of each computed k-space value is known, each resultant linear variety can be fuzzified with a membership function [40] determined by its reliability. Depending on its membership, each vector can be given freedom to move, so that a set of fuzzy linear varieties can be formed that do not contradict the spatial constraints defined in a priori spatial constraints. Therefore proper fuzzification of linear varieties can transform inconsistent sets of constraints to a group of convex sets with a nonempty intersection. Such a result can be achieved by using the method of fuzzy POCS [41]. 4.4
Fuzzy POCS
As shown in the previous section, POCS breaks down in the important instance where two or more convex sets do not intersect [42]. However, regularization using fuzzy constraints [41] can be applied to find valuable POCS solutions that are close enough to each of the convex constraints. The underlying concept of being close enough suggests fuzzification of the nonintersecting convex sets to fuzzy convex sets [31]. Even if two or more crisp convex sets do not intersect, ␣-cuts [31] of their corresponding fuzzy sets can. This is illustrated in Fig. 7. Although fuzzy POCS often do not result in a fixed-point solution, the extent of the limit cycle of convergence can be effectively reduced. There are two methods of fuzzification of crisp convex sets, using dilation of the convex sets [41]. The amount of dilation is dependent on the reliability of the constraints used to form the convex set. The most reliable convex set should be dilated the least and the least reliable convex set should be dilated the most, in order to optimize the use of reliable information.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
151
FIGURE 7 Three nonintersecting convex sets are fuzzified through morphological dilation. Shown contours are for two different ␣-cuts ␣1 and ␣2 . At some combination of ␣-cuts, it is possible to force the contours to intersect at a point f, which is close enough to each of the three constraint sets.
If the crisp convex set is parameterized, in many cases the fuzzy convex set can be generated by fuzzification of the parameter set. If the parameter set exists on an interval, then the signal set is trivially convex. Fuzzification can be achieved simply by fuzzifying the interval [41]. Fuzzification can also be achieved by dilation of the underlying crisp set with a convex dilation kernel. If the dilation kernel is convex, then the dilation result can be interpreted as an ␣-cut of a fuzzy convex set [41]. The degree of membership of a signal in the fuzzy signal set is equal to that of the membership of the parameter in the fuzzified parameter set. If the crisp set of functions is not parameterized, fuzzification can be achieved through the direct morphological dilation [43] of each signal in the set. The ␣-cuts of the fuzzified convex set can be generated by choosing convex dilation kernels of increasing dimension. Fuzzification by dilation of convex sets assumes invariance of location of the crisp set in the Hilbert space. However, this may be disadvantageous in certain cases where one of the nonintersecting convex sets is located far away from the other sets. Use of dilation in this case will lead to an unacceptably low ␣-cut value, so that intersection is not possible. Therefore translation of this set towards the other sets in the Hilbert space can help maintain an acceptable ␣-cut value for dilation. This is illustrated in Fig. 8. It should
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
152
Weerasinghe et al.
FIGURE 8 Fuzzy POCS involves dilation and translation of less reliable convex sets.
be noted that such translation of a convex set in the Hilbert space amounts to changing the fundamental characteristics of the a priori constraints it represents. Therefore it can only be performed adaptively, during the POCS iterations. Fuzzy POCS of another type can be applied to the case where two or more convex sets intersect at more than one point. It should be noted that such an intersection of two or more convex sets is also inherently convex. Convergence to interior points of the intersection, if they exist, can be obtained by the application of morphological erosion [43] to one or more convex sets. This is effectively the reverse of fuzzification by dilation and can be achieved simply by choosing high ␣-cut values of the fuzzy convex sets. Therefore convergence would be to a point within rather than on the shaded area, as shown in Fig. 8. The approach is similar to peeling away convex hulls to find the most inner of a set of points. 4.5
Fuzzy Convex Sets
Fuzzy convex sets were first introduced by Zadeh in his pioneering work, presented in 1965 [31]. A fuzzy set Cf on the universal set X is defined by the membership function c , which maps X to the real interval [0, 1]. The fuzzy set Cf can be written as Cf = {x/ c (x)兩x 僆 X} If C ␣f denotes the crisp set corresponding to an ␣-cut of Cf ,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(33)
Rotational Motion Artifact Correction
C ␣f =
再
{x兩c (x) ⱖ ␣, x 僆 X} X
153
for for
␣≠0 ␣=0
(34)
The fuzzy set Cf is considered to be convex if and only if for every 0 ⱕ ⱕ 1,
c [x1 ⫹ (1 ⫺ )x2] ⱖ min{ c (x1), c (x2)}
(35)
Similarly, it has been shown that Cf is convex if all of its ␣-cuts for 0 ⱕ ␣ ⱕ 1 are convex [31]. Let m(x, y) be an arbitrary image in a Hilbert space H. If the spatial frequency distribution of m(x, y) is denoted by S, S = F{m(x, y)}
m(x, y) 僆 H
(36)
where F{⭈} is the Fourier transform. Let the following constraints on the form of m be known as a priori information. (1) Finite support constraint C1 = {m(x, y)兩m(x, y) = 0
for
(x, y) 僆 ROI}
(37)
where ROI is a prespecified region of interest. In order to fuzzify this constraint, the boundary of the ROI can be dilated morphologically. The extent of dilation is decided based on the reliability of the extracted ROI boundary [33]. Another method of dilating the set C1 would be to introduce a finite maximum energy limit allowable outside the ROI, as given in the following equation. 1 N2
冘冘 x
储m(x, y)储2 ⱕ Eout
᭙(x, y) 僆 ROI
(38)
y
where Eout is the maximum energy allowable outside ROI. (2) Amplitude constraint If the amplitude values of m(x, y) are known to be restricted within the limits Imin and Imax , the amplitude constraint is given by C2 = {m(x, y)兩Imin ⱕ m(x, y) ⱕ Imax
᭙(x, y)}
(39)
The set C2 can be fuzzified through direct fuzzification of the parameters Imin and Imax . (3) Energy constraint If the total energy within ROI is known to have an upper limit of Emax , the energy constraint is given by
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
154
Weerasinghe et al.
C3 =
再
m(x, y)兩E =
冘冘 x
冎
储m(x, y)储2 ⱕ Emax
y
᭙(x, y) 僆 ROI (40)
Similarly to C2 , C3 can also be fuzzified by direct fuzzification of the parameter Emax . Sets C1 , C2 , and C3 are well-established convex sets [26]. Since their fuzzification is achieved via dilation, the resultant ␣-cuts will also be convex sets. (4) k-Space data constraint Let Sˆ denote the partially defined spatial frequency data (i.e., the k-space information) interpolated from the acquired data. Then the k-space constraint can be given by ˆ x , ky) C4 = {m(x, y)兩F{m(x, y)} = S(k
(kx , ky) 僆 R0}
(41)
where R0 is the region of k-space filled by the computed spatial frequency values. It is also known that C4 is a convex set of the form of a linear variety [39]. 4.6
Fuzzy Data Model
It is possible to define a fuzzy set so that its membership function can be seen as a possibility distribution. Consider some imprecise information expressed as ‘‘about  ⬇ A.’’ This can be replaced by A (x), where this new expression is taken as the information that expresses the possibility of A being in the vicinity of  [40]. Therefore a fuzzy number A is expressed as A = ( , c), where  expresses the center of the region of possibility and c its spread. The membership function of A can be given by
A (x) = L
冉 冊 x⫺ c
c>0
(42)
where L(x) is called the reference function [40] and possesses the following properties: L(x) = L(⫺x) L(0) = 1 L(x) is a strictly decreasing function for [0, ⬁). Examples of L(x) for p > 0 are functions such that L1(x) = max(0, 1 ⫺ p 兩x兩 p), L2(x) = e⫺兩x兩 , etc. If L1(x) is considered for p = 1, the result is a triangular fuzzy number as depicted in Fig. 9(a). The major obstacle to the fuzzification of k-space data is the estimation of the spread (c) of the possibility function. Since each spatial frequency
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
155
FIGURE 9 (a) c (x) of a triangular fuzzy number; (b) membership functions of a conjugate pair.
value is subjected to differing measurement errors, interpolation errors and noise, the overall effect of such errors is highly complex and difficult to model accurately as a general rule. However, it is possible to identify pairs of complex conjugate k-space points, since 兩S(kx, ky)兩 ⬇ 兩S*(N ⫺ kx , N ⫺ ky)兩, where S*(⭈) indicates the complex conjugate and N is the number of phase encoding (ky) or frequency encoding (kx) steps. Although the phase of these alleged complex conjugate values can be corrupted due to acquisition timing, the magnitudes should generally be matched within an error limit. If the regridded k-space provides p[ En for all n, since the POCS algorithm is converging. Therefore, n exists for all n. The membership function of is given by
(n) =
再
1.0 2 e⫺(兩n兩 /4)
for for
n ⱖ 0.0 n < 0.0
Hence = r0 (n) where r0 is a ‘‘small’’ number.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(45)
Rotational Motion Artifact Correction
157
If 储Sn兩 ⫺ 兩Sn⫺1储 < e0 兩S兩
储Sn兩 ⫺ 兩S储 >
and
S (兩Sn兩) > 0
then the original k-space values are updated to the new values at the nth iteration, and preserved as k-space constraints. 5
SIMULATION RESULTS
Simulation experiments were conducted using both synthetic phantoms and human anatomy. The synthetic phantom [45] was deliberately chosen to be symmetric in shape, in order to examine the limitations of the proposed algorithms and their viability in practice. Motion consisting of in-plane stepwise rotation as well as continuous rotation was used in the experiments. Simulated stepwise rotations were synthesized to contain worst-case scenarios such as sudden rotations at the acquisition of the view (N/2), large-angle rotations, and consecutive angle changes that do not incur any change in the X-directional IFT width (wl). The nonperiodic continuous rotations were intended to represent worst-case scenarios in practice. The objects used in the experiments are shown in Fig. 10. Figure 11 shows the reconstructed images using conventional IFFT technique [1] on the motion affected k-space data. Severe motion artifacts are visible in all cases. The mean squared error value is calculated with respect to the corresponding objects (i.e., reference images) shown in Fig. 10. It should be noted that this measure is not available in practice due to the unavailability of a reference image. However we have included this measure to quantify the performance of the proposed algorithm
FIGURE 10 Objects used in the experiments: (a) Shepp & Logan phantom for cases 1, 2, and 3 (from Ref. 45); (b) axial head slice for case 4; (c) sagittal head slice for case 5.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
158
Weerasinghe et al.
FIGURE 11 Reconstructed MR images from motion corrupted signals, using inverse FFT: (a) case 1 (MSE = 1870.7); (b) case 2 (MSE = 1989.5); (c) case 3 (MSE = 1260.7); (d) case 4 (MSE = 1147.2); (e) case 5 (MSE = 2640.1).
on the different experiments conducted in this study. The mean square error (MSE) is defined by 1 MSE = 2 N
冘冘 N⫺1 N⫺1
x=0
[g(x, y) ⫺ m(x, y)]2
(46)
y=0
where g(x, y) is the reconstructed image and m(x, y) represents the original reference image. The extracted X-directional FFT boundaries using two open contours [44] are shown in Fig. 12. Notice that for stepwise rotations (i.e., cases 1, 2, and 3), the projected width (wl) changes abruptly, whereas for continuous rotations (i.e., cases 4 and 5), the transitions are relatively smooth. It is important to realize that the purpose of extracting the edges using snakes is to establish unbroken edges on both sides of the X-directional IFT images, so that an accurate estimate for the projection width (wl) can be computed for motion parameter estimation. Object boundaries (ROI) are then extracted using an entropy minimization technique in conjunction with a closed snake contour model [33]. The
FIGURE 12 Converged snake contours for X-directional IFT width extraction: (a) case 1; (b) case 2; (c) case 3; (d) case 4; (e) case 5. (From Ref. 44.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
159
converged closed snake contours are shown in Fig. 13. These contours are used to construct the lookup table described in Sec. 3.3. Using this lookup table, the initial search spaces were derived for the motion estimation algorithm. Objects with approximate geometric symmetries produced multiple search spaces for a particular view. This resulted in an unnecessary elongation of the estimation time. In order to expedite the estimation process, only the search space that contained the previous estimated angle was considered. This effectively reduced the estimation time by half with good results for continuous rotations. However, the algorithm failed to track sudden angle changes involved in stepwise rotations. As a solution to this problem, the minimum slew rate among the initial guesses were computed. At large variations in rotation angles, the high slew rate provided a guide to search all possible initial guesses, providing better tracking of sudden angular changes. Since the search space near the initial guesses was fixed at an arbitrary value (10% of the total angular span) for all angle estimations, it was found that in some cases the correct rotation angle falls outside the search space. Such events were accompanied by estimations with unexpectedly low membership values. Therefore if the membership value of a particular estimation falls below 40% of the previous estimation, a reestimation was invoked for that particular view with a doubled search space near the initial guess. This was necessary to track the motion accurately, although it required up to four times the usual estimation time. At high spatial frequencies, the signal-to-noise ratio (SNR) of data was found to be inadequate to give acceptable membership values for the angle estimations. Therefore those estimates with membership value less than a predefined ␣-cut [36] were discarded. However, this caused little effect on the final reconstructed image, since noise corrupted data hardly gave additional information to upgrade the quality of the image. Where possible, the rotation angles of discarded estimations were derived using linear interpo-
FIGURE 13 Converged snake contours for object boundary (ROI) extraction: (a) case 1; (b) case 2; (c) case 3; (d) case 4; (e) case 5. (From Ref. 33.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
160
Weerasinghe et al.
lation of nearby valid estimations. The results of angle estimations (i.e., after imposing the ␣-cut) for the five experiments considered are shown in Fig. 14. The corresponding membership values are plotted in Fig. 15. The shift in the estimated angles from the actual motion is due to the fact that the algorithm considers the orientation of the object at the acquisition of the (N/2)th view to be 0⬚. This does not affect the artifact suppression scheme, but the corrected image displays a rotated version of the initial object. The high incidence of data overlap regions [44] due to large angle rotations results in more reliable angle estimations, especially in the high-frequency views, compared to small-angle rotations with less overlap. The total estimation times for continuous rotations ranged from 132 minutes to 157 minutes on a Sun SPARC 2 workstation, whereas stepwise rotations incurred a maximum of 167 minutes. The total number of valid estimations was in excess of 190 angles. Compared to previously published work [14,17], which required 4.5 h for 10 estimations [14], the proposed algorithm performs efficiently and avoids an exhaustive search on all possible combinations of angles [17]. The estimated angles are then used to suppress the rotational motion artifacts using fuzzy POCS. Fig. 16 shows the reconstructed images with motion artifact correction. 6
SUMMARY
The goal of this chapter was to develop algorithms for restoring MR images reconstructed from data corrupted by patient motion during the scan time. The rotational model of the motion was selected because the translational model has already been studied in depth, as reported in past literature [8– 13]. Rotational motion being a primary component of gross in-plane movements, the inclusion of such motion into the existing translational model accurately describes patient movements involving head and limbs. A general technique of rotational motion parameter estimation is provided in this chapter. An added advantage of this method is that it is capable of estimating the rotation angle associated with each view in the midst of concurrent translational motion. Due to the transformation of the estimation problem from one N-dimensional minimization to N single-dimensional minimizations, the estimation speed is greatly enhanced while reducing the computational complexity. The motion parameter estimation from the corrupted data themselves is widely regarded as a difficult task even for pure translational movements. With the added complexity of rotational movements, such estimations were highly time-consuming in the past. The new method proposed in this chapter provides a new perspective to this problem and a solution that is both effective and easy to implement.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
161
FIGURE 14 Angle estimation results—estimated rotation angles compared to the simulated rotation (dashed line): (a) case 1; (b) case 2; (c) case 3; (d) case 4; (e) case 5. (From Ref. 44.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
162
Weerasinghe et al.
FIGURE 15 Angle estimation results—membership values (solid line) representing the reliability of each angle estimate are shown with the ␣cut value (dashed line) used for choosing the valid estimates: (a) case 1; (b) case 2; (c) case 3; (d) case 4; (e) case 5. (From Ref. 44.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction
163
FIGURE 16 Reconstructed images with artifact correction using the proposed fuzzy POCS algorithm. Fuzzy POCS parameters and the number of iterations (It) are indicated for each case: (a) case 1—MSE = 78.9 (0 = 0.05, 0 = 0.3, and It = 10); (b) case 2—MSE = 69.2 (0 = 0.05, 0 = 0.3, and It = 10); (c) case 3—MSE = 151.1 (0 = 0.05, 0 = 0.3, and It = 10); (d) case 4—MSE = 86.5 (0 = 0.1, 0 = 0.3, and It = 10); (e) case 5—MSE = 408.7 (0 = 0.05, 0 = 0.3, and It = 10).
It is shown that gross in-plane rotations cause data overlaps and voids in the corrected k-space data. The proposed weighted-averaging method has been shown to be effective in managing the overlap data, which minimizes the interpolation errors and data corruption due to noise. An iterative algorithm based on POCS was developed to estimate the missing data within data void regions. Constraints based on the finite support and consistency with the collected data are used to form convex sets. The limitations commonly encountered in conventional POCS algorithms are overcome using the fuzzy POCS algorithm described in this chapter. Despite the existence of a plethora of postprocessing artifact correction techniques, none is chosen for widespread use in clinical settings. A major reason is the inadequacy of the established motion models in representing realistic movements. However, the expectation of a globally valid mathematical model is also viewed as unrealistic. Therefore the perspective of this and ongoing research is to develop models that can effectively and efficiently suppress motion artifacts due a particular type of motion. Such an array of models may be then combined to develop a generalized technique that will enable the user to choose an appropriate combination of models. It is also envisaged that automatic selection of motion models may be possible in the future. From a more theoretical point of view, the fuzzy POCS algorithm developed in this chapter may find its uses in many other applications in signal estimation and reconstruction. Further theoretical development and application of fuzzy POCS is to be encouarged for future research.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
164
Weerasinghe et al.
REFERENCES 1. 2.
3.
4. 5.
6.
7. 8. 9.
10.
11. 12.
13. 14.
15. 16. 17.
A. Kumar, D. Welti, R. R. Ernst. NMR Fourier zeugmatography. J. Magn. Reson. 18:69–83, 1975. L. S. de Varies, L. M. S. Dubowitz, V. Dubowitz, F. M. Pennock. A color atlas of brain disorders in the newborn. England: Wolfe Medical Publ., 1990, pp. 1–15. H. W. Korin, J. P. Felmlee, S. J. Riederer, R. L. Ehman. Spatial-frequencytuned markers and adaptive correction for rotational motion. Mag. Reson. Med. 33:663–669, 1995. R. L. Ehman, J. P. Felmlee. Adaptive technique for high-definition MR imaging of moving structures. Radiology 173:255–263, 1989. W. A. Edelstein, J. M. S. Hutchison, G. Johnson, T. Redpath. Spin warp NMR imaging and applications to human whole-body imaging. Phys. Med. Bio. 25: 751–756, 1980. P. M. Pattany, J. J. Phillips, L. C. Chiu, J. D. Lipcamon, J. L. Duerk, J. M. McNally, S. N. Mohapatra. Motion artifact suppression technique (MAST) for MR imaging. J. Comp. Asst. Tomography 11:369–377, 1987. D. B. Twieg, J. Katz, R. M. Peshock. A general treatment of NMR imaging with chemical shifts and motion. Mag. Reson. Med. 5:32–46, 1987. M. L. Wood, R. M. Henkelman. MR image artifacts from periodic motion. Med. Phys. 12:143–151, 1985. H. W. Korin, F. Farzaneh, R. C. Wright, S. J. Riederer. Compensation for the effects of linear motion in MR imaging. Mag. Reson. Med. 12:99–113, 1989. L. Tang, M. Ohya, Y. Sato, S. Tamura, H. Naito, K. Harada, T. Kozuka. Artifact cancellation in MRI due to phase encoding axis motion. Systems and Computers in Japan 26:88–98, 1995. M. Hedley, H. Yan. Suppression of slice selection axis motion artifacts in MRI. IEEE Trans. Med. Imaging 11:233–237, 1992. M. Hedley, H. Yan. Correcting slice selection axis motion artifacts in MR imaging. Proc. 1992 International Conference on Acoustics, Speech and Signal Processing (ICASSP’92), San Francisco, 1992, 3:81–84. M. Hedley, H. Yan. Motion artifact suppression: a review of postprocessing techniques. Magn. Reson. Imag. 10:627–635, 1992. R. A. Zoroofi, Y. Sato, S. Tamura, H. Naito. MRI artifact cancellation due to rigid motion in the imaging plane. IEEE Trans. Med. Imaging 15:768–784, 1996. J. D. O’Sullivan. A fast Sinc function gridding algorithm for Fourier inversion in computer tomography. IEEE Trans. Med. Imaging 4:200–207, 1985. M. L. Wood, M. J. Shivji, P. L. Stanchev. Planar-motion correction with use of k-space data acquired in Fourier MR imaging. JMRI 5:57–64, 1995. D. Atkinson, D. L. G. Hill, P. N. R. Stoyle, P. E. Summers, S. F. Keevil. Automatic correction of motion artifacts in magnetic resonance images using an entropy focus criterion. IEEE Trans. Med. Imaging 16:903–910, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Rotational Motion Artifact Correction 18.
19. 20.
21. 22. 23. 24. 25. 26. 27.
28. 29.
30. 31. 32.
33.
34. 35. 36. 37. 38.
165
M. Hedley, H. Yan, D. Rosenfeld. A modified Gerchberg–Saxton algorithm for one-dimensional motion artifact correction in MRI. IEEE Trans. Sig. Processing 39:1428–1432, 1991. K. C. Tam, V. Perez-Mendez. Tomographic imaging with limited-angle input. J. Opt. Soc. Am. 71:582–592, 1981. R. M. Glaeser, L. Tong, S. H. Kim. Three-dimensional reconstruction from incomplete data: interpretability of density maps at ‘atomic’ resolution. Ultramicroscopy 27:307–318, 1989. L. M. Bregman. Finding the common point of convex sets by the method of successive projections. Dokl. Akad. Nauk. 162(3):487–490, 1965. L. G. Gubin, B. T. Polyak, E. V. Raik. The method of projections for finding the common point of convex sets. USSR Com. Math. Phy. 7(6):1–24, 1967. R. W. Gerchberg, W. O. Saxton. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik 35(2):237–248, 1972. R. W. Gerchberg. Super-resolution through error energy reduction. Optica Acta 21(9):709–720, 1974. A. Papoulis. A new algorithm in spectral analysis and band limited extrapolation. IEEE Trans. Cir. Sys. 22(9):735–742, 1975. D. C. Youla, H. Webb. Image restoration by the method of convex projections: Part I, Theory. IEEE Trans. Med. Imaging 1:81–94, 1982. A. Levi, H. Stark. Image restoration by the method of generalized projections with application to restoration from magnitude. J. Opt. Soc. Am. A 1(9):932– 943, 1984. M. H. Goldburg, R. J. Marks II. Signal synthesis in the presence of an inconsistent set of constraints. IEEE Trans. Circ. Syst. 32:647–663, 1985. W. F. Zhuo, Y. Wang, R. C. Grimm, P. J. Rossman, J. P. Felmlee, S. J. Riederer, R. L. Ehman. Orbital navigator echoes for motion measurements in magnetic resonance imaging. Mag. Reson. Med. 34:746–753, 1995. C. Weerasinghe, H. Yan. Correction of motion artifacts in MRI caused by rotations at constant angular velocity. Signal Processing 70:103–114, 1998. L. A. Zadeh. Fuzzy sets. Information and Control 8:338–353, 1965. L. Ji, H. Yan. An intelligent and attractable snake model for contour extraction. Proc. of 1999 International Conf. on Acoustics, Speech and Signal Processing (ICASSP’99), Arizona, 1999, 6:3309–3312. C. Weerasinghe, L. Ji, H. Yan. A new method for ROI extraction from motion affected MR images based on suppression of artifacts in the image background. Signal Processing 80(5):867–881, 2000. M. Hedley, H. Yan, D. Rosenfeld. An improved algorithm for 2D translational motion artifact correction. IEEE Trans. Med. Imaging 10:548–553, 1991. A. Macovski. Noise in MRI. Mag. Res. Med. 36:494–497, 1996. H.-J. Zimmerman. Fuzzy set theory and its applications. Boston: Kluwer, 1996, pp. 31–32. H. Yan, J. C. Gore. An efficient algorithm for MR image reconstruction without low spatial frequencies. IEEE Trans. Med. Imaging 9:184–189, 1990. C. Weerasinghe, H. Yan. An improved algorithm for rotational motion artifact suppression in MRI. IEEE Trans. Med. Imaging 17:310–317, 1998.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
166 39. 40. 41.
42.
43.
44.
45.
Weerasinghe et al. D. G. Luenberger. Optimization by vector space methods. New York: John Wiley, 1968, pp. 11–22. T. Terano, K. Asai, M. Sugeno. Fuzzy Systems Theory and Its Applications. New York: Academic Press, 1992, pp. 70–84. R. J. Marks II, S. Oh, L. Laybourn, S. Lee. Fuzzy and extra crisp alternating projections onto convex sets (POCS). Proc. of the 4th Int. Conf. on Fuzzy Sys., Yokohama, 1995, 427–435. D. C. Youla, V. Velasco. Extensions of a result on the synthesis of signals in the presence of inconsistent constraints. IEEE Trans. Cir. Sys. 33:465–468, 1996. P. Maragos, R. W. Schafer. Morphological filter—Part I: Their set theoretic analysis and relations to linear shift-invariant filters. IEEE Trans. Acous. Speech and Sig. Proc. 35:1153–1169, 1987. C. Weerasinghe, L. Ji, H. Yan. A fast method for estimation of object rotation function in MRI using a similarity criterion among k-space overlap data. Signal Processing 78:215–230, 1999. L. A. Shepp, B. F. Logan. Reconstructing interior head tissue from x-ray transmissions. IEEE Trans. Nucl. Sci. 21:228–238, 1974.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
6 Tagged MR Cardiac Imaging Nikolaos V. Tsekos and Amir A. Amini Washington University School of Medicine in St. Louis, St. Louis, Missouri
1
INTRODUCTION
Noninvasive techniques for assessing the dynamic behavior of the human heart are invaluable in the diagnosis of heart disease, as abnormalities in the myocardial motion sensitively reflect deficits in blood perfusion [1,3]. MRI is a noninvasive imaging technique that provides superb anatomic information with excellent spatial resolution and soft tissue contrast. Conventional MR studies of the heart provide accurate measures of global myocardial function, chamber volumes and ejection fractions, and regional wall motions and thickening. In MR tagging, the magnetization property of selective material points in the myocardium are altered in order to create tagged patterns within a deforming body such as the heart muscle. The resulting pattern defines a time-varying curvilinear coordinate system on the tissue. During tissue contractions, the grid patterns move, allowing for visual tracking of the grid intersections over time. The intrinsic high spatial and temporal resolutions of such myocardial analysis schemes provide unsurpassed information about local contraction and deformation in the heart wall that can be used to derive local strain and deformation indices from different myocardial regions. In this chapter, we provide a comprehensive overview of tagged MR imaging methods proposed in the literature. We continue the discussion of 167
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
168
Tsekos and Amini
tagged MRI in Chap. 15 where we discuss two promising image analysis techniques developed in our laboratory for both analysis and visualization of left-ventricular deformations from tagged images. 2 2.1
TAGGED MRI Introduction
Assessment of motion with MR tagging is based on the simple concept that the spatiotemporal deformation of a ‘‘tagged pattern,’’ applied onto a matrix (tissue), reflects the motion of the matrix. Specifically, a change in the shape of the tagging pattern in an MR image reflects the change in the shape of the underlying matrix, between the moment the tagging pattern was originally applied and the instance the MR image was collected. Moreover, assuming that a series of MR images are collected at consecutive instances after the application of the tagging pattern, then a cinematic impression of the matrix motion can be acquired (Fig. 1). For the study of cardiac motion, the observation of the tagging pattern deformation is usually accomplished by collecting MR images over a portion or an entire cardiac cycle. In MRI, the generation of the tagging pattern is based on the spatial modulation of the longitudinal magnetization (Mz) using sequences of radio frequency (RF) pulses together with a combination of suitable pulsed and/ or constant B0 magnetic field gradients. MR tagging was first proposed by Morse and Singer to measure bulk flow [2]. Imaging of the heart wall motion with MR tagging was first described by Zerhouni et al. [3], who demonstrated that tagged magnetization could be used to assess the complex de-
FIGURE 1 Illustration of the MR tagging concept. (A) The ECG signal. (B) The tagging pulse and the MR image acquisition elements. TD: a delay to determine the instance the tagging pulse will be applied relative to the ECG triggering signal. TPi (i = 1 to 6): the time instances the acquisition elements are collected.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
169
formation of the heart. Subsequently, Axel and Dougherty proposed spatial modulation of magnetization (SPAMM), a simple and robust scheme for generating parallel planes of saturation throughout the entire imaging volume [4]. Since then, major efforts have been devoted to the refinement and improvement of these methods for generating complex tagging patterns or imaging the tissue motion with more efficient imaging techniques. Such studies have clearly demonstrated that by combining suitable RF pulse trains or envelopes with B0 gradients, tailored tagging patterns can be generated such as starburst tags [3], parallel line patterns [4], tagging grids [5–7], striped radial tags [8], and contrast-enhanced difference patterns [9]. In general, myocardial tagging involves two distinct phases: (1) modulation of the magnetization for the generation of a specific tagging pattern and (2) acquisition of MR images to observe the motion of the tagging pattern. These two phases are also reflected in the structure of the employed pulse sequence, i.e., the series of RF pulses, gradient pulses, and data acquisition periods that are used to collect the tagged images, as illustrated in Figs. 1 and 2. Virtually any imaging sequence can be used to image the motion of the tagging pattern, by adding the appropriate tagging pulses. In practice, consideration of the rate and extent of the motion as well as the imaged medium must be taken into account when designing the appropriate tagging pattern and MR imaging sequence. 2.2
Generation of the Tagging Pattern
The MR tagging pattern may be generated by several techniques that manipulate the matrix magnetization. In general, there are two families of tagging pattern generation techniques. With the first approach, the longitudinal magnetization (Mz) of the matrix is directly modulated with the application of suitable sequences of RF pulses and magnetic field (B0) gradients. These tailored RF pulse sequences have a characteristic response in the frequency domain that in the presence of suitable magnetic field gradients generates a spatial profile. Examples of such techniques are the band-selective RF trains, originally proposed by Zerhouni et al. [3,8], and the multiband delays alternating with nutations for tailored excitation (DANTE), first introduced by Mosher and Smith [7] and implemented on the rat heart by de Crespigny et al. [10]. The second approach is the spatial modulation of magnetization (SPAMM) technique that was first introduced and later improved by Axel and Dougherty [4,6]. In SPAMM, the tagging pattern is first generated by modulating the transverse magnetization (Mxy) of the matrix and then, after it is restored on the longitudinal axis, it is observed. Hybrid DANTE/ SPAMM techniques have been introduced to improve tagging pattern generation [11] and to facilitate the use of large-bandwidth adiabatic pulses [12].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
170
Tsekos and Amini
FIGURE 2 A series of four images of the tagged porcine heart from enddiastole to late diastole.
Recently, complex tagging patterns have been introduced by using tailored RF and gradient magnetic field waveforms that synergistically drive the matrix magnetization [13,14]. Several factors should be considered when designing and optimizing a tagging pulse for specific experimental conditions. As an example, to study a 1-D translational motion, a set of parallel tag lines perpendicular to the motion would be the optimal choice. For a radial motion of a round object, a set of concentric ring tags would be ideal. However, the challenge is to image more complex motions like that of the myocardium, which includes translation and rotation as well as nonrigid motions in 3-D space. In principle, any type of tagging pattern can be generated in regard to shape, size, and orientation in space, by combining RF with tailored amplitude and/or phase waveforms together with time-varying magnetic field gradients along any combination of spatial axes [11–14,16]. In a first approach, the theory of k-space excitation for small flip angle excitations described by Pauly et
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
171
al. [15] can be used for the design of tailored spatial excitations for tagging patterns. As an example, the k-space approximation has been used to design tailored tagging patterns with variable spacing [13]. However, the k-space approximation is not ideal for the study of tagging techniques because it only describes the Mx and My components of the magnetization, whereas tagging modulates the longitudinal magnetization (Mz). Recently, Kerwin and Prince [14] described an extension of the k-space approximation to relate the pulse sequence directly to the modulation of the longitudinal magnetization tagging pattern. Despite their potential for tailored imaging of specific motion patterns or matrix shapes, complicated tagging patterns have several technical and practical limitations. Specifically, highly crafted RF pulses require dense coverage of a wide portion of the k-space, increasing the imaging time. In terms of k-space coverage, the 1-D parallel line tags have low requirements [16], whereas a grid pattern (parallel lines along two orthogonal axes) and a striped radial pattern require broader coverage. To address the issue of kspace coverage, McVeigh [16] has introduced the approach of acquiring parallel line tagging patterns in different orientations. In that implementation, the readout gradient is always aligned perpendicular to the tags, to resolve the position of the tag lines with high spatial resolution and reduced sampling time. However, this method requires the imaging of tagging stripes along two orthogonal axes for the same imaging plane, thus increasing the imaging time. 2.2.1
Tagging with Saturation Planes
Tagging using narrow saturation planes applied perpendicular to the imaging plane was the first implementation of myocardial tagging by Zerhouni et al. [3]. In this approach, each tagging line is generated by a single RF pulse applied in the presence of a magnetic field gradient, as in slice selection. Thus the implementation offers an unmatched way to define the width and plane of orientation of each tagging plane, as well as the separation between them. This can be achieved by adjusting the duration and transmission frequency of the RF envelope together with the strength and orientation of the slice selection magnetic field gradient. The saturation planes technique has been implemented to generate parallel [3] and starburst [8] tagging patterns. Figure 3 illustrates the combination of RF pulse and gradients that can be used to generate a parallel tagging pattern with saturation planes. Figure 4A illustrates the RF pulse and gradients to generate a starburst pattern like the one illustrated on the human heart in Figure 4B. Tagging with band-selective pulses has been substituted with more time- and power-efficient approaches such as the SPAMM and DANTE sequences.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
172
Tsekos and Amini
FIGURE 3 Timing diagram of the RF pulse sequence, pulse transmission frequency, and gradients for the generation of five equidistant and parallel tagging lines with the saturation planes technique. The RF pulses (␣1 to ␣5) are soft band-selective pulses. Optional crushing gradients (GS) are shown interleaved with the slice selection gradients (GSS) to spoil unwanted transverse magnetization generated by RF imperfections. If necessary, additional crushing gradients may also be applied in parallel along the other two pulses.
FIGURE 4 (A) Timing diagram of the RF pulse sequence, pulse transmission frequency, and gradients for the generation of four tagging lines with a starburst pattern. Assuming that the star lines are oriented /4 relative to the vertical axis, the slice selection gradients (GSS1 and GSS3) on the Y and Z axis for the first and third pulses should be equal. Optional crushing gradients are also shown (GS1, GS2, GS3). (B) Four-line starburst pattern on the human heart depicting a diastolic phase.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
2.2.2
173
Tagging with Spatial Modulation of Magnetization (SPAMM)—Standard SPAMM
Tagging of the magnetization with the spatial modulation of magnetization (SPAMM) technique was introduced by Axel and Dougherty [4] and is unique with respect to the other methods: the tagging pattern is generated by manipulating the transverse magnetization (Mxy). In its simplest implementation, SPAMM is applied with a train of two pulses (1-1 SPAMM) interleaved with a pulsed magnetic field gradient applied along one direction (G r ), shown in Fig. 5A. The first pulse turns all (if ␣ = /2) or some (if ␣ < /2) of the longitudinal magnetization into the transverse plane. The gradient pulse induces a spatial distribution of the transverse magnetization → phase ( (→r)) along the direction of the applied gradient. The (r) depends on the waveform (G r (t)), duration (Tp) of the gradient pulse, and the position in space (→r): →
→
→ (r) =␥
冕
G r (t) dt →
(1)
Tp
As a result, the Mx and My components are sinusoidally modulated as shown in Fig. 5B, with a spatial periodicity (L) given by L=
2 ␥ 兰 G r (t) dt →
(2)
Subsequently, the second RF pulse restores the longitudinal magnetization, which is now sinusoidally modulated, generating alternating stripes of high and low signal intensity (Fig. 5B). Crushing gradients are applied at the end of the SPAMM sequence to spoil the remaining transverse magnetization. Axel and Dougherty have demonstrated that narrower tags can be generated with the SPAMM techniques using pulse sequences with a train of nonselective RF pulses, the relative amplitudes of which are distributed according to the binomial coefficients, i.e., 1-2-1, 1-3-3-1, 1-4-6-4-1 (Fig. 5B) [6]. The tagging stripes are perpendicular to the gradient axis. The sequence can be applied a second time with the gradients along an orthogonal axis to generate a rectangular tagging grid. In practice, the amplitude of each RF pulse is regulated so the composite will give the desired excitation. Tagging with the SPAMM technique is the most widely used method on clinical systems. Standard and Slice following Complementary Spatial Modulation of the Magnetization (CSPAMM). The complementary spatial modulation of the magnetization (CSPAMM) was introduced by Fischer et al. [17] to in-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
174
Tsekos and Amini
FIGURE 5 (A) Timing diagram of the RF pulse train and the gradients used with a 1–1 SPAMM sequence for generation of a 2-D tagging grid. Crushing gradients are shown along all three axes to remove remaining transverse magnetization after the second pulse of each SPAMM sequence. (B) The transverse (Mxy) magnetization after the spoiling gradient and the longitudinal (Mz) magnetization after the second RF pulse for different flip angles, ranging from inversion (thick line) to partial excitation (thin line). Note that the tagging grid separation is not uniform for the intermediate flip angle. Similar effects on the profile of the magnetization occurs due to T1 relaxation.
crease the ratio of the tagged and nontagged components of the magnetization. With this approach, two tagged images are collected as shown in Fig. 6. The first image is a standard 2-D 1-1 SPAMM. The second image is collected by inverting the phase of the second pulse of the second SPAMM element (gray box in Fig. 6). By subtracting the two images, the relaxed component of the magnetization is removed, maintaining the component that contains the tagging information [17]. CSPAMM is an SNR-efficient technique, but it requires the collection of two image sets, which increases the imaging time and may result in image registration errors. The concept of CSPAMM led to a special implementation, the slice following CSPAMM [18]. This technique was introduced to address a major limitation of the MR tagging techniques: its inability to account for throughplane motion. In MR tagging, the imaged slices are fixed in space relative to the magnet coordinate system. The heart undergoes a complex motion that includes components in three dimensions: a base-to-apex compression in addition to shortening and thickening on the short axis together with clockwise rotation of the base and counterclockwise rotation of the apex during systole (see Chap. 15 for more details). The first component is the main contributor to the through-plane motion, which may result in incorrect
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
175
FIGURE 6 Timing diagram of the RF sequence and the gradients used with the CSPAMM to generate a rectangular tagging grid. Two images are collected with this technique, a standard 2-D SPAMM (positive forth pulse) and a complementary SPAMM, with the tagging grid acquired by inverting the forth RF pulse. The gradients GTZ and GTY generate a tagging pattern in a way identical with that of the standard SPAMM. The two image sets are then subtracted.
assessment of motion with tomographic techniques like MRI. To address this issue, Fischer et al. modified the CSPAMM as shown in Fig. 7A [18]. Specifically, slice-following is achieved by tagging a thin slice within an imaged volume, rather than the entire volume. The imaged volume should be large enough to include the motion of the thin tagged slice. To images are again collected, as with the standard CSPAMM. By subtracting the two images, the background signal of the magnetization above and below the tagged slice is eliminated, thereby leaving the image of the thin tagged slice (Fig. 7B). Figure 8 shows CSPAMM images collected with standard imaging where slice positions are fixed relative to the scanner’s coordinate system, and with slice following CSPAMM. 2.2.3
Tagging with DANTE Pulse Sequences
Standard DANTE. Multiband delays alternating with nutations for tailored excitation (DANTE) was first used to measure the translational motion of a phantom by Mosher and Smith [7] and later was applied to assess the motion of the rat heart wall by de Crespingy et al. [10]. The DANTE sequence, originally introduced by Bodenhausen et al. for NMR applications [19], is a series of small excitation angle pulses with a frequency response composed of a central band at the carrier frequency (⍀0) and a series of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
176
Tsekos and Amini
FIGURE 7 (A) Timing diagram of the RF pulses and gradient used with the CSPAMM technique. The RF pulse 1 is slice selective. (B) Illustration of the CSPAMM operation depicting a portion of the tissue immediately after the application of the tagging pulse (t = 0) and after a time tk . (From Ref. 18.)
sidebands at equidistant frequencies (⍀n) as illustrated in Fig. 9. When a linear magnetic field gradient is applied during the DANTE sequence, it linearly disperses the Larmor frequencies of the tissue. As a result, the central band and the sidebands of the pulse generate a series of equidistant and parallel stripes of reduced longitudinal magnetization compared to the unperturbed underlying matrix. A DANTE sequence can be generated from any parent RF pulse by segmentation in N pulse elements of duration ⌬Tp by inserting N ⫺ 1 delays of length (Fig. 9A). Note that the total duration of the resultant DANTE sequences is Ttot = N ⌬Tp ⫹ (N ⫺ 1). The tagging grid is characterized by the width (l) of the inversion bands, the distance (L) between them, and the total tissue area (Ltot) tagged by the pulse (Fig. 9B). The characteristics of the tagging pattern generated by a DANTE sequence are directly related to operator-defined features of the DANTE sequence: the full width at half maximum (FWHM) bandwidth () of the on-resonance and harmonic excitations generated by a DANTE sequence is inversely proportional to the total duration of the pulse (Ttot) times a constant value (R). The R constant is characteristic of the parent pulse and determines the width of its excitation bandwidth (BW), i.e., BW = R/Tp . So l⬀⬀
R Ttot
(3)
The peak-to-peak spacing of the DANTE pulse sequences is inversely
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
177
FIGURE 8 Tagged short-axis images of a healthy volunteer obtained with CSPAMM at end-diastole (left column), at end-systole (middle column), and at late diastole (right column). The top row is the result of a standard imaging technique (Fig. 6). The images in the second row are obtained with slice-following CSPAMM (Fig. 7). (Courtesy of M. P. Watkins, S. E. Fischer, C. H. Lorenz, Center for Cardiovascular MR, Barnes-Jewish Hospital, St. Louis, MO.)
proportional to the center-to-center distance (⌬Tp ⫹ ) between the DANTE pulse elements: L ⬀ ⍀n ⬀
1 ⌬Tp ⫹
(4)
Specifically, the (FWHM) of the tagged area is related to the total FWHM bandwidth ⍀tot of a DANTE sequence, which is inversely proportional to the duration of the elementary segments of the DANTE sequence: Ltot ⬀ ⍀tot ⬀
1 ⌬Tp
(5)
The simplest magnetic field gradient that can be used with DANTE pulses is a constant one applied during the entire Ttot period [7]. However, more complicated gradient waveforms can be applied for more efficient tagging or tailored gradient patterns. Recently, a sawtooth-like gradient wave-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
178
Tsekos and Amini
FIGURE 9 (A) Timing diagram of the RF pulse train of a DANTE sequence for tagging with a constant magnetic field gradient (GT). (B) The frequency profile of a DANTE sequence composed of the central band and sidebands. For a given gradient (GT), the pulse segment duration (⌬TP), the interpulse delay (), and the total duration of the RF train (Ttot) determine the full width at half maximum tagged area (Ltot), tag spacing (L), and tag width (l).
form (Fig. 10B) was introduced [12], which consists of a constant base gradient (GB) during the DANTE sequence pulse element (⌬Tp) while it is ramped up toward a maximum value (GB ⫹ GM) and then down to the base (GB) during the interpulse delays (). This gradient waveform allows the reduction of the grid spacing without further increase of the gradient strength or undesired elongation of the DANTE sequence. Defining for simplicity the parameters ␣ = /⌬Tp and = GM /GB, analytical expressions for the calculation of the l, L, and Ltot can be derived [12]: R =␥ l
冕
G(t) dt
or
l=
Ttot
R[N ⫹ (N ⫺ 1)␣] ␥ GBTtot[(N ⫺ 1)(1 ⫹ ␣ ⫹ 0.5␣) ⫹ 1] (6)
l =␥ L
冕 冕
l =␥ Ltot
[N ⫹ (N ⫺ 1)␣] ␥ GBTtot(1 ⫹ ␣ ⫹ 0.5␣)
G(t) dt
or
L=
G(t) dt
or
Ltot =
⌬Tp⫹
⌬Tp
N ⫹ (N ⫺ 1)␣ ␥ GBTtot
(7)
(8)
where ␥ is the gyromagnetic ratio. The ability of the sawtooth gradient to reduce the width and the distance between the grid lines is demonstrated by the dependence of Eqs. (6) and (7) on the parameter . A plethora of tagging patterns can be generated to suit the specific gradient coil performance, power deposition limitations, heart rates, etc.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
179
FIGURE 10 The B1(t) (A), the phase (t) (B), and the sawtooth like gradient waveform G(t) (C). The illustrated adiabatic DANTE sequence was generated from a hyperbolic secant parent pulse. The calculated longitudinal magnetization following an adiabatic DANTE sequence as a function of frequency offset (D) and B1 (E). The profiles were produced by an adiabatic DANTE sequence from parent pulse with R = 8 (B and D). (From Ref. 12.)
Tagging with Adiabatic DANTE Sequences. The above MR tagging techniques for tag generation with uniform tagging contrast require relatively homogeneous B1 fields for RF transmission, which necessitates the use of large circumscribing coils. These coil configurations are not suitable for several experimental conditions, such as working with the human torso at high magnetic fields and in small-bore magnets for animal or human studies where ‘‘body’’ coils cannot be used due to limited available bore space. In these cases, the use of surface coils for both transmission and signal reception is the necessary solution. Such coils, however, intrinsically possess highly nonuniform B1 fields. To address this issue. Tsekos et al. implemented MR tagging with DANTE pulses generated from adiabatic parent pulses [12]. The procedure of parent pulse segmentation for generation of a
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
180
Tsekos and Amini
DANTE sequence is shown in Fig. 10A and 10B (together with the employed sawtooth gradient in 10B). With adiabatic pulses, both the amplitude (Fig. 10A) and the phase (Fig. 10B) of the RF pulse are modulated, and both should be segmented accordingly to create a DANTE sequence. The resultant adiabatic DANTE inversion sequences demonstrate a major advantage compared to the conventional nonadiabatic implementations, with both the tag grid constant and the width being insensitive to B1 field inhomogeneities, making them suitable for use with surface coils. Figures 10C and 10D illustrate the computer-simulated performance of the adiabatic DANTE sequence, which demonstrate that the grid contrast remains unaffected despite large changes in B1 . Figure 11A shows the superior performance of the adiabatic DANTE sequence compared to a conventional nonadiabatic on a phantom. Figure 11B shows tagged images of a canine heart at 4.7 tesla generated with an adiabatic DANTE sequence collected using an extrathoracic surface RF coil. 2.2.4
Variable Distance Tagging
Although the 1-D and 2-D parallel plane tagging patterns are suitable for the study of the heart wall motion, they are not optimal. Specifically, during systole, the tags over regions with circumferential motion demonstrate decrease in their separation, while tags over regions with radial thickening
FIGURE 11 (A) Tagged images of a phantom generated with a nonadiabatic rectangular DANTE sequence (1 and 3) and with the adiabatic DANTE inversion sequence (2 and 4). The tagging grid is generated perpendicularly (1 and 2) and parallel to the surface coil plane (3 and 4). (B) Short-axis, apical view tagged images of the canine heart at 4.7 tesla, using the adiabatic DANTE inversion sequence, showing six phases of one cardiac cycle. The grid has nominal dimensions 4 ⫻ 4 mm2. (From Ref. 12.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
181
demonstrate increased separation. To address this issue, McVeigh et al. implemented a tagging sequence that generates tags with variable separation to match the left ventricular displacement pattern [13]. Using the small flip angle approximation theory [15], a modulated amplitude RF pulse train was designed with frequency response composed of regions with variable separation inversion bands. In the presented implementation, the ratio of the number of tagging bands in the central region to the number of tagging bands in the side regions was 3:5. The central tagging region was placed over the myocardium with circumferential motion, and the side bands over the myocardium with radial thickening. Thus fewer tagging lines with larger separation were placed over the area with circumferential motion, so a minimal separation of about 4 pixels would be maintained for optimal precision in the estimation of the tag position. 2.3
Imaging of the Tagging Pattern Motion
In principle, any MR imaging sequence can be fitted with tagging pulses and several MR imaging sequences: spin echo (SE) [2,4,20], gradient recalled echo (GRE), echo planar imaging (EPI) [21], and spiral scanning [22]. However, both the tagging pattern and the imaging sequence should be optimized for the particular type of matrix and motion. Optimal use of MR tagging to study the motion of the heart requires imaging sequences that provide good contrast between tagged and background (nontagged) myocardium, sufficient spatial and temporal resolution to track the tagging pattern, and insensitivity to motion due to breathing. Originally, cardiac-gated multislice, multiphase SE imaging was used to image the motion of tagging patterns on the heart [23,24]. However, this approach often resulted in images with inconsistent quality, due to artifacts associated with patient breathing. To address this issue, image acquisition during voluntary patient breathhold was introduced. With breath-hold, the total scan time must be limited to the period that the subject can hold his or her breath, usually on the order of 15–30 sec. Imaging in such short periods can be accomplished with pulse sequences like segmented GRE, single-shot or segmented EPI, and spiral scanning. In addition to the effect of respiratory motion, several factors should be considered when implementing MR tagging. In MR tagging, an important factor is the loss of contrast-to-noise ratio (CNR) between the tagging pattern and the underlying matrix over time due to T1 relaxation. After the application of the tagging pulses, the magnetization of the tagged stripes relaxes, thus reducing the contrast between the tags and the untagged matrix. In practice, the train of excitation/observation pulses of the pulse sequence that follows the tagging pulses to observe the longitudinal magnetization further deteriorates the CNR [21,25]. The observation pulses pro-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
182
Tsekos and Amini
gressively reduce the magnetization of the underlying matrix, making the tagged stripe and background matrix signal converge faster, thus reducing duration of the heart cycle that can be imaged. Several approaches may be used to obviate this limitation, such as the generation of a tagging pattern with flip angles larger than /2 [26] or the use of echo planar imaging (EPI) [21] or spiral scanning [22] pulse sequences. Another approach to obtaining tagged images over an entire cycle is to collect two sets of data. In the first, the tagging pattern is placed in the myocardium at end-diastole and in the second, the tagging pattern is placed at end-systole. Subsequently, the images from the two acquisition sets can be used to reconstruct the heart motion over an entire heart cycle [16]. Optimization of the resolution in both the spatial and the inverse kspace is another important factor. In regard to the real space, it has been demonstrated that the precision in the estimation of the tag position and the optimum tag thickness impose restrictions on the grid width [27] and separation or density [17,28]. The images should have sufficient spatial resolution to allow for the accurate delineation of a tagging grid with given width and separation. Furthermore, the relation between the spatial frequency components of the tagging pattern and the frequencies sampled by the imaging sequence imposes additional restrictions. In the frequency domain, the data acquisition should rapidly sample the Fourier components that are most important for resolving the tagging pattern and tracking it over time. This can be realized when considering the simplest tagging scheme, a set of lines parallel to a real space axis, say x. Since in k-space the Fourier components of this pattern are concentrated along the ky axis, to achieve high resolution in k-space, the readout gradient can be aligned along the ky axis [11]. To measure the motion along two orthogonal directions with this implementation, two images should be collected, each one with both the tags and the readout gradient along the two orthogonal axes. Although longer acquisition times are required, the two image sets provide sampling equivalents to the standard 2-D grid pattern, but with higher resolution. The requirement to sample the k-space with sufficient resolution also implies that tagging patterns that have compact Fourier spectra are better in regard to susceptibility to motion blurring [29]. 2.3.1
Segmented k-Space Gradient Recalled Echo
With segmented k-space acquisition, only a part (segment) of the k-space of the image corresponding to a specific heart phase is acquired during a given heartbeat, as illustrated in Figure 12A. The segments that correspond to the same heart phase are collected in different but consecutive heartbeats during the same breath-hold. Figure 12B illustrates an example of segmented coverage of the k-space using linear trajectories, i.e., starting from the highest
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
183
FIGURE 12 (A) Examples of segmented k-space acquisition, illustrating three linear (⫺max to ⫹max) k-space trajectories. kr: readout phase encoding k-space axis; kp: phase encoding k-space axis. (B) Implementation of a k-space acquisition. The image depicting the heart at a specific point in the cardiac cycle is generated by combining the required segments from different heartbeats. (C) Timing diagram of the RF and gradient pulses used with the gradient recalled echo (GRE) sequence. RF spoiling and/or refocusing of the gradients can be used (example of a refocusing of the phase gradient is illustrated).
negative k-space line and moving progressively toward the positive k-space line, skipping a number of lines equal to the number of segments. Other trajectories can be used, such as the center-out, where the innermost lines are first collected. After the collection of all segments, the k-space lines that correspond to the same image, i.e., the same heart phase and slice, should be correctly reordered by interleaving from the highest negative k-space line
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
184
Tsekos and Amini
(⫺kmax) to the highest positive k-space line (⫹kmax) before the Fourier transform is applied. With segmented k-space imaging, high temporal resolution is achieved, in the order of the time acquisition of the single segment and not of the entire image. As an example, if an image that requires 300 ms to be collected is split into eight segments, the temporal resolution is 37.5 ms. Although each one of the segments should be collected during different heartbeats, there is no time penalty, since multiple heart phases can be collected during the same period, as shown in Fig. 12A. It should be emphasized that GRE, EPI, and spiral scanning can be implemented with segmented k-space acquisition. In addition to high temporal resolution, with a segmented k-space GRE sequence, the signal is sampled within a brief window (30 to 40 ms) when a short TE value is used. That feature minimizes flow artifacts as well as signal loss due to magnetic susceptibility. Short TE values can be achieved with currently available state-of-the-art gradient field coils and/or utilizing fractional echoes [11]. Figure 12C shows an example of a GRE sequence with the gradients overlapping to achieve short TE and TR values. With a short TE/TR GRE sequence, RF spoiling may be incorporated to eliminate the transverse magnetization. Alternatively, a combination of the slice, readout, and phase encoding gradients may be employed to refocus the transverse magnetization and increase the SNR [11] with the potential penalty of an increased TR. The segmented k-space GRE sequence has disadvantages, originating primarily from the acquisition scheme that requires an RF pulse for each line of k-space. First, GRE has a relatively low efficiency in sampling the longitudinal magnetization that is progressively driven to a low steady state with the train of excitation pulses. Furthermore, due to the continuous excitation pulsation, the longitudinal magnetizations of the untagged and tagged tissue converge toward the same steady state, reducing the contrast of the later heart phases. To address the limitations of the segmented GRE, segmented EPI was implemented by Tang et al. for MR tagging [21]. Despite its limitations, segmented GRE is often the method of choice in MR tagging studies on clinical MR scanners. 2.3.2
Echo Planar Imaging and Spiral Scanning
With EPI, multiple k-space lines can be collected after a single RF pulse, as illustrated in Figure 13. Compared to segmented GRE, EPI has two major advantages in myocardial MR. First, since with EPI the untagged tissue magnetization experiences a smaller number of RF excitation pulses, the contrast between tagged and nontagged myocardium is substantially higher [11,21]. Myocardial tagging with EPI is also implemented with segmented
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging
185
FIGURE 13 Timing diagram of the RF and gradient pulses used with the echo planar imaging (EPI) sequence with blipped phase encoding gradients, for a center-out segmented k-space trajectory.
k-space acquisition, like the segmented GRE. However, with EPI a substantially smaller number of segments is required. Second, since with EPI, each segment covers a larger portion of the k-space, a complete image can be collected within fewer heartbeats, and thus multiple slices can be obtained in a single breath-hold period. This provides an additional benefit: multislice, multiphase EPI acquisitions provide parallel short axis slices that are registered correctly in space because they are obtained within the same breathhold. EPI has some limitations, since it is susceptible to inhomogeneities of the main magnetic field that can be addressed with localized field shimming [30], and it is also susceptible to chemical shift artifacts that can be reduced with the use of spectral-spatial selective RF pulses [31]. In addition to EPI, a time-efficient technique to cover the k-space is segmented spiral scanning [22]. With segmented spiral scanning, after an excitation RF pulse, the two magnetic field gradients orthogonal to the slice selection gradient generate a long center-out spiral trajectory. As a result, after a single excitation, more k-space points are sampled as compared to the GRE. In that respect, the operation principle of the spiral scanning is similar to the EPI. However, spiral scanning has an inherent advantage: the readout gradient waveforms are flow compensating by design [32]. Spiral scanning has the same disadvantages as EPI, and that is that chemical shift effects require spectral-spatial pulses. Furthermore, the k-space points are along a spiral trajectory, so the raw data must be interpolated to an orthogonal grid before the Fourier transform operation is applied.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
186
3
Tsekos and Amini
CONCLUSIONS
In this chapter, we have provided an overview of tagged MR cardiac imaging techniques published to date. These included the radial tagging, SPAMM, CSPAMM, and DANTE methods. We also gave an overview of EPI, spiral scanning, and segmented GRE, which may be fitted with tagging pulses for fast imaging of heart motion. An important challenge before us is to reduce the time required to perform accurate quantitative analysis of the collected data, and to design methods for visualization of the kinematics of heart wall motion. We will continue this discussion in Chapter 15, which is devoted to these important topics. ACKNOWLEDGMENTS This work was supported in part by grants from the Whitaker Biomedical Engineering Foundation, the National Science Foundation through grant IRI9796207, and the National Institutes of Health through grants HL-57628 and HL-64217. REFERENCES 1. 2. 3.
4. 5. 6. 7. 8. 9. 10. 11.
W. Grossman. Assessment of regional myocardial function. J. Amer. Coll. Cardiology 7(2), 327–328, 1986. O. C. Morse, J. R. Singer. Blood velocity measurements in intact subjects. Science 170, 440–441, 1970. E. A. Zerhouni, D. M. Parish, W. J. Rogers, A. Yang, E. S. Shapiro. Human heart: tagging with MR imaging—a method for noninvasive assessment of myocardial motion. Radiology 169, 59–63, 1988. L. Axel, L. Dougherty. MR imaging of motion with spatial modulation of magnetization. E. R. McVeigh, E. A. Zerhouni. Noninvasive measurement of transmural gradients in myocardial strain with MR imaging. Radiology 180, 677–683, 1991. L. Axel, L. Dougherty. Heart wall motion: improved method of spatial modulation of magnetization for MR imaging. Radiology 172, 349–350, 1989. T. J. Mosher, M. B. Smith. A DANTE tagging sequence for the evaluation of translational sample motion. Magn. Reson. Med. 15, 334–339, 1990. B. D. Bolster, Jr., E. R. McVeigh, E. A. Zerhouni. Myocardial tagging in polar coordinates with use of striped tags. Radiology 177, 769–772, 1990. S. E. Fischer, G. C. McKinnon, S. E. Maier, P. Boesiger. Improved myocardial tagging contrast. Magn. Reson. Med. 30, 191–200, 1993. A. J. de Crespigny, T. A. Carpenter, L. D. Hall. Cardiac tagging in the rat using a DANTE sequence. Magn. Reson. Med. 21, 151–156, 1991. E. R. McVeigh, E. Atalar. Cardiac tagging with breath-hold cine MRI. Magn. Reson. Med. 28, 318–327, 1992.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Cardiac Imaging 12.
13. 14. 15. 16. 17.
18.
19. 20.
21.
22. 23.
24.
25. 26.
27.
28.
187
N. V. Tsekos, M. Garwood, H. Merkle, Y. Xu, N. Wilke, K. Ugurbil. Myocardial tagging with B1 insensitive adiabatic DANTE inversion sequences. Magn. Reson. Med. 34, 395–401, 1994. E. R. McVeigh, B. D. Bolster, Jr. Improved sampling of myocardial motion with variable separation tagging. Magn. Reson. Med. 39, 657–661, 1998. W. S. Kerwin, J. L. Prince. A k-space analysis of MR tagging. J. Magn. Reson. 142, 313–322, 2000. J. Pauly, D. Nishimura, A. Macovski. A k-space analysis of small-tip-angle excitation. J. Magn. Reson. 81, 43–56, 1989. E. R. McVeigh. MRI of myocardial function: motion tracking techniques. Magn. Reson. Imaging 14, 137–150, 1996. S. E. Fischer, M. Stuber, M. B. Scheidegger, P. Boesiger. Towards high resolution myocardial tagging. Proceedings of the 12th Annual Meeting, Society of Magnetic Resonance in Medicine, New York, NY, 1993. S. E. Fischer, G. C. McKinnon, M. B. Scheidegger, W. Prins, D. Meier, P. Boesiger. True myocardial motion tracking. Magn. Reson. Med. 31, 401–413, 1994. G. Bodenhausen, R. Freeman, G. A. Morris. Simple pulse sequence for selective excitation in Fourier transform NMR. J. Magn. Reson. 23, 171–175, 1976. M. Tyszka, N. J. Shah, R. C. Hawkes, L. D. Hall. Visualization of fluid motion by tagged magnetic resonance imaging. Flow. Meas. Instrum. 2, 127–130, 1991. C. Tang, E. R. McVeigh, E. A. Zerhouni. Multi-shot EPI for improvement of myocardial tag contrast: comparison with segmented SPGR. Magn. Reson. Med. 33, 443–447, 1995. H. Sakuma, K. Takeda, C. B. Higgins. Fast magnetic resonance imaging of the heart. Eur. J. Radiol. 29, 101–113, 1999. N. R. Clark, N. Reichek, P. Bergey, E. A. Hoffman, D. Brownson, L. Palmon, L. Axel. Circumferential myocardial shortening in the normal human left ventricle. Assessment by magnetic resonance imaging using spatial modulation of magnetization. Circulation 84, 67–74, 1991. A. A. Young, L. Axel. Three-dimensional motion and deformation of the heart wall: estimation with spatial modulation of magnetization—a model-based approach. Radiology 185, 241–247, 1992. S. B. Reeder, E. R. McVeigh. Tag contrast in breath-hold CINE cardiac MRI. Magn. Reson. Med. 31, 521–525, 1994. W. J. Rogers, Jr., E. P. Shapiro, J. L. Weiss, M. B. Buchalter, F. E. Rademakers, M. L. Weisfeldt, E. A. Zerhouni. Quantification of and correction for left ventricular systolic long-axis shortening by magnetic resonance tissue tagging and slice isolation. Circulation 84, 721–731, 1991. E. Atalar, E. R. McVeigh. Optimum tag thickness for the measurement of motion by MRI. Proceedings of the Book of Abstracts, 11th Annual Meeting, Society of Magnetic Resonance in Medicine, 1992. E. R. McVeigh, L. Gao. Precision of tag position estimation in breath-hold CINE MRI: the effect of tag spacing. Proceedings of the 12th Annual Meeting, Society of Magnetic Resonance in Medicine, New York, NY, 1993.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
188 29. 30. 31.
32.
Tsekos and Amini C. C. Moore, S. B. Reeder, E. R. McVeigh. Tagged MR imaging in a deforming phantom: photographic validation. Radiology 190, 765–769, 1994. F. A. Jaffer, H. Wen, R. S. Balaban, S. D. Wolff. A method to improve the B0 homogeneity of the heart in vivo. Magn. Reson. Med. 36, 375–383, 1996. F. Schick, J. Forster, J. Machann, R. Kuntz, C. D. Claussen. Improved clinical echo-planar MRI using spatial-spectral excitation. J. Magn. Reson. Imaging 8, 960–967, 1998. D. G. Nishimura, P. Irarrazabal, C. H. Meyer. A velocity k-space analysis of flow effects in echo-planar and spiral imaging. Magn. Reson. Med. 33, 549– 556, 1995.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
7 Functional MR Image Visualization and Signal Processing Methods Alex R. Wade and Brian A. Wandell Stanford University, Stanford, California
Thomas P. Burg Massachusetts Institute of Technology, Cambridge, Massachusetts
1
INTRODUCTION
Functional magnetic resonance imaging (fMRI) indirectly measures the activity of the neurons within the gray matter of the brain. The most widespread method of measuring this activity is based on the blood oxygenation level dependent (BOLD) signal. As Ogawa and others showed, the MR signal can be used to measure the relative concentration of blood oxygen in different positions within the human brain [1,2]. This signal in turn depends on the relative activity of the nearby neurons. Because this signal can be measured using many conventional MR scanners distributed around the world, the opportunity to measure activity in the human brain has been made available to many scientists. The portions of the visual cortex that play an active role in neural computation have a relatively simple surface structure: they have the form of a folded sheet. While the brain as a whole forms a fairly large threedimensional solid, the interior of this solid comprises almost exclusively a mass of white matter (axons) that connect the neurons. The neurons them189
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
190
Wade et al.
selves fall within a fairly thin (3–5 mm) sheet, the cortical gray matter that covers the surface of the white matter. It is possible to take advantage of the simple shape properties of the cortical gray matter when analyzing fMRI data in at least two ways. First, restricting the analysis of fMRI signals to activity within the gray matter improves the overall sensitivity of analysis methods. Because we know a priori that the signals of interest are in the gray matter, we can reduce the number of false alarms (and thus increase the signal-to-noise ratio) by examining only activity that falls within the gray matter. Hence the segmentation of white and gray matter serves as a useful signal-processing tool. Second, because the region of interest is a surface and not a solid, there are many opportunities derived from computer graphics to render and visualize the activity in intuitive ways. And indeed, there has been a rapid increase in the number of laboratories using functional magnetic resonance imaging (fMRI). The increase in the number of scientists doing neuroimaging studies has produced an explosion of new ideas and shared software tools for visualizing and analyzing neuroimaging data. In this chapter we describe two contributions from our laboratory. One is a method for visualizing and rendering the signals measured along the surface of the human brain [3]. The second is a method for deriving activity maps in real time. This analysis method is useful for applications that require immediate verification of the experiment or even feedback control during the experiment itself. 2
CORTICAL SURFACE VISUALIZATION
Because in its natural form the sheet of gray matter is very heavily folded, viewing the entire gray matter surface in a single glance is impossible. Neural activity that falls at locations where the cortical sheet dips downward, into a fold (sulcus) are blocked from view by the protrusions (gyri). The oval shape of the surface makes it difficult to see the lateral and medial surfaces in a single view. Yet to analyze neural signals it is useful for the scientist to be able to see the entire pattern of activity at once. Such a view can clarify the spatial relationship between different regions that are activated by the experimental stimulus. Here we describe a method we have developed to flatten the visual cortex of human and monkey brains [3]. Our work forms part of a series of developments, spanning the last fifteen years, for visualizing the spatial distribution of activity within the sheet of gray matter by creating a flattened representation [3–8]. The challenges in developing algorithms that flatten the cortical surface can be divided into two main areas. The first challenge is to find a method that converts the spatial relationships inherent in the MR instrumental measurements to the relationships inherent in the cortical sur-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
191
face. The difference between these two representations is illustrated in Fig. 1. The instrumental sampling of the MR signal is naturally thought of as falling on a regular grid of points that span the measurement volume. Adjacent points are assumed to be connected for averaging and visualizing, and this forms a rectangular sampling lattice of the MR measurements (Fig. 1a). Adjacency relationships inherent in the cortical surface, however, do not follow this three-dimensional connectivity. Rather, the connections trace out a two-dimensional sheet along a complicated path within this volume, as illustrated in Fig. 1b. Notice that adjacent points in Fig. 1a, shown by the links, are not adjacent in the cortical sheet shown in Fig. 1b. Indeed, a pair of points that are adjacent in the instrumental sampling can be on op-
FIGURE 1 (a) The sampling grid of the fMRI signal and (b) the spatial structure of cortical gray matter. The nearest neighbors in the cortex differ from the usual assignment of nearest neighbors to the fMRI signal, so that adjacent points in the fMRI grid are not adjacent in gray matter. Methods that average across the fMRI grid, without respecting the cortical neighborhoods, will provide an inaccurate picture of brain activity. (Reprinted with permission from Wandell et al., 2000.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
192
Wade et al.
posite sides of a sulcus and quite far from one another when measured along the surface of the brain. The neighborhood relationships implied by the adjacency defines a topology, and we describe the discrepancy in the identity of neighboring points as a topological inconsistency between the instrumental samples and the cortical surface samples. Several aspects of visualization and analysis can be improved by using the proper cortical gray matter neighborhoods to guide visualization and analysis. First, the neighbor relationships can be used to create flattened representations so that activity from the entire surface can be viewed in a single image. Second, the neighbor relationships serve as the foundation for creating a three-dimensional rendering of the brain. In many cases, rendering the boundary between the gray and white matter provides a satisfactory method for visualizing most of the surface area in a small number of views. With the widespread use of inexpensive processors for three-dimensional rendering, these visualizations are practical and satisfactory for interactive data analysis. Third, the neighbor relationships can be used for statistical processing. It is greatly preferable to perform spatial signal processing with respect to the cortical neighborhoods rather than the instrumental spatial neighborhoods. For example, the two highlighted points shown in Fig. 1 are adjacent in the MR measurements; their values would be blurred under conventional convolution operations applied to three-dimensional brain data. With respect to the brain, however, the positions are on opposite sides of a sulcus and thus not neighbors. Hence if averages are taken with respect to the cortical neighborhoods, the signals are not near one another and their values would not be combined. Combining spatial measurements in a way that respects the gray matter neighborhoods is superior to methods based on an instrumental sampling grid. We believe that the use of cortical neighborhood relationships will become a very important part of data analysis in the next few years. 2.1
The Shape of the Cortical Surface
The character of the cortical shape is hard to appreciate or measure directly because the main appearance at a distance is merely that of a surface folded into a crumpled form. In examining small pieces of the brain, however, it is possible to see the local shape. The typical local shape of the cortical surface is illustrated in Fig. 2a. Over significant portions, the surface is a simple plane. At other points, extrusions introduce bubbles that distort the sheet. If there were no bubbles representing regions with nonzero intrinsic curvature, or technically, if the surface were developable, it would be possible computationally (or physically) to unfold the crumpled form into a
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
193
FIGURE 2 Why flattening distorts distances. (a) The surface contains three regions. One is planar, a second has a portion of a sphere inserted, and the third has a portion of an ellipsoid inserted. The perimeters of the two circles are the same, but the areas contained within them differ. The flattened representations described here mainly distort the distance relationships between nodes on the spherical and ellipsoidal regions. (b) The first stage of the unfolding procedure is to lay down a planar graph onto the 3-D sheet. (Reprinted with permission from Wandell et al., 2000.)
simple plane. The planar representation would preserve the distances between points as measured along the surface [9]; when distances are preserved, other metric quantities (e.g., angles and areas) are preserved as well. Because the cortical surface includes bubbles, no flattened representation can precisely preserve the distance relationships. One way to see why a distance-preserving flattened representation is impossible is to consider the area of the cortex within the regions marked on Fig. 2a. The circle shown
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
194
Wade et al.
on the plane contains much less surface area (r2) than the equal-perimeter circle that bounds the ellipsoidal extrusion (>2r 2). There is no way to flatten this surface so that the area contained within these circles maps to a flattened representation that simultaneously preserves the distance relationships between points. Knowing that it is impossible to preserve distances, one must design algorithms that choose among the various quantities that can be preserved. When creating a flattened representation, we think it is most important to preserve the local topology in the sense that the flattened representation should contain no twists or folds of the original surface. One way to describe this intuition more formally is to begin with a mesh that covers the cortical surface (Fig. 2b). As the mesh is drawn, it contains no intersecting edges and thus forms a planar graph [10]. Like Drury et al. [6], we believe that the first objective in creating a flattened representation should be to preserve the neighborhood relationships between the sample nodes so that when the flattened representation is created, the planar graph is still a planar graph. As we describe below, if one can find a planar graph that spans the surface one would like to unfold, then the nodes can be placed on a plane as a planar graph in a single step [3]. 2.2
Flattening Algorithm
In a complete flattening algorithm, a number of different steps must be performed. More complete descriptions of the flattening process are described elsewhere; briefly, the basic strategy that has been adopted by several groups is to define the solid mass of white matter first [4,8,11,12]. From this solid, a surface can be defined that characterizes either the white/gray boundary or a surface that passes through the middle of the gray matter. A triangular mesh for this surface can be defined by using standard computer graphics methods (in this case, the ‘‘marching cubes’’ algorithm [13]) or by simply defining a mesh on the outside surface of the white matter voxels [8]. The nodes and edges of these meshes can serve as a graph in the calculations below. Once the mesh is defined, the flattened representation can be derived by the following computation. First, choose a region of the brain for flattening. This need not contain the entire hemisphere, as one may not have functional data from the entire brain. One reasonable method for selecting a region to unfold is to find a single node (the ‘‘start node’’) and then include all of the nodes that fall within some distance of the start node. The nodes and links between nodes (edges) that cover this region of the brain define the mesh that will be placed on the plane. By using this method to select the nodes, we can be assured of defining a convex set for flattening; as we will see below, this can be an advantage.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
195
The nodes can be divided into two groups: perimeter and interior nodes. A node in a planar graph can be identified as being on the perimeter by determining whether the nodes connected to it form a closed loop. Assign the start node to the planar position (0, 0). The perimeter points are first placed on a circle whose radius matches the distance from the start node to the perimeter. Then, maintaining the angular position, the perimeter node positions are adjusted individually so that their planar and cortical distances from the start node approximately match. To place the interior nodes on the plane while preserving the planar graph, each node is assigned an initial location at a position equal to the average position of its neighbors (the sample nodes connected by an edge). When the neighbors form a convex shape, placing the node at the average position of the neighbors will not introduce any edge intersections, and the mesh will continue to be a planar graph [14–16]. If the nodes form a concave set, this is not guaranteed. But in our experience with hundreds of unfolds, the mesh generation algorithm combined with the initial placement procedure introduces no edge intersections among the edges of interior nodes. The initial placement of the interior nodes can be implemented in the following linear calculation. Suppose there are n interior nodes and p perimeter nodes. Let X be the n ⫻ 2 matrix containing the planar positions of the interior nodes. Let N be an n ⫻ n matrix whose (i, j )th entry is 1/mi, where mi is the number of edges connected to the ith node, or zero if (i, j ) are not connected. The p ⫻ 2 matrix X0 contains the two-dimensional locations assigned to the perimeter nodes. Finally, P is an n ⫻ p matrix whose (i, j )th entry is 1/mi when perimeter node j is connected to sample node i. Each node is at the average position of its neighbors when the following equation is satisfied: X = NX ⫹ PX0 Hence, the initial positions X can be determined from the edges attached to each node (specified in matrices N and P) and the initial positions of the perimeter points (specified in X0). These positions are found by solving the linear equation X = (I ⫺ N)⫺1PX0 Placing the three-dimensional nodes and edges into a two-dimensional plane using a method that preserves the mesh topology, and in a single step, is a computationally efficient beginning to a flattening algorithm. This initial placement produces planar positions that do not match the cortical distances very well. However, it is possible to adjust the positions of the points within
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
196
Wade et al.
the plane iteratively, preserving the planar graph, so that the planar distances match the cortical distances more closely [3]. 2.3
An Example
Figure 3 uses a simple test surface to illustrate the distance distortions introduced by the unfolding process. The test surface consists of a plane with several extruding bubbles that vary in size. Some of the bubbles just emerge above the surface while others rise significantly; in this way, the area of the bubble with respect to the circumference of its intersection with the plane differs considerably. This structure is a useful test surface because the size of the bubbles span those observable in gray matter. Figure 3b shows the planar locations of a set of sample nodes selected on the surface. These nodes were placed initially in the plane using the linear method described above. Then the node positions were iteratively adjusted to match planar distances with distances measured along the original surface. It is evident from examining the sample node positions that in the regions near the large bubbles the node locations are compressed. This introduces distortions between the distance measured in the original surface and in the planar plots. Figure 3c measures this distance distortion by comparing the distance between points measured along the original surface (dm) with the distance measured through the corresponding nodes in the planar representation (dp). In order to compare distance errors for both nearby and widely separated points, we plot the fractional distance error, (dp ⫺ dm)/dm. With this formula, positive and negative errors represent an expansion and compression of the planar distances, respectively. For most of the points the distance distortions are relatively small. The largest distortions, near the large bubble, are on the order of 40%. In our experience, flattening the human cortex produces distance errors in the order of 15–20%. Hence the human cortex appears to have bubbles as large as the middle-size sphere. We will return to an application of the flattened representation at the end of this paper. Further details as well as analyses of the unfolding method described here can be found in Ref. 3 and at our Internet site http://white. stanford.edu/⬃brian. 3
RAPID CALCULATION OF ACTIVITY MAPS: k-CORRELATION
Next, we describe a method for rapidly converting fMRI data, acquired into k-space, into covariance and correlation maps. We developed this method originally to serve two purposes. First, in many fMRI experiments scanning time is extremely limited. For example, studies of patients with special dis-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
197
FIGURE 3 (a) A test surface comprising a plane with different portions of a sphere inserted. (b) The positions of the sample nodes after flattening. (c) A histogram summarizing the distance errors (distance in cortex minus distance on the plane) between sample nodes. (Reprinted with permission from Wandell et al., 2000.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
198
Wade et al.
orders often require significant experimental arrangements that are hard to repeat, and the patients are generally unwilling or unable to return. By rapidly converting the fMRI data while the subject is in the scanner, the experimenter can confirm that the data from individual sessions are satisfactory. Second, if the activation maps can be created during an experimental session, then the maps themselves can be used to guide the next round of experimental stimulations or tasks. This will permit experimenters to design truly interactive tasks in which the stimulus is modified in order to track brain activity. The algorithms we use to speed up these computations operate directly on the raw k-space data coming from the scanner. Calculating the covariance map using k-space data means that o-space phase information is automatically included into the analysis. In the second part of this section, we demonstrate that in addition to speeding up the generation of activation maps, this may also improve the sensitivity of the analysis compared to traditional methods. 3.1
Efficient Computations Based on k-Space Data
The standard method for performing correlation analysis on cortical BOLD fMRI signals is computationally demanding. This method transforms k-space data into object (o) space, computes the magnitudes of the transformed values, and correlates the resulting real time series with some approximation of the expected BOLD response due to the stimulus. After thresholding, the resulting cortical correlation map shows regions where the temporal BOLD signal is significantly correlated with the presence of the stimulus (17). When the goal of the analysis is to produce a covariance map, rather than a full time series, the standard method can be greatly simplified: it is possible to reduce the computational burden by producing a covariance image directly from the k-space data. When the goal of the analysis is to produce a correlation map, many of the key steps can be performed entirely in k-space. Performing these calculations in k-space eliminates the need to transform each acquired k-space image into o-space, and when a spiral acquisition trajectory is used, the gridding and interpolating steps are also avoided. While it is not possible to compute the entire correlation map in k-space, we have found an efficient method for approximating the correlation using a limited number of k-space images. We distinguish maps generated using magnitudes in o-space from maps generated using complex values in k-space by referring to them as o-correlation and k-correlation maps, respectively. 3.1.1
K-Covariance and k-Correlation
Mathematically, the generation of the o-space correlation map from a k-space data set K can be expressed as a sequential application of two linear oper-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
199
ators (Fourier transforms) with an intermediate, nonlinear modulus operation: C0(x) =
兩Ft{兩F k⫺1{K}兩}兩0 sd(x)
(1)
The Ft operator represents the Fourier transform in the time domain, 兩F ⫺1 k 兩 represents the modulus of the inverse FT in the spatial frequency domain, and sd(x) represents the standard deviation of the fMRI response at the point x in object space. Note that the operation 兩F ⫺1 k 兩 applies to the entire k-space data set—we are obliged to transform every k-space image into o-space before finding the covariance at signal frequency 0. In a typical spiral acquisition data set, this means performing over 100 gridding, interpolation and 2-D Fourier transform operations for each of 16 image acquisition planes. If we are prepared to include the entire complex o-space signal in the analysis, the correlation takes the form 兩Ft{F k⫺1{K}}兩0 C0(x) = sd(x)
(2)
The numerator is now a direct combination of two linear transforms, one in time, the other in space. Since time and space are separable, these two linear operations are interchangeable, and the correlation can be expressed as C0(x) =
兩F ⫺1 k {Ft{K}0}兩 sd(x)
(3)
The benefit of writing the numerator in this way is that we now require only a single transform from k-space to o-space. We have effectively eliminated all but one of the k-space to o-space transforms required to calculate the numerator. The numerator is referred to as the covariance of the signal with a harmonic at frequency 0. It is an absolute measure of how much of the signal lies at that frequency. By computing this value for every pixel in the imaging plane, a functional activation covariance map can be generated. Figure 4a shows a typical covariance map for a single imaging plane of an fMRI scan. An activated functional area is labeled y. Note that there are other regions with high covariance in the image (x). These are due to draining blood vessels that are of no interest to us in terms of mapping brain function. They have a large covariance simply because they contain a lot of blood and the total signal in that region is therefore very high.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
200
Wade et al.
FIGURE 4 Covariance map (a) and correlation map (b) generated from the same functional data set. The scan was designed to activate a region of the brain involved in motion processing (marked y). Note the regions of high amplitude in the covariance scan where the scan plane intersects a major blood vessel (x).
Correlation, in contrast to covariance, is a relative measure: it tells us what proportion of the total signal lies at the frequency 0.1 Figure 4b shows the correlation map generated from the same data set as Fig. 4a. The signal from the blood vessels is now highly attenuated, but the signal from the activated cortex is still present. This normalization step is achieved by dividing the signal at each point by the total signal power or, equivalently for zero-mean signal, the standard deviation sd(x). In general, we are more interested in the correlation map than the covariance map, as this is less likely to be distorted by the presence of large blood vessels and other artifacts and allows for precise statistical analysis. However, the covariance can be useful as a first quick indicator of the quality of the scan; gross problems, such as poorly placed receiver coils or excessive patient motion, can be detected in the covariance map. 3.1.2
Standard Deviation Map Estimates
This final normalization step cannot be computed efficiently in k-space. The generation of the standard deviation at every point in o-space requires a nonlinear modulus operator, and the division in o-space corresponds to a computationally demanding deconvolution in k-space. This means that while we can calculate the covariance map for a given data set almost immediately, we still have to back-transform every k-space image to o-space in order to compute the standard deviation and calculate the correlation map. 1
Strictly speaking, because we are dealing with the entire complex data set, we must combine the two signals at ⫾0. It can be shown that this combination can be achieved by taking the root sum of squares of the magnitudes of the signals.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
201
Nevertheless, there is one final improvement that we can make. The standard deviation at any point is calculated from all the values of that point over time. Effectively, we are estimating the true standard deviation or power of the zero-mean signal from a finite set of sample points at discrete time intervals. What is the error in this estimate if we use fewer frames in order to compute the standard deviation? The answer, according to population sampling theory, is that in the ideal case, the error should decrease approximately as a function of the reciprocal of the number of frames sampled. In theory, this means that we can use relatively few frames in order to obtain a good estimate of the standard deviation. In practice, this will depend on how well the noise statistics of the MR signal approach the ideal case. Since we can use the entire data set to calculate the ‘‘true’’ signal standard deviation at any point in the image, we can examine how this estimate is affected by using fewer and fewer frames. Figure 5 shows a plot of the estimated standard deviation error against the number of frames used to derive that estimate for data from a typical MR scan. The theoretical 1/N line is also plotted for comparison. It can be seen that the error in the SD estimate falls off rapidly with the number of samples used. While the improvement in the estimate of the SD is not quite ideal, it is good enough
FIGURE 5 Using fewer sample frames to construct an estimate of the standard deviation speeds up the calculation. The error of this undersampled estimate compared to the SD map generated from all the frames is shown above, as a function of the number of frames sampled. Population sampling theory predicts an error that decreases as approximately 1/N relationship (also plotted).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
202
Wade et al.
for us to be able to use a limited subset of the entire k-space data set in order to calculate a good estimate of the standard deviation map. In this example, we would need to use approximately 30 frames to obtain a 95% accurate estimate of the standard deviation map. We can therefore estimate the correlation in Eq. (3) using approximately 1/3 the number of k-space to o-space transformations that we would normally use to an accuracy of 95%. 3.2
SNR Improvement
While standard correlation calculations use the magnitude of the o-space data [see Eq. (1)], the k-correlation method uses the entire complex-valued k-space data. By using all the imaging data in this way, we have the potential to improve the signal-to-noise ratio of the correlation analysis. However, we also run the risk of introducing additional noise from the complex phase component of the BOLD signal. In this section, we describe ways of removing noise from the o-space phase signal. We then compare the sensitivity of the methods (o-correlation and k-correlation) in two ways. First, we describe receiver operating curve (ROC) analyses using fMRI data with added targets. Under our measurement conditions, k-correlation improves signal discrimination (d⬘). Second, we examine activity measurements of signals measured in the visual pathways with known properties. The activity maps we calculate using k-correlation have activation patterns similar to those obtained using conventional correlation analysis. 3.2.1
Information in the o-space Phase
It is not generally appreciated that the o-space complex phase holds any useful information at all (although see [18]). In fact, one of the common justifications of the modulus operation during o-correlation is that it removes noise that would otherwise be introduced from the complex phase and improves the final correlation map. The o-space phase component is certainly susceptible to noise, including noise sources such as motion and flow artifacts that affect the magnitude signal to a much lower degree. However, much of the phase noise is systematic and can be attenuated by applying simple trend removal procedures to the data. Low-order trend removal is standard practice when preprocessing the o-space magnitude signal. In general, trends are routinely removed to at least 1st order, and many schemes exist for digitally filtering the o-space magnitude signal further in order to remove noise due to the cardiac cycle and respiration [19–21]. Our fMRI studies with phantom targets show that fitting and subtracting an exponential function from the phase signal at each point removes much of the noise and dramatically improves the detectability of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
203
residual signals at the stimulus frequency. Trend removal can also be performed directly in k-space using a 3rd order polynomial, and this corrects for noise in both the o-space phase and magnitude. The result of trend removal is illustrated in Fig. 6. Figure 6a shows standard o-space correlation maps from scans designed to localize the functional brain region marked with an arrow (MT). All the correlations in this image are calculated using the o-space magnitude signal. Figure 6b shows a correlation map based on only the raw phase component of the same complex data set. This correlation map is clearly inferior to the o-correlation map, and there is very little activity in the region indicated in Fig. 6a. However, Fig. 6c shows a correlation map generated from the same phase data after low-order trend removal. Note that the region localized in Fig. 6a is identifiable again. Although there is not an exact correspondence between the pixels identified by the magnitude and phase correlations, the same general areas are identified. This suggests that the phase component of the complex o-space signal contains useful information that is normally discarded. 3.2.2
Signal-to-Noise Measurements
We investigated the potential improvement in correlation analysis SNR using receiver operating characteristic theory applied to simulated signals embedded in real fMRI data [17,22,23]. Specifically, we took complex fMRI time
FIGURE 6 Information in the complex o-space phase. (a) A correlation map generated from o-space magnitude data. Data are from an fMRI experiment designed to activate region MT. (b) A corresponding correlation map generated from the raw o-space phase data. There is little activation evident in region MT. (c) A correlation map generated from complex o-space phase data after low-order trend removal (3rd order polynomial and exponential terms). The region MT is active in this phase-correlation map.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
204
Wade et al.
series from real scanning sessions with a stimulus frequency of 6 cycles per scan. We then added a simulated complex signal at a different frequency to a small region of o-space and compared the ability of o-space and k-space correlation analysis to recover this signal. Our simulated ‘‘probe’’ signals had amplitudes similar to those found in real data. Note that the advantage of this method is that the noise characteristics of the data are realistic. If the complex o-space phase contains prohibitively high noise levels, as is generally believed, this will be reflected in poorer SNR for the k-correlation methods compared to the o-correlation. ROC analysis allows us to determine a measure of signal detectability for a signal in a known location. Figure 7 shows the results of our ROC analysis using simulated probe signals. When the probe signal contains only a magnitude component, the two methods (o-correlation and k-correlation) perform almost equally. o-correlation provides a slightly superior discriminability index (d⬘) compared to k-correlation, reflecting that in this case there is no information in the phase component and it therefore only contributes noise. When a small, realistic phase signal (0.01 radians in size) is added to the phase component, the story is very different. Now the k-correlation method achieves a significantly higher discriminability score than o-correlation. We regularly see clear stimulus-related phase signals of 0.02 radians or larger in real functional scans. Our conclusion is that including the phase information in the correlation analysis does not necessarily degrade the signal with excessive noise levels. Performing simple low-order trend removal prior to computing the correlation allows us to use the entire complex ospace data set to improve the detectability of a signal that has some periodic phase component at the stimulus frequency. 3.2.3
Restricting Phase Maps Using Magnitude and Complex Correlation Maps
Another way of confirming that the k-correlation maps are reasonable substitutes for o-correlation maps is to use them to create retinotopic maps in the visual cortex. Early visual areas are retinotopic, that is, there is a oneto-one mapping between spatial positions in the visual field and spatial positions in each early visual area. If a subject is presented with a pattern that slowly changes position over time, the area of cortical activation corresponding to that pattern in each retinotopic visual area will also change. It is possible to use specially designed stimuli to map eccentricity in the visual field onto the cortex [24–27]. Retinotopic maps are best visualized on flattened representations of the visual cortex generated, using methods described in the first part of this chapter. Retinotopic maps exist in only a subsection of the human visual cortex, and one of the best ways to visualize them is
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
205
FIGURE 7 Receiver operating curves (ROCs) generated from simulated signals embedded in real fMRI scanning data. The height of the curve above the identity line is a measure of that method’s signal detection performance. Voxels containing the signal were identified using o-space magnitude correlation (o-correlation) or complex k-space correlation (kcorrelation) methods. When no complex phase component is present in the signal, both methods perform similarly (k-correlation performing slightly worse than o-correlation due to residual noise in the complex ospace phase). However, when a small complex phase component (0.01 radians in amplitude) at the stimulus frequency is introduced, k-correlation performs significantly better than o-correlation.
to show a color-coded eccentricity map superimposed onto the flattened cortical surface, as in Fig. 8. Clearly, if the k-correlation values are closely linked to the physiological signal, phase maps that are restricted using these correlation values should also generate reasonable retinotopies with clear progressions of eccentricity and angular location in the areas identified by the conventional ospace correlation maps.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
206
Wade et al.
FIGURE 8 Visual field eccentricity maps, estimated using types of correlation measure, are shown on a flattened representation of the occipital cortex. The color codes the position in the visual field that excites each location of cortex. (a) Those voxels with o-space magnitude correlations greater than 0.3. The retinotopic layout is easily observed in both correlation measures, (b) though the computation time of the second map is reduced by a factor of 3.
Figure 8 shows two retinotopic maps generated from the same data set and identical maps of temporal signal lag. The stimulus was a circular contrast-reversing annulus that increased in radial position over the course of one scan cycle. Regions in the center of the visual field are activated at the start of the scan cycle, but more peripheral regions have later activations times. Hence the time of activation serves to measure the position within the visual field that excites each region of the visual cortex. In the retinotopic maps shown, the foveal center is marked F, and signal lag increases with radial distance from the fovea as expected. Figure 8a shows the retinotopy restricted to a correlation threshold of 0.3 using the standard correlation map generated from o-space magnitude data. Figure 8b shows the same retinotopy restricted using the complex k-correlation map. The two maps are very similar. The k-space restriction maintains the solid retinotopic region R around the fovea that corresponds to the early visual areas. Cortical regions that either do not exhibit retinotopy or are outside the portion of the visual field stimulated in these experiments show subthreshold responses in both cases. These findings support the idea that k-correlation and o-correlation maps reflect the same underlying neural activity. The additional information introduced by including the complex o-space phase information does not significantly degrade the quality of these maps and may, under some conditions, improve them.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Visualization
4
207
CONCLUSION
We have illustrated two novel signal-processing methods for visualizing and analyzing functional magnetic resonance imaging (fMRI) data. The visualization method (cortical flattening) improves our ability to appreciate the spatial organization of signals, something that is much harder when the data are visualized on the folded cortex. The second method (k-space correlation) uses both the magnitude and phase information in the MR data while still reducing the analysis time by a factor of 4. We evaluate the effects of including the magnitude and phase with a signal-to-noise analysis using signal detection theory. We conclude by demonstrating a complete system that produces retinotopic maps of the visual cortex from the k-space analysis on the representation of a flattened visual cortex.
REFERENCES 1.
2.
3. 4. 5.
6.
7.
8. 9. 10. 11.
S Ogawa, D Tank, R Menon, J Ellermann, S Kim, H Merkle, K Ugurbil. Intrinsic signal changes accompanying sensory stimulation: functional brain mapping with magnetic resonance imaging. Proc Nat Acad Sci 89, 5951–5955, 1992. KK Kwong, JW Belliveau, DA Chesler, IE Goldberg, RM Weisskoff, BP Poncelet, DN Kennedy, BE Hoppel, MS Cohen, R Turner, H Cheng, TJ Brady, BR Rosen. Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proc Nat Acad Sci 89, 5675–5679, 1992. B Wandell, S Chial, B Backus. Visualization and measurement of the cortical surface. J Cognitive Neuroscience, in press. GJ Carman, HA Drury, DC Van Essen. Computational methods for reconstructing and unfolding the cerebral cortex. Cerebral Cortex 5, 506–517, 1995. AM Dale, MI Sereno. Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: a linear approach. J Cog Neurosci 5, 162–176, 1993. HA Drury, DCV Essen, CH Anderson, CW Lee, TA Coogan, JW Lewis. Computerized mappings of the cerebral cortex: a multiresolution flattening method and a surface-based coordinate system. J Cog Neuroscience 8, 1–28, 1996. B Fischl, MI Sereno, AM Dale. Cortical surface-based analysis. II: inflation, flattening, and a surface-based coordinate system. Neuroimage 9, 195–207, 1999. AM Dale, B Fischl, MI Sereno. Cortical surface-based analysis I. Segmentation and surface reconstruction. Neuroimage 9, 179–194, 1999. B Horn. Robot Vision. MIT Press, Cambridge, 1986. B Bollobas. Modern Graph Theory. Springer, New York, 1991. PC Teo, G Sapiro, BA Wandell. Creating connected representations of cortical gray matter for functional MRI visualization. IEEE Trans Med Imaging 16, 852–863, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
208 12. 13. 14. 15. 16. 17.
18. 19.
20.
21. 22.
23. 24. 25.
26.
27.
Wade et al. DC Van Essen, HA Drury. Structural and functional analyses of human cerebral cortex using a surface-based atlas. J Neurosci 17, 7079–7102, 1997. Kitware. Kitware, 1998. WT Tutte. Convex representation of graphs. Proc London Math Soc 10, 1960. MS Floater. Parameterization and smooth approximation of surface triangulations. Computer Aided Geometric Design 14, 231–250, 1997. B Levy, JL Mallet. Non-distorted texture mapping for sheared triangulated meshes. SIGGRAPH ’98 25, 343–352, 1998. PA Bandettini, A Jesmanowicz, EC Wong, JS Hyde. Processing strategies for time-course data sets in functional MRI of the human brain. Magn Reson Med 30, 161–173, 1993. S Lai, GH Glover. Detection of BOLD fMRI signals using complex data. MRM meeting poster, 1997. MH Buonocore, RJ Maddock. Noise suppression digital filter for functional magnetic resonance imaging based on image reference data. Magn Reson Med 38, 456–469, 1997. X Hu, TH Le, T Parrish, P Erhard. Retrospective estimation and correction of physiological fluctuation in functional MRI. Magn Reson Med 34, 201–212, 1995. RM Weiskoff, J Baker, J Belliveau, TL Davis, KK Kwong, MS Cohen, BR Rosen. In: Society for Magnetic Resonance in Medicine, New York, 1993. RT Constable, P Skudlarski, JC Gore. An ROC approach for evaluating functional brain MR imaging and postprocessing protocols. Magn Reson Med 34, 57–64, 1995. P Skudlarski, RT Constable, JC Gore. ROC analysis of statistical methods used in functional MRI: individual subjects. Neuroimage 9, 311–329, 1999. BA Wandell. Computational neuroimaging of human visual cortex. Ann Rev Neuroscience 22, 145–173, 1999. EA DeYoe, P Bandettini, J Neitz, D Miller, P Winans. Functional magnetic resonance imaging (FMRI) of the human brain. J Neurosci Methods 54, 171– 187, 1994. EA DeYoe, GJ Carman, P Bandettini, S Glickman, J Wieser, R Cox, D Miller, J Neitz. Mapping striate and extrastriate visual areas in human cerebral cortex. Proc Natl Acad Sci USA 93, 2382–2386, 1996. SA Engel, BA Wandell, DE Rumelhart, AT Lee, MS Shadlen, EJ Chichilnisky, GH Glover. FMRI measurements in human early area VI: resolution and retinotopy. Investigative Ophthalmol Vis Sci 35, 1977, 1994.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
8 Multiscale Segmentation of Volumetric MR Brain Images Wiro J. Niessen, Koen L. Vincken, and Max A. Viergever Image Sciences Institute, University Medical Center, Utrecht, The Netherlands
Joachim Weickert University of Mannheim, Mannheim, Germany
1
INTRODUCTION
Image segmentation is a classical problem in computer vision and of paramount importance to medical imaging. The rapidly growing availability of three-dimensional reconstruction techniques had led to the development of numerous semiautomatic and automatic classification techniques to replace manual segmentation, which is a labor-intensive, subjective, and thereby nonreproducible procedure. In the field of medical image processing, segmentation of MR (brain) images has received ample attention [1–3]. MRI distinguishes itself from other modalities owing to its excellent discrimination of soft tissues. Correct tissue classification enables Quantitative (volumetric) analysis, e.g., to monitor lesion growth over time [4], or to perform volumetric analysis of brain tissue in neuropsychological disorders such as schizophrenia [5] or Alzheimer’s disease [6,7]. 209
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
210
Niessen et al.
Morphological analysis, e.g., to analyze and determine structural changes in the cortical surface [8,9], to label structures in the cortex [10], to study change in temporal lobe morphology in schizophrenic patients [11], or to assess intracranial deformation caused by brain tumors [12]. Visualization, e.g., for improved diagnosis [13] and for surgical planning and guidance [14–17]. Roughly, three main directions of MR image segmentation can be distinguished: (multispectral) classification methods, region-based methods, and boundary-based methods. Furthermore, prior knowledge of anatomical shapes can be used in combination with or incorporated into the various approaches [18–27]. In classification-based techniques, pixels are labeled as belonging to a certain class. This approach has been used by many researchers. The simplest technique is based on automatic threshold selection [28]. However, this method is sensitive to intratissue intensity variations that are common in MR acquisitions. Therefore adaptive schemes have been used [20,29] that combine classification with estimation of model parameters that describe for example intensity inhomogeneities in the image. Leemput et al. [25] interleave classification with parameter estimation, and utilize Markov random fields to incorporate contextual information. Held et al. [30] and Rajapakse et al. [31] use Markov random fields to capture signal inhomogeneities and to incorporate regional information. Classification schemes typically assign a single tissue type to a voxel. In order to be less sensitive to partial volume effects, fuzzy classifiers that allow multiple tissue types in a voxel have been used directly by Herndon et al. [23], and after correcting for intensity inhomogeneities by Pham and Prince [24]. Because in MRI independent variables can be measured (proton density, T1, T2), it is possible to perform classification based on multispectral images, see e.g. Refs. 13 and 32–36. Region-based methods include region growing, region growing in combination with morphological edge detection [37], and thresholding combined with morphological operations [38,39]. Ho¨hne and Hanson [40] used the combination of region growing and morphological operations in an interactive three-dimensional setting. Gradient techniques extract boundaries between tissue types. Bomans et al. [41] apply a three-dimensional extension of the Marr Hildredth edge operator. Kennedy et al. [42] extract surfaces based on edge information. Lim and Pfefferbaum [43] developed an interactive segmentation procedure based on local contrast. Zijdenbos et al. [44] refined this approach for intracranial segmentation. Another approach, to which a whole literature is devoted (for an excellent overview, the reader is referred to the paper by McInerney and Terzo-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
211
poulos [45]), considers deformable templates that are fitted to data based on (among others) a gradient dependent attraction force. An advantage of these approaches is that prior knowledge can be built into the parameterization process. Examples are parameterized deformable surfaces [19], where prior probabilities are assigned to the model parameters. Point distribution models, which were introduced by Cootes et al. [46], combine the descriptive power of deformable models with statistical knowledge of the structure to be segmented. Variations of this approach have been applied to MR brain segmentation [47,48]. A related approach that incorporates prior anatomical information uses brain atlases [49]. Arata et al. [50] registered individual patient data sets to an atlas to determine interpatient variability. Dawant et al. [26] and Hartmann et al. [27] used a global transformation of an atlas data set in combination with free-form transformations to segment MR images in healthy volunteers and in patients with severely atrophied brains, respectively. Recently, approaches that have great topological flexibility [51] have been applied to MR volumes. In order to constrain the flexible deformation these schemes allow, additional constraints on the deformation have successfully been applied to segment the cortical mantle [52]. Both intensity-based (classification- and region-based) methods and edge-based methods have specific limitations. Edge-based approaches suffer from spurious edges, strong variations in edge strengths, and local gaps in edges. This is particularly true for the highly curved transition between grey and white matter. Classification methods and region-based methods suffer from the fact that there is in practice a significant overlap in intensity between tissue types, owing to RF coil inhomogeneity, susceptibility artifacts, and partial volume effects. To compensate retrospectively for RF induced variations, a number of approaches have been proposed [43,53–55], but none of these is generally applicable [56]. In this chapter we consider MR brain segmentation based on a multiscale representation of image data. This approach allows us to capture the gross outline of shapes at a larger scale, while achieving (sub)voxel accuracy at lower scales. The method is based on a multiscale stack as input, and a multiscale linking model that relates the voxels at subsequent scales. Originally this method was devised for multiscale stacks that were constructed using linear scale space theory. However, it may be advantageous to construct (nonlinear) multiscale representations that preserve boundaries between tissues of interest at higher scales. Ideally, intratissue smoothing corrects for grey value fluctuations within a tissue type, whereas intertissue diffusion is limited. In this study, the use of gradient-dependent diffusion and geometrical scale spaces are considered and evaluated as alternatives for linear scale spaces.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
212
Niessen et al.
The chapter is organized as follows. In Sec. 2 we describe the theory of multiscale segmentation using a linking model based on a linear multiscale representation of image data. Since the linking model can also be applied to other multiscale representations, provided that a number of conditions is met, we describe two alternative multiscale representations in Sec. 3, viz., gradient dependent diffusion and geometric flows. In Sec. 4 the resulting scale spaces and segmentation results from simulated 3-D MR data, 2-D MR patient data, and 3-D MR patient data are shown, with ground truth (simulated data) and medical experts (patient data) as ground truth. 2 2.1
MULTISCALE SEGMENTATION Linear Scale Space
The necessity of a multiscale approach to image analysis has long been recognized [57]. A formal approach, linear scale space theory, that captures the inherent multiscale nature of image data, was developed by Witkin [58] and Koenderink [59]. By definition, a linear scale space is a one-parameter family of images generated by the linear diffusion equation with the original image L0 as initial condition:
再
冉
冊
⭸ ⫺⌬ ⭸t
L(x, t) = 0 (1)
L(x, 0) = L 0(x)
where t is related to spatial scale as t = 12 2. One of the most important features of this diffusion process is that it satisfies a maximum principle [60]. When traversing in the positive scale direction, local maxima are always decreasing [59]. The propagator (or Green’s function) of the linear diffusion equation is the Gaussian function. Therefore the analytical solution of Eq. (1) is given by L(x, t) = L 0(x) * G(x, )
(2)
where the D-dimensional Gaussian scale space kernel G is defined by the normalized Gaussian function G(x, ) =
1 exp (兹2 2)D
冉 冊 1 x2 ⫺ 2 2
(3)
A multiscale stack of an image can now be constructed by computing scaled versions of the image at increasing . In order not to have an a priori bias for a certain scale, the natural way to travel through a linear multiscale
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
213
representation is exponential [61]. For creating a multiscale representation for the hyperstack linking model, good results have been obtained by Vincken et al. [62] and Koster et al. [63] by using the sampling procedure
n = exp(n ⌬) 2.2
=
1 ln 2 2
(4)
Hyperstack Multiscale Linking Model
The hyperstack (for a more detailed description and extension see Refs. 61– 65) is a linking-model-based segmentation technique. The entire hyperstack segmentation process requires four steps (see Fig. 1): (1) blurring, (2) linking, (3) root labeling, and (4) downward projection. In the blurring phase, a stack of images is created. During the linking phase, voxels in two adjacent scale levels are connected by so-called child– parent linkages. The main linking criteria re the luminance value and proximity [59]; a parent is selected in a scale-dependent search region based on the intensity difference. In our implementation, the region in which a child voxel in scale level n searches for a parent voxel in level n ⫹ 1 is given by a circular or spherical region with radius rn,n⫹1 : rn,n⫹1 = k⭈ n,n⫹1
(5)
where n,n⫹1 is the relative scale difference between level n and n ⫹ 1, and k is a constant that is set to 1.5. Within the search region a link is found based on an affection strength, based on heuristic and statistical features [63]. This affection is defined as
FIGURE 1
Schematic of the hyperstack segmentation process.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
214
Niessen et al.
A = exp
冉
⫺
冊冉
冊
兩Lp ⫺ Lc 兩 d 2c,p ⭈ 1⫺ 2( 2n,n⫹1) ⌬Lmax
(6)
The first term has the effect that the affection is smaller if the distance between a parent and child dc,p increases, where distances are measured relative to the scale increase from the child level n to the parent level n ⫹ 1. The second term considers the difference in image intensity between parent (Lp) and child (Lc), normalized over the maximum intensity difference in the original image Lmax . The linking is a bottom-up process such that only parents that have been linked to are considered children in the next linking step. This leads to the typical treelike structure of linkages through scale space. If the linking has converged—in the sense that only a few parents are left in the top level of the hyperstack—the root labeling takes place. In this phase, the children in the tree with a relatively low affection value are labeled as roots, each of which represents a segment in the original image. Finally, in the down projection phase the actual segments are formed by grouping all the voxels in the ground level that are linked to a single root. This can readily be executed by following the child–parent linkages downwards. Typical postprocessing consists of merging a number of segments into the desired objects. The quality of a segmentation can be judged with respect to accuracy (e.g., if a gold standard is available) and by a criterion that determines the amount of postprocessing required. Our goal is to obtain an accurate segmentation with minimal user interaction and subjectivity. The linking model as presented here is based on image intensities. Therefore the approach is limited to image modalities in which objects can be distinguished according to their grey value, as for example in MR images. It is possible also to base linkages on other types of information, e.g., gradient or texture information. Compared to the pyramid [57], the hyperstack is more accurate owing to a better sampling in both space and scale. Another important aspect of the hyperstack is that it allows for roots at multiple levels. Two linear scale space properties are essential for hyperstack segmentation: Causality: The linear multiscale representation satisfies a maximum principle [60]; when traversing in the positive scale direction local maxima are always decreasing. The successive simplification of the image enables identification of the most prominent objects at larger scales, which can be tracked down to smaller scales for accurate boundary localization.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
215
Semigroup property in differential form: The Gaussian function is closed under convolution. Two consecutive smoothing steps with a kernel are equivalent to one smoothing step of a larger kernel. At any level of scale there is a local operator that steers the evolution. This property explains why linkages can be made based on a local criterion. The mentioned scale space properties ensure that we do not just decompose our signal into descriptions at different scales. Rather, the process of how smaller objects (or finer detail) merge into larger objects at a higher scale is of interest and should be exploited in a multiscale segmentation method. The structure in scale space, called deep structure, has to be studied. It is important to note that linear scale space satisfies a third property, namely that the luminance is conserved under the diffusion process. Therefore it makes sense to link points in adjacent levels of scale, based on intensity. 3
NONLINEAR MULTISCALE REPRESENTATIONS
From the previous section we know that two essential properties for multiscale image inputs to the hyperstack are that (1) the image should simplify with increasing scales and (2) the multiscale representation should allow linking relations across scales. Linear scale space satisfies these requirements. However, it suffers from the drawback that it smoothes over regions that can be of particular interest. Therefore much research has concentrated on the construction of multiscale image descriptions with desirable scale space properties, but with additional constraints, for example preservation of relevant edges. Here we will consider multiscale stacks based on gradient dependent diffusion and mean curvature flow, respectively. In a related approach, Acton et al. [66] consider anisotropic diffusion pyramids for image segmentation. Their method uses the same subsampling as the pyramid. Here no subsampling is applied, since local diffusion can be very limited in nonlinear diffusion approaches. 3.1 3.1.1
Multiscale Representation Based on Gradient Dependent Diffusion Gradient Dependent Diffusion
Since multiscale representations that conserve luminance are suitable for linking points in adjacent scale levels based on their grey value, we start with the general formula for an evolution of the luminance function under a flow F(x), which is a function of local image properties:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
216
Niessen et al.
⭸L = ⵜ⭈F(x) ⭸t
(7)
The fact that this evolution conserves L under a current follows from the divergence theorem with appropriate boundary conditions. The function F may be dependent on image features. The most prominent example of luminance conserving scale spaces (F = ⵜL) is the linear scale space, as discussed in Sec. 2.1. However, the evolution can be adapted to any local image feature, e.g., to preserve edges or corners. Gradient-dependent diffusion is essentially a modification of the linear diffusion (or heat) equation [Eq. (1)], first proposed by Perona and Malik [67]. A general form of a diffusion equation with conduction coefficient c is given by ⭸L = ⵜ⭈(c ⵜL) ⭸t
(8)
The idea is to choose c so that relevant objects become smooth while ‘‘edges’’ between objects are preserved. The rationale in the case of MR images is that we would like to compensate for the intensity variations within a tissue, thus allowing intraregion smoothing while suppressing interregion smoothing. An option is to make c a decreasing function of the gradient magnitude [67]: c(x, t) = g(㛳ⵜL(x, t)㛳) = e⫺( 㛳ⵜL㛳
2
/k2)
(9)
Here, k is a free parameter that determines the significance of the local gradient. A maximum principle has been proven provided that the solution is C 2. However, there is considerable empirical and theoretical evidence that the Perona and Malik equation does not have smooth solutions, since it is ill-posed [68–70]. Therefore Catte´ et al. [71] suggest a regularization in which the gradient in the heat conduction coefficient is calculated at a certain scale. This also implies that insignificant edges (for example resulting from high frequency noise) will not be kept. Furthermore, this approach satisfies a maximum principle [69]. For implementing gradient-dependent diffusion we use a fast, unconditionally stable approach as introduced by Weickert et al. [72]. It is an implicit scheme that makes use of additive operator splitting (AOS). Since it is unconditionally stable, even the higher levels of the stack can be calculated very fast. 3.1.2
Sampling the Multiscale Stack
In order to generate a multiscale representation it is possible to use an exponential sampling of the evolution parameter of Eq. (8). However, a number of problems have been reported [64] using gradient-dependent diffusion and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
217
the hyperstack owing to slow convergence. In this implementation convergence was forced by linearly smoothing the last levels in the stack. Therefore an improvement was introduced [65] by selecting the levels based on information-theoretic measures. Weickert proved [69] that the regularized Perona and Malik equation has a class of Lyapunov functionals that are monotonically decreasing in t and bounded by the entropy of a constant image L⬁ with the same average grey value. If we consider a Lyapunov functional H(L) we can construct measures of ⌿ of average globality of the form ⌿(L(t)) :=
H(L 0) ⫺ H(L(t)) H(L 0) ⫺ H(L⬁)
(10)
that increase monotonically from 0 to 1 with increasing t. In our experiments we sampled the scale space levels based on the variance of the image (from 0.0 to 1.0 in 20 equidistant steps). We observed that for the first levels, the equation typically behaves as a nonlinear smoothing procedure, preserving features of high contrast. Eventually, if the contrast of all structures is smaller than the contrast parameter k, all regions are smoothed. 3.2 3.2.1
Multiscale Stack Based on Mean Curvature Flow Geometrical Scale Spaces
Geometry-driven diffusion is another approach that favors intratissue smoothing over intertissue smoothing. In two dimensions the Euclidean shortening flow has excellent geometric smoothing properties: ⭸L = Lvv ⭸t
(11)
Here the (v, w) gauge is used (see Fig. 2), where v denotes the tangential
FIGURE 2 Local gauge coordinates in two dimensions and three dimensions (w points in the normal direction). In three dimensions, u and v denote the principal curvature directions.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
218
Niessen et al.
direction and w the perpendicular direction to the isophote. This equation is called the Euclidean shortening flow, since it decreases the Euclidean arc length fastest. It can be interpreted as a diffusion equation that only diffuses in the local isophote direction (⌬L = Lvv ⫹ Lww). In Cartesian coordinates the Euclidean shortening flow reads ⭸L L 2x Lyy ⫹ L y2 Lxx ⫺ 2Lx Ly Lxy = ⭸t L 2x ⫹ L 2y
(12)
Geometric smoothing in three dimensions is the same as evolving the isophote surfaces. In three-dimensional images isophotes can generally be regarded as two-dimensional surfaces. We use the (u, v, w) gauge (see Fig. 2) to describe conveniently the geometrical properties of the isophote surface. The direction normal to the surface is denoted by w, while u and v denote directions in the surface patch of minimal (min) and maximal (max) curvature, respectively. The local geometry of the surface is fully described by these two curvatures. Therefore we consider a general evolution of the isophote surface S as ⭸S = g(min , max)N ⭸t
(13)
If we consider this surface as the zero level set embedded in a representation L of one additional dimension, the corresponding evolution of L is given by ⭸L = 㛳ⵜL㛳g(min , max) ⭸t
(14)
Alvarez et al. [73] have shown that this equation describes a causal, homogeneous, isotropic and morphological scale space, if g is a symmetric function, nondecreasing with respect to min and max . A logical choice for g is mean curvature H, which leads to the wellstudied mean curvature flow [74]; ⭸S min ⫹ max = N = HN ⭸t 2
(15)
In Cartesian coordinates, H is given by H=
L 2x L yy ⫹ L 2y Lxx ⫺ 2Lx Ly Lxy ⫹ (cycl.(x, y, z)) 2(L 2x ⫹ L 2y ⫹ L 2z )3/2
(16)
where (cycl.(x, y z)) denotes the two cyclic permutations; (x y, z) → ( y, z, x) → (z, x, y), of the previous terms. This equation is the equivalent of the Euclidean shortening flow in that it shrinks surface areas as fast as possible,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
219
i.e., it is the gradient flow of the area functional. Moreover, the equation can be rewritten in the form ⭸L = Luu ⫹ Lvv = ⌬L ⫺ Lww ⭸t
(17)
which implies that we diffuse only in the directions u and v, orthogonal to the image gradient direction. Thus no diffusion is allowed in the direction that most likely signals a transition between tissue types. 3.2.2
Sampling the Multiscale Stack
Since for mean curvature flow we are not guaranteed to have a similar information reduction as for gradient dependent diffusion, we opt for another approach to sample Eq. (16) into a discrete set of scale levels. Recall that is related to evolution time as t = 12 2 in the linear scale space case. Since mean curvature motion can be interpreted as limiting the diffusion in the normal direction of isosurfaces, we can select levels based on the corresponding evolution time. In this case the local scale is always equal to or smaller than the scale in the linear multiscale representation. Mean curvature has been implemented using an Euler forward scheme and centered derivative operators. Owing to stability criteria, the time step is set to ⌬t = 0.1, which implies a long computation time for the higher scale levels. 4
RESULTS
In this section the resulting scale spaces are presented, and segmentation results are evaluated with respect to the amount of required postprocessing and accuracy. For the amount of postprocessing the different strategies are compared with respect to the number of segments that are required in order to be able to distinguish the regions of interest. In the hyperstack linking model the number of segments that is obtained can be specified. It depends of course on the multiscale representation and the linking strategy whether these segments correspond to the segments of interest. Therefore typically an oversegmentation is obtained, and postprocessing in order to merge segments is required. It is also possible to split segments, but this is a more tedious procedure that introduces more interoperator variability. With respect to the accuracy of the segmentation results, a common problem in the analysis of segmentation results is the absence of a ‘‘gold standard.’’ We use supervised segmentation with manual postediting and a brain phantom as ground truth. To evaluate the segmentation results we apply two error measures, viz., percentage of volume difference and percentage of wrongly classified voxels MV. The first measure is always smaller or equal to the second measure, since in this measure overestimates and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
220
Niessen et al.
underestimates cancel each other. The latter measure contains both overestimates and underestimates. The performance can also be quantified using a similarity measure E [75] between two binary segmentations A1 and A2 : E = 1 ⫺ MV = 2 4.1
㛳A1 艚 A2 㛳 㛳A1㛳 ⫹ 㛳A2 㛳
(18)
Simulated 3-D MR Brain Data
To validate the segmentation of white matter, grey matter, and CSF, we utilized the MR simulation data of a digital brain phantom.1 The advantage of this data set is threefold: (1) ground truth is available, (2) the underlying brain phantom is fuzzy, i.e., multiple tissue types can be present in one voxel, simulating the partial volume effect, and (3) the simulator includes typical acquisition artifacts as noise and nonuniformity. We selected a T1 MR image, with an additive noise distribution that has a standard deviation of 3% of the mean intensity of white matter. The nonuniformity field is set to vary between 0.9 and 1.1. Compared to linear diffusion and mean curvature flow, gradient-dependent diffusion required the fewest segments to distinguish the objects of interest. Therefore segmentation results of the hyperstack linking model in conjunction with gradient-dependent diffusion are presented. The threshold k in Eq. (8) was set to 80% of the cumulative histogram of the gradient. In total, eight segments were required in order to be able to distinguish accurately white matter, grey matter, and CSF. We compared the segmentation to the digital brain phantom, that now provided a real gold standard. The results are shown in Table 1. For grey matter, white matter, and the total brain the volume could be estimated accurately. The volume of CSF is underestimated. The volume of misclassified voxels is approximately 6% for the total brain, and around 20% for white matter and grey matter. These numbers seem large, but they are close to the optimal possible binary segmentation (Table 1).2 In order to investigate the influence of the binarization of segmentation results, we also segmented the simulated MR data set using a probabilistic version of the hyperstack linking model [62]. In this modification, linkages are allowed to multiple parents, allowing voxels to belong partially to e.g. 1
The simulated MRI volumes were provided by the McConnell Brain Imaging Centre at the Montreal Neurological Institute at their Web site http://bic.mni.mcgill.ca. 2 An optimal segmentation is impossible if a binary classification is used, since the ground truth has subvoxel accuracy. The best possible binary segmentation has been obtained by assigning to each voxel that tissue that it contains most. Owing to the limited resolution with respect to the curved nature of these data, this already yields significant errors [76].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images TABLE 1
221
Errors in Tissue Segmentation
Tissue type White matter Grey matter CSF Total brain ROI
Nr. voxels
Volume err. (%)
Classif. err. (%)
6.63 * 105 8.98 * 105 3.71 * 105 1.56 * 106 1.93 * 106
⫹0.08 ⫺1.40 ⫺6.80 ⫺0.77 ⫺1.93
17.42 (15.34) 24.74 (20.17) 27.92 (23.86) 6.42 (5.39) 22.84 (19.24)
The second column indicates the error in total estimated volume. The third column indicates the total volume of wrongly classified voxels. Here the probabilistic phantom is used as ground truth. Between brackets, the minimal possible error is listed in case of a binary segmentation.
grey and white matter (see Fig. 3). For all segments an image is generated that contains a distribution labeling which percentile of the voxel belongs to that segment. Postprocessing boils down to adding a number of segments into a tissue class. User interaction is minimal; four segments were grouped to obtain the white matter segmentation, while grey matter and CSF were present as two segments. In Table 2 we show the errors for probabilistic segmentation. White matter and total brain are segmented more accurately than would be possible using a binary segmentation, i.e., a segmentation where a voxel can only belong to one tissue class. 4.2
Two-Dimensional MR Patient Data
In this section we show the multiscale representation and the segmentation result of a 2-D MR slice. The image (called ‘BRAIN.TR’; see Fig. 4) is a transversal slice with a region of interest of 170 ⫻ 210 pixels. We distinguish
FIGURE 3 Examples of probabilistic segmentation of total brain, white matter, grey matter, and CSF, respectively.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
222 TABLE 2
Niessen et al. Errors in Tissue Segmentation Using a Probabilistic Approach
Tissue type White matter Grey matter CSF Total brain ROI
Nr. voxels
Volume error (%)
6.63 * 105 8.98 * 105 3.71 * 105 1.56 * 106 1.93 * 106
⫹2.99 ⫹1.47 ⫺8.02 ⫹2.12 ⫹0.17
Classification error (%) 10.09 21.12 25.21 4.65 18.14
(15.34) (20.17) (23.86) (5.39) (19.24)
The first column indicates the error in total estimated volume. The second column indicates the total volume of wrongly classified voxels. Note that the results of white matter and total brain segmentation are better than the best possible binary segmentation (between brackets).
six segments, viz., ventricles, grey matter, white matter, tissue between skull and brain, skull, and background. The objective of the study is to obtain a sufficiently accurate segmentation (compared to a gold standard obtained by manual segmentation by an expert) with minimal manual editing. We show a number of scale levels of all input images for the linear scale space, the Euclidean shortening flow, and the regularized Perona and Malik equation. For the regularized Perona and Malik equation we have to set a number of parameters to obtain the heat conduction coefficient in Eq. (9). We calculate the gradient magnitude at = 2 (pixel units). The threshold k, which
FIGURE 4 Left: the original BRAIN.TR image; middle: the manual segmentation of the ROI; right: segmentation result superimposed on the original image.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
223
determines which gradients are significant, is selected upon inspecting the gradient magnitude near object boundaries. In Fig. 5 we show the scale spaces of the MR images. Recall that for the linear scale space the level n coincides with n = exp(n ␦); ␦ = 12 ln 2. To illustrate that the diffusion can be adapted in order to preserve objects of interest, we have included two values of k. For instance, the ventricles are very well preserved if k = 100. Also, depending on the signal-to-noise ratio and the frequency of the noise, the scale at which the gradient is evaluated can be adapted. For the MR images we typically have good results for ⬇ 1–4 pixel units. Although a smaller value for k preserves more objects, the best results for the MR images shown here have been obtained using a larger k. Most of the time this is due to the fact that convergence of subparts of an object is slow for smaller k. Convergence of the linkages in a hyperstack is enforced by using a sufficient number of discrete scale space levels, so that only a few top parents remain in the highest level of the stack (a hyperstack typically needs 15 levels to converge). Note that this also requires that all the detail disappear at increasing scale; otherwise, groups of linkages may continue linking upwards without linking to each other. For instance, the stacks generated by the regularized Perona and Malik equation with small k suffer from this problem (they require longer evolution times). In Fig. 6 we plot the object distribution obtained from the three hyperstack segmentations of the BRAIN.TR image and show error pictures of the white matter. Each segmentation produces a series of subsegments that have to be collected to end up with a distribution similar to Fig. 4. The number of subsegments that results from the hyperstack segmentation step is indicated in Fig. 6. We only need 16 segments to obtain a segmentation of the six objects using the regularized Perona and Malik equation. This shows the potential of the hyperstack, especially in combination with nonlinear multiscale representations, for speeding up segmentations; manually segmenting the white matter is a tedious task in this image. 4.3
Three-Dimensional MR Patient Data
Four T1-weighted 3-D MR brain data sets were considered, for which a manual segmentation obtained by a medical expert was available. The data were provided by the Neuroimaging Group of the Schizophrenia Project, Department of Psychiatry, University Hospital, Utrecht. Resolution of the data was 1.0 ⫻ 1.0 ⫻ 1.2 mm, and the ROI consisted approximately of 1603 voxels. Manual segmentations were obtained using a number of simple supervised thresholding and morphological operations, followed by a slice-byslice inspection where the contours could be modified by manual tracings.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
224 Niessen et al.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
225
FIGURE 5 The BRAIN.TR image shown at different levels of scale for four different scale space generators. Shown are, from top to bottom: (a) Linear scale space, spatial domain (levels 2, 5, 8, 11); (b) regularized Perona and Malik equation (k = 100) (levels 2, 5, 8, 11); (c) regularized Perona and Malik equation (k = 400) (levels 2, 5, 8, 11); (d) Euclidean shortening flow (levels 2, 5, 8, 11).
226
Niessen et al.
FIGURE 6 Top row: Segmentations of the BRAIN.TR image. To obtain the six segmented areas a number of regions had to be merged in a postprocessing step. For a quantitative criterion based on the manual effort of a user a heuristic determination of postprocessing costs is introduced. An indication is the number of segments we need to obtain an accurate segmentation of the six segments. The linear scale space (left) required 58 segments, Euclidean shortening flow (middle) needed 34 segments and the regularized Perona and Malik equation (right) only 16. In this sense the regularized Perona and Malik equation performs best. Bottom row: Errors in the segmentations of the white matter. The pixels that are in agreement with each other are colored grey; the differences are colored white.
In Figs. 7, 8, and 9 we show coronal, sagittal, and transversal crosssections of scale space levels obtained using linear scale space, gradientdependent diffusion, and mean curvature motion, respectively. The first important criterion to compare the performance of the three multiscale stacks is the number of segments that is needed to distinguish the tissues of interest. In order to have the background, CSF, white matter, and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
227
FIGURE 7 Coronal, sagittal, and transversal cross sections of an MR volume subjected to linear diffusion. The plotted results are the original slice, and results for are equal to 2.0, 8.0, and 32.0 pixel units.
grey matter as separate segments, typically in the order of ten segments were required for the multiscale stack based on gradient-dependent diffusion. In case of mean curvature flow, on average thirty segments were required, whereas linear diffusion sometimes required over one hundred segments. Therefore in this section quantitative results using the multiscale stack based on gradient-dependent diffusion are presented. We present the segmentation results obtained by the hyperstack in conjunction with gradient-dependent diffusion in two steps. In order to segment the total brain, i.e., the region consisting of grey matter and white matter, a small number of postprocessing steps is required. After the total brain has been segmented, segmentation of grey matter, white matter, and CSF can be achieved by joining a small number of segments. Separating the segmentation of the total brain versus background and subsequent segmentation of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
228
Niessen et al.
FIGURE 8 Coronal, sagittal, and transversal cross sections of an MR volume subjected to gradient-dependent diffusion. The plotted results are the original slice, and the result after evolving the image until 96%, 92%, and 88% of the original variance of the image is obtained, respectively. Owing to the implicit implementation, the stack can be calculated very efficiently.
its parts is important for two reasons. First, manual segmentation of the total brain has been proven quite accurate and is less time-consuming than segmentation of grey and white matter [35]. Second, relatively simple semiautomatic segmentation procedures, e.g., combining region growing or intensity-based classification with use of spatial context (morphological operations) under expert supervision, are viable alternatives for total brain segmentation. In all four cases, only a limited number of segments was required to be able to identify the total brain. In the results presented here we used between 9 and 14 segments.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
229
FIGURE 9 Coronal, sagittal, and transversal cross sections of an MR volume subjected to mean curvature flow. The plotted results are the original slice, and the evolution after 16, 128, and 1024 iterations with a timestep of 0.1 pixel units.
By merging a number of segments, which ranged between 5 and 7, we outlined the total brain. The only postprocessing involved the removal of a number of small connective regions in the intracranial cavity (mainly dural structures) that connect the brain to part of the surrounding cranium. The five segments representing total brain are combined to a binary segmentation that is morphologically eroded with a square structure element with radius 1. Subsequently, the largest region (which presents the total brain) is identified. A geodesic dilation of two iterations with a square element of radius 1 is applied to compensate for the erosion in the first step. A geodesic dilation is defined as the infimum of a dilation and the original binary segmentation (the mask image).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
230
Niessen et al.
TABLE 3 Quantitative Comparison of Segmentation Errors in Two MR Brain Studies
Study Brain Brain Brain Brain
Nr. voxels ROI
Volume error (%)
Classif. error (%)
Similarity
9.6 * 105 8.3 * 105 9.1 * 105 8.8 * 105
⫺0.6 ⫺0.3 ⫺1.3 ⫺0.1
5.0 5.3 2.9 4.2
0.95 0.95 0.97 0.96
1 2 2 2
In the volume error overestimates and underestimates cancel each other. Therefore volumes can be estimated rather accurately. For evaluating segmentation procedures the classification error or the similarity measure (Eq. 18) is more appropriate.
In Table 3 we quantify the segmentation results with respect to the supervised segmentation. The estimated volume approximately differs between 0.1 and 1.3%, while the volume of misclassified voxels is in the order of 3–5%. The volume of misclassified voxels is a more reliable estimate for the performance of a segmentation method, since in estimates of the total volume over- and underestimates cancel each other. In order to assess visually the performance of the automatic segmentation method, volume renderings were made3 (see Fig. 10). We first render the segmentation so as to show the binary segmentation result. Subsequently, we use the binary segmentation for a visually optimal rendering. In the latter approach the segmentation is used as a mask, and the original grey values and an appropriate opacity function are used to compute the surface. If the total brain is segmented, white matter, grey matter, and ventricles can be identified with no further user interaction than joining a small number of segments. Since manual three-dimensional segmentation of white matter is a time-consuming procedure, we show visualizations of the segmented white matter surface in Fig. 11. 5
SUMMARY
A multiscale linking model that uses large-scale information to obtain a small number of segments, and small scale information for accuracy, has been used in conjunction with a number of multiscale representations. The 3
Renderings were made using the VROOM package developed by Zuiderveld et al. [77].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
231
FIGURE 10 Top left: Visualization based on automatic segmentation. Top right: Visualization based on supervised segmentation. Bottom left: Visualization based on the automatic segmentation. The segmentation is used as a mask, while the eventual visualization depends on the grey value of the original image and a properly adapted opacity function. This typically leads to smoother visualizations. Bottom right: Ditto for the supervised segmentation.
output of the segmentation program consists of a number of segments that have to be grouped to obtain the regions of interest. Gradient-dependent diffusion outperforms linear diffusion and mean curvature motion, in the sense that a smaller number of segments is required to outline the interesting regions. Simple morphological postprocessing is used to outline the total brain. No further user interaction is required to segment grey matter, white matter, and CSF. The segmentation results have been compared to ground truth in case of simulated MR data on a digital brain phantom. In the simulation study, the volume of the total brain (⫹0.77%), white matter (⫹0.08%), and grey
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
232
Niessen et al.
FIGURE 11 Top left: Transverse visualization of white matter based on automatic segmentation. Top right: Transverse visualization based on original grey values with automatic segmentation as mask. Bottom left: Sagittal visualization of white matter based on automatic segmentation. Bottom right: Sagittal visualization based on original grey values with automatic segmentation as mask.
matter (⫺1.40%) could reliably be estimated. The volume of CSF could reasonably be estimated (⫺6.80%). We also look at the total volume of misclassified voxels. The errors in grey and white matter are in the order of 20%. Although errors in the order of 20% may seem alarming, errors in this range can be expected for any segmentation method that uses binarization of the segmentation results (for T1 MR acquisitions with voxels of 1 mm3). By using a probabilistic segmentation algorithm, results that are better than ever possible for a binary segmentation have been obtained.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images
233
In case of patient data, the results show that whereas the volume of grey matter, white matter, and total brain can be quite accurately estimated (errors in the order of 0.5%), the percentage of wrongly classified pixels is considerably larger. We found deviations of 5% in total brain segmentation. Whereas for objectivity and reproducibility a fully automatic segmentation procedure is desirable, it is hard, if at all possible, to obtain accurate segmentation results without any supervision. In this chapter we have shown that our low-level bottom-up approach leaves only a small number of decisions to be made to obtain the desired segments. These last few decisions can either be made by an operator or by using prior knowledge. In the former case it is important to minimize subjectivity, whereas in the latter case a study involving a large number of data sets is required to test the robustness of a (now fully automatic) system. In our experiments we segment brains according to the former method. Since the hyperstack yields a discrete number of segments, user subjectivity is limited (there are no continuous parameters that need to be set). The incorporation of a priori knowledge is an important next step. This can be done at different levels of sophistication. Anatomical knowledge, for example from brain atlases, can be used to group segments into the desired anatomical structures, which can lead to a fully automatic segmentation procedure. Second, in the linking procedure itself segments can be tagged according to the structure they most likely present, and the linking process itself can be adjusted. This modification can increase convergence speed and accuracy. REFERENCES 1. 2.
3.
4.
5.
6.
JC Bezdek, LO Hall, LP Clarke. Review of MR image segmentation techniques using pattern recognition. Medical Physics 20(4):1033–1048, 1993. AP Zijdenbos, BM Dawant. Brain segmentation and white matter lesion detection in MR images. Critical Reviews in Biomedical Engineering 22(5–6): 401–465, 1994. LP Clarke, RP Velthuizen, MA Camacho, JJ Heine, M Vaidyanathan, LO Hall, RW Thatcher, ML Silbiger. MRI segmentation: methods and applications. Magnetic Resonance Imaging 13(3):343–368, 1995. BJ Bedell, PA Narayana. Automatic segmentation of gadolinium-enhanced multiple sclerosis lesions. Magnetic Resonance in Medicine 39(6):935–940, 1998. SM Lawrie, SS Abukmeil. Brain abnormality in schizophrenia. A systematic and quantitative review of volumetric magnetic resonance imaging studies. British Journal of Psychiatry 172:110–120, 1998. JL Tanabe, D Amend, N Schuff, V DiSclafani, F Ezekiel, D Norman, G Fein, MW Weiner. Tissue segmentation of the brain in Alzheimer disease. AJNR American Journal of Neuroradiology 18(1):115–123, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
234 7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18. 19. 20. 21.
Niessen et al. D Kidron, SE Black, P Stanchev, B Buck, JP Szalai, J Parker, C Szekely, MJ Bronskill. Quantitative MR volumetry in Alzheimer’s disease. Topographic markers and the effects of sex and education. Neurology 49(6):1504–1512, 1997. C Davatzikos, RN Bryan. Using a deformable surface model to obtain a shape representation of the cortex. IEEE Transactions on Medical Imaging 15(6): 785–795, 1996. C Xu, DL Pham, ME Rettmann, DN Yu, JL Prince. Reconstruction of the human cerebral cortex from magnetic resonance images. IEEE Transactions on Medical Imaging 18(6):467–480, 1999. SR Sandor, RM Leahy. Toward automated labeling of the cerebral cortex using a deformable atlas. In: Proc. Information Processing in Medical Imaging, 127– 138, 1995. M Shenton, R Kikinis, FA Jolez, SD Pollak, M LeMay, CG Wible, H Hokama, J Martin, D Metcalf, M Coleman, RW McCarley. Abnormalities of the left temporal lobe and thought disorder in schizophrenia. New England Journal of Medicine 327:604–612, 1992. D Dai, B Condon, R Rampling, G Teasdale. Intracranial deformation caused by brain tumors; assessment of 3-D surface by magnetic resonance imaging. IEEE Transactions on Medical Imaging 12(4):693–702, 1993. HE Cline, WE Lorensen, R Kikinis, F Jolesz. Three dimensional segmentation of MR images of the head using probability and connectivity. Journal of Computer Assisted Tomography 14(6):1037–1045, 1990. WEL Grimson, GJ Ettinger, SJ White, PL Gleason, L Pe´rez, WM Wells, R Kikinis. An automatic registration method for frameless stereotaxy, image guided surgery, and enhanced reality visualization. IEEE Transactions on Medical Imaging 15(2):129–140, 1996. T Peters, B Davey, P Munger, R Comeau, A Evans, A Olivier. Three-dimensional multimodal image-guidance for neurosurgery. IEEE Transactions on Medical Imaging 15(2):121–128, 1996. S Nakajima, H Atsumi, AH Bhalerao, FA Jolesz, R Kikinis, T Yoshimine, TM Moriarty. PE Stieg. Computer-assisted surgical planning for cerebrovascular neurosurgery. Neurosurgery 41(2):403–409; discussion 409–410, 1997. SS Kollias, R Bernays, RA Marugg, B Romanowski, Y Yonekawa, A Valavanis. Target definition and trajectory optimization for interactive. Journal of Magnetic Resonance Imaging 8(1):143–159, 1998. M Sonka, SK Tadikonda, SM Collins. Knowledge-based interpretation of MR brain images. IEEE Transactions on Medical Imaging 15(4):443–452, 1996. LH Staib, JS Duncan. Model-based deformable surface finding for medical images. IEEE Transactions on Medical Imaging 15(5):720–731, 1996. WM Wells, WEL Grimson, R Kikinis, FA Jolesz. Adaptive segmentation of MRI data. IEEE Transactions on Medical Imaging 15(4):429–442, 1996. N Saeed, JV Hajnal, A Oatridge. Automated brain segmentation from single slice, multislice, or whole-volume MR scans using prior knowledge. Journal of Computer Assisted Tomography 21(2):192–201, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images 22. 23.
24. 25.
26.
27.
28.
29. 30.
31.
32. 33.
34.
35.
36.
235
MS Atkins, BT Mackiewich. Fully automatic segmentation of the brain in MRI. IEEE Transactions on Medical Imaging 17(1):98–107, 1998. RC Herndon, JL Lancaster, JN Giedd, PT Fox. Quantification of white matter and gray matter volumes from three-dimensional magnetic resonance volume studies using fuzzy classifiers. Journal of Magnetic Resonance Imaging 8(5): 1097–1105, 1998. DL Pham, JL Prince. Adaptive fuzzy segmentation of magnetic resonance images. IEEE Transactions on Medical Imaging 18(9):737–752, 1999. K Van Leemput, F Maes, D Vandermeulen, P Suetens. Automated model-based tissue classification of MR images of the brain. IEEE Transactions on Medical Imaging 18(10):897–908, 1999. BM Dawant, SL Hartmann, J-P Thirion, F Maes, D Vandermeulen, P Demaerel. Automatic 3-D segmentation of internal structures of the head in MR images using a combination of similarity and free-form transformations: Part I, methodology and validation on normal subjects. IEEE Transactions on Medical Imaging 18(10):909–916, 1999. SL Hartmann, MH Parks, PR Martin, BM Dawant. Automatic 3-D segmentation of internal structures of the head in MR images using a combination of similarity and free-form transformations: Part I, validation on severely atrophied brains. IEEE Transactions on Medical Imaging 18(10):917–926, 1999. H Suzuki, J Toriwaki. Automatic segmentation of head MRI images by knowledge guided thresholding. Computerized Medical Imaging and Graphics 15(4): 233–240, 1991. T Kapur, WEL Grimson, WM Wells, R Kikinis. Segmentation of brain tissue from magnetic resonance images. Medical Image Analysis 1(2):109–127, 1996. K Held, ER Kops, BJ Krause, WM Wells, R Kikinis, HW Muller-Gartner. Markov random field segmentation of brain MR images. IEEE Transactions on Medical Imaging 16(6):878–886, 1997. JC Rajapakse, JN Giedd, JL Rapoport. Statistical approach to segmentation of single-channel cerebral MR images. IEEE Transactions on Medical Imaging 16(2):176–186, 1997. M Vannier, R Butterfield, D Jordon, W Murphy, RG Levitt, M Gado. Multispectral analysis of magnetic resonance images. Radiology 154:221–224, 1985. M Kohn, N Tanna, G Herman, SM Resnick, PD Mozley, RE Gur, A Alavi, RA Zimmerman, RC Gur. Analysis of brain and cerebrospinal fluid volumes with MR imaging. Radiology 178:115–122, 1991. G Gerig, J Martin, R Kikinis, O Ku¨bler, M Shenton, FA Jolesz. Unsupervised tissue type segmentation of 3D dual-echo MR head data. Image and Vision Computing 10(6):349–360, 1002. R Kikinis, ME Shenton, G Gerig, J Martin, M Anderson, D Metcalf, CHRG Guttmann, RW McCarley, B Lorensen, H Cline, FA Jolesz. Routine quantitative analysis of brain and cerebrospinal fluid spaces with MR imaging. Journal of Magnetic Resonance Imaging 2(6):619–629, 1992. LM Fletcher, JB Barsotti, JP Hornak. A multispectral analysis of brain tissues. Magnetic Resonance in Medicine 29:623–630, 1993.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
236 37.
38.
39.
40.
41.
42.
43.
44.
45. 46.
47.
48.
49.
50.
51.
Niessen et al. P Gibbs, DL Buckley, SJ Blackband, A Horsman. Tumour volume determination from MR images by morphological segmentation. Physics in Medicine and Biology 41:2437–2446, 1996. ME Brummer, RM Mersereau, RL Eisner, RRJ Lewine. Automatic detection of brain contours in MRI data sets. IEEE Transactions on Medical Imaging 12(2):153–166, 1992. L Lemieux, G Hagemann, K Krakow, FG Woermann. Fast, accurate, and reproducible automatic segmentation of the brain in T1-weighted volume MRI data. Magnetic Resonance in Medicine 42(1):127–135, 1999. KH Ho¨hne, WH Hanson. Interactive 3-D segmentation of MRI and CT volumes using morphological operations. Journal of Computer Assisted Tomography 16(2):285–294, 1992. M Bomans, KH Ho¨hne, U Tiede, M Riemer. 3-D segmentation of MR images of the head for 3-D display. IEEE Transactions on Medical Imaging 9(2):177– 183, 1990. DN Kennedy, PA Filipek, VS Caviness. Anatomic segmentation and volumetric calculations in nuclear magnetic resonance imaging. IEEE Transactions on Medical Imaging 8(1):1–7, 1989. KO Lim, A Pfefferbaum. Segmentation of MR brain images into cerebrospinal fluid spaces, white and gray matter. Journal of Computer Assisted Tomography 13(4):588–593, 1990. A Zijdenbos, BM Dawant, RA Margolin. Automatic detection of intracranial contours in MR images. Computerized Medical Imaging and Graphics 18:11– 23, 1994. T McInerney, D Terzopoulos. Deformable models in medical image analysis: a survey. Medical Image Analysis 1(2):91–108, 1996. T Cootes, A Hill, C Taylor, J Haslam. Use of active shape models for locating structures in medical images. Image and Vision Computing 12(6):355–366, 1994. G Sze´kely, A Kelemen, C Brechbuhler, G Gerig. Segmentation of 2-D and 3D objects from MRI volume data using constrained elastic deformations of flexible Fourier contour and surface models. Medical Image Analysis 1(1):19– 34, 1996. N Duta, M Sonka. Segmentation and interpretation of MR brain images: an improved active shape model. IEEE Transactions on Medical Imaging 17(6): 1049–1062, 1998. MI Miller, GE Christensen, Y Amit, U Grenander. Mathematical textbook of deformable neuroanatomies. Proceedings National Academic of Sciences USA 90(24):11944–11948, 1993. LK Arata, AP Dhawan, JP Broderick, MF Gaskil-Shipley, AV Levy, ND Volkow. Three-dimensional anatomical model-based segmentation of MR brain images through Principal Axes Registration. IEEE Trans Biomed Eng 42(11): 1069–1078, 1995. A Yezzi, S Kichenassamy, A Kumar, P Olver, A Tannenbaum. A geometric snake model for segmentation of medical imagery. IEEE Transactions on Medical Imaging 16(2):199–209, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multiscale Segmentation of Volumetric MR Brain Images 52.
53.
54. 55.
56.
57.
58. 59. 60.
61. 62.
63.
64.
65.
66.
67.
237
X Zeng, LH Staib, TR Schultz, JS Duncan. Segmentation and measurement of the cortex from 3D MR images using coupled surfaces propagation. IEEE Transactions on Medical Imaging 18(10):927–937, 1999. BM Dawant, AP Zijdenbos, RA Margolin. Correction of intensity variations in MR images for computer-aided tissue classification. IEEE Transactions on Medical Imaging 12(4):770–781, 1993. WM Wells, R Kikinis, WEL Grimson, FA Jolesz. Adaptive segmentation of MRI data. IEEE Transactions on Medical Imaging 15(4):429–442, 1996. B Johnston, MS Atkins, B Mackiewich, M Anderson. Segmentation of multiple sclerosis lesions in intensity corrected multispectral MRI. IEEE Transactions on Medical Imaging 15(2):154–169, 1996. JG Sled, AP Zijdenbos, AC Evans. A comparison of retrospective intensity nonuniformity correction methods for MRI. In: JS Duncan, G Gindi, eds. Proc. Information Processing in Medical Imaging, 459–464, 1997. PJ Burt, TH Hong, A Rosenfeld. Segmentation and estimation of image region properties through cooperative hierarchical computation. IEEE Transactions on Systems, Man, and Cybernetics 11(12):802–825, 1981. AP Witkin. Scale-space filtering. In: Proc. International Joint Conference on Artificial Intelligence, 1019–1023, Karlsruhe, Germany, 1983. JJ Koenderink. The structure of images. Biological Cybernetics 50:363–370, 1984. RA Hummel. Representations based on zero crossings in scale-space. In: Proceedings of the IEEE Computer Vision and Pattern Recognition Conference 204–209, 1986. Reproduced in: Readings in Computer Vision: Issues, Problems, Principles and Paradigms (M Fischler, O Firschein, eds.). Morgan Kaufmann, 1987. LMJ Florack, BM ter Haar Romeny, JJ Koenderink, MA Viergever. Linear scale-space. Journal of Mathematical Imaging and Vision 4(4):325–351, 1994. KL Vincken, ASE Koster, MA Viergever. Probabilistic multiscale image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(19):109–120, 1997. ASE Koster, KL Vincken, CN De Graaf, OC Zander, MA Viergever. Heuristic linking models in multiscale image segmentation. Computer Vision and Image Understanding 65(3):382–402, 1997. WJ Niessen, KL Vincken, J Weickert, MA Viergever. Nonlinear multiscale representations for image segmentation. Computer Vision and Image Understanding 66(2):233–245, 1997. WJ Niessen, KL Vincken, J Weickert, BM ter Haar Romeny, MA Viergever. Multiscale segmentation of three-dimensional MR brain images. International Journal of Computer Vision 31(2/3):185–202, 1999. ST Acton, AC Bovik, MM Crawford. Anisotropic diffusion pyramids for image segmentation. In Proc. First International Conference on Image Processing. IEEE, 1994, 478–482. P Perona, J Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(7):629– 639, 1990.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
238 68.
69.
70.
71.
72.
73.
74. 75.
76.
77.
Niessen et al. M Nitzberg, T Shiota. Nonlinear image filtering with edge and corner enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(8): 826–833, 1992. J Weickert. Anisotropic diffusion in image processing. PhD thesis, Dept. of Mathematics, University of Kaiserslautern, Germany, 1996. Revised and extended version available as book (Teubner Verlag, Stuttgart, 1998). S Kichenassamy. Nonlinear diffusions and hyperbolic smoothing for edge enhancement. In MO Berger, R Deriche, I Herlin, J Jaffre´, JM Morel, eds. Proc. of 12th International Conference on Analysis and Optimization of Systems, Vol. 219 of Lecture Notes in Control and Information Sciences. Springer, London, 1996, 119–124. F Catte´, PL Lions, JM Morel, T Coll. Image selective smoothing and edge detection by nonlinear diffusion. SIAM Journal on Numerical Analysis 29(1): 182–193, 1992. J Weickert, BM ter Haar Romeny, MA Viergever. Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Transactions on Image Processing 7(3): 398–410, 1998. L Alvarez, PL Lions, JM Morel. Image selective smoothing and edge detection by nonlinear diffusion. II. SIAM Journal on Numerical Analysis 29(3):845– 866, 1992. G Huisken. Flow by mean curvature of convex surfaces into spheres. Journal of Differential Geometry 20:237–266, 1984. A Zijdenbos, BM Dawant, RA Margolin, AC Palmer. Morphometric analysis of white matter lesions in MR images. IEEE Transactions on Medical Imaging 13(4):716–724, 1994. WJ Niessen, CJ Bouma, KL Vincken, MA Viergever. Evaluation of medical image segmentation. In: R Klette, B Haralick, S Stiehl, MA Viergever, eds. Geometry-Driven Diffusion in Computer Vision, Computational Imaging and Vision. Dordrecht: Kluwer Academic Publishers, 2000, in press. KJ Zuiderveld, AHJ Koning, R Stokking, JB Antoine Maintz, F Appelman, MA Viergever. Multimodality visualization of medical volume data—our techniques, applications, and experiences. Computer and Graphics 20(6):775–791, 1996.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
9 A Precise Segmentation of the Cerebral Cortex from 3-D MRI Using a Cellular Model and Homotopic Deformations Yann Cointepas and Isabelle Bloch Ecole Nationale Supe´rieure des Te´le´communications, Paris, France
Line Garnero Hoˆpital La Salpeˆtrie`re, Paris, France
1
INTRODUCTION
Three-dimensional cerebral cortex segmentation using magnetic resonance images (MRI) has a double purpose: it provides a precise model of the brain convolutions and it provides information on the localization of functional regions. Therefore it is useful for both anatomical applications (such as visualization, registration, recognition, etc.) and functional imaging (fMRI, PET, EEG/MEG, etc.). The cortex geometry is complex because its surface presents many folds. Only one third of the cortex is located on the external part of the brain; the rest is located in the folds (also called sulci). As a result, cortex segmentation methods must overcome several inherent problems: Sulci have a ‘‘tree’’ structure. They are divided into many branches inside the brain. 239
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
240
Cointepas et al.
Sulci are both surfacic, when two distinct parts of the cortex are in contact, and volumic, in places filled with cerebrospinal fluid (CSF) (Fig. 1). The thin parts of the sulci are very sensitive to partial volume effects. Because of these difficulties and because of the complex geometry of the cortical ribbon, classical segmentation techniques cannot be used to obtain a complete 3-D model of the cortex, but only to obtain the external part of the cortex surface. Therefore existing cortex segmentation methods are specific and often use anatomical information as assumptions about the final model of the cortex. The two most used constraints impose on the final model the same topology as a sphere and a cortex width that is almost constant. Let us comment on some typical examples of the different existing methods. Wagner et al. [1] segmented the cortical ribbon by applying a region growing algorithm along the limit between gray matter and white matter. This method makes it possible to compute geodesic distances between cortex points, but it does not take into account the part of the cortex surface that disappears due to partial volume effect. Zeng et al. [2] proposed a method based on level sets [3,4]. The deformable model is composed of two curves evolving in parallel with distance constraints. These curves represent respectively interior and exterior surfaces of the cortex. In this method, the topology of the model is not constrained, and partial volume effect is not taken into account. Mangin et al. [5] introduced the use of homotopic deformation (i.e., topology-preserving deformation) for cortex segmentation. The model is initialized by the voxels corresponding to the brain surface. This model is homotopically dilated in the gray matter. Then a homotopic skeletonization
FIGURE 1 Diagram of a cortex slice. The cerebral convolutions consist of both two-dimensional structures in the very narrow parts and threedimensional parts, containing CSF.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
241
is applied to obtain a surfacic representation. This method gives a surfacic model of the cortex with a spherical topology but does not consider the presence of CSF volumes in the sulci. Teo et al. [6] proposed a topological segmentation method based on region growing. First, the white matter is segmented using classification. Then layers of voxels are successively added to the white matter surface to obtain the cortex. The topology of the result is given by the segmentation of the white matter, but the classification used cannot guarantee the correct topology. Xu et al. [7] carried out a fuzzy segmentation of gray matter, white matter, and CSF. The membership to the white matter function is then thresholded to obtain an isosurface used as an initial deformable model. The correct topology is obtained by iteratively applying median filtering to the membership function. The convergence of this method is not proven. Moreover, median filtering tends to smooth the initial model. The method proposed by Le Goualher et al. [8] is similar to the one proposed by Davatzikos [9]. The brain is segmented and its curvature gives a set of curves corresponding to the external part of the sulci. These curves are then deformed toward the bottom of the sulci. The deformation of each curve gives a surface corresponding to a sulcus. These methods cannot be used to obtain the full complex structure of the cortex because some sulci are divided into several surfacic branches. In order to solve all the problems of cortex segmentation, anatomical constraints must be accompanied by a suitable modeling making it possible to take into account the deep geometry of the cortex. In this chapter, we present an original cortex segmentation method based on a homotopic cellular deformable model. In Sec. 2, we briefly introduce the cellular model (see Ref. 10 for more details) and present its usefulness for cortex segmentation. In Sec. 3 we introduce our segmentation method and present some results in Sec. 4. Section 5 contains concluding remarks. 2
CELLULAR MODELING OF THE CORTEX
The purpose of introducing a new modeling method is to allow the use of a model that can adapt to the deep geometry of the cortex and at the same time ensure the spherical topology of the result. We show in this section that a cellular modeling can be used to meet these requirements. The cellular model is a cellular complex based structure embedded in the cubic grid. This structure is composed of four types of elements (cubes, facets, lines, and points) connected by an adjacency relationship (Fig. 2). The adjacency relationship is defined such that two elements are connected if and only if one is part of the border of the other. Such a cellular structure
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
242
Cointepas et al.
FIGURE 2 (a) The cellular model structure, which is composed of different cells: the cubes, the facets between two cubes, the lines between facets, and the points between the lines. (b) The cells used to build cellular objects.
has been proven to be free from topological paradox [11]. Moreover, we showed in a previous work [10] how to characterize the simple cells, which are the cells that can be removed or added to the model without changing its topology. This characterization is based on local tests performed in the neighborhood of the cell to be checked, which leads to an efficient implementation. Thus it is possible to deform a cellular model with complete control of its topology. The cellular model makes it possible to represent objects with different local dimensions. In other words, a cellular object can be composed of an assembly of volumes, surfaces, and curves (Fig. 3). This property is important to enable representation of both the surfacic and volumic parts of the
FIGURE 3 Representation of objects with different local dimensions. (a) Structures made up of voxels are always three-dimensional. (b) Thin structures can be represented with a cellular model.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
243
sulci in the same model. Moreover, having the ability to apply homotopic deformations to a cellular model allows us to use a deformable model with adaptive local dimensions and fixed topology. In the next section, we use such a deformable model to segment the cortex. 3
CORTEX SEGMENTATION
Our segmentation method is based on homotopic deformation of a cellular model. In Sec. 3.1, we briefly show how we first segment the hemispheres of the brain in order to initialize the model and to impose the initial topology. The deformation method is presented in Sec. 3.2. 3.1
Cellular Model Initialization
The cortex we want to segment is located in the hemispheres. Therefore we first segment the hemispheres to initialize the cellular model. This segmentation is done in three steps [12]: The brain is segmented from the initial 3-D MR image with mathematical morphology operators [5]. The brain stem and the cerebellum are segmented using mathematical morphology operators and removed from the segmented brain. The resulting object represents the hemispheres. In order to remove cavities and topological tunnels and thus impose the same topology as a filled sphere to the model, two-dimensional cavities are filled in in the hemispheres slice by slice. Once a binary 3-D image representing the hemispheres is obtained (Fig. 4), we use it to initialize a cellular model. This initialization must preserve the connectivity used in the 3-D image. We use the following algorithm: Set all cells to the label background. For each hemisphere voxel v: Set the cube corresponding to v and all its neighboring cells (of dimension lower than 3) to the label object. End for 3.2 3.2.1
Deformation Method Principle
The segmentation process consists in initializing the model on the hemispheres and then penetrating this model towards the interior of the cortical
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
244
Cointepas et al.
FIGURE 4 Three-dimensional views (obtained with IDL) of the segmentation result for the brain (a) and for the hemispheres (b).
folds (Fig. 5). The deformations are done by iteratively removing simple cells from the model. The use of a cellular model makes it possible to take into account the thin parts of the sulci by removing cells located between two voxels. We define three types of deformations: In order to penetrate into an object, it is necessary to go through its surface. Therefore we remove the object surface to let the model evolve (Fig. 6c).
FIGURE 5 Diagram of the deformation process. (a) The model is initialized on the external brain surface. (b) The model is penetrated towards the inside of the sulci. (c) Final state, the model surface merged into the cortex folds and the initial topology is preserved (in 3-D).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
245
FIGURE 6 The three types of deformation of the cellular model. (a) Diagram of a cortical convolution slice (black curve) and of its image representation. (b) Initial cellular model corresponding to the convolution; all the cells belong to the model (for clarity, we put the labels of the image on the cells of the model). (c) Cells corresponding to the model surface are removed from the object (removed cells are in black). (d) Cells corresponding to a thin part of the convolution are removed. (e) Cells corresponding to a CSF volume are removed.
In places where the sulci are very narrow, we deform a surface between the voxels (Fig. 6d). When the model reaches a volume containing CSF, the volume is removed from the object (Fig. 6e). Each deformation consists of removing a simple cell. Therefore, the main part of the deformation process consists in choosing between all the simple cells, which ones must be removed to deform the model towards the desired solution. This process is done with the following algorithm: For each simple cell s If s belongs to the model limit remove s from the model
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
246
Cointepas et al.
Else If s is a cube If volume(s) < svolume remove s from the model End if Else if s is a facet If surface(s) < ssurface remove s from the model End if End if End for In this algorithm, the three main conditions inside the For loop correspond to the three types of deformations. In the following, we further detail the volume deformation and the surface deformation, which are expressed as cost functions volume and surface. svolume and ssurface are preset thresholds. 3.2.2
Volume Deformation
Volume deformations are based on a function that gives, for each voxel of the MR image (and therefore for each cube of the model), a membership to the cortex value obtained using classification of the initial MR image based on gray levels (Fig. 7). This value is normalized between 0 and 1. It can be seen as the probability that a voxel belongs to the cortex. For example, a value of 0 means that the voxel is not in the cortex, a value of 1 means that the voxel is in the cortex, and a value of 0.90 indicates that the voxel is most probably in the cortex. We use this membership function to estimate cortex location in order to guide the model deformation. The membership to the cortex function (called volume) is used to remove the cubes for which membership to the cortex is too low (i.e., less than svolume). Since the voxels corresponding to CSF have a low membership to the cortex value, removing cubes with low volume allows the model to penetrate the parts of the sulci containing CSF. 3.2.3
Surface Deformation
The cost function surface guiding the surface deformations inside the folds is composed of two terms that are also expressed as cost functions. These two terms are combined with a weight value ␣, which is a parameter of the algorithm.
surface(s) = ␣ ⭈ csf(s) ⫹ (1 ⫺ ␣)⭈ width(s)
(1)
The goal of the function width is to detect the thin parts of the cortical folds that are not visible in the image due to the partial volume effect. This function is based on the assumption that the cortex is of almost constant
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
247
FIGURE 7 Slice of a membership to the cortex image. The grey levels represent values between 0 (black) and 1 (white).
width. Indeed, in places where the cortex surface is not visible, the cortical ribbon is locally twice as wide as the cortex width. This property can be used to extend the cortical surface (Fig. 8). The cost function width uses this property of the cortical ribbon to evaluate the position of a facet compared to the ribbon center. To carry out this evaluation, two functions are used:
FIGURE 8 Diagram of a constant-width ribbon. On the left, there is a structure twice as wide as the ribbon width. On the right, the surface has been prolonged and the width is constant.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
248
Cointepas et al.
The function width(o, dir, l) evaluates the cortex width from a point o in space (actually a facet center) in a direction dir and on a length l. This function is computed by projecting the half-line (o, dir) in the membership to the cortex image and computing the mean value on a length l. The function evolution gives, for each facet s of the model surface, a direction roughly representing the direction in which the surface of the model should be deformed near s. The function epaisseur must favor the removal of a facet when it is in the center of a sulcus. Thus it will reflect the thickness of the cortex on both sides of the facet by using the function width. Moreover, the removal of the facet makes the internal surface evolve towards the interior of the brain, and thus one should ensure that the facet is not at the bottom of a convolution by evaluating the depth of the cortex in the evolution direction of the facet in order to preserve the thickness of the cortex in this direction (Fig. 9). The following function is thus obtained:
width(s) = minimum(
width(center(s), normal(s), cw), width(center(s),⫺normal(s), cw), width(center(s), evolution(s), cw)
)
(2)
The functions center(s) and normal(s) respectively provide the central point of the facet and a normal direction to the facet. The length cw, expressed in millimeters, corresponds to the length on which the thickness around the facet is evaluated. It is a parameter of the algorithm, and its value must at least be equal to the thickness of the cortex. The use of the minimum function to combine the estimated thickness of the cortex makes it possible to favor surface evolution only when there is sufficient cortex around the facet in the three directions.
FIGURE 9 Evaluation of the cortex thickness around a facet. The considered facet is in white. n corresponds to a normal to the facet and e represents the evolution direction of the facet.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
FIGURE 10 Ref. 13.)
249
Segmentation of the cerebrospinal fluid in the sulci. (From
In spite of the control over the topology of the cellular model, the deformation space is huge, and thus the use of a single local criterion to guide the model would make the result too sensitive to noise. Thus we choose to combine the method of guidance that we have just presented with a more global criterion based on a preliminary segmentation of the CSF in the sulci. The CSF, which is dark in the images, can be partly segmented by the traditional tools of mathematical morphology [13] (Fig. 10). This detection can be used as guide to evolve internal surfaces. The deformations of the cellular model are guided by favoring its displacements along surfaces close to the CSF in the sulci by calculating csf(s) as the mean value of CSF in s neighborhood. 4
RESULTS
To present the results and evaluate the localization of the model surface in the brain, we developed a visualization tool allowing the superimposition of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
250
Cointepas et al.
the model surface on an image slice. This method displays, over the slices, the surface facets (either internal or external) that are located in the slice and are perpendicular to it. We present the surface superimposed on the membership to the cortex image in order to visualize the model surface location as compared to the cortex fold locations. Figure 11 presents the results on a part of the brain. It shows the model evolution at different steps of the algorithm. One step corresponds to the traversing of all the simple cells of the model. In this example, we use the following parameters: svolume = 0.3. This threshold is used with the volume function to deform CSF volumes in sulci. Its values are between 0 and 1. Low values represent high constraints (i.e., few deformations) and high values represent low constraints. This parameter is independent of others, and its influence on the result is continuous. ssurface = 0.65. This threshold is used with the surface function to deform CSF volumes in sulci. Its limit values are the same as svolume. The choice of this parameter is related to the parameter ␣. Both parameters must represent a compromise between detection, localization, and regularity of the sulci. ␣ = 0.7. This parameter is a weight used to guide surface deformations. Its values are between 0 and 1 and represent a compromise between the use of width and csf to guide the model. l = 4 mm. This parameter represents the evaluated cortex width. It does not have well-defined limits, but it is generally assumed that the cortex width is between 2 and 5 millimeters. The surface topology is preserved during the deformation process. Although the resulting model surface is irregular and seems to be disconnected on the two-dimensional slices, the three-dimensional connectivity is preserved (it is difficult to see on a two-dimensional representation). Unfortunately, irregularities on the surface prevent the use of a three-dimensional visualization. However, in spite of the lack of regularity of surfaces, the results show that generally there is a good localization of internal surfaces in the convolutions. The cost functions used during the deformation process are good guiding criteria for deformable model based cortex segmentation. 5
CONCLUSION
In this chapter, an original method is introduced for the segmentation of cortical surface using a homotopic cellular deformable model. This method is based on the deformations of a cellular object initialized on the volume of the brain and then deformed towards the interior of the deep cortical
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
251
FIGURE 11 Model evolution during the segmentation process. Parameters are svolume = 0.3, ssurface = 0.65, ␣ = 0.7, and l = 4 mm.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
252
Cointepas et al.
convolutions. The cellular representation, which makes it possible to evolve simultaneously surfaces and volumes within the same model, can thus adapt to the geometry of the sulci, which are made up of both surfacic and volumic parts. The lack of regularity of the model surface makes difficult the qualitative interpretation of the results. However, it appears that the two terms of guidance that we used to deform surfaces are complementary and allow detection and correct localization of the sulci, in particular in the deepest parts of the brain. Our segmentation method takes into account several inherent problems of cortex segmentation (tree structure of the sulci, partial volume effect, cortex topology, and presence of both volumes and surfaces in sulci) that are not considered together by other methods. The cellular deformable model is embedded in the cubic grid. Therefore the results can be used directly in the original MRI space, which is not the case with traditional deformable models. Moreover, other approaches separate volume and surface modeling. Our approach gives a single model that represents both the cerebral volume and the cortex surface.
FIGURE 12
Model evolution in sulci junctions (a) and in CSF volumes (b).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
A Precise Segmentation of the Cerebral Cortex
253
The use of a homotopic cellular deformable model to segment the cortex allows us to reach two objectives: it takes into account the junctions of the deep sulci (Fig. 12a) as well as the volumes of CSF in the sulci (Fig. 12b).
REFERENCES 1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
M. Wagner, M. Fuchs, H.-A. Wischmann, K. Ottenberg, O. Do¨ssel. Cortex segmentation from 3D MR images for MEG reconstruction. Biomagnetism: Fundamental Research and Clinical Applications. Elsevier Science, IOS Press, 1995, pp. 433–438. X. Zeng, L. H. Staib, R. T. Schult, J. S. Duncan. Segmentation and measurement of the cortex from 3D MR images. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI’98, Vol. 1496. Cambridge, MA, 1998, pp. 519–530. R. Malladi, J. A. Sethian, B. C. Vemuri. Shape modeling with front propagation: a level set approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(2):158–174, 1995. J. A. Sethian. Level set methods: evolving interfaces in geometry, fluid mechanics, computer vision and material science. Cambridge University Press, 1996. J.-F. Mangin, O. Coulon, V. Frouin. Robust brain segmentation using histogram scale-space analysis and mathematical morphology. In: Medical Image Computing and Computer Assisted Intervention (MICCAI’98), Lecture Notes in Computer Science. Cambridge, MA, Springer-Verlag, pp. 1230–1241. P. C. Teo, G. Sapiro, B. A. Wandell. Creating connected representations of cortical gray matter for functional MRI visualization. IEEE Transactions on Medical Imaging 16(6):852–863, 1997. D. L. Pham, C. Xu, L. Prince. Finding the brain cortex using fuzzy segmentation, isosurfaces and deformable surface models. In: XVth International Conference Information Processing in Medical Imaging (IPMI’97). Poultney, Springer-Verlag, 1997, pp. 399–404. Georges Le Goualher, Emmanuel Procyk, D. Louis Collins, Raghu Venugopal, Christian Barillot, Alan C. Evans. Automated extraction and variability analysis of sulcal neuroanatomy. IEEE Transactions on Medical Imaging 18(3):206– 217, 1999. Christos Davatzikos. Using a deformable surface model to obtain a shape representation of the cortex. IEEE Transactions on Medical Imaging 15(6):785– 795, 1996. Yann Cointepas, Isabelle Bloch, Line Garnero. A cellular model for multiobjects multi-dimensional homotopic deformations. Pattern Recognition 34(9): 1785–1798, 2001. V. A. Kovalevsky. Finite topology as applied to image analysis. Computer Vision, Graphics, and Image Processing 46:141–146, 1989.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
254 12.
13.
Cointepas et al. Yann Cointepas. Mode´lisation topologique et segmentation tridimensionnelles du cortex ce´re´bral a` partir d’IRM pour la re´solution des proble`mes directs et inverses en EEG et en MEG. Ph.D. thesis, E´cole Nationale Supe´rieure des Te´le´communications, October 1999. Thierry Ge´raud. Segmentation des structures internes du cerveau en imagerie par re´sonance magne´tique tridimensionnelle. Ph.D. thesis, E´cole National Supe´rieure des Te´le´communications, June 1998. (ENST 98 E 012).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
10 Feature Space Analysis of MRI Hamid Soltanian-Zadeh Henry Ford Health System, Detroit, Michigan and University of Tehran, Tehran, Iran
1
INTRODUCTION
The gray levels of images in magnetic resonance imaging (MRI) depend on several tissue parameters, including proton density (PD), spin-lattice (T1) and spin-spin (T2) relaxation times, flow velocity (v), and chemical shift (␦ ). Using different protocols and pulse sequences, MRI generates multiple images of the same anatomical site. The resulting images are called multiparametric (multispectral) images or an MRI scene sequence. They contain multiple gray levels (features) for each voxel defining a multidimensional feature space. The information existing in this feature space may be used for image analysis (e.g., image segmentation and tissue characterization). A major goal of medical image analysis is to extract important clinical information that would improve diagnosis, treatment, and follow-up of the disease. In this chapter, a novel method is presented for the analysis of the multidimensional MRI feature space. The method is presented in the context of its application to brain images. In brain MRI, the existence of abnormal tissues may be easily detectable. However, accurate and reproducible segmentation and characterization 255
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
256
Soltanian-Zadeh
of abnormalities are not straightforward. For instance, a major problem in tumor treatment planning and evaluation is the determination of the tumor extent. Clinically, T2-weighted and gadolinium-enhanced T1-weighted MRI have been used to indicate regions of tumor growth and infiltration [1,2]. Conventionally, simple thresholding or region-growing techniques have been utilized for each image to segment the tissue or volume of interest for diagnosis, treatment planning, and follow-up of the patients. These methods are not effective in many image analysis tasks such as segmentation and characterization of brain tumors. These tumors may be composed of confluent areas of coagulation necrosis, compact areas of anaplastic cells, and areas of adjacent brain parenchyma infiltrated by tumor cells. They may be surrounded by reactive astrocytes and a rim of edema [3,4]. Without using all of the information obtained from different MRI protocols, segmentation and characterization of the tumor compartments are not feasible. Advanced image analysis techniques have been, and still are, being developed to use MRI data optimally and solve the problems associated with the previous techniques [5]. These approaches include artificial neural networks [6–9], a variety of clustering techniques [10–19], eigenimage filtering [20–26], and an optimal feature space method [27]. Some of these methods are presented in other chapters. The main focus of this chapter is a linear method specifically developed to visualize MRI feature space and analyze the data using the results. Pre- and postprocessing approaches are briefly presented to make the chapter self-contained. Before giving details of the methods, we introduce the related concepts and approaches. In feature space analysis of brain MRI, the input images are first preprocessed. Preprocessing consists of (1) registration of multiple MRI studies; (2) segmentation of intracranial cavity from skull, scalp, and background; (3) correction of image nonuniformities; and (4) noise suppression. Then a transformation is applied to extract features and generate a visualizable feature space (cluster plot) in which normal tissues are clustered in prespecified locations and abnormalities are clustered elsewhere. Finally, using the cluster plots, images are segmented into their components [27]. The segmented images may be used for clinical studies such as guided biopsy procedures, or they may be processed by an image classifier which assigns the segmented regions to one of several objects. The results may be used for 3-D visualization, or they may be further analyzed by an image understanding system which determines the relationships between different objects to form a scene description. These image analysis steps are shown in Fig. 1. Vannier et al. [28] presented the first work in which a feature space representation of MRI (in their terminology, multispectral MRI data) was used for tissue segmentation and characterization. Clarke et al. [29] have reviewed MRI segmentation work in recent years. As they explained, feature
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
257
FIGURE 1 Top, flow chart of a general image analysis system. Bottom, flowchart of an MRI analysis system.
extraction is a crucial step for image segmentation. Feature extraction is applied in computer vision systems as a step before image segmentation and classification steps to improve the final results [30]. Soltanian-Zadeh et al. [27,31] pioneered the work on MRI feature extraction. They developed a linear transformation to (1) generate data for a meaningful and standard visualization of the feature space, i.e., visualization of the multidimensional histogram of the data (cluster plot); and (2) improve object identification, segmentation, and characterization results. Later, Velthuizen et al. [32–34] developed a linear transformation for MRI feature extraction to improve image segmentation results using fuzzy C-means clustering. Many researchers (e.g., in Refs. 35–40) have used cluster plots for MRI segmentation and interpretation. Generation of a cluster plot for the MRI feature space constrains its dimensionality. As the dimensionality of the feature space increases, its visualization becomes more difficult. Oneand two-dimensional (1-D and 2-D) feature spaces can be visualized using a conventional histogram, and an image whose pixel intensity is proportional to the number of data points in a certain range (a 2-D cluster plot). A 3-D cluster plot can also be generated by drawing three axes of an orthogonal coordinate system in an image for 3-D perception, and making image pixel gray levels proportional to the number of data points in certain ranges. The 3-D cluster plot can be treated as a 3-D object and rotated to see it from different directions, to get insight into the distribution of clusters in the feature space. A 4-D feature space can be generated by creating 3-D feature
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
258
Soltanian-Zadeh
spaces and showing them in a loop, i.e, using time as the forth axis. This visualization will be limited in that the operator can not draw regions of interest (ROIs) on it and find the corresponding pixels in the image domain. More difficulties will be involved with visualizing feature spaces of dimensions higher than four. Therefore researchers have restricted their attentions to feature spaces with dimensions not larger than three. Spatial and feature domain representations are explained in the following section. 1.1
Spatial and Feature Domain Representations
An MRI scene sequence shows spatial locations of different tissues, with a different contrast in each image. It can therefore be considered as a spatial domain representation of tissues. In a spatial domain representation, pixels corresponding to a specific tissue are locally connected but may be distributed over different sections of the image. A feature space (domain) representation of tissues can be generated from an MRI scene sequence. In a feature space representation, pixels corresponding to a specific tissue are connected as clusters, even though their spatial locations in the image domain may be far apart. To prepare data for a feature space representation, certain features are selected or extracted from the images. The features may be the pixel gray levels or other features calculated from them (e.g., edge and texture). Using the pixel gray levels of an MRI scene sequence consisting of n images, a feature space representation can be generated by defining an n-dimensional pixel vector for each pixel in the image (spatial) domain. The pixel gray levels of the same location from different images in the sequence will be elements of the pixel vector. Image analysis tasks that can be accomplished using feature space representations include (1) identification of objects, (2) segmentation of objects, (3) characterization of objects, and (4) generation of quantitative measurements on objects. The image analysis results can be used in decision making (diagnosis, treatment planning, and evaluation of treatment). Based on pixel intensity features and those features extracted from them by applying a transformation, methods of preparing data for an MRI feature space representation can be partitioned into three categories: (1) tissue-parameter-weighted images (e.g., [41,42]), (2) explicit calculation of tissue parameters (e.g., [43,44]), and (3) linear transformations (e.g., generation of color composite images [45,46] or principal component analysis (PCA) [47]) and nonlinear transformations (e.g., angle images [48]). Categories 1 and 2 require acquisition of multiple images using specific MRI protocols. A difficulty with category 2 is that it requires protocols that are usually different from those routinely used in clinical studies. In addition, noise propagation, through the required nonlinear calculations, combines
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
259
with the model inaccuracies and yields unsatisfactory results [49–51]. As illustrated in Ref. 52, due to the nonlinearity of the transformation from pixel intensity space to tissue parameter space, optimal linear decision functions in the intensity space translate to nonlinear decision functions in the tissue parameter space. Thus, unless these nonlinear decision functions are used, the decision is not optimal. Category 3 can be applied to any MRI scene sequence and can improve the clustering properties of the data for the feature space representation while reducing its dimensionality. However, general purpose transformations are not appropriate for MRI. For instance, a difficulty with the feature space generated by principal component images is the limitation related to the size of the objects in the scene. Small objects make slight contributions to the covariance matrix and thus are not enhanced and visualized in the first few principal component images. The first few principal component images are normally used for the feature space representation, since they have the best signal-to-noise ratios (SNRs). A concern with angle images is the nonlinearity of the transformation that generates curves, rather than lines, for partial volume regions in the cluster plot. Soltanian-Zadeh et al. [27] have developed an optimal linear transformation to prepare MRI data for feature space analysis. Their method, which is related to discriminant analysis (DA) [53], uses some class information (signature vectors for normal tissues) and has specific compatibility to MRI and is most appropriate for this field. The features, extracted from the image define clusters in the feature space. There is a correspondence between these clusters and the tissue types in the image. This correspondence is used to segment the image using the information present in the cluster plot. Before detailing the feature extraction and segmentation methods, we review preprocessing methods suitable for MRI analysis. These methods use intraframe and interframe information to achieve the best performance. In developing many of the methods, as explained throughout the chapter, vector space notations and methods have been used. 2
PREPROCESSING
Preprocessing consists of (1) registration of multiple MRI studies; (2) segmentation of intracranial cavity from skull, scalp, and background; (3) correction of image nonuniformities; and (4) noise suppression. Methods designed for these tasks are introduced in the following sections. 2.1
Registration
In order to follow sequential changes that may occur over time, it is necessary to register the image sets obtained at different times. Also, if the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
260
Soltanian-Zadeh
patient moves between different scans, images should be registered before multispectral image processing and analysis is applied (see Fig. 2). Several methods have been proposed for medical image registration, e.g., [26,55–60]. These techniques can be partitioned into three categories: (1) landmark-based (point matching), (2) surface-based (surface matching), and (3) intensity-based (volume matching) methods. Compared to landmarkbased methods, surface-based methods may be preferred, since they do not need landmarks. Surface-based methods may be preferred to intensity-based methods, because they are faster. Most of the surface-based methods utilize
FIGURE 2 An illustration of image registration. A, an axial T1-weighted MRI of a brain tumor, with skin edges (contour) overlaid. B, corresponding axial T2-weighted MRI, with contour of the T1-weighted image overlaid to show the need for registration. C, the T2-weighted image after being registered to the T1-weighted image, with the contour of the T1weighted image overlaid to illustrate the match generated by the image registration method.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
261
the head surface, brain surface, or inner/outer surface of the skull to estimate rotation and translation parameters (see [26,56–58] for details). The surface is usually characterized by a set of edge or contour points extracted from cross-sectional images. Manual drawing of the contours is very time consuming. The automatic extraction of the contour points by standard edgebased, region-based, or classification-based algorithms has some problems [61]. Both region-based and classification-based algorithms are affected by inhomogeneity artifacts. Edge-based techniques are affected by the partial volume effects creating wide transition zones between tissue types. Researchers have developed a variety of complex and heuristic systems to overcome these difficulties and to automate the contour extraction procedure for specific applications [61–65]. A thorough survey of the recent work in the area of contour extraction for intracranial cavity segmentation is given in Ref. 61. Because it is simple and can follow an appropriate path in the middle of the partial volume regions, an edge-tracking algorithm similar to the one proposed by Henrich et al. [66] can be used. In using this algorithm, it was found that this method has difficulties caused by the discontinuity of edges in the back of the eyes and ears and sometimes by edge discontinuities resulting from previous surgery or an inadequate field of view (see Fig. 3). To solve this problem, Soltanian-Zadeh and Windham [67] developed an automated method that uses a multiresolution pyramid to connect edge discontinuities. A flow chart of the algorithm is presented in Fig. 4 and an example is shown in Fig. 5. 2.2
Warping
In MRI analysis, image warping is needed to overlay the histological images onto the MRI in order to evaluate tissue segmentation and characterization results. In addition, image warping is needed to match the geometry of the images acquired using an echo-planar imaging (EPI) pulse sequence (which has geometrical distortions) to those acquired using standard protocols. Several warping methods have been proposed (e.g., in Refs. 68–71). Ghanei et al. [72] developed and tested three different methods for warping brain MRIs. They utilized a deformable contour to extract and warp the boundaries of the two images. In their method, a mesh-grid coordinate system is constructed for each brain, by applying a distance transformation to the resulting contours and scaling. In the first method, the first image is mapped to the second image based on a one-to-one mapping between different layers defined by the mesh-grid system. In the second method, the corresponding pixels in the two images are found using the above-mentioned mesh-grid system and a local inverse-distance weight interpolation. In the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
262
Soltanian-Zadeh
FIGURE 3 Four representative MRIs in which the edge-tracking algorithm does not find correct contours because of the soft tissue discontinuity. a, discontinuity due to ears. b, discontinuity due to eyes. c, discontinuity due to inadequate field of view. d, discontinuity due to previous surgery. (From Ref. 67.)
third method, a subset of grid points is used to find the parameters of a spline transformation, which defines the global warping. The warping methods have been applied to clinical MRI consisting of diffusion-weighted EPI and T2-weighted spin-echo images of the human brain. Compared to the first method, the second and third methods were superior (p < 0.01) in ranking the diagnostic quality of the warped MRI as defined by a neuroradiologist [72]. An example is shown in Fig. 6. 2.3
Intracranial Segmentation
The image background does not usually contain any useful information but complicates the image restoration and tissue segmentation/classification, and increases the processing time. It is therefore beneficial to remove the image
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
263
FIGURE 4 Steps of the multiresolution approach for automatic contour extraction. (From Ref. 67.)
background before image restoration and analysis begins. In addition, in brain studies, tissues such as scalp, eyes, and others that are outside of the intracranial cavity are not of interest. Hence it is preferable to segment the intracranial cavity volume away from scalp and background. Traditionally, a supervised method using thresholding and morphological operators [73– 76] was used for this segmentation. Soltanian-Zadeh and Windham [67] and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
264
Soltanian-Zadeh
FIGURE 5 Steps of the multiresolution approach for connecting gaps in MR images. a0–a4, skin edges extracted from the MR image shown in Fig. 3 and four lower-resolution versions of it, respectively. b4, exterior edge of a4. b3, points on the exterior edge of a3, found by selecting edge points in a3 that are less than 2 units of chessboard distance from a blowup of b4. Similarly, b2 is found using b3 and a2; b1 is found using b2 and a1; and b0 is found using b1 and a0. (From Ref. 67.)
Atkins and Mackiewich [77] developed automated methods to improve accuracy, reproducibility, and speed of segmentation. 2.4
Nonuniformity Correction
MRI brain images acquired using standard head coils may suffer from several possible sources of nonuniformity, including (1) main field (B0); (2) the time domain filter applied prior to Fourier transformation, in the frequency encoding direction; (3) uncompensated gradient eddy currents; (4) transmit-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
265
FIGURE 6 An illustration of image warping. A, a T2-weighted image. B, corresponding diffusion-weighted image with the contours of the T2weighted image to show the mismatch. C, warped diffusion-weighted image with the contours of the T2-weighted image to show the match after warping.
ted and received radio-frequency (RF) field; (5) RF penetration depth effects; and (6) RF standing wave effects. Simmons et al. [78,79] have investigated the magnitude of these effects on clustering properties of MRI data acquired using a GE Signa system. The first effect is usually corrected by using a multiple spin-echo sequence. Condon et al. [80] have discussed methods for correcting the second effect. However, since most of the current scanners use digital filters whose effect on the image is limited to 2 or 3 pixels at the edge of the image, this correction is usually unnecessary. The third effect, on modern MRI systems that are equipped with shielded gradients, is small for spin-echo sequences at long repetition times used in brain studies. The fourth effect needs to be estimated and used to correct MRI scans [81]. Approaches for estimating this nonuniformity are explained in the next paragraph. The fifth and sixth effects are normally negligible in brain studies, so no correction is necessary for them. Ignoring random noise, the measured MRI pixel gray level Pij can be related to the true MRI signal by the relation Pij = Aij Iij where Aij is the nonuniformity factor at location (i, j) in the image, and Iij is the artifact free intensity value at the same location. A number of approaches to the correction of radio-frequency-induced intensity variations have been proposed (e.g., in Refs. 82–84). All these methods rely upon the division of the acquired image by a reference image that approximates the nonuniformity profile Aij but differ in the way the reference image is obtained. One approach is to use a water or oil phantom. These phantoms are cylindrical plastic containers of about the same size as an average human head (about 20 cm in diameter and 25 cm in height) filled with water or solid oil. Using an oil phantom has the advantage over a water phantom of avoiding spurious radio-frequency (RF) penetration depth and standing wave
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
266
Soltanian-Zadeh
effects. Both these phantoms have limitations associated with their changing nonuniformity pattern over time and due to loading of the coil. Therefore approaches such as those explained below, which estimate the nonuniformity pattern from the acquired images (in a reasonable amount of time), are more appropriate. 1.
2.
2.4.1
Assuming that the inhomogeneity in the RF coil sensitivity manifests itself as a low-frequency component, Aij is estimated by smoothing the image using a 33 ⫻ 33 kernel of 1’s [85]. We have modified this approach by not averaging the background pixels to avoid the artifacts generated around the outside of the brain. Also, CSF and other high-contrast regions are replaced by the average of white and gray matter values to avoid other edge artifacts. Making the assumptions that T1, T2, and PD values for a single tissue type do not vary significantly across a particular slice, and that reference points for at least one tissue type can be identified across the image, an intensity surface is fitted to the reference points to estimate the nonuniformity profile. Two methods (direct and indirect) have been proposed [86]. In the direct method, the user selects multiple points to define a tissue type across the field of view. In the indirect method, the user selects an initial point, and the reference points are selected automatically throughout the image using a similarity criterion. Then, using the basis functions Fi = d 2i ln(di), where di is the Euclidean distance, an intensity pattern is fitted to the reference points by a least-squares fit. This method has been used by researchers at the Henry Ford Health System and elsewhere [87]. An example is shown in Fig. 7. Tissue Inhomogeneities
A tissue type may have biological variations throughout the imaged volume. For example, biological properties of white matter in the anterior and posterior of the brain are slightly different. A tissue type may also have biological heterogeneity in it; many brain lesions are naturally heterogeneous. These cause variations of signal intensity for a single tissue in the imaged volume. The feature space representation of the entire volume may therefore be spread out, i.e., clusters for different tissues may overlap. Sources of this variation include the difference in the proton density and T1 and T2 relaxation times from voxel to voxel. These differences generate a different multiplicative factor for each of the image gray levels that cannot be easily corrected. Therefore feature space analysis is not recommended for the entire 3-D volume in one stage; superior results are obtained using a slice-by-slice analysis approach.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
267
FIGURE 7 Illustration of nonuniformity correction, a, original image. b, correct image. c, estimated nonuniformity pattern.
2.5
Noise Suppression (Image Restoration)
Noise limits the performance of both human observers and computer vision systems. As such, noise should be suppressed (images should be restored) prior to inputting data to image segmentation and classification algorithms. To reduce the computation time, noise suppression (image restoration) is performed after intracranial volume segmentation. General purpose filters such as low-pass, Weiner, median, or anisotropic diffusion filters may be used. However, a filter specifically designed for MRI [88] generates superior results. Details of the approach is beyond the scope of this chapter. In the following, we briefly introduce the filter. The filter is multidimensional, nonlinear, and edge preserving. It has been developed for multiparameter (multispectral) MR image restoration (noise suppression) where multiple images of the same section are processed together. The filter uses both intraframe (spatial) and interframe (temporal) information by considering a neighborhood around each pixel and calculating the Euclidean distance of each pixel vector in this neighborhood from the pixel vector in the center and comparing the result with a preset threshold value. The threshold value is found by calculating probabilities of detection (PD) and false alarm (PF). The filter uses the model in applications for which a good model exists for interframe information, e.g., the exponential model for multiple spin-echo images. It also uses the widely used zero-mean white Gaussian model for statistical noise. The filter works as follows. A neighborhood around each pixel is considered. The Euclidean distance of each pixel vector in the neighborhood
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
268
Soltanian-Zadeh
from the pixel vector in the center is found. If this distance is smaller than a specific threshold value , the pixel vector is considered in the least squares (LS) or maximum likelihood (ML) estimation, otherwise it is not. The threshold is selected based upon the noise standard deviation in the images, the contrast between adjacent tissues, and partial volume averaging effects, which are reflected in the sharpness of edges in each image. In practice, is calculated based on the probabilities of detection and false alarm. An approximate LS or ML estimate for the average of the contributing pixel vectors is determined and saved for all of the contributing pixel vectors. Then the neighborhood is moved and the procedure is repeated. Finally, the average of several estimates obtained for a particular pixel vector is calculated to obtain the filter output for the pixel vector. Theoretical and experimental results have shown that for processing MRI scene sequences with four or five images per slice, a 9 ⫻ 9 neighborhood and a threshold value of = 4 generates optimal results [88]. Figure 8 shows a sequence of four T2-weighted and a T1-weighted MRI of a brain tumor, after registration and intracranial segmentation. It also shows noisesuppressed images generated using the filter described above. Transformed images created for the lesion by applying the optimal transformation method (explained later in Sec. 3.4) to the original and noise-suppressed images and the corresponding difference image are also shown. Note the quality improvement in the original images and the lesion image, generated by the noise suppression filter. This figure illustrates the significance of noise suppression prior to feature space analysis. 3
FEATURE SELECTION OR EXTRACTION
Brain tumors are normally large, and detection of their existence is simple. They may be found by a symmetry analysis of the image gray levels in the axial images, since they generate significant gray level asymmetry in these images. Detection of multiple zones in the tumor and accurate estimation of the tumor extent are, however, difficult, and different image analysis approaches may be employed for this purpose. We present a feature space method and the issues related to it in the remainder of this chapter. A critical step in this approach is feature extraction. Conventional methods include explicit calculations of the tissue parameters [43,44], tissue parameter weighted images [41,42], principal component images [47,50], and angle images [48]. Explicit calculations of the tissue parameters is not usually recommended (and not explained in this chapter) for the following reasons: (1) this approach requires data using several specific pulse sequences that are different from routine clinical protocols [43,44], (2) the results are prone to noise propagation through the nonlinear calculations
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
269
FIGURE 8 O1–O5, four T2-weighted multiple spin echo images (TE/TR = 25–100/2000 ms) and a T1-weighted image (TE/TR = 20/500 ms) of a brain tumor, respectively, after registration and intracranial segmentation. A, transformed image created for the lesion by applying the optimal transformation method explained in Sec. 3.4 to images O1–O5. R1–R5, noise suppressed images generated using the filter described in Sec. 2.5. B, transformed image created for the lesion by applying the optimal transformation method to images R1–R5. C, difference image generated by subtracting image B from image A. Note the quality improvement in the original images and the lesion image generated by the noise suppression filter. Image C illustrates that the filter has suppressed the noise without removing useful image information.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
270
Soltanian-Zadeh
involved [51], and (3) the optimal linear decision boundaries for the resulting feature space do not match with those of the original data [52]. Clinically feasible techniques are described next. 3.1
Tissue Parameter Weighted Images
Using a multiple spin-echo pulse sequence with a short echo time (e.g., TE = 25 ms) and a long repetition time (e.g., TR = 2500 ms), a sequence of four images (corresponding to TE = 25, 50, 75, 100 ms) can be acquired. The contrast in the first image is mainly due to the PD difference between tissues; thus it is called PD weighted. The contrast in the third and fourth images is mainly due to the T2 difference between tissues; thus it is called T2 weighted. Likewise, using a single spin-echo pulse sequence with a short echo time (e.g., TE = 20 ms) and a moderate repetition time (e.g., TR = 500 ms), a T1-weighted image can be acquired. These three images define a 3-D feature space representation of the tissues. Alternatively, a T1weighted image can be acquired using an inversion recovery pulse sequence, with a short echo time (e.g., TE = 12 ms), moderate inversion time (e.g., TI = 500 ms), and long repetition time (e.g., TR = 1500 ms). PD-, T2-, and T1-weighted images define a 3-D feature space representation of the tissues. 3.2
Principal Component Images
Principal component analysis (PCA) is a linear transformation that has been applied in a variety of fields including MRI [47,50]. It has been employed in digital image processing as a technique for image coding, compression, enhancement, and feature extraction [89–92]. For MRI feature space representation, PCA generates linear combinations of the acquired images that maximize the image variance. The weighting vectors for these linear combinations are the normalized eigenvectors of the sample covariance matrix estimated using all the acquired images. The number of principal component images equals the number of acquired images, but the variance (equivalently, the SNR) of the principal component images sharply decreases from the first image to the last. The first three principal component images contain most of the information and can be used to define a 3-D feature space representation of the tissues. 3.3
Angle Images
Angle images are defined by calculating a set of parameters for each pixel vector in an orthogonal subspace defined by the constant vector (cv = [1, 1, . . . , 1]T ) and the signature vectors for normal tissues (si , i = 1, . . . , M) that are encountered in a study, e.g., white matter, gray matter, and CSF for the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
271
brain [48]. The orthogonal subspace is defined by inputing cv and si into a Gram-Schmidt orthogonalization procedure. Among 3-D feature spaces generated by the above parameters, the feature space generated by the following parameters separated normal and abnormal tissues in a brain study the best [48]. The parameters are (1) the Euclidean norm of each pixel vector, (2) the angle between each pixel vector and the constant vector, and (3) the angle between each pixel vector and the orthogonal complement of white matter to the constant vector. 3.4
Optimally Transformed Images
Soltanian-Zadeh et al. [27] developed an optimal feature space method for MRI. In this method, a multidimensional histogram (cluster plot) is generated, and clusters are found by visual inspection and marked on the result. Due to visualization limitations, a 3-D cluster plot is used. To generate such a cluster plot, three images need to be extracted for each slice from the original MRI study, which may have up to six images per slice. The method for optimally extracting these features (images) is presented next. 3.4.1
Problem Formulation
In the derivation of the transformation, we use the following notation: (1) uppercase boldface letters, such as V, to refer to a vector space; (2) uppercase sans serif letters, such as V, to refer to a matrix each of whose columns is a point in V; and (3) lower case boldface letters, such as v, to refer to the vector coordinates of a point in V. Let V and W be n-dimensional and p-dimensional real vector spaces, respectively. Then points in V and W are vectors in n and p, respectively (see Fig. 9). Further assume that a collection of data can be classified or categorized in terms of M predefined groups, with data points in a group more similar to other data points in the group than to the data points in other groups. Let each data category have a target position in W about which transformed data are expected to be well clustered. Denote the number of data points in each category as NS( j), j = 1, . . . , M. A linear transformation T is desired that maps points in W to points in V as follows: c = Tv
(1)
The transformation matrix T is found so that the ratio of interset distance (IED) to intraset distance (IAD) is maximized. The ratio of IED to IAD is a standard criterion for quantifying clustering properties of data [93] (see Fig. 9). This problem can be formulated as the following constrained optimization problem:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
272
Soltanian-Zadeh
FIGURE 9 Transformation of categorical data to target positions and improving its clustering properties by reducing its dimensionality.
maximize
IED IAD
subject to
cj = Tv¯j ,
(2) 1ⱕjⱕM
(3)
where cj is the target position for the average vector of the j th group (v¯j) defined by v¯j =
1 NS( j)
冘
NS( j)
v lj
(4)
l=1
with v lj being the l th data point in the jth group. The IED reflects the average distance between different data groups. It is defined as the average of distances between each pair of the average vectors from different groups [93], in the transformed domain: 2 IED = M(M ⫺ 1) 2
冘冘 M
M
j=1
i=j⫹1
㛳Tv¯j ⫺ Tv¯i 㛳 2
(5)
where v¯j and v¯i are the average vectors for the jth and ith groups, respectively, and 㛳 ⭈㛳 represents the Euclidean norm. This definition makes the interset distance independent of the number of points in each data group. This is an important property in that it avoids dependency of the transformation to the object size. The IAD reflects the average variance of the data points in each group. It is defined as the average of distances between each vector in a group and the average vector from the same group, again in the transformed domain:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
IAD2 =
兺M j=1
1 NS( j)
冘冘 M
273
NS ( j)
j=1
㛳Tv ij ⫺ Tv¯j 㛳 2
(6)
i=1
where v ij is the i th data point in the jth group. 3.4.2
Solution
To attain an easy distinction between normal and abnormal tissues, the average vectors of the normal tissues are projected onto prespecified locations, e.g., on the axes of the new subspace. This is sometimes referred to as projection of categorical data to target positions. Once target positions are specified, IED is fixed. Therefore maximizing the ratio of IED to IAD is equivalent to minimizing IAD. Minimizing IAD, in turn, is similar to minimizing the mean-square error between specified target positions in W and projections of the measurement data [94]. The solution is obtained by taking partial derivatives of IAD2 with respect to elements of T and solving the resultant systems of equations. Special Case. A special case, with an analytical solution, is defined if M = p and we decide to assign the target positions for normal tissues on the axes of the new subspace, i.e., ci = [0, . . . , 0, ci , 0, . . . , 0]T with ci > 0 in the ith row. This will require t i ⭈v¯j = 0
1ⱕjⱕM
j≠i
ti ⭈v¯i = ci
(7)
where t i is the i th row of the M ⫻ n transformation matrix T. For this case, it can be shown that IED 2 simplifies to
冘 M
2
IED = 2
㛳c i 㛳2
(8)
i=1
Using the white noise Gaussian model with standard deviation for MRI noise simplifies IAD 2 to
冋冘 册 M
IAD 2 = 2
㛳t i 㛳 2
(9)
i=1
With an additional constraint that 㛳t i 㛳 = 1, 1 ⱕ i ⱕ M, it can be shown that Eqs. (2) and (3) can be equivalently formulated as maximize
t i ⭈v¯i [t i ⭈t i]1/2
1ⱕiⱕM
subject to
t i ⭈v¯j = 0
1ⱕjⱕM
(10) j≠i
(11)
In this formulation, {㛳t i 㛳 = 1, 1 ⱕ i ⱕ M} is equivalently considered by using 㛳t i 㛳 = [t i ⭈ t i]1/2 in the denominator of the objective function Eq. (10)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
274
Soltanian-Zadeh
(see [95] for a mathematical proof of this equivalence). Soltanian-Zadeh and Windham [96] have derived analytical solutions to a class of optimization problems that are in the form given in Eqs. (10) and (11). 3.5
Discriminant Analysis Images
For a (p ⫹ 1)-class problem, the natural generalization of Fisher’s linear discriminant involves p discriminant functions. Similar to the optimal transformation described above, the projection is from an n-dimensional space to a p-dimensional space. However, the objective function is defined using within (SW) and between (SB) scatter matrices [53]:
冘
Sj
(12)
冘
(v ij ⫺ v¯j)(v ij ⫺ v¯j)T
(13)
冘
NS( j)(v¯j ⫺ v¯)(v¯j ⫺ v¯)T
(14)
p⫹1
SW =
j=1
where
NS( j)
Sj =
i=1
and
p⫹1
SB =
j=1
where 1 v¯ = 兺 NS( j)
冘 p⫹1
NS(i)v¯i
(15)
i=1
To find the transformation matrix T, the following objective function is maximized: J(T) =
兩TSBT T 兩 兩TSWT T 兩
(16)
It was shown in Ref. 97 that the rows of T, i.e., t i , can be found by numerically solving the following generalized eigenvalue problem: SB t i = i SW t i
(17)
Since the determinant is the product of the eigenvalues, it is the product of the variances in the principal directions, thereby measuring the square of the hyperellipsoidal scattering volume [53]. Equation (16) therefore represents the ratio of products of between-class variances to the products of within-class variances.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
275
The interesting point is that if the within-class scatter is isotropic, the eigenvectors are the eigenvectors of SB which span the subspace defined by the vectors v¯i ⫺ v¯. The rows of T can be found by applying the GramSchmidt orthogonalization procedure to p vectors {v¯i ⫺ v¯, i = 1, . . . , p} [53]. This is similar to the special case explained above, except that, instead of using normal tissue signature vectors, linear combinations of them are used in the Gram-Schmidt orthogonalization procedure. Also, four tissues are needed instead of three. Since there are only three normal tissues in the brain, the DA transformation will depend on the pathology, and thus the resulting feature space and the location of the clusters changes from study to study. In addition, the solution is not unique, since rotations and scaling of the axes do not change the value of J( T). However, for the optimal transformation explained in the previous section, the solution is unique, since we fixed the location of normal tissue clusters. Also, note that instead of p known tissues needed by the optimal method, discriminant analysis needs p ⫹ 1 known tissues. 3.5.1
Examples
The effectiveness of the above feature extraction techniques in brain tumor studies is demonstrated in Ref. 27. Here, transformed images, cluster plots, and segmented regions of a simulation obtained using different methods are shown in Figs. 10–14. The resulting feature spaces are quantitatively compared in Table 1. Note the superiority of the method proposed in Sec. 3.4 compared to the other methods. MR images, cluster plots, and segmentation results for a phantom and a human brain obtained using the optimal method are shown in Figs. 15– 17 and 19–20, respectively. Note the quality of clusters formed for each of the phantom materials in the cluster plot and the segmentation results generated by the method. In the human study, note that the clusters for normal tissues, their partial volume regions, and zones of the lesion are clearly visualized in the cluster plot and are appropriately segmented by the method. 4
IMAGE SEGMENTATION
MRI segmentation methods are mainly region based. They use image intensities or their extracted features as representatives of biological properties of the tissue. Image pixels are classified into different regions based on these features. Classification is carried out using a criterion function such as those explained in the next section. 4.1
Decision Engine
Since the Gaussian model has been widely used to characterize the MRI noise [49,52,98–100], and experimental results have illustrated its validation
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
276
Soltanian-Zadeh
FIGURE 10 Original, restored, and transformed images of a brain simulation containing a two-zone lesion. a–e, four T2-weighted multiple spin echo images (TE/TR = 19–76/1500 ms) and a T1-weighted inversion recovery image (TE/TI/TR = 12/519/2000 ms). f–j, restored images generated using the restoration filter described in Sec. 2.5. k–m, transformed images created by applying the optimal transformation explained in Sec. 3.4, using the white matter, gray matter, and CSF signature vectors. (From Ref. 27.)
[5,21,23,24], statistical pattern classification methods that have sound mathematical bases are normally used. These methods are presented next. Three approximations to the optimal Bayes method [93] can be used: (1) a minimum distance pattern classifier using standard Euclidean distance, (2) a minimum distance pattern classifier using generalized Euclidean distance (each coordinate is normalized to the noise standard deviation in that direction), and (3) a maximum likelihood classifier (MLC). These techniques are similar to those used in multispectral satellite data [41]. Since the first and second methods can be considered as special cases of the third method, we briefly review the latter.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
277
FIGURE 11 Comparison of images used for four different feature spaces, using the simulation data. a–c, tissue-parameter-weighted images. d–f, principal component images. g–i, angle images. j–l, optimally transformed images. (From Ref. 27.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
278
Soltanian-Zadeh
FIGURE 12 Comparison of cluster plots for feature spaces defined by images in Fig. 11. a, using images 11a–c. b, using images 11d–f. c, using images 11g–i. d, using images 11j–l. The first image corresponds to the y-axis, the second image corresponds to the z-axis, and the third image corresponds to the x-axis (according to the right-hand method). Note the superiority of the clustering properties (number of clusters identified and the interset-to-intraset Euclidean distance ratio between them) for the optimal method. (From Ref. 27.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
279
FIGURE 13 Comparison of the segmented tissues using cluster plots for tissue-parameter-weighted and PCA images. a–e, segmented tissues (CSF and lesion, gray matter, partial volume between white and gray matters, white matter, and background) using cluster plot 12a. f–k, segmented tissues (CSF and outer zone of the lesion, inner zone of the lesion, gray matter, partial volume between white and gray matters, white matter, and background) using cluster plot 12b. (From Ref. 27.)
The MLC classifies multivariate vectors by evaluating the probability for each class membership using the statistics for the training regions [101,102]. The criterion function for the classifier is di (x) =
P(i) exp (2)n/2 兩Ki 兩1/2
冋
册
1 ⫺ (x ⫺ x¯i)T K i⫺1(x ⫺ x¯i) 2
(18)
where x is the multivariate sample that is being classified, x¯i is the sample mean vector for the ith tissue type calculated from the transformed images, Ki and 兩Ki 兩 are the sample covariance matrix and its determinate, n is the number of images in the scene sequence, P(i) is the a priori probability of class i and T means transpose. The sample mean vectors and sample covariance matrix are estimated using the training data sets, and P(i) is assumed to be equal for all classes. The tissue i for which di (x) (equivalently P(i 兩x)) is the highest, is assumed to be the most probable class. When Ki is a
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
280
Soltanian-Zadeh
FIGURE 14 Comparison of the segmented tissues using cluster plots for angle and optimally transformed images. a–e, segmented tissues (CSF and outer zone of the lesion, inner zone of the lesion, gray matter, partial volume between white and gray matters with background, and white matter) using cluster plot 12c. f–m, segmented tissues (CSF, partial volume between CSF and gray matter, outer zone of the lesion, inner zone of the lesion, gray matter, partial volume between white and gray matters, white matter, and background) using cluster plot 12d. (From Ref. 27.)
TABLE 1 Comparison of Inter-to-Intraset Euclidean Distance Ratios for Different Methods Method Tissue parameter weighted Principal component analysis Angle images Discriminant analysis Method of Sec. 3.4
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Simulation
Patient
11.2 5.42 13.08 102.02 119.14
14.26 1.96 15.03 5.01 21.72
Feature Space Analysis of MRI
281
FIGURE 15 a–e, four T2-weighted multiple spin echo images (TE/TR = 25–100/2000 ms) and a T1-weighted image (TE/TR = 20/500 ms) of a solution phantom. f–j, noise suppressed images generated using the filter described in Sec. 2.5. k–m, transformed images created by applying the optimal transformation explained in Sec. 3.4, using the signature vectors for water and solutions with maximum concentrations. (From Ref. 54.)
diagonal matrix or a scaled version of the identity matrix, the above approximate classifiers (2) or (1) are obtained, respectively. Also, since P(i )/ (2)n/2 兩Ki 兩 1/2 is constant, the maximum of di (x) coincides with the minimum of (x ⫺ x¯i )T K ⫺1 ¯ i ), which is called the Mahalanobis generalized disi (x ⫺ x tance [93]. 4.2
Supervised Methods
In supervised methods, the operator has a significant interaction with the computer. He or she analyzes the data slice by slice. This has the advantage of visual control over what is happening throughout the imaged volume but the disadvantage of being time consuming and operator dependent. Operator dependency can be minimized by designing a step-by-step protocol that is followed closely by the operator.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
282
Soltanian-Zadeh
FIGURE 16 Cluster plot generated by the optimal transformation explained in Sec. 3.4. 1–16, segmented objects (water, different concentrations of CuSO4, and CuSO4 plus creatine). (From Ref. 54.)
A variety of supervised methods can be designed for MRI segmentation. As an example, we present the method we have developed for brain studies based on the optimal transformation presented in Sec. 3.4. 1.
The operator draws a sample ROI for each of the normal tissues (white matter, gray matter, and CSF) over the slice that he or she
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
283
FIGURE 17 Structure of the solution phantom shown in Figs. 15 and 16. W refers to water and numbers are actual concentrations of CuSO4 in millimoles. On the right hand side, CuSO4 solutions are located, and on the left hand side, CuSO4 solutions plus 50 millimoles of creatine are located. (From Ref. 54.)
2.
3.
4.
wants to analyze. These ROIs may be small but should be carefully drawn to include pure pixels only (i.e., without any partial volume averaging). The computer program finds the sample mean and standard deviation of the pixel gray levels in each ROI for every image in the sequence. It then defines signature vectors for the normal tissues, using the mean values. The operator specifies target locations for each of the normal tissues. The computer program finds the minimum mean square error transformation to these target positions, applies it to the images of the slice under consideration, and then generates a multidimensional (3-D for the brain) feature space representation, i.e., a cluster plot. The operator draws ROIs for any clusters he or she finds in the cluster plot. His or her a priori knowledge regarding the location of the clusters for normal tissues will help him or her in identifying clusters corresponding to partial volume pixels and abnormal tissues (see Figs. 16 and 20).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
284
Soltanian-Zadeh
FIGURE 18 A flowchart of the ISODATA algorithm explained in Sec. 4.3. (From Ref. 19.)
5.
6.
The computer program finds pixels in the image domain that correspond to each of the ROIs drawn over the clusters and generates the corresponding region in the image domain for each cluster. The computer program, using the statistics (sample mean and covariance matrix) of the pixels corresponding to each region, and using the Euclidean, generalized Euclidean, or Mahalanobis gen-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
285
FIGURE 19 O1–O5, two T2-weighted fast spin echo images (TE/TR = 22, 88/3500 ms), a FLAIR image (TE/TI/TR = 155/2200/10000 ms), and two T1weighted images (pre- and postgadolinium with TE/TR = 14/500 ms) of a brain tumor, respectively, after registration, intracranial segmentation, and noise suppression. A–H, segmented regions generated by the ISODATA algorithm explained in Sec. 4.3.
eralized distance, assigns each pixel in the image domain to one of the classes. It then assigns an integer number (equivalently, a color) to all pixels in each class. Segmentation results may be presented as multiple binary images, each representing a segmented region (see Figs. 16 and 20). They may also be presented as a color image, not shown here due to publication limitations.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
286
Soltanian-Zadeh
FIGURE 20 Top, optimal feature space (cluster plot) generated by the transformation described in Sec. 3.4. Selected ROIs for the clusters are superimposed on the cluster plot. A–H, segmented regions for white matter, gray matter, partial volume between white and gray matter, multiple zones of the nonenhancing lesion, CSF, and enhancing lesion, respectively.
4.3
Unsupervised Methods
In these methods, the operator has minimal interaction with the computer. He or she specifies the image set that should be analyzed and a set of parameters that is used by the unsupervised method. A computer program, implementing a cluster search algorithm such as K-means [103,104], fuzzy
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
287
C-means [55], or ISODATA [105], finds clusters for the selected data set. It defines signature vectors for each cluster and segments the image using a criterion function such as those explained in Sec. 4.1. Velthuizen et al. [15] used fuzzy C-means and an iterative least squares clustering (ILSC) approach for this purpose. Soltanian-Zadeh et al. [19] developed a variation of the ISODATA algorithm. The approach is similar in principle to the Kmeans clustering in the sense that cluster centers are iteratively determined sample means; however, it includes a set of additional merging and splitting procedures. These steps have been incorporated into the algorithm as a result of experience gained through experimentation. A block diagram of the proposed approach is shown in Fig. 18, and details are given in the following section. 4.3.1
ISODATA Algorithm
The steps of the algorithm proposed in Ref. 19 are as follows: 1.
2. 3. 4. 5.
Specify the following parameters: K = number of cluster centers desired N = a parameter to which the number of samples in each cluster is compared s = standard deviation parameter c = lumping parameter L = maximum number of pairs of clusters that can be lumped in one iteration I = number of iterations allowed Distribute the pixel vectors among the current cluster centers (Zj) based on the smallest Euclidean distance criterion. Discard cluster centers with fewer than N members, and reduce the number of clusters (Nc) by the number of clusters discarded. Redistribute the pixel vectors associated with the cluster centers discarded in the previous step into the cluster centers remaining. Calculate the average intraset Euclidean distance (IADj ) of pixel vectors in each cluster from their center: IADj =
6.
1 Nj
冘 Nj
㛳Z ji ⫺ Zj 㛳
(19)
i=1
where Nj is the number of samples and Z ij is the i th data point in the jth group. Calculate the overall average distance of the pixel vectors (IAD) using the weighted average of the IADjs found for the clusters in the previous step:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
288
Soltanian-Zadeh
IAD =
7.
8.
9. 10.
11.
1 N
冘 Nc
Nj IADj
where N is the total number of samples. Do one of the following that applies: If this is the last iteration, set c = 0 and go to Step 11. If Nc ⱕ 0.5K, go to Step 8. If this is an even-numbered iteration or if Nc ⱖ 2K, go to Step 11. If 0.5K < Nc < 2K, go to Step 8. Find the standard deviation vectors for the clusters. This vector contains the standard deviations of individual elements of the pixel vectors in each cluster. The standard deviation is estimated using the sample standard deviation formula. Find the maximum element of each standard deviation vector and call it jmax . If jmax > s and IADj > IAD and Nj > 2(N ⫹ 1) or Nc ⱕ 0.5K ⫺ then split Zj into two new cluster centers Z ⫹ j and Z j , delete Zj , ⫹ and increase Nc by 1. The cluster center Z j is formed by adding a given quantity ␥j to the component of Zj which corresponds to jmax . Similarly, Z ⫺ j is formed by subtracting ␥j from the same component of Zj . If splitting took place in this step, go to Step 2; otherwise continue. Calculate interset Euclidean distance (IED) between cluster centers: IEDij = 㛳Zi ⫺ Zj 㛳
12. 13.
14.
(20)
j=1
(21)
Find L cluster pairs that have smaller IEDs than the rest of the pairs and order them. Starting with the pair that has the smallest IED, perform a pairwise lumping according to the following rule. If neither of the clusters has been used in lumping in this iteration, and if the distance between the two cluster centers is less than c , merge these two clusters and calculate the center of the resulting cluster; otherwise, go to the next pair. Once all of the L pairs are considered, go to the next step. If this is the last iteration, the algorithm terminates; otherwise, go to Step 2.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
4.3.2
289
Selection of Parameters
We use K = 8 (to ensure that we do not lose any clusters), N = 500, s = twice the average standard deviation of white matter in MR images being segmented, c = Euclidean distance between the white and gray matter, which is usually the smallest distance between normal tissues in the brain, L = 1, I = 60, and ␥j = 0.25jmax . The algorithm does not show severe sensitivity to these parameters, but for the best performance, these parameters need to be optimized. 4.3.3
An Example
The ISODATA method is applied to MRI studies. Figure 19 shows the original images and segmentation results obtained in a representative clinical tumor study. It can be seen that the method has segmented normal tissues and multiple zones of the lesion. A comparison of the results with those given in Fig. 20 for the optimal feature space method shows that they are similar to some extent. However, note that the optimal feature space method has segmented the nonenhancing part of the lesion (Fig. 19D) into two distinct zones. Note also that, in the optimal feature space method, a correspondence exists between each segmented region and its location in the feature space compared to the normal tissue locations. This provides insight into the biological properties of the lesion zones.
5
TISSUE CLASSIFICATION AND CHARACTERIZATION
Once the image is segmented, the resulting regions need to be classified into normal and abnormal tissues. After this classification, the volumes and surface areas can be estimated and the tissue can be biologically identified using statistical measures of MRI signature vectors. Then damaged structures and the extent of damage can be measured. Image classification should be based on both the spatial and the feature domain properties of MRI. The spatial domain properties include the relationship between a region and its neighbors, e.g., spatial size of connected pixels in a region, location of each normal tissue relative to other tissues, and connectivity of normal tissues in three dimensions. The feature domain properties include, the similarity of the signature vectors associated with segmented regions (i.e., the cluster centers), relative brightness of the tissues in MRI, and the region’s gradient. Reliable and validated methods for automatic classification of tissues from MRI do not exist. Therefore, in most research laboratories, the operator
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
290
Soltanian-Zadeh
classifies the regions using spatial and feature domain information. Segmentation and classification results must be validated using simulation, phantom, animal, and human studies as described in the following section. 6 6.1
EVALUATION MATERIALS Simulations
In the testing and evaluation tasks, computer simulations are the only test objects in which the ‘‘truth’’ is known on a pixel-by-pixel basis. This is particularly true in regard to partial volume averaging effects, noise, and artifacts. Therefore simulations are used as ‘‘gold standards’’ for testing and evaluating the software and algorithms developed for MRI. Simulations can be generated using the following methods. One method uses regular and irregular shaped geometric objects (e.g., circles, ellipses, and rectangles in two dimensions, and spheres, ellipsoids, cylinders, and cubes in three dimensions) that fit together to form a scene. In another method, specific tissues are segmented first from MRI, e.g., white matter, gray matter, CSF, and lesion zones from the brain images. These specific tissues are reconstructed into 2-D or 3-D simulations. An example of this type of simulation is shown in Fig. 10. In the above simulations, objects are assigned gray levels, based upon average gray levels obtained for normal and abnormal tissues in MR images. Regions representing lesions and partial volume averaging are included in the scene. For the partial volume regions, the fractional components are determined on a pixel-by-pixel basis. Gaussian noise is added to the free induction decay (FID) signals, and images are reconstructed using Fourier transformation. By introducing appropriate modifications of the FID signals or reconstructed images, nonuniformity effects are added to the data. Also, by specific modifications of the FID signals or by convolving the reconstructed images with the system’s point spread function (PSF) for the protocols with PSF significantly different from a delta function, the effects of MRI PSF are incorporated into the simulations. 6.2
Phantoms
Phantoms are considered as the second level of ‘‘gold standard’’ for testing and evaluation studies, since the ‘‘truth’’ is known for them in an averaging sense. For example, their partial volume fractions are not known on a pixelby-pixel basis but can be estimated on average by calculating an object’s volume, i.e., the sum of the partial volume fractions, by water displacement. Several types of phantoms are found useful for testing and evaluation of image analysis methods. There are no (or insignificant) artificial bound-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
291
aries between different objects in these phantoms. They can be simple or complex in shape. The first type uses biological materials, e.g., a hard-boiled egg in gelatin, with its shell removed. In this example, there are three materials, of which one (egg white) has an interface with the other two (gelatin and egg yolk). The second type of phantom is designed using chemicals such as creatine and CuSO4 . Different vials in the phantom have different concentrations of the chemicals so that they have different MRI characteristics. Phantoms containing vials with different shapes and sizes are manufactured from these materials. Volumes of the vials are physically measured before the phantom is made and are compared to those obtained from image analysis methods. Images of a CuSO4 phantom are shown in Fig. 15. The third type of phantom is designed using tissue mimicking (TM) material [106], manufactured by water, glycerol, n-propanol, animal-hide gelatin, agar, and small amounts of formaldehyde and p-methylbenzoic acid. Different objects in the phantom have different concentrations of the above material so that they have different MRI characteristics. Long-term stability as well as thermal stability exists for the materials when stored in glass containers sealed with a layer of petrolatum. The melting point of the materials is 75⬚C. Phantoms containing objects with different shapes and sizes are manufactured from these materials. Volumes of objects are measured by water displacement and are compared to those obtained from image analysis. The fourth type of phantom is designed using aqueous polymer gels that are sensitive to radiation, so that they change their T2 and T1 when treated with x-rays [107]. The spatial pattern of changes in relaxation times, as well as their magnitudes, can be precisely controlled by the radiation dose and pattern. Their PD, T2, and T1 can be altered independently using radiation dose and chemical modifications. The degree of magnetization transfer can be manipulated by the degree of monomer cross-linking, thereby mimicking real tissues [107]. 6.3
Animal Studies
The animal studies may consist of MRI (T1-, T2-, PD-, diffusion-, perfusion-, and magnetization transfer-weighted images) and histological studies of rats with tumor or middle cerebral artery (MCA) infarction. Histology is used as a ‘‘gold standard’’ in evaluating MRI segmentation and classification results obtained for the rat images. An example is shown in Fig. 21. 6.4
Human Studies
The human studies are usually limited to selected clinical cases. These cases are restricted to patients with a specific abnormality, e.g., brain tumor or
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
292
Soltanian-Zadeh
FIGURE 21 An animal study with histology. A–D, four multiple spin-echo T2-weighted MR images (TE/TR = 30, 60, 90, 120/3000 ms) of a rat brain with middle cerebral artery (MCA) occlusion. E, an inversion recovery T1-weighted MR image (TE/TI/TR = 30/750/6000 ms). F, segmented image generated by the ISODATA algorithms; note multiple zones found for the stroke lesion. G, image F warped to the histological slide shown in H. H, a coronal section of cerebral tissue stained with hematoxylin and eosin (H and E) for light-microscopic analysis.
cerebral infarction. Existing MRI data is analyzed retrospectively. Biopsy is used as a ‘‘gold standard’’ in evaluating MRI segmentation and classification results obtained for the human images. Image-guided biopsy is illustrated in Fig. 22.
7 7.1
PERFORMANCE MEASURES Accuracy and Reproducibility
To obtain initial estimates for the accuracy and reproducibility of the feature space method, the following experiments were conduced. (1) Operator-generated segmented regions from a simulation study were compared to the true regions, i.e., those used to generate the simulation. The range of the relative error and its average were recorded. (2) Segmented regions from two operator-analyzed simulation studies were compared. The relative difference in the number of pixels for each tissue was calculated. The range of the percentage difference and its average were recorded. (3) Segmented regions from two operator-analyzed brain MRI studies were compared. The range
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
293
FIGURE 22 The display of the ISG/Elekta Viewing Wand during the intraoperative biopsy target localization, using classified images. Multiple planes are shown with the biopsy location indicated by the crossing lines.
of the relative difference and its average were recorded. The results are presented in Tables 2–4. To evaluate further the reproducibility of the method, the following experiment was performed. From ten MRI studies of brain tumor patients, images of a slice through the center of the tumor were selected for processing. Patient information was removed from the image headers, and new names (unrecognizable by the image analysts) were given to the images. These images were independently analyzed by two image analysts using the methods described previously. Segmentation results were compared to assess the reproducibility of the method. For each tissue segmented in each patient study, a comparison was done by a similarity index (which is explained in
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
294
TABLE 2
Accuracy Results for a Stimulation Study Pixel No.
Tissue type
Actual
Experimental
Difference
% Difference
Absolute % difference
Gray matter Partial W/G White matter CSF Zone 1 of lesion Zone 2 of lesion Average
11220 4093 8197 1328 967 690
11884 3627 8015 1312 967 690
⫺664 466 182 16 0 0
⫺5.92 11.39 2.22 1.20 0.00 0.00 1.48
5.92 11.39 2.22 1.20 0.00 0.00 3.45 Soltanian-Zadeh
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Intraobserver Reproducibility Results for a Simulation Study Pixel No.
Tissue type
Actual
Experimental
Difference
% Difference
Absolute % difference
Gray matter Partial W/G White matter CSF Zone 1 of lesion Zone 2 of lesion Average
11894 3637 8075 1312 967 783
11697 3830 8058 1303 967 783
197 ⫺193 17 9 0 0
1.67 ⫺5.17 0.21 0.69 0.00 0.00 ⫺0.43
1.67 5.17 0.21 0.69 0.00 0.00 1.29
Feature Space Analysis of MRI
TABLE 3
295
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
296
TABLE 4
Intraobserver Reproducibility Results for a Tumor Patient Study Pixel No.
Tissue type
Try 1
Try 2
Difference
% Difference
Absolute % difference
Gray matter Partial W/G White matter CSF Zone 1 of lesion Zone 2 of lesion Average
8105 3949 7492 517 776 2152
8080 4120 7263 506 779 2243
25 ⫺171 229 11 ⫺3 ⫺91
0.31 ⫺4.24 3.10 2.15 ⫺0.39 ⫺4.14 ⫺0.53
0.31 4.24 3.10 2.15 0.39 4.14 2.39 Soltanian-Zadeh
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
297
TABLE 5 Interobserver Reproducibility Results (Similarity Measures S ) for a Tumor Patient Study Pixel No. Tissue type Gray matter Partial W/G White matter Zone 1 of lesion Zone 2 of lesion Zone 3 of lesion Zone 4 of lesion Average
Observer 1
Observer 2
Common
6373 2684 10281 763 1473 819 1770
6207 2768 10328 659 1048 805 1643
6124 2393 10213 592 802 537 1641
Similarity index S 0.97 0.88 0.99 0.83 0.64 0.66 0.96 0.85
the next section) to evaluate the pixels that were in both of the segmentation results obtained by the two image analysts (agreement) relative to the pixels that were not in both (disagreement). The results for one of the patients are presented in Table 5. An overall comparison of the results for segmenting each tissue type was done by finding the average of the similarity indices for individual studies (see Table 6). 7.2
Similarity Index
Measuring the agreement between different image classification results can be considered as a reliability study in which a number of targets (image pixels) are rated (classified) by different judges (operators). A reliability measure known as the kapa coefficient () has been proposed for this type of study [108]. Kappa is a chance-corrected measure of agreement, defined by
TABLE 6 Overall Reproducibility Results (Similarity Measures S ) for Common Tissues in Ten Tumor Patient Studies Tissue type
WM
GM
PWG
CSF
TL
OV
Average S in 10 studies
0.83
0.82
0.75
0.47
0.77
0.73
In this table, WM, GM, PWG, and TL represent white matter, gray matter, partial volume between white and gray matter, and total lesion, respectively. OV represents the average of all similarity indices.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
298
Soltanian-Zadeh
=
po ⫺ p c 1 ⫺ pc
(22)
where po is the observed percentage of agreement (the percentage of targets rated the same by the raters) and pc is the percent agreement that would occur by chance alone. The denominator in this equation is a scale factor that allows to fall in the interval (⫺⬁, 1]. = 1 indicates perfect agreement, = 0 indicates equal to pure chance, < 0 implies that the agreement is less than chance alone. It is generally accepted that 0.7 < < 1.0 indicates excellent agreement, 0.4 < < 0.7 indicates fair agreement, and < 0.4 indicates poor agreement [108]. When segmenting a tissue type on MRI, the number of pixels belonging to this specific tissue is usually much smaller than the total number of pixels in the image. In this condition, it has been shown in Ref. 109 that Eq. (22) can be approximated by a similarity index S defined as S=
overlapping area average of the two areas
(23)
which is used by us and other researchers (e.g., in Ref. 109) for evaluating MRI segmentation results. 7.3
Comparison to Histology
In animal studies, the level of agreement between the image analysis estimate for the extent of abnormality (e.g., ischemic cell injury or tumor invasion) and that determined by histology are found. An example is shown in Fig. 21. 7.4
Comparison to Biopsy
In human studies, to evaluate the feature space method, the results obtained from the image analysis method were compared to those reported by the pathology laboratory for the corresponding biopsy samples. To perform this comparison, two to ten biopsy samples were taken during the surgical resection of the lesion in ten brain tumor patients (total of 56 samples). It should be noted that these biopsies were intraoperative samples taken with visual inspection using viewing wand stereotaxis during the craniotomy resection procedure. The samples were taken as the open resection was taking place, not as a separate procedure, which would not be safe, ethical, or justifiable. The minimum number of biopsy samples chosen was taken, central to, and at the edge of, the gadolinium enhancement. In addition, in most cases, biopsy samples were chosen, in the center and at the edge of the T2-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
299
weighted hyperintense area. In all cases, the pathology from these locations demonstrated either tumor cells present or areas of gliosis and radiation necrosis. Histological diagnoses in these ten patients were as follows: one glioblastoma multiforme, three anaplastic astrocytomas, two oligodendrogliomas, two gangliogliomas, one atypical meningioma, and one gliosis/radiation necrosis. Localization of the biopsy sites was performed using an Allegro 3-D workstation (ISG, Toronto, Ontario, Canada). Using this system, from the 3-D MRI data, the three orthogonal planes were displayed in surgery. Biopsy sites were determined from these images using an ISG/Elekta Viewing Wand. The Wand is a surgical tool electronically connected to the 3-D data allowing visualization of its position in the patient’s brain (see Fig. 22). Biopsy samples from the above locations were then taken and their locations were recorded. Due to the lack of knowledge about the transformation between the ISG coordinates and the original image coordinates, the locations of the biopsy samples were estimated from the views recorded on the ISG system. The resulting estimates were used in evaluating the agreement between the image analysis method and the biopsy results. The image segmentation results obtained by the feature space method (in terms of regions being normal or abnormal) agreed with the biopsy results.
8
EFFECTS OF MRI PROTOCOLS
To evaluate the effects of MRI protocols on the segmentation results generated by the feature space method, the following nine image sets were defined using different MR images (see Fig. 23). 1. 2. 3. 4. 5. 6. 7. 8.
Four T2-weighted images and one T1-weighted image (before Gd injection) For T2-weighted images and a FLAIR image Four T2-weighted images and two T1-weighted images (before and after Gd injection) Two T2-weighted images (TE = 22, 77) and one T1-weighted image (before Gd injection) Two T2-weighted images (TE = 22, 77) and two T1-weighted images (before and after Gd injection) Two T2-weighted images (TE = 22, 77), one T1-weighted image (before Gd injection), and a FLAIR image Four T2-weighted images, one T1-weighted image (before Gd injection), and a FLAIR image Four T2-weighted images, two T1-weighted images (before and after Gd injection), and a FLAIR image
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
300
Soltanian-Zadeh
FIGURE 23 Seven MR images of a slice of a brain tumor, after registration, intracranial segmentation, restoration, and nonuniformity corrections are applied. a, a proton density weighted image with TE/TR = 22/ 3500 ms. b, a T2-weighted image with TE/TR = 44/3500 ms. c, a T2-weighted image with TE/TR = 77/3500 ms. d, a T2-weighted image with TE/TR = 88/3500 ms. e, an inversion recovery image (FLAIR) with TE/TI/TR = 155/2200/10004 ms. f, a T1-weighted spin-echo image with TE/TR = 14/500 ms, before Gd injection. g, a T1-weighted spin-echo image with TE/TR = 14/500 ms, after Gd injection.
9.
Two T2-weighted images (TE = 22, 77), two T1-weighted images (before and after Gd injection), and a FLAIR image
An image analyst analyzed the above nine MRI sets for three tumor patients using the feature space method. In all studies, the operator was able to find clusters for the normal tissues. Also, clusters for different tumor zones were found. Based on the clusters marked for each zone, the method segmented the image into normal tissues (white matter, gray matter, and CSF), partial volume regions, and multiple tumor zones. By inspecting the image analysis results, we observed that the locations of the clusters for the tumor zones and their corresponding regions in the image domain changed to some extent as a function of the MR images (MRI protocols) used. However, the segmentation results for the total lesion and normal tissues remained almost unchanged. Also, we noticed that separation of the clusters in the feature space was a function of the MRI protocols. For
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
301
example, the FLAIR image presented a small contrast-to-noise ratio (CNR) between white and gray matter, and thus including it in the image set, instead of the T1-weighted spin-echo image, deteriorated the segmentation results for the normal tissues. On the other hand, the FLAIR image presented a large CNR between tumor zones and the normal tissues, so its inclusion in the image set improved the separation of the clusters for the tumor zones in the feature space. The T1-weighted image presented a good contrast between white and gray matter, so its inclusion in the image set resulted in a good separation of the normal tissue clusters in the feature space. To illustrate the above findings, cluster plots and segmentation results for four studies of a brain tumor are shown in Figs. 24–27. These figures show the results obtained using image sets 1, 4, 8, and 2 (as defined above), respectively.
FIGURE 24 Cluster plot of the feature space generated by the optimal linear transformation using images in Fig. 23a–d, f with clusters marked and the corresponding segmented regions shown on the right.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
302
Soltanian-Zadeh
FIGURE 25 Cluster plot of the feature space generated by the optimal linear transformation using images in Fig. 23a, c, f with clusters marked and the corresponding segmented regions shown on the right.
9
A TISSUE CHARACTERIZATION APPLICATION
To illustrate the role of the feature space method in tissue characterization, the following study was carried out. MRI studies of a patient before and after surgery and radiation therapy were analyzed by the feature space method, and the postsurgery results were compared to the presurgery results. Figures 28 and 30 show, respectively, acquired (original), processed (after registration, intracranial segmentation, noise suppression, and nonuniformity correction), and transformed images of the corresponding slice from the presurgery and postsurgery studies. Figures 29 and 31 show the cluster plots for the transformed images and the boundaries of the ROIs defined for different tissues. They also show the segmentation of the images into white matter, gray matter, partial volume between white and gray matter, zones of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
303
FIGURE 26 Cluster plot of the feature space generated by the optimal linear transformation using images in Fig. 23a–g with clusters marked and the corresponding segmented regions shown on the right.
the lesion and its partial volume regions, and CSF. A comparison of these figures shows that the tumor and its surrounding zones (S4, S7–S9) have disappeared after surgery and radiation therapy, but a new zone, which could be radiation necrosis (labeled S10), has appeared. Note that the location of S10 on the feature space shown in Fig. 31 is different from the locations of any of the lesion zones in Fig. 29.
10
DISCUSSION
Detecting the existence of brain tumors or infarctions from MRI is relatively simple. This detection can be carried out automatically by a symmetry analysis. However, fast, accurate, and reproducible segmentation and character-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
304
Soltanian-Zadeh
FIGURE 27 Cluster plot of the feature space generated by the optimal linear transformation using images in Fig. 23a–e with clusters marked and the corresponding segmented regions shown on the right.
ization of the lesions are complicated. As such, diagnostic classification of brain tumors, as well as that of other cerebral space occupying lesions that need to be distinguished from brain tumors, is still based on histological examination of tissue samples obtained via biopsy or excision. These examinations are carried out to (1) establish a histological diagnosis, (2) determine this histological boundaries of a lesion, and (3) establish whether the lesion comprises solid tumor tissue, isolated tumor cells within the parenchyma, or some other growth pattern. The ultimate goal of MRI processing algorithms includes accomplishing the above noninvasively. In this chapter, we cited many related image processing articles and reviewed and illustrated a recent technique that has a sound mathematical basis and the highest potential for achieving this goal. We also showed an example in which biopsy samples were used to validate
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
305
FIGURE 28 Presurgery study: O1–O6, four T2-weighted multiple spinecho images (TE/TR = 25–100/2000 ms) and two T1-weighted (without and with gadolinium, respectively, TE/TR = 25/500 ms) images of a brain tumor. R1–R6, original images after the registration, intracranial segmentation, restoration, and nonuniformity corrections were applied. T1– T3, transformed images.
the image analysis results. Most of the techniques presented in this chapter were implemented in a user-friendly software package called ‘‘eigentool,’’ which runs on Sun workstations and is available to researchers working in this field free of charge. We presented a step-by-step methodology for the processing of brain MRI studies. Recently, compound methods have been proposed for accomplishing some of these steps together. For example, it has been proposed to
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
306
Soltanian-Zadeh
FIGURE 29 Presurgery study: Cluster plot with ROI borders defined for clusters. S1–S9, segmented regions for white matter, gray matter, partial volume between white and gray matter, tumor, and surrounding zones of the tumor, respectively.
combine nonuniformity correction and image segmentation in a single iterative algorithm. Currently, these methods are computationally intense, compared to the step-by-step procedures, and therefore their use is not clinically feasible. Most of the literature on brain MRI processing and the examples presented here use gray level features for tissue segmentation and characterization. It is expected that future work will consider the addition of other features (e.g., spatial connectivity, texture and edge information) and meth-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
307
FIGURE 30 Postsurgery and radiation study: O1–O6, four T2-weighted multiple spin-echo images (TE/TR = 25–100/2000 ms) and two T1weighted (without and with gadolinium, respectively, TE/TR = 25/500 ms) images of a brain tumor. R1–R6, original images after the registration, intracranial segmentation, restoration, and nonuniformity corrections were applied. T1–T3, transformed images.
ods (e.g., mathematical morphology and a priori knowledge) into the segmentation procedure. Segmentation results are presented as black-and-white cross-sectional pictures, because of publication limitations. In practice, color-coded images and 3-D visualization are normally used to present and utilize the segmentation results more efficiently. The methods presented in this chapter have generated promising results. However, further research in image acquisition and processing needs
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
308
Soltanian-Zadeh
FIGURE 31 Postsurgery and radiation study: Cluster plot with ROI borders defined for clusters. S1–S10, segmented regions for white matter, gray matter, partial volume between white and gray matter, and zones of the lesion, respectively. Compared to the cluster plot and the segmented images in Fig. 29, it can be seen that the tumor and its surrounding zones (S4, S7–S9) has disappeared but now a new zone corresponding to radiation necrosis (labeled S10) is present.
to be undertaken in order to generate faster, more accurate, and more reproducible segmentation and characterization of tissues from MRI. New image acquisition techniques such as magnetization transfer, diffusion, and perfusion will generate additional information regarding the anatomy and physiology of the tissues, useful for their segmentation and characterization. These images can be processed by the methods described in this chapter. However,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI
309
since these new images may have different geometric distortion and resolution, utilizing them along with conventional images requires specific preprocessing to adjust the resolution and correct their geometric distortion. Magnetic resonance spectroscopy (MRS) and magnetic resonance spectroscopic imaging (MRSI) are also expected to play significant roles in MRI tissue characterization. The acquisition time and resolution of MRSI images are limiting factors for their use in clinical examinations. By advancement of the MRI technology, these techniques are becoming clinically feasible. ACKNOWLEDGMENTS Development and evaluation of many of the techniques presented in this chapter have been supported in part by the NIH under grants CA46124 and CA61263, and by the NSF under grant BES-9911084. Joe Windham provided mentorship and strong support for the work, and we are deeply grieved by his death. Several colleagues helped with the implementation and evaluation of the techniques. We would like to thank Donald Peck and Tom Mikkelsen for their assistance with testing and evaluation studies as well as interpretation of the results. We also would like to thank Lucie Bower, Jalal Soltanianzadeh, Amir Ghanei, Tom Brusca, and Holly Niggeman for their help with programming, and David Hearshen, Michael Jacobs, Jeff Hasenau, Linda Emery, Tanya Murnock, and Lisa Scarpace for their help with data collection and analysis. REFERENCES 1.
2.
3.
4. 5.
6.
PA Forsyth, PJ Kelly, TL Cascino, BW Scheithauer, EG Shaw, RP Dinapoli, EJ Atkinson. Radiation necrosis or glioma recurrence: is computer-assisted stereotactic biopsy useful? J Neurosurg 82(3):436–444, 1995. PJ Kelly, C Daumas-Duport, DB Kispet, BA Kall, BW Scheithauer, JJ Illig. Imaging-bases stereotactic serial biopsies in untreated intracranial glial neoplasms. J Neurosurg 66(6):865–874, 1987. PC Burger, P Kleihues. Cytologic composition of the untreated glioblastoma multiforme: a postmortem study of 18 cases. J Cancer 63(10):2014–2023, 1989. PC Burger, BW Scheithauer, FS Vogel. Surgical Pathology of the Nervous System and Its Coverings. Churchill, Livingston, 1991. H Soltanian-Zadeh, JP Windham, DJ Peck, AE Yagle. A comparative analysis of several transformations for enhancement and segmentation of magnetic resonance image scene sequence. IEEE Trans Med Imag 11:302–318, 1992. M Ozkan, BM Dawant. Neural network based segmentation of multi-modal medical images: a comparative and prospective study. IEEE Trans Med Imag 12:534–544, 1993.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
310
Soltanian-Zadeh
7.
LO Hall, AM Bensaid, LP Clarke, RP Velthuizen, MS Silbiger, JC Bezdek. A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Trans Neural Networks 3:672–682, 1992. J Alirezaei, ME Jernigan, C Nihmias. Neural network based segmentation of magnetic resonance images of the brain. Proc IEEE Conf Med Imag, 1995, pp 1397–1401. B Ashjaei, H Soltanian-Zadeh. A comparative study of neural network methodologies for segmentation of magnetic resonance images. Proceedings of the International Conference on Image Processing, Lausanne, Switzerland, 1996. RL DeLaPlaz, PJ Chang, JV Dave. Approximate fuzzy C-means (AFCM) cluster analysis of medical magnetic resonance image (MRI) data. Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 1987, pp 869–871. RL DeLaPaz, EH Herskovits, V Di Gesu, WA Hanson, R Bernstein. Cluster analysis of medical magnetic resonance images (MRI) data: diagnostic application and evaluation. SPIE 1259:176–181, 1990. A Simmons, SR Arridge, GJ Barker, AJ Cluckie, PS Tofts. Improvements to the quality of MRI cluster analysis. J Magn Reson Imag 12:1191–1204, 1994. MC Clark, LO Hall, DB Goldgof, LP Clarke, RP Velthuizen, MS Silbiger. MRI segmentation using fuzzy clustering techniques. IEEE Engineering in Medicine and Biology, November/December 1994, pp 730–742. ME Brandt, TP Bohan, LA Kramer, JM Fletcher. Estimation of CSF, white and gray matter volumes in hydrocephalic children using fuzzy clustering of MR images. Computerized Medical Imaging and Graphics 18:25–34, 1994. RP Velthuizen, LP Clarke, S Phuphanich, LO Hall, AM Bensaid, JA Arrington, HM Greenberg, ML Silbiger. Unsupervised measurement of brain tumor volume on MR images. J Magn Reson Imag 5:594–605, 1995. WE Phillips, RP Velthuizen, S Phuphanich, LO Hall, LP Clarke, ML Silbiger. Application of fuzzy C-means segmentation technique for tissue differentiation in MR images of a hemorrhagic glioblastoma multiforme. J Magn Reson Imag 13:277–290, 1995. M Vaidyanathan, LP Clark, RP Velthuizen, S Phuphanich, AM Bensaid, LO Hall, JC Bezdek, H Greenberg, A Trotti, M Silbiger. Comparison of supervised MRI segmentation methods for tumor volume determination during therapy. J Magn Reson Imag 13:719–728, 1995. B Ashjaei, H Soltanian-Zadeh. Adaptive clustering techniques for segmentation of magnetic resonance images. Proceedings of the ISRF-IEE International Conference on Intelligent and Cognitive Systems, Neural Networks Symposium, Tehran, Iran, 1996. H Soltanian-Zadeh, JP Windham, L Robbins. Semi-supervised segmentation of MRI stroke studies. Proceedings of SPIE Medical Imaging 1997: Image Processing Conference, Newport Beach, CA, 1997. JP Windham, MA Abd-Allah, DA Reimann, JW Froelich, AM Haggar. Eigenimage filtering in MR imaging. J Comput Assist Tomogr 12(1):1–9, 1988.
8.
9.
10.
11.
12. 13.
14.
15.
16.
17.
18.
19.
20.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI 21. 22. 23.
24.
25. 26.
27. 28.
29.
30. 31.
32.
33.
34. 35.
36.
37.
311
H Soltanian-Zadeh, JP Windham, JM Jenkins. Error propagation in eigenimage filtering. IEEE Trans Med Imag 9(4):405–420, 1990. DJ Peck, JP Windham, H Soltanian-Zadeh, JP Roebuck. A fast and accurate algorithm for volume determination in MRI. Med Phys 19(3):599–605, 1992. H Soltanian-Zadeh. Multi-dimensional signal processing of magnetic resonance scene sequences. PhD dissertation, University of Michigan, Ann Arbor, MI, 1992. H Soltanian-Zadeh, JP Windham, AE Yagle. Optimal transformation for correcting partial volume averaging effects in magnetic resonance imaging. IEEE Trans Nuc Sci 40(4):1204–1212, 1993. H Soltanian-Zadeh, JP Windham. Letter to the Editor: Mathematical basis of eigenimage filtering. Magn Reson Med 31(4):465–466, 1994. DJ Peck, JP Windham, L Emery, H Soltanian-Zadeh, DO Hearshen, T Mikkelsen. Cerebral tumor volume calculations using planimetric and eigenimage analysis. Medical Physics 23(12):2035–2042, 1996. H Soltanian-Zadeh, JP Windham, DJ Peck. Optimal linear transformation for MRI feature extraction. IEEE Trans Med Imag 15(6):749–767, 1996. MW Vannier, RL Butterfield, D Jordan, WA Murphy, RG Levitt, M Gado. Multispectral analysis of magnetic resonance images. Radiology 154:221– 224, 1985. LP Clarke, RP Velthuizen, MA Camacho, JJ Heine, M Vaidyanathan, LO Hall, RW Thatcher, ML Silbiger. Review of MRI segmentation: Methods and applications. Magn Reson Imag 13(3):343–368, 1995. AK Jain. Fundamentals of Digital Image Processing. Englewood Cliffs, New Jersey: Prentice-Hall, 1989. H Soltanian-Zadeh, JP Windham, DJ Peck. MRI feature extraction using a linear transformation. Proceedings of SPIE Medical Imaging 1993: Image Processing Conference, Feb 1993, Vol 1898, pp 487–500. RP Velthuizen, LO Hall, LP Clarke. MMRI feature extraction using genetic algorithms. Proceedings of the 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society 1996, Vol 3, pp 1138– 1139. RP Velthuizen, LO Hall, LP Clarke. An initial investigation of feature extraction with genetic algorithms for fuzzy clustering. Biomed Eng—Applications, Basis, and Communications 8(6):496–517, 1996. RP Velthuizen, LO Hall, LP Clarke. Feature extraction for MRI segmentation. J Neuroimaging 9(2):85–90, 1999. DD Blatter, ED Bigler, SD Gale, SC Johnson, CV Anderson, BM Burnett, N Parker, S Kurth, SD Horn. Quantitative volumetric analysis of brain MR: normative database spanning 5 decades of life. AJNR 16:241–251, 1995. DC Bonar, KA Schaper, JR Anderson, DA Rottenberg, SC Strother. Graphical analysis of MR feature space for measurement of CSF, gray-matter, and whitematter volumes. J Comput Assist Tomogr 17(3):461–470, 1993. G Gerig, R Kikinis. Segmentation of 3-D magnetic resonance data. Proceedings of 5th Int Conf on Image Analysis and Processing, Positano, Italy, 1989, pp 602–609.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
312
Soltanian-Zadeh
38.
EF Jackson, PA Narayana, JC Falconer. Reproducibility of nonparametric feature map segmentation for determination of normal human intracranial volumes with MR imaging data. J Magn Reson Imag 4:692–700, 1994. S Vinitski, C Gonzalez, F Mohamed, T Iwanaga, RL Knobler, K Khalili, J Mack. Improved intracranial lesion characterization by tissue segmentation based on a 3-D feature map. Magn Reson Med 37:457–469, 1997. B Johannes, R Jurgen, T Thomas. Differentiation of normal and pathologic brain structures in MRI using exact T1 and T2 values followed by a multidimensional cluster analysis. MED-INFO Proceedings, 1995, pp 687–691. MW Vannier, RL Butterfield, D Jordan, WA Murphy, RG Levitt, M Gado. Multispectral analysis of magnetic resonance images. Radiology 154:221– 224, 1985. JR Mitchell, SJ Karlik, DH Lee, A Fenster. Computer-assisted identification of multiple sclerosis lesions in MR imaging volumes in the brain. J Magn Reson Imag 4(20:197–208, 1994. PH Higer, B Gernot. Tissue Characterization in MR Imaging: Clinical and Technical Approaches. New York, Springer-Verlag, 1990. M Just, M Thelen. Tissue characterization with T1, T2, and proton density values: results in 160 patients with brain tumors. Radiology 169:779–785, 1988. HK Brown, TR Hazelton, JV Fiorica, AK Parsons, LP Clarke, ML Silbiger. Composite and classified color display in MR imaging of the female pelvis. J Magn Reson Imaging 10(1):143–145, 1992. HK Brown, TR Hazelton, ML Silbiger. Generation of color composite for enhanced tissue differentiation in magnetic resonance imaging of the brain. American Journal of Anatomy 192(1):23–24, 1991. H Grahn, NM Szeverenyi, MW Roggenbuck, F Delaglio, P Geladi. Data analysis of multivariate magnetic resonance images: I. A principal component analysis approach. J Chemometrics and Intelligent Laboratory Systems 5: 311–322, 1989. JP Windham, H Soltanian-Zadeh, DJ Peck. Tissue characterization by a vector subspace method. Presented at the 33rd Annual Meeting of the American Association of Physicists in Medicine (AAPM), San Francisco, CA, July 1991. Abstract published in Med Phys 18(3):619, 1991. JN Lee, SJ Riederer. The contrast-to-noise in relaxation time, synthetic, and weighted-sum MR images. Magn Reson Med 5:13–22, 1987. U Schmiedl, DA Ortendahl, AS Mark, I Berry, L Kaufman. The utility of principle component analysis for the image display of brain lesions: a preliminary, comparative study. Magn Reson Med 4:471–486, 1987. JR MacFall, SJ Riederer, HZ Wang. An analysis of noise propagation in computed T2, pseudodensity, and synthetic spin-echo images. Med Phys 13(3):285–292, 1986. ER McVeigh, MJ Bronskill, RM Henkelman. Optimization of MR protocols: a statistical decision analysis approach. Magn Reson Med 6:314–333, 1988. RO Duda, PE Hart. Pattern Classification and Scene Analysis. New York, John Wiley, 1973, pp 114–121.
39.
40.
41.
42.
43. 44.
45.
46.
47.
48.
49. 50.
51.
52. 53.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI 54. 55.
56. 57.
58.
59.
60.
61. 62.
63. 64.
65. 66.
67. 68.
69. 70. 71.
313
H Soltanian-Zadeh, JP Windham. Optimal feature space for MRI. Proc of the 1995 IEEE Medical Imaging Conference, San Francisco, CA, Oct 1995. LP Clarke, RP Velthuizen, MA Camacho, JJ Heine, M Vaidyanathan, LO Hall, RW Thatcher, ML Silbiger. Review of MRI segmentation: methods and applications. J Magn Reson Imag 13(3):343–368, 1995. GTY Chen, CA Pelizzari. Image correlation techniques in radiation therapy treatment planning. Comput Med Imag Graph 13(3):235–240, 1989. MV Herk, HM Kooy. Automatic three-dimensional correlation of CT-CT, CTMRI, and CT-SPECT using chamfer matching. Med Phys 21(7):1163–1178, 1994. J Mangin, V Frouin, I Bloch, B Bendriem, J Lopez-Krahe. Fast nonsupervised 3D registration of PET and MR images of the brain. Cereb Blood Flow Metab 14:749–762, 1994. PA Van den Elsen, JB Antoine Maintz, ED Pol, MA Viergever. Automatic registration of CT and MR brain images using correlation of geometrical features. IEEE Trans Med Imag 14(2):384–396, 1995. BA Ardekani, M Braun, BF Hutton, I Kanno, H Iida. A fully automatic multimodality image registration algorithm. J Comput Asist Tomogr 19(4):615– 623, 1995. AP Zijdenbos, BM Dawant, RA Margolin. Automatic detection of intracranial contours in MRI images. Comput Med Imag Graph 18(1):11–23, 1994. A Chakraborty, LH Staib, JS Duncan. Deformable boundary finding in medical images by integrating gradient and region information. IEEE Trans Med Imag 15(6):859–870, 1996. C Davatzikos, RN Bryan. Using a deformable surface model to obtain a shape representation of the cortex. IEEE Trans Med Imag 15(6):785–795, 1996. Y Ge, JM Fitzpatrick, BM Dawant, J Bao, RM Kessler, RA Margolin. Accurate localization of cortical convolutions in MR brain images. IEEE Trans Med Imag 15(4):418–428, 1996. S Sandor, R Leahy. Surface-based labeling of cortical anatomy using a deformable atlas. IEEE Trans Med Img 16(1):41–54, 1997. G Henrich, N Mai, H Backmund. Preprocessing in computed tomography picture analysis: a bone-deleting algorithm. J Comput Assist Tomogr 3(3): 379–384, 1979. H Soltanian-Zadeh, JP Windham. A multi-resolution approach for intracranial volume segmentation from brain images. Med Phys 24(12):1844–1853, 1997. D Collins, P Neelin, DR Spelbring, TM Peters, AC Evans. Automatic 3D intersubject registration of MR volumetric data in standardized talairach space. J Comput Assist Tomogr 18(2):192–205, 1994. C Davatzikos, JL Prince, RN Bryan. Image registration based on boundary mapping. IEEE Trans Med Imag 15:112–115, 1996. C Davatzikos. Spatial normalization of 3D brain images using deformable models. J Comput Assist Tomogr 20(4):656–665, 1996. P Thompson, AW Toga. A surface-based technique for warping three-dimensional images of the brain. IEEE Trans Med Imag 15:402–417, 1996.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
314
Soltanian-Zadeh
72.
A Ghanei, H Soltanian-Zadeh, M Jacobs. Boundary-based warping of brain images. Proc of SPIE Medical Imaging 2000: Image Processing Conference, San Diego, CA, Feb 2000. WK Pratt. Digital Image Processing. New York, John Wiley, 1978. KR Castleman. Digital Image Processing. Englewood Cliffs, New Jersey: Prentice-Hall, 1979. RC Gonzalez, P Wintz. Digital Image Processing. 2d ed. Reading, Massachusetts: Addison-Wesley, 1987. AK Jain. Fundamentals of Digital Image Processing. Englewood Cliffs, New Jersey: Prentice-Hall, 1989. MS Tkins, BT Mackiewich. Fully automatic segmentation of the brain in MRI. IEEE Trans Med Imag 17(1):98–107, 1998. A Simmons, PS Tofts, GJ Barker, DAG Wicks, SA Arridge. Considerations for RF nonuniformity correction of spin echo images at 1.5 T, SMRM’92, Book of Abstracts, 4240. A Simmons, PS Tofts, GJ Barker, SA Arridge. Improvement to dual echo clustering of neuroanatomy in MRI, SMRM’92, Book of Abstracts, 4202. BR Condon, J Patterson, D Wyper, A Jenkins, DM Hadley. Image non-uniformity in magnetic resonance imaging: its magnitude and methods for its correction. British Journal of Radiology 60(709):83–87, 1987. DAG Wicks, GJ Barker, PS Tofts. Correction of intensity non-uniformity in magnetic resonance images of any orientation. Magn Reson Imag 11(2):183– 196, 1993. WM Wells III, WEL Grimson, R Kikinis, FA Jolesz. Adaptive segmentation of MRI data. IEEE Trans Med Imag 15(4):429–442, 1996. R Guillemaud, M Brady. Estimating the bias field of MR images. IEEE Trans Med Imag 16(3):238–251, 1997. CR Meyer, PH Bland, J Pipe. Retrospective correction of MRI amplitude inhomogeneities. Proc CVRMed’96 (N Ayache, ed). Berlin, Springer-Verlag, 1995, pp 513–522. KO Lim, A Pfefferbaum. Segmentation of MR brain image into cerebrospinal fluid spaces and white and gray matter. J Comput Assist Tomogr 13(4):588– 593, 1989. BM Dawant, AP Zijdenbos, RA Margolin. Correction of intensity variation in MR images for computer-assisted tissue classification. IEEE Trans Med Imag 12(4):770–781, 1993. L Nocera, JC Gee. Robust partial volume tissue classification of cerebral MRI scans. Proc of SPIE Medical Imaging 1997: Image Processing Conference, 1997, Vol 3034, Part 1, pp 312–322. H Soltanian-Zadeh, JP Windham, AE Yagle. A multidimensional nonlinear edge-preserving filter for magnetic resonance image restoration. IEEE Trans Imag Proc 14(2):147–161, 1995. HC Andrews, CL Patterson. Singular value decomposition and digital image processing. IEEE Trans ASSP 24:26–53, 1976. TS Huang, PM Narenda. Image restoration by singular value decomposition. Appl Optics 14:2213–2216, 1975.
73. 74. 75. 76. 77. 78.
79. 80.
81.
82. 83. 84.
85.
86.
87.
88.
89. 90.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Feature Space Analysis of MRI 91. 92. 93. 94.
95. 96.
97. 98.
99.
100. 101. 102.
103. 104. 105. 106. 107. 108. 109.
315
BR Hunt, O Knubler. Karhunen-Loeve multispectral image restoration—Part I: Theory. IEEE Trans ASSP-32:592–599, 1984. N Ahmed, KR Rao. Orthogonal Transforms for Digital Image Processing. New York, Springer-Verlag, 1975, pp 189–224. JT Tou, RC Gonzalez. Pattern Recognition Principles. 2d ed. Addison-Wesley, 1977. SA Zoharian, AJ Jarghaghi. Minimum mean-square error transformation of categorical data to target positions. IEEE Trans Signal Proc 40(1):13–23, 1992. DG Luenberger. Optimization by Vector Space Methods. New York, John Wiley, 1969. H Soltanian-Zadeh, JP Windham. Novel and general approach to linear filter design for CNR enhancement of MR images with multiple interfering features in the scene. Journal of Electronic Imaging 1(2):171–182, 1992. S Wilks. Mathematical Statistics. New York, John Wiley, 1962, pp 577–578. DG Brown, JN Lee, RA Blinder, HZ Wang, SJ Riederer, LW Nolte. CNR enhancement in the presence of multiple interfering processes using linear filters. Magn Reson Med 14(1):79–96, 1990. JB de Castro, TA Tasciyan, JN Lee, F Farzaneh, SJ Riederer, RJ Herfkens. MR subtraction angiography with matched filter. J Comput Assist Tomogr 12(2):355–362, 1988. RB Buxton, F Greensite. Target-point combination of MR images. Mag Res Med 18(1):102–115, 1991. JK Gohagan, EL Spitznagel. Multispectral analysis of MR images of the breast. Radiology 163(3):703–707, 1987. HA Koenig, R Bachus, ER Reinhardt. Pattern recognition for tissue characterization in magnetic resonance imaging. Health Care Instrumentation Siemens Publication 184–187, 1986. PA DeVijver, J Kittler. Pattern Recognition: A Statistical Approach. London: Prentice-Hall International, 1982. K Fukunaga. Introduction to Statistical Pattern Recognition. 2d ed. San Diego, Academic Press, 1990. GH Ball, DJ Hall. A clustering technique for summarizing multi-variate data. Behavioral Science 12:153–155, 1967. EL Madsen, JC Blechinger, GR Frank. Low-contrast focal lesion detectability phantom for 1H MR imaging. Med Phys 18:549–554, 1991. JC Gore, MJ Maryanski, RJ Schulz. Contrast-detail quality assurance phantoms for MRI using polymer gels. Med Phys 23(6):1136–1137, 1996. JJ Bartko. Measurement and reliability: statistical thinking considerations. Schizophrenia bull 17(3):483–489, 1991. AP Zijdenbos, BM Dawant, RA Margolin, AC Palmer. Morphometric analysis of white matter lesions in MR images: method and validation. IEEE Trans Med Imag 13:716–724, 1994.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
11 Geometric Approaches for Segmentation and Signal Detection in Functional MRI Analysis Guillermo Sapiro University of Minnesota, Minneapolis, Minnesota
1
INTRODUCTION
Functional magnetic resonance imaging (fMRI) is one of the most significant and revolutionary advances in MRI in recent years. This technique uses MRI noninvasively to map areas of increased neuronal activity in the human brain without the use of an exogenous contrast agent. Since its initial demonstration, fMRI has emerged as a powerful and revolutionary tool for studying brain function and has generated an enormous amount of interest among neuroscientists, psychologists, nuclear magnetic resonance scientists, engineers, computer scientists, mathematicians, and clinicians. To date, fMRI has been applied to study a variety of neuronal processes, ranging from activities in the primary sensory and motor cortex to cognitive functions including perception, attention, language, learning, and memory. The majority of fMRI experiments are based on the blood oxygenation level dependent (BOLD) contrast, which is derived from the fact that deoxyhemoglobin is paramagnetic. Changes in the local concentration of deoxyhemoglobin within the brain lead to alterations in the magnetic resonance signal. 317
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
318
Sapiro
It is generally assumed that neuronal activation induces an increase in regional blood flow without a commensurate increase in the regional oxygen consumption rate (CMRO2), in which case the capillary and venous deoxyhemoglobin concentrations should decrease, leading to an increase in T2* and T2. This increase is reflected as an evaluation of intensity in T2*- and T2-weighted MR images. Among the various parts of the brain, the cortex is the most prominent, and one of the most intensely studied. The cortex is composed mainly of two types of tissue: gray matter (neurons) and white matter (connectors). Gray matter forms the outer layer of the cortex, encasing the inner white matter almost completely. Despite its complex outward appearance, the structure of the gray matter in each hemisphere is quite straightforward; it is roughly that of two crumpled sheets, one for each hemisphere; neither sheet has holes or self-intersections. There are various ways to visualize gray matter. One approach is to create a 3-D model of gray matter. With this approach, mainly the exterior surface is seen. Much of cortical gray matter, however, is buried deep within the folds of the brain. Visualizing the neural activity recorded by fMRI within these sulci requires novel visualization techniques. An increasingly popular way of visualizing such mappings is to superimpose fMRI measurements on flattened or inflated representations of the cortical surface [1,8,11,12,17,35,42,43]. This can be done for example by computing a planar representation of a region of gray matter so that distances (or angles, or areas) on the plane are similar to the corresponding geometric measurements within the gray matter. To create the map, we must then be able to measure distances and other geometric characteristics within the segmented gray matter. Hence, we must identify both the gray matter and its topological connectivity. To accomplish this task, we designed and built the system described in the first part of this chapter. Once the segmentation is achieved, the fMRI data can be overlayed onto the gray matter, and then visualized and analyzed. This system is being widely used in fMRI laboratories, mainly at the laboratory of Brian Wandell at Stanford (see white.stanford.edu), where a large number of improvements have been made since the initial development of the algorithm and system; see Ref. 36. We describe below only the main characteristics and underlying concepts. We should also note that many of the concepts in Ref. 36 can be found and have been adapted by commercially available programs like Brain Voyager. In the second part of the chapter we describe an algorithm for detecting pixels associated with neural activity in fMRI. We propose to use geometric anisotropic diffusion to exploit the spatial correlation between the active pixels in functional MRI. The anisotropic diffusion flow is applied to a
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
319
probability image, obtained either from t-map statistics or via a Bayes rule. Examples with simulated and real data show improvements over classical pixel-based techniques.
2
GRAY MATTER SEGMENTATION
One difficulty in achieving a direct gray matter segmentation from MR data is that MR intensity levels within gray matter significantly overlap with the levels from both white matter and nonbrain matter. Regions of gray matter can be as narrow as one or two voxels, so that a large percentage of the gray matter suffers from partial volume effects, thereby limiting the effectiveness of the intensity-based gray matter segmentation algorithms. A second problem for our application is the difficulty in determining gray matter connectivity from gray matter segmentation alone. For example, the gray matter voxels on opposite sides of a sulcus may be adjacent to one another on the sampling grid. Yet, the gray matter voxels on opposite sides of the sulcus should not be connected. Hence, gray matter connectivity cannot be discerned only from the segmentation result; additional information about the nearby white matter is necessary to determine the connectivity. Various gray-matter segmentation techniques have been proposed, e.g., see Refs. 44, 41, 7, and 2. Most of these use image segmentation methods that do not incorporate knowledge of the basic features of cortical anatomy. Most importantly, it is difficult to compute accurately the gray matter connectivity from these segmentations. There are techniques that do use some information about cortical anatomy; the manner in which such knowledge is employed tends to be local and statistical [18,40]. The methods that are closest to ours are those described in Refs. 19, 28, and 8. We discuss the relationship between these works and ours in detail in Ref. 36. Methods using deformable surfaces can produce connected segmentations [6,8,20,25,38]. Although these methods have several useful features, detailed in Ref. 36, they also have one major problem (see Ref. 28 for a detailed discussion): the minimization process used to deform the surface is prone to local minima. This frequency occurs near deep sulci with narrow openings that are present in the occipital lobe of the human cortex (see Ref. 9, Fig. 7, for an illustration of this problem). The segmentation method proposed in this paper can be used to initialize such algorithms [6,28]. Recent methods have alleviated this problem to some extent [16,45] by, among other things, introducing prior knowledge about the width of the gray matter, as is done in the algorithm described in this chapter.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
320
2.1
Sapiro
The Basic Algorithm
The segmentation method comprises four steps. First, the white matter and CSF (nonbrain) regions in the MR volume are segmented.1 Second, the user selects the desired cortical white matter component. Third, the white matter structure is verified by checking for cavities and handles. Fourth, a connected representation of the gray matter is created by growing out from the white matter boundary. The gray matter growing is subject to two main constraints: (1) new gray matter cannot grow into voxels that have been already classified as CSF, white matter, or gray matter; (2) connectivity of the segmented gray matter must be maintained during the growing process. An important feature of the method we describe below is that gray matter is calculated from white matter segmentation. This is motivated in part by the nature of the data. Because cortical gray matter voxels border either on white matter or on CSF, MR signals from the gray matter often suffer from partial volume effects so that segmentation based on the intensity level of the MR gray matter signal is poor. White matter segmentation does not suffer from these problems to the same extent. The SNR of the MR data is adequate to achieve acceptable white matter segmentation using only local spatial constraints. Had the SNR been much lower, algorithms that promote global spatial constraints would probably be required [5]. A second reason for growing gray matter from white matter segmentation is that it simplifies the computation of gray matter connectivity, a main goal for our application. If one begins with a gray matter segmentation, making decisions about the connectivity of gray matter voxels on opposite sides of a sulcus is very difficult, perhaps impossible. By growing gray matter from white matter, we can keep track of which sulcal wall each gray matter voxel is on and thus develop the proper connectivity relationships. 2.1.1
Segmenting White Matter and CSF
In this step, we create three classes: white matter, CSF,2 and unknown. The unknown class contains mainly gray matter, but at this first stage, the segmentation of this class is unreliable, and in addition it contains no connectivity information. The voxel intensities within each class are first modeled as independent random variables with normal distributions. Thus the likelihood of a particular voxel, Vi, belonging to a certain class, Ci, is 1
The images used in this part of the chapter are obtained from a T1-weighted gradient echo volumetric acquisition system with TE set to the minimum full, TR set to 33 ms, NEX set to 1, and a 40 degree flip angle. 2 This class also includes other non-brain structure.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
Pr(Vi = v兩Ci = c) =
1 exp 2 兹 c
冉
⫺
1 (v ⫺ c)2 2 c2
321
冊
where i is a spatial index ranging over all voxels in the MR volume and c is one of {white, unknown, CSF}. The values of c and c typically remain unchanged across different MR data sets collected using the same pulse sequence. Next, the posterior probability is computed for each voxel independently using Bayes’ rule together with a homogeneous prior, Pr(Ci = c兩Vi = v) = (1/K) Pr(Vi = v兩Ci = c) Pr(Ci = c), where K is a normalizing constant independent of c. The exact prior depends on the part of the cortex and can be set in advance by the user. The system described here has been found to be robust to variations in the value of the priors. Posterior volumes are then smoothed anisotropically in three dimensions, preserving discontinuities. The anisotropic smoothing technique applied is a 3-D extension of the original 2-D version proposed by Perona et al. [30,31] (see Ref. 36 for details on the model parameters): ⭸Pc = div(g(储䉮Pc储)䉮Pc) ⭸t
(1)
where Pc = Pr(C = c兩C) represents the volume of posterior probabilities for class c, g(储䉮Pc储) = exp(⫺(储䉮Pc储/c)2), and c represents the rate of diffusion for class c. Figure 1 shows an example of a posterior probability derived from an homogeneous prior and its smoothed counterpart.
FIGURE 1 From left to right: image of posterior probabilities corresponding to white matter class; image of corresponding MAP classification (brighter regions in the posterior image correspond to areas with higher probability, white regions in the classification image correspond to areas classified as white matter, black regions correspond to areas classified as CSF); image of white matter posterior probabilities after being anisotropically smoothed; image of MAP classification computed with smoothed posteriors.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
322
Sapiro
A number of reasons motivate this step. First, anisotropic smoothing applied directly to the MR data would not take into account the fact that only three classes are segmented. In addition, anisotropic diffusion applied to the raw data is well motivated only when the noise is additive and class independent. For example, if two classes have the same mean and differ only in variance, anisotropic smoothing of the raw data is ineffective. Using anisotropic diffusion on the posterior probabilities to capture local spatial constraints was motivated by the intuition that posteriors with piecewise uniform regions result in segmentations with piecewise uniform regions. Applying anisotropic smoothing on the posterior probabilities is feasible even if the classes are described by general probability distribution functions. This novel application is related to (anisotropic) relaxation labeling [33] and is further discussed in detail in Ref. 37. Finally, the white matter and nonbrain classifications are obtained using the maximum a posteriori probability (MAP) estimate after anisotropic diffusion. That is, C* i =
arg max Pr*(Ci = c兩Vi = v) c 僆 {white, unknown, CSF}
where Pr*(Ci = c兩Vi = v) corresponds to the posterior following anisotropic diffusion. As we see in Fig. 1, the white matter and CSF classifications are smooth and connected. The unknown class classification does not correspond to a plausible description of the gray matter. For this reason, we retain only the white matter and CSF segmentations. Figure 2 shows a schematic description of this part of the algorithm, as well as a toy example illustrating the idea. 2.1.2
Selection of Cortical White Matter
Gray matter segmentation can be obtained from white matter segmentation because gray matter surrounds white matter. Thus it is important to ensure that we have obtained an accurate and topologically correct white matter segmentation.
> FIGURE 2 (a) Schematic description of the posterior diffusion algorithm. (b) Simple example of the posterior diffusion algorithm. Two classes of the same average and different standard deviation are present in the image. The first row shows the result of the proposed algorithm (posterior, diffusion, MAP), while the second row shows the result of classical techniques (diffusion, posterior, MAP).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
323
324
Sapiro
White matter connectivity is defined using 26-neighbor adjacency; that is, two distinct white matter voxels are adjacent to each other if their spatial coordinates differ by no more than one. Two white matter voxels are connected to each other if there is a path of white matter voxels connecting the two, such that all neighboring pairs of white matter voxels along the path are 26-neighbor adjacent. Gray matter connectivity is also defined using 26-neighbor adjacency. CSF connectivity, on the other hand, is defined using 6-neighbor adjacency; that is, two distinct voxels classified as CSF are adjacent to each other if exactly one of their spatial coordinates differ by one. The reason for defining the connectivity of CSF differently is to prevent intersections between regions of CSF and white or gray matter [23]. The initial classification generally yields several unconnected components labeled as white matter. Only one of these is the main section of white matter, the others being either parts of the cerebellum, or other non-brain materials. The user identifies a voxel in the cortical white matter component via a graphical user interface, and a flood-filling algorithm automatically identifies the entire connected component [14]. The purpose of this stage is, primarily, to remove extracortical components such as skin or cerebellum. If the MR volume has been cropped to an appropriate region of interest within the cortex, the cortical white matter component typically corresponds to the largest white matter component. 2.1.3
White Matter Topology
Gray matter is a single sheet that encases white matter. To ensure that there are not multiple gray matter sheets or self-interactions of the gray matter, we must eliminate gray matter grown within cavities or through white matter handles. Cavities are nonwhite matter regions that are completely surrounded by white matter, such as the inside of a tennis ball. Handles are nonwhite matter regions that are partially surrounded by white matter, such as the middle of a doughnut; see Ref. 36 for examples. In our application, handles may arise when the classification inappropriately assigns the white matter label to voxels that cross a sulcus, cutting through two layers of gray matter and CSF. We check for this condition in the previewer. Handles are removed by hand-editing or readjusting the parameters used to obtain the white matter classification. Automatic handles removal has recently been added by Brian Wandell and his team. It is possible to compute the number of handles using the Euler characteristic, , which is equal to the sum of the number of connected components and cavities, minus the number of handles. The first two quantities can be computed using flood-fill algorithms. The Euler characteristic can be
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
325
computed as the sum of the local Euler characteristic over all 2 ⫻ 2 ⫻ 2 voxel neighborhoods,3
local =
冘 i
vi ei fi ⫺ ⫹ ⫺ oi 8 4 2
where i ranges over all 2 ⫻ 2 ⫻ 2 voxel neighborhoods and vi, ei, fi, oi represent the number of vertices, edges, faces and octants in the ith neighborhood, respectively [24]. Thus the number of handles can be computed as the sum of the number of connected components and cavities minus the Euler characteristic. A second type of segmentation error is the presence of cavities within the white matter. The problem created by cavities is that gray matter will grow on the boundary of the cavity and form a surface internal to the white matter. It is possible to eliminate white matter cavities in a number of simple ways, for example by repeatedly initiating the flood-filing algorithm from nonwhite matter voxels on the volume boundary. All nonwhite matter connected components that are not filled must be encased entirely by white matter and thus be cavities. 2.1.4
Gray Matter Computation
The gray matter voxels are identified by growing, in 3-D, a sequence of layers that begin on the white matter boundary. The maximum number of gray matter layers is a parameter of the program and chosen by the user. The number of layers is determined by the spatial resolution of the MR data and the fact that the human cortex is roughly 4–5 mm thick (according to this, the number of layers that are grown). For example, suppose the MR data has a spatial resolution of 1 mm along each spatial dimension and we are identifying gray matter near the calcarine cortex in the occipital pole where the gray matter is roughly 5 mm thick. Then a maximum of 5 layers are grown. There may be fewer layers at any particular location if CSF is encountered before the n mm or gray matter from the opposite side of a sulcus is encountered. Thus the thickness of the final classification depends on (1) the maximum thickness of gray matter, (2) the CSF classification, and (3) potential collisions with gray matter growing from different portions of the white matter.
3 There are only 256 possible 2 ⫻ 2 ⫻ 2 neighborhood configurations; the local Euler characteristic of each possible configuration is precomputed and stored in a table. The Euler characteristic, and thus, the number of handles, is then computed efficiently using table lookups.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
326
Sapiro
Figure 3 shows an example of gray matter classification. A simple case, in which only two layers are grown from the boundary of the white matter component, is illustrated. Each layer of gray matter grows upon the previous layer (or from the boundary of the white matter component for the first layer) in the same fashion. Adding a layer of gray matter voxels is carried out in two steps: first, new gray matter voxels are identified and labeled: second, connectivity of the new gray matter voxels is determined. For the first layer, ‘‘unknown’’ voxels that are 6-neighbor adjacent to some white matter boundary voxel are classified as gray. Each new voxel is classified as gray only if all its parents are connected. Connectivity in this case is determined from the 26-neighbor adjacency of white matter voxel parents. For layer N ⫹ 1, unclassified voxels that are 6-neighbor adjacent to gray matter in layer N are classified as gray matter voxels belonging to layer N ⫹ 1. The connected gray matter voxels in layer N are known as parents of the voxels in layer N ⫹ 1. Again, a new voxel is classified as gray only if all its parents are connected. The reason for requiring connectivity among the parents is that a voxel must not be assigned a gray classification if doing so results in a contention for connectivity among unconnected voxels in the previous layer. For example, a voxel falling between two gray matter voxels on opposite sides of a sulcus will not be classified as gray matter. During the second step, connectivity of the newly classified gray matter voxels is computed. Connectivity of gray matter voxels is divided into two categories: interlayer and intralayer. Gray matter voxels between dif-
FIGURE 3 From left to right: white matter classification; two layers of gray matter classification grown out from white matter classification; schematic showing of two layers of gray matter grown out from the white matter boundary. The connectivity of the white matter boundary and the first layer of gray matter is represented by the links between adjacent filled circles.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
327
ferent layers are considered connected if they are 6-neighbor adjacent. This occurs precisely when one gray matter voxel is a parent of the other. Ascertaining the connectivity of gray matter voxels within the same layer is a little more complicated as it requires the examination of the connectivity of the voxels’ parents. Figures 4a–c shows the parents of pairs of gray matter voxels in all possible configurations. Two gray matter voxels within the same layer are considered connected if they (1) are 26-neighbor adjacent, (2) have a common parent or have parents that are connected (as computed in the previous connectivity step), and (3) are not one of the configurations in Figs. 3d and 3e that give rise to intersections. The connectivity so determined cannot result in intersecting regions. Figures 4d and e show different configurations of voxels that result in intersecting regions. In Fig. 4d, for example, if the shaded cubes (gray matter) were labeled as connected, then the digital region formed by these two cubes would intersect the digital region formed by the two other cubes (white matter parents or gray matter parents from the previous layer). Figure 4e shows all the other remaining cases. For the first layer, since connectivity of white matter voxels is determined using 26-neighbor adjacency, two 26-neighbor adjacent gray matter voxels are considered connected if they either share a common white matter parent or have white matter parents that are 26-neighbor adjacent. Despite
FIGURE 4 Possible configurations while computing gray matter connectivity. (a–c) Unshaded cubes represent possible white matter voxels from which these gray matter voxels could have been grown and (d, e) shaded cubes represent gray matter voxels.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
328
Sapiro
the complexity of the connectivity algorithm, it can be efficiently implemented with tables. 2.2
Results and Discussion
Figure 5 shows comparisons between gray matter segmentation results derived manually, and those computed using the current method without any manual editing (see Ref. 36 for additional examples). The next figure shows a 3-D reconstruction, followed by two examples of unfolding, curvaturebased, deformations shown in Figs. 6 and 7 (see Ref. 3 for details on this method). The current method has difficulties when the white matter is extremely thin, roughly one voxel thick (this is uncommon). This is a common problem in MRI segmentation. In this case, the anisotropic smoothing algorithm tends to remove the narrow regions of white matter in favor of larger regions of CSF. The results of the segmentation stage were used to visualize fMRI measurements, overlaid on the flattened representations, obtaining qualitatively very similar results compared to those obtained with manual segmentation (see white.stanford.edu for a number of publications showing this). This segmentation work can also be of great significance for activity detection, as we discuss in the next section. 3 3.1
ACTIVITY DETECTION IN BLOCK-DESIGN fMRI Introduction
Basically two types of experiments are performed in fMRI: block-design and event related. In block-design fMRI studies, the most popular ones, periods of a control state are interleaved with stimulation periods of task performance and/or sensory stimulation, and functional maps are generated by the statistical comparison of images acquired in these periods. Also in these periods, typically a minute or more, a subject executes the task many times. Thus studies based on the block design paradigm can be viewed as a steady-state response, and the activity map that emerges from such studies provides an averaged view of the areas engaged in the specific task. Consequently important information, regarding the temporal sequence of the neuronal activity in different parts of the brain and cognitive effects such as learning or habituation that evolve over repeated executions, is lost. This opens up the avenue for probing neuronal activity with an added dimension, i.e., time, via event-related fMRI. In these experiments, the evolution of the fMRI signal during a single sensory or cognitive event is monitored. Eventrelated fMRI also permits unconventional paradigm designs such as those used for studying infrequent events and mistakes in behavior.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
329
FIGURE 5 The first row shows manual gray matter segmentation results; the second row shows the automatically computed gray matter segmentation (sagittal slices of the occipital lobe).
One of the key issues in fMRI is to detect the areas (pixels/voxels) that are active while the subject executes the requested task, that is, pixels that show changes in their intensity due to stimulus presentation. This is not just the most crucial step in fMRI analysis but also an extremely challenging one because the fMR signal is very weak and noisy (noise comes from thermal noise, cardiac and respiratory pulsation, subject motion, etc.). In this part of the chapter we present a novel technique for detecting activity regions in block-design fMRI.4 The classical method for identifying active regions in block-design fMRI is based on some thresholding of a statistical map, such as a map of t-statistics [32] or correlation coefficients, calculated on a pixel-by-pixel basis (see for example Ref. 15 and references therein). This model looks into the fMRI time series at each pixel independently. With this approach, the classification of a pixel as active or inactive depends on that pixel alone, and this classification is not sensitive to any potential spatial pattern of activity. Empirical evidence suggests that fMRI data exhibits spatial correlation, making it natural to use neighboring information in the classification process. If a pixel is classified as active, and it is surrounded by pixels 4
The investigation of similar techniques for event-related fMRI is the subject of current research with Prof. X. Hu and colleagues; see Sec. 4.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
330
Sapiro
FIGURE 6 3-D reconstruction of the human white matter from MRI, based on posterior diffusion techniques.
FIGURE 7 Unfolding the cortex, and tracking the marked regions, with a curvature-based flow and a 3-D morphing flow, respectively.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
331
classified as inactive, it is likely that this pixel has been wrongly classified due to noise. It is the goal of this part of the chapter to introduce this spatial correlation in the classification process. More specifically, we will use anisotropic diffusion of probability maps to do so. The introduction of spatial correlation in the classification of fMRI is now an active research topic, and a number of techniques are currently being proposed in the literature. See Refs. 10, 26, and 29 and references therein for examples of recent results in this direction. Additional techniques are being studied by Prof. Hu at the University of Minnesota, and by my group (including exploiting the 3-D spatial connectivity produced by the segmentation algorithm described above). 3.2
Additional Background on Anisotropic Diffusion
A number of algorithms were proposed for anisotropic diffusion following the seminal work in Ref. 30. In contrast with the equation used for the segmentation work, for signal detection we follow Ref. 4, where, using a robust framework, improvements on the original Perona–Malik algorithm have been proposed. We will briefly describe this technique. See the mentioned papers for additional details and references. Let I(x, y, t): ⍀ 傺 IR2 ⫻ [0, ) → IR be the deforming image (representing probabilities in our case, see below), with the original image as the initial condition I(x, y, 0). The image enhancement flow is obtained from the gradient descent of
冕
(储䉮I储) d⍀
⍀
which is given by ⭸I = div ⭸t
冉
⬘(储䉮I储)
冊
䉮I 储䉮I储
where is, for example, the Lorentzian or Tukey’s biweight robust function. As stated before, the concept on which this popular algorithm is based is that information (gray values, probabilities, etc.) is diffused and propagated inside ‘‘uniform’’ regions, while the diffusion is stopped across discontinuities due to the function ⬘(储䉮I储)/储䉮I储 (when this is constant, the classical heat flow or isotropic diffusion equation is obtained). Therefore the image is being enhanced while edges are preserved. Next, we will now show how to use this flow to improve the active/nonactive pixel classification in fMRI.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
332
3.3
Sapiro
The Proposed Detection Algorithms
In Ref. 36, and as described above, we introduced the use of anisotropic diffusion in probability spaces. In this case, it was used to segment MRI into gray-matter, white-matter, and CSF regions. The advantages of this technique over other classical approaches, as well as the importance of diffusing the probability space instead of the image space, were reported as well. Next, we extend these ideas for the detection problem in hand. Two different approaches illustrating these ideas will be discussed. 3.3.1
Anisotropic Diffusion of t-Maps
As pointed out in the introduction, the classical method for identifying active regions in block-design fMRI is based on some thresholding of a statistical map, such as a map of t-statistics or correlation coefficients, calculated on a pixel-by-pixel basis. This does not exploit spatial correlation, and is apt to produce false detections; see Fig. 8 (left). Our first approach to introduce this spatial correlation is to process the t-map with anisotropic diffusion before performing the thresholding operation. This way, the probability values will be diffused, and information will be spatially propagated, while preserving the edge information (strong jumps in the probability given by the t-map). We therefore propose to diffuse the probability data in the tmap, with a controlled diffusion, before applying the thresholding operation. Examples are given in Figs. 8 (right) and 9. 3.3.2
Anisotropic Diffusion of Posterior Probabilities
In the second proposed approach we apply anisotropic diffusion to posterior probabilities. In this case, we consider two classes of pixels, activity and nonactivity, and compute the posterior probability per pixel for each class,
FIGURE 8
Example of signal detection for simulated data.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
FIGURE 9
Examples of signal detection for real data.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
333
334
Sapiro
using the Bayes rule (as we did for segmentation, where we had three classes). Classically, following the posterior computation, a pixel is classified using the maximum a posteriori (MAP) approach. As in the algorithm described above, and following Ref. 36, we apply anisotropic diffusion to each one of the two posterior maps before the maximum a posteriori decision is made. The posterior diffusion is applied in such a way that the probability functions per pixel add to one [36]. In order to compute the posterior probability maps using the Bayes rule, we need to compute the likelihood and prior probabilities, since Posterior = C Likelihood⭈Prior where C is a normalizing constant. In our experiments, shown in Fig. 9, we have learned these probability distributions directly from the data. We define an n ⫻ n moving window (n = 4 or n = 8), and in each window we empirically compute the likelihood and prior probabilities for each one of the two classes. In order to compute these probability functions from the data, we need to decide which pixels will be used for the statistics (normalized histograms) of the activity class and which ones will be used for the statistics of the nonactivity class. We use the classical t-map as a first estimate of this classification. That is, if the classical t-map at a certain pixel is greater than a given threshold, the whole time series corresponding to those pixels is used to compute the likelihood and prior functions for the activity class. Otherwise, that pixel is used for computing the probability functions corresponding to the nonactivity class. The t-map is normally diffused as in the previous section. Recapping, the second proposed algorithm is based on first using the whole fMR data and (diffused) t-map to compute empirically the likelihood and prior probability functions. From these functions, posterior probabilities for two classes, activity and nonactivity, are computed using the Bayes rule. These posterior probabilities are anisotropically diffused, and then the classification is performed via MAP. This way we once again exploit the spatial correlation, preserving sharp discontinuities in the probability domain. 3.3.3
Examples
We have tested the proposed algorithm on both simulated and real data. Figure 8 shows results for simulated data, while Fig. 9 shows results for real data. In Fig. 8, we took real MR data from the control period of the fMRI experiment in Refs. 21 and 22 (that is, we consider MR data from the period when the subject is resting). We then simulated activity by adding a Gaussian signal, with = 5 and 2 = 3, to a 5 ⫻ 5 pixels square region of this data (the region was randomly selected). We used 20 frames of MR data and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation
335
added the Gaussian signal in 10 of them. On the left we see the results of a classical t-test, while on the right we see the results of the algorithm described in Sec. 3.3. Note that the classical t-test fails to detect the square completely and has a number of false positives, while our algorithm performs an almost perfect detection (white pixels correspond to the activity class and gray pixels to the nonactivity one). The two columns in Fig. 9 show results of our algorithm for real data. In the first column we used data from a motor-related fMRI experiment [21,22], while the second column uses data from a speech-related experiment. Both columns first show one time slice of the fMR data (a slice from the period of activity), followed by the results of classical t-test, the diffused t-test approach, and the diffused posterior approach, respectively. Once again we observe, this time for real data, that the active regions detected by our algorithm are much more regular, as expected in fMRI. 3.4
Discussion
In this part of the chapter we have introduced the use of anisotropic diffusion for activity detection in fMRI. Applying anisotropic diffusion on the t-maps and posterior images reduces the noise, favoring clustered pixels over scattered pixels, hence improving the classification results. A number of future research directions are suggested by this work. For block-design fMRI, we still need to improve the likelihood functions and prior estimates used by the posterior diffusion technique. This can be done by using a number of models for the fMR signal proposed in the research literature, as well as by incorporating segmentation, e.g., in Ref. 36, into the process of learning these functions from the data. Since activity happens only in the gray matter, having this segmentation may then improve the accuracy of these probability functions. We have only considered, two classes, activity and nonactivity. It has been shown in the literature that active pixels might have different characteristics. Using additional classes will then immediately improve the results presented here, since by using only two classes we are mixing pixels with different statistics while computing the likelihood and prior distributions. We are also extending these concepts to a full 3-D connectivity approach, using the output of the segmentation previously described. Lastly, and as mentioned above, event-related fMRI provides even tougher image processing challenges, since only one or a few time slices are available. We are currently also investigating the use of anisotropic diffusion techniques in this paradigm (in collaboration with Prof. X Hu from CMRR).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
336
4
Sapiro
CONCLUDING REMARKS
In this chapter we exemplified the use of geometric techniques to address fundamental problems in fMRI. The specific problems discussed are MRI segmentation and activity detection. Other geometric techniques, based as well on partial differential equations, can be used to address additional problems like brain flattening [1] and image warping [39]. ACKNOWLEDGMENTS The segmentation work was originally developed with Dr. Patrick Teo and Prof. Brian Wandell (Stanford University), while Patrick and I were at Hewlett-Packard Laboratories, Palo Alto, California. Since then, it has been tremendously improved and supported by Brian and his group. Brian introduced me to the world of fMRI. The cortex unfolding work is carried out jointly with Marcelo Bertalmio and Gregory Randall. Prof. X. Hu, Prof. K. Ugurbil, and J. Strupp from the Center for Magnetic Resonance Research at the University of Minnesota provided very helpful conversations and some of the data used in this paper. Prof. X. Hu provided me with an internal report on fMRI that helped us to learn about the subject. This work was partially supported by a grant from the Office of Naval Research ONRN00014-97-1-0509, by the Office of Naval Research Young Investigator Award, by the Presidential Early Career Awards for Scientists and Engineers (PECASE), by the National Science Foundation CAREER Award, and by the National Science Foundation Learning and Intelligent Systems Program (LIS). REFERENCES 1. 2.
3. 4. 5. 6.
S Angenent, S Haker, A Tannenbaum, R Kikinis. Laplace–Beltrami operator and brain flattening. IEEE Trans. Medical Imaging 18:686–699, 1999. T Bartlett, M Vannier, D McKeel Jr, M Gado, C Hildebolt, R Walkup. Interactive segmentation of cerebral gray matter, white matter, and CSF: photographic and MR images. Computerized Medical Imaging and Graphics 18(6): 449–460, 1994. M Bertalmio, G Sapiro, G Randall. Region tracking on level-sets methods. IEEE Trans. Medical Imaging 18:448–451, 1999. M Black, G Sapiro, D Marimont, D Heeger. Robust anisotropic diffusion. IEEE Trans. Image Processing 7(3):421–432, 1998. C Bouman, M Shapiro. A multiscale random field model for Bayesian image segmentation. IEEE-IP 3(2):162–177, 1994. V Caselles, R Kimmel, G Sapiro, C Sbert. Minimal surfaces based object segmentation. IEEE-PAMI 19(4):394–398, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation 7.
8.
9. 10.
11.
12. 13. 14. 15.
16. 17. 18.
19. 20.
21.
22.
23. 24.
337
H Cline, W Lorensen, R Kikinis, F Jolesz. Three-dimensional segmentation of MR images of the head using probability and connectivity. Journal of Computer Assisted Tomography 14(6):1037–1045, 1990. A Dale, M Sereno. Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: a linear approach. Journal of Cognitive Neuroscience 5(2):162–176, 1993. C Davatzikos, RN Bryan. Using a deformable surface model to obtain a shape representation of the cortex. IEEE-MI 15:785–795, 1996. X Descombes, F Kruggel, DY von Cramon. Spatio-temporal fMRI analysis using Markov random fields. IEEE Trans. Medical Imaging 17(06):1028–1039, 1998. H Drury, D Van Essen, C Anderson, C Lee, T Coogan, J Lewis. Computerized mappings of the cerebral cortex: a multiresolution flattening method and a surface-based coordinate system. Journal of Cognitive Neuroscience 8(1):1– 28, 1996. S Engel, G Glover, B Wandell. Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cerebral Cortex 7:181–192, 1997. L Fletcher, J Barsotti, J Hornak. A multispectral analysis of brain tissues. Magnetic Resonance in Medicine 29(5):623–630, 1993. J Foley, A van Dam, S Feiner, J Hughes. Computer Graphics: Principles and Practice. Reading, MA, Addison-Wesley, 1990. K Friston, A Holmes, K Worseley, J Poline, C Fritch, R Frackowiak. Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 2:189–210, 1995. J Gomez, O Faugeras. Reconciling distance functions and level-sets. Proc. Scale Space Workshop, Corfu, Greece, September 1999. G Hermosillo, O Faugeras, J Gomes. Cortex unfolding using level set methods. INRIA Sophia Antipolis Technical Report, April 1999. B Johnston, M Atkins, K Booth. Three-dimensional partial volume segmentation of multispectral magnetic resonance images using stochastic relaxation. Proc SPIE 2180:268–279, 1994. M Joliot, B Mazoyer. Three-dimensional segmentation and interpolation of magnetic resonance brain images. IEEE-MI 12(2):269–277, 1993. S Kichenassamy, A Kumar, P Olver, A Tannenbaum, A Yezzi. Gradient flows and geometric active contour models. Proc. ICCV. Cambridge, MA, 1995, pp. 810–815. SG Kim, J Ashe, K Hendrich, JM Ellermann, H Merkle, K Ugurbil, AP Georgopoulos. Functional magnetic resonance imaging of motor cortex: hemispheric asymmetry and handedness. Science 261:615–617, 1993. SG Kim, JE Jennings, JP Strupp, P Andersen, K Ugurbil. Functional MRI of human motor cortices during overt and imaged finger movements. Int. J. Imaging Sys. Tech. 6:271–279, 1995. T Kong, A Rosenfeld. Digital topology: introduction and survey. CVGIP 48: 357–393, 1989. T Lee, R Kashyap. Building skeleton models via 3-D medial surface/axis thinning algorithms. CVGIP 56(6):462–478, 1994.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
338 25. 26.
27.
28.
29.
30. 31. 32. 33. 34. 35. 36.
37. 38. 39. 40.
41.
42. 43.
Sapiro R Malladi, JA Sethian, BC Vemuri. Shape modeling with front propagation: a level set approach. IEEE-PAMI 17:158–175, 1995. PP Mitra, S Ogawa, X Hu, K Ugurbil. The nature of spatiotemporal changes in cerebral hemodynamics as manifested in functional magnetic resonance imaging. MRM 37:511–518, 1997. S Ogawa, D Tank, R Menon, J Ellermann, S Kim, H Merkle, K Ugurbil. Intrinsic signal changes accompanying sensory simulation: functional brain mapping with magnetic resonance imaging. Proc. Nat. Acad. Sci. 89:591–595, 1992. J-F Mangin, V Frouin, I Bloch, J Regis, J Lopez-Krahe. Automatic construction of an attributed relational graph representing the cortex topography using homotopic transformations. SPIE Math. Methods in Medical Imaging. San Diego, 1994, pp. 110–121. SC Ngan, X Hu. Analysis of functional magnetic resonance imaging data using self-organizing mapping with spatial connectivity. Magn. Reson. Med. 41:939– 946, 1999. P Perona, J Malik. Scale-space and edge detection using anisotropic diffusion. IEEE-PAMI 12:629–639, 1990. P Perona, T Shiota, J Malik. Anisotropic diffusion. In B Romeny, ed. Geometry-Driven Diffusion in Computer Vision. Kluwer, 1994, pp. 73–92. WH Press et al. Numerical Recipes in C: The Art of Scientific Computing. New York, Cambridge University Press, 1994. A Rosenfeld, RA Hummel, SW Zucker. Scene labeling by relaxation operations. IEEE Trans. on Systems, Man, and Cybernetics 6(6):420–453, 1976. PJ Rousseeuw, AM Leroy. Robust Regression and Outlier Detection. New York, John Wiley, 1987. G Sapiro, U Ramamurthy. On geometry and fMRI visualization. in preparation. P Teo, G Sapiro, B Wandell. Creating connected representations of cortical gray matter for functional MRI visualization. IEEE Trans. Medical Imaging 16(06):852–863, 1997. PC Teo, G Sapiro, B Wandell. Anisotropic smoothing of posterior probabilities. Proc. ICIP, 1997. D Terzopoulos, A Witkin, M Kass. Constraints on deformable models: recovering 3d shape and nonrigid motions. Artificial Intelligence 36:91–123, 1988. AW Toga. Brain Warping, New York, Academic Press, 1998. D Vandermeulen, R Verbeeck, L Berben, D Delaere, P Suetens, G Marchal. Continuous voxel classification by stochastic relaxation: theory and application to mr imaging and mr angiography. Image and Vision Computing 12(9):559– 572, 1994. K Vincken, A Koster, M Viergever. Probabilistic hyperstack segmentation of MR brain data. In Proc. First Int’l Conf. on Computer Vision, Virtual Reality and Robotics in Medicine. Nice, France, 1995, pp. 351–357. B Wandell, S Chial, B Backus. Cortical visualization. Journal of Cognitive Neuroscience 12(5):739–752, 2000. B Wandell, S Engel, H Hel-Or. Creating images of the flattened cortical sheet. Invest. Opth. Vis. Sci. 36(S612), 1996.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Geometric Approaches for Segmentation 44. 45.
339
WM Wells, W Grimson, R Kikinis, F Jolesz. Adaptive segmentation of MRI data. IEEE-MI 15(4):429–442, 1996. X Zeng, LH Staib, RT Schultz, JS Duncan. Segmentation and measurement of the cortex from 3D MR Images. Proceedings Medical Image Computing and Computer-Assisted Intervention, MICCAI ’98. Springer-Verlag, Cambridge, MA, 1998, pp. 519–530.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
12 MR Image Segmentation and Analysis Based on Neural Networks Javad Alirezaie Ryerson University, Toronto, Ontario, Canada
M. E. Jernigan University of Waterloo, Waterloo, Ontario, Canada
1
INTRODUCTION
MR imaging has a unique advantage over other modalities in that it can provide multispectral images of tissues with a variety of contrasts based on the three magnetic resonance parameters proton density (PD), T1, and T2 [1]. Tissue classification of normal and pathological tissue types using multispectral MR images has great potential in clinical practice. Possible areas of application include quantitative measurements of brain components (the volume, shape, location), radiosurgery planning by delineating the region to be treated, follow-up of brain abnormalities before and after intervention, and quantitative studies of neurodegenerative diseases such as Alzheimer’s disease and multiple sclerosis [2]. Segmentation of images obtained from magnetic resonance (MR) imaging techniques is an important step in the analysis of MR images of the human brain. In the study of Alzheimer’s disease, it is the volumes of white matter and gray matter that are of interest. If manual delineation is considered, the analysis would be time-consuming and tedious and would require a qualified observer [3]. Detection of white and gray matter structures in a 341
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
342
Alirezaie and Jernigan
large number of MR images is impractical and will only become possible if reliable and robust automated methods are developed to assist human experts. Automatic segmentation not only requires less time from human experts but also can provide less variable results. Multispectral analysis has been the most successful approach to the automatic recognition of tissue types in the human body based on MR imaging [4]. During the last few years, several studies on the automatic recognition of normal tissues in the brain, and its surrounding structures, have been published [4–34]. Consistently, these studies have reported results that are in visual agreement with expert judgment, but a number of factors that reduce the robustness and the reliability of the classifiers have been identified. In this chapter, the potential of artificial neural networks (ANN) in the classification and segmentation of magnetic resonance (MR) images of the brain is explored. Using multispectral MR images, we demonstrate an improved differentiation between normal tissue types of the brain and surrounding structures using pixel intensity values and spatial information of neighboring pixels. We give the essential steps of the training technique and show how our neural networks approaches can be used for classification and segmentation of different tissues. We present results obtained from our method and compare it with other traditional and neural network approaches. 2
MR BRAIN IMAGES
More than 40 complete studies were obtained from the MRI department at McMaster University Medical Center for the purpose of this study. All information identifying the subjects was stripped before we had access to the data. These studies were examined using a GE 1.5T MR scanner. The axial field of view was either 20 cm or 22 cm for the T1-weighted sequences, and 22 cm for the PD- and T2-weighted sequences. Images were reconstructed onto a 256 ⫻ 256 matrix, so that the size of a pixel was either 0.78 ⫻ 0.78 mm2 or 0.86 ⫻ 0.86 mm2. When the T1-weighted images were acquired with a field of view of 20 cm, they were resampled to ensure accurate registration with the other images. Each slice was 5 mm thick. As an example, one of the representative images is shown in Fig. 1. Three pulse sequences were used: TR/TE = 600/16 ms for the T1-weighted images; TR/TE = 2700/30 ms for the PD-weighted images; and TR/TE = 2700/90 ms for the T2-weighted images (Figs. 1a, 1b, 1c). 3
SUPERVISED LEARNING APPROACHES
In this section we explore the potential of learning vector quantization (LVQ) ANN for the multispectral supervised classification of MR images of the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
343
FIGURE 1 (a) A T1-weighted image. (b) A PD-weighted image. (c) A T2weighted image.
brain. The LVQ originally was proposed by Kohonen [35] and has attracted considerable attention in the recent past [36]. The LVQ is modified for better and more accurate classification. We give the essential steps of the training technique and show how the LVQ ANN can be used for classification and segmentation of different tissues. We present results obtained from our method and compare it with back-propagation artificial neural networks. 3.1
LVQ Classifier
Learning vector quantization ANN is a classification network that consists of two layers. The two-layer ANN classifies patterns by using an optimal set of reference vectors or codewords. A codeword is a set of connection
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
344
Alirezaie and Jernigan
weights from input to output nodes (Fig. 2). The set of vectors w1 , w2 , . . . , wk is called a codebook in which each vector wi is a codeword for vector quantization. If several codewords are assigned to each class, and each is labeled with the corresponding class symbol, the class region in the x space (input) is defined by simple nearest-neighbor comparison of x with the codewords wi ; the label of the closest wi defines the classification of x To define the optimal placement of wi in an iterative learning process, initial values must be set. The next phase, then, is to determine the labels of the codewords, by presenting a number of input vectors with known classification and assigning the codewords to different classes by majority voting, according to the frequency with which each wi is closest to the calibration vectors of a particular class. The classification accuracy improves if the wi are updated according to the algorithm described below [37,38]. The idea is to pull codewords away from the decision surface to mark (demarcate) the class borders more accurately. In the following algorithm we assume that wi is the nearest codeword to the input vector x [Eq. (1)] in the Euclidean metric; this, then, also defines the classification of x. 㛳x ⫺ wi 㛳 = MIN kj=1 㛳x ⫺ wj 㛳
(1)
where the Euclidean distance between any two vectors X and Y is defined as
FIGURE 2
Topology of LVQ ANN.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
冋冘
册
1/2
n
㛳X ⫺ Y㛳 =
345
(xi ⫺ yi)
2
i=1
The following algorithm shows how the codewords will be updated. wi (t ⫹ 1) = wi (t) ⫹ ␣ (t)[x ⫺ wi (t)]
(2)
If x is classified correctly (if the label agreed with codeword assignment), wi (t ⫹ 1) = wi (t) ⫺ ␣ (t)[x ⫺ wi (t)]
(3)
If the classification of x is incorrect (if the label does not agree with the codeword assignment), and wj (t ⫹ 1) = wj (t),
i≠j
(4)
(the other codewords are not modified). Here ␣ (t) is a learning rate such that 0 < ␣ (t) < 1 and is decreasing monotonically in time (t). We chose
冉
␣ (t) = 0.2 1 ⫺
t 10000
冊
After desired number of iterations, the codebook typically converges and the training is terminated. 3.2
Back-Propagation ANN Classifier
The back-propagation ANN used in this study consists of one input layer, two hidden layers, and one output layer. Figure 3 shows the topology of the network. The numbers of input and output nodes are equal to 9 and 7 respectively. The number of nodes in each hidden layer is 18. The training of
FIGURE 3
Topology of the back-propagation ANN.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
346
Alirezaie and Jernigan
the network consists of finding a map between a set of input values and a set of output values. This mapping is performed by adjusting the value of the weights wij using a learning algorithm. A popular learning algorithm, the generalized delta rule briefly introduced here [39] is used. After the weights are adjusted on the training set, their value is fixed and the ANN is used to classify input vectors. Assume the input vector, xi = (xi1 , xi2 , . . . , xiN ), is applied to the input units, and the output vector, oi , is the observed output vector. The generalized delta rule minimizes an error term E which is defined as
冘 N
1 Ei = 2
(xij ⫺ oij )2
(5)
j=1
The error term is a function of the difference between the output and input vectors. This difference can be minimized by adjusting the weights wij in the network. The generalized delta rule implements a gradient descent in E to minimize the error [39]. wjk then presents as ⌬iwjk = ␦ij oik with
␦ij =
再
f ⬘(netij )兺p ␦ipwpj (xij ⫺ oij )f ⬘(netij )
(6)
for a hidden node for an output node
(7)
in which netij = 兺p wjk oik ⫹ j is the total input to node j including a bias term j , the parameter  is the learning rate, and layer p precedes the output layer. The output of node j due to input k is thus oij = fj (netij ) with f the activation function. If f (z) = 1/(1 ⫹ exp⫺z), Eq. (7) can be rewritten as
␦ij =
再
oij (1 ⫺ oij )兺p ␦ipwpj (xij ⫺ oij )oij (1 ⫺ oij )
for a hidden node for an output node
(8)
Finally, by adding a momentum term to the learning rate we obtain for the weights ⌬iwjk (t) = ␦ij oik ⫹ ␦iwjk (t ⫺ 1)
(9)
where is the momentum rate at each iteration; the weights are thus modified as follows: wjk (t ⫹ 1) = wjk (t) ⫹ ⌬iwjk (t) 4
(10)
UNSUPERVISED APPROACHES
It has been shown by several researchers [31–36] that the Self Organizing Feature Map (SOFM) system of Kohonen is a strong candidate for contin-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
347
uous valued unsupervised pattern recognition. Kohonen’s self-organizing feature maps use a linear update rule for the weights, which makes this model computationally attractive. In this section we present an unsupervised clustering technique for multispectral segmentation of MR images of the human brain. Our scheme utilizes the SOFM ANN for feature mapping and generates a set of codebook vectors. By extending the network with an additional layer the map will be classified and each tissue class will be labeled. To minimize clustering artifacts, an algorithm has been developed for isolating the cerebrum prior to segmentation. The cerebrum is extracted by stripping away the skull pixels from the T2-weighted image. The c-means algorithm was used for unsupervised classification of MR brain images for comparison purposes. Since the algorithm is well known, it will not be described here; only the results will be presented. 4.1
Self-Organizing Feature Map (SOFM)
Kohonen [38] has presented an algorithm that produces what he calls selforganizing feature maps similar to those that occur in the brain. The algorithm will map a set of input vectors onto output vectors according to some characteristic feature of the input vectors. A brief discussion of this ordering behavior follows. More details can be found in the monograph by Kohonen [40]. The basic model for self-organization feature mapping consists of two layers. The first layer contains the input nodes and the second contains the output nodes. The output nodes are arranged in a two-dimensional grid as shown in Fig. 4. Every input is connected extensively to every output node via adjustable weights. Let X = [x0 , x1 , . . . , xN⫺1]T be a set of N inputs in m such that each xi has m dimensions (or features). Let m be the number of input nodes and P be the number of output nodes. Let Wj = [w0 j , w1 j , . . . , w(m⫺1) j ]T denote the weights or reference vectors. Wj is the vector containing all of the weights from the m input nodes to output node j. After enough input vectors have been fed to the system, the weights will specify clusters or vector centers that sample the input space so that the point density function of the vector centers tends to approximate the probability density function of the input vectors [38]. Updating the weight for any given input in this model is done only for output units in a localized neighborhood. For each node j, there are NE neighbor nodes that depend on the topological neighborhood selected. A topological neighborhood consists of a rectangular or a hexagonal array of points around the selected node [40]. Figures 5a and b show simple forms of neighborhood sets around node j. The neighborhood is centered on the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
348
Alirezaie and Jernigan
FIGURE 4 maps.
Two-dimensional array of output nodes used to form feature
output node whose distance dij is minimum. The measurement of dij is a Euclidean distance, defined as
冘 N⫺1
dij =
(xi ⫺ wij )2
(11)
i=0
where xi is the input to node j and wij is the weight from input node i to output node j.
FIGURE 5 Topological neighborhoods at different times, as feature maps are formed (0 < t1 < t2).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
349
The neighborhood decreases in size with time until only a single node is inside its bounds. A learning rate is also required that decreases monotonically in time. Convergence to a cluster center will be controlled by the learning rate. As the learning rate decreases with more iterations, movement becomes restricted to smaller distances around the cluster center. 4.1.1
SOFM Algorithm
The SOFM algorithm can be described as follows. Step 1: Initialize weights. Randomly initialize weights from N inputs. Set the initial radius of the neighborhood NE. Step 2: Present new input. Step 3: Compute distance to all nodes. Step 4: Select output node with minimum distance. Select node j* as that output node with minimum distance dj . Step 5: Update weights to node j* and neighbors. Weights are updated for node j* and all nodes in the neighborhood defined by NE j*(t) following Eq. (5). Wj updates as wij (t ⫹ 1) = wij (t) ⫹ hij (t)(xi ⫺ wij (t)) for
j 僆 NEj*
0ⱕiⱕN⫺1
(12)
The term hij is called the neighborhood kernel. Step 6: If NE ≠ 0 go back to Step 2. The basic idea behind the SOFM approach is to move the weights towards the center of clusters by updating the weights on each input value. The neighborhood kernel hij is defined over the lattice points and has a very central role for a good feature mapping. Usually hij = h(㛳ri ⫺ rj 㛳, t), where rj 僆 2 and ri 僆 2 are the radius vectors of nodes j and i, respectively, in the array. With increasing 㛳ri ⫺ rj 㛳, hij → 0. The average width and form of hij is important for convergence. The neighborhood was defined as a set of array points around node j as shown in Fig. 5. Then hij = NEj (t) and is decreasing in time monotonically. It can be seen that hij is acting as a learning rate factor ␣ (t) (0 ⱕ ␣ (t) ⱕ 1). Both ␣ (t) and NEj (t) decrease in time during the ordering process. 4.1.2
Feature Map Classification
In order to use the SOFM for clustering and classification we need to extend the network. One way of extending SOFM can be done by adding an associative layer to the Kohonen layer as shown in Fig. 6. This additional set of neurons does not participate in weight updating. After the self-organizing network terminates and weights are adjusted, the additional layer finds for
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
350
Alirezaie and Jernigan
FIGURE 6 SOFM with an additional layer using the maximum likelihood training scheme.
the weight vector (prototype) closest to each input, and assigns the input to that class. The mapping can be accomplished by using maximum likelihood training, a supervised learning scheme. A maximum likelihood approach suggests a simple training algorithm that consists of counting the best matching units in the map corresponding to the training data. The output units are connected to the output nodes in the Kohonen layer corresponding to that class with greatest frequency of occurrence of training data. Usually the training dataset is small; for each class a few representatives are selected. 4.1.3
Extracting the Cerebrum
Extracting the cerebrum is performed by stripping away the skull and scalp pixels from the T2 images. The following algorithm describes the technique. 1. 2.
Divide the image into four quadrants (see Fig. 7). In quadrant 4: (a) From the center of the T2-weighted image ‘‘o’’, (xo , yo), identify the pixel of bone or air, and background (xi , yj) by measuring the threshold of pixel value on each row from left to
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
351
FIGURE 7 Different steps of extracting the cerebrum from MR images of the head.
3.
right. If the pixel value is less than 100, then check the neighborhood pixels in a box of 4 ⫻ 4 or (5 ⫻ 5) (see Fig. 7). If the majority of pixels in the box belong to class of bone, stop the process and assign (xi , yi) as a boundary pixel. (b) Decrement o in the row, go to (a) and repeat the process until point B is reached. (B is not a fixed point and it varies depending on the shape of the image (head). The algorithm stores the position of B (xB , yB) when it reaches this point. xB is fixed in each quadrant.) (c) From the center of the T2-weighted image ‘‘o’’, (xo , yo), identify the pixel as bone or air, and background (xk , yl ) by measuring the threshold of pixel value on each column from down to up. If the pixel values is less than the threshold, then check the neighborhood pixels in a box of 4 ⫻ 4 or (5 ⫻ 5) (see Fig. 7). If the majority of pixels in the box belong to the bone class stop the process and assign (xk , yl ) as a boundary pixel. (d) Increment o in the column, go to (c), and repeat the process until point B is reached. Similarly, process quadrants 1, 2, and 3 by changing the direction of search: In quadrant 1, search right to left in the row, and down to up in the column. In quadrant 2, search right to left in the row, and up to down in the column.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
352
Alirezaie and Jernigan
In quadrant 3, search left to right in the row, and up to down in the column. By using the detected margin at this stage, a mask image is generated to remove skull pixels from the T1- and PD-weighted images. A typical example is shown in Fig. 8. Figures 8a, b, and c show the T1-, T2-, and PD-weighted images, respectively. Figure 8d shows how the algorithm strips the skull pixels from the T2-weighted image. Figure 8e shows the mask image. Figures 8f, g, and h show the extracted cerebrum from each image
FIGURE 8
Different stages of extracting the cerebrum.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
353
component. It should be mentioned that the algorithm would not work automatically for images near the base of the brain, and user interaction is required to specify the marginal points on the image using a mouse interface. 5
EXPERIMENTS AND RESULTS
Results from the four different approaches are presented in this section. Pixel intensity values and spatial information of neighboring pixels are used as features. The number of input nodes in the network is equal to the number of features, and the number of output nodes is equal to the number of target tissue classes. In the supervised classification cases, the images were segmented into seven classes: background, cerebrospinal fluid (CSF), white matter, gray matter, bone, scalp, and lesion or tumor (if present). In the unsupervised approaches the images segmented into four classes: CSF, white matter, gray matter, and unknown (lesion or tumor if present). 5.1 5.1.1
Results of Supervised Classification Approaches Results of LVQ ANN Approach
For the initialization of the codebook vectors, a set of vectors is chosen from the training data, one at a time. All the entries used for initialization must fall within the borders of the corresponding classes, and this is checked by the K-nearest neighbor (K-nn) algorithm. In fact, in this step, the placement of all codebook vectors is determined first, without taking their classification into account. The initialization program then selects the codebook vector based on the desired number of codewords. For our segmentation problem, different sets of codebooks including 60 to 120 codewords for each set were tested. The accuracy of classification may depend on the number of codebook entries allocated to each class. There is no simple rule to find out the best distribution of the codebook vectors. We used the method of iteratively balancing the medians of the shortest distances in all classes. Our program first computes the medians of the shortest distances for each class and corrects the distribution of codewords so that for those classes in which the distance is greater than the average, codewords are added; and for those classes in which the distance is smaller than the average, some codewords are deleted from the initialized codebook vector. When the codebook has been initialized properly, training is begun. Training vectors are selected by manual segmentation of multispectral MR images (T1-, T2-, and -weighted images) of the head. In a typical experiment, the T1-weighted images are displayed on the computer screen, one slice at a time. Next, representative regions of interest
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
354
Alirezaie and Jernigan
(ROIs) for the target tissue classes are selected interactively on the computer screen using a mouse-driven interface. When the ROIs are selected, a data structure containing the labels of the regions, the spatial information of neighboring pixels in these regions, and the corresponding intensity vectors is created and stored in a file. The created file is then used to train the neural networks. Figure 9 shows the results obtained using this approach. The original proton density image is shown in Figure 9a. A complete segmented image is shown in Figure 9b; CSF, white matter, and gray matter are shown separated in Figures 9c, 9d, and 9e. 5.1.2
Results of Back-Propagation ANN Approach
The first step in using the network was to determine the learning parameters. Figure 3 illustrates the topology of the network. With that topology, the network was trained three times with different learning rates and momentum. The effect of the learning parameter and momentum rate on the speed of convergence was investigated, and the best combination was selected for classification and segmentation. The same training and test data were used for training and testing the network. The training was stopped if convergence was not reached after 100,000 iterations. In order to investigate the impact of the topological parameters on both the speed of convergence and the classification accuracy, several experiments have been set up. Neural networks with one and two hidden layers were generated, and the number of nodes in each of the layers was varied from 10 to 30. From results obtained in each test, the best combination is selected for classification and segmentation of MR images. Results obtained using this technique are shown in Fig. 10. Figure 11 shows the results of segmenting an MR image with tumor. The patient had a malignant glioma. In this case the segmentation of the MR images is complicated because image features of abnormal tissues may be very close to those of their neighboring normal tissues. Therefore abnormal tissues may be found in a normal tissue component after a segmentation. Abnormal tissues may also deform the geometry of normal tissues. It can be observed that the results for back-propagation ANN suffer from noise and misclassification but that the LVQ ANN results are quite good.
5.2
Results of Unsupervised Segmentation Approaches
In this section the network parameters and results of segmentation of MR images using our scheme and clustering technique will be presented.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
355
FIGURE 9 (a) The original -weighted image. (b) A complete segmented image using LVQ ANN. (c) CSF tissues. (d) White matter. (e) Gray matter.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
356
Alirezaie and Jernigan
FIGURE 10 (a) The original -weighted image. (b) A complete segmented image using back-propagation ANN. (c) CSF tissues. (d) White matter. (e) Gray matter. Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
357
FIGURE 11 (a) Segmentation of a tumor in an abnormal brain using backpropagation ANN. (b) Segmentation of a tumor using LVQ ANN.
5.2.1
Results of the SOFM Network
One of the parameters that has to be set to obtain a good mapping is the form of the array. From the results of several experiments, it appeared that choosing the hexagonal lattice provided better results than the rectangular lattice. Other parameters that influenced the results were the type of neighborhood function and the map size. The size of the map defines the number of codewords or reference vectors. Map sizes of 6 ⫻ 6, 8 ⫻ 8, 10 ⫻ 10, and 11 ⫻ 11 were tested. Results were improved slightly as the map size increased. However, no significant differences were observed when the size was increased from 10 ⫻ 10 to 11 ⫻ 11. Therefore the size of 10 ⫻ 10 which consisted of 100 codewords, was selected. The reference vectors of the map were first initialized randomly. The lattice type of the map and the neighborhood function used in the training procedures were also defined in the initialization. The map was trained by the self-organizing feature map algorithm explained in Sec. 3. For other image slices from the same patient study, initialization of the network was not performed randomly. In order to speed up the learning time, previous weights were used for initialization of the network. Figure 12 shows an example of segmentation results obtained using the SOFM network. 5.2.2
Results of the c-Means Clustering Algorithm
The results of the c-means algorithm heavily depend on the number of iterations and classes. Typical results are shown in Figs. 13 and 14. To dem-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
358
Alirezaie and Jernigan
FIGURE 12 Typical results using SOFM and maximum likelihood as feature map classifier.
onstrate how the c-means algorithm performs clustering, different numbers of classes were chosen. In Fig. 13, results of segmentation are shown for c = 4 classes. The different class tissues are shown separately. Figure 14 shows results for c = 6 classes. As shown in Fig. 13, for c = 4, the algorithm tends to segment the image into three tissue classes. Figure 13a shows white matter and gray matter mixed together as one class, while the CSF in Figs. 13b and c is classified into two tissue classes. For c = 5, the algorithm tends to segment the image into four tissue classes, the white matter and gray matter mixed together as one class, while
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
FIGURE 13
359
(a–c) Results of applying the c-means algorithm (c = 4).
part of the gray matter is classified as a separate class. The CSF is classified into two tissue classes. For c = 6, the algorithm tends to segment the image into five tissue classes. Although in Fig. 14, results seem to be encouraging, none of the segmented regions are correct. The white matter is classified into one class (Fig. 14a), while the gray matter and the CSF are each classified into two tissue classes (Fig. 14b, c and 14d, e, respectively). The first class of gray matter is correct while the second class of the gray matter is composed of the gray matter and CSF. An attempt was made to merge the five classes to three classes, but similar results were obtained (as shown in Fig. 13).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
360
Alirezaie and Jernigan
FIGURE 14
6
(a–e) Results of applying the c-means algorithm (c = 6).
DISCUSSION AND CONCLUSION
Back-propagation neural networks are very sensitive to the training set in MR image segmentation of the brain. Back-propagation nets provide adequate brain segmentation provided that the training data are quite good. They can learn effectively on as few as 250 pixels per class in a 256 ⫻ 256 MR image, using a multilayer back-propagation network with between 10 and 20 hidden units, which means that training and testing are relatively fast. An effort to find a universal training set that would be useful on many different MR images has been made. However, because of the intensity variation across MR images, which causes instability on training data, the back-propagation network could not perform well every time. If back-propagation networks are to be used for segmentation of MR images, at least for now, reliance on an operator to select good training data for each subject and each slice of data is crucial. Results of our LVQ ANN approach are not limited to MR images recorded from a single patient, and the classification procedure is valuable for MR images recorded from other patients. Our method is insensitive to the variations of gray-level for each tissue type between different slices from
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis
361
the same study. the LVQ artificial neural networks are a particularly good choice for segmentation of MR images because their generalization properties require the labeling of only a few training points, and they produce results faster than other traditional classifiers. We showed that tissue segmentation using LVQ ANN produces better and faster results than backpropagation ANN and is a powerful technique for MR image segmentation. For the unsupervised segmentation approaches we showed that the cmeans algorithm could not provide a reliable result for MR image segmentation. Although the basic theory of the c-means algorithm is similar to the SOFM, two major differences exist: (1) in the SOFM algorithm, each entry (i.e., vector x) is used to update the winning class and its neighboring classes, while in the c-means algorithm, each input vector is classified, and only the winning class is modified during each iteration; (2) in the SOFM, the weights represent the number of reference vectors, which are normally several times the number of classes, while in the c-means algorithm the number of classes is predetermined as a constant value in the algorithm. There are a number of important properties that make the SOFM suitable for use as a codebook generator for clustering schemes. They can be summarized as follows: The set of reference vectors is a good approximation to the original input space. The reference vectors are topologically ordered in the feature map, so that the correlation between the reference vectors increases as the distance between them decreases. The density of the feature map corresponds to the density of the input distribution, so that regions with a higher probability density have a better resolution than areas with a lower density. A common problem in MR image segmentation is the inhomogeneity introduced by the magnetic field in the images. This problem has admittedly not been addressed by many researchers [41,21,22,32,34]. However, in our SOFM algorithm, the inhomogeneity problem can be resolved in most cases. In the clustering stage when the Kohonen map establishes the number of clusters based on the self-organizing feature map algorithm, part of the image with relatively similar contrast and gray level values is mapped in the same neighborhood, which can then be classified as one class. However, the algorithm may fail to produce good results if the inhomogeneity is significant. In summary, our studies have shown the effectiveness of our neural network techniques for supervised and unsupervised segmentation of brain tissues in MR images. The characteristics of proposed artificial neural network schemes—including a massive parallel structure, a high degree of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
362
Alirezaie and Jernigan
interconnection, and the ability to self-organize—parallel many of the characteristics of the human visual system. The proposed approaches are promising techniques for MR image segmentation.
REFERENCES 1. 2.
3. 4.
5.
6.
7.
8.
9. 10.
11.
12. 13. 14.
R. R. Edelman, S. Warach. Magnetic resonance imaging (first of two parts). New England Journal of Medicine 328:708–716, 1993. W. Bondareff, J. Raval, B. Woo, D. L. Hauser, P. M. Colletti. Magnetic resonance imaging and the severity of dementia in older adults. Arch. Gen. Psychiatry 47:47–51, 1990. H. S. Stiehl. 3-D image understanding in radiology. IEEE Engineering in Medicine and Biology 9(4):24–28, 1990. T. Taxt, A. Lundervold. Multispectral analysis of the brain using magnetic resonance imaging. IEEE Engineering in Medicine and Biology 13(3):470– 481, 1994. M. C. Clark, L. O. Hall, D. B. Goldgof, L. P. Clarke, R. P. Velthuizen, M. S. Silbiger. MRI segmentation using fuzzy clustering techniques. IEEE Engineering in Medicine and Biology, 730–742, 1994. J. Alirezaie, M. E. Jernigan, C. Nahmias. Neural network based segmentation of magnetic resonance images of the brain. IEEE Transactions on Nuclear Science 44(2):194–198, 1997. S. C. Amartur, D. Piraino, Y. Takefuji. Optimization neural networks for the segmentation of magnetic resonance images. IEEE Transactions on Medical Imaging 11(2):215–220, 1992. H. S. Choi, Y. Kim. Partial volume tissue classification of multichannel magnetic resonance images—a mixel mode. IEEE Transactions on Medical Imaging 10(3):395–407, 1991. J. Lin, K. Cheng, C. Mao. A fuzzy hopfield neural networks for medical image segmentation. IEEE Transactions on Nuclear Science 43(4):2389–2398, 1996. C. Lee, S. Huh, T. A. Ketter, M. Unser. Automated segmentation of the corpus callosum in midsagitattal brain magnetic resonance images. Optical Engineering 39(4):924–935, 2000. M. C. Clark, L. O. Hall, D. B. Godgof, R. P. Velthuizen, F. K. Murtagh, M. S. Silbiger. Automatic tumor segmentation using knowledge-based techniques. IEEE Transactions on Medical Imaging 17(2):187–201, 1998. N. Shareef, D. Wang, M. Bister. Segmentation of medical images using legion. IEEE Transactions on Medical Imaging 18(1):74–91, 1999. T. Kapur, W. Grimson, W. Wells, R. Kikinis. Segmentation of brain tissue from magnetic resonance images. Medical Image Analysis 1(2):109–127, 1996. H. E. Cline, C. L. Dumoulin, H. R. Hart, W. E. Lorensen, S. Ludke. 3D reconstruction of the brain from magnetic resonance images using a connectivity algorithm. J. Magn. Reson. Imaging 5:345–352, 1987.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
MR Image Segmentation and Analysis 15.
16.
17. 18. 19.
20.
21.
22.
23. 24.
25. 26. 27.
28.
29.
30. 31.
363
L. P. Clarke, R. P. Velthuizen, S. Phuphanich, J. D. Schellenberg, J. A. Arrington, M. Silbiger. MRI: stability of three supervised segmentation techniques. Magnetic Resonance Imaging 11:95–106, 1993. J. R. Mitchell, S. J. Karlik, D. H. Lee, A. Fenster. Computer-assisted identification and quantification of multiple sclerosis lesions in MR imaging volumes in the brain. J. Magn. Reson. Imaging 4:197–208, 1994. M. W. Vannier, M. Gado, R. L. Butterfield. Multispectral analysis of magnetic resonance images. Radiology 154(1):221–224, 1985. R. M. Haralick, L. G. Shapiro. Survey image segmentation techniques. Computer Vision, Graphics, and Image Processing 29:100–132, 1985. D. A. Ortendhal, J. W. Carlson. Segmentation of magnetic resonance images using fuzzy clustering. In C. N. de Graff, M. A. Viergever, eds., Information Processing in Medical Imaging. Plenum Press, New York, 1988, pp. 91–106. L. O. Hall, A. M. Bensaid, R. P. Velthuizen, L. P. Clarke, M. S. Silbiger, J. C. Bezdek. A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Transactions on Neural Networks 3(2):672–682, 1992. H. E. Cline, W. E. Lorensen, R. Kikinis, F. Jolesz. Three-dimensional segmentation of MR images of the head using probability and connectivity. Journal of Computer Assisted Tomography 14(6):1037–1045, 1990. M. Joliot, B. M. Mazoyer. Three-dimensional segmentation and interpolation of magnetic resonance brain images. IEEE Transactions on Medical Imaging 9(2):269–227, 1993. A. Baraldi, P. Blonda, F. Parmiggiani, G. Satalino. Contextual clustering for image segmentation. Optical Engineering 39(4):907–923, 2000. J. Weng, A. Singh, M. Y. Chiu. Learning-based ventricle detection from cardiac mr and ct images. IEEE Transactions on Medical Imaging 16(4):378–391, 1997. N. Duta, M. Sonka. Segmentation and interpretation of mr brain images. IEEE Transactions on Medical Imaging 17(6):1049–1062, 1998. C. K. Leung, F. K. Lam. Maximum segmented image information thresholding. Graphical Models and Image Processing 60:57–76, 1998. C. Li, D. B. Goldgof, L. O. Hall. Knowledge-based classification and tissue labeling of MR images of human brain. IEEE Transactions on Medical Imaging 12(4):740–750, 1991. B. M. Dawant, Alex P. Zijdenbos, Richard A. Margolin. Correction of intensity variations in MR images for computer-aided tissue classification. IEEE Transactions on Medical Imaging 12(4):770–781, 1993. M. Ozkan, R. J. Maciunas. Neural network based segmentation of multi-modal medical images: a comparative and prospective study. IEEE Transactions on Medical Imaging 12(3):534–544, 1993. M. B. Riemer, M. Riemer. 3-D segmentation of MR images of the head for 3-D display. IEEE Transactions on Medical Imaging 9(2):177–183, 1990. A. Alaux, P. A. Rick. Multispectral analysis of magnetic resonance images: a comparison between supervised and unsupervised classification techniques. In
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
364
32. 33.
34.
35. 36.
37.
38. 39. 40. 41.
Alirezaie and Jernigan International Symposium on Tissue Characterization in MR imaging, 1990, pp. 165–169. Z. Liang. Tissue classification and segmentation of MR images. IEEE Engineering in Medicine and Biology 12(1):81–85, 1993. A. Lundervold, G. Storvik. Segmentation of brain parenchyma and cerebrospinal fluid in multispectral magnetic resonance images. IEEE Transactions on Medical Imaging 14(2):339–349, 1995. J. Chou, C. Chen, W. Lin. Segmentation of dual-echo MR images using neural networks. In Proc. SPIE, Medical Imaging, Vol. 1898, SPIE, 1993, pp. 220– 227. T. Kohonen. An introduction to neural computing. Neural Networks 1:3–16, 1988. N. R. Pal, J. C. Bezdak, E. C. K. Tsao. Generalized clustering networks and Kohonen’s self-organizing scheme. IEEE Transactions on Neural Networks 4(4):549–557, 1993. T. Kohonen, G. Barna, R. Chrisley. Statistical pattern recognition with neural networks: benchmarking studies. In IEEE International Conference on Neural Networks (ICNN), Vol. 1, July 1988, pp. 61–68. T. Kohonen. Self-Organization and Associative Memory. Springer-Verlag, 1989. J. A. Freeman, D. M. Skapura. Neural Networks, Algorithms, Applications, and Programming Techniques. Addison-Wesley, 1991. T. Kohonen. Self-Organizing Maps. Springer-Verlag, Berlin, 1995. D. N. Kennedy, V. S. Caviness. Anatomic segmentation and volumetric calculations in nuclear magnetic resonance imaging. IEEE Transactions on Medical Imaging 8(1):1–7, 1989.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
13 Stochastic Model Based Image Analysis Yue Wang The Catholic University of America, Washington, D.C.
Tu¨lay Adalı University of Maryland Baltimore County, Baltimore, Maryland
1
INTRODUCTION
Quantitative analysis of magnetic resonance (MR) images deals with the problem of estimating tissue quantities and segmenting the anatomy into contiguous regions of interest. The problem has recently received much attention largely due to the improved fidelity and resolution of MR imaging systems, and the effective clinical utility of image analysis and understanding in diagnosis, monitoring, and intervention (e.g., see Refs. 1–4). For example, pathological studies show that many neurological diseases are accompanied by subtle abnormal changes in brain tissue quantities and volumes as shown in Ref. 2. Because of the virtual impossibility for clinicians to quantitatively analyze these pathological changes associated with a specific disease directly from MR images, considerable effort is required to 365
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
366
Wang and Adalı
develop accurate image analysis algorithms for identifying and quantifying these changes in vivo as discussed in Refs. 5–11. The major tasks of MR image analysis involve tissue quantification and image segmentation. In the stochastic model based approach, it is typically assumed that each pixel can be decomposed into a pixel image and a context image, and each is considered separately. A pixel image is defined as the observed gray level associated with the pixel, whereas the context image is defined as the membership of the pixel associated with different regions. In our treatment of the stochastic MR analysis, we first address the problem of MR image statistics, and in the next section we provide the complete statistical description of MR imaging along with justification of the common assumptions made for MR image modeling. This section provides the basis for the following section, Sec. 3, which introduces the statistical models for pixel and context images. The finite mixture models, and in particular, the standard finite normal mixtures (SFNM), have been widely used for pixel images, and efficient algorithms are available for calculating the parameters of the model. On the other hand, by incorporating statistical properties of context images, Markov random fields (MRF) have proven to be the most popular way to impose local consistency constraints on context images in terms of a stochastic regularization scheme given in Ref. 12. We introduce both of these models and show that by using an adaptive SFNM model, where information-theoretic criteria are used to determine the number of tissue types, we can incorporate the partial volume effect, tissue abnormality, and inhomogeneity distortion into a single stochastic model [3,10]. Also, an inhomogeneous MRF is introduced that uses the entropy rate distribution to determine probabilistic boundary sites in the image, and the spatial discontinuity levels to assign the model parameter values. This permits an adaptive construction of the patient-specific site model for improving both tissue quantification and image segmentation as given in Refs. 13,14. Section 4 describes the estimation of the model parameters, i.e., the quantification and segmentation stages. The final section provides a set of two- and three-dimensional image analysis results. We also address the problem of algorithm performance evaluation and define postglobal relative entropy for the assessment of the final image segmentation performance, shown in Ref 15. Based on the observation that the parameter values of a particular tissue type in the quantified and segmented images should be very close, this criterion suggests an indirect but objective approach for the difficult problem of performance evaluation in image segmentation in Ref. 16.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
2 2.1
367
MR IMAGE STATISTICS Introduction
As discussed in Sec. 1, although some pioneer work has been reported on the statistical analysis of x-ray CT [17], positron emission tomography [18], and ultrasound scan [19], there has been little work carried out on MR imaging statistics and model-based MR image analysis. For example, Ref. 20 discusses noise sources and their effects in MR imaging, Ref. 21 presents a unified SNR analysis of medical imaging systems, Ref. 22 applies information theory to the quality evaluation of MR imaging, and Ref. 24 compares several currently used statistical algorithms in MR imaging. However, none of these researchers provide a complete statistical analysis that can incorporate MR imaging statistics into MR image modeling by justifying the many heuristic assumptions typically made in the literature (e.g., see Ref. 22). In this section, we present a full investigation of the stochastic aspects of MR imaging and discuss its applications in MR image modeling. A complete statistical description of MR imaging, including the imaging equation and the random field theory, is provided. Both object variability and thermal noise are considered as degradations presented in the pixel images that are generated by Fourier transform reconstruction algorithm. Gaussianity, dependence, stationarity, and ergodicity of MR pixel images are characterized as the standard problems of statistics. Their justification is given in order to form the basis of the stochastic image modeling and analysis. In the context of stochastic image modeling and regularization, the statistical properties of both pixel and context images are important and have to be considered together. Many investigations have been conducted on image statistics of pixel images in MR or other modalities (e.g., Refs. 10, 22, 25, and 29). Reference 20 has conducted intensive research on the objective assessment of imaging statistics of x-ray and gamma rays, by considering the effects of both quantum noise and object variability. A pioneer work is reported in Ref. 19, on the speckle statistics in medical ultrasound images in which the first-order and the second-order statistics received intensive treatment. References 21 and 26 study the intrinsic signal-to-noise ratio and noise power spectrum for MR imaging; both object variability and thermal noise are considered and evaluated theoretically. Reference 24 conducts a comparative work for evaluating several statistical approaches in MR imaging, in which intensive real MR data analysis is included; Ref. 22 contributes to the measurement of bivariant information content of MR images in which six assumptions are made and partially justified. Furthermore, the problem for researchers related to the statistics of context image is another challenging topic. Using a randomization rule and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
368
Wang and Adalı
stochastic regularization, many investigations have been conducted (see Refs. 13, 25, 27, and 28) showing important properties. However, they suffer from some limitations, such as constraints on context representation, which are generally imposed mathematically without objective justification by the context image statistics [13,29]. Since the true context is unobservable in general, the assumed models always contain some empirical parameters that are given heuristically [3,10,13]; homogeneous assumption is applied to the context image even when it is obviously not true [10,30]. In our MR statistical investigation, the following issues are addressed: 1. 2. 3. 2.2 2.2.1
MR image statistics as applicable to typical MR data samples using theoretical and experimental studies MR image statistics in the context image The relationship between the MR image statistics and the MR image models
Statistical Properties of Pixel Images Gaussianity
From MR imaging, the pixel image ˆf(k, l) is a weighted sum of a large number of (M ⫻ N) random variables [the free induction decay (FID) signal s(m, n) and thermal noise sn(m, n)]. Both the FID signal and the thermal noise have finite means and variances. Thus, according to the generalized central limit theorem, a pixel image ˆf(k, l) has an asymptotic Gaussian distribution. Due to the physical limitation, this Gaussian distribution is truncated. Therefore the Gaussianity of the final MR pixel images can be summarized by the following statement: Property 1: Given an MR image, each pixel is a random variable with a truncated asymptotic Gaussian distribution, and the whole image is a Gaussian random field. This conclusion is strongly supported by the real MRI data analysis. Reference 24 has shown that, based on a number of experiments in which a single set of measurements of a slice is compared to an average image made from eight sets of measurements of the same slice, the results from the real (or the imaginary) part of the reconstructed pixel image is influenced by the noise following a Gaussian distribution with zero mean. The histogram of the noise for the real part of the reconstructed data clearly show that, even in the tails, a Gaussian distribution produces an excellent fit. Furthermore, the modulus image will have a Rayleigh distribution. Rayleigh density approaches a truncated Gaussian when the SNR is relatively high [27]. Reference 24 also shows that noise at each pixel follows a Gaussian distribution approximately in the modulus images.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
2.2.2
369
Dependence
According to second-order moment input/output relations of linear systems [31], the covariance function of the convolved linear magnetic resonance coefficients can be expressed as Kf ((k, l), (k⬘, l⬘)) = K ((k, l), (k⬘, l⬘)) 䉺 h*((k, l), (k⬘, l⬘)) 䉺 h((k, l), (k⬘, l⬘))
(1)
where 䉺 denotes continuous linear convolution, and K (⭈,⭈) is the covariance function of the true linear magnetic resonance coefficients. When the distance between two pixel images is denoted by ⌬d = 㛳(k, l), (k⬘, l⬘)㛳, we have
冉 冊 冉 冊
⌬d ⌬r 䉺 h*((k, l), (k⬘, l⬘)) 䉺 h((k, l), (k⬘, l⬘)) ⌬d = (k, l) (k⬘, l⬘)exp ⫺ 䉺 h*(⌬d) 䉺 h(⫺⌬d) ⌬r (2)
Kf ((k, l), (k⬘, l⬘)) = (k, l) (k⬘, l⬘)exp
⫺
where ⌬r is defined as the effective correlation distance that has an order of 10⫺8 m. Thus it is easy to show that when ⌬d approaches infinity, the covariance of f (⭈,⭈) goes to zero, since the first convolution function in Eq. (2) is narrow ranged. This property is called asymptotically independent or weakly dependent. In our case, it is lim 㛳(k, l ),(k⬘, l⬘)㛳→⬁
Kf ((k, l), (k⬘, l⬘)) = 0
(3)
Furthermore, the dependence between any two pixel images can also be evaluated by the covariance function Kˆf ((k, l), (k⬘, l⬘)) given by Kfˆ ((k, l), (k⬘, l⬘)) = Kf ((k, l), (k⬘, l⬘)) ⫹ Kfn((k, l), (k⬘, l⬘)) = Kf ((k, l), (k⬘, l⬘)) ⫹
2sn ␦ (k⬘ ⫺ k, l⬘ ⫺ l) MN
(4)
Thus the dependence in MR pixel images can be stated as follows: Property 2: Any two pixel image random variable (rv’s) in an MR image are asymptotically independent, that is, weakly dependent. Their correlation is mainly governed by the system point spread function under the condition of narrow ranged microscopic correlation. The importance of Eq. (4) derives from the fact that the correlation between any two pixels in an MR image is only determined by the correlation of the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
370
Wang and Adalı
object variability part in which both microscopic correlation and system point spread function make contributions. This property is strongly confirmed by real MRI data analysis. References 22 and 24 have reported that, based on their experiments, pixel images seem to be uncorrelated, since the plot of the correlation function of an MR image is shown to be rapidly decreasing with increasing interpixel distance. 2.2.3
Stationarity
After investigating the first- and second-order statistics of MR pixel images, we are ready to explore the other two important properties, that is, stationarity and ergodicity. A homogeneous object in an MR image has a unique linear magnetic resonance coefficient  (x, y), that is, the unique mean and variance is E[  (x, y)] =
E[( (x, y) ⫺ )2] = 2
(5)
We can note that by Ref. 23, E[ ˆf(k, l)] = C1
E[( ˆf(k, l) ⫺ C1)2] = C 2
(6)
where C1 and C 2 are the constant mean and variance of a pixel image ˆf(x, y). Note that for a different homogeneous object in an image, C(⭈) is determined uniquely by the mean and variance of the underlying linear magnetic resonance coefficient  (x, y) of that object. We define a meaningful image region (or, we simply say, region) mathematically as follows: Definition: A group of pixel images are said to form a region in an MR image if they have the same means and variances. Thus, based on the above definition, an image region in the final reconstructed MR image will correspond to a homogeneous object. Then within an image region, we can simply rewrite Eq. (3.21) as Kf (⌬d) = 2 exp(⫺⌬d) 䉺 h*(⌬d) 䉺 h(⫺⌬d)
(7)
Therefore the autocorrelation function of ˆf(i, j) will be ˆ Rfˆ ((k, l), (k⬘, l⬘)) = E[ ˆf(k, l)f(k⬘, l⬘)] ⭈
再
2 exp(⫺⌬d) 䉺 h*(⌬d) 䉺 h(⫺⌬d) ⫹ C 12 C 2 ⫹ C 12
(k ≠ k⬘, l ≠ l⬘) (k = k⬘, l = l⬘) (8)
Since correlation only depends on the spatial index difference, according to Ref. 31, ˆf(i, j) is a stationary field in the wide sense, and also in the strict sense. By assuming that an MR image contains several distinct regions, we state stationarity as follows:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
371
Property 3: Given an MR image, each region is stationary. The whole image is piecewise stationary. 2.2.4
Ergodicity
For a given region in an MR image, we can rewrite Eq. (4) as Kfˆ ((k, l), (k⬘, l⬘)) =
再
2 exp(⫺⌬d) 䉺 h*(⌬d) 䉺 h(⫺⌬d) C2
(k ≠ k⬘, l ≠ l⬘) (k = k⬘, l = l⬘) (9)
which indicates that Kˆf (0) < ⬁, and Kˆf ((k, l), (k⬘, l⬘)) → 0 when ⌬d → ⬁. Therefore ˆf(k, l) has a mean ergodic theorem with limiting sample average C1 : 1 lim MN M,N→⬁
冘冘 M
N
k=0
l=0
ˆf(k, l) = C1
(10)
with probability 1 [31]. Note that the variances of both thermal noise and object variability are assumed to be finite. Then we use the Birkhoff–Khinchin theorem [31] to show that ˆf(k, l) has a variance ergodic theorem. Reference 31 shows that a stationary Gaussian random field is ergodic if it has strictly positive definite covariance function Kfˆ such that Kˆf ((k, l), (k⬘, l⬘)) is finite for all ⌬d and approaches 0 as ⌬d → ⬁ with probability 1. Therefore according to Theorem 7.3 in Ref. 31, lim M,N→⬁
1 MN
冘冘 M
k=0
N
ˆ l) ⫺ C1]2 = C 2 [ f(k,
(11)
l=0
A detailed mathematical proof for the independent case can be found in Ref. 17. The above discussion can be summarized by the following property: Property 4: An MR image is a piecewise ergodic random field, and each region satisfies both mean and variance ergodic theorems. The importance of this property is that in unsupervised image analysis the quantification of mean and variance for each region is necessary. With ergodic theorems, the spatial averages of pixel images can be performed to
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
372
Wang and Adalı
estimate these quantities. Thus only one image, i.e., only one realization of a random field, is required. 2.3
Statistical Properties of Context Image
2.3.1
Stochastic Regularization and Markovian Property
As mentioned above, each pixel in an MR image is described in terms of pixel image and context image, where the pixel image refers to its intensity (gray level) and the context image refers to its class label (membership). In the previous section, we define an image region as a group of pixel images that have the same mean and variance and justify the stationarity property of image regions. This investigation is based on the moments of the intensities of pixels. We also show that an image region in an MR image corresponds to a homogeneous object; the whole image may contain several regions. Thus there is a one-to-one correspondence between pixel label and image region, i.e., a homogeneous object. In order to have a complete characterization of MR image statistics, we need to investigate the statistical properties of the context image. Another motivation for studying the statistics of context images in this research is to provide a basis for searching a mathematical tool that can translate a statement about the context information, in both local and global scales, into model structures and parameter values, in a traceable format. As discussed in a number of references [13,25], although true context is unobservable, a priori expectations, specific scene knowledge, and contextual clues can help to eliminate possible ambiguities and recover missing information so as to perceive the scene ‘‘correctly.’’ In MR imaging, nearby locations typically have similar tissue types such as tones, although these have locally slow variations representing homogeneous regions; boundaries, which are usually smooth and persistent; and objects, such as tissue and organs, which have preferred relations and orientations. These ‘‘regularities’’ are rarely deterministic; rather, they describe the correlations and likelihoods of possible outcomes in a real scene. This type of knowledge can be captured mathematically and exploited in a stochastic framework. In MR images, regions are piecewise contiguous; thus a pixel label takes on discrete values and the labels of nearby pixels are strongly correlated. This correlation is primarily local and can be represented by a Markovian property. The Markovian property implies that the probability distribution of a label depends only on its neighboring pixel labels, given all other pixel labels. Mathematically, the concept of a Markovian property can be described by the following two local properties: 1. 2.
(Postivity) P(Li = li) > 0 (Markov property) P(Li = li 兩 lS兩i) = P(Li = li 兩 l⭸i)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
373
where li is the label of pixel i, ⭸i denotes the neighborhood of pixel i, and lS兩i all pixel labels except pixel i. This natural constraint can be also justified based on the mechanism of biological development [32]. 2.3.2
Spatial Continuity and Context Representation
Local context images can be described by Markovian property. Global context images are generally reflected by image regions and boundaries. Within one region, the context exhibits a strong spatial continuity, while on region boundaries, the context reflects the natural discontinuity. The level of spatial continuity in local context images will be measured by the dependence (or correlation) among pixel labels and represented by the parameter values that can reflect the measure. We propose a simple randomization rule to describe mathematically the statistical properties of the local context images. For each pixel i, in a second-order neighborhood system1 after randomly reordering its eight neighboring pixel labels without specifying orientation, we can calculate the conditional histogram of local context images. Since this conditional histogram is a probability measure, we can define a meaningful probability space for its dependence on local context images based on this conditional probability measure that determines the probability of the context image of the central pixel when the context images of its neighboring pixels are given. Similarly, by randomly reordering all pixels in the whole image without specifying orientation, we can calculate the histogram of global context images. Clearly, the global histogram is also a probability measure. We define a probability space based on this probability measure and refer to it as the global context information. This multinomial probability distribution determines the unconditional probability of any context image in the whole image, which will be further discussed in the next section. 3 3.1
STOCHASTIC MODELS FOR AN MR IMAGE Introduction
The objective of stochastic modeling in image analysis is to capture the intrinsic character of images in a structure with few parameters so as to understand the statistical nature of the imaging process. Stochastic image models are useful to specify quantitatively the natural constraints and general assumptions for the purpose of statistics about the physical world and the 1
Given a central pixel, a second-order neighborhood system refers to its eight neighboring pixels [13].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
374
Wang and Adalı
imaging process. They play crucial roles in unsupervised MR image analysis, such as in parameter estimation (image quantification) and image segmentation (pixel classification). Two different types of stochastic image models are required in practical applications [27]: 1.
2.
Models for Pixel Images: The models for pixel images are specifically designed to describe the type of randomness involved in the observable pixel random variables. Many stochastic models have been proposed to analyze the tone or texture pixel images, such as the conditional FNM (CFNM) [33], the standard FNM (SFNM) [17,29,35], the generalized Gaussian mixture (GGM) [29], the Gaussian random field (GRF) [10], the autoregressive (AR) model [34], and the Markov/Gibbs random field (MRF/GRF) [37]. Models for Context Images: The models for the underlying true context images are designed to serve as prior contextual constraints on unobserved pixel labels in terms of stochastic regularization [35]. The MRF model has been the most popular so far in this domain, although it has a number of fundamental issues still unexplored and unanswered [37].
In the previous section, we discussed the statistical properties of the MR image. Rather than making heuristic assumptions on image statistics, these statistical properties can be utilized to establish an appropriate framework of stochastic modeling for MR images. A summary of the previous section is as follows. For an MR image, 1.
2.
3. 4. 5.
Each pixel is a random variable with a truncated, asymptotic Gaussian distribution. The whole image is a Gaussian random field. Any two pixel image rv’s are asymptotically independent (i.e., weakly dependent). Their correlation is mainly governed by the system point spread function with a narrow ranged microscopic correlation.2 Each region is stationary. The whole image is piecewise stationary. Each region satisfies both mean and variance ergodic theorems. The whole image is an ergodic random field. Context images have multinomial distributions; they are correlated and satisfy the Markovian property.
2
Here microscopic correlation refers to the intrinsic correlation among spin densities as the input of an imaging system.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
375
In this section, based on these properties, we present a new framework for stochastic MR image modeling and provide a better mathematical understanding of the related issues. Recently, there has been considerable interest in using FNM (for pixel images) and MRF (for context images) models for medical image analysis, and impressive application results using these models have been presented [17,29,33]. Our discussion mainly focuses on these two types of models and shows that this framework results in efficient algorithms for image analysis. 3.2 3.2.1
Image Modeling Pixel Modeling
Given a digital image consisting of N ⬅ N1 ⫻ N2 pixels, assume that this image contains K regions and that each pixel is decomposed into a pixel image x and a context image l. By ignoring information regarding the spatial ordering of pixels, we can treat context images (i.e., pixel labels) as random variables and describe them using a multinomial distribution with unknown parameters k . Since this parameter reflects the distribution of the total number of pixels in each region, k can be interpreted as a prior probability of pixel labels determined by the global context information. Thus the relevant (sufficient) statistics are the pixel image statistics for each component mixture and the number of pixels of each component. The marginal probability measure for any pixel image, i.e., the finite mixtures distribution, can be obtained by writing the joint probability density of x and l and then summing the joint density over all possible outcomes of l, i.e., by computing p(xi ) = 兺l p(xi , l ), resulting in a sum of the following general form:
冘 K
pr (xi) =
k pk (xi )
i = 1, . . . , N
(12)
k=1
where xi is the gray level of pixel i. pk (xi )s are conditional region probability density functions (pdfs) with the weighting factor k , satisfying k > 0, and 兺Kk=1 k = 1. The generalized Gaussian pdf given region k is defined by [29] pk (xi) =
␣k exp[⫺兩k (xi ⫺ k )兩␣] 2⌫(1/␣)
␣>0
k =
1 k
冋 册
1/2
⌫(3/␣) ⌫(1/␣) (13)
where k is the mean, ⌫(⭈) is the gamma function, and k is a parameter related to the variance k by
k =
1 k
冋 册 ⌫(3/␣) ⌫(1/␣)
1/2
(14)
When ␣ >> 1, the distribution tends to a uniform pdf; for ␣ < 1, the pdf
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
376
Wang and Adalı
becomes sharper; for ␣ = 2.0, one has the Gaussian pdf, and for ␣ = 1.0 the Laplacian pdf. Therefore the generalized Gaussian model is a suitable model to fit the histogram distribution of those images whose statistical properties are unknown, since the kernel shape can be controlled by selecting different ␣ values. The finite generalized Gaussian mixture model (FGGM) for ␣ = 2 is also commonly referred to as the standard finite normal mixture model and has been the most frequently used form. It can be written as pk (xi) =
1
兹2k
exp
冉
⫺
(xi ⫺ k)2 2 2k
冊
i = 1, 2, . . . , N
(15)
where k and 2k are the mean and variance of the kth Gaussian kernel and K is the number of Gaussian components. The whole image can be closely approximated by an independent and identically distributed random field X. The corresponding joint pdf is P(x) =
写冘 N
K
i=1
k=1
k pk (xi)
(16)
where x = [x1 , x2 , . . . , xN], and x 僆 X. Based on the joint probability measure of pixel images, the likelihood function under finite mixture modeling can be expressed as
写 N
ᏸ(r) =
pr (xi)
(17)
i=1
where r:{K, ␣, k , k , k , k = 1, . . . , K} denotes the model parameter set. 3.2.2
Context Modeling
In MR images, regions are piecewise contiguous; thus a pixel label takes on discrete values and the labels of nearby pixels are strongly correlated. Using the equivalence between a Gibbs distribution and an MRF, it has been shown that a Gibbs distribution provides a joint probability measure for l in the following form [2,10,13]: P(l) =
1 exp(⫺U(l)) Zl
(18)
where U(l) is the energy function, and the normalizing constant Zl is the partition function. A neighborhood system can be established by specifying N (i) the clique function V (i) c (l), where U(l) = 兺i=1 V c (l). The most typical configuration of an MRF model is the pairwise interaction neighborhood system in which spatial neighbors occur only in pairs, i.e., the second-order model [13]. We define the neighborhood of pixel i, denoted by ⭸i, by opening a 3
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
377
⫻ 3 window with pixel i as the central pixel. The energy function can then be written as
冘冘 N
U(l) =
i=1
[i I(li , lj)]
(19)
j僆⭸i
where {i} is the Markov parameter and I(⭈,⭈) is the indicator function [4]. By translating local context information into the energy function of the Gibbs measure, by the use of a clique structure or a Markov parameter, an inhomogeneous MRF model can be established. Again it is important to note that, in our formulation, the Markov parameter i in Eq. (19) is considered to be shift-variant, not a fixed constant as in most previous work [2,10]. We believe that the correlation among context images is primarily local and should be reflected by the Markov parameter values. For example, within one region, the context exhibits strong spatial continuity, while on region boundaries, the context reflects the natural discontinuity. By simply applying the Bayes law, we can construct a unified framework to integrate the pixel image model with the context image model. It has been shown that requiring the conditional independence of the observed pixel images, given the context images, is sufficient to ensure that the posterior distribution is an MRF [13,37]. That is, a posterior conditional probability function for the context images l, given the observed pixel images x, has the form of a Gibbs distribution given by P(l兩x) =
1 exp(⫺U(l兩x)) Z l兩x
(20)
where Z l兩x is a normalizing constant. Based on the pairwise interaction neighborhood system, the corresponding energy function is U(l兩x) =
冘再写冋 N
K
i=1
k=1
册 冘
1 (xi ⫺ k)2 ln( k2) ⫹ 2 2 2k
I(li ,k)
⫹
冎
[i I(li , lj)]
j僆⭸i
(21)
using Eqs. (15) and (19). Equation (20) refers to a hidden MRF, where the local property can be derived as p(li 兩l⭸i , x) =
1 exp(⫺U(li 兩l⭸i , x)) Zi
(22)
where Zi is a normalizing constant, l⭸i local context images, and
写冋 K
U(li 兩l⭸i , x) =
k=1
册 冘
1 (xi ⫺ k)2 ln( k2) ⫹ 2 2 2k
I(li ,k)
c
⫹
[i I(li , lj)]
(23)
j僆⭸i
The spatial statistical dependence among pixel images is one of the funda-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
378
Wang and Adalı
mental issues in the mathematical formulation for image analysis. In our approach, we tackle the problem as follows: For the purpose of tissue quantification, maximum likelihood (ML) estimation, based on the SFNM model given in Eq. (17), is used. In Ref. 38 we prove a convergence theorem showing that, when the pixel images are asymptotically independent, the parameter estimates based on Eq. (17) converge to their true values with probability 1 for a sufficiently large N. For the purpose of image segmentation, the maximum posterior probability (MAP) approach is employed to update l according to Eq. (20). Since in the formulation, we allow i to be adjustable, the assignment of their values will incorporate the unified correlation among both the pixel and context images. Our tests with a wide class of simulated and real data demonstrate the plausibility of the approach. 3.3
Model Identification
Once the model is chosen, identification addresses the estimation of the local region parameters (k , k , k , k = 1, . . . , K) and the structural parameters (K, ␣). In particular the estimation of the order parameter K is referred to as model order selection. 3.3.1
Parameter Estimation
With an appropriate system likelihood function, the objective of model identification is to estimate the model parameters by maximizing the likelihood function, or equivalently minimizing the relative entropy between the image histogram px (u) and the estimated pdf pr (u), where u is the gray level [4,39]. There are a number of approaches to perform the ML estimation of finite mixture distributions [47]. The most popular method is the expectation maximization (EM) algorithm [40,41]. The EM algorithm first calculates the posterior Bayesian probabilities of the data based on the observations and obtains the current parameter estimates (E-step). Then it updates parameter estimates using generalized mean ergodic theorems (M-step). The procedure moves back and forth between these two steps. The successive iterations increase the likelihood of the model parameters being estimated. A neural network interpretation of this procedure is given in Ref. 42. We can use relative entropy (the Kullback–Leibler distance) [43] for parameter estimation, i.e., we can measure the information-theoretic distance between the histogram of the pixel images, denoted by px , and the estimated distribution pr (u), which we define as the global relative entropy (GRE): D(px 㛳pr) =
冘 u
px (u)log
px (u) pr (u)
(24)
It can be shown that when relative entropy is used as the distance measure,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
379
distance minimization is equivalent to the maximum likelihood (ML) estimation of the model parameters [39,4]. For the case of the FGGM model, the EM algorithm can be applied to the joint estimation of the parameter vector and the structural parameter ␣ as follows [40]: EM Algorithm: 1.
For ␣ = ␣min , . . . , ␣max , m = 0, given initialized r (0) E-step: for i = 1, . . . , N, k = 1, . . . , K, compute the probabilistic membership z (m) ik =
(m) k pk (xi)
冘
(25)
K
p (xi) (m) k k
k=1
再
M-step: for k = 1, . . . , K, compute the updated parameter estimates
(m⫹1) = k
(m⫹1) k
冘
N1N2
z ik(m)
i=1
1 = N (m⫹1) k
2(m⫹1) = k
2. 3.
1 N
1 N (m⫹1) k
冘 冘 N
z (m) ik xi
(26)
i=1 N
(m⫹1) 2 z (m) ) ik (xi ⫺ k
i=1
When 兩GRE (m)( px 㛳 pr) ⫺ GRE (m⫹1)( px 㛳 pr)兩 ⱕ ⑀ is satisfied, go to Step 2. Otherwise, m = m ⫹ 1 and go to E-Step. Compute GRE, and go to Step 1. Choose the optimal rˆ that corresponds to the minimum GRE.
However, the EM algorithm generally has the reputation of being slow, since it has a first-order convergence in which new information acquired in the expectation step is not used immediately [44]. Recently, a number of online versions of the EM algorithm were proposed for large-scale sequential learning (e.g., see [4,45–48]). Such a procedure obviates the need to store all the incoming observations, and changes the parameters immediately after each data point, allowing for high data rates. Titterington et al. [47] developed a stochastic approximation procedure that is closely related to the probabilistic self-organizing mixtures (PSOM) algorithm we are introducing here and show that the solution can be consistent. Other similar formulations are due to Marroquin and Girosi [45] and Weinstein et al. [48].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
380
Wang and Adalı
For the adaptive estimation of the SFNM model parameters, we can derive an incremental learning algorithm by simple stochastic gradient descent minimization of D( px 㛳 pr) [4,49] given in Eq. (24) with pr given by Eq. (15): (t) (t⫹1) = k(t) ⫹ a(t)(xt⫹1 ⫺ k(t))z (t⫹1)k k
2(t⫹1) k
=
2(t) k
(27)
⫹ b(t)[(xt⫹1 ⫺ ) ⫺ ]z (t) 2 k
2(t) k
(t) (t⫹1)k
k = 1, . . . , K
(28)
where a(t) and b(t) are introduced as the learning rates, two sequences converging to zero and ensuring unbiased estimates after convergence. For details of the derivation and approximations, see Refs. 4 and 9. Based on a generalized mean ergodic theorem [50], updates can also be obtained for the constrained regularization parameters, k , in the SFNM model. For simplicity, given an asymptotically convergent sequence, the corresponding mean ergodic theorem, i.e., the recursive version of the sample mean calculation, should hold asymptotically. Thus we define the interim estimate of k as in Ref. 51:
(t⫹1) = k
t 1 (t) (t) z (t⫹1)k k ⫹ t⫹1 t⫹1
(29)
Hence the updates given by Eqs. (27), (28), and (29) together with an evaluation of Eq. (25) using Eq. (15) provide the incremental procedure for computing the SFNM component parameters. Their practical use, however, requires the existence of a strong mixing condition and a decaying annealing procedure (learning rate decay) [50,52,53]. In finite mixtures parameter estimation, the algorithm initialization must be chosen carefully and appropriately. In Ref. 51, an adaptive Lloyd–Max histogram quantization (ALMHQ) algorithm is introduced for threshold selection, which is also well suited to initialization in an ML estimation. It can be used for initializing the network parameters: k , 2k , and k , k, 1, 2, . . . , K. 3.3.2
Model Order Selection
The determination of the region parameter K directly affects the quality of the resulting model parameter estimation and in turn affects the result of segmentation. In a statistical problem formulation such as the one introduced in the previous section, the use of information-theoretic criteria for the problem of model determination arises as a natural choice. Two popular approaches are Akaike’s information criterion (AIC) [55] and Rissanen’s minimum description length (MDL) [56]. Akaike proposes the selection of the model that gives the minimum AIC, which is defined by AIC(Ka) = ⫺2 log(ᏸ(rˆML)) ⫹ 2Ka
(30)
where rˆML is the maximum likelihood estimate of the model parameter set
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
381
r, and Ka is the number of free adjustable parameters in the model [17,55], given by 3K ⫺ 1 for the SFNM model. The AIC selects the correct number of the image regions K0 such that K0 = arg
再
min
1ⱕKaⱕKMAX
冎
AIC(Ka)
(31)
Rissanen addresses the problem from a quite different point of view. He reformulates the problem explicitly as an information coding problem in which the best model fit is measured such that high probabilities are assigned to the observed data, while at the same time the model itself is not too complex to be described [56]. The model is selected by minimizing the total description length defined by MDL(Ka) = ⫺log(ᏸ(rˆML)) ⫹ 0.5Ka log(N)
(32)
Similarly, the correct number of distinctive image regions K0 can be estimated as K0 = arg
再
min
1ⱕKaⱕKMAX
冎
MDL(Ka)
(33)
It is also worth noting that Schwartz [57] arrives at the same formulation as Rissanen by using Bayesian arguments. A more recent formulation of an information-theoretic criterion, the minimum conditional bias and variance criterion (MCBV) [4,58], selects a minimum conditional bias and variance model, i.e., if two models are about equally likely, MCBV selects the one whose parameters can be estimated with the smallest variance. The formulation is based on the fundamental argument that the value of the structural parameter cannot be arbitrary or infinite, because such an estimate might be said to have a low ‘‘bias’’ but the price to be paid is high ‘‘variance’’ [25]. Since the joint maximum entropy is a function of Ka and rˆ, by taking advantage of the fact that model estimation is separable into components and structure, we define the MCBV criterion as
冘 Ka
MCBV(K) = ⫺log(ᏸ(x兩rˆML)) ⫹
H(rˆkML)
(34)
k=1
where ⫺log(ᏸ(x兩rˆML)) is the conditional bias (a form of information theoa retic distance) [59,50], and 兺Kk=1 H(rˆkML) is the conditional variance (a measure of model uncertainty) [59,53], of the model. As both these terms represent natural estimation errors about their true models, they can be treated on an equal basis. A minimization of the expression in Eq. (34) leads to the following characterization of the optimum estimation:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
382
Wang and Adalı
K0 = arg
再
min
冎
MCBV(K)
1ⱕKⱕKMAX
(35)
That is, if the cost of model variance is defined as the entropy of parameter estimates, the cost of adding new parameters to the model must be balanced by the reduction they permit in the ideal code length for the reconstruction error. A practical MCBV formulation with code-length expression is further given by [50,58]
冘 Ka
MCBV(K) = ⫺log(ᏸ(x兩rˆML)) ⫹
k=1
1 log 2eVar(rˆkML) 2
(36)
where the calculation of H(rˆkML) requires the estimation of the true ML model parameter values. It is shown that, for a sufficiently large number of observations, the accuracy of the ML estimation quickly approaches the best possible result as determined by the Cramer–Rao lower bounds (CRLBs) [53]. Thus the CRLBs of the parameter estimates are used in the actual calculation to represent the ‘‘conditional’’ bias and variance [54]. We have found experimentally that the MCBV formulation for determining the value of K0 exhibits very good performance consistent with both the AIC and MDL criteria. It should be noted, however, that these are not the only plausible approaches to the problem of order selection; other approaches such as crossvalidation techniques may also be quite useful [60–64]. 3.3.3
Segmentation
Image segmentation is a technique for partitioning the image into meaningful regions corresponding to different objects. It may be considered to be a clustering process where the pixels are classified into attributed tissue types according to their gray-level values and spatial correlation. More precisely, image segmentation addresses the realization of context images l, given the observed pixel images x [36]. Based on the inhomogeneous MRF model given by Eq. (20), there are several approaches for performing the pixel classification. For example, we can use ML classification to maximize directly the individual likelihood function of the pixel images, i.e., the first term in Eq. (20), by searching the optimum l where the true pixel labels l* are considered to be functionally independent constants [36]. The major problem associated with this approach is that the classification error is high when the observed pixel images are noisy. However, it may well function as an initial solution, since the classification error is spatially uniformly distributed. By considering both terms in Eq. (20), we derive the modified iterated conditional modes (MICM) algorithm to search for a MAP image segmentation. The structure of this relaxation labeling procedure is based on two basic considerations, (1) the decomposition of the global computation
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
383
scheme into a network performing simple local computations, and (2) the use of suitable local context regularities in resolving ambiguities [36]. Let ⑀ denote the expected segmentation error, i.e., the posterior cost of misclassification given pixel images x, based on the following equivalence:
再
冎
再
冎
arg max P(li = k, lS兩i 兩x) = arg max p(li = k兩lS兩i , x) k
k
(37)
where lS兩i denotes pixel labels of all pixels except pixel i. References 13 and 37 show that choosing the labeling that minimizes ⑀ is equivalent to maximizing the marginal posterior distribution so that label li is updated to satisfy p(l (m⫹1) 兩x, lS兩i) ⱖ p(l (m) i i 兩x, lS兩i)
(38)
for all pixels. Furthermore, by imposing the Markovian constraint, we have the following relationship [13,37]: p(li 兩x, lS兩i) ⬀ p(xi 兩li)p(li 兩l⭸i) ⬀ p(li = k兩xi , l⭸i)
(39)
That is,
再
冎
再
冎
再
冎
arg max p(li 兩x, lS兩i) = arg max p(xi 兩li)p(li 兩l⭸i) = arg max p(li 兩xi , l⭸i) k
k
k
(40) Hence, based on the inhomogeneous hidden MRF formulation, the MICM algorithm is constructed as a computationally feasible alternative to obtain an MAP solution. From Eq. (22), we can show that maximizing the conditional probability is equivalent to minimizing the energy function U(li 兩l⭸i , x). That is, pixel i will be classified into the kth region if
再
冎
li = arg min U(li = k兩l⭸i , x) k
(41)
The MICM algorithm uses a specified number of iterations by randomly visiting all pixels. At each single cycle in the iterations, for the update of li we have P(l兩x) = p(li 兩lS兩i , x)P(lS兩i 兩x)
(42)
which never decreases, hence assuring eventual convergence [13,37]. This approach demonstrates how a network of discrete units can be used to search an optimal solution in a problem where incorporation of context constraints is important. To perform image segmentation based on the inhomogeneous MRF model requires assignment of the Markov parameter i [13]. An important
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
384
Wang and Adalı
area in image analysis that has recently been emphasized is the use of patient-specific site models to guide various medical image analysis tasks [14]. Hence, in the assignment of the Markov parameter, we first construct a patient-specific probabilistic atlas directly from the interium context images (e.g., after ML classification), where the entropy rate distribution is used as a measure of the pixel label dependence, and then assign the Markov parameter values based on this model through a predefined lookup table. This approach has a clear physical interpretation, since a lower entropy rate means higher dependence, and the probabilistic atlas coincides with the boundary allocation [31]. For a stationary random process l1 , l 2 , . . . , lN , the entropy rate H(ᏸ) is defined as the rate of entropy growth [50]. When dealing with a stationary Markov chain, the entropy rate can be calculated by H(ᏸ) = lim H(lN 兩lN⫺1 , . . . , l1) = lim H(lN 兩lN⫺1) = H(l 2 兩l1) N→⬁
(43)
N→⬁
Assume that the inhomogeneous MRF is locally stationary. Then the local entropy rate, i.e., the entropy rate distribution, is given by H(ᏸi) =
冘
P(l⭸i)H(li 兩l⭸i)
(44)
l⭸i
Since the probability distribution of l⭸i is generally unavailable, we use firstorder stochastic approximation to estimate the entropy rate [31,53]:
冘 K
H(ᏸi) ⬇ H(li 兩l⭸i) = ⫺
p(li = k兩l⭸i)log p(li = k兩l⭸i)
(45)
k=1
By opening a 3 ⫻ 3 neighborhood window, the conditional probability is given by p(li = k兩l⭸i) =
冘 j僆⭸i
I(lj , k) 8
(46)
It can be shown that when l⭸i represents a uniform block the corresponding entropy rate is zero. Finally, we use the following lookup table to assign the Markov parameter values:
i =
␣ H(ᏸi) ⫹
(47)
where ␣ is the scale factor and is the shifting offset; these coefficients are empirically determined based on our experience. We refer to the plot of i as the patient-specific probabilistic atlas or the site model. Finally, we introduce, in addition to the iterated update of context images l using Eq. (41) and subsequent modification of the Markov parameter according to the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Stochastic Model Based Image Analysis
385
lookup table Table 47, fine tuning the tissue conditional likelihood densities through a ‘‘classification-based learning’’ scheme [38]. The classificationbased learning method uses the misclassified pixels to adjust the tissue density functions, which are previously estimated using the EM algorithm, so that a minimum classification error can be achieved. Integrated into the MICM algorithm, the reinforced and antireinforced learning rules are used to update the tissue quantities k and 2k in the first term of Eq. (20) at each iteration: Reinforced learning:
l (n⫹1) = l i(n⫹1) ⫹ (xi ⫺ l (n⫹1) ) i i (n⫹1)
(n)
(n)
l (n⫹1) = l (n⫹1) ⫹ [(xi ⫺ l i(n⫹1))2 ⫺ l i(n⫹1)] i i 2(n⫹1)
Anti-reinforced learning:
2(n)
(n)
2(n)
l (n) = l i(n) ⫺ (xi ⫺ l i(n)) i (n⫹1)
l (n) i
2(n⫹1)
(n)
(n)
= l i(n) ⫺ [(xi ⫺ l i(n))2 ⫺ l i(n) ] 2(n)
(n)
2(n)
(48)
where the misclassified pixels are those with l (n⫹1) ≠ l i(n) and is the learning i rate constant, which is typically chosen to be small ( 0
In Eq. (1), 兩F()兩 and 兩F(b)兩 are the amplitudes of the Fourier coefficient at the frequency and the base frequency b, respectively. While represents the phase angle of the Fourier coefficient at the base frequency of each voxel, p describes the phase angle of the Fourier coefficient at the base frequency of the voxel with the maximal ratio of 兩F(b)兩2
冏冘 冏 F()
v>0
This ratio directly indicates how strong the activation is (at the base frequency of the paradigm), while the cos( ⫺ p) term represents the degree of phase coherence between the temporal signal of a voxel and the strongest activation signal in the whole data set. In fact, the phase angle of the strongest activation signal only serves as a reference phase angle for computing the activation map. Although phase angles of other voxels can also be selected as the reference, the use of the phase angle of the voxel with the strongest activation signal seems to be more appropriate because of its high SNR. By rotating the phase angle, a deactivation map (with 180⬚ phase angle shift) can be created very easily: Deactivation(x, y, z) =
冘
兩F(b)兩2 兩F()兩
v>0
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
cos( ⫺ ⫹ p)
(2)
410
Lai and Fang
A 3-D activation or deactivation map can be generated by computing the measure Activation (x, y, z) or Deactivation (x, y, z) for each voxel in the 3-D volume fMRI data, respectively. In contrast to our method, the spectral density method [8] uses the spectral density at the base frequency 兩F(b)兩 directly to detect the brain activation. According to our experiments, the spectral density at the base frequency is sensitive to the misregistration in fMRI data. In addition, it tends to have higher values at the image contours and boundaries with great intensity changes. By introducing a relative measure (ratio) 兩F(b)兩
冘
兩F()兩
v>0
in addition to the spectral density at the base frequency 兩F(b)兩, more robust detection results can be achieved. We have applied our frequency analysis method with baseline correction to several fMRI data sets. All of them have shown similar results. In comparison with the standard t-test method [13], our method has shown significant performance improvement of fMRI activation signal detection. Figure 5(a) shows a slice of the fMRI data at a time instant with each volumetric MRI data containing 16 slices; the resolution for each slice is 64 ⫻ 64 pixels per. This fMRI data was acquired using an EPI pulse sequence by a Siemens MR Scanner. Figure 5 also depicts the activation maps at this slice obtained using our method, the t-test method, and the spectral density method [8]. The activation maps for all the 16 slices obtained by using the frequency analysis technique and the t-test method are shown in Fig. 6. The signal-to-noise ratio (SNR, defined as the highest activation measure in a map over the sum of activation values of all voxels) in the activation map produced by our method is about three times that of the t-test method. This means that our method is more powerful in discriminating active signals from inactive signals, which is a very desirable feature for many applications where the activation signal is weak. While the activation map produced by our method is consistent with that of the t-test method, the spectral density method failed to indicate the strongest activation at the same place and in the same slice. The spectral density method, in our experiments, tends to have high false activation values at the boundaries of the brain image. Our frequency analysis method with baseline correction has the following main advantages: (1) It does not require explicit knowledge of the activation model signal; only the paradigm used for the data acquisition is needed. (2) It is not sensitive to the baseline drift, which has been corrected before the Fourier transformation is applied. (3) It is insensitive to highfrequency noise caused by heartbeat and respiration. This is mainly because we use the sum of all frequency components, which provide robustness in
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
411
FIGURE 5 (a) A slice of the fMRI data at a time instant. (b–d) The activation maps of the slice computed using (b) our method, (c) the spectral density method, and (d) the t-test method.
the computation of activation measures. (4) In general it provides a much better SNR than the commonly used t-test method. Our experiments show that an increase of more than 300% in the SNR can be achieved. (5) The algorithm executes as far as the conventional t-test method. A limitation of our current algorithm is the assumption that paradigms used for fMRI data acquisition are periodic. Further development is needed to extend our current method to nonperiodic paradigms.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
412
Lai and Fang
FIGURE 6 The computed activation maps obtained by using our frequency analysis technique (a) and the standard t-test method (b) for all the 16 slices. Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
4
413
ACTIVATION SIGNAL DETECTION USING LPCA
The principal component analysis (PCA) has also been applied to extract the activated regions and their time courses of activation [14–15] without explicit knowledge of the activation process. This method is based on the assumption that the activation, for the best separation, is orthogonal to other signal variations such as brain motion, physiological signals (heartbeat and respiration) and other uncorrelated noise. In many real applications, this assumption is not always valid. Since the PCA is computed from the entire MR images over time, it becomes difficult to detect the activation signal when the temporal variations of the fMRI data are different in different regions. One example is in different baseline drifts caused by the partial volume effect in different spatial regions in the image. A more common approach [16] is the application of PCA to voxels identified from a priori statistical analysis [17], which again requires explicit knowledge of the activation paradigm. In this section, we present a novel local PCA (LPCA) algorithm for detecting fMRI activation signals. This algorithm does not assume any prior information for the shape of the model activation signal. Note that our LPCA method is very different in principle from the traditional PCA methods. In our LPCA algorithm, the PCA is applied to all the active and inactive segments extracted from the time course at each individual voxel. A lowerdimensional subspace is formed from some of the most dominant eigenvectors computed from PCA. The detection of activation is based on the degree of separation between the two clusters of active-period and inactive-period segments projected onto the linear eigensubspace. In contrast, the traditional PCA methods are applied to the PCA globally for the entire fMRI data set, and the orthogonality assumption is applied between the activation signal and other signal components. In addition, in previous PCA methods, no use was made of paradigm information or the measure of separation between clusters formed by the signals in the active and inactive periods. In the proposed method, we extract the most dominant principal components for each voxel from the inactive-period segments as well as the active-period segments in the neighborhood of each voxel. This procedure is called local PCA or LPCA. After the computation of LPCA, an activation measure at each voxel is computed based on the distribution of the projections of the inactive-period temporal signals and the active-period temporal signals at this voxel onto the most dominant local principal components. The activation measure is large when the projections of the inactive-period signals are distributed at some distance from those of the active-period signals, and vice versa. For ease of exposition, the following discussion of our local PCA approach is focused on the analysis of the temporal signal for a single voxel,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
414
Lai and Fang
although in general, a neighborhood of each voxel may contain several voxels in the computation of the LPCA. After partitioning the temporal signal for each voxel into active-period and inactive-period segments based on the (i) given paradigm, each segment is represented by a vector (a) k or k , where the symbols (a) and (i) stand for the active period and inactive period respectively, and k is the index for the segment. Let all of the vectors (a) k and (i) k be appropriately chosen to be of the same size. Two separate clusters can be formed from the active-period segments (a) k and the inactive-period segments (i) k . The degree of separation between these two clusters provides a measure of difference between the active-period segments and the inactiveperiod segments; thus it indicates the strength of the activation of this signal. By treating each cluster as samples from a probability distribution, we can employ statistical methods to determine the difference between the two distributions. Unfortunately, in the case of fMRI, the dimension for these two (i) clusters, i.e., the size of the vectors (a) k and k , is normally larger than, or similar to, the number of samples in each cluster. It is very difficult to obtain a robust estimation of the second-order statistics for such a small number of samples. We usually need the number of samples, several times more often than the dimension of the space, to obtain a robust estimation of the second-order statistics, which are needed for determining the differences between two distributions. To resolve the above problem, we apply the PCA to reduce the di(i) mension of the original space by projecting the vectors (a) k and k onto a linear subspace formed by several of the most dominant components. Note (i) that the PCA is performed by using all the vectors (a) k and k at each voxel to obtain a compact representation for all the samples. Next we can obtain the two corresponding clusters in this projection space followed by the statistics of these two probability distributions which can be more reliably estimated from the projected samples. In addition, the PCA provides the orthogonal principal components that decouple the correlation of the samples. Therefore we can assume that the new random variables defined by the new projection space are uncorrelated. This makes the estimation of the second-order statistics much easier, i.e., we only need to compute the mean and standard deviation for each principal component direction in the new projection space. Then we can apply Student’s t-test for each principal component direction to obtain a measure of separation along this direction. The activation measure can be computed by combining the separation measure along all the directions in the projection space. This activation measure indicates the strength of the activation at the corresponding voxel. Figure 7 depicts an example of a temporal signal with weak activation. It is obvious that the two clusters for the active-period and inactive-period segments are mixed together in the projection space formed by the two most
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
415
FIGURE 7 An example of a temporal signal with weak activation is shown in (a) and (b), i.e., the partitioning of the signal into active-period segments and inactive-period segments. The three most dominant principal components computed from these segments are depicted in (c). The distributions of the two clusters in the projection space formed by the first and second principal components are shown in (d).
416
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Lai and Fang
FIGURE 8 An example of a temporal signal with strong activation is shown in (a) and (b), i.e., the partitioning of the signal into active-period segments and inactive-period segments. The three most dominant principal components computed from these segments are depicted in (c). The distributions of the two clusters in the projection space formed by the first and second principal components are shown in (d).
Functional MR Image Analysis
417
dominant principal components, i.e. the first and second, as shown in the figure. Another example of a strong-activation signal is shown in Fig. 8. In this case, the two clusters of the active-period segments and inactive-period segments are nicely separated in the corresponding 2-D projection space. Note that our algorithm is not restricted to using only two principal components in the projection space. In this example, we use two principal components simply for visualization purposes. In fact, we use three principal components in the implementation of our LPCA algorithm. Next, we describe the details of our LPCA algorithm for the fMRI activation detection. In our algorithm, we first segment the fMRI temporal signals into background signals and nonbackground signals by simple thresholding. The linear regression technique is subsequently applied to the nonbackground signals to correct the baseline drifts. Then the activation measure based on the local PCA is computed for the baseline-corrected nonbackground temporal signals. The flowchart of the local PCA algorithm is given in Fig. 9. The detailed procedure of our LPCA-based algorithm can be described as follows: 1. Apply background segmentation on the original 3-D volume MRI data. For the voxels classified into the background, we simply assign the activation measure values to zero and do not include them in further processing. The background segmentation is performed by a simple thresholding procedure. 2. Perform the linear regression procedure for the temporal signal at each voxel to obtain the baseline-corrected temporal signal for further processing. 3. Partition the baseline-corrected temporal signal f (x, y, z, t) at each voxel (x, y, z) into active and inactive segments based on the paradigm of the fMRI procedure. Each segment is taken from the center part of each active or inactive period with equal length by discarding the beginning and the ending portion of the signal, since they correspond to the transition periods in the paradigm and are sensitive to small errors in the given paradigm information. All the segments are taken with equal length for later local PCA computation. 4. Use all of the inactive-period as well as active-period segments at each voxel to perform the local principal component analysis. However, the local principal component analysis can also be applied to all active and inactive segments in the neighborhood of each voxel. For example, all segments from the eight nearest neighbor voxels in each slice can be used in the local principal component analysis. 5. Take the most dominant M principal components for each voxel computed from the previous step and project its inactive-period segments
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
418
Lai and Fang
FIGURE 9 Flowchart of our LPCA algorithm for detecting activation signals in fMRI.
and active-period segments onto a subspace formed by these M principal components. 6. Compute the means and the standard deviations from the distributions of the projections computed in the previous step for the inactiveperiod segments and active-period segments. Then compute the activation measure at each voxel (x, y, z) as follows:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
Activation(x, y, z) = (储 f (a)储 ⫺ 储 f (i)储) M 兩mj(a)(x, y, z) ⫺ m(i) j (x, y, z)兩 ⭈ (a) 2 (i) 2 j=1 兹[ j (x, y, z)] ⫹ [ j (x, y, z)]
冘
419
(3)
where the symbol 储 储 denotes a two-norm operator; the symbols m and stand for mean and variance, respectively; the superscripts (i) and (a) denote inactive and active periods, respectively; and the subscript j is the index for the principal component. Note that the first multiplication factor for the entire summation on the right-hand side of Eq. (3) is the energy difference between the active-period signal and the inactive-period signal. This term is used because we anticipate that the energy in the active period will be larger than the energy in the inactive period for the activation signal. Inside the summation is a measure of separation between the two clusters of the active and inactive segments in a projected eigensubspace. This separation measure, or discrimination measure, is similar to the Student t-test measure for two different distributions of random variables in statistics [13,18]. We generalize this measure for two distributions in a multidimensional space by summing all separation measures computed in all the orthogonal directions. Although this summing combination assumes that the distributions are uncorrelated, it has the advantage of simpler computation and better reliability for small-size samples for the combination by using the covariance matrices [19]. The above procedure finally provides a 3-D activation map from the fMRI data. The computed activation map can be interpreted as the likelihood of the neural activation caused by the designated stimulation at each voxel location. The multiplication of the two terms in the definition of the activation measure is a way of combining two different types of criteria for determining the degree of activation; one is the energy difference and the other is the separation between the two clusters formed by active-period and inactive-period segments. This combination of different activation measures can provide significant SNR (signal-to-noise ratio) improvement in the activation signal detection, which will be shown subsequently by experiments. We first apply the proposed algorithm to detect the neural activation in the fMRI for the finger tapping experiment. This fMRI data set is a time sequence of 3-D volume MRI data, acquired by a multislice EPI sequence with a TE time of 0.056 seconds. At each time instant of data acquisition, a volume MRI of size 64 ⫻ 64 ⫻ 16 is acquired. The images have an isotropic spatial resolution of 3 mm. Each EPI sequence needs 0.1 seconds for acquiring each image, thus 1.6 seconds for each volume. The entire data set contains 61 acquisitions of such volume MRI data. The periodic acti-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
420
Lai and Fang
vation paradigm of this finger tapping experiment contains four repetitive periods with each period containing a finger tapping interval followed by a rest interval. The time gap between two consecutive data acquisitions is about 3 seconds. The concatenation of 16 slices of the fMRI volumetric data into a single image for the finger tapping experiment at a time instant is shown in Figure 10(a). The activation maps computed by using the standard t-test method and the proposed local PCA algorithm are shown in Figs. 10(b) and 10(c), respectively. For a fair comparison, we apply the baseline correction procedure in both methods. In the implementation of the LPCA algorithm, we use three most dominant principal components in the computation of the activation maps in all the experiments. Comparing Figs. 10(b) and 10(c), we can see that the activation map produced by the proposed LPCA method has a better SNR than that produced by the t-test method. The SNR is defined to be the ratio of the largest normalized value to the mean of the normalized values at the nonbackground points. Here the normalization is performed to make the ranges of the activation values the same for different methods. The SNR of the activation map produced by our method is 5.62 times that of the t-test method. This indicates that the proposed LPCA method is very powerful in discriminating activation signals from baseline signals for this data set. Our experiment shows that the strongest activation in the above fMRI volumetric data was detected in the sixth slice. An image of this slice at a time instant is shown in Fig. 11(a). The activation maps for this slice computed by the conventional t-test method [13] and the proposed LPCA algorithm are illustrated in Figs. 11(b) and 11(c). It is clear that the activation map computed by the LPCA algorithm is more localized at the highest activation location than that of the t-test method. The temporal signal at the strongest activation site detected by our method is depicted in Fig. 11(d). The other fMRI experiments contain four different types of stimulation, i.e. motor stimulus, sensory stimulus, phasic pain, and tonic pain, under the same nonperiodic paradigm as shown in Fig. 12(b). Figure 12(a) depicts the concatenation of all the 16 slices of the volumetric MR images at a time instant for the motor stimulation. Once again, we apply the t-test method and the proposed LPCA method to all these experiments to detect the activation areas. For the LPCA method, we still use the three most dominant principal components in the computation of the activation measure. Since the principal component analysis requires the active segments and inactive segments extracted from the temporal signal to be of equal length, we take the central intervals of a fixed length from all active and inactive epochs based on the paradigm. The fixed length for the above segment extraction is determined by the shortest epoch in the paradigm.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
421
FIGURE 10 (a) The concatenation of 16 slices of the fMRI volumetric data at one time instant during the finger tapping experiment. The activation maps are computed by using (b) the t-test method and (c) the proposed LPCA algorithm. The LPCA algorithm provides 5.62 times SNR improvement in the activation map over that of the t-test method.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
422
Lai and Fang
FIGURE 11 (a) A slice of the MRI volume data. (b) The activation map obtained by using the conventional t-test method. (c) The activation map obtained by using our local PCA based method. (d) The temporal signal of the point with the highest activation map value.
The activation maps produced by the t-test method and the LPCA method for the motor experiment are shown in Figs. 13(a) and 13(b), respectively. The proposed LPCA method renders an activation map that is more localized at the activation location than that of the t-test method. The SNR of the activation map computed by the LPCA method is 8.31 times better than that of the t-test method. Figures 14(a) and 14(b) show the ac-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
423
FIGURE 12 (a) The concatenation of 16 slices of the fMRI volumetric data at a time instant in the motor experiment. (b) The nonperiodic paradigm as used in four different fMRI experiments.
tivation maps for the sensory experiment using the t-test method and the LPCA method. For this experiment, the SNR of the LPCA method is 3.45 times better than that of the t-test method. Similarly, for the phasic pain and tonic pain experiments, the activation maps produced by the above two methods are shown in Figures 15 and 16, respectively. The LPCA method provides better SNR in the activation map than the t-test method, i.e., 3.47 times better for the phasic pain experiment and 3.88 times better for the tonic pain experiment. In general, our LPCA method provides substantial signal-to-noise ratio improvement in the detection of functional activation signals for all of the above experiments. This SNR improvement makes activation signal detection and localization much easier and more precise, especially when the actual activation signal is weak.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
424
Lai and Fang
FIGURE 13 The activation maps for the motor experiment using (a) the t-test method and (b) the proposed LPCA method. The SNR in the activation map for the LPCA method is about 8.31 times the SNR of the ttest method.
FIGURE 14 The activation map for the sensory experiment using (a) the t-test method and (b) the proposed LPCA method. The SNR in the activation map for the LPCA method is about 3.45 times the SNR of the ttest method.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis
425
FIGURE 15 The activation map for the phasic pain experiment using (a) the t-test method and (b) the proposed LPCA method. The SNR in the activation map for the LPCA method is about 3.47 times the SNR of the t-test method.
FIGURE 16 The activation map for the tonic pain experiment using (a) the t-test method and (b) the proposed LPCA method. The SNR in the activation map for the LPCA method is about 6.24 times the SNR of the t-test method.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
426
5
Lai and Fang
SUMMARY
In this chapter, we first reviewed the existing methods for fMRI analysis and activation detection. We also discussed a simple linear regression method to alleviate the baseline drift artifact due to the partial volume effect or small registration errors. In addition, two methods for brain activation signal detection were presented. One is a frequency analysis–based method and the other is the LPCA-based method. The experimental results demonstrated the better performance of the frequency analysis–based method compared to the conventional t-test method. Our LPCA method took advantage of the paradigm used in the fMRI experiment to partition the temporal signal into active segments and inactive segments. The PCA was subsequently applied on these segments to project each segment onto a lower-dimensional eigensubspace. An activation measure was defined based on the separation between two clusters formed by the active and inactive segments in the eigensubspace. We demonstrated by the use of several experiments that the proposed LPCA method provides, in general, a better SNR than that of the conventional t-test method. Unlike the conventional t-test and cross-correlation methods, the proposed LPCA algorithm does not rely on a model activation signal to detect brain activation from fMRI data. This is useful when some activation processes are so complicated that the model activation signal is inaccurate or unavailable. In addition, the LPCA algorithm should be insensitive to random noise. This is because the projection of signal segments onto an eigensubspace formed by a set of the most dominant principal components can alleviate the effect of random noise in the detection of activation signals. REFERENCES 1. 2.
3.
4.
5.
M. S. Cohen, S. Y. Bookheimer. Localization of brain function using magnetic resonance imaging. Trends Neuroscience 17, 268–277, 1994. K. J. Friston, A. P. Holmes, K. J. Worsley, J. P. Poline, C. D. Frith, R. S. J. Frackowiak. Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 2, 189–210, 1995. K. J. Friston, A. Holmes, J.-B. Poline, C. J. Price, C. D. Frith. Detecting activations in PET and fMRI: levels of inference and power. Neuroimage 4, 223– 235, 1995. U. E. Ruttimann, M. Unser, R. R. Rawlings, D. Rio, N. F. Ramsey, V. S. Mattay, D. W. Hommer, J. A. Frank, D. R. Weinberger. Statistical analysis of functional MRI data in the wavelet domain. IEEE Trans. on Medical Imaging 17, no. 2, 142–154, 1998. X. Descombes, F. Kruggel, D. Y. von Cramon. Spatio-temporal fMRI analysis using Markov random fields. IEEE Trans. on Medical Imaging 17, no. 6, 1028– 1039, 1998.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Functional MR Image Analysis 6. 7.
8.
9.
10. 11.
12.
13.
14.
15.
16.
17.
18. 19.
427
F. Y. Nan, R. D. Nowak. Generalized likelihood ratio detection for fMRI using complex data. IEEE Trans. on Medical Imaging 18, no. 4, 320–329, 1999. J. Kershaw, B. A. Ardekani, I. Kanno. Application of Bayesian inference to fMRI data analysis. IEEE Trans. on Medical Imaging 18, no. 12, 1138–1153, 1999. A. Bandettini, A. Jesmanowicz, E. C. Wong, J. S. Hyde. Processing strategies for time-course data sets in functional MRI of the human brain. Magnet Resonance in Medicine 30, 161–173, 1993. M. Singh, J. Jeong, L. Al-Dayeh, T. Kim, P. Colletti. Cross-correlation techniques to identify activated pixels in a three-condition fMRI task. IEEE Trans. Nuclear Science 46, no. 3, 520–526, 1999. H. Fischer, J. Hennig. Clustering of functional MR data. Proceedings of the International Society for Magnetic Resonance in Medicine, 1996, p. 1779. G. Scarth, E. Moser, R. Baumgartner, M. Alexander, R. L. Somorjai. Paradigmfree fuzzy clustering-detected activations in fMRI: a case study. Proceedings of the International Society for Magnetic Resonance in Medicine, 1996, p. 1784. K.-H. Chuang, M.-J. Chiu, C.-C. Lin, J.-H. Chen. Model-free functional MRI analysis using Kohonen clustering neural network and fuzzy c-means. IEEE Trans. Medical Imaging 18, no. 12, 1117–1128, 1999. G. M. Hathout, K. A. T. Kirlew, G. J. K. So, D. R. Hamilton, J. X. Zhang, U. Sinha, S. Sinha, J. Sayre, D. Gozal, R. M. Harper, R. B. Lufkin. MR imaging signal response to sustained stimulation in human visual cortex. Journal of MRI 4, 537–543, 1994. T. H. Le, Xiaoping Hu. Potential pitfalls of principal component analysis in fMRI. Proceedings of the International Society for Magnetic Resonance in Medicine, 1995, pp. 820. P. P. Mitra, D. J. Thomson, S. Ogawa, K. Hu, K. Ugurbil. Spatio-temporal Patterns in fMRI data revealed by principal component analysis and subsequent low pass filtering, Proceedings of the International Society for Magnetic Resonance in Medicine, pp. 817, 1995. G. Scarth, R. L. Somorjai. Fuzzy clustering versus principal component analysis of fMRI, Proceedings of the International Society for Magnetic Resonance in Medicine, pp. 1782, 1996. K. Friston, C. D. Frith, P. F. Liddle, R. S. J. Frackowiak. Functional connectivity: the principal component analysis of large (PET) data sets, Journal of Cerebral Blood Flow and Metabolism, Vol. 13, No. 1, pp. 5–14, 1993. W. H. Press, S. A. Teukolsky, W. T. Vettering, B. P. Flannery. Numerical Recipes in C. 2d ed. Cambridge University Press, 1992. C. J. Huberty. Applied Discriminant Analysis. John Wiley, 1994.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
15 Tagged MR Image Analysis Amir A. Amini and Yasheng Chen Washington University School of Medicine in St. Louis, St. Louis, Missouri
I
INTRODUCTION
The various tagged MR image acquisition approaches elucidated in Chap. 6 provide the means for efficient encoding of tag patterns within the myocardial tissue. However, methods are needed for quantitative analysis and extraction of the tag patterns in collected images in order to measure the motion of tag lines in 2-D and more generally, tag surfaces in 3-D. There has been a flurry of activity in this area. Young et al. [1] have proposed an analysis system based on snakes. First, tag data in different slices are tracked with snakes in order to recover tag intersection motion within the image slices. This snake approach minimizes an external energy that is the sum of intensities for each slice (i.e., assuming the tag points are the darkest image points), together with an internal energy that provides smooth displacements between snakes at successive time points. User interaction is provided through attachment of springs to various points on the snakes in order to guide the curves to fall into the correct local minima. Once the tag positions on the myocardium are found, coordinates of these points in each deformed image are determined within a volumetric finite element model fitted to endocardial and epicardial contours. To fit a 3-D finite element model (FEM) to the stripe displacements, an objective function is formed that minimizes the perpendicular distance of the model point to all of the corresponding 429
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
430
Amini and Chen
undeformed tag planes, thereby determining the FEM nodal parameters and deforming the FEM mesh [1]. Although the FEM model provides good local strain analysis, it results in a large number of model parameters. The work described in Ref. 2 considers geometric primitives that are generalizations of volumetric ellipsoids. The use of parameter functions in this context allows for spatial variations of aspect ratios of the model to fit the LV. The models are also further generalized to parameterize the twisting motion about the long axis of the LV as a function of distance along the long axis as well as the radial direction. The forces are computed from SPAMM data points. Contour points apply forces to the nodes of the closest triangles on the surface in the initial time frame. Each SPAMM intersection point applies a force proportional to the distance to the intersecting line of its two original and perpendicular undeformed tag planes. The forces are applied to model points by distributing forces to FEM nodes with a Lagrangian dynamics, physics-based formulation, and the model comes to rest when all the applied forces come to equilibrium. As the output, the recovered information includes parameter functions describing the aspect ratios of the model as well as parameters that quantify twisting about the long axis of the LV. Though the model provides useful information, it only utilizes information available at tag crossings. The work described in Ref. 3 extends the approach in Ref. 2 by incorporating information from the entire lengths of 1-D tags extracted from multiview SPAMM images, and the approach was applied to the modeling and analysis of the right ventricular shape and motion. Methods based on optical flow have also been applied to the analysis of tagged MR images. An approach called variable brightness optical flow (VBOF) accounts for temporal variation of signal intensities, thus generalizing Horn and Schunck’s optical flow constraint equation [4]. The primary assumption in deriving Horn and Schunck’s optical flow constraint is that the intensity of a material point does not change with time. Hence this approach is not directly applicable to tagged MR images. The drawback of VBOF as stated in Ref. 5 is that the algorithm requires knowledge of the parameters D0 (proton density), T1 , and T2 in the region of interest. A new algorithm described by Gupta and Prince [5] relaxes the intensity constancy constraint and allows for intensity variations to be modeled by a more accurate local linear transformation. The physics of MR is used to derive approximations for the intensity variations by assuming a local linear transformation of the intensity, no motion, and adequate temporal resolution that subsequently is utilized in a new optical flow algorithm requiring only approximate knowledge of T1 in the image. The approach in Ref. 6 for the analysis of radial tagged images determines the ventricular boundaries on a morphologically processed image. The
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
431
morphological operators fill in the tag points on the image. A graph-search technique then determines the optimal inner and outer boundaries of the myocardium. In order to determine the tag locations, tag profiles are simulated as a function of time, assuming linear gradients and a sinc RF pulse for generating the flip angle of rotation for samples as a function of distance along the gradient. The simulated profiles are subsequently used to find tag lines by determining the points one after the other in a sequence, using initial search starting points on the determined LV boundaries. The outer boundary serves as a stopping criterion for the tag localization procedure in short axis images. Parallel tags in long axis images of the ventricle are processed similarly to the short axis images. More recent research uses images from three sequences of parallel tags from segmented k-space imaging obtained at a number of times [7]. After processing the images as described in the previous paragraph for determining tag locations, the authors perform least-squares fitting of a truncated power series in the prolate spheroidal coordinate system on the whole of the myocardium in order to measure dense displacements. The steps in the fitting procedure include determining ⌬x, ⌬y, and ⌬z in reference to tag planes in the initial undeformed state. The gross affine linear fit is then performed in Cartesian coordinates and is subtracted from the initial displacement data to remove large bulk motions and linear stretches and shears. A fit is then performed in prolate spheroidal coordinates with a power series up to the linear term in the radial direction and up to the fifth order in the azimuthal and angular directions. This leads to a 50 parameter fit for each of the prolate spheroidal coordinate directions. An alternate approach to motion reconstruction developed by Denney and Prince [8] utilizes a multidimensional stochastic model (each component of the displacement field is modeled by a Brownian surface) for the true displacement field and the Fisher estimation framework to estimate displacement vectors in points on the lattice. The advantage of this framework is that an error covariance can be established that determines the number of tag lines needed to achieve a given estimation accuracy. In Refs. 9 and 10, tag lines are tracked with dynamic programming Bsnakes and B-snake grids. Radeva et al. [11] proposed a volumetric B-solid model to track tag lines concurrently in different image slices by implicitly defined B-surfaces that align themselves with tagged points. The solid is a 3-D tensor product B-spline whose isoparametric curves deform under image forces from tag lines in different image slices. For the motion reconstruction problem, an original solution was proposed in Refs. 9 and 12. In this formulation only information at tag crossings was utilized as part of the reconstruction algorithm. In ref. 10, the method was further extended to that of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
432
Amini and Chen
constrained thin-plate reconstruction of the displacement field from points and lines based on a variational solution. The rest of this chapter is organized as follows: Sec. 2 gives a description of a cardiac motion simulator for validating tagged MR image analysis techniques. The simulator is capable of generating realistic deformations of the LV together with tagged images at arbitrary orientations. Section 3 describes coupled B-snake grids and techniques for thin-plate spline reconstruction of deformations. A key issue addressed in this section is the degree to which the use of tag intersections (as opposed to the use of entire tag lines) is sufficient to characterize the deformations of the left ventricle. Additionally, strain, which is a measure of local stretching, is discussed and computed for in vivo data. Finally, in Sec. 4, a SPAMM imaging protocol for noninvasive placement and tracking of myocardial beads with tagged MRI is described. Furthermore, in this section, efficient algorithms for the reconstruction of tag surfaces and localization of bead locations are provided.
2
CARDIAC MOTION SIMULATOR
An environment based on a 13-parameter kinematic model developed by Arts et al. [13] has been implemented as was described in Ref. 14 for simulating a time sequence of tagged MR images at arbitrary orientations. Based on user-selected discretization of the space between two concentric shells and by varying the canonical parameters of the model, both a sequence of tagged MR images and a ‘‘ground truth’’ vector field of actual material point deformations are available. A pair of prolate spheroids represents the endocardial and epicardial LV surfaces and provides a geometric model of the LV myocardium. The motion model involves the application of a cascade of incompressible linear transformations describing rigid as well as nonrigid motions. Once a discretization step is assumed and a mesh for tessellating the 3-D space is generated, linear matrix transformations are applied in a sequence to all the mesh points so as to deform the reference model. The parameters of the motion model, referred to as k-parameters, and the transformations to which they correspond, are given in Table 1. To illustrate the effect of varying the parameters, Figs. 1 and 2 show the deformations of an undeformed model when varying k2 and k4 . Figure 3 displays a different undeformed 3-D model and a deformed model when a number of k-parameters are simultaneously varied as listed in Table 2. In order to simulate MR images, an imaging plane intersecting the geometric model is selected, and tagged spin-echo imaging equations are
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
433
TABLE 1 Description of the Thirteen k Parameters k1 k2 k3 k4 k5 k6 k7 k8 k9 k10 k11 k12 k13
Radially dependent compression Left ventricular torsion Ellipticalization in long axis planes Ellipticalization in short axis planes Shear in x direction Shear in y direction Shear in z direction Rotation about x-axis Rotation about y-axis Rotation about z-axis Translation in x direction Translation in y direction Translation in z direction
FIGURE 1 Deformed models of the LV resulting from change of k2 from 0.2 to 0.8 in increments of 0.2. (From Ref. 21.)
FIGURE 2 Deformed models of the LV resulting from change of k4 from ⫺0.02 to ⫺0.08 in increments of ⫺0.02. (From Ref. 21.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
434
Amini and Chen
FIGURE 3 Simulated LV in the undeformed reference configuration (left), and a deformed LV resulting from a composition of translation, rotation, scaling, shearing, and torsional deformations (right).
applied for simulating the in vivo imaging process. Figure 4 displays simulated SA and LA images for an undeformed reference configuration. 3 3.1
ANALYSIS OF 2-D TISSUE DEFORMATIONS Coupled B-Snake Grids
B-splines are suitable for representing a variety of industrial and anatomical shapes [10,15–17]. The advantages of B-spline representations are that (1) they are smooth, continuous parametric curves that can represent open or closed curves (for our application, due to parametric continuity, B-splines will allow for subpixel localization of tags); (2) they are completely specified by a few control points; moreover (3) individual movement of control points will only affect their shape locally. In medical imaging, local tissue deformations can easily be captured by movement of individual control points without affecting static portions of the curve. A B-spline curve is expressed as
冘 N⫺1
→
␣ (u) =
→
Vi Bi,k (u)
(1)
i=0
where the Bi,k (u) are the B-spline basis functions of order k having poly→ nomial form, and local support, and the Vi are the sequence of control points of the B-spline curve. A coupled snake grid is a sequence of spatially ordered deformable curves represented by a B-spline, which responds to image forces and tracks
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
TABLE 2 k1 ⫺0.02
The Values of the Thirteen k Parameters for the Deformed Model in Fig. 3(b) k2
k3
k4
k5
k6
k7
k8
k9
k10
k11
k12
k13
⫺0.03
⫺0.015
0.04
0.01
0
0.07
⫺0.015
0.15
⫺0.25
0.355
0.47
0.38
435
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
436
Amini and Chen
FIGURE 4 Simulated short-axis (left) and long-axis image (right), in the undeformed reference configuration.
nonrigid tissue deformations from SPAMM data. The spline grids are constructed by having the horizontal and vertical grid lines share control points. By moving a spline control point, the corresponding vertical and horizontal snakes deform. This representation is reasonable since the point of intersection of two tag lines is physically the same material point, the tissues are connected, and furthermore, through shared control points, a more efficient representation is achieved. We define an MN spline grid by {(M ⫻ N) ⫺ 4} control points, which we represent by the set {{p12 , p13 , . . . , p1,N⫺1}, {p21 , p22 , . . . , p2,N }, . . . , {pM,2, pM,3 , . . . , pM,N⫺1}}
(2)
where pij is the spline control point at row i and column j (Fig. 5). To detect and localize SPAMM tag lines, we optimize grid locations by finding the minimum intensity points in the image, as tag lines are darker than surrounding tissues. However, there is an additional energy term present in our formulation that takes account of the local 2-D structure of image intensity values at tag intersections. Although we cannot specify an exact correspondence for points along a tag line, we do know the exact correspondence for points at tag intersections, assuming 2-D motion for the LV.1 1
We note that in general the motion of LV is three-dimensional. However, there is a little through-plane motion at the apical end of the LV, especially during systole [18].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
437
FIGURE 5 The spatial organization of control points for a coupled Bsnake grid. Dependence of horizontal and vertical splines of deformable grids is captured by the shared control points. (From Ref. 21.)
This is the familiar statement of the aperture problem in image sequence analysis. The way to incorporate this familiar knowledge into our algorithm and therefore distinguish between 1-D and 2-D tagged points is to utilize the SSD (sum of squared differences) function [19] in the minimization
(p12 , . . . , pM,N⫺1) = 1
冘冕 k
I(␣k (u)) du ⫹ 2
冘
SSD(vij)
(3)
ij
where I is the image intensity function, ␣k is a horizontal or vertical spline, vij denotes the intersection point on the pixel grid of snake curves, and 1 and 2 are preset constants. The SSD function determines the sum of squared differences of pixels in a window around point vij in the current frame (with intensity function I ) with a window around the corresponding B-snake grid
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
438
Amini and Chen
intersection in the previous frame (with intensity function J). That is, when the location of the grid in I is sought,
冘 K
SSD(vij) =
2 (I(qi) ⫺ J(q⬘)) i
(4)
i=1
for corresponding locations qi and q⬘i in an image window with K total pixels. To minimize the discrete version of , at each iterative step, we compute the gradient of with respect to pij , perform a line search in the ⵜ direction, move the control points to the minimum location (which will result in real values for pij), and continue the procedure until the change in energy is less than a small number, defining convergence. To compute the gradient of , it is helpful to note that with pij = (xij , yij), ⭸ ( p12 , . . . , (xij ⫹ ⌬x, yij), . . . , pM,N⫺1) ⫺ ( p12 , . . . , (xij , yij), . . . , pM,N⫺1) ⯝ ⭸xij ⌬x ⭸ ( p12 , . . . , (xij , yij ⫹ ⌬y), . . . , pM,N⫺1) ⫺ ( p12 , . . . , (xij , yij), . . . , pM,N⫺1) ⯝ ⭸yij ⌬y
(5) In practice, we have an additional constraint in computing the energy function: we only use the intersections and points on the snake grid that lie on the heart tissue. To track the grid over multiple frames, the localized grid for the current temporal frame becomes the initial grid for the next frame, which is then optimized on the new data. 3.1.1
Validations
To assess tracking performance with the coupled B-snakes, starting with a user-initialized grid in frame 1, the algorithm was used to track tag lines automatically in a time sequence of simulated images. In a separate setting, an expert was also asked to outline the tags by manually adjusting the control point locations of a coupled B-snake grid in each frame of the same image sequence. The figure of merit proposed in Ref. 20 was then modified to compare each manually outlined B-snake grid G1 , with the corresponding grid G2 , automatically determined by the algorithm 1 ⫻ M⫹N⫺4
d(G1 , G2) =
冘 再 k
1 max min D(␣k,1(u), ␣k,2(w)) 2 u w
冎
⫹ max min D(␣k,2(w), ␣k,1(u)) w
u
(6)
where ␣k,1 and ␣k,2 are corresponding B-snake curves (horizontal or vertical),
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
439
u and w are their corresponding spline parameters (a 100 points per span sampling of u and w was employed), M ⫹ N ⫺ 4 is the total number of splines in the deformable grid, and D is the Euclidean distance metric. Figures 6, 7, and 8 show error plots [Eq. (6)] for our spline tracker (over entire image sequences) for a series of 2-D deformations obtained by varying the k4 , k5 , k10 , k11 , and k12 with a temporal resolution of 20 ms. In all experiments, 1 = 75 and 2 = 1. The details of the spline tracker were as follows: the splines were of order 4, and a B-spline span roughly corresponded to a tag spacing. To illustrate, Figs. 1 and 2 show the deformations of the undeformed model when varying k2 and k4 , and Fig. 9 shows the automatically localized grids overlaid on top of four frames of the simulated image sequence corresponding to k2 . An in vivo result from localization and tracking of spline grids with gradient descent are shown in Fig. 10. Note that the portions of the snake grids lying within ventricular blood and in the liver do not contribute to the energy of the snake grid. The endocardial and epicardial contours were each manually segmented throughout the sequence using a six-control-point Bspline representation. The contours delimit the part of the snake grids on the myocardium that are used by the algorithm.2 Initialization of B-snake grids can be performed automatically based on the imaging protocol of Sec. 4.1, which includes SA as well as LA acquisitions. Since locations of one set of SA tag planes are related to LA image slices and furthermore, the tag interspacings are known, the B-snake grid is automatically initialized at the outset from image headers without any user intervention. 3.2
Constrained Thin-Plate Spline Warps
Tracking tissue deformations with SPAMM using snake grids provides 2-D displacement information at tag intersections and 1-D displacement information along other 1-D snake points [9,12,10,21]. The displacement measurements from tag lines however are sparse; interpolation is required to reconstruct a dense displacement field from which strain, torsion, and other mechanical indices of function can be computed at all myocardial points. To proceed more formally, the vector field that warps the deformed image into the undeformed image is assumed to be C1 continuous. This smoothness requirement is achieved by minimizing an objective function that combines spatial derivatives of both components of the reconstructed displacement as follows:
2
We have have found automatic determination of endocardial and epicardial contours in SPAMM image sequences to be a formidable task due to the presence of tag lines.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
440
Amini and Chen
FIGURE 6 The figure of merit [Eq. (6)] for coupled B-snake tracking as a function of k2 and k4 . (From Ref. 21.) Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
441
FIGURE 7 The figure of merit [Eq. (6)] for coupled B-snake tracking as a function of k5 and k10 . (From Ref. 21.) Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
442
Amini and Chen
FIGURE 8 The figure of merit [Eq. (6)] for coupled B-snake tracking as a function of k11 and k12 . (From Ref. 21.) Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
443
FIGURE 9 Results of coupled B-snake tracker on a simulated image sequence (1 = 75, 2 = 1). From top left (k2 = 0.2) to bottom right (k2 = 0.8) in increments of 0.2. Temporal resolution is 20 ms. (From Ref. 21.)
⌽1 =
冕冕
2 2 (u xx ⫹ 2u xy ⫹ u 2yy) dx dy ⫹
冕冕
(v 2xx ⫹ 2v 2xy ⫹ v 2yy) dx dy (7)
characterizing approximating thin-plate splines. It should be noted that although thin-plate warps have been used in the past for other medical imaging applications [22], they have only been utilized to interpolate a warp given
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
444
Amini and Chen
FIGURE 10 Results of tracking SPAMM images with deformable spline grids (1 = 100, 2 = 1). (From Ref. 21.)
homologous landmark points. Assuming 2-D tissue motion (as is roughly the case towards the apical end of the heart [18]), two possible physical constraints [in addition to Eq. (7)] can be considered: 1. Corresponding homologous landmarks obtained from intersecting horizontal and vertical tag grid lines should come into correspondence after application of the thin-plate warp. The intersections of two tag grid lines are ‘‘pulled’’ towards one another by minimizing
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
⌽2 =
冘
445
(u ⫺ uint)2 ⫹ (v ⫺ vint)2
(8)
where uint and vint are the x and y components of displacement at tag intersections. 2. Corresponding homologous curves should come into correspondence, that is, any point on a deformed tag in frame n must be warped to lie on its corresponding undeformed tag in frame 0 of the sequence. For a vector field to perform such a warp, h(u, v) of Fig. 11 must be minimized. → → → Let P 1 = P 1(u, v) = (x, y) ⫹ (u, v) as in Fig. 11, and let P 2 be any point on the undeformed tag line. The following term is then summed over all deformed horizontal and vertical grid points: ⌽3 =
冘
(h(u, v))2 =
冘
→
→
→
{(P 1(u, v) ⫺ P 2)⭈n}2
(9)
where the summation is performed on all points (x, y) on the deformed grids. In order to compare curve-warps with landmark warps, an objective function may be formed by a linear combination of the terms in Eqs. (7), (8), and (9): ⌽ = 1⌽1 ⫹ 2⌽2 ⫹ 3⌽3
(10)
where 1 , 2 , 3 are nonnegative weights. The following test cases can then
FIGURE 11 Distance of a point P1 to a straight line can easily be calculated with knowledge of any point, P2 on the line and the normal vector (left). The residual distance from a warped point of a deformed tag line to its corresponding undeformed tag line in frame 0 of the sequence, h(u, v ), can similarly be calculated (right). (From Ref. 25.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
446
Amini and Chen
be considered: (I) {1 > 0, 2 > 0, 3 = 0} and (II) {1 > 0, 2 = 0, 3 > 0}. Additionally, within each of case I (landmark-based warps) and II (curve-based warps), the number of homologous landmark points/curves can be varied to determine the effect on the accuracy of reconstructions. In order to optimize ⌽ with conjugate-gradient descent, partial derivatives of ⌽1 can be calculated using the computational molecule approach discussed in Ref. 23 and also used in Ref. 21. Partial derivatives of Eq. (8) can easily be obtained. Finding partial derivatives of ⌽3 is also straightforward: ⵜ⌽3 = 2
冘
→
→
→ → {(P 1(u, v) ⫺ P 2)⭈n}⭈n
(11)
As may be concluded from the last paragraph, minimization of Eq. (10) leads to a system of linear equations with full rank under nontrivial deformations. Therefore Eq. (10) is a convex objective function with a global minimum. It should be noted that in comparison to Ref. 21, in the present discussion we describe thin-plate techniques that require knowledge of homologous landmark points and/or curves; though with an additional constraint, namely that one set of the homologous curves should be straight. This is a modification of the framework in Ref. 21, making the objective function convex, bypassing local minima, and resulting in reduced computational costs. 3.2.1
Validations
For the purpose of comparing 2-D displacement field reconstructions, we have used the parameters k2 , k4 , k5 , and k10 for generating 2-D deformations of the geometric model, based on which simulated tagged images and 2-D displacement vector fields of actual material points are produced. The error norms used in comparing the ground truth vector field (Vg) with the vector field measured by the warp algorithms (Vm) are rms = 兹1/N 兺 兩Vm ⫺ Vg兩2 and dist = 兹1/N 兺 (h(u, v))2, where N is the total number of deformed grid points on the myocardial region, the summation in rms is performed on entire myocardial regions, and the summation in dist is performed on all points (x, y) on the deformed grid (Fig. 11). In order to test the accuracy of reconstructions as a function of the number of homologous landmarks and homologous curves, a number of experiments were carried out. The experiments were designed to determine the effect of the number of homologous landmark points versus the number of homologous curves on the accuracy of thin-plate spline reconstruction of deformations. The cardiac simulator was used to generate a sequence of SA tagged images in addition to ground truth displacement vector fields. The imaging parameters chosen resulted in 7 ⫻ 7 image tag grids. The intersec-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
447
tions of the tag lines in the myocardial area yielded all of the available landmark points, and the tag segments that were in the myocardial region yielded all of the available curves for warping. In order to vary the number of homologous landmarks and curves, the 7 ⫻ 7 grid was subsampled to a 1 ⫻ 1, 2 ⫻ 2, 3 ⫻ 3, 4 ⫻ 4, and the original 7 ⫻ 7 grid. To compare landmark-based and curve-based warps, for each subsampled grid considered, the intersections of the corresponding grids were used as the landmark points. The findings for parameter k4 (short-axis ellipticalization) are displayed in Figs. 12 and 13 (plots for k2 , k5 , and k10 generally follow the same patterns). Please note that for these figures, there are no landmarks for the 1 ⫻ 1 grid, and additionally the results of the 7 ⫻ 7 grid are labeled as
FIGURE 12 Plots of rms as a function of the number of tag lines used in thin-plate spline reconstruction of deformations. Different plots correspond to different values of k4 , ellipticalization in the short-axis plane. (From Ref. 25.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
448
Amini and Chen
FIGURE 13 Plots of dist as a function of the number of tag lines used in thin-plate spline reconstruction of deformations. Different plots correspond to different values of k4 , ellipticalization in the short-axis plane. (From Ref. 25.)
case ‘‘5’’ on the x-axis of the plots. For the plots, ‘‘1 1 0’’ corresponds to 1 = 1, 2 = 1, 3 = 0. Similarly, ‘‘1 0 1’’ corresponds to 1 = 1, 2 = 0, 3 = 1, and ‘‘1 1 1’’ corresponds to 1 = 1, 2 = 1, 3 = 1. The plots indicate that for small deformations, there is no significant difference between using curve-based and landmark-based warps, but that for larger deformations, there is a more graceful degradation of curve-based warps in the reconstruction of true deformations in comparison to landmarkbased warps. There are also no significant differences between curve-based and landmark-based warps when the number of landmarks/curves increases. In the rest of the section, {1 = 1, 2 = 0, 3 = 1} was used for analysis of the in vivo data.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
3.3
449
In Vivo Results
Figure 14 shows six short-axis images at 202, 225, 247, 270, 292, and 315 ms after the ECG trigger (the image at the top is at 0 ms) in a pig. Figure 15 shows six short-axis images at the same time points and slice positions after inducing a myocardial infarction. The computed displacement vectors V for each of these time points for the baseline study and the postmyocardial infarction (post-MI) study are plotted in Figs. 16 and 17. The magnitudes of the motion fields for each of these figures are shown next in Figs. 18 and 19. Although the infarct zone can readily be recognized from the vector fields, these zones clearly stand out in the vector field magnitude maps. Figures 20 and 21 show the absolute value of the radial and circumferential strains for the baseline study after application of a 5 ⫻ 5 moving average low-pass filter to reduce noise. Figures 22 and 23 show the same quantities for the post-MI study. An example of how the radial and circumferential directions are determined is given in Fig. 24, which shows the radial strain directions for frame 2 of Fig. 14. Circumferential directions are orthogonal to the directions plotted. In order to compute the radial direction, a least-squares fit is made of a circle to the endocardium. For each point, the unit vector pointing to the center of this circle determines the radial direction. Finally, Fig. 25 shows the picture of a stained histological slice of the LV roughly corresponding to the position of image slices for the porcine model.
4
MEASUREMENT OF 3-D MOTION OF MYOCARDIAL BEADS
The procedure outlined in the previous section provides a mechanism for tracking points within short-axis image slices. However, as shown in Fig. 26, in MRI, the positions of image slices are fixed relative to the magnet’s coordinate system, and therefore this approach can only yield 2-D motion of material points within a short-axis slice. To obtain information about the movements of points in the out-of-plane direction, a second sequence of images is acquired with slices parallel to the heart’s long axis and with the requirement that tag planes intersecting the new slices be in parallel to shortaxis images. This requirement however is not sufficient for 3-D tracking, and there should yet be an additional requirement, namely that the same set of material points should be imaged in both views. The imaging protocol in Sec. 4.1 accomplishes this goal.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
450
Amini and Chen
FIGURE 14 Six baseline short-axis tagged images of a cine sequence for a given slice position from a normal pig heart. The image on top is the undeformed tag configuration at 0 ms after the ECG trigger.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
451
FIGURE 15 Six short-axis images from a tagged sequence with the same time and slice positions as in Fig. 14, after induction of a myocardial infarction. The image on top is the undeformed tag configuration at 0 ms after the ECG trigger.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
452
Amini and Chen
FIGURE 16 Six displacement vector fields for the baseline study, corresponding to the same slice and time positions of the images in Fig. 14. Segmental motion of all myocardial points can easily be quantitated and visualized from the location, direction, and length of the displayed vectors.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
453
FIGURE 17 Six displacement vector fields for the post-MI study. Segmental motion can easily be quantitated and visualized.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
454
Amini and Chen
FIGURE 18 The magnitude of the displacement vectors of Fig. 16 (normal baseline study). Brighter areas correspond to points having larger motion.
FIGURE 19 The magnitude of the displacement vectors of Fig. 17 (postMI study). Dark regions of the myocardium, which are the areas with depressed function, can be recognized.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
FIGURE 20
This figure shows radial strains for the baseline study.
FIGURE 21 study.
This figure shows circumferential strains for the baseline
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
FIGURE 22 This figure shows radial strains for the post-MI study. Notice that the lower right area consistently has small strain values.
FIGURE 23 This figure shows circumferential strains for the post-MI study. Once again, the lower right area consistently has small strain values.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
457
FIGURE 24 This figure illustrates the technique used to determine the radial and circumferential strain directions. For each time frame, a circular fit is made to the endocardium. The unit vector pointing to the circle’s center is the radial direction (as displayed in the figure). The orthogonal direction is the circumferential direction.
FIGURE 25 Stained histological slice of the pig’s LV, roughly corresponding to image slice position in Figs. 14 and 15. The area in the lower right (between 2 and 7 o’clock locations) of the picture with no dye uptake (bright region) is the infarct zone. Note also the distinct narrowing of the myocardial regions in these areas. (From Ref. 24.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
458
Amini and Chen
FIGURE 26 Imaging geometry in MRI. The position of Slices (S1, . . . , S6) are fixed relative to the magnet’s coordinate system. However, a dynamic organ such as the heart moves in and out of the slices. Motion of points A1, . . . , A6 illustrates this. (From Ref. 25.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
4.1
459
Imaging Protocol
A breath-hold SPAMM pulse sequence was used to collect multiple images in both short-axis and long-axis views of the entire heart without gaps. Immediately after the ECG trigger, rf tagging pulses were applied in two orthogonal directions. The repetition time (TR) of the imaging sequence was approximately 7.1 ms, the echo time (TE) was 2.9 ms, the rf pulse flip angle was 15 degrees, and the time extent of rf tag pulses was 22 ms. Echo sharing was used in collecting each time-varying image sequence for a given slice position. Five data lines were collected for any time frame during each heart cycle, but two data lines were overlapped between two consecutive cardiac frames, resulting in an effective temporal resolution of approximately 22 ms. Other imaging parameters were: field of view = 330 mm, data acquisition matrix size = 160 ⫻ 256 (phase encoding by readout), in-plane resolution = 2.1 ⫻ 1.3 mm2, slice thickness = 7 mm, and tag spacing = 7 mm. The image orientations for the short-axis and long-axis views of the heart were first determined by collecting multiple oblique-angle scout images. For the short-axis images, one of the tagging planes was placed parallel to the long-axis imaging planes of the heart by manually setting the angles of the tagging plane in the coordinate system of the magnet to be the same as those of the long-axis view as determined from the scout images. The coordinates of the center of the central tagging plane in the reference coordinates system (relative to the center of the magnet) were set to be the same as those of the center of one of the long-axis image planes to be acquired, again determined by the scout images. As a result, one set of tagging planes intersecting short-axis image slices coincided with long-axis images, since both tag spacing and slice thickness were 7 mm center-tocenter. The other short-axis tagging plane was placed orthogonal to the first tagging plane. Similarly, long-axis images were acquired with their tag planes coinciding with short-axis slice positions. Figure 27 illustrates the relationship between tag planes and image slices as accomplished by this imaging protocol. Figure 28 displays the position of the short-axis image slices on one long-axis image at end-diastole. As a result of the imaging protocol outlined in this section, the tag intersections from short-axis images are the material points corresponding precisely to the intersection of three tag planes, revealing for all time points in the cardiac cycle the 3-D motion of these special points. 4.2
Reconstruction of Tag Planes from Coupled B-Snake Grids
Given a spatial stack of m curves on m image slices, each represented by n control points, a matrix of control points is constructed as follows:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
FIGURE 27 Figure illustrates the relationship between tag planes and image slices in imaging the left ventricle of the heart. At the time of tag placement, long-axis tag planes coincide with short-axis image slices, and long-axis image slices coincide with one set of short-axis tag planes. Note that this correspondence subsequently becomes invalid due to the motion of the heart. (From Ref. 26.)
FIGURE 28 Position of short-axis image slices at the time of tag placement is drawn on a long-axis image acquired at the same time point in the heart cycle. (From Ref. 26.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
冋
→
V11 →
Vm1
→
... ... ...
V1n →
Vmn
册
461
(12)
where the first index denotes ordering of control points across image slices along the z axis, and the second index denotes ordering of control points along the curve—within the image slice. The matrix immediately gives rise to the surface →
S(u, v) =
冘
→
Vij Bi,k (u)Bj,k (v)
(13)
ij
where we have used nonperiodic blending functions as defined in Ref. 15 with ti’s as the knot sequence: Bi,k (u) =
(u ⫺ ti)Bi,k⫺1(u) (ti⫹k ⫺ u)Bi⫹1,k⫺1(u) ⫹ ti⫹k⫺1 ⫺ ti ti⫹k ⫺ ti⫹1
(14)
We have applied cubic splines (i.e., of order k = 4) to ensure that there is the necessary flexibility in the parametrized shapes (it should be noted that Bj,k takes on an identical form to Bi,k). Furthermore, given that tag lines and images are of approximately equal distance, uniform B-splines are considered so that the knots are spaced at consecutive integer values of parametric variables. Figures 29 and 30 illustrate the construction of intersecting cubic Bspline tag surfaces from a spatial stack of coupled B-snake grids. 4.3
Computing Time-Dependent Coordinates of Material Points
As in the case of coupled B-snakes of short-axis images, deformations of tag planes in the long-axis orientation are again measured by creating Bspline surfaces from stacks of B-snakes. The difference between short-axis and long-axis image acquisitions however is that there is only one set of parallel tag planes intersecting long-axis images. Figure 31 illustrates a tag surface constructed from a spatial sequence of long-axis images. Coordinates of material points can be obtained by computing intersections of three intersecting B-spline surfaces representing three intersecting → tag surfaces. For each triplet of intersecting B-spline surfaces, S1(u1 , v1), → → S2(u2 , v2), S3(u3 , v3), the following computation is carried out:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
462
Amini and Chen
FIGURE 29 A grid of B-spline snakes on short-axis image slices in midsystole. Note that for better illustration, in this figure and in all analysis to follow, every second tag line is utilized. The analysis area of interest is the region within the myocardial borders of the LV. (From Ref. 26.)
→
→
→
→
→
→
→
min d 2(S1 , S2) ⫹ d 2(S1 , S3) ⫹ d 2(S2 , S3) →
→
(15)
P1 , P2 , P3 →
→
where point Pi belongs to surface Si and d is the Euclidean distance metric. The minimization is carried out using the method of conjugate gradient descent, which ensures the fast convergence of the method. Note that the overall distance function above can be written as →
→
→
→
㛳S1(u1 , v1) ⫺ S2(u2 , v2)㛳2 ⫹ 㛳S2(u2 , v2) ⫺ S3(u3 , v3)㛳2 →
→
⫹ 㛳S1(u1 , v1) ⫺ S3(u3 , v3)㛳2
(16)
with the goal of finding the parameters (ui , vi) for the triplet of surfaces. The computed parameters will in fact be surface parameters of the intersection point. For the iterative optimization process, a good initial set of parameters has been found to be parameters of the intersection point assuming linear B-spline bases.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
463
FIGURE 30 The reconstruction of two orthogonal tag surfaces from a sequence of coupled B-spline curves in short-axis image slices (as in Fig. 29). Two views of the same tag planes are shown. The dark surface corresponds to the second vertical grid line from the right in Fig. 29. The bright vertical surface orthogonal to this surface corresponds to the fourth horizontal grid line from the top. Finally, the bright horizontal surface is the long-axis tag plane corresponding to the short-axis image slice. (From Ref. 26.)
FIGURE 31 B-spline surface representation of a long-axis tag plane reconstructed from a spatial stack of B-snakes. Two views of a reconstructed long-axis tag surface are displayed horizontally. ‘‘Out-of-plane’’ movement of the heart is visualized by the deviation from flatness of the long-axis B-surface. (From Ref. 26.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
464
4.4
Amini and Chen
Validations
In order to validate the accuracy of the method in measuring 3-D motion of myocardial material points, the cardiac motion simulator of Sec. 2 was used to generate SA and LA images as prescribed by the imaging protocol of Sec. 4.1. The correspondence of LA tag planes and SA image slices as well as one set of SA tag planes and the LA image slices in the undeformed state was ensured by computing the exact location of the tag plane intersecting the 3-D model using Eq. (11) of Ref. 14. The location of the image slice in each case was chosen to be identical to this position. The SA image planes are located at z = ⫺1.57, ⫺0.785, 0, 0.785, 1.57, 2.355, 3.14, 3.925 (all locations in cm), which coincide with positions of LA tag planes in the undeformed state. The LA image planes are located at y = ⫺2.355, ⫺1.57, ⫺0.785, 0, 0.785, 1.57, 2.355 (all locations in cm), which coincide with positions of one set of SA tag planes. Therefore there are eight SA slices and seven LA slices. The slice separation as well as the tag separations in SA and LA images were 0.785 cm. To compute a figure→of merit for the algorithm, the RMS →error between theoretical intersections, P ti , and the computed intersections, P ci , are calculated by RMS =
冑冘 1 N
N
→
→
兩P ti ⫺ P ci 兩2
(17)
i=1
where the sum is over all myocardial material points across all spatial slices for a given volumetric time frame. Because the accuracy of the intersection calculations is affected by both the tag line detection/tracing procedure and the tag spacings, and in order to remove the effect of errors in tag line detection/tracing, we use the theoretical tag line locations (known from the simulator parameters) for validating the computed intersections for each of the six image collection time points for k3 , k7 , and k9 sequences. The 3-D RMS error curves are shown in Fig. 32. The plots indicate that these errors, which are solely due to numerical optimization and round-off errors in computing the location of myocardial beads, are truly negligible (in the order of 0.01 cm). Figure 33 displays how the RMS error varies with the tag spacing under tag line detection/manual correction for k1 and k2 sequences.3 The error curves in this 3
For k3 , k7 , and k9 image sequences, it is possible to obtain the exact location of the B-snake grid control point locations (and consequently 3-D coordinates of tag curves) for all time points by directly applying the simulator deformation matrix [14] to the coordinates of the control points in the undeformed state. However, due to the structure of B-spline bases, for k1 and k2 (radially-dependent compression and torsion) parameters, this cannot be achieved.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
465
FIGURE 32 RMS error plots [Eq. (17)] as functions of k3 , k7 , and k9 . To generate these error curves, the deformation matrix [14] was directly applied to the locations of tag line control points in the undeformed reference state. Intersections of the triplet of tag surfaces were then found and the RMS error between the computed and theoretical intersections was calculated. In all cases, ki = 0 represents the undeformed reference state at time 0. Please note that in the above plots, depending on whether ki increases or decreases from zero, time increases from left to right or right to left.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
466
Amini and Chen
FIGURE 33 RMS error plots for all the simulated myocardial material points as functions of k1 and k2 with tag detection and manual correction. In all cases, ki = 0 represents the undeformed reference state at time 0. For the k1 plot, time increases from right to left, and for k2 plot, time increases from left to right. In each case, the top plot represents the RMS error with tag spacing of 0.785 cm and the bottom plot is the error curve with tag spacing of 0.393 cm.
FIGURE 34 RMS error plots for all the simulated myocardial material points during the whole cardiac cycle when all the k parameters are concurrently varied. The k parameters for each time point were found by least-squares fitting to an in vivo canine heart over the entire ECG cycle [13].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
467
FIGURE 35 Intersections computed for the k1 sequence: slice 1 of the spatial stack at times 0 and 5 (top). Slices 1, 3, 5, and 7 at times 0 (k1 = 0) and 5 (k1 = ⫺0.1) (bottom). For a quick time movie, please visit http://www-cv.wustl.edu/demos/index.html.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
468
Amini and Chen
FIGURE 36 Intersections computed for the k2 sequence: slice 0 of the spatial stack at times 0 (k2 = 0) and 5 (k2 = 0.15) (top). Slices 0, 2, 4, and 6 at times 0 and 5 (bottom).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
469
FIGURE 37 Intersections computed for the k3 sequence: slice 1 of the spatial stack at times 0 (k3 = 0) and 5 (k3 = 0.05) (top). Slices 0, 2, 4, and 6 at times 0 and 5 (bottom).
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
470
Amini and Chen
FIGURE 38 Initial 3-D location of material points shown on every second slice of the MRI data (top); location of the material points, for every fourth slice, of the MRI data (left); new computed location of material points, one-third through systole (right). The nonrigid motion of the material points of the heart can be seen: points further up in slices (around the base) move downwards, whereas points near the heart’s apex are relatively stationary. (From Ref. 26.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis
471
case clearly indicate larger errors than those in Fig. 32. However, note that when tag spacing is reduced by half, the RMS error is reduced by half or by a third for the parameters k1 and k2 , respectively. Figure 34 shows the RMS error for the entire cardiac cycle when all the 13 k parameters are concurrently varied. The time evolution of the 13 k parameters are as those reported in Refs. 13 and 14, representing canonical deformations of an in vivo canine LV. Figures 35, 36, and 37 show intersections computed for each of k1 , k2 , and k3 simulated images from frame 0 to frame 5. The algorithm was also tested on a normal human image sequence that included 17 slices and 19 frames (17 ⫻ 19 images), yielding temporal positions of around 250 beads over the heart cycle. In a movie of these material points, the 3-D motion of individual SPAMM points of the myocardium is clearly apparent. Figure 38 displays results of the intersection computation for a few of the material points. Figure 39 shows results from the application of the algorithm to images collected from a patient with an old, healed, anteroseptal myocardial infarction.
5
CONCLUSIONS
In this chapter, we have provided an overview of novel 2-D and 3-D image analysis methods for measurement of deformations of the left ventricle of the heart from tagged images. The analysis methods presented reconstruct dense displacement vector fields depicting local and regional deformation of the myocardium. Once a dense displacement vector field is available, the strain of deformation at all myocardial points can be measured (the interested reader is referred to Ref. 24 for extensions of these methods to 3-D and 4-D space-time). Methods were also presented for computing the 3-D location of beads encoded within the ventricular wall with a tagging pulse sequence. Validation of the techniques was carried out with a cardiac motion simulator. It is believed that the methods presented will prove valuable in assessing the cardiac patient’s ventricular function.
ACKNOWLEDGMENTS This work was supported in part by grants from the Whitaker Biomedical Engineering Foundation, the National Science Foundation through grant IRI9796207, and the National Institutes of Health through grants HL-57628 and HL-64217.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
472
Amini and Chen
FIGURE 39 MRI of the left-ventricular long-axis view with tissue-tagging (left) in a patient with a history of an anteroseptal myocardial infarction. The anteroseptal wall (arrows) shows no motion of the beads during the systolic phase of the cardiac cycle (end-diastole to mid-systole to endsystole) reflecting akinesis of these segments, compatible with myocardial infarction (right). Please note that the location of beads at end-diastole corresponds to short-axis slice positions, as shown on the long-axis image.
REFERENCES 1.
2.
3.
4.
A. Young, D. Kraitchman, L. Dougherty, L. Axel. Tracking and finite element analysis of stripe deformation in magnetic resonance tagging. IEEE Transactions on Medical Imaging 14(3):413–421, 1995. J. Park, D. Metaxas, L. Axel. Volumetric deformable models with parameter functions: a new approach to the 3D motion analysis of the LV from MRISPAMM. In: International Conference on Computer Vision, 1995, pp. 700– 705. E. Haber, D. Metaxas, L. Axel. Motion analysis of the right ventricle from MRI images. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI’98. MIT, Cambridge, MA, 1998. B. K. P. Horn, B. G. Schunck. Determining optical flow. Artificial Intelligence 17(1-3):185–203, 1981.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Tagged MR Image Analysis 5. 6.
7.
8.
9.
10.
11.
12.
13.
14.
15. 16.
17. 18. 19. 20.
21.
473
S. Gupta, J. Prince. On variable brightness optical flow for tagged MRI. In: Information Processing in Medical Imaging (IPMI), 1995, pp. 323–334. M. A. Guttman, J. L. Prince, E. R. McVeigh. Tag and contour detection in tagged MR images of the left ventricle. IEEE Transactions on Medical Imaging 13:74–88, 1994. W. O’Dell, C. Moore, W. Hunter, E. Zerhouni, E. McVeigh. Three-dimensional myocardial deformations: calculation with displacement field fitting to tagged MR images. Radiology 195(3):829–835, 1995. T. Denney, J. Prince. Reconstruction of 3-d left ventricular motion from planar tagged cardiac MR images: an estimation-theoretic approach. IEEE Transactions on Medical Imaging 14(4):625–635, 1995. A. A. Amini et al. Energy-minimizing deformable grids for tracking tagged MR cardiac images. In: Computers in Cardiology. Durham, NC, 1992, pp. 651– 654. A. A. Amini, R. W. Curwen, John C. Gore. Snakes and splines for tracking non-rigid heart motion. In: European Conference on Computer Vision, Cambridge University, 1996, pp. 251–261. P. Radeva, A. Amini, J. Huang. Deformable B-Solids and implicit snakes for 3D localization and tracking of SPAMM MRI data. Computer Vision and Image Understanding 66(2):163–178, 1997. A. A. Amini et al. MR physics-based snake tracking and dense deformations from tagged MR cardiac images (oral presentation). In: AAAI Symposium on Applications of Computer Vision to Medical Image Processing, Stanford University, Stanford, CA, 1994. T. Arts, W. Hunter, A. Douglas, A. Muijtjens, R. Reneman. Description of the deformation of the left ventricle by a kinematic model. J. Biomechanics 25(10): 1119–1127, 1992. E. Waks, J. Prince, A. Douglas. Cardiac motion simulator for tagged MRI. In: Proc. of Mathematical Methods in Biomedical Image Analysis, 1996, pp. 182– 191. M. E. Mortenson. Geometric Modeling. John Wiley, New York, 1985. S. Menet, P. Saint-Marc, G. Medioni. B-snakes: implementation and application to stereo. In: Proceedings of the DARPA Image Understanding Workshop, Pittsburgh, PA, 1990, pp. 720–726. A. Klein, F. Lee, A. Amini. Quantitative coronary angiography with deformable spline models. IEEE Transactions on Medical Imaging 16(5):468–482, 1997. N. Reichek. Magnetic resonance imaging for assessment of myocardial function. Magnetic Resonance Quarterly 7(4):255–274, 1991. P. Anandan. A computational framework and an algorithm for the measurement of visual motion. International Journal of Computer Vision 2:283–310, 1989. V. Chalana, Y. Kim. A methodology for evaluation of boundary detection algorithms on medical images. IEEE Transactions on Medical Imaging 16(5): 642–652, 1997. A. A. Amini, Y. Chen, R. Curwen, V. Mani, J. Sun. Coupled B-snake grids and constrained thin-plate splines for analysis of 2D tissue deformations from tagged MRI. IEEE Trans. on Medical Imaging 17(3):344–356.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
474 22.
23. 24.
25.
26.
Amini and Chen F. Bookstein. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-11:567–585, 1989. D. Terzopoulos. Multiresolution computation of visible representation. Ph.D. thesis, MIT, 1984. J. Huang, D. Abendschein, V. Davila-Roman, A. Amini. Spatio-temporal tracking of myocardial deformations with a 4D B-spline model from tagged MRI. IEEE Trans. on Medical Imaging 18(10):957–972, 1999. A. A. Amini, Y. Chen, D. Abendschein. Comparison of land-mark-based and curve-based thin-plate warps for analysis of left-ventricular motion from tagged MRI. Medical Image Computing and Computer-Assisted InterventionsMICCAI’99, Cambridge University, 1999. A. A. Amini, P. Radeva, D. Li. Measurement of 3D motion of myocardial material points from explicit B-surface reconstruction of tagged data. Medical Image Computing and Computer-Assisted Intervention. MIT, Cambridge, MA, 1998.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
16 Time-Domain Spectroscopic Quantitation Leentje Vanhamme, Sabine Van Huffel, and Paul Van Hecke Katholieke Universiteit Leuven, Leuven, Belgium
This chapter gives an overview of modern time-domain analysis methods in magnetic resonance spectroscopy (MRS). In Sec. 1, the mathematical model functions that are used to describe the MRS signals are given and the important concept of prior knowledge is explained. Section 2 concerns the analysis of signals in the time domain in ‘‘ideal’’ circumstances, i.e, the model function of the entire signal is known. The principles of existing timedomain quantitation methods are explained, and relations between methods are pointed out. Section 3 deals with parameter estimation in ‘‘nonideal’’ circumstances brought about by deviations from the model function and the presence of unknown or uninteresting spectral features. In Sec. 4 some extensions of methods to process multiple signals simultaneously are described briefly. Section 5 gives some practical examples, and Sec. 6 summarizes the main conclusions. 1
INTRODUCTION
Accurate quantitation of MRS signals is an essential step prior to the conversion of the estimated signal parameters to biochemical quantities (e.g., 475
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
476
Vanhamme et al.
concentration, pH). The signals are characterized by highly overlapping peaks and a low signal-to-noise ratio and therefore require sophisticated analysis tools. The quantitation of the signal can be done directly in the measurement domain or alternatively in the frequency domain after transformation by the discrete Fourier transform (DFT). Although time- and frequency-domain processing are theoretically equivalent, processing the signal in the time domain offers some distinct advantages. If a model function other than a Lorentzian function model (see Sec. 1.1) is used, a simple exact analytical expression for the DFT of the model function is not available. In order to minimize the influence of distorted signal samples or broad resonances underlying the metabolites of interest, the first data points are often discarded in the time domain. This effect can also be taken into account in the frequency domain but makes the model function again more complicated. In general, it is very easy in the time domain to adapt the model function to changes in the sampling vector, i.e., nonuniform sampling, missing data points; while this is more complicated in the frequency domain. In summary, the mathematical model functions used to model the MRS data have a simpler form in the time domain and hence are more efficient to compute. 1.1
Model Function
The model function often used to represent the MRS signal is the Lorentzian model function
冘 K
yn = y¯n ⫹ en =
ak e jke (⫺dk⫹j2 fk)tn ⫹ en
n = 0, 1, . . . , N ⫺ 1
(1)
k=1
K represents the number of different resonances; j = 兹⫺1. The amplitude ak is related to the concentration of the metabolite, and the frequency fk characterizes it. The damping dk provides, among other factors, information about its mobility and molecular environment. Often the acquisition of the signal is only started after a time delay t0 , the receiver dead time. Therefore the sampling time points tn can be written as tn = n⌬t ⫹ t0 with ⌬t the sampling interval. The quantity 2 fkt0 is also called the first-order phase. The phase k consists of a so-called zero-order phase 0 and an additional phase factor k⬘ which represents extra degrees of freedom that may be required under certain experimental conditions (but usually all k⬘ are zero). The noise term en is assumed to be circular1 complex white Gaussian noise. It mainly consists of thermal noise from the sample and noise from the 1
The term circular means that the real and imaginary parts of the noise sequence are not correlated and have equal variance.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
477
electronic components. Even though individual metabolite signals can intrinsically be represented by a damped complex exponential, in practical situations a perfect homogeneous magnetic field cannot be obtained throughout the sample under investigation. This results in deviations from the ideal model function. An alternative model function often used is the Gaussian model
冘 K
yn = y¯n ⫹ en =
ake j ke (⫺gktn⫹j2 fk)tn ⫹ en
n = 0, 1, . . . , N ⫺ 1
(2)
k=1
Recently, the Voigt model [1,2], which combines the Lorentz and Gauss models, has received some attention. 1.2
The Concept of Prior Knowledge
In MRS applications, prior knowledge concerning the spectral components is often present. This prior knowledge can in some cases be derived from spin properties as illustrated hereafter using the adenosine triphosphate (ATP) example. In Fig. 1 the molecular structure of the ATP compound is depicted. The three phosphorus atoms, designated P␣ , P , and P␥ , experience a different chemical/electronic environment, and as a result of shielding, they will resonate at slightly different frequencies, giving rise to three groups of spectral lines in the spectrum. As a result of the spin–spin coupling between the neighboring phosphorus nuclei, the groups corresponding to the ␣ and ␥ phosphates of the ATP are split into two peaks (a doublet) and the one of the  phosphate is decomposed into three peaks (a triplet), as shown in Fig. 2. From quantum mechanics, the following relations between the parameters of ATP can be derived: The frequency differences between the individual resonances within a multiplet are equal and known: f1 ⫺ f2 = f3 ⫺ f4 = f5 ⫺ f6 = f6 ⫺ f7 = ⌬f.
FIGURE 1 Structure of the chemical compound ATP. All three phosphorus atoms experience a different chemical environment.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
478
Vanhamme et al.
FIGURE 2 Theoretical frequency spectrum of ATP. Three groups of spectral lines can be observed corresponding to the P␣ , P , and P␥ atoms. In addition, each group is split into a multiplet of lines as a result of spin– spin coupling.
The dampings of all peaks are equal: d1 = d2 = ⭈ ⭈ ⭈ = d7 . The amplitude ratios between the multiplets are 1:1 for P␣ , 1:2:1 for P , and 1:1 for P␥ . The total amplitudes of P␣ , P , and P␥ relate as 1:1:1. In summary: a1 = a2 = a3 = a4 = 2a5 = a6 = 2a7 . The phases of all peaks are equal: 1 = 2 = ⭈ ⭈ ⭈ = 7 The above prior knowledge is valid under some assumptions that must be evaluated for individual applications (for more information see Ref. 3). The prior knowledge can also be derived from phantom spectra of pure metabolite solutions measured under the same experimental conditions as the in vivo spectra [4,5]. Alternatively, the complete phantom metabolite spectra can be used as a model, circumventing the need for a mathematical model function [6]. In order to avoid the need to acquire phantom spectra for each field strength and pulse sequence used, quantum-mechanical spectral simulation has been used to deduce a priori information [7–11]. 2
TIME-DOMAIN QUANTITATION METHODS
Time-domain methods are usually divided into interactive (Sec. 2.1) and black-box (Sec. 2.2) methods. 2.1
Interactive Methods
An approach often used in all kinds of parameter estimation problems is to maximize P( ), the probability density function of the parameters , given the data vector y = [y0 , . . . , yN⫺1]T
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
P(y兩 )P( ) P(y)
max P( 兩y) = max
479
(3)
Since the denominator of Eq. (3) is independent of the parameters, it is sufficient to maximize the numerator. If the probability density function of the parameters is unavailable, the most reasonable choice for P( ) is that of a uniform distribution. P(y兩 ) is known as the likelihood function, and the estimates obtained by maximizing P(y兩 ) are called maximum likelihood (ML) estimates. In the case of circular white Gaussian noise with standard deviation for the real and imaginary parts, the following holds:
写 N⫺1
P(y兩 ) =
P(Re(en))P(Im(en)) =
n=0
1 = (2 2)N
写
1 (22)N
写 N⫺1
2
2
2
2
e⫺(Re(en)) /2 e⫺(Im(en)) /2
n=0
N⫺1
2
2
2
e⫺(Re( yn⫺y¯ n)) /2 e⫺(Im( yn⫺y¯ n)) /2
2
n=0
where Re(⭈), Im(⭈) denote the real, respectively imaginary part of a complex quantity. Maximizing P(y兩 ) is equivalent to maximizing the log-likelihood function L, i.e. L = ln(P(y兩 )) = ⫺N ln(2 2) ⫺
1 2 2
冘 N⫺1
兩yn ⫺ y¯n兩2
(4)
n=0
兩⭈兩 represents the modulus of a complex entity. The above equation shows clearly that in the case of circular white Gaussian noise, obtaining the ML estimates is the same as minimizing the squared difference between the data and the model function or, equivalently, solving a nonlinear least-squares (NLLS) problem. In vector notation the functional to be minimized is written as 㛳y ⫺ y¯㛳2
(5)
㛳⭈㛳 denotes the Euclidean vector norm, y¯ = [y¯0 , . . . , y¯N⫺1]T. It is possible to obtain the ML estimates by minimizing a so-called variable projection functional [12], since the model functions used in MRS (see Sec. 1) can be written as
冘 K
y¯n =
ck␥k (␣k , n)
n = 0, . . . , N ⫺ 1
(6)
k=1
where ck are the complex amplitudes ak e jk and ␥k (␣k , n) are independent functions of the nonlinear parameter vector ␣k . E.g., in the case of the model function of Eq. (1), ␣k is equal to [ fk dk t0]. Using matrix notation, Eq. (6) becomes
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
480
Vanhamme et al.
y¯ = ⌫c
(7) T
with c = [c1 , . . . , cK ] and ⌫=
冋
␥1(␣1 , 0) ⭈⭈ ⭈ ␥1(␣1 , N ⫺ 1)
⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
册
␥K (␣K , 0) ⭈⭈⭈ ␥K (␣K , N ⫺ 1)
an N ⫻ K matrix of full rank. Suppose that the nonlinear parameters ␣k , k = 1, . . . , K are known; then the matrix ⌫ can be computed and an estimate for the linear parameters c is obtained as the solution of a linear least-squares (LS) problem: cˆ = ⌫†y with ⌫† = (⌫H⌫)⫺1⌫H the pseudo-inverse of ⌫ (the superscript H denotes the complex conjugate). The cost function of Eq. (5) then becomes 2 㛳y ⫺ ⌫⌫†y㛳2 = 㛳P ⬜ ⌫ y㛳
(8)
Equation (8) defines the variable projection functional, and P ⬜ ⌫ is the orthogonal projector on the column space of ⌫. The linear parameters have been eliminated from the cost function of Eq. (8). As a result, a minimization ˆ k are obtained as the problem in fewer variables is obtained. The estimates ␣ ˆ is computed and parameters that minimize Eq. (8). Using ␣ ˆ k , the matrix ⌫ the estimates cˆk of the complex amplitudes are obtained as the LS solution of ˆ ⬇y ⌫c ˆ †y. In Ref. 12 it is proven that the parameters obtained in this i.e., cˆ = ⌫ way are the same as the ones obtained by minimizing Eq. (5) directly. The prior knowledge described in Sec. 1.2 can easily be incorporated into these functionals. By imposing the prior knowledge, the parameter accuracy increases [13]. The prior knowledge is expressed as linear equality constraints that are substituted into the original functional, resulting in an unconstrained NLLS problem. The latter can be solved using local or global optimization theory. Global optimization has already been used in MRS [14–16]. The main disadvantage of these methods is the poor computational efficiency. On the other hand, it is possible to obtain good starting values for the frequencies and dampings by a procedure called peak picking for use in local optimization methods, which results in an acceptable solution in a reasonable time under most circumstances. Peak picking can be explained as follows. The signal is Fourier transformed and the real part is phased correctly. For each peak, the user indicates the top and a point be-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
481
longing to the peak at half height. The top value provides a starting value for the frequency of the peak, and the difference between the top value and the value at half height is directly proportional to the damping of that peak. This step, in combination with the proper choice of prior knowledge to be imposed, involves a lot of interaction with the user. Therefore these model fitting methods are mostly called interactive methods. The first widely available and used interactive method in MRS data quantification is VARPRO [17]. As the name suggests, the method actually minimizes Eq. (8) using a modified version of Osborne’s Levenberg–Marquardt algorithm [18]. Recently, an improved method called AMARES (advanced method for accurate, robust and efficient spectral fitting) was developed [19]. AMARES actually minimizes the cost function of Eq. (5) using a sophisticated NLLS algorithm. AMARES allows the inclusion of more prior knowledge about the signal parameters, the model function (Lorentz, Gauss, and Voigt), and the type of signal (FID or echo). This results in an increased accuracy and flexibility. Overall an algorithm is obtained that outperforms VARPRO in terms of accuracy, robustness, and flexibility. See Ref. 20 for a detailed comparison (in terms of robustness, implementation details, flexibility) between VARPRO and AMARES. Alternative ways to minimize the cost function have been developed. In Refs. 21 and 22 expectation-maximization (EM) algorithm is used to reduce the high dimensional search for the optimum. The EM algorithm divides the problem into K independent optimizations, each involving the parameters of a single peak. The methods presented in Refs. 23 and 24 are very similar in nature and also try to reduce the complexity of the minimization problem by performing multiple one-dimensional searches. The possibility of imposing prior knowledge was not addressed in those publications. In Ref. 25, the methods presented in Refs. 23 and 24 are extended so that the algorithms use the a priori knowledge of the possible frequency intervals of the damped exponentials. 2.2
Black-Box Methods
The basic underlying principles of linear prediction (LP) and state-space methods are treated separately without going into all mathematical details in Secs. 2.21 and 2.2.2, respectively. For each of these two classes, the principles are first outlined for the ideal case of noiseless data. Subsequently, noisy data are considered, and an overview of the methods that compute a statistically suboptimal solution is provided. Finally, it is indicated how ML estimates can be obtained for both black-box approaches. More details about the methods and additional references can be found in Refs. 26 – 29.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
482
Vanhamme et al.
2.2.1
Linear Prediction Methods
2.2.1.1 Basic Theory—Noiseless Data. The model of Eq. (1) can be rewritten in terms of complex amplitudes ck and signal poles zk as follows:
冘 K
y¯n =
冘 K
(ak e jk)(e (⫺dk⫹j 2 fk)n⌬t) =
k=1
ck z nk
n = 0, . . . , N ⫺ 1
(9)
k=1
A polynomial of degree K having the signal poles as roots can be formed as follows:
写 K
p(z) =
冘 K
(z ⫺ zk) =
k=1
pi z K⫺i
( p0 = 1)
(10)
i=0
pk are the so-called forward prediction coefficients. Using the fact that p(zk) = 0, 1 ⱕ k ⱕ K, the following holds for all k:
冘 K
z =⫺ m k
pi z m⫺i k
mⱖK
(11)
i=1
Combining Eqs. (11) and (9) gives
冘 冉冘 冊 冘 冉冘 冊 冘 K
y¯n = ⫺
K
pi z kn⫺i
ck
k=1
i=1
K
K
=⫺
ck z n⫺i k
pi
i=1
k=1
K
=⫺
pi y¯n⫺i
nⱖK
(12)
i=1
The theoretical series y¯n , n = 0, . . . , N ⫺ 1 of Eq. (9) can be extended by an arbitrary number of points using Eq. (12), and this procedure is therefore called forward linear prediction. A similar reasoning as above can be repeated to define backward linear prediction. To determine the prediction coefficients, two different approaches have been proposed. The first is based on the autocorrelation coefficients of the data points (see Ref. 28 and references therein), while the second works on the data points directly. Here only the second option is explored further. Writing down Eq. (12) for n = K, . . . , N ⫺ 1 results in the following set of equations:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
冋册冋 y¯K y¯K⫹1 ⭈⭈ ⭈ y¯N⫺1
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
y¯0 y¯1 ⭈⭈ ⭈
=⫺
y¯N⫺K⫺1
册冋 册
y¯K⫺1 y¯K ⭈⭈ ⭈ y¯N⫺2
483
pK pK⫺1 ⭈⭈ ⭈ p1
¯ lp p h¯ lp = ⫺H
(13)
Equation (13) is exactly solvable and gives the exact linear prediction coefficients p. By rooting the polynomial p(z) specified in Eq. (10), the signal poles zk , k = 1, . . . , K, are obtained, from which the frequencies and dampings can easily be computed. The signal poles are filled in in Eq. (9), and the exact amplitudes and phases can then be determined from this set of equations. 2.2.1.2 Noisy Data—Suboptimal Solutions. If noise is introduced, Eq. (12) becomes
冘 冘 K
yn = ⫺
pk (yn⫺k ⫺ en⫺k) ⫹ en
k=1 K
=⫺
冘 K
pk yn⫺k ⫹
k=1
( p0 = 1)
pk en⫺k
(14)
k=0
Equation (14) represents a case of an autoregressive moving average (ARMA) model with identical autoregression and moving average parameters. Since an ARMA model can be arbitrarily well approximated by a highorder AR model, Eq. (14) is replaced by
冘 M
yn ⬇ ⫺
pm yn⫺k ⫹ vn
(15)
k=1
where vn , n = 0, . . . , N ⫺ 1, is considered to be circular white Gaussian noise. Equation (15) represents a much easier to solve AR model. It is of course evident that the parameters obtained by using the formalism of Eq. (15) are statistically suboptimal due to the simplified underlying noise model. The prediction coefficients are found by setting up Eq. (15) for n = M, . . . , N ⫺ 1, i.e.,
冋册 冋 yM yM⫹1 ⭈⭈ ⭈ yN⫺1
⬇⫺
y0 y1 ⭈⭈ ⭈
yN⫺M⫺1
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
hlp ⬇ ⫺H lp p
册冋 册
yM⫺1 yM ⭈⭈ ⭈ yN⫺2
pM pM⫺1 ⭈⭈ ⭈ p1
(16)
From Eq. (16), the maximum value for M can be derived. Since N ⫺ M has
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
484
Vanhamme et al.
to be greater than or equal to M, in order not to have an underdetermined set of equations, the largest possible value for M is N/2. No exact solution exists for the above equations. The popular Kumaresan and Tufts’ LPSVD method [30,31] starts by computing the singular value decomposition (SVD) of Hlp . A rank K approximation of Hlp , the matrix HlpK , is then obtained by retaining the largest K singular values and setting the others to zero. In this way, a significant noise contribution is removed. This latter procedure is also called discrete regularization. Solving the set of equations hlp ⬇ HlpK p in a LS sense results in an estimate pˆ of the linear prediction coefficients. The estimates zˆk of the signal poles zk , k = 1, . . . , K, are derived from the K largest (in absolute value) roots of the prediction polynomial 1 ⫹ 兺 M ˆ m z ⫺m = 0. m=1 p A lot of variants of the above outlined algorithm exist. Truncating for example the SVD of [Hlp hlp ] to rank K and computing the total least squares (TLS) solution results in an improved algorithm called LPTLS [32]. Instead of truncating the SVD to rank K, a continuous regularization technique called LPSVD(CR) [33] can be applied to allow automatic determination of the number of components. In EPLPSVD [34,35], the Hankel structure of the HK matrix is restored by applying a SVD-based signal enhancement algorithm developed by Cadzow. In Ref. 36 the truncation step in Cadzow’s algorithm is followed by a correction of the retained singular values according to the principles of minimum variance. All algorithms mentioned are however statistically suboptimal because the ARMA model is not taken into account correctly. 2.2.1.3 Noisy Data—ML Solutions. It is now shown how the LP problem can be solved in a ML way. Two different approaches are considered. The first is based on the optimal solution of the LP equations directly; the other rewrites Eq. (8) in terms of the LP coefficients. Consider again Eq. (14) and set up the linear prediction equations as in Eq. (16) with M = K. Basically, the LP problem solution comes down to computing a solution to the set of incompatible linear equations hlp ⬇ ⫺Hlpp. To compute a solution to this set of equations, corrections [⌬Hlp ⌬hlp ] have to be made to [Hlp hlp ] in order to ensure rank deficiency of [Hlp ⫹ ⌬Hlp hlp ⫹ ⌬hlp ]. In the classical LS approach, the following problem is solved: min 㛳⌬hlp 㛳
⌬h lp ,p
such that ⫺H lp p = hlp ⫹ ⌬hlp
(17)
The solution p to Eq. (17) is only statistically optimal in case the errors are
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
485
identically, independently distributed (i.i.d.) and confined to hlp . The latter condition is obviously violated in this case. The TLS method is more general than the LS approach in the sense that it solves min ⌬H lp ,⌬hlp ,p
㛳[⌬H lp
⌬hlp ]㛳 F
such that ⫺(H lp ⫹ ⌬H lp)p = hlp ⫹ ⌬hlp
(18)
yielding statistically optimal results in case of rowwise i.i.d. errors on [H lp hlp]. Note that 㛳⭈㛳 F denotes the Frobenius norm of a matrix. Since [H lp hlp] has a Hankel structure, the rowwise independency assumption is not valid either. The only correct way to obtain ML estimates for p is to use the structured TLS (STLS) formulation [37–39], which comes as a natural extension to the TLS problem when structured matrices are involved. The STLS approach can be formulated as follows: min
qHq
⌬H lp ,⌬hlp ,p
such that ⫺(H lp ⫹ ⌬H lp)p = hlp ⫹ ⌬hlp and [⌬Hlp
⌬hlp]
has the same structure as
[H lp
hlp]
(19)
where q 僆 ⺓ contains the different entries of [⌬H lp ⌬hlp], i.e., the elements in the first column and last row of this matrix. In recent years many different formulations have been proposed for the STLS problem: the constrained TLS (CTLS) approach [40,41], the structured total least norm (STLN) approach [39,42], and the Riemannian SVD (RiSVD) approach [38]. All these formulations start more or less from a formulation similar to the one above, but the final formulation, for which an algorithm is developed, can be quite different. In any case, there exists no closed-form solution to the minimization problem of Eq. (19), and it has to be solved using an iterative algorithm. Once the ML estimates of the linear prediction coefficients have been obtained by solving the above STLS problem formulation, ML estimates of frequencies, dampings, amplitudes, and phases are determined as outlined before. Since the parameters obtained in this way are optimal in a ML sense, it comes as no surprise that the model fitting methods as presented in Sec. 2.1 and this STLS approach of the LP problem formulation are totally equivalent. To prove this, the following derivation is made. Let ⌬yk = yk ⫺ y¯k . The problem stated in Eq. (19) can then be written more explicitly as N⫻1
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
486
Vanhamme et al.
冘 N⫺1
min ⌬yn,p
兩⌬yn兩2
n = 0, . . . , N ⫺ 1
n=0
such that ⫺(H lp ⫹ ⌬H lp)p = hlp ⫹ ⌬hlp with
⌬H lp =
冋
⌬y0 ⌬y1 ⭈⭈ ⭈ ⌬yN⫺K⫺1
⌬h lp = [⌬yK
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
⌬yK⫺1 ⌬yK ⭈ ⭈⭈ ⌬yN⫺2
(20)
册
⌬yK⫹1 ⭈⭈⭈ ⌬yN⫺1]T
(21)
(22)
The constraint in Eq. (20) basically expresses that [H lp ⫹ ⌬H lp hlp ⫹ ⌬hlp] is a rank-deficient matrix. Each rank-deficient Hankel matrix ⺓(N⫺K)⫻(K⫹1) can however be constructed from a signal representing a sum of damped complex exponentials 兺Kk=1 ak e jke (⫺dk⫹j2 fk)n⌬t, n = 0, . . . , N ⫺ 1. This parameterization follows directly from the fact that if [H lp ⫹ ⌬H lp hlp ⫹ ⌬hlp] is a rank-deficient Hankel matrix, there exists a vector p such that ⫺[H lp ⫹ ⌬H lp]p = [hlp ⫹ ⌬hlp]. The latter equation is nothing other than a set of LP equations, similar to Eq. (13). The complex amplitudes and signal poles can be retrieved using the procedure outlined in the beginning of this section. Using this parameterization of the elements yn , n = 0, . . . , N ⫺ 1 of the Hankel matrix, it is possible to recast Eq. (20) as the minimization of the K jk (⫺dk⫹j2 fk)n⌬t 2 cost function 兺N⫺1 e 兩 w.r.t. ak , dk , fk , k , k = 1, n=0 兩yn ⫺ 兺k=1 ak e . . . , K. The constraint present in Eq. (20) has been eliminated by using the particular choice of parameterization of the rank-deficient matrix [H lp ⫹ ⌬H lp hlp ⫹ ⌬hlp]. This cost function is totally equivalent to the one of Eq. (5) in case y¯ is modeled by a sum of uniformly sampled damped complex exponentials. By solving the STLS formulation, an optimal Hankel matrix of rank K is found as an approximation to [H lp hlp]. Note that Cadzow’s algorithm computes only an approximate solution to the STLS problem, in the sense that not the closest (to [H lp hlp]) rank-deficient matrix is found. Cadzow’s algorithm uses an iterative two-step procedure. First a rank K matrix is computed using the truncated SVD. Since the latter approximation no longer has a Hankel structure, the second step proceeds by replacing each matrix entry by the value obtained by averaging over the corresponding antidiagonal. These two steps are repeated until convergence.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
487
It is now shown how Eq. (8) can be rewritten in terms of the LP coefficients. In case the model function of Eq. (9) is used, the matrix ⌫ defined in Eq. (7) becomes
⌫=
冋
1 z1 ⭈⭈ ⭈ N⫺1
z1
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
1 zK ⭈⭈ ⭈ N⫺1
zK
册
The cost function of Eq. (8) can be rewritten as 㛳y ⫺ ⌫⌫†y㛳2 = yH(I ⫺ ⌫⌫†)y
(23)
(I ⫺ ⌫⌫†) projects on the orthogonal complement of the column space of ⌫. Defining the ((N ⫺ K) ⫻ N ) Toeplitz matrix P as
P=
冋
pK 0 ⭈ ⭈⭈ 0
pK⫺1 pK ⭈ ⭈⭈ ⭈⭈⭈
⭈⭈⭈ pK⫺1 ⭈ ⭈⭈ pK
p1 ⭈⭈⭈ ⭈ ⭈⭈ pK⫺1
1 p1 ⭈ ⭈⭈ ⭈⭈⭈
0 1 ⭈ ⭈⭈ ⭈⭈⭈
⭈⭈⭈ 0 ⭈ ⭈⭈ ⭈⭈⭈
⭈⭈⭈ ⭈⭈⭈ ⭈ ⭈⭈ p1
0 0 ⭈ ⭈⭈ 1
册
(24)
and taking into account that p(zk) = 0, k = 1, . . . , K, the relation P⌫ = 0 holds. Combining this with the observation that the rank of ⌫ is K and the one of P is N ⫺ K, P H is seen to be a basis for the orthogonal complement of the column space of P. Therefore the associated projector becomes I ⫺ ⌫⌫† = P H(PP H)⫺1P When H is defined as the extended matrix H = [H lp holds: Py = Hp
(25) hlp], the following (26)
By combining Eqs. (25) and (26), the cost function of Eq. (23) becomes pHHH(PP H)⫺1Hp
(27)
Minimizing Eq. (27) w.r.t. p results in ML estimates for the linear prediction coefficients. The iterative quadratic maximum likelihood (IQML) algorithm, initially proposed in Ref. 43, is often used (see, e.g., [44]) to minimize Eq. (27). However, it is not widely known that the solution obtained by IQML is suboptimal. In fact, in Ref. 44 it is still claimed that the IQML method is a ML estimation algorithm. For a formal proof of the suboptimality of IQML, the interested reader is referred to Ref. 45. In Ref. 45, an alternative algorithm to determine the minimum of Eq. (27) is outlined.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
488
2.2.2
Vanhamme et al.
State-Space Methods
2.2.2.1 Basic Theory—Noiseless Data. The term state-space originates from the corresponding theory derived in the control and identification literature [46]. In the following, the basic principles of the method are explained using principles from linear algebra. We arrange the noiseless data y¯n , n = 0, . . . , N ⫺ 1 in a Hankel matrix as follows:
¯s = H
冋
y¯0 y¯1 ⭈⭈⭈
y¯N⫺M⫺1
y¯1 y¯2 ⭈⭈⭈
y¯N⫺M
⭈⭈⭈ ⭈⭈⭈ ⭈⭈⭈ ⭈⭈⭈
y¯M y¯M⫹1 ⭈⭈⭈ y¯N⫺1
册
M ⱖ K, N ⫺ M > K
(28)
Using the model function of Eq. (9), the following Vandermonde decom¯ s is easily obtained: position of H
¯s = H
冋
z N⫺M⫺1 1
z N⫺M⫺1 2
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
1 z1 ⭈ ⭈ ⭈⭈ zM 1
1 z2 ⭈ ⭈⭈M z2
⭈⭈⭈ ⭈⭈⭈ ⭈ ⭈⭈ ⭈⭈⭈
1 zK ⭈ ⭈⭈M zK
1 z1 ⭈⭈ ⭈
冋
1 z2 ⭈⭈ ⭈
1 zK ⭈⭈ ⭈
册
¯ ¯ ¯T = SCT
z N⫺M⫺1 K
册冋
c1 0 ⭈⭈ ⭈ 0
0 c2 ⭈⭈ ⭈ 0
⭈⭈⭈ ⭈⭈⭈ ⭈⭈ ⭈ ⭈⭈⭈
0 0 ⭈⭈ ⭈ cK
册
T
(29)
(30)
From this Vandermonde decomposition the complex amplitudes and signal poles can immediately be derived. However, no standard algorithms exist that compute the Vandermonde decomposition directly, and as a result the parameters have to be determined indirectly. To this end, the following rea¯ ) exhibits the soning is made. From Eq. (30) it is seen that S¯ (but also T shift-invariant property ¯ S¯ ↑ = S¯ ↓Z
(31)
where the up (down) arrow stands for deleting the top (bottom) row, and ¯ 僆 ⺓K⫻K is a diagonal matrix composed of the K signal poles zk , k = 1, Z ¯ , and T ¯ is equal to K. From the LP relations that . . . , K. The rank of S¯, C ¯ s , it follows that H ¯ s itself has exactly rank exist between the columns of H ¯ s has the form K. That means that the SVD of H
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
¯ s = U⌺V ¯ ¯ ¯ H = [U ¯K H ¯ KV ¯ K⌺ ¯ HK =U
¯ 2] U
冋 册 ¯K ⌺ 0
0 0
¯K [V
489
¯ 2]H V (32)
¯ K 僆 ⺓K⫻K, V ¯ K 僆 ⺓(N⫺M)⫻K, U ¯ 2 僆 ⺓(N⫺M)⫻(N⫺M⫺K), ⌺ ¯ K 僆 ⺓(M⫹1)⫻K, and where U (M⫹1)⫻(M⫹1⫺K) ¯ V2 僆 ⺓ . From a comparison of Eq. (30) and Eq. (32) it can be ¯ K and S¯ span the same column space and hence are equal up inferred that U ¯ 僆 ⺓K⫻K to a multiplication by a nonsingular matrix Q ¯ K = SQ ¯¯ U
(33)
Using Eq. (33), the shift-invariance property of Eq. (31) translates to ¯ ↑K = U ¯ K↓Q ¯ ⫺1ZQ ¯ ¯ U
(34)
¯ , subsequent truncation of the SVD, and setComputation of the SVD of H ¯Q ¯ ). Since the eigenvalues ¯ ⫺1Z ting up Eq. (34) allows the determination of (Q ⫺1 ¯ ¯ ¯ ¯ of (Q ZQ) and Z are equal, the signal poles are easily derived by ¯ ⫺1ZQ) ¯ ¯ = eig(Z) ¯ = {zk , k = 1, . . . , K} eig(Q The function eig(⭈) determines the eigenvalues of the matrix between brackets. From these signal poles, frequencies and dampings are easily calculated. Amplitudes and phases are determined by filling in the obtained signal poles in the model function Eq. (9) and solving the resulting equations. Note that ¯ a similar reasoning can be made to using the shift-invariant property of T ↑ ¯ ¯ derive a relation between VK↓ and VK similar to Eq. (34). 2.2.2.2 Noisy Data—Suboptimal Solutions. If noise is considered, relation (32) no longer holds and there is no exact solution (Q⫺1ZQ) of the shift-invariant property in UK anymore. However Hs , having the same struc¯ s , but constructed from the noisy data, can be approximated by the ture as H truncated SVD of Hs : Hs = U⌺VH ⬇ UK⌺KVKH = HsK
(35)
where UK and VK are respectively the first K columns of U and V, and ⌺K is the leading (K ⫻ K) submatrix of ⌺. This step brings along the suboptimality of the method. HK is a matrix of rank K, but its Hankel structure has been destroyed by the truncation of the SVD. Therefore there exists no exact solution to the set of equations U↑K ⬇ UK↓Q⫺1ZQ
(36)
Instead, in the method called HSVD [46,47], an estimate of the matrix (Q⫺1ZQ) is obtained by solving Eq. (36) in a LS sense. The eigenvalues of the latter matrix are the estimates zˆk , k = 1, . . . , K, of the signal poles.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
490
Vanhamme et al.
These estimates are filled in in the model equations of Eq. (9) and fitted to the original data with the LS method. A lot of variants of HSVD exist. An improved variant is the HTLS algorithm [48], which computes the TLS solution of Eq. (36). Cadzow’s method or the minimum variance technique can evidently also be used as preprocessing techniques to improve quantitation by means of the basic HSVD and HTLS algorithms [36]. 2.2.2.3 Noisy Data—ML Solutions. This state-space formulation can also be solved in a statistically optimal way. Consider again Hs , with M = K:
Hs =
冋
y0 y1 ⭈⭈⭈
yN⫺1⫺K
⭈⭈⭈ ⭈⭈⭈ ⭈⭈⭈ ⭈⭈⭈
册
yK yK⫹1 = [H1 ⭈⭈⭈ yN⫺1
hs]
(37)
¯s = with H1 僆 ⺓(N⫺K)⫻K and hs 僆 ⺓(N⫺K)⫻1. When no noise is present Hs = H ¯ 1 h¯ s], and the rank of [H ¯ 1 h¯ s] is K, implying the existence of a vector [H ¯ 1x = h¯ s . By adding noise, the corresponding set of linear x such that H equations H1x ⬇ hs is no longer compatible, but corrections [⌬H1 ⌬hs] can be computed using a STLS algorithm such that [H1 ⫹ ⌬H1 hs ⫹ ⌬hs] has exactly the rank K and still has Hankel structure. Comparing H1 , hs with Eq. (16) shows that hs = hlp and H1 = H lp if M is taken to be equal to K in Eq. (16). The problem as stated above thus reduces to the one stated in Eq. (19) or Eq. (20). Once the optimal rank K approximation of the matrix Hs is found, the parameters are determined as outlined above for the HSVD algorithm. Note that Eq. (35) and Eq. (36) can now be solved exactly and that the thus obtained parameters have ML properties. It is clear that the optimal solution of the LP and state-space problems give the same results, since both start from the same optimal rank K matrix. Only the way in which the parameters are derived differs. 2.3
Note
The underlying principles of the matrix pencil (MP) method [49] applied to MRS data in Ref. 50, are very similar to the ideas used to derive the statespace formalism. The method finds the estimates of the signal poles as eigenvalues of a matrix. In this way polynomial rooting and root selection are avoided. In fact, in the noiseless data case, the MP and state-space method give the same result. In the noisy data case, there is almost no difference between estimates obtained by either method [51].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
491
Recently, black-box methods have been proposed in which some form of prior knowledge can be incorporated, but the limitations to the imposition of prior knowledge about model function parameters are inherent in this type of method. In the research literature, the following types of prior knowledge have been incorporated: known signal poles [52–54], known (frequency and) phase [55], known frequency difference and equal dampings [56], known signal poles and phases [55]. 2.4
Summary and Additional Remarks
It was shown in the previous sections that minimizing 㛳y ⫺ y¯㛳2 gives the same ML parameter estimates as 1. Minimizing the variable projection functional 㛳 y ⫺ ⌫⌫†y㛳2 and ob¯ ⬇ y in the LS taining the estimates for the linear parameters by solving ⌫c sense. 2. Solving min qHq ⌬H,⌬h,p
such that ⫺(H ⫹ ⌬H)p = h ⫹ ⌬h, [⌬H ⌬h] has the same structure as [H h], and q contains the different elements of [⌬H ⌬h]. H = H lp 僆 ⺓(N⫺K)⫻K, h = hlp 僆 ⺓(N⫺K)⫻1, ⌬H = ⌬H lp , and ⌬h = ⌬hlp in the LP problem setting. Once the K optimal LP coefficients pˆ are obtained, estimates of the frequencies and dampings can be derived from the roots of the polynomial 1 ⫹ 兺Km=1 pˆm z⫺m = 0. The estimates for amplitudes and phases ˆ ⬇ y in the LS sense. are obtained by solving ⌫c H = H1 僆 ⺓(N⫺K)⫻K, h = hs 僆 ⺓(N⫺K)⫻1, ⌬H = ⌬H1 , and ⌬h = ⌬hs in ˆ obtained by solvthe state-space setting. Determine the eigenvalues from E ing U↑K = UK↓E. Note that this equation holds exactly. UK here is the matrix derived by the SVD of the optimal rank K approximation [H1 ⫹ ⌬H1 hs ⫹ ⌬h1]. Estimates for frequencies and dampings are obtained from the eigenvalues of E. The estimates for amplitudes and phases are obtained by solving ˆ ⬇ y in the LS sense. ⌫c 3. Minimizing pHHH(PP H)⫺1Hp w.r.t. p. Once the K optimal LP coefficients pˆ are obtained, estimates of the frequencies and dampings can be derived from the roots of the polynomial 1 ⫹ 兺Km=1 pˆm z⫺m = 0. The estimates ˆ ⬇ y in the LS sense. for amplitudes and phases are obtained by solving ⌫c On the other hand, LPSVD, LPTLS, HSVD, and HTLS possibly combined with signal enhancement, e.g., the minimum variance technique or Cadzow’s method, provide statistically suboptimal solutions. HSVD and HTLS are a better alternative to the LP methods, since they circumvent the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
492
Vanhamme et al.
steps of polynomial rooting and root selection. Polynomial rooting is increasingly time-consuming for larger prediction orders, whereas root selection may be difficult when the SNR is low. In general state-space methods are computationally more efficient and provide more accurate parameter estimates than the LP methods [57]. The interactive methods are very flexible. All types of model functions can be used, and prior knowledge can be easily incorporated. Besides these advantages, interactive methods provide ML parameter estimates if the underlying assumptions concerning the model function and noise distribution are satisfied. The most important drawbacks are the necessary user interaction and the fact that no methods exist that guarantee convergence to the global minimum within a reasonable time. Black-box methods require minimal user involvement or expertise. One of the drawbacks, however, is that the model function used is restricted to a sum of uniformly sampled damped complex exponentials as specified in Eq. (1). Another disadvantage is that only limited forms of prior knowledge can be used. In those cases where the quality of the data is poor or the resolution is low due to extensive peak overlap, only interactive methods, implementing all available prior knowledge, can provide reliable and accurate parameter estimates. The difference in computational complexity between interactive and black box methods depends, among other things, on the quality of the initial parameter estimates and the amount of prior knowledge that is imposed. It is impossible to predict which method will turn out to be faster on a particular data set. 3
PARAMETER ESTIMATION IN NONIDEAL CIRCUMSTANCES
In many cases the biomedical MRS data do not obey the theoretical model functions of Sec. 1 and contain disturbing resonances of which little is known. Examples include eddy currents that distort the lineshape and residual water in proton spectra. The quantitation results can often be improved using techniques to adjust for the data imperfections. Section 3.1 discusses methods that deal with model imperfections. Different techniques to reduce the influence of unwanted spectra features in the spectrum are discussed in Sec. 3.2. 3.1
Corrections for Model Imperfections
Severe signal distortions most often appear in biomedical MRS due to the restriction of molecular motion, instrumental imperfections, and sample inhomogeneities. These shortcomings might limit the spectral resolution and result in systematic quantitation errors. These errors occur in particular for
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
493
model-based quantitation algorithms, since in general the resonance peaks do not have the assumed ideal lineshapes (such as Lorentzian, Gaussian, or Voigt). Different techniques to handle such problems have been proposed in the research literature. The common concept of using time-domain preprocessing techniques in this context is to determine the signal distortions by comparing an experimental reference peak to a peak having an ideal lineshape. The required reference peak can be acquired as a separate measurement or extracted from the spectrum itself. If a reference peak is obtained from a separate measurement, this results in a high-quality estimate of the deviations. Time is however spent on performing the additional measurements, and furthermore it is questionable if the distortions are actually equal through different experiments even though the same measurement sequences and hardware are used. Extracting a reference peak from the spectrum itself offers, as an obvious advantage, that no separate measurement needs to be performed. Unfortunately in many cases no suitable reference peak can be found in the spectrum that is sufficiently separated from other peaks and has a high enough SNR. Some of the work in this area based on both approaches is described in more detail below. Initial efforts dealt with the correction of eddy current induced distortions. The eddy currents give rise to a time-varying phase shift in the acquired data. In Ref. 58 it was proposed to determine the phase shift from a ) one-peak reference signal y (ref , n = 0, . . . , N ⫺ 1, and subsequently correct n the data according to (ref )
y (corr) = yn e⫺j n n where
n(ref ) = tan⫺1
冉
(38)
冊
) Im(y (ref ) n (ref ) Re(y n )
(39)
denotes the instantaneous phase of the reference signal. The method was used again by Klose [59] for the special case of 1H MRS. In this case, the instantaneous phase is obtained from separate data acquired without using any hardware water suppression. The high signal energy of the unsuppressed water peak compared to the metabolite peaks is expected to give a reliable measure of the instantaneous phase of the water peak only. Thereafter the corrections are applied to the acquired water-suppressed data. This method is standard for eddy current corrections in 1H MRS. More general methods that are able to adjust for more general data imperfections have also been proposed. In Ref. 60 a reference peak is chosen as one of the peaks in the experimental data. An ideal reference peak signal y (ideal) , n = 0, . . . , N ⫺ 1, is constructed via the inverse FT of the predicted n
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
494
Vanhamme et al.
reference peak spectrum. Finally, the corrected data y (corr) , n = 0, . . . , N ⫺ n 1, are obtained by y (corr) = yn n
y (ideal) n y n(ref )
n = 0, . . . , N ⫺ 1
(40)
A potential drawback is that the reference signal might be equal or close to zero, which can lead to numerical problems. This can be avoided if the ) values of y (ref are examined and replaced when necessary. The distortions n associated with the extraction of the needed reference peak from the spectrum and possible remedies are discussed in Ref. 61. An algorithm based on the same principle as that proposed by Morris [60] is called QUALITY (quantitation by converting lineshapes to the Lorentzian type). It was proposed in Ref. 62 and it addresses the removal of arbitrary distortions by elementwise division by an estimated lineshape deviation using either separate data or an isolated peak in the data to be quantitated as a reference peak. Further developments for automating the above procedures for various specific spectroscopic experiments are proposed in, for example, Ref. 63 (1H spectroscopic imaging) and Ref. 64 (1H spectroscopy). In Ref. 65 a technique called self-deconvolution is used to avoid the need for a reference peak. The method is iterative and based on an initial estimate of the parameters for the spectral components. In Refs. 66 and 67 an experimental spatial map of the variations in the magnetic field is set up. Those variations are included in the quantitation, thereby eliminating distortions due to field inhomogeneities. More general approaches can be found in, for example, Refs. 68, 2, and 6 where lineshape distortions are parameterized in various ways and included in the fitting procedures. By including the distortions in the model function directly— either as additional parameters to be estimated or by multiplying the model function with the derived model distortions—these methods also avoid the problems associated with division by a reference signal that is equal or close to zero. No final conclusion can be drawn about which method should be used for a specific scenario, since no systematic study of the accuracy of the methods has been performed. Most evaluations have been done only by visual inspection of the corrected spectrum. A systematic study of the parameter accuracy achieved by the different techniques using both simulations and an extensive experimental study is desirable. 3.2
Removal of Unwanted Components
In biomedical MRS the acquired signal is often composed of many resonance frequencies, of which only a few are of clinical interest. Preprocessing
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
495
the MRS signal to eliminate the unwanted features prior to quantitation can be favorable in some cases. Removal of the features of no interest (called nuisance peaks below) can reduce the computational burden in the final estimation phase and improve the accuracy of the parameter estimates (remove bias and/or reduce variance). Depending on the characteristics of the nuisance peaks, different approaches can be taken. Three scenarios for which the removal of nuisance peaks have to be handled in different ways can be distinguished. The most straightforward case is when the nuisance peaks and the peaks of interest are separated in frequency. Estimation of parameters of selected peaks in the presence of unknown or uninteresting spectral features separated in frequency from the peaks of interest is denoted by frequency selective (FS) parameter estimation in the following. A large number of methods that handle such cases have been presented in the literature. An important example is solvent suppression in 1H spectroscopy, which has received much attention. The huge water resonance that is inevitably present in those spectra cannot be described by an analytical function, mainly because of magnetic field inhomogeneity and measurement suppression techniques. Since no model function is available for the water signal, a method such as AMARES cannot be used without first removing the disturbing peak. Note that the problem is also present if one uses a frequency-domain quantification method. In this case, care has to be taken of the tilted baseline caused by the presence of the water resonance. As a consequence, irrespective of the method used for the actual quantification of these proton spectra, a preprocessing step is necessary to remove the unwanted water contribution. For disturbing or unwanted components that are separated in frequency from the metabolites wanted and have an intensity in the same range of the metabolites wanted, the problem is less difficult, and simpler techniques (like time-domain weighting, as described in the next section) can be used. An overview of techniques that are used to remove the influence from nuisance peaks separated in frequency from the peaks of interest is given in Sec. 3.2.1. Another possibility is that the nuisance peaks have a much larger linewidth than the peaks of interest. Such features originate from large lessmobile molecules or from sequence or hardware artifacts. These peaks can be efficiently removed even though they severely distort the spectrum and apparently overlap with peaks of interest by discarding the first samples of the acquired data. The loss of SNR for the peaks of interest when discarding a few samples is most often negligible. Finally, the most difficult scenario occurs when the nuisance peaks present are not separated in frequency nor in damping from the peaks of interest and as a result strongly overlap. This is for example encountered in
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
496
Vanhamme et al.
short-echo proton spectra where macromolecular baseline resonances pose problems. A number of different solutions have been proposed. Most methods use some kind of model for the baseline. In Ref. 69 the baseline is modeled by a sum of Voigt lines and in Ref. 5 by a set of Gaussian peaks. In the frequency domain, the baseline has also been approximated by a spline function [6] and by wavelet coefficients [70]. 3.2.1
Removal of Components Separated in Frequency
Time-Domain Weighting. The influence of nuisance peaks in NLLS parameter estimation techniques such as VARPRO and AMARES was studied in Ref. 71. The bias term of the amplitude estimates, assuming correct estimates of frequencies and dampings, was derived, and it was seen that this term could be reduced by introducing an appropriate weighting vector wn in the NLLS cost function
冘 N⫺1
兩wn (yn ⫺ y¯n)兩2
(41)
n=0
The choice of the weighting vector is a tradeoff between reducing the nuisance peaks’ influence and the loss of SNR. The weighting vector should be matched with the frequency distance, amplitude, and damping of the nuisance peaks to find the optimal tradeoff. This is however not feasible in practice and instead a generic weighting vector is applied that is expected to work reasonably well in most cases. In Ref. 71 a weighting consisting of a quarter-wave sinusoid for the first (and last) 20 samples was recommended. The method gives good results for relatively well-separated peaks. However, the technique always leads to a loss of SNR resulting in an increased variance of the parameter estimates. If the nuisance peaks have a large amplitude or are close in frequency to the peaks of interest, the method breaks down. This method can therefore not be used to reduce the influence of the water peak in 1H MRS. Use of Black-Box Methods. Another approach to solve the problem of FS estimation is to model the nuisance peaks by a black-box method and subtract the reconstructed time-domain signal from the original signal prior to parameter estimation. In general black-box methods provide a very good mathematical fit of the data by a sum of exponentially damped complexvalued sinusoids. A black-box method can therefore be used to approximate complicated features of the nuisance peaks by modeling them by a superposition of Lorentzian peaks. After subtraction of the nuisance peaks, the residual signal can be quantified using any parameter estimation method.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
497
This technique was first used in the context of water suppression in proton MRS [72,73]. A black-box method that can be used in this context is the HSVD method [72]. A drawback of using HSVD to model the nuisance peaks is the high computational complexity involved with computing an entire SVD. Fast variants of HSVD that compute only part of the SVD exist [74,75]. In addition to the frequency regions to be suppressed, the user has to choose the model order K. The choice of model order K in HSVD is not trivial in cases where the nearby nuisance peaks have an unknown or other than Lorentzian model function, and in some cases the choice of the model order has an influence on the parameter accuracy [76]. Filtering. In the context of the removal of residual water in proton spectra, the earliest preprocessing techniques were based on filtering. In Ref. 77 a simple high-pass finite impulse response (FIR) filter was used to suppress the water peak. From a filtering point of view, these are very basic filters that will strongly influence the metabolite peaks, making an accurate quantitation using the filtered signal impossible. A technique that is more elaborate in that sense was proposed in Ref. 78, where the main idea consists of using a low-pass FIR filter of relatively high order to suppress all metabolite resonances at higher frequencies. The filtered signal ideally consists of the pure water signal, which can be subtracted from the original signal. A simple preprocessing method, ER-filter [79], has been proposed in the more general context of suppressing any wanted region in the spectrum. Although this technique inherently distorts the signal it can be used in case the wanted spectral region is small compared to the full spectral width, and the number of data points is large. A general comment on this filter-related research is the lack of a satisfactory discussion of the design of the proposed filters. The estimation results are strongly influenced by the choice of filter type and filter order, and only limited guidelines have been provided for making that choice. Moreover, none of the papers mentioned above discusses the influence of the techniques used on the parameter estimates of the metabolites of interest. In Ref. 80 a particular narrow pass-band FIR filter was developed for suppressing neighboring peaks in the spectrum. The peak of interest was thereafter quantitated using the HSVD method and the parameters adjusted for the filter influence. This method was proposed to reduce the computational burden involved with a total quantitation and to achieve robust estimates with high accuracy. In Ref. 81, however, AMARESf , a method based on the use of maximum-phase FIR filters is presented and a detailed study of the use and design of FIR filters in the context of solvent suppression is made. The effect of the filter is taken into account in the parameter estimation phase, thereby minimizing the distortions introduced
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
498
Vanhamme et al.
by the use of the filter. In Ref. 76 the use of these FIR filters is generalized to the suppression of any region in the spectrum, and a detailed comparison with other methods such as time-domain weighting, frequency-domain fitting using a polynomial baseline, and the time-domain HSVD filter method is carried out. The ease of use, low computational complexity, and high resulting parameter accuracy of the maximum-phase FIR filter method make it an attractive approach for an FS parameter estimation. 4
EXTENSIONS
In biochemical studies MRS signals are often acquired consecutively to monitor metabolic changes over time. Time series, for example, are measured to follow up reactions occurring after administration of a certain substance, to observe changes in metabolites after pinching off an artery in perfusion experiments where an organ such as the heart is isolated, and to study the metabolic changes during and after muscle exercise. Often information concerning the time evolution of some of the parameters is present. AMARESts [82], an extension of AMARES, takes into account in a statistically optimal way the common information present in spectra of a time series. The method performs very well in practical situations. AMARESts leads to improved and more robust estimates than those obtained by processing the signals individually with AMARES, since in the latter case the information present between the spectra cannot be taken into account. HTLSstack and HTLSsum [83] are extensions of HTLS, and although these algorithms do not perform as well in terms of precision as AMARESts and need more fine tuning, they perform better than HTLS applied on every signal individually and are promising black-box techniques to automate further the MRS data processing. Very recently, the concept of Principal Component Analysis (PCA) has been used in the time domain, and comparisons with HTLSstack and HTLSsum were carried out [84]. PCA demonstrates good performance in terms of both accuracy and computational efficiency for MRS data quantification, but is not suitable for direct quantification of more than one lineshape. HTLSstack and HTLSsum simultaneously quantify multiple lineshapes and outperform PCA, in particular when the lineshape is Lorentzian, since they exploit this prior knowledge. Only at low SNR does PCA outperform the HTLS-based methods if the data set contains only one non-Lorentzian lineshape. 5
EXAMPLES
Some of the time-domain quantitation methods are illustrated using 31P and H examples.
1
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
499
The 31P signal analyzed was obtained from an ex vivo perfused rat liver. It was recorded at 81.1 MHz (4.7 T Bruker Biospec) and is the summation of 2048 signals obtained in 100 s. The signal consists of 128 complex data points, and the sampling interval is 0.2 ms. Peaks from an external standard (ES), phosphomonoesters (PME), inorganic phosphate (Pi), phosphodiesters (PDE), and adenosine triphosphate (␣, - and ␥-ATP) can be observed. The broad resonance underlying the metabolites mentioned originates from phosphorous nuclei in less mobile molecules. In order to reduce its influence, the first five data points are excluded in the time-domain fit. The following prior knowledge was imposed: equal dampings within each of the multiplets of ATP; 16 Hz as frequency splitting within peaks belong to the same multiplet; amplitude ratios of 1:2:1, 1:1, and 1:1 between the -, ␣-, and ␥-ATP, respectively. The dampings of PME and Pi are constrained to be equal. The starting time was estimated and the phases k of all peaks were constrained to be equal. The model function used was Lorentzian. The resulting fit with AMARES is shown in Fig. 3. The 1H signal was obtained from a volume of 2 ⫻ 2 ⫻ 2 cm3 in the human brain, acquired at 63.6 MHz (1.5 T magnetic field) using a volumeselective double echo PRESS sequence with echo time TE = 135 ms. The spectrum is the sum of 128 signals acquired every 2 s. The starting time and zero order phase were estimated, and the resulting fit with AMARESf is shown in Fig. 4. The quantified spectral peaks correspond to total creatine (creatine ⫹ phosphocreatine) (peaks denoted by 1), choline (Cho) (peak denoted by 2), and N-acetyl-aspartate (NAA) (peak denoted by 3). The peak at 0 kHz is the residual water signal. 6
CONCLUSIONS
In this chapter an overview of time-domain algorithms used to estimate the parameters of MRS signals was given. On the one hand, interactive methods exist that fit a model function to the data and minimize a cost function. The methods are very flexible in terms of possible model functions and the prior knowledge that can be imposed. If the assumptions concerning the model function and the noise are satisfied, ML parameter estimates are obtained. The drawback is that a solution to a difficult minimization problem has to be obtained. Black-box methods on the other hand are easy to use and require no expert knowledge, but they are restricted in that they can only be applied to uniformly sampled complex damped exponentials, and only limited forms of prior knowledge can be imposed. Their application in demanding in vivo applications is therefore limited. LP methods and statespace variants exist. Most algorithms only approximately solve the LP or state-space problem setting. It was shown here that the LP and state-space
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
500
Vanhamme et al.
FIGURE 3 Analysis of an ex vivo 31P signal from a perfused rat liver with AMARES. From bottom to top: the FT spectrum of the original signal, the individual fitted Lorentzians, and the residual, which is the difference between the original signal and the reconstructed signal.
problems provide ML parameter estimates if they are solved without introducing approximations. However, the problems are reduced to a solution of a minimization problem. A short overview of time-domain techniques that can be used to improve the quantitation, in case model distortions or unwanted/unknown components are present, was also given. ACKNOWLEDGMENTS This work is supported by the Belgian Programme on Interuniversity Poles of Attraction (IUAP-4/2 & 24), initiated by the Belgian State, Prime Minister’s Office for Science, Technology and Culture, by the EU Pro-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation
501
FIGURE 4 Analysis of an in vivo 1H signal from within the white matter of the human brain with AMARESf . From bottom to top: the FT spectrum of the original signal, the individual fitted Lorentzians, and the residual, which is the difference between the original signal and the reconstructed signal.
gramme ‘‘Training and Mobility of Researchers’’ project (contract ERBFMRXCT970160) entitled ‘‘Advanced signal processing for medical magnetic resonance imaging and spectroscopy,’’ by a Concerted Research Action (GOA) project of the Flemish Community, entitled ‘‘Mathematical Engineering for Information and Communication Systems Technology,’’ and by the Fund for Scientific Research-Flanders (F.W.O.) grant G.0360.98. LV is a postdoc funded by the F.W.O. SVH is a senior research associate with the F.W.O.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
502
Vanhamme et al.
REFERENCES 1. 2. 3. 4.
5.
6. 7.
8.
9.
10.
11.
12.
13.
14.
15. 16. 17.
I. Marshall, J. Higinbotham, S. Bruce, A. Freise. Use of Voigt lineshape for quantification of in vivo 1H spectra. Magn. Reson. Med. 37:651–657, 1997. J. Slotboom, C. Boesch, R. Kreis. Versatile frequency domain fitting using time domain models and prior knowledge. Magn. Reson. Med. 39:899–911, 1998. R. A. de Graaf. In Vivo NMR Spectroscopy. John Wiley, Chichester, 1998. Sˇ. Mierisova´, A. van den Boogaart, I. Tka´cˇ, P. Van Hecke, L. Vanhamme, T. Liptaj. New approach for quantitation of short echo time in vivo 1H MR spectra of brain using AMARES. NMR Biomed. 11:32–39, 1998. R. Bartha, D. J. Drost, P. C. Williamson. Factors affecting the quantification of short echo in-vivo 1H MR spectra: prior knowledge, peak elimination, and filtering. NMR Biomed. 12:205–216, 1999. S. Provencher. Estimation of metabolite concentrations from localized in vivo proton NMR spectra. Magn. Res. Med. 30:672–679, 1993. D. Graveron-Demilly, A. Diop, A. Briguet, B. Fenet. Product-operator algebra for strongly coupled spin systems. J. Magn. Reson. (Series A) 101:233–239, 1993. K. Young, V. Govindaraju, B. J. Soher, A. A. Maudsley. Automated spectral analysis I: formation of a priori information by spectral simulation. Magn. Res. Med. 40:812–815, 1998. R. B. Thompson, P. S. Allen. Quantification of the coupled 1H metabolites using PRESS—numerical modeling and basis function calculation. In: Proc. ISMRM, 7th Scientific Meeting, Philadelphia, May 1999, p. 1569. R. B. Thompson, P. S. Allen. The response of coupled spins to the STEAM sequence—improving quantification. In: Proc. 7th Scientific Meeting ISMRM, Philadelphia, May 1999, p. 587. R. B. Thompson, P. S. Allen. Sources of variability in the response of coupled spins to the PRESS sequence and their potential impact on metabolite quantification. Magn. Reson. Med. 41:1162–1169, 1999. G. H. Golub, V. Pereyra. The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. SIAM J. Numer. Anal. 10(2): 413–432, 1973. S. Cavassila, S. Deval, C. Huegen, D. van Ormondt, D. Graveron-Demilly. The beneficial influence of prior knowledge on the quantitation of in vivo magnetic resonance spectroscopy signals. Investigative Radiology 34(3):242–246, 1999. K. Sekihara, N. Ohyama. Parameter estimation for in vivo magnetic resonance spectroscopy (MRS) using simulated annealing. Magn. Reson. Med. 13:332– 339, 1990. F. S. DiGennaro, D. Cowburn. Parametric estimation of time-domain NMR signals using simulated annealing. J. Magn. Reson. 96:582–588, 1992. G. J. Metzger, M. Patel, X. Hu. Application of genetic algorithms to spectral quantification. J. Magn. Reson. (Series B) 110:316–320, 1996. J. W. C. van der Veen, R. de Beer, P. R. Luyten, D. van Ormondt. Accurate quantification of in vivo 31P NMR signals using the variable projection method and prior knowledge. Magn. Res. Med. 6:92–98, 1988.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation 18.
19.
20.
21. 22.
23. 24.
25.
26. 27. 28. 29.
30.
31.
32. 33.
34.
503
M. R. Osborne. Some aspects of non-linear least squares calculations. In: F. A. Lootsma, ed. Numerical Methods for Non-Linear Optimization. Academic Press, London, 1972. L. Vanhamme, A. van den Boogaart, S. Van Huffel. Improved method for accurate and efficient quantification of MRS data with use of prior knowledge. J. Magn. Reson. 129:35–43, 1997. L. Vanhamme. Advanced time-domain methods for nuclear magnetic resonance spectroscopy data analysis. Ph.D. thesis, Dept. of Electr. Eng., Katholieke Universiteit Leuven, November 1999. Also downloadable from ftp.esat.kuleuven. ac.be/sista/members/vanhamme/reports/phd.ps.gz. M. I. Miller, A. S. Greene. Maximum-likelihood estimation for nuclear magnetic resonance spectroscopy. J. Magn. Reson. 83:525–548, 1989. S. C. Chen, T. J. Schaewe, R. S. Teichman, M. I. Miller, S. N. Nadel, A. S. Greene. Parallel algorithms for maximum-likelihood nuclear magnetic resonance spectroscopy. J. Magn. Reson. (Series A) 102:16–23, 1993. Z.-S. Liu, J. Li, P. Stoica. RELAX-based estimation of damped sinusoidal signal parameters. Signal Processing 62(3):311–321, 1997. S. Umesh, D. W. Tufts. Estimation of parameters of exponentially damped sinusoids using fast maximum likelihood estimation with application to NMR spectroscopy data. IEEE Trans. Signal Processing 44(9):2245–2259, 1996. Z. Bi, A. P. Bruner, J. Li, K. N. Scott, Z.-S. Liu, C. B. Stopka, H.-W. Kim, D. C. Wilson. Spectral fitting of NMR spectra using an alternating optimization method with a priori knowledge. J. Magn. Reson. 140:108–119, 1999. D. S. Stephenson. Linear prediction and maximum entropy methods in NMR spectroscopy. Prog. in NMR Spectrosc. 20:515–626, 1988. H. Gesmar, J. J. Led, F. Abildgaard. Improved methods for quantitative spectral analysis of NMR data. Prog. in NMR Spectrosc. 22:255–288, 1990. P. Koehl. Linear prediction spectral analysis of NMR data. Prog. in NMR Spectrosc. 34:257–299, 1999. R. de Beer, D. van Ormondt. NMR Basic Principles and Progress, Vol. 26. Analysis of NMR data using time-domain fitting procedures. Springer-Verlag, Berlin/Heidelberg, 1992, pp. 201–248. R. Kumaresan, D. W. Tufts. Estimating the parameters of exponentially damped sinusoids and pole-zero modeling in noise. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP 30(6):833–840, 1982. H. Barkhuysen, R. de Beer, W. M. M. J. Bove´e, D. van Ormondt. Retrieval of frequencies, amplitudes, damping factors, and phases from time-domain signals using a linear least-squares procedure. J. Magn. Reson. 61:465–481, 1985. C. F. Tirendi, J. F. Martin. Quantitative analysis of NMR spectra by linear prediction and total least squares. J. Magn. Reson. 85:162–169, 1989. W. Ko¨lbel, H. Scha¨fer. Improvement and automation of the LPSVD algorithm by continuous regularization of the singular values. J. Magn. Reson. 100:598– 603, 1992. A. Diop, A. Briguet, D. Graveron-Demilly. Automatic in vivo NMR data processing based on an enhancement procedure and linear prediction method. Magn. Reson. Med. 27:318–328, 1992.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
504 35.
36.
37.
38. 39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
Vanhamme et al. A. Diop, W. Ko¨lbel, D. Michel, A. Briguet, D. Graveron-Demilly. Full automation of quantitation of in vivo NMR by LPSVD(CR) and EPLPSVD. J. Magn. Reson. (Series B) 103:217–221, 1994. H. Chen, S. Van Huffel, C. Decanniere, P. Van Hecke. A signal enhancement algorithm for time-domain data MR quantification. J. Magn. Reson. (Series A) 109:46–55, 1994. B. De Moor. Structured total least squares and L2 approximation problems. In: Linear Algebra and Its Applications, Special Issue on Numerical Linear Algebra Methods in Control, Signals and Systems (Van Dooren, Ammar, Nichols, and Mehrmann, eds.), 188–189:163–207, 1993. B. De Moor. Total least squares for affinely structured matrices and the noisy realization problem. IEEE Trans. Signal Processing 42:3004–3113, 1994. J. B. Rosen, H. Park, J. Glick. Total least norm formulation and solution for structured problems. SIAM Journal on Matrix Anal. Appl. 17(1):110–128, 1996. T. J. Abatzoglou, J. M. Mendel. Constrained total least squares. In: Proc. of IEEE International Conf. on Acoustics, Speech and Signal Processing, Dallas, TX, 1987, pp. 1485–1488. T. J. Abatzoglou, J. M. Mendel, G. A. Harada. The constrained total least squares technique and its applications to harmonic superresolution. IEEE Trans. Signal Processing 39:1070–1086, 1991. S. Van Huffel, H. Park, J. B. Rosen. Formulation and solution of structured total least norm problems for parameter estimation. IEEE Trans. Signal Processing 44(10):2464–2474, 1996. Y. Bresler, A. Macovski. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Trans. Acoust., Speech and Signal Processing ASSP-34:1081–1089, 1986. G. Zhu, W. Y. Choy, B. C. Sanctuary. Spectral parameter estimation by an iterative quadratic maximum likelihood method. J. Magn. Reson. 135:37–43, 1998. P. Lemmerling. Structured total least squares: analysis, algorithms and applications. Ph.D. thesis, Dept. of Electr. Eng., Katholieke Universiteit Leuven, 1999. S. Y. Kung, K. S. Arun, D. V. Bhaskar Rao. State-space and singular-value decomposition-based approximation methods for the harmonic retrieval problem. J. Opt. Soc. Am. 73(12):1799–1811, 1983. H. Barkhuysen, R. de Beer, D. van Ormondt. Improved algorithm for noniterative time-domain model fitting to exponentially damped magnetic resonance signals. J. Magn. Reson. 73:553–557, 1987. S. Van Huffel, H. Chen, C. Decanniere, P. Van Hecke. Algorithm for timedomain NMR data fitting based on total least squares. J. Magn. Reson. (Series A) 110:228–237, 1994. Y. Hua, T. K. Sarkar. Matrix pencil method for estimating parameters of exponentially damped/undamped sinusoids in noise. IEEE Trans. Acoust., Speech and Signal Processing ASSP-38:814–824, 1990.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation 50.
51.
52.
53.
54.
55.
56.
57.
58.
59. 60. 61. 62.
63.
64.
505
Y.-Y. Lin, P. Hodgkinson, M. Ernst, A. Pines. A novel detection-estimation scheme for noisy NMR signals: applications to delayed acquisition data. J. Magn. Reson. 128:30–41, 1997. B. D. Rao, Relationship between matrix pencil and state space based harmonic retrieval methods. IEEE Trans. Acoust., Speech and Signal Processing 38(1): 177–179, 1990. H. Chen. Subspace-based parameter estimation of exponentially damped sinusoids with application to nuclear magnetic resonance spectroscopy data. Ph.D. thesis, Dept. of Electr. Eng., Katholieke Universiteit Leuven, 1996. H. Chen, S. Van Huffel, D. van Ormondt, R. de Beer. Parameter estimation with prior knowledge of known signal poles for the quantification of NMR spectroscopy data in the time domain. J. Magn. Reson. (Series A) 119:225– 234, 1996. H. Chen, S. Van Huffel, J. Vandewalle. Improved methods for exponential parameter estimation in the presence of known poles and noise. IEEE Trans. Signal Process. 45(5):1390–1393, 1997. H. Chen, S. Van Huffel, A. J. W. Van den Boom, P. P. J. Van den Bosch. Subspace-based parameter estimation of exponentially damped sinusoids using prior knowledge of frequency and phase. Sign. Proc. 59:129–136, 1997. S. Van Huffel, D. van Ormondt. Subspace-based exponential data modeling using prior knowledge. In: Proc. of the IEEE Benelux Chapter Signal Processing Symposium (IEEEBSPS), Leuven, Belgium, March, 1998, pp. 211– 214. S. Van Huffel, L. Aerts, J. Bervoets, J. Vandewalle, C. Decanniere, P. Van Hecke. Improved quantitative time-domain analysis of NMR data by total least squares. In: J. Vandewalle, R. Boite, M. Moonen, A. Oosterlinck, eds. Proc. EUSIPCO 92, Vol. 3, Brussels, Belgium, August 24–27. Elsevier, 1992, pp. 1721–1724. R. Ordidge, I. Cresshull. The correction of transient B0 field shifts following the application of pulsed gradients by phase correction in the time domain. J. Magn. Reson. 69:151–155, 1986. U. Klose. In vivo proton spectroscopy in presence of eddy currents. Magn. Reson. Med. 14:26–30, 1990. G. A. Morris. Compensation of instrumental imperfections by deconvolution using an internal reference signal. J. Magn. Reson. 80:547–552, 1988. A. Gibbs, G. A. Morris. Reference deconvolution. Elimination of distortions arising from reference line truncation. J. Magn. Reson. 91:77–83, 1991. A. A. de Graaf, J. E. van Dijk, W. M. M. J. Bove´e. QUALITY: Quantification improvement by converting lineshapes to the Lorentzian type. Magn. Reson. Med. 13:343–357, 1990. A. A. Maudsley, Z. Wu, D. J. Meyerhoff, M. W. Weiner. Automated processing for proton spectroscopic imaging using water reference deconvolution. Magn. Reson. Med. 31:589–595, 1994. P. G. Webb, N. Sailasuta, S. J. Kohler, T. Raidy, R. A. Moats, R. E. Hurd. Automated single-voxel proton MRS: technical development and multisite verification. Magn. Reson. Med. 31:365–373, 1994.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
506 65. 66. 67.
68.
69.
70.
71. 72.
73. 74.
75. 76.
77.
78.
79.
80.
Vanhamme et al. A. A. Maudsley. Spectral lineshape determination by self-deconvolution. J. Magn. Reson. (Series B) 106:47–57, 1995. D. Spielman, P. Webb, A. Macovski. Water referencing for spectroscopic imaging. Magn. Reson. Med. 12:38–49, 1989. P. Webb, D. Spielman, A. Macovski. Inhomogeneity correction for in vivo spectroscopy by high-resolution water referencing. Magn. Reson. Med. 23:1– 11, 1992. J. Raz, T. Chenevert, E. J. Fernandez. A flexible spline model of the spin echo with applications to estimation of the spin-spin relaxation time. J. Magn. Reson. (Series A) 111:137–149, 1994. L. Hofmann, J. Slotboom, C. Boesch, R. Kreis. Model fitting of 1H-MR spectra of the human brain: incorporation of short-T1 components and evaluation of parametrized vs. non-parametrized models. In: Proc. 7th Scientific Meeting ISMRM, Philadelphia, PA, 1999, p. 586. K. Young, B. J. Soher, A. A. Maudsley. Automated spectral analysis II: application of wavelet shrinkage for characterization of non-parameterized signals. Magn. Res. Med. 40:816–821, 1998. A. Knijn, R. de Beer, D. van Ormondt. Frequency-selective quantification in the time domain. J. Magn. Reson. 97:444–450, 1992. A. van den Boogaart, D. van Ormondt, W. W. F. Pijnappel, R. de Beer, M. Ala-Korpela. Removal of the water resonance from 1H magnetic resonance spectra. In: J. G. McWhirter, ed. Mathematics in Signal Processing III, Clarendon Press, Oxford, 1994, pp. 175–195. J. H. J. Leclerc. Distortation-free suppression of the residual water peak in proton spectra by postprocessing. J. Magn. Reson. (Series B) 103:64–67, 1994. W. W. F. Pijnappel, A. van den Boogaart, R. de Beer, D. van Ormondt. SVDbased quantification of magnetic resonance signals. J. Magn. Reson. 97:122– 134, 1992. L. Vanhamme, R. D. Fierro, S. Van Huffel, R. de Beer. Fast removal of residual water in proton spectra. J. Magn. Reson. 132:197–203, 1998. L. Vanhamme, T. Sundin, P. Van Hecke, S. Van Huffel, R. Pintelon. Frequencyselective quantification of biomedical magnetic resonance spectroscopy data. J. Magn. Reson. 143(1):1–16, 2000. Y. Kuroda, A. Wada, T. Yamazaki, K. Nagayama. Postacquisition data processing method for suppression of the solvent signal. J. Magn. Reson. 84:604– 610, 1989. D. Marion, M. Ikura, A. Bax. Improved solvent suppression in one- and twodimensional NMR spectra by convolution of time-domain data. J. Magn. Reson 84:425–430, 1989. S. Cavassila, B. Fenet, A. van den Boogaart, C. Remy, A. Briguet, D. GraveronDemilly. ER-Filter: a preprocessing technique for frequency-selective timedomain analysis. J. Magn. Reson. Anal. 3:87–92, 1997. I. Dologlou, S. Van Huffel, D. van Ormondt. Frequency-selective MRS data quantification with frequency prior knowledge. J. Magn. Reson. 130:238–243, 1998.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Time-Domain Spectroscopic Quantitation 81.
82.
83.
84.
507
T. Sundin, L. Vanhamme, P. Van Hecke, I. Dologlou, S. Van Huffel. Accurate quantification of 1H spectra: from finite impulse response filter design for solvent suppression to parameter estimation. J. Magn. Reson. 139:189–204, 1999. L. Vanhamme, S. Van Huffel, P. Van Hecke, D. van Ormondt. Time-domain quantification of series of biomedical magnetic resonance spectroscopy signals. J. Magn. Reson. 140:120–130, 1999. L. Vanhamme, S. Van Huffel. Multichannel quantification of biomedical magnetic resonance spectroscopic signals. In: Proc. SPIE Conference on Advanced Signal Processing: Algorithms, Architectures, and Implementations VIII, San Diego, CA, 1998, pp. 237–248. Y. Wang, S. Van Huffel, N. Mastronardi. Quantification of resonances in magnetic resonance spectra via principal component analysis and hankel total least squares. In: Proc. of the Program for Research on Integrated Systems and Circuits (ProRISC99), Mierlo, The Netherlands, 1999, pp. 585–591.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
17 Multidimensional NMR Spectroscopic Signal Processing Guang Zhu The Hong Kong University of Science and Technology, Kowloon, Hong Kong, China
Yingbo Hua University of California, Riverside, California
1
INTRODUCTION
Nuclear magnetic resonance (NMR) spectroscopy has many applications in different fields [1]. One of these applications is utilized in the structure and function studies of biomolecules in solution, which can mimic physiological conditions [2]. Many earlier NMR spectroscopic studies involved using continuous-wave NMR, a type of one-dimensional (1-D) NMR. The application of Fourier transforms to NMR [3] greatly changed NMR spectroscopy and dramatically improved sensitivity. The introduction of two-dimensional (2-D) NMR spectroscopy during the early 1980s revolutionized traditional 1-D NMR spectroscopy and made it easier to study more complex molecular systems, such as proteins, nucleic acids, etc. [1,2,4]. As the molecular weight of biomolecules and the number of hydrogen atoms under NMR spectroscopic investigation increase, the spectral overlap becomes more severe. Therefore it is increasingly difficult to analyze such 1-D spectra, as is shown 509
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
510
Zhu and Hua
in Fig. 1. 2-D NMR experiments can be performed to spread the one-dimensional resonance peaks into an orthogonal dimension in order to alleviate the overlap problem in 1-D NMR experiments. For the same reason, threedimensional (3-D) and four-dimensional (4-D) NMR experiments have been developed to study even bigger biomolecules [5,6]. However, there are problems associated with the multidimensional (n-D) NMR approaches. Because of technical considerations relating to the stabilities of biomolecules and the NMR spectrometers, the time needed to acquire an NMR data set is normally confined to less than a week. Due to this limitation, the acquisition of 3-D and 4-D data must be truncated. This truncation introduces the so-called truncation artifact when discrete Fourier transform (DFT) is applied to obtain the corresponding 3-D and 4-D NMR spectra, resulting in spectral line broadening and introducing sinc wiggles, or side bands, which reduce sensitivity and resolution of the spectra, hence hindering accurate analysis of the spectra.
FIGURE 1 The 1-D 1H spectrum of loloatin (10 amino acid residues) (A) and the 1-D 1H spectrum of BC domain of CNTFR (109 amino acid residues) (B) in H2O recorded on a Varian Inova 750 MHz NMR spectrometer.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
511
To minimize the truncation artifacts in 3-D and 4-D NMR spectra, high resolution methods are available. Linear prediction methods [7–10] and maximum entropy methods [10–14] have been documented widely in NMR spectroscopy research literature. The most commonly used in n-D NMR data processing is, however, the linear prediction method, which is applied to extrapolate time domain data before Fourier transformation. In this chapter, we will review these methods and a few other related methods. In addition, we will show that the multidimensional NMR signals have a structure that has been thoroughly investigated by researchers in the signal processing community. In particular, all the parameters of the n-D NMR signals can, in principle, be easily retrieved by using the matrix pencil method incorporating the concept of subspace decomposition. 2
A MATHEMATICAL MODEL OF THE MULTIDIMENSIONAL NMR SIGNAL
To observe a nuclear resonance transition of the biological sample placed in an external static magnetic field B0 , a radio frequency electromagnetic field B1 is applied to it, at the frequency close to the resonance frequency of the nuclei under study. The detected time domain signal is termed a free induction decay (FID) signal, which is subsequently Fourier transformed to deconvolute the different precession frequencies, yielding a 1-D NMR spectrum as shown in Fig. 1. Mathematically, a 1-D FID with a single resonance can be expressed as M(t) = M0 e ⫺t/T2 e i t⫹i
(1)
where M0 and T2 are the amplitude and the transverse relaxation time of a magnetization, respectively, is the precession frequency, and is the initial phase. This continuous FID is sampled at discrete time interval ⌬t, the dwell time. The intensity of this signal is digitized by an analog-to-digital converter. The time Tacq = N ⌬t is the total acquisition time needed to record an FID of N complex data points. Subjecting an FID signal to a discrete Fourier transformation (DFT) yields an NMR spectrum with a spectral width of SW = 1/⌬t. If the digital resolution of a spectrum is required to be ⌬ (Hz), the acquisition time Tacq should be set to 1/⌬. In general, for time-domain multidimensional NMR signals, the n-D FID, can be expressed as
冘冘 冘 N1
S(t1 , t 2 , . . . , tn) =
N2
Nn
⭈⭈⭈
l1 =1 l2 =1
al1 ,l2 , . . . , ln exp{( jl1 ⫺ R2l1)t1 ⫹ jl1)}
ln =1
exp{( jl 2 ⫺ R2l 2)t2 ⫹ jl 2)} . . . exp{( jln ⫺ R2ln)tn ⫹ jln)}
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(2)
512
Zhu and Hua
where Nm , m = 1, 2, . . . , n, represent the number of resonances in the mth dimension; R2l m , l m , l m , denote the damping factor, frequency, and phase of the lth resonance in that dimension; al1 ,l 2 , . . . , ln are the amplitude of ndimensional peaks resonating at l1 , l 2 , . . . , ln , and j = 兹⫺1. It is always desirable to analyze multidimensional NMR data in their frequency domain obtained by the Fourier transform of the above FID. Since all terms in Eq. (2) are the simple products of exp{( jl1 ⫺ R2l1)t1 ⫹ jl1)}, exp{( jl 2 ⫺ R2l 2)t1 ⫹ jl 2)}, . . . , and exp{( jl n ⫺ R2l n)t1 ⫹ jl n)}, and have no interference terms, the multidimensional Fourier transformation of an n-D FID, S(t1 , t 2 , . . . , tn), can be simplified to the successive use of 1-D Fourier transform to each dimension. Generally, multidimensional NMR spectra are presented and analyzed by 2-D cross sections taken from hypercubes, and newly developed computer software can be easily used to analyze multidimensional NMR spectra [10].
3
SPECTRAL FOLDING AND BASELINE DISTORTION IN THE DFT NMR SPECTRA
Since its introduction to NMR in 1966 [3], the Fourier transform has become an indispensable tool used to process acquired NMR data and to convert the time-domain signals into frequency-domain spectra. In principle, Fourier transformation is a straightforward integral transformation and can be described as follows: given a time domain FID S(t), the corresponding frequency response F() is computed from
冕
⬁
F() =
S(t)exp(⫺i2t) dt
(3)
⫺⬁
where t and stand for time and frequency, respectively. In practice however, only the discrete Fourier transform (DFT) can be employed, since the acquired NMR data are sampled discretely. The DFT is in the form F
冉 冊 冘 l N ⌬t
N⫺1
=
k=0
S(k ⌬t)exp
冉
⫺i2lk N
冊
(4)
where S is the FID composed of N data points, ⌬t is the sampling interval or the dwell time, and l and k indicate the lth and kth data points in the time and frequency domains, respectively. The DFT yields data only at frequencies = l/N ⌬t. It is well known that a delay in sampling the first data point of the FID can result in distortion of the spectral baseline (offset and curvature) of the corresponding frequency-domain spectrum [15–17]. The baseline distortion is problematic when one attempts to analyze accurately
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
513
smaller peaks next to peaks several orders larger, since the baseline distortion is proportional to the amplitudes of resonances. The spectral properties associated with the first data point can be analyzed mathematically, and methods to minimize the baseline distortion can be proposed. To simplify the discussion, S(t) is considered to consist of a single exponentially damped sinusoid, and the FID is given by Sk = S(k ⌬t) = a exp[(i20 ⫺ R2)(k ⌬t ⫹ Td) ⫹ i] k = 0, 1, 2, . . . , N ⫺ 1
(5)
where , a, R2 , and 0 are the initial phase, amplitude, damping factor, and frequency of the resonance considered, respectively, and Td is the sampling delay between the RF pulse and the first data point of the FID. For complex data, the initial phase can be removed by multiplication of the FID by exp(⫺i), and this term can be ignored. The spectrum is obtained by substituting Eq. (5) into Eq. (4) and carrying out the summation as follows:
冘 N⫺1
F() =
A exp{[k(i2 (0 ⫺ ) ⫺ R2) ⌬t]}
k=0
=A
1 ⫺ exp{k(i2 (0 ⫺ ) ⫺ R2) ⌬t]Tacq} 1 ⫺ exp{k(i2 (0 ⫺ ) ⫺ R2) ⌬t]/SW}
(6)
where the spectral width SW equals 1/⌬t, the total acquisition time Tacq equals N ⌬t, and A = a exp[(i20 ⫺ R2)Td]. The equation above reduces to a Lorentizan in the limits ⌬t → 0 and Tacq → ⬁. As A in Eq. (6) contains a linear frequency-dependent phase term exp(i20Td), a linear phase correction is generally used to make all signals absorptive in order to obtain the highest spectral resolution possible. The linear phase correction involves multiplication of the nth data point by exp{i(0 ⫹ 1n/N)} when the frequency range is within [⫺SW/2, SW/2], where 0 and 1 are the zeroth- and first-order phase corrections for the spectrum of N complex data points. Because phase correction at 0 = 0 is not needed, it immediately yields that 0 = ⫺1/2 and 1 = 2iTd /⌬t. To increase the digital resolution of 3-D and 4-D heteronuclear NMR spectra, it is necessary to limit the size of the spectral windows in the indirectly detected dimensions to less than the actual frequency dispersion of resonances. For example, setting Td = ⌬t/2, half a dwell time, 0 = ⫺90⬚, and 1 = 180⬚, resonances that have been aliased will appear with opposite phase with respect to nonaliased resonances. This enables separation of aliased and nonaliased resonances as shown in Fig. 2.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
514
Zhu and Hua
FIGURE 2 (a) and (b) Simulated 1-D spectra with Td = 0.75⌬t obtained by DFT with the respective scaling factors 0.75 and 0.6 on the first data points. (c) The same spectrum obtained with the Td = 0.5⌬t with no scaling on the first data point. (d) The spectrum obtained if the spectral window for the spectrum in (c) is reduced by 33%. For (c) and (d) the linear phase correction is 180⬚. (Reproduced with permission from Ref. 15.)
Equation (6) can be expanded based on the following expansion: exp[b(1 ⫺ x)] = 1 ⫹ b(1 ⫺ x) ⫹
冉冊 b2 4
(1 ⫺ x)2 ⫹ ⭈⭈⭈
(7)
and csch(b) =
1 b 7b3 ⫺ ⫹ ⫹ ⭈⭈⭈ b 12 2880
(8)
where b is a constant. A ‘‘phased’’ spectrum of the following form can be obtained with 0 = 2 (0 ⫺ ), h = SW/R2 , and x = 2Td /⌬t. Re[F()] = a[L0(0) ⫹ f0(x) ⫹ f1(0 , x) ⫹ ⭈⭈⭈]
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(9)
Multidimensional NMR Spectroscopic Signal Processing
515
with L(0) = f0 =
再
h 1 ⫹ (0 /R 2)2
(1 ⫺ x) [(1/3) ⫺ (1 ⫺ x)2] [x(1 ⫺ x)(2 ⫺ x)] ⫺ ⫺ 2 8h 48h2 [7 ⫺ 30(1 ⫺ x)2 ⫹ 15(1 ⫺ x)4] ⫹ ⫹ ⭈⭈⭈ 5760h3
冎
and f1(0 , x) =
(0 /SW)2[x(1 ⫺ x)(2 ⫺ x)] 48 ⫹
(0 /SW)2[7 ⫺ 30(1 ⫺ x)2 ⫹ 15(1 ⫺ x)4] ⫹ ⭈⭈⭈ 1920h
Equation (9) describes the relationship between x and the baseline. The first term in this equation gives rise to a Lorenzian-type peak. The function f0(x) describes a constant baseline offset, whereas f1(0 , x) represents baseline curvature. Important baseline properties of DFT spectra can be readily obtained from the above equation. When the spectral width is much larger than the linewidth, i.e., h >> 1, one obtains f0(x) = (1 ⫺ x)/2 and f1(0 , x) = (0 /SW)2x(1 ⫺ x)(2 ⫺ x)/48. Thus the offset of the baseline can only be zero when x is 1, namely, the delay time is one-half the dwell time, whereas curvature can only be zero when x equals 0, 1, and 2, that is, when Td is zero, one-half, and one dwell time. It is proposed that baseline distortion can be corrected with the multiplication of the first data point S0 by a factor (1 ⫹ x)/2 [17]. However, detailed studies reveal that the spectra obtained with unscaled phase-corrected Fourier transform after subtracting [1 ⫺ x]S0 /2 have better baseline properties [16]. 4
THREE-DIMENSIONAL MAXIMUM ENTROPY METHOD
Maximum entropy methods (MEM) [11–14,18–31] have been used for minimizing the resulting truncation effects in the corresponding Fourier transformed spectra and for optimizing the resolution. The application of maximum entropy methods to one-dimensional and two-dimensional [11–13,18–31] NMR data has been discussed in great detail elsewhere. Below, the discussion will be limited to the details pertinent to our implementation of the algorithm for the processing of 3-D NMR spectra. Although there is no fundamental difference in the application of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
516
Zhu and Hua
MEM to 1-D, 2-D, and 3-D NMR data, the size of the data matrix that needs to be considered rapidly increases with the dimensions of the data set, resulting in a rapid decrease in the stability of the algorithm and a steep increase in the computational time needed for MEM processing. The 3-D MEM is superior to the 1-D and 2-D MEM in processing 3-D and 4-D NMR data because the accumulated computational error can be greatly reduced, and information about resonance in all three dimensions are simultaneously used to reconstruct the MEM spectrum. Analogous to the solution used by Mazzeo et al. [23] for processing large 1-D data sets, we propose to apply 3-D MEM to small but crowded 3-D sections of 3-D or 4-D Fourier transformed spectra. The described implementation of 3-D MEM is based on the algorithm of Gull and Daniell [19], and the application of this algorithm was previously discussed in detail by Delsuc [26]. Given a 3-D experimental FID of noisy Nt1 ⫻ Nt2 ⫻ Nt3 hypercomplex data points Dijk , where Ntk is the number of complex data points in the kth dimension, one seeks a frequency domain spectrum F(1 , 2 , 3) consisting of Nf 1 ⫻ Nf 2 ⫻ Nf 3 real points Flmn with l = 1, . . . , Nf 1 ; m = 1, . . . , Nf 2 ; n = 1, . . . , Nf 3 , and Nf k ⱖ 2Ntk , so that the following criteria are satisfied. 1.
The spectral entropy S is maximized with S being defined as
冘
Nf 1Nf 2Nf 3
S=⫺
Flmn ln Flmn
Flmn > 0
(10)
l=1,m=1,n=1
2.
The square difference
冘
Nf 1Nf 2Nf 3
= 2
l=1,m=1,n=1
(Dijk ⫺ Rijk)2 2ijk
(11)
is converged to 2Nt1 ⫻ 2Nt2 ⫻ 2Nt3 , where Rijk is the inverse Fourier transformation of the MEM spectrum F(1 , 2 , 3), and ijk is the standard deviation of the time domain noise at the coordinate (i, j, k), which is commonly considered as a constant. The 3-D MEM spectrum Flmn , which is satisfied with the above conditions and is obtained by maximizing Q = S ⫺ 2 with the correct choice of the Lagrangian multiplier , such that 2 ⬵ 2Nt1 ⫻ 2Nt2 ⫻ 2Nt3 , can be expressed as [12,13,18–29] Flmn = A exp
再
⫺1 ⫹ 2
冘
Nf 1 ,Nf 2 ,Nf 3
i=1, j=1,k=1
where
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
lmn T ijk
冋
册冎
(Dijk ⫺ Rijk)Wijk 2ijk
(12)
Multidimensional NMR Spectroscopic Signal Processing
Wijk =
再
517
i ⱕ Nt1 , j ⱕ Nt2 , k ⱕ Nt3 Nt1 < i ⱕ Nf 1 , Nt2 < j ⱕ Nf 2 , Nt3 < k ⱕ Nf 3
1 0
and T lmn ijk , the transformation matrix, satisfies the equation ⭸ 2 =2 ⭸Flmn
冘
Nf 1 ,Nf 2 ,Nf 3 lmn T ijk
i=1, j=1,k=1
冋
册
(Dijk ⫺ Rijk)Wijk 2 ijk
(13)
The introduction of the constant A is required when a constraint on the total power is incorporated into the MEM algorithm [21]. The 3-D transformation T lmn expressed in Eq. (13) can be substituted by a 3-D ijk Fourier transformation. The optimal in each iteration is calculated by ⵜS ⭈ ⵜQ = 0 [26]. The 3-D MEM spectrum is obtained based on an iterative method [19]. Beginning with a flat spectrum of amplitude A, the iterative method substitutes the Rijk obtained from the inverse Fourier transformation of the nth iteration F n(1 , 2 , 3) on the right-hand side of Eq. (12) to obtain the solution of the (n ⫹ 1)th iteration F n⫹1(1 , 2 , 3). Due to the exponential form of Eq. (12), this process is very unstable. Several methods have been proposed to stabilize this procedure. In this implementation the so-called fixed-point method [21,26] is used to slow down the iterative procedure in the manner detailed below. First, the new spectrum F n⫹1 new (1 , 2 , 3) is calculated as a linear combination of the result of the previous iteration F nnew(1 , 2 , 3) and the outcome of Eq. (12), F n⫹1(1 , 2 , 3): n n⫹1 F n⫹1 (1 , 2 , 3) new (1 , 2 , 3) = (1 ⫺ ␣)F new(1 , 2 , 3) ⫹ ␣F
(14)
The factor ␣ is used to control how much of the calculated spectrum F n⫹1(1 , 2 , 3) is needed to construct the new spectrum F n⫹1 new (1 , 2 , 3). The factor ␣ can be calculated [26] or given as a constant. A similar approach can also be applied to compute A and by Wu’s method [21]. The optimal An⫹1 of the (n ⫹ 1)th iteration can be computed by the following:
冘 冘
Nt 1Nt 2Nt 3
An⫹1 =
n 兩Dijk⭈R ijk 兩
i=1, j=1,k=1
An
Nt 1Nt 2Nt 3
(15)
兩R 兩
n 2 ijk
i=1, j=1,k=1
with R n being the trial FID of the nth iteration and 兩 兩 being defined as the norm of a complex number. The dot product of two hypercomplex data expressed in Eq. (15) is defined as a sum of the products of the corresponding components. Similar to the fixed-point method used for n⫹1 F n⫹1 new (1 , 2 , 3), Anew is given by
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
518
Zhu and Hua n n⫹1 An⫹1 new = (1 ⫺ )Anew ⫹ A
(16)
The factor  is a constant given by the user, controlling how much the calculated An⫹1 is used for the actual amplitude An⫹1 new . The initial amplitude A0 is given by A0 =
Max 1ⱕiⱕNt 1 ,1ⱕiⱕNt 2 ,1ⱕiⱕNt 3
兩Dijk兩
(17)
Nt 1 Nt 2 Nt 3
Wu’s correction for obtaining the MEM spectrum of each iteration [21] is given as n⫹1 F new =
1 ⫹ (NF nnew / 2) n⫹1 F 1 ⫹ (NF n/ 2)
(18)
To further stabilize the iterative algorithm, the used for each iteration is calculated by n n⫹1 n⫹1 new = (1 ⫺ )new ⫹
(19)
where is the used for the (n ⫹ 1)th iteration and is calculated from ⵜS⭈ ⵜQ = 0. 0 is the small constant (1.0 ⫻ 10⫺4). determines the speed of the convergence and in our experience its value is 0.01–0.2. In addition, if the MEM spectrum of the nth iteration is scaled by a constant a, i.e., F nnew = F n⫹1/a, the speed of convergence can also be controlled by varying a. The iterative process is stopped when the 2 no longer improves more than a given percentage or 2 approaches 2Nt 1 ⫻ 2Nt 2 ⫻ 2Nt 3 . One can also monitor the convergence by observing the angle between ⵜS and ⵜ 2, because at the solution point they are antiparallel to each other. The iterative method described above ensures rapid convergence of the MEM spectra. In order to extract and expand a 3-D section of a 3-D spectrum, it is necessary to apply window functions along the truncated dimensions to obtain the Fourier transformed spectrum before it is processed by the 3-D MEM. This is necessary so that truncation effects do not distribute sinc wiggles outside the region being analyzed. If this is not done, the reconstruction of the MEM spectrum will be distorted as the information of the peaks in the sinc wiggles is lost. By applying window functions, the information of a peak is concentrated at its center, guaranteeing minimal loss of information when the zoom region of a 3-D FT spectrum, to be expanded by the MEM method, is selected. The line broadening due to the application of window functions can be readily deconvoluted with the MEM method [12] by applying the same window function to the exponent of Eq. (12). If the selection of the analyzed spectral region includes a border peak, the final MEM spectra will have a folded portion of this resonance peak due to the use of the Fourier transformation in the MEM algorithm. n⫹1 new
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
n⫹1
Multidimensional NMR Spectroscopic Signal Processing
519
It is important to notice that when the proposed procedure for selecting the spectral region is applied, the noise distribution is no longer constant. The distribution of the noise is modulated according to the apodization functions used. Therefore, in theory, the modulated ijk should be applied in Eq. (12). In practice, less steep apodization functions are employed to prevent the exponent of that equation being divided by a very small number or zero. If this is not necessary, the same window function can be applied. If ijk is used as a constant, only a portion of the original 3-D FID, where the original window functions assume large values, is fitted most effectively by MEM, and the performance of the MEM algorithm is less optimal. In some cases, this may also introduce a small peak between two close resonances. Figure 3A shows the 2-D section of a 3-D zoomed region selected from the Fourier transformed 3-D HCCH-TOCSY spectrum of the protein calmodulin and taken at the 45.8 ppm in the carbon dimension. The original 3-D matrix size of the time domain data is 384*(H) ⫻ 64*(H) ⫻ 24*(C). A sine-squared-bell window function was applied to the F3 dimension, and sine-bell window functions were also applied to the F1 and F2 dimensions. After zero filling and applying a Fourier transformation, a final spectrum with the size of 1024 ⫻ 256 ⫻ 128 was obtained. The size of this section of the 3-D spectrum is 64 ⫻ 32 ⫻ 32. The 1-D Z-trace was taken from the black dot shown on this 2-D slice. Figure 3B shows the MEM spectrum obtained by expanding a region of the Fourier transformed 3-D HCCHTOCSY spectrum. After applying Hilbert and inverse Fourier transformations, the corresponding time domain data with the size of 24* ⫻ 8* ⫻ 12* was obtained. The final size of the MEM spectrum was 64 ⫻ 64 ⫻ 64, and a total of 12 iterations were carried out. Deconvolutions on J couplings and window functions were also applied. The resulting 3-D MEM spectrum showed great improvement in resolution in all three dimensions when compared with the original spectrum in Fig. 3A. The Z-trace was taken from the black dot shown on this 2-D slice. The relatively simple 3-D MEM algorithm described above is capable of increasing the apparent spectral resolution and signal-to-noise ratio in all three dimensions simultaneously, which is difficult to achieve by using 1-D or 2-D MEM methods. The more sophisticated MEM procedure proposed by Skilling and Bryan [20] can also be applied to achieve better stability and convergence. 5
LINEAR PREDICTION IN MULTIDIMENSIONAL NMR SIGNAL PROCESSING
In contrast to the more complex maximum entropy method discussed above, linear prediction (LP) is based on a simple yet effective algorithm. The linear
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
520
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Zhu and Hua
FIGURE 3 (A) 2-D section of a 3-D zoomed region selected from the Fourier transformed 3-D HCCH-TOCSY spectrum of the protein calmodulin and taken at the 45.8 ppm in the carbon dimension. (B) MEM spectrum obtained by expanding a region of the Fourier transformed 3-D HCCH-TOCSY spectrum. The Z-trace was taken from the black dot shown on this 2-D slice. (Reproduced with permission from Ref. 14.)
Multidimensional NMR Spectroscopic Signal Processing
521
prediction method can be used to predict a datum by a linear combination of the data before it, the backward LP, or the data after it, the forward LP. When the LP method was first introduced to NMR signal processing [32–35], it was used to obtain spectral parameters from 1-D FID signals, which are composed of damped sinusoids. The parameters can be used directly to analyze the spectral frequencies, or to reconstruct spectral data as an alternative to the fast Fourier transform (FFT). Unfortunately, LP is much slower than the FFT for large 1-D applications, and its accuracy can be poor when the signal-to-noise ratio is low, a condition that is often encountered in NMR spectral analysis. The spectral quality of the reconstructed LP spectra is not much better than that of the FFT spectra, and a more serious defect of this approach is that sometimes LP introduces spurious peaks. So far, the 1-D LP method has been most frequently used in extending the truncated multidimensional NMR signals to reduce the so-called truncation artifacts, which broadens spectral peaks and introduces sinc wiggles interfering accurate spectral analysis. The 1-D LP extrapolation method is usually able to predict future data accurately throughout many periods of the signal, while other methods, such as the polynomial extrapolation method, generally become seriously inaccurate after one cycle or two. The sinusoidal nature of the NMR signals ideally fits the LP model. Therefore there are many applications of this type of extrapolation technique. For example, 1-D LP can be used to replace missing or distorted points, or as an alternative to zero filling [7–10,36]. The 1-D LP extrapolation method can be readily applied to multidimensional NMR data processing as shown in Fig. 4, since the multidimensional NMR spectra can be considered as the direct product of the 1-D NMR spectrum. The basis for this application is Prony’s LP method, which is described in Chap. 18. Here, for the consistency with our description, we briefly outline this method again as follows. Prony’s method claims that the following linear recursion relationship is true:
冘 P
xn =
ck xn⫺k
n = P, . . . , N ⫺ 1
(20)
k=1
where P is the LP order, ck are the complex valued constants, and N is the total number of complex data points xn , which consists of K (ⱕP) number of damped sinusoids and can be expressed as
冘 K
xn =
ak exp[(␣k ⫹ i2k)n] ⫹ n
n = 0, 1, . . . , N ⫺ 1
(21)
k=1
ak in Eq. (21) are the complex-valued amplitudes for the sinusoids having the damping factors ␣k , frequencies k , and white noise n .
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
522
Zhu and Hua
FIGURE 4 The flow diagram of the 3-D NMR spectral processing procedure with the use of 1-D LP.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
523
Equation (20) can be written in matrix form as x = Tc
(22)
where x = [xP , . . . , xN]T, c is the vector of LP coefficients, and the matrix T is an (N ⫺ P) ⫻ P matrix made by xi . The prediction constants ck are obtained by singular value decomposition (SVD) [37], provided, that is, that P ⱕ N/2. With the prediction coefficients determined by SVD, the frequencies and damping factors of the signal components encoded in these coefficients can be calculated from the P roots (z1 , . . . , zP) of the polynomial Z P ⫺ c1Z P⫺1 ⫺ ⭈⭈⭈ ⫺ cP = 0
(23)
The frequencies and damping factors are given as 1 ln兩Zk兩 ⌬t 1 Im(Zk) fk = ⫺ tan⫺1 2 ⌬t Re(Zk)
␣k =
(24)
where ⌬t is the time interval between sampling points. The amplitudes and phases can be obtained by substituting the damping factors and frequencies obtained above into Eq. (21) and solving the corresponding least-squares problem again. However, in the multidimensional NMR data processing procedure shown in Fig. 4, we normally use the so-called root-reflection method to force all the roots calculated from Eq. (23) into the unit circle for the forward LP method to stabilize the LP extrapolation algorithm. The newly calculated LP coefficients are used to predicate new data based on Eq. (20). In order to minimize the number of peaks introduced by the dispersion part of the spectra during the processing, we only retain the absorptive 1-D spectra in the dimensions other than the dimension where the 1-D LP is used. Hence the use of the Hilbert transformation is essential (Fig. 4). 6
PHASE AND DAMPING FACTOR CONSTRAINED LINEAR PREDICTION
In multidimensional NMR applications, the LP method is used to extrapolate 1-D FID prior to DFT. The extrapolation is accomplished by the repetitive use of the recursion relationship described by Eq. (20). As the number of peaks K is generally unknown in NMR applications, the LP prediction order is typically set in the N/4–N/3 range to allow the least-squares methods to work robustly when the data are contaminated with noise. Therefore the robustness of the extrapolation method is very much dependent on the quality of the prediction coefficients ck . It is generally
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
524
Zhu and Hua
known that the least-squares methods perform more robustly by utilizing constraints. As described in Sec. 3, the phases of the spectra are determined by the dwell time of the first data point in an FID. In addition, many socalled constant time NMR experiments give rise to FIDs with no damping, i.e., the damping factors R2 are zeros. This knowledge can be used as a constraint to improve the robustness of the extrapolation algorithms. Mirror image linear prediction [8] is proposed based on this concept and is used to process 1-D truncated FIDs with no damping and known phases. In multidimensional NMR experiments, the phase of the NMR signal is well known in all the indirectly detected dimensions [15]. Also, if the signal is severely truncated, the small amount of signal decay during the short time evolution can be adequately removed by the multiplication of a single damping factor. In the absence of signal decay, and for the case where the time-domain signal has zero phase at time zero, the time domain can be extended to negative, allowing more resonance peaks to be detected by the LP method. If x0 corresponds to the data point sampled at time 0 and x1 is the data sampled after one dwell time ⌬t, the negative data points x⫺N⫹1 , where x* . . . , x⫺1 can be added, with x⫺n = x*, n n is the complex conjugate of xn . The new time domain data is composed of 2N ⫺ 1 complex data points ideally allowing more frequency components to be detected by the LP method. Figure 5 shows the superiority of the mirror image LP when compared with the commonly used LP method. As discussed above, in multidimensional NMR experiments, it is advantageous to set the first data points at half dwell time for obtaining flat spectral baseline and desirable folding properties. In this circumstance, the mirror image of the time-domain signals results in doubling the length of FID, allowing double the number of signals to be resolved. It is critical in NMR data analysis by LP to resolve maximum numbers of resonances since in many NMR applications the number of poles is unknown a priori. The mirror image LP has been widely applied in multidimensional NMR data
> FIGURE 5 (a) 2-D cross section (F1, F3) of the 3-D HCA(CO)N spectrum of the protein calmodulin obtained with the Fourier transformation from 32 complex data points in F3 dimension, apodized with cosine-squarebell and zero filling to 128 points in the t1 domain. (b) Same as (a), but using only the first 8 complex data points. (c) Same as (b), but using LP to predict 32 complex data points before spectral processing as described in (a). (d) The 2-D spectrum obtained with the use of the same processing procedures as in (c) but with the mirror image LP method instead of commonly used LP. (Reproduced with permission from Ref. 8.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
525
526
Zhu and Hua
processing, especially in constant time experiments in which the time-domain data have little damping. The simple procedures described above can greatly enhance the performance of LP applied to obtain multidimensional NMR spectra, since in practice, only 32 to 64 complex data points are generally sampled in the indirectly detected dimensions of a 3-D experiment. Also, as little as eight complex data points may be sampled in the indirectly detected dimensions of a 4-D NMR experiment. Therefore every data point bearing important information on the spectra has to be treated with great care. 7
FORWARD-BACKWARD LP METHOD
Very often, FIDs obtained from the NMR experiments are damping timedomain signals. In order to minimize artifacts introduced by LP extrapolation of FIDs in multidimensional NMR data, accurate LP coefficients are required. Generally, this is done with the so-called root-reflection method in the forward LP applications, in which the roots, or poles zk , are replaced by zk / 兩zk兩2 if 兩zk兩 > 1, ensuring that all signal components are decay exponentials. Hence prediction artifacts are reduced. As was pointed out by Porat and Friedlander [38], and applied to NMR by Delsuc et al. [39], the LP can also be applied in a backward manner:
冘 P
xn =
dk xn⫹k
(25)
k=1
The backward LP coefficients dk can be calculated in the same way as for the forward LP coefficients ck . The poles related to frequencies and damping factors are calculated from the polynomial
P ⫺ d1 P⫺1 ⫺ ⭈⭈⭈ ⫺ dP = 0
(26)
The signal-related backward roots, after root reflection and complex conjugation, should be identical to the signal-related roots from the polynomial formed by forward LP coefficients. The noise-related roots do not show such correlation. It is suggested that searching for pairs of most-similar roots from the forward and root-reflected backward linear prediction, renders an effective means of distinguishing noise and signals. The main problem with this approach in practical applications is that it requires some prior knowledge about the number of signal components, and the search for the closest root pairs may result in missing roots for weak signal components. Therefore it cannot be readily applied to processing multidimensional NMR data, since very often the number of resonances is unknown, and the analysis of weak signals is important.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
527
In order to circumvent these problems associated with the procedure mentioned above, a simple yet effective method was proposed. The more accurate LP coefficients can be computed by simply averaging the two sets of coefficients obtained from the forward LP and the root-reflected backward LP, resulting in a reduction in random noise present in these two sets of coefficients. The procedure is shown in the flow diagram in Fig. 6A. A comparison of spectra (Fig. 6B) obtained by the regular LP and the forwardbackward LP demonstrates that the latter method is more stable than the regular LP method. The forward-backward LP and mirror image LP are now routinely used in the processing of multidimensional NMR data [10].
8
TWO-DIMENSIONAL LINEAR PREDICTION
Up to now, there has been no ‘‘true’’ 2-D LP algorithm that has been applied to 2-D NMR data, although there are numerous reports suggesting methods that will make the 1-D LP method perform like a 2-D method [40,41]. The chief drawback of these methods is that the number of peaks they are able to resolve is limited, when compared to the FFT method when the number of 2-D NMR time-domain data is severely truncated. This limitation restricts applications of these methods to multidimensional NMR data.
FIGURE 6 (A) The flow diagram of the forward-backward linear prediction procedure. (Reproduced with permission from Ref. 9.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
528
Zhu and Hua
FIGURE 6 (B) (Left) The 2-D cross section of a 3-D spectrum of S. nuclease processed with the regular LP method to extend the t2 domain data from 32 complex to 256 complex data points before the Fourier transformation. (Right) The spectrum obtained with the use of the forwardbackward LP method. (Reproduced with permission from Ref. 9.)
The 2-D linear prediction method is closely analogous to the 1-D method and can be described as follows [42–44]. Given a time-domain matrix of M ⫻ N real data points, the 1-D LP equation is replaced by x(m, n) = or x(m, n) =
冘冘
ckl x(m ⫺ k, n ⫺ l)
(27)
冘冘
ckl x(m ⫺ k, n ⫺ l)
(28)
Q
P
k=1
l=1
Q
P⫺1
k=1
l=0
The coefficients ckl are obtained from these linear equations using SVD. The total number of frequency components in the 2-D time-domain signal must
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
529
now be smaller than (Q ⫻ P)/4. It is well known that 1-D LP can increase the apparent resolution of a spectrum and reduce truncation effects by extrapolating time-domain data before the Fourier transform is applied. For the same reason, the 2-D LP can also be applied to increase the same effects by extending the 2-D data in the time domain. In principle, 2-D LP improves discrimination between signal and noise compared to the 1-D LP. Similar to the 1-D LP extrapolation scheme, the 2-D LP method for extending the timedomain data before the application of the Fourier transform also requires the determination of 2-D LP coefficients ckl by the least squares method from an experimental 2-D data matrix. It is easy to construct the following matrix equation based on Eq. (27) a = Xb
(29)
where a and b are the column vectors, and X is a matrix. They are defined as a = [x (P ⫹ 1, Q), x (P ⫹ 2, Q), . . . , x (P ⫹ 1, Q ⫹ 1), . . . , x (M, N)]T
冋
b = [C (1, 1), C (2, 1), . . . , C (1, 2), . . . , C (P, Q)]T
x (P,Q) x (P⫺1,Q) x (P⫹1,Q) x (P, Q) X= ⭈ ⭈ x (M⫺1,N⫺1) x (M⫺2,N⫺1)
册
. . . x (1,Q) x (P,Q⫺1). . . x (1,Q⫺1) . . . x (1,1) . . . x (2,Q) x (P⫹1,Q⫺1). . . x (2,Q⫺1) . . . x (2,1) ... ⭈ ⭈ ⭈ ... ⭈ ... x (M⫺P,N⫺Q)
where M and N are the total number of complex data points in t1 and t2 . P and Q are the block size of a 2-D causal linear prediction window. The 2-D LP coefficients ckl are readily obtained from the SVD method. One drawback of the 2-D LP method is the stability of the algorithm. There is no root-reflection type of algorithm that ensures the stability of this 2-D LP method. With a Q ⫻ P prediction matrix C, x(m, n) can only be predicted for which m > Q and n > P [Eq. (27)] and n ⱖ P [Eq. 28)]. As a result, it is impossible to predict data points x(M ⫹ 1, n) for n < P, without additional information about the data. In NMR applications, however, we can mirrorimage the first few data points to the negative time domain so that a fully predicted data matrix can be obtained (Fig. 7A). The 2-D LP method discussed here was applied to a 2-D cross section of a 4-D 13C/13C-separated NOESY spectrum of protein calmodulin. As shown in Fig. 7B, the 2-D LP method described here could be used to increase the apparent spectral resolution and signal-to-noise ratio.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
530
Zhu and Hua
FIGURE 7 (A) Shapes of M ⫻ N data matrices used in the 2-D linear prediction procedures. Solid dots correspond to the experimental data points, open circle is the predicted data, and the ⫻ indicates the first few mirror-imaged data points.
9
SOLVENT SUPPRESSION BY SVD
Solvent resonance suppression is critical in many nuclear magnetic resonance spectroscopy (NMR) applications. For a protein dissolved in 90% H2O/10% D2O the concentration of solvent protons (⬇100 M) is about five orders of magnitude greater than the concentration of the protons of interest in the solute (typically ⬇1 mM). Consequently, the strong solvent resonance that dominates the NMR spectrum of the solute can hide signals of interest. A great number of numerical methods have been designed to suppress sol-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
531
FIGURE 7 (B) (Left) The 2-D cross section of 4-D 13C/13C-separated NOESY spectrum of the protein calmodulin obtained from DFT of 8* ⫻ 8* data matrix with the use of cosine-squared bell apodization and zerofilling to 64* ⫻ 64*. (Right) The spectrum obtained from DFT of 16* ⫻ 16* matrix, which is extrapolated from 8* ⫻ 8* matrix. (Reproduced with permission from Ref. 44.)
vent signals. These have been reviewed in detail [45,46]. Most of the postacquisition solvent suppression methods are based on bandpass filtering [47,48] or subtracting the synthetic solvent peak, which is calculated by either LPSVD (linear prediction method based on singular value decomposition) methods [49] or nonlinear least-squares methods [50]. Some of these methods are also described in Chap. 18. Here we describe a simple alternative approach [51] in which a Toeplitz matrix is formed from a 1-D FID, and the corresponding solvent suppressed 1-D FID is then constructed by removing the largest singular value of the Toeplitz matrix. This method has been proven to be effective in removing a strong solvent peak even when it is 1014 times larger than the other resonances in a simulated spectrum. This indicates that, theoretically, little or no solvent suppression is needed in an experiment if this postacquisition solvent suppression method is applied. A 1-D FID is composed of the summation of K exponentially damped sinusoids, and a forward Toeplitz matrix T can be formed as
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
532
T=
冋
x1 x2 ⭈
x2 x3 ⭈
xN⫺M⫹1
xN⫺M⫹2
... ... ... ...
册
xM xM⫹1 ⭈ xN
Zhu and Hua
(30)
where N is the number of data points in the 1-D FID and M is the window size of the Toeplitz matrix. This matrix is similar to the one used in the forward linear prediction method [7–9]. An (N ⫺ M) ⫻ M matrix T can be expressed as the product of an (N ⫺ M) ⫻ (N ⫺ M) unitary matrix U, and (N ⫺ M) ⫻ M diagonal matrix ⌺, and an M ⫻ M unitary matrix V [37]: T = U⌺VH
(31)
where H denotes a conjugate transpose. The diagonal elements i of the matrix ⌺ are positive and its off-diagonal elements are zero. Furthermore, the largest component 1 corresponds to the largest sinusoid component in the 1-D FID, and the noise will have the smallest singular values, M⫹1 , . . . , N⫺M . By zeroing the largest singular value and constructing a new Toeplitz matrix T⬘ in the following manner T⬘ = U⌺⬘VH
(32)
where ⌺⬘ is the matrix ⌺ with 1 = 0 and the rest of the elements are unchanged, a new 1-D FID can be constructed from the matrix T⬘ using Eq. (32). The spectrum obtained from the Fourier transformation of the reconstructed FID will have the solvent resonance removed from it, if the solvent peak is much stronger than the other resonances, which is normally the case in the NMR applications. By zeroing the smallest singular values, the noise can be removed from the corresponding spectrum, provided there is a clear division in the singular value distribution between the noise and the signals. Similarly, solvent suppression can be achieved by the same procedure on the backward Toeplitz matrix. We applied the SVD solvent suppression method to a 1-D FID, which is the first FID of a 2-D NOESY spectrum of the protein lysozyme recorded on a Varian Inova-400 spectrometer with the 2-D FID matrix being 1024* ⫻ 200*. Water presaturation was used in the experiment. The window size chosen was M = 30. The residual water signal in the processed spectrum, as shown in Fig. 8, may come from the presaturation and imperfect shimming, which could split the water peak into many resonances, and the algorithm described has only removed the largest component of these resonances. In conclusion, this method exploits the fundamental characteristics of the Toeplitz matrix constructed from the sampled data points of the sinu-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
533
FIGURE 8 (A) The 1-D spectrum obtained by Fourier transformation of the first FID of the 2-D NOESY experiment on the protein neural grandin recorded with a 1–1 jump return pulse and water presaturation. (B) The corresponding water suppressed 1-D spectrum obtained in the same manner as in (A) after applying the solvent suppression method with a window size of 20. (Reproduced with permission from Ref. 51.)
soidal signals. It is different from the approach proposed by Mitschang et al. [52], where the autocorrelation matrix was formed and a Karhunen– Loeve (KL) transformation was applied. As indicated in the paper by Brown and Campbell, the SVD and KL are different and may not yield the same results in 2-D image processing [53]. The method described can also be applied to remove diagonal peaks from COSY or NOESY spectra [54]. This simple method can be easily applied to multidimensional NMR spectra for postacquisition water suppression and diagonal peak removal. 10
ITERATIVE QUADRATIC MAXIMUM LIKELIHOOD METHOD
In order to analyze spectra quantitatively, such as those obtained from nuclear magnetic resonance spectroscopy, the linear prediction method based on singular value decomposition (LPSVD) [32] can be used for estimating
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
534
Zhu and Hua
amplitudes, phases, frequencies, and damping factors of resonances, directly from an entire experimental data set without data preprocessing, such as windowing and zero filling. An alternative parametric algorithm, the iterative quadratic maximum likelihood (IQML) method [55–60], for spectral analysis has been demonstrated to be much less sensitive to the noise perturbation than is the LPSVD and total least squares (TLS) methods. Detailed mathematical treatment of the IQML method is present in Chap. 18. Here we briefly outline the procedure to use the IQML method for estimating spectral parameters from 1-D NMR data. For the quantitative analysis of 1-D NMR data, {xi 兩i = 1, . . . , N}, the IQML algorithm [59,60] is composed of the following steps, (a) through (g). Iteration of the method stops when the convergence criterion is met (normally, it requires less than five steps). (a)
(b)
Initialization: set m = 0 and b0 = constant vector, where b is a vector of K components. Compute C(m) = X⫹(BB⫹)⫺1X where
X=
and
B=
(c)
冋 冋
xK xK⫹1
xK⫺1 xK
xN⫺1
xN⫺1
bK 0 ⭈ 0
bK⫺1 bK ⭈ ⭈
... ... ... ...
... bK⫺1 ⭈ bK
(33)
x0 x1
册
xN⫺K
b1 ... ... bK⫺1
1 b1 ⭈ ⭈
0 1 ⭈ ⭈
... ... ... b1
册
0 0 ⭈ 1
Solve the quadratic minimization problem: (m) min b ⫹ b(m⫹1) (m⫹1) C
b(m⫹1)僆
(d) (e) (f)
where are the constraints on the coefficients bi , 0 ⱕ i ⱕ K. Increase m: m = m ⫹ 1 Check the convergence: 㛳b(m⫺1) ⫺ b(m)㛳 < ? if yes, go to step (f), else if no, go to step (b). Find the frequencies and damping factors from the roots of the characteristic polynomial formed by b
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
ZK ⫹ b1ZK⫺1 ⫹ b2ZK⫺2 ⫹ ⭈⭈⭈ ⫹ bK = 0 (g)
535
(34)
and the damping factors and frequencies are given by Eq. (24). Amplitudes and phases can be obtained by substituting the damping factors and frequencies obtained above into Eq. (1) and solving the corresponding least-squares problem again.
The possible constraints for optimization in NMR applications that could be implemented in step (c) are listed as follows: (1) Constrain roots inside or on the unit circle; (2) forward and backward LP constraints on b [9] can be used for a more accurate estimation; (3) Im(b) = 0 for real time series; and (4) constrain roots outside the unit circle for the backward LP arrangement to distinguish the signal from the noise. These constraints can be used individually or can be combined according to the application. One advantage of the IQML method is that prior knowledge of an NMR signal can be easily incorporated into the algorithm, resulting in a better estimation of spectral parameters. We also used these methods to process an experimental 1-D 13C sucrose NMR spectrum. 1024 complex data points with 128 scans were recorded on a 0.1 M sucrose sample. Figure 9A shows the 1-D sucrose FFT spectrum using 128 scans. Figure 9B shows the 1-D sucrose LPSVD reconstructed spectrum with use of 140 complex data points from the experimental FID, starting from the 5th data point of the FID, with the LP order and rank of the matrix being 30 and 12 respectively. Figure 9C shows the 1-D total least squares method (TLS) [58] reconstructed spectrum under the same conditions as the LPSVD spectrum (Fig. 1B). Figure 9D shows the 1-D IQML reconstructed spectrum under the same conditions. The IQML method retrieved all the resonances, whereas the LPSVD and TLS methods had either missing peaks or artifacts. It is important to point out that Fig. 9B– D represent a case with high probability under the condition described above, and the increased computational complexity of the IQML method makes it about N times slower than the LPSVD method, with N being the number of iteration used for the IQML.
11
MULTIDIMENSIONAL NMR SIGNAL PROCESSING BY MATRIX PENCIL
For an introductory treatment of one-dimensional spectral estimation, one should read the book by Stoica and Moses [67]. The following discussion presents a state-of-the-art multidimensional spectral estimation method known as the matrix pencil method. To simplify the discussion below, n-D NMR signal model [Eq. (2)] can be written as
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
536
Zhu and Hua
FIGURE 9 (A) 1-D Fourier transformed 13C sucrose NMR spectrum. (B) The LPSVD spectrum obtained with use of 140* data points, LP order and rank being 30 and 12 respectively. (C) The TLS spectrum obtained under the same conditions as (B). (D) The IQML spectrum obtained under the same conditions as (B) and (C). (Reproduced with permission from Ref. 60.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
s(k1 , k2 , . . . , kn) =
冘冘 冘 N1
N2
l1=1
l2=1
537
Nn
⭈⭈⭈
al 1 ,l 2 , . . . , ln z lk11 z lk22 . . . z lknn
(35)
ln=1
where 1 ⱕ ki ⱕ Ki zl i = exp(⫺rl i ⫹ jl i ) al 1 ,l 2 , . . . , ln = 兩al 1 ,l 2 , . . . , ln 兩exp( jl 1 ⫹ jl 2 ⫹ ⭈⭈⭈ ⫹ jl n ) Step 1: Estimation of the poles zl i = exp(⫺rl i ⫹ jl i ) Define
Si (k1 , . . . , ki⫺1 , ki⫹1 , . . . , kn ) =
冋
si (1) si (2) si (2) si (3) ⭈⭈ ⭈⭈ ⭈ ⭈ si (Li ) si (Li ⫹ 1)
册
. . . si (Ki ⫺ Li ⫹ 1) . . . si (Ki ⫺ Li ⫹ 2) ⭈⭈ ... ⭈ ... si (Ki )
where si (ki ) = s(k1 , . . . , ki , . . . , kn ) Ni < Li ⱕ Ki ⫺ Ni ⫹ 1 Experience shows that the recommended choice of Li should be just slightly larger than Ni [61,62]. Then, define
冘 冘冘 冘 K1
Ri =
Ki⫺1
Ki⫹1
⭈⭈⭈
k1=1
Kn
⭈⭈⭈
ki⫺1=1 ki⫹1=1
Si (k1 . . . , ki⫺1 , ki⫹1 , . . . , kn )
kn=1
⭈Si (k1 , . . . ,ki⫺1 , ki⫹1 , . . . , kn )H where H denotes the conjugate transpose. If the damping factor rl i is known to be zero, one can replace R i by R i,FB = R i ⫹ PR*P, where R* i i is the complex conjugate of R i and P is the permutation matrix, i.e.,
P=
冋
1 ⭈ ⭈⭈
1
1
册
The above method of constructing R i (or R i,FB ) follows from a concept applied by Hua [63]. Now compute the eigenvalue decomposition of R i , i.e., R i = E i⌳i EHi
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
(36)
538
Zhu and Hua
and E i = e i1
ei 2
⭈⭈⭈
e iLi
⌳ i = diag(i 1 , i 2 , . . . , iL i )
i 1 ⱖ i 2 ⱖ ⭈⭈⭈ ⱖ iLi ⱖ 0 One can determine an estimate of Ni by counting the number of dominant eigenvalues in R i or using a least-square detection method [65]. With an estimate of Ni , we now define E i,0 = [ei 1ei 2 ⭈⭈⭈ eiNi ] =
冋 册 冋 册 E i,1 —
=
— E i,2
i.e., E i,1 is E i,0 without its last row, and E i,2 is E i,0 without its first row. It then follows [1,2] that {zl i ; li = 1, 2, . . . , Ni } can be found by computing the Ni eigenvalues of the Ni ⫻ Ni matrix: H ⫺1 H E⫹ E i,1 E i,2 i,1 E i,2 = (E i,1 E i,1)
where ‘‘⫹’’ denotes the pseudoinverse. The above procedure can be implemented in various ways. In particular, one can use singular value decompositions (SVD) on a data matrix stacked from Si (k1 , . . . , ki⫺1 , ki⫹1 , . . . , kn), instead of the eigenvalue decompositions (EVD) on Ri . SVD is known to be numerically more robust than EVD because the former avoids the use of ‘‘squaring.’’ However, in situations where computer precision is very high in comparison to the noise level in the data, there is little reason to choose SVD instead of EVD. In fact, SVD is more costly in computation than EVD. Step 2: Estimation of the amplitudes al 1 ,l 2 , . . . , ln With the estimates obtained from step 1 (after being repeated for all i), one can construct all the N1 N2 ⭈ ⭈ ⭈ Nn multidimensional modes: z kl 11 z kl 22 ⭈⭈⭈ z kl nn Using these modes in (1), one can solve a set of K1K2 ⭈ ⭈ ⭈ Kn linear equations (with N1N2 ⭈ ⭈ ⭈ Nn unknowns) to obtain a least square estimate of the N1N2 ⭈⭈⭈ Nn amplitudes al 1 ,l 2 , . . . , ln . However, if N1N2 ⭈ ⭈ ⭈ Nn is too large a number, and the actual number of nonzero amplitudes is N = N1 = N2 = ⭈ ⭈ ⭈ = Nn (i.e., there is no pair of modes that shares a common pole), then one should construct the corresponding N nodes: z kl 1 z kl 2 ⭈⭈⭈ z kl n from the estimates of {zl i ; li = 1, 2, . . . , N} before the corresponding N amplitudes are computed. This will save a large number of computations. But the estimates from Step 1 are in random order. To obtain the correct set
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing
539
of N modes, one must carry out a correct association among {zl i ; li = 1, 2, . . . , N} for all i. A systematic treatment of a class of association techniques is available [64,66]. Remarks: The matrix pencil method shown above is a natural multidimensional extension of the principle shown in [61–64]. It is a very robust method. Such an extension for the SVD-Prony method developed by Kumaresan and Tufts (which is equivalent to the SVD linear prediction method) is not as straightforward. The matrix pencil method is computationally more efficient than the SVD-Prony method. Unlike the latter, the former has no problems such as spurious spectral peaks or noise zeros. In comparison to the maximum likelihood method such as the IQML method, the matrix pencil method has no problem such as convergence to local minima. Note that the IQML method is an iterative procedure that is not guaranteed to converge to the best estimate. But the most efficient implementation of the IQML method is available [57].
12
CONCLUSION
Development of multidimensional NMR spectroscopy has imposed new problems on traditional signal processing. In this chapter, some of solutions are described in an attempt to tackle these problems. Properties of DFT relating to spectral folding and baseline distortion described in Sec. 3 are important in multidimensional NMR experimental setups by NMR spectroscopists. In order to obtain high-resolution DFT n-D NMR spectra, both forward-backward LP and mirror-image LP methods are applied routinely. To minimize the truncation artifacts in 3-D and 4-D NMR spectra, high resolution methods are available. The linear prediction methods and the maximum entropy methods have been documented widely in NMR spectroscopy literature. In particular, all the parameters of the n-D NMR signals can be retrieved in principle by using the matrix pencil method incorporating the concept of subspace decomposition. The matrix pencil method requires a moderate amount of data and exploits the data structure in the most efficient way, and is essentially noniterative, unlike the iterative quadratic maximum likelihood (IQML) method that is not guaranteed to converge to the optimal solution. The matrix pencil method is also in great contrast to the maximum entropy method that exploits the least data model. The maximum entropy method essentially maximizes the ambiguity or flatness of the spectrum of a given data set. If there is 0 confidence in the data model (i.e., the superimposed exponentials), the maximum entropy method is a natural choice. If there is 100% confidence in the data model, the matrix pencil method should be a very practical solution. If the confidence level is some-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
540
Zhu and Hua
where in between 0 and 100%, then one should examine the results from both methods. In fact, the estimates given by the matrix pencil method can also be used to initiate the iterative procedure of the maximum entropy method. More detailed elaboration of the matrix pencil method and other methods will be carried out in future. ACKNOWLEDGMENT This work was supported by the Research Grant Council of Hong Kong (HKUST6038/98M and HKUST6199/99M). The authors thank Dr. K. Sze for stimulating discussion. REFERENCES 1. 2. 3. 4. 5.
6. 7.
8. 9.
10.
11.
12.
13.
R. R. Ernst, G. Bodenhausen, A. Wokaun. Principles of Nuclear Magnetic Resonance in One and Two Dimensions. Clarendon Press, Oxford, 1987. K. Wuthrich. NMR of Proteins and Nucleic Acids. John Wiley, New York, 1986. R. R. Ernst, W. A. Anderson. Rev. Sci. Instrum. 37:93, 1966. A. Bax. Two-Dimensional Nuclear Magnetic Resonance in Liquids. Delft University Press, Delft, The Netherlands, 1982. G. M. Clore, A. M. Gronenborn. Application of three- and four-dimensional heteronuclear NMR spectroscopy to protein structure determination. Prog. NMR Spectroscopy 23:43–92, 1991. A. Bax, S. Grzesiek. Methodological advances in protein NMR. Acc. Chem. Res. 26:131–138, 1993. E. T. Olejniczak, H. L. Eaton. Extrapolation of time-domain data with linear prediction increases resolution and sensitivity. J. Magn. Reson. 87:628–632, 1990. G. Zhu, A. Bax. Improved linear prediction for truncated signals of known phase. J. Magn. Reson. 90:405–410, 1990. G. Zhu, A. Bax. Improved linear prediction of truncated damped sinusoids using modified backward-forward linear prediction. J. Magn. Reson. 100:202– 207, 1992. F. Delaglio, S. Grzesiek, G. Vuister, G. Zhu, J. Pfeifer, A. Bax. NMRPipe: a multidimensional spectral processing system based on Unix pipes. J. Biomol. NMR 6:277–293, 1995. E. D. Laue, R. G. Brereton, S. Sibisi, J. Skilling, J. Staunton. Maximum entropy method in magnetic resonance spectroscopy. J. Magn. Reson. 62:437–452, 1985. E. D. Laue, M. R. Mayger, J. Skilling, J. Staunton. Reconstruction of phasesensitive two-dimensional NMR spectra by maximum entropy. J. Magn. Reson. 68:14–29, 1986. J. Hoch. In Methods in Enzymology (N. Oppenheimer, T. L. James, eds.), Vol. 176, Academic Press, San Diego, 1989, p. 216.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing 14.
15.
16.
17.
18. 19. 20. 21. 22. 23.
24. 25. 26. 27. 28.
29. 30. 31.
32.
541
G. Zhu. Application of a three-dimensional maximum entropy method to processing sections of three-dimensional NMR spectra. J. Magn. Reson (Series B) 113:248–251, 1996. A. Bax, M. Ikura, L. E. Kay, G. Zhu. Removal of F1 baseline distortion and optimization of folding in multi-dimensional NMR spectra. J. Magn. Reson. 91:174–178, 1991. G. Zhu, D. Torchia, A. Bax. Discrete Fourier transformation of NMR signals. The relationship between sampling delay time and spectral baseline. J. Magn. Reson. (Series A) 105:219–222, 1993. G. Otting, H. Wider, G. Wagner, K. Wuthrich. Origin of T1 and T2 ridges in 2D NMR spectra and procedures for suppression. J. Magn. Reson. 66:187– 193, 1986. E. T. Jaynes. Statistical Physics. W. A. Benjamin, New York, 1963. S. F. Gull, G. J. Daniell. Image reconstruction from incomplete and noise data. Nature 272:686–690, 1978. J. Skilling, R. K. Bryan. Maximum-entropy image-reconstruction—general algorithm. Mon. Not. R. Astron. Soc. 211:111, 1984. N. L. Wu. A revised Gull–Daniell algorithm in the maximum entropy method. Astron. Astrophys. 139:555–557, 1984. F. Ni, H. Scheraga. Constrained iterative spectral deconvolution with applications in NMR spectroscopy. J. Magn. Reson. 82:413–418, 1989. A. R. Mazzeo, M. A. Delsuc, A. Kumar, G. C. Levy. Generalized maximumentropy deconvolution of spectral segments. J. Magn. Reson. 81:512–519, 1989. D. S. Stephenson. Linear prediction and maximum entropy methods in NMR spectroscopy. Prog. NMR Spectrosc. 20:515–626, 1988. G. J. Daniell, P. J. Hore. Maximum-entropy and NMR, a new approach. J. Magn. Reson. 84:515–536, 1989. M. A. Delsuc. Maximum Entropy and Bayesian Methods. J. Skilling, ed. Kulwer Academic Publisher, 1989, p. 285. J. A. Jones, P. J. Hore. The maximum entropy method—appearance and reality. J. Magn. Reson. 92:363–376, 1991. W. Boucher, E. D. Laue, S. C. Burk, P. J. Domaille. 4-Dimensional heteronuclear triple resonance NMR methods for the assignment of backbone nuclei in proteins. J. Am. Chem. Soc. 114:2262–2264, 1992. P. Schmeider, A. S. Stern, G. Wagner, J. C. Hoch. Application of nonlinear sampling schemes to COSY-type spectra. J. Biomol. NMR 3:569–576, 1993. E. Bartholdi, R. R. Ernst. Fourier spectroscopy and causality principle. J. Magn. Reson. 11:9–15, 1973. M. A. Delsuc, G. Levy. The application of maximum entropy processing to the deconvolution of coupling patterns in NMR. J. Magn. Reson. 76:306–315, 1988. H. Barkhuijsen, R. De Beer, W. M. M. J. Bovee, D. Van Ormondt. Retrieval of frequencies, amplitudes, damping factors, and phases from time domain signals using a linear least squares procedure. J. Magn. Reson. 61:465–481, 1985.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
542 33. 34.
35.
36.
37. 38.
39.
40. 41.
42. 43. 44. 45. 46. 47.
48.
49.
50. 51.
Zhu and Hua H. Gesmar, J. J. Led. Spectral estimation of complex time-domain NMR signals by linear prediction. J. Magn. Reson. 76:183–192, 1988. H. Gesmar, J. J. Led. Spectral estimation of two-dimensional NMR signals applying linear prediction to both dimensions. J. Magn. Reson. 76:575–586, 1988. J. Tang, J. R. Norris. Linear prediction zeta-transform method, Pade rational approximation, and the Burg maximum entropy extrapolation. J. Magn. Reson. 78:23–30, 1988. D. Marion, A. Bax. Baseline correction of 2D FT NMR-spectra using a simple linear prediction extrapolation of the time domain data. J. Magn. Reson. 83: 205–211, 1989. G. H. Golub, C. F. Van Loan. Matrix Computations. Johns Hopkins, 1983. B. Porat, B. Fiedlander. A modification of the Kumaresan–Tufts method for estimating rational impulse responses. IEEE Trans. Acoust. Speech. Signal Process. ASSP-34:1336–1338, 1986. M. A. Delsuc, F. Ni, G. C. Levy. Improvement of linear prediction processing of NMR spectra having very low signal-to-noise. J. Magn. Reson. 73:548– 552, 1987. H. Gesmar, J. J. Led. Two-dimensional linear prediction NMR spectroscopy. J. Magn. Reson. 83:53–64, 1989. Y. Imanishi, T. Matsuura, C. Yamasaki, T. Yamazaki, K. Ogura, M. Imanari. An autoregressive spectra-analysis of 2D NMR data. J. Magn. Reson. (Series A) 110:175–182, 1994. S. L. Marple. Digital Spectral Analysis: With Applications. Prentice-Hall, Englewood Cliffs, NJ, 1987. R. Kumaresan, D. W. Tufts. Proc. IEEE 69:1515, 1981. G. Zhu, A. Bax. Two-dimensional linear prediction for signals truncated in both dimensions. J. Magn. Reson. 98:192–199, 1992. P. J. Hore. Methods in Enzymology, (N. J. Oppenheimer, T. L. James, eds.), Vol. 176. Academic Press, San Diego, 1989, p. 64. M. Gueron, P. Plateau, M. Decorps. Solvent signal suppression in NMR. Prog. NMR Spectrosc. 23:135–209, 1991. Y. Kuroda, A. Wada, T. Yamazaki, K. Nagayama. Postacquisition data-processing method for suppression of the solvent signal. J. Magn. Reson. 84:604– 610, 1989. D. Marion, M. Ikura, A. Bax. Improved solvent suppression in one-dimensional and two-dimensional NMR-spectra by convolution of time domain data. J. Magn. Reson. 84:425–430, 1989. J. H. J. Leclerc. Distortion-free suppression of the residual water peaks in proton spectra by post-processing. J. Magn. Reson. (Series B) 103:64–67, 1994. M. Deriche, X. Hu. Elimination of water signal by post-processing. J. Magn. Reson. (Series A) 101:229–232, 1993. G. Zhu, D. Smith, Y. Hua. Post-acquisition solvent suppression by singular value decomposition. J. Magn. Reson. 124:286–290, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Multidimensional NMR Spectroscopic Signal Processing 52.
53. 54. 55. 56.
57. 58. 59. 60.
61.
62.
63. 64. 65. 66. 67.
543
L. Mistchang, C. Cieslar, T. A. Holak, H. Oschkinat. Application of the Karhunen-Loeve transformation to the suppression of undesirable resonances in 3dimensional NMR. J. Magn. Reson. 92:208–217, 1991. D. E. Brown, T. W. Campbell. Enhancement of 2D NMR spectra using singular value decomposition. J. Magn. Reson. 89:255–264, 1990. G. Zhu, W. Choy, B. C. Sanctuary, G. Song. Suppression of diagonal peaks by singular value decomposition. J. Magn. Reson. 132:176–178, 1998. R. Kumaresan, L. L. Scharf, A. K. Shaw. An algorithm for pole-zero modeling and spectral-analysis. IEEE Trans. ASSP-34:637–640, 1986. Y. Bresler, A. Macovski. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Trans. ASSP-34:1081–1089, 1986. Y. Hua. The most efficient implementation of the IQML algorithm. IEEE Transactions on Signal Processing 42:2203–2204, 1994. C. F. Tirendi, J. F. Martin. Quantitative analysis of NMR spectra by linear prediction and total least squares. J. Magn. Reson. 85:162–169, 1989. G. Zhu, Y. Hua. Quantitative NMR spectral analysis by an interactive maximum likelihood method. Chem. Phys. Lett. 264:424–429, 1997. G. Zhu, W. Choy, B. C. Sanctuary. Quantitative NMR spectral parameter estimation by an iterative maximum likelihood method. J. Magn. Reson. 135: 37–43, 1998. Y. Hua, T. K. Sarkar. Matrix pencil method for estimating parameters of exponentially damped/undamped sinusoids in noise. IEEE Trans. Acoustics, Speech, and Signal Processing 38:814–824, 1990. Y. Hua, T. K. Sarkar. On SVD for estimating generalized eigenvalues of singular matrix pencils in noise. IEEE Trans. on Signal Processing 39:892–900, 1991. Y. Hua. Estimating two-dimensional frequencies by matrix enhancement and matrix pencil. IEEE Trans. on Signal Processing 40:2267–2280, 1992. Y. Hua. High resolution imaging of continuously moving object using stepped frequency radar. Signal Processing (EURASIP) 35(1):33–40, 1994. Q. Cheng, Y. Hua. Detection of cisoids using least square error function. IEEE Transactions on Signal Processing 45:1584–1590, 1997. Y. Hua, K. Abed-Meraim. Techniques of eigenvalues estimation and association. Digital Signal Processing. Vol. 7. Academic Press, 1997, pp. 253–259. P. Stoica, R. Moses. Introduction to Spectral Analysis, Prentice-Hall, 1997.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
18 Advanced Methods in Spectroscopic Imaging Keith A. Wear U.S. Food and Drug Administration, Rockville, Maryland
1
INTRODUCTION
The goal of magnetic resonance spectroscopy is to make quantitative measurements of relative or absolute concentrations of chemical compounds of biological importance. In spectroscopic imaging, a one-, two-, or threedimensional map of concentration as a function of position is generated and displayed in an image format. One fundamental form of magnetic resonance spectroscopy is chemical shift imaging (CSI). In CSI, a set of phase-encoded free induction decay signals is acquired. Phase encoding is performed in one, two, or three dimensions. The two-, three-, or four-dimensional Fourier transform (with a time to temporal frequency transformation of the free induction decay signals accounting for the added dimension) of this data set yields a one-, two-, or three-dimensional grid of spectra that corresponds to a line, plane, or volume in the human body [1,2]. Additional localization may be achieved through use of a surface coil. The most common method for Fourier transform estimation is the discrete Fourier transform (DFT) [3]. The DFT may be computed efficiently using the fast Fourier transform (FFT) [4]. If short lengths (either in time domain or k-space) are used, the DFT exhibits limited resolution in the transform domain. In addition, the DFT may produce Gibbs ringing artifacts. 545
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
546
Wear
The extent of k-space that may be sampled in a clinical spectroscopic examination is limited by the time available during which to acquire data. In traditional CSI, there is a fixed number of phase-encoded acquisitions per k value and the time between successive acquisitions is limited by T1 relaxation effects. Often, investigators are faced with a difficult trade-off between acquisition time and spatial resolution in the reconstruction. Two important applications with challenging spatial resolution issues are 1H spectroscopic imaging and 31P spectroscopic imaging. In the former case, contamination from subcutaneous fat layers can produce errors in spectroscopic images of tissues of interest such as the brain. In the latter case, contamination from skeletal muscle can corrupt images of tissues of interest such as the liver. Numerous authors have proposed alternatives to the FFT in MRI and spectroscopy [5]. Hu and coworkers [6] have developed a method called SLIM (spectral localization by imaging) with which they incorporate prior knowledge obtained from the conventional MR image in the localization algorithm. This approach can reduce the number of input signals required and therefore reduce clinical examination time. A modification of this method called GSLIM (generalized SLIM) is reported to reduce spectral leakage in CSI and inhomogeneity errors in SLIM [7]. Von Kienlin and Mejia [8] have developed another localization method, based on a priori image information, called SLOOP (spectral localization with optimized pointspread function), which provides voxels of arbitrary shape and improves SNR. Plevritis and Macovski [9,10] have developed a method in which anatomical information from the nonspectroscopic image is incorporated in an algorithm to enhance resolution in spectroscopic images. They have also proposed an alternative k-space sampling distribution that can enhance resolution in spectroscopic images [11]. Maudsley and coworkers [12–14] have developed alternative k-space sampling methods that can enhance imaging performance. Parker et al. and Hu et al. have proposed strategies in which phase-encodes (spatial frequencies) are measured with varying amounts of repetition or varying repetition times in order to reduce ringing artifacts [15– 17]. Hu and Stillman have developed a method for obtaining CSI with reduced ringing using anatomical information [18]. Autoregressive approaches have been proposed for MRI by Smith et al. [19], Martin and Tirendi [20], Barone and Sebastiani [21], and for spectroscopy by Wear et al. [22]. Hendrich and coworkers [23] have utilized a 2-D FSW (Fourier series window) approach and circular voxels. Wear et al. have extended a method called constrained reconstruction, developed by Haacke et al. [24] for MRI, to magnetic resonance spectroscopy [25]. This approach models the object of interest in one dimension as
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
547
a set of contiguous boxcar shaped tissues. Constrained reconstruction has been shown to enhance spatial resolution in 2-D CSI of the liver. More recently, echoplanar spectroscopic imaging has attracted much interest because of its relatively fast data acquisition speed [26–38]. Bayesian methodology has been employed to estimate confidence intervals for metabolite concentration levels in spectroscopic imaging [39]. As an alternative or supplement to the signal processing methods mentioned above, suppression pulses may be applied prior to data acquisition in order to minimize undesired signals such as proton signals emanating from subcutaneous lipids in the brain. Enhanced spectroscopic imaging performance using suppression pulses has been reported by numerous investigators [40–43]. The present chapter however will focus on k-space-based signal processing methods, which by themselves represent a fairly large topic. Suppression techniques will not be considered. 2
CHEMICAL SHIFT IMAGING (CSI)
Chemical shift imaging is one of the foundations of clinical magnetic resonance spectroscopic imaging and was introduced by Brown, Kincaid, and Ugurbil in 1982 [1]. A related approach was developed by Maudsley et al. [2]. The object is to recover a spatial distribution of chemical shifts, (x, ␦), where x refers to spatial position and ␦ refers to the chemically shifted frequency. If a RF pulse is applied in the presence of a magnetic field consisting of the sum of a spatially uniform static field, H0 , and a linear time-varying gradient along the x direction, G(t), then the frequency (or time derivative of phase, ) of the resultant free induction decay (FID) signal will be d = ␥ [H0 ⫹ G(t)⭈x][1 ⫹ ] ⬵ ␦ ⫹ ␥ G(t)⭈x dt
(1)
where ␥ is the gyromagnetic ratio, accounts for the electronic shielding that causes the chemical shift effect, and ␦ = ␥ (1 ⫹ )H0 . The approximate equality arises from ignoring a term equal to ␥ G(t)x, which is valid as is generally less than or on the order of 10⫺5. Integration of Eq. (1) to obtain the spatially and temporally varying phases gives
冕
t
(x, t) = ␦ t ⫹ ␥ x
G(t) dt
(2)
0
One standard implementation of chemical shift imaging entails a gradient function, G(t), which is a constant, G, for a specified interval of time (0 < t < t0) and zero thereafter. Equation (2) becomes
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
548
Wear
(x, t) = ␦ t ⫹ ␥ Gt0 x = ␦ t ⫹ kx
(t > t0)
(3)
where k = ␥ Gt0 . Now the resultant FID will be a weighted integral over all chemical shifts and over space. S(t) =
冕
(x, ␦)ei[␦ t⫹kx] dx d␦
(4)
So the FID (after time t0) samples the inverse Fourier transform of the object at time t and spatial frequency k. By taking many repeated measurements over a range of values for k (generated by using a range of gradient strengths, G) and by applying a discrete spatial-temporal Fourier transformation, the spatial distribution of chemical shifts, (x, ␦) may be reconstructed. A more general treatment is given by Brown et al. [1]. Ideally, a perfect reconstruction of (x, ␦) is possible provided information over all of k-space is acquired. In conventional CSI, however, k-space is typically acquired over a limited range of values k n = (nx ⌬kx , ny ⌬ky , nz ⌬kz) where ⫺Nx /2 < nx < Nx /2, ⫺Ny /2 < ny < Ny /2, and ⫺Nz /2 < nz < Nz /2. The extent of k-space sampling is severely limited by time constraints in data acquisition. Sampling over such a limited range leads to the Gibbs artifact, which results in spectral leakage and often an undesirably large point spread function [7]. See Fig. 1. Various schemes to circumvent these problems are described below. In 3-D CSI, the volume of interest in the body can often be such that only a small number of voxels (e.g., four) are needed along one dimension, whereas a greater number (e.g., sixteen) are desired along the other two dimensions. Conventional CSI, using n regularly spaced phase encodes to acquire n slices along a dimension, becomes ineffective for small n due to poor spatial resolution. In these cases, hybrid 2-D CSI/Hadamard spectroscopic imaging techniques offer advantages over 3-D CSI. Hadamard spectroscopic imaging has a comparative advantage for small n and can be used in the small n dimension. Transverse Hadamard spectroscopic imaging spatially encodes the spins’ relative transverse phases parallel or antiparallel corresponding to the ⫹1 or ⫺1 of the ith row of an nth order Hadamard matrix [44–48]. Examples of 2-D CSI from a sample syringe object and from human liver can be seen in Figs. 2 through 5. 3
SPECTRAL LOCALIZATION BY IMAGING (SLIM) AND GENERALIZED SPECTRAL LOCALIZATION BY IMAGING (GSLIM)
SLIM (spectral localization by imaging) incorporates prior knowledge obtained from a conventional MR image in the spectral localization algorithm
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
549
FIGURE 1 Computer simulation of the MRSI acquisition showing Gibbs ringing effects for different numbers of k-space points. Data is shown for a single slice through a 3-D MRSI data set with a prolate spheroidal object, and for cubic k-space dimensions of 9, 13, 17, 21, 25, and 31 points for images (a) through (f), respectively. (From AA Maudsley et al. Reduced phase encoding in spectroscopic imaging. Magn Reson Med 31:645–651. Copyright 䉷 1994. Reprinted by permission of Wiley-Liss, Inc., a subsidiary of John Wiley & Sons, Inc.)
[6]. This is a natural approach, as most spectroscopic studies are preceded by a nonspectroscopic imaging study. One principal advantage of SLIM is that it can require fewer input signals than a conventional rectangular grid phase-encoded chemical shift imaging experiment and hence result in a faster data acquisition time. The subject of interest is modeled as being composed of a number of N spatially homogeneous compartments (e.g., a limb may exhibit MR signals from muscle, fat, bone marrow, and tumor). If pi (t) is the ith phase-encoded spectroscopic signal from the sample, and cj (t) is the signal from the jth compartment in the absence of phase encoding, then
冘 N
pi (t) =
gij cj (t)
i = 1, 2, . . . , M
(5)
j=1
where gij =
冕
exp(⫺iki ⭈r) d 3r
(6)
Compartment⫺j
where ki is the ith phase-encoding vector. M refers to the number of phase-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
550
Wear
FIGURE 2 Proton image of syringe containing inorganic phosphate (Pi). The syringe is immersed in a water tank. The vertically oriented bright object, just right of center, is the syringe. A 5 ⫻ 5 subset of the original 8 ⫻ 8 grid of voxels is superimposed. The voxel dimensions are length ⫻ width ⫻ thickness = 3 ⫻ 3 ⫻ 4 cm. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597. Copyright 䉷 1997, IEEE.)
encoded signals, which must be greater than or equal to N. In order to evaluate the coefficients gij , one needs to know the spatial boundaries for each of the N compartments. These boundaries are determined from the nonspectroscopic image. Equation (5) can be rewritten in matrix form: P(t) = GC(t)
(7)
where P(t) is an M ⫻ 1 vector, G is an M ⫻ N matrix, and C(t) is an N ⫻ 1 vector. A least-squares fitting method, in which G is factored by means of a singular value decomposition, can be used to solve this equation to determine C(t). C(t) can be Fourier transformed to obtain the spectra of the individual compartments. From the anatomical boundaries determined from
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
551
FIGURE 3 2-D CSI of syringe containing inorganic phosphate (Pi) (see proton image in Fig. 2). The signal occupies not only the six proper voxels but many others as well. In addition, ringing is apparent. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597. Copyright 䉷 1997, IEEE.)
the nonspectroscopic image and the compartmental spectra, a spectroscopic image can be generated. When the compartmental homogeneity assumption is violated, spectral information can leak among the inhomogeneous compartments [7]. Inhomogeneity can arise from variations within compartments in concentration, composition, and/or magnetic field. This spectral leakage can degrade localization accuracy. GSLIM (generalized spectral localization by imaging) was developed in order to suppress this problem. The GSLIM approach employs a generalized series expansion for the spatial distribution of chemical shifts rather than being restricted to the Fourier series model implicit in conventional CSI. This flexibility enables the incorporation of a variety of a priori constraints into the model. If no a priori information is imposed, GSLIM reduces to the Fourier series model. If, on the other hand, a finite collection of contiguous homogeneous compartments is imposed (as in SLIM), GSLIM reduces to the SLIM representation. Generally speaking, the nonspectroscopic image can provide substantial structural information (namely the locations of boundaries between tissues) to be incorporated into the model. The Fourier series part is particularly useful for modeling lowfrequency variations [7].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
552
Wear
FIGURE 4 Proton image of liver of normal human subject. A 3 ⫻ 5 subset of the original 8 ⫻ 8 grid of voxels is superimposed. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597. Copyright 䉷 1997, IEEE.)
4
SPECTRAL LOCALIZATION WITH OPTIMAL POINT SPREAD FUNCTION (SLOOP)
Spectral localization with optimal point spread function (SLOOP) is an extension of SLIM [8]. In SLOOP, the phase-encoding gradients and the number of accumulations in the spectroscopic experiment are chosen in such a way to optimize the signal-to-noise ratio and to minimize errors due to the existence of some degree of inhomogeneity within compartments. One extension is to allow for a variable number of signal accumulations for each phase-encoding step so that Eq. (6) may be rewritten as gij = NA i
冕
exp(⫺ik i ⭈r) d 3r
(8)
Compartment⫺j
where NAi is the number of accumulations for the ith phase-encoding step. SLOOP employs a numerical optimization technique in order to choose the set of phase encoding vectors ki and the accumulation numbers NAi so as to maximize measurement performance. The parameter used to characterize
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
553
FIGURE 5 2-D CSI of normal human liver in vivo (see proton image in Fig. 4). The vertical dotted lines correspond to the chemical shift for PCr. The arrows in the lower right corner indicate the presence of PCr in the reconstruction in voxels that do not contain muscle and therefore should not contain PCr. In particular, notice that in the rightmost column, PCr, which should be confined to the top row (where muscle is located), has been vertically smeared into the two deeper voxels due to the sinc convolution effect inherent in the FFT process. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597. Copyright 䉷 1997, IEEE.)
measurement performance is a weighted sum of two variables. One variable describes efficiency and corresponds to the percentage of available magnetization detected in each compartment. The other variable describes contamination and is derived from the integral of the point spread function over all space outside each compartment. The weighted sum of the worst case (over all compartments) efficiency and the worst case (over all compartments) contamination is optimized by the numerical procedure. The worst-case efficiency typically arises from the smallest compartment. SLOOP has been extended to use higher order gradients (i.e., gradients having a nonlinear spatial dependence), resulting in enhanced signal-to-noise ratio and localization [49]. One of the advanced features of SLOOP is that the underlying model also takes into account all information available about spatially varying flip angles, spatially varying saturation (dependent on T1 relaxation), the B1
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
554
Wear
field during reception, etc. It is the incorporation of all this information into the reconstruction procedure that allows one to measure absolute concentrations. SLOOP has been employed for many applications including quantification of cardiac phosphorus metabolite concentrations in 31P spectroscopy [50,51]. An example of a set of spectra derived using SLOOP is shown in Fig. 6. Measurements of phosphorus metabolite concentrations in human heart in vivo from nine volunteers obtained from CSI and SLOOP are given in Table 1. Absolute concentrations, obtained without NOE, are in good agreement with biopsy data from human heart surgery which have shown ATP concentrations of 5.5 mmol/kg wet wt [51].
FIGURE 6 Phosphorous spectra from left ventricular myocardium, liver, chest muscle, and blood obtained using SLOOP. (From cover, J Magn Reson 134 and R. Loffler et al. Localized spectroscopy from anatomically matched compartments: improved sensitivity and localization for cardiac 31 P MRS in humans. J Magn Reson 134:287–299, 1998.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
555
TABLE 1 Phosphate Metabolite Concentrations in Human Heart Measured Using SLOOP and CSI PCr
ATP
Reference
Method
11.0 ⫾ 2.7
6.9 ⫾ 1.6
PA Bottomley et al. Magn Reson Med 14:425–434, 1990
3-D-CSI, external reference
12.1 ⫾ 4.3
7.7 ⫾ 3.0
T Yabe et al Circulation 92:15– 23, 1995
2-D-CSI, external reference
10.0 ⫾ 2.0
5.8 ⫾ 1.6
PA Bottomley et al. Magn Reson Med 35:664–670, 1996
1-D-CSI, internal reference
9.0 ⫾ 1.2
5.3 ⫾ 1.2
M Meininger et al. Magn Reson Med 41:657–663, 1999
SLOOP without NOE
7.9 ⫾ 2.2
4.8 ⫾ 0.6
M Meininger et al. Magn Reson Med 41:657–663, 1999
SLOOP with NOE
Concentrations are in mmo/kg wet weight, mean ⫾ SD. These data represent an overview of quantitative cardiac examinations described in the literature. PCr concentration obtained using SLOOP without NOE and ATP concentration obtained using SLOOP with NOE have relatively low standard deviations. (From M. Meininger et al. Concentrations of human cardiac phosphorus metabolites determined by SLOOP 31P NMR spectroscopy. Magn Reson Med 41:657–663. Copyright 䉷 1999. Reprinted by permission of Wiley-Liss, Inc., a subsidiary of John Wiley & Sons, Inc.)
5
ALTERNATIVE K-SPACE SAMPLING METHODS
Various investigators have proposed alternatives to the straightforward rectangular grid (square in 2-D, cubic in 3-D) approach to CSI. Maudsley et al. and Hugg et al. have investigated spherical phase encoding. One motivation for the spherical approach is as follows. Spatial resolution in any direction is proportional to the extent of k-space sampled in that direction. For 3-D spectroscopy with a cubic grid, the extent of k-space sampled along the diagonals of the cube is longer by a factor of 兹3 than along the principal axes. Using a spherical volume of k-space, on the other hand, results in isotropic spatial resolution [13]. See Fig. 7. Apodization (allowing the number of acquisitions to be averaged to vary with k value) can also be employed to improve imaging performance. Generally speaking, the optimal k-space sampling distribution and apodization depends on the specific imaging task of interest and can entail a trade-off between voxel size and sensitivity. It has been demonstrated in computer simulations [14] and in proton spectroscopy of the brain [13] that, in some cases, spherical and acquisition-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
556
Wear
FIGURE 7 Two-dimensional plot through the resultant spatial response functions for (a) cubic k-space acquisition with 17*17*17 points, and (b) spherical k-space acquisition with diameter of 21 points. (From AA Maudsley et al. Reduced phase encoding in spectroscopic imaging. Magn Reson Med 31:645–651. Copyright 䉷 1994. Reprinted by permission of Wiley-Liss, Inc., a subsidiary of John Wiley & Sons, Inc.)
weighted k-space offer attractive advantages over the standard uniformly weighted rectilinear approach [14]. Hu et al. have devised a method designed to reduce signal contamination in low-intensity compartments of interest emanating from adjacent compartments with high intensity [16]. Examples of compartments responsible for this kind of distortion would include subcutaneous fat in the case of 1H spectroscopy and skeletal muscle in the case of 31P spectroscopy. With this approach, high values of k are sampled with a small number of acquisitions, and low values of k are sampled with larger numbers of acquisitions. One variation of this method entails using a reduced repetition time (TR) for sampling high values in k-space [17]. This technique has been demonstrated in 1H spectroscopic images of human brain. 6
ESTIMATION OF UNSAMPLED VALUES OF K BY EXTRAPOLATION
Spectroscopic imaging often entails sampling a range of points throughout k-space and then inverse Fourier transforming to obtain the spatial distri-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
557
bution of chemical shifts. It is often implicitly assumed that the unsampled values in k-space are zero. Plevritis and Macovski [10] have developed a method whereby these unsampled values in k-space can be estimated from the sampled data by incorporating the knowledge of the limited spatial extent of the image (i.e., within the anatomical boundaries of the object) into the reconstruction algorithm. In one dimension, the effects of sampling a limited range of k-space can be described by for k = {1 ⭈ ⭈ ⭈ N}
sl (k) = s(k)rect(k)
(9)
where s(k) is the true Fourier transform of the object, sl (k) corresponds to measured Fourier transform, and rect(k) = 1 for k = {1 ⭈ ⭈ ⭈ M } and zero otherwise. The acquired spectroscopic image, Sl (m), is related to the desired reconstruction, S(m), via
冘 冉 N
Sl (m) =
sinc
n⫺
n=1
mN M
冊
S(n)
for m = {1 ⭈ ⭈ ⭈ M}
(10)
where sinc(n) is the inverse discrete Fourier transform of rect(k). Plevritis and Macovski choose the minimum energy reconstruction consistent with known spatial extent of the object (determined from the nonspectroscopic image) and the measured k values. Mathematically, this corresponds to
冘 N
min
ˆ 2 S(n)
n=1
(where ˆ denotes the reconstruction) subject to the condition of finite spatial support ˆ ˆ S(n) = T(n)S(n)
(11)
where T(n) = 1 within the boundaries of the object to be reconstructed and zero elsewhere, and to the requirement of consistency with measured k values
冘 冉 N
Sl (m) =
sinc
n=1
n⫺
mN M
冊
ˆ S(n)
(12)
It can be shown that the solution to the optimization problem can be expressed in a matrix equation [10], Sˆ = TLt(LTLt )⫺1 Sl
(13)
where now T is a diagonal matrix whose (n, n) element is T(n) and L is a circulant matrix whose (m ⭈ n) element is sinc(n ⫺ mN/M ). This technique has been demonstrated to enhance resolution without increasing acquisition time or reducing SNR in brain [10].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
558
Wear
Another method for extrapolation to values in k-space beyond the measured range, based on the Papoulis–Gerchberg algorithm, has been reported by Haupt et al. [52] to be an effective strategy for removing lipid artifacts in 1H spectroscopic imaging of the brain. An example of this approach is shown in Fig. 8.
7
CONSTRAINED RECONSTRUCTION
A thorough explanation of the constrained reconstruction algorithm for MRI is provided by Haacke and coworkers [24]. A related algorithm was developed independently by Martin and Tirendi [20]. The object to be imaged is modeled, in one dimension, as a weighted sum of contiguous boxcar functions of varying amplitudes and widths. Each boxcar corresponds to a different tissue or structure in the body. The phase-encoded data (the sampled Fourier transform of the object) can be expressed as the weighted sum of fractions of the form sinc(␣k)exp(⫺i2k), where sinc(z) = sin(z)/z, k is sampled spatial frequency (determined by the phase encode gradients), and ␣ and  correspond to the widths and centers of the boxcar functions. If this sum is multiplied by k it then becomes a weighted sum of terms of the form sin(␣k)exp(⫺i2k) = [exp(i␣k) ⫺ exp(⫺i␣k)]
exp(⫺i2k) 2 (14)
Thus the product of the phase-encoded data with k is a simple sum of complex exponentials. The widths and centers of the constituent boxcars can be obtained from the frequencies of the exponentials. In general, the frequencies of the exponentials can be estimated using a singular value decomposition method outlined by Haacke and coworkers [24]. If the number of boxcars is small, however, easier approaches may be implemented. For example, for a single boxcar object, the magnitude of the phase-encoded data can be least-squares fitted to the functional form *sinc(␣k)*, and the phase may be least-squares fitted to a line, ⫺2k. Once the widths and centers are known, the amplitudes of the boxcars can be obtained from a linear least-squares procedure [24]. Wear et al. have applied constrained reconstruction to 2-D CSI [25]. The application of constrained reconstruction to spectroscopy is somewhat more challenging than that for MRI for the following reasons. First, in CSI, far fewer data samples are typically acquired in one phase-encode dimension (e.g., 8 or 16, as opposed to, say, 256 or 512 in MRI). Therefore much less data is available to be fitted to the reconstruction. Second, spectroscopic data
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
FIGURE 8 NAA images from human brain for TE = 50 ms: (a) zero-filled Fourier reconstruction; (b) after Papoulis–Gerchberg (PG) extrapolation of the lipid region; (c) MRI (2500/80) corresponding to the center of the metabolite image slice; (d) NAA image after PG extrapolation and spectral baseline fit-and-subtract procedure; (e) spatially smoothed version of the zero-filled reconstruction (a); (f) spatially smoothed version of the PG extrapolation result (b), after removal of signals contained within the region of the lipid mask. (From CI Haupt et al. Removal of lipid artifacts in 1H spectroscopic imaging by data extrapolation. Magn Reson Med 35: 678–687. Copyright 䉷 1996. Reprinted by permission of Wiley-Liss, Inc., a subsidiary of John Wiley & Sons, Inc.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
560
Wear
tends to be noisier than MRI data (SNR typically on the order of 8:1 as opposed to typically 100:1). Constrained reconstruction has been applied to 31P spectroscopic imaging in the liver. The two main tissues contributing 31P peaks in this application are muscle and liver. The main species of interest (ATP, PDE, Pi , and PME) are found in both tissues. PCr exists in muscle but not in liver. Therefore at frequencies (chemical shifts) corresponding to PCr, the data can be modeled as a single boxcar (muscle). The locations of the boundaries of the muscle can be estimated by applying a single component boxcar model to the data (at PCr frequencies). The remaining boundary of interest (determined by the sensitivity volume of the surface coil) can be determined from the nonspectroscopic 1H image (e.g., Fig. 4). Once the boundaries (or equivalently, the widths and centers) of the boxcars are determined, the amplitudes can be obtained from a linear least-squares procedure. Since the concentrations of the various metabolites are not equal, the least-squares procedure must be repeated for each frequency (chemical shift). Examples of constrained reconstruction spectroscopic images derived from the same data shown in Figs. 2 through 5 can be seen in Figs. 9 and
FIGURE 9 Constrained reconstruction of syringe containing inorganic phosphate (Pi). Constrained reconstruction was applied in both directions. Using the proton image (see Fig. 1) as a reference, it can be seen that the constrained reconstruction performs better than the 2-D CSI (see FIg. 3) in that the reconstruction is confined to the true location of the object and ringing is suppressed. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597. Copyright 䉷 1997, IEEE.)
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
561
FIGURE 10 Constrained reconstruction of normal human liver in vivo (see proton image in Fig. 4). Constrained reconstruction was applied in the vertical direction while an FFT was applied in the horizontal direction. The vertical dotted lines correspond to the chemical shift for PCr. The erroneous PCr signals in the 3-D CSI reconstruction are not present in the constrained reconstruction (see arrows in the lower right corners of Figs. 5 and 10). The PCr peak corresponding to the top right voxel has not been smeared (due to the sinc convolution effect) into the two deeper voxels as was the case with the 2-D CSI. Hence it has a greater magnitude than in the 2-D CSI. (From KA Wear et al. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16: 591–597. Copyright 䉷 1997, IEEE.)
10. An FFT was used for transformation from time domain to temporal frequency (chemical shift) domain. For human data (Fig. 10), constrained reconstruction was used for one transform from spatial frequency to position (for the direction in which high resolution was critical), while an FFT was used for the remaining spatial frequency to position transform. For the syringe containing 31P (Fig. 9), constrained reconstruction was used in both spatial directions. The algorithm for analyzing human data can be summarized as follows: 1.
Acquire data set of phase-encoded FID’s, D(kx , ky , t) where kx and ky are spatial frequencies.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
562
Wear
2. 3.
4. 5.
8
Use FFT to transform from time domain to chemical shift domain, D(kx , ky, f ). To determine muscle boundaries for each column in reconstruction: a. Average phase encoded data over range of frequencies corresponding to PCr. DPCr (kx , ky) = (1/N) 兺i D(kx , ky , fi), where N discrete frequencies, fi , lie in the spectral region corresponding to PCr. b. Take FFT in horizontal direction, DPCr (x, ky). c. Apply single boxcar constrained reconstruction in vertical direction to each column. Determine posterior boundary (due to falloff in sensitivity of the surface coil) from proton image. For each column in the reconstruction, with knowledge of locations of the three important boundaries in this two-tissue (boxcar) model, solve for amplitudes and phases, as functions of frequency (chemical shift) of boxcars using least-squares procedure [24].
ECHO-PLANAR SPECTROSCOPIC IMAGING
Traditional CSI can be too slow for many applications, particularly 3-D studies, as it requires a new excitation pulse for each value of k-space sampled. The interval between successive acquisitions (i.e., k values) is limited by the relaxation time(s) (T1) of the metabolite(s) of interest. For example, to generate a 16 ⫻ 16 ⫻ 16 voxel data set would require 2.3 hours, assuming a repetition time (TR) of 2 seconds (typical for brain spectroscopic studies) [32]. The use of time-varying readout gradients allows for the sampling of a range of k values following a single excitation pulse, which can result in much faster data acquisition [26–38]. Echo-planar approaches, which may track zigzag, rectilinear, or spiral trajectories in k-space [53], have been employed for spectroscopic imaging [32]. From Eq. (2) it can be seen that
冕
t
k(t) = ␥
G(t) dt
(15)
0
A spiral trajectory in k-space can be described by k(t) = Ateit, where k(t) = kx (t) ⫹ iky (t). This trajectory can be achieved through the use of oscillating gradients, G(t) = ␥ ⫺1dk/dt or Gx (t) = A cos t ⫺ At sin t, and Gy (t) = A sin t ⫹ At cos t [53]. Generation of a spectroscopic image with nonuniformly sampled data, such as is obtained with spiral-trajectory echoplanar spectroscopic imaging, is much more complicated than that with conven-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging
563
tional rectangular grid phase-encoded data. One approach is to use a gridding algorithm [54,55] to resample nonuniformly sampled data onto a rectilinear grid. In this approach, the nonuniformly sampled data, Ms (k), is represented as the product of the continuous Fourier transform of the object, M(k), and a sampling function, S(k), which is a set of Dirac delta functions located at the sampled points in k-space. Ms (k) = M(k)⭈S(k)
(16)
In order to resample this data onto a new (e.g., uniform) grid, the original sampled data is weighted by the density of the sampling structure, (k), and convolved with a kernel, such as a Gaussian, a sinc, or a small finite window [55], C(k), and then resampled [31,54–56]. Mc (k) =
冋
册
Ms (k) * C(k) ⭈⌸(k) (k)
(17)
where ⌸(k) is the new sampling function. The reconstruction is obtained by discrete Fourier transformation. 9
DISCUSSION
Magnetic resonance spectroscopy is a powerful diagnostic tool with numerous applications. The most straightforward approach to spectroscopy is CSI. CSI requires Fourier transformation of digitized phase-encoded data. Application of the DFT for Fourier transformation often results in inadequate resolution and ringing artifacts. Many investigators have developed advanced signal processing methods for spectroscopy in order to circumvent the need for the DFT. These methods may exhibit a multitude of advantages over traditional CSI, including faster acquisition time, improved spatial resolution, improved signal-to-noise ratio, and more accurate quantitation. These benefits have been demonstrated in a variety of settings including 1H spectroscopic imaging in the human brain and 31P spectroscopic imaging in the human liver. REFERENCES 1. 2.
TR Brown, BM Kincaid, K Ugurbil. NMR chemical shift imaging in three dimensions. Proc Natl Acad Sci USA 79:3523, 1982. AA Maudsley, SK Hilal, WH Perlman, HE Simon. Spatially resolved high resolution spectroscopy by four dimensional NMR. J Magn Reson 51:147– 152, 1983.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
564 3. 4. 5.
6. 7. 8. 9. 10. 11. 12.
13. 14.
15. 16. 17.
18. 19.
20. 21.
Wear AV Oppenheim, RW Schafer. Digital Signal Processing. Prentice-Hall, Englewood Cliffs, NJ, 1975. JW Cooley, PAW Lewis, PD Welch. The fast Fourier transform and its applications. IBM, Yorktown Heights, NY, Res. Paper RC-1743, 1967. ZP Liang, FE Boada, RT Constable, EM Haacke, PC Lauterbur, MR Smith. Constrained reconstruction methods in MR imaging. Reviews Magn Reson Med 4:67, 1992. X Hu, DN Levin, PC Lauterbur, T Spraggins. SLIM: spectral localization by imaging. Magn Reson Med 8:314, 1988. ZP Liang, PC Lauterbur. A generalized series approach to MR spectroscopic imaging. IEEE Trans Med Imag 10:132, 1991. M von Kienlin, R Mejia. Spectral localization with optimal pointspread function. J Magn Reson 94:268–287, 1991. S Plevritis, A Macovski. Spectral extrapolation of spatially bounded images. IEEE Trans Med Imag 14:487–497, 1995. S Plevritis, A Macovski. MRS imaging using anatomically based k-space sampling and extrapolation. Magn Reson Med 34:686–693, 1995. S Plevritis, A Macovski. A new k-space sampling distribution for spectroscopic imaging. Proc Soc Magn Reson 1179, 1994. AA Maudsley, JW Hugg, EJ Fernandez, GB Matson, MW Weiner. Application of reduced k-space sampling in spectroscopic imaging. Abstracts of the 10th Meeting of the Society of Magnetic Resonance in Medicine, Vol. 1, Aug. 10– 16, San Francisco, 1991, p. 186. AA Maudsley, GB Matson, JW Hugg, MW Weiner. Reduced phase encoding in spectroscopic imaging. Magn Reson Med 31:645–651, 1994. JW Hugg, AA Maudsley, MW Weiner, GB Matson. Comparison of k-space sampling schemes for multidimensional MR spectroscopic imaging. Magn Reson Med 36:469–473, 1996. DL Parker, GT Gullberg, PR Frederick. Gibbs artifact removal in magnetic resonance imaging. Med Phys 14:640, 1987. X Hu, M Patel, K Ugurbil. A new strategy for spectroscopic imaging. J Magn Reson B 103:30–38, 1994. X Hu, M Patel, W Chen, K Ugurbil. Reduction of truncation artifacts in chemical-shift imaging by extended sampling using variable repetition time. J Magn Reson B 106:292–296, 1995. X Hu, AE Stillman. Technique for reduction of truncation artifacts in chemical shift images. IEEE Trans Med Imag 10(3):290–294, 1991. MR Smith, ST Nichols, RM Henkelman, ML Wood. Application of autoregressive moving average parametric modeling in magnetic resonance image reconstruction. IEEE Trans Med Imag MI-5:132, 1986. JF Martin, CF Tirendi. Modified linear prediction modeling in magnetic resonance imaging. J Magn Reson 82:392–399, 1989. P Barone, G Sebastiani. A new method of magnetic resonance image reconstruction with short acquisition time and truncation artifact reduction. IEEE Trans Med Imag 11(2):250–259, 1992.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging 22.
23.
24.
25.
26. 27. 28. 29. 30.
31.
32.
33. 34.
35.
36.
37. 38.
565
KA Wear, KJ Myers, RF Wagner, SS Rajan, LW Grossman. An autoregressive method for high-resolution one-dimensional chemical shift imaging. J Magn Reson B 105:172, 1994. K Hendrich, H Liu, H Merkle, J Zhang, K Ugurbil. B1 voxel shifting of phasemodulated spectroscopic localization techniques. J Magn Reson 97:486–497, 1992. EM Haacke, ZP Liang, SH Izen. Constrained reconstruction: a superresolution, optimal signal-to-noise alternative to the Fourier transform in magnetic resonance imaging. Med Phys 16:388, 1989. KA Wear, KJ Myers, SS Rajan, LW Grossman. Constrained reconstruction applied to 2D chemical shift imaging. IEEE Trans Med Imag 16:591–597, 1997. P Mansfield. Spatial imaging of the chemical shift in NMR. Magn Reson Med 1:370–386, 1984. A Macovski. Volumetric NMR imaging wih time-varying gradients. Magn Reson Med 2:29–40, 1985. DB Twieg. Multiple-output chemical shift imaging (MOCSI): a practical technique for rapid spectroscopic imaging. Magn Reson Med 12:64–73, 1989. DN Guilfoyle, A Blamire, B Chapman, RH Ordidge, P Mansfield. PEEP—a rapid chemical-shift imaging method. Magn Reson Med 10:282–287, 1989. S Posse, G Tedeschi, R Risinger, R Ogg, D LeBihan. High speed 1H spectroscopic imaging in human brain by echo planar spatial-spectral encoding. Magn Reson Med 33:34–40, 1995. E Adalsteinsson, P Irarrazabal, DM Spielman, A Macovski. Three-dimensional spectroscopic imaging with time-varying gradients. Magn Reson Med 33:461– 466, 1995. E Adalsteinsson, P Irarrazabal, S Topp, C Meyer, A Macovski, DM Spielman. Volumetric spectroscopic imaging with spiral-based k-space trajectories. Magn Reson Med 39:889–898, 1998. E Adalsteinsson, DM Spielman. Spatially resolved two-dimensional spectroscopy. Magn Reson Med 41:8–12, 1999. JM Star-Lack. Optimal gradient waveform design for projection imaging and projection reconstruction echoplanar spectroscopic imaging. Magn Reson Med 41:664–675, 1999. F Hyder, R Renken, DL Rothman. In vivo carbon-edited detection with proton echo-planar spectroscopic imaging (ICED-PEPSI): [3,4-13CH2]glutamine tomography in rat brain. Magn Reson Med 42:997–1003, 1999. AR Guimaraes, JR Baker, BG Jenkins, PL Lee, RM Weissdoff, BR Rosen, RB Gonzalez. Echoplanar chemical shift imaging. Magn Reson Med 41:877–882, 1999. EA Adalsteinsson, J Star-Lack, CH Meyer, DM Spielman. Reduced spatial side lobes in chemical shift imaging. Magn Reson Med 42:314–323, 1999. S Morikawa, T Inubushi, H Ishii, Y Nakasu. Effects of blood sugar level on rat transient focal brain ischemia consecutively observed by diffusion-weighted EPI and 1H echo planar spectroscopic imaging. Magn Reson Med 42:895–902, 1999.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
566 39. 40.
41. 42.
43.
44. 45. 46.
47.
48.
49. 50.
51.
52.
53. 54.
Wear K Young, D Khetselius, BJ Soher, AA Maudsley. Confidence images for MR spectroscopic imaging. Magn Reson Med 44:537–545, 2000. J Star-Lack, SJ Nelson, J Kurhanewicz, LR Huang, DB Vigneron. Improved water and lipid suppression for 3D PRESS CSI using RF band selective inversion with gradient dephasing (BASING). Magn Reson Med 38:311–321, 1997. JW Pan, DB Twieg, HP Heatherton. Quantitative spectroscopic imaging of the human brain. Magn Reson Med 40:363–369, 1998. TC Tran, DB Vigneron, N Sailasuta, J Tropp, P Leroux, J Kurhanewicz, SJ Nelson, R Hurd. Very selective suppression pulses for clinical MRSI studies of brain and prostate cancer. Magn Reson Med 43:23–33, 2000. RG Males, DB Vigneron, J Star-Lack, SC Falbo, SJ Nelson, H Hricak, J Kurhanewicz. Clinical application of BASING and spectral/spatial water and lipid suppression pulses for prostate cancer staging and localization by in vivo 3D 1 H magnetic resonance spectroscopic imaging. Magn Reson Med 43:17–22, 2000. G Goelman, V Harihara Subramanian, JS Leigh. Transverse Hadamard spectroscopic imaging technique. J Magn Reson 89:437–454, 1990. G Goelman. Fast Hadamard spectroscopic imaging techniques. J Magn Reson B 104:212–218, 1994. O. Gonen, J Hu, R Stoyanova, JS Leigh, G Goelman, TR Brown. Hybrid three dimensional (1D-Hadamard, 2D chemical shift imaging) phosphorus localized spectroscopy of phantom and human brain. Magn Reson Med 33:300–308, 1995. O. Gonen, F Arias-Mendoza, G Goelman. 3D localized in vivo 1H spectroscopy of human brain by using a hybrid of 1D-Hadamard with 2D chemical shift imaging. Magn Reson Med 37:644–650, 1997. O. Gonen, JB Murdoch, R Stoyanova, G Goelman. 3D multivoxel proton spectroscopy of human brain using a hybrid of 8th-order Hadamard encoding with 2D chemical shift imaging. Magn Reson Med 39:34–40, 1998. R Pohmann, E Rommel, M von Kienlin. Beyond k-space: spectral localization using higher order gradients. J Magn Reson 141:197–206, 1999. R Loffler, R Sauter, H Kolem, A Hasse, M von Kienlin. Localized spectroscopy from anatomically matched compartments: improved sensitivity and localization for cardiac 31P MRS in humans. J Magn Reson 134:287–299, 1998. M Meininger, W Landschutz, M Beer, T Seyfarth, M Horn, T Pabst, A Haase, D Hahn, S Neubauer, M von Kienlin. Concentrations of human cardiac phosphorus metabolites determined by SLOOP 31P NMR spectroscopy. Magn Reson Med 41:657–663, 1999. CI Haupt, N Schuff, MW Weiner, AA Maudsley. Removal of lipid artifacts in 1 H spectroscopic imaging by data extrapolation. Magn Reson Med 35:678– 687, 1996. ZP Liang, PC Lauterbur. Principles of Magnetic Resonance Imaging: A Signal Processing Perspective. New York, IEEE Press, 2000. JD O’Sullivan. A fast sinc function gridding algorithm for Fourier inversion in computed tomography. IEEE Trans Med Imag MI-4:200–207, 1985.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Advanced Methods in Spectroscopic Imaging 55.
56.
567
JJ Jackson, CH Meyer, DG Nishimura, A Macovski. Selection of a convolution function for Fourier inversion using gridding. IEEE Trans Med Imag 10:473– 478, 1991. AR Thompson, RN Bracewell. The main beam and ring-lobes of an east-west rotation-synthesis array. Astronom J 29:77–94, 1973.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
19 Characterization of Brain Tissue from MR Spectra for Tumor Discrimination Paulo J. G. Lisboa, Wael El-Deredy, Y. Y. Barbara Lee, Angelica R. Corona Hernandez, and Peter Harris John Moores University, Liverpool, United Kingdom
Yangxin Huang Harvard School of Public Health, Chestnut Hill, Massachusetts
Carles Aru´s Universitat Auto`noma de Barcelona, Spain
This chapter critically reviews the state-of-the-art methodology in automatic brain tissue and tumor identification using statistical pattern recognition of in vivo 1H NMR spectra. A systematic methodology for tissue assignment to one of many tissue types directly from the spectra is proposed. This relies on the repeated application of a resampling method, the bootstrap, to provide a robust identification of a small subset of predictive resonance intensities from the entire spectrum, as well as unbiased estimates of misclassification errors. The selected resonances can be used to identify the metabolites most relevant for discrimination between pairs of tissue types, and the pairwise classifiers are optimally combined to obtain class assignment probabilities. 1
INTRODUCTION
Magnetic resonance spectroscopy (MRS) is a noninvasive tool that is unique in providing a detailed fingerprint of the biochemistry of living tissue. In 569
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
570
Lisboa et al.
clinical cases, where the results of imaging are still ambiguous for tissue identification, greater specificity may be sought using the additional information contained in the spectrum. For instance, Kondziolka et al. [1] reported that imaging alone is not sufficient to predict reliably the histological diagnosis of astrocytomas and recommended that all patients suspected of having supratentorial mass lesions should undergo stereotactic biopsy for confirmation. However, biopsies are invasive, risky, and expensive procedures that carry risks of morbidity and mortality. Given the wealth of evidence already available from clinical symptoms and radiological imaging, it is useful to characterize the predictive power of MRS for brain tumor classification as well as for differential diagnosis. From the standpoint of data-based analysis, the attraction in making tissue assignments independently of prior clinical knowledge is tempered by the small sample sizes usually available for study. This has the double effect of reducing the evidence for predictive components of the spectrum, severely limiting the size and scope of the models supported by the data, and resulting in conservative estimates of classification performance. In this chapter we show that current sample sizes are still some way from achieving optimal classification rates. In general, the spectral information is eyeballed by the interpreter, and specific frequency intensities form the basis for the clinical assessment of the spectrum—based on clinical and biochemical prior knowledge. While the richness of the MR spectrum and the experience of the interpreter are clearly important in this process, there are no guarantees that the spectral regions usually identified as clinically relevant are indeed the most predictive of tissue type or tumor grade on the basis of unbiased statistical evidence. Other parts of the spectrum, and combinations of the spectral components, may be optimal for specific discrimination tasks. This chapter focuses on the identification of the individual spectral components (metabolites) that, in combination, yield maximum predictive power. Here we are motivated by the robustness of parsimonious statistical models. The ultimate aim is to interpret clinically the relevance and role of the metabolic indicators found to be predictive for differential diagnosis between grades and types of brain tumor. The data that illustrate the results of this study consist of single voxel PROBE spectra acquired in vivo for five tumors types and cystic regions from some of those tumours as well as cystic growths. The protocol used for data acquisition is detailed in Tate et al. [2], and the total sample of 98 spectra used in this study comprises 17 low grade astrocytoma (AST), 24 glioblastoma (GL), 10 metastases (ME), 22 meningiomas (MM), 9 oligodendrogliomas (OD), and 16 spectra from cystic regions (CY). Figure 1 shows the median and the upper and lower quartiles of the six tissue types.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
571
FIGURE 1 Median spectral intensities (solid line) and lower and upper quartiles (broken lines) for each of the tissue types used in the study.
572
2
Lisboa et al.
PATTERN RECOGNITION OF MRS DATA FOR INVESTIGATING BRAIN TUMORS
The role of MRS in the diagnosis of brain tumors remains controversial [3]. Although MRS has been shown to be capable of identifying tissue types and detect abnormalities, its clinical potential has not been fully utilized. Tissue heterogeneity is probably the main cause of the large variation between spectral patterns of samples with the same clinical diagnosis. A recent study by Lin et al. [4] used clinical measures to assess the efficacy of 1H MRS in clinical decision making by following up patients to determine if MRS accurately predicted the clinical outcome or the surgical findings. Large overlaps between spectra from different diagnosis, poor signalto-noise ratios, and the technical difficulties of producing good field homogeneity and high spectral resolution make the identification of lower concentration metabolites difficult. As MRS becomes a routine measurement in neurological investigation, automated decision support systems for classifying brain tissue based on MRS will require robust and rigorous evaluation paradigms in order for these systems to be clinically acceptable and practically usable. During the last decade, numerous applications of statistical pattern recognition and neural networks for MRS data analysis have been proposed (see Refs. 5 and 6 for extensive reviews on pattern recognition in MRS data analysis). The majority of reported results reduce the dimensionality of the spectra using principal components analysis (PCA) (e.g., Ref. 7) or by selecting known metabolites and using peak heights relative to another metabolite (usually Creatine or Choline) (e.g., Ref. 8). Linear discriminant analysis (LDA) is then used to generate classification boundaries between the known grouping of the data [2]. Other techniques include neural networks [9,10,11], Genetic programming [12], and a mixture of all these techniques in a voting system [13]. One may argue that the success of the pattern representation process is largely dependent on the way the dimensionality of the spectral patterns is reduced, or on the choice of metabolites. For example, whereas Kugel et al. [14] reported that no correlation between lactate and tumor grade was found, Preul et al. [8] used lactate as one of the six variables for the LDA classifier. Smaller dimension patterns are desirable because they yield more compact, and less complex pattern recognition systems, which, by the principle of Ockham’s razor, are generally more robust. In practice, the predictive power of statistical pattern recognition methods deteriorates quickly as the number of variables used comes close to the size of the data sample. The performance of the classifier deteriorates as the dimension of its input increases (the curse of dimensionality) [15] as noise effects present in the covariates accumulate and affect the diagnostic prediction. Therefore there
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
573
is an unavoidable trade-off to be made between having sufficient predictive power in the covariates for the discrimination task and keeping the number of covariates to a minimum. The majority of reported statistical analysis techniques for MRS are based on studies using a number of data points (observations) of the same order of magnitude as the number of variables and parameters estimated to build the system. Hagberg [6] reported the number of variables used in designing the classifier, the size of the design data and the test data and the reported accuracy of the classifier. Although it may be possible to achieve reasonable classification accuracy using the small data set available, the predictive power of such a classifier when applied to future data, for which there is no histological information, is questionable. In addition, there is very little reporting of confidence levels on estimates of classification performance for out-of-sample data. Yet, demonstrating the reliability of the computer-based decision support systems, in safety-critical situations like medical diagnosis, is essential, both in order to gain practical acceptance by clinicians and for regulatory purposes including clinical trial design and certification. 3
SYSTEMATIC METHODOLOGIES FOR TISSUE DISCRIMINATION WITH 1H SPECTRA
In vivo 1H MRS, typically of brain tumors, is characterized by a significant mixing between tissue types (classes) and small sample sizes. Tissue characterization is difficult because of the inherent complexity in the signal due to tissue inhomogeneity, partial volume effects, and interpatient variation, together with a low signal-to-noise ratio. The extensive mixing between classes can be gauged by identifying the source signals underlying the complete data set. Since this paper is intended to review purely data-based methods, this unmixing must be carried out without recourse to prior knowledge of the spectral profile of pure metabolites or assumptions concerning the profiles of different tissue types present in tumors, such as infiltrating tumor, necrotic, and cystic tissue. The state-of-the-art methodology in blind signal separation is the application independent components analysis (ICA) [16]. This approach decomposes the original signals into a linear combination of supra-Gaussian sources, in a manner analogous to treating the spectra as sound traces recorded over a number of time intervals equal to the spectral bins into which the spectra are digitized, with each spectrum corresponding to the recording from a separate microphone. In our data set, there are 194 spectral components, and 98 spectra, from which we aim to identify the independent sources of the spectra, which are analogous to the number of independent speakers in the
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
574
Lisboa et al.
room. The ICA formalism enables the evidence for a number of independent sources to be estimated from the data, resulting in the plot shown in Fig. 2 of the likelihood of ICA models for different numbers of IC sources [16], identified using the software package JADE [17]. There is a clear optimum for just two sources, indicating how much the small sample size impacts the evidence for separating the metabolic source signals that are known to have generated the signal. One of the two sources, shown in Fig. 3, consists largely of the Mobile lipid signals at about 1.3 and 0.9 ppm. The other is a linear combination of the remaining metabolites. The mixing matrix components for the two sources identified from the AST and GL spectra are plotted in Fig. 4. A similar result is obtained for the complete data set, whose
FIGURE 2 Evidence in favor of different numbers of sources, estimated from a penalized likelihood function, for the combined AST and GL spectra.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
575
FIGURE 3 Two sources identified as the most likely to generate the AST and GL spectra.
mixing matrix, shown in Fig. 5, demonstrates an extensive overlap between the five tumor categories. This overlap is reflected in the performance of a single block classifier, simultaneously assigning spectra to all tissue classes. This has a maximum overall accuracy of 56%, compared with that achieved for an optimal combination of pairwise linear discriminant classifiers, which is 73% accuracy overall and considerably better for each differential decision in isolation. The spectral complexity, substantial noise, and small sample sizes suggest the use of simple linear discriminant models for automatic tissue assignment. This is the method of choice for most of the studies reported on in the literature and is also utilized for the experimental results reported in this paper. The following are important issues for data-based characterization of MR spectra using statistical and artificial intelligence methods:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
576
Lisboa et al.
FIGURE 4 Representation of the AST and GL spectra using the coefficients of the mixing matrix of the independent components, which reconstructs the spectra from the two sources identified by ICA.
1.
2.
3.
Selecting an appropriate discriminant model—this should involve a verification of any assumptions about the distribution made by the selected methodology. Dimensionality reduction—this is a critical stage in data-based modeling and has a major effect on the discriminant power of the resulting discriminant models. Two alternative approaches are the selection of the most predictive frequency components and taking linear combinations of all frequencies. An important aspect of dimensionality reduction is to demonstrate the robustness of the method used, against possible bias induced by the small sample sizes. Performance evaluation—the robustness of the misclassification error estimation due to small sample sizes is of paramount im-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
577
FIGURE 5 Combined graphical display of spectra from five tumor types, each represented by the mixing coefficients for two independent component sources, identified from the combined data.
4.
5.
portance. Another important issue is in optimizing the structure of the discriminant model, for instance, choosing between pairwise, hierarchical, and block-structure classifiers. Modelling the inherent uncertainty in the tissue assignment—there is a requirement to make explicit the uncertainty in the model predictions for individual spectra, involving also the ability to detect novel data, which lie outside all the classes represented in the model. This is a difficult issue to address rigorously, yet it is a cornerstone for the clinical utilization of statistical methods for decision support systems using MRS. Visualisation—clinicians often find it useful when decision support includes a direct graphical display of the spectra in relation
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
578
Lisboa et al.
to historical samples from the highest ranked candidates for class assignment. This may take the form of linear score plots or twodimensional displays. 3.1
Selecting the Discriminant Model
Selection of a discriminant model involves a choice between (a) selecting individual spectral components as opposed to linear combinations such as PCA and partial least squares (PLS), (b) choosing between linear and nonlinear models and (c) parametric or nonparametric models; each choice involves particular reliance on assumptions about the distributions. The preferred discriminant model structure depends on the sample size, how many covariates are entered into the model, and the extent of the mixing between classes. Also, when particular distribution assumptions are made, they must be tested for, for instance in the application of Bayesian models to obtain smoothly varying class assignment probabilities. When only a small number of predictive covariates are used to represent the spectra, and the class mixing is substantial, the preferred model is the simplest, i.e., LDA, optimized separately for each class pair. Since there are five distinct tumor types in this study, ten classifiers are generated. Spectra from cystic regions separate well from those of tumors, so this discrimination is carried out using a single model. One disadvantage of the LDA method, however, is that it predicts binary outcomes only, with the consequence that the closeness to the decision boundary separating the two classes can only be inferred informally from the score plot. On the other hand, LDA can be upgraded to provide a probabilistic class assignment probability, if the spectra are assumed to be normally distributed in the multivariate sense. In this study, the bootstrap resampling [18] procedure is applied to the determination of an ‘‘envelope’’ for the standard tests of multivariate normality, enabling this assumption to be confirmed for the subsets of predictive variables. This is shown graphically in Fig. 6. 3.2
Dimensionality Reduction
A critical stage in any statistical analysis of complex data is the determination of an optimal subset of predictive variables. This has a major effect on the predictive power of the resulting discriminant models when applied to data outside the design sample, since missing out on important predictors reduces the discrimination power for the design data, and including nuisance variables introduces wanton noise into the predictions made on future data. The most common approach to variable selection in LDA is stepwise forward selection, which is available by default in most commercial statistical software. However, when this is applied to the design data, it may select
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
579
FIGURE 6 Multivariate normality envelopes for the variables selected to discriminate between AST and GL spectra. These 95% intervals were estimated using bootstrap resampling methods, and they serve to verify that the data from each tissue class are consistent with a multivariate normal distribution, thus enabling Bayes’s theorem to be used to generate continuous estimates of class membership probabilities.
too many covariates, resulting in a model that performs very well for the design data but very poorly for any other data. In an attempt to make the model predictions more robust, it is commonplace to use prior clinical knowledge in order to select particular variables, or combinations of measured variables, typically using area ratios. While this results in more robust predictions for out-of-sample spectra applications, it can miss out on statistically powerful variables. We apply the bootstrap method [18] to variable selection, ranking each frequency component on the basis of the frequency with which it is included in the final discriminant model. This procedure is applied 5,000 times to each pair of tumors, each time resampling the available data in order to simulate the variation in model selection due to the small sample size [18]. The spectral intensities are first ranked in order of the multivariate separation between the two classes, as measured by the Mahalanobis distance, which is a multivariate extension of the ratio of between- to within-class variances. Once the predictive variables are ranked in order of importance, a stepwise model selection procedure is carried out,
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
580
Lisboa et al.
which relies on estimates of the predictive performance of LDA using the candidate variable subset. For any variable selection problem for discriminant analysis, a preliminary univariate examination of all possible variables should be carried out first [19]. This ensures that only those variables that show significant discriminatory power are included in the subsequent analysis. Hence it reduces the influence from noise. Here, we adopt a univariate T2-test to test for the difference in means between the spectra from the two classes [20]. 3.3
Performance Evaluation
Another important aspect is the robustness of the method against bias induced by the small sample sizes. This is overcome by a third application of bootstrap resampling, to make bias-corrected estimates of the misclassification error of the model, when applied to future data. This is known as the .632 method, reflecting the way in which predictions for each spectrum in the design sample in a bootstrapped resample are made, by combining bootstrapped resamples where that spectrum is used in the LDA, and where they are predicted to be out-of-sample [18]. The significance criterion for inclusion of new variables into the model, or indeed the rejection of candidate variables, is McNemar’s test [21,22], which is a nonparametric test for the significance of the difference in performance between any two classifiers, based solely on the particular spectra misclassified by just one of the two classifiers. The classification performances for the ten class pairs, and for cysts vs. the rest, are shown in Tables 1 and 2, for the LDA and its Bayesian extension, respectively. In each case, a different optimal set of predictive variables is selected by the automatic application of the method outlined in the previous section.
TABLE 1 Leave-One-Out Estimate of the Predictive Accuracy of LDA Using McNemar’s Test as the Significance Test for Variable Selection (%)
AST OD MM ME CY
GL
ME
MM
OD
78 64 76 68 Rest 98
89 89 84
77 87
73
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue TABLE 2
AST OD MM ME CY
581
Predictive Accuracy of LDA-Bayes (%) GL
ME
MM
OD
68 69 78 79 Rest 96
89 89 88
77 84
69
A tentative assignment of the predictive variables selected is shown in Table 3. Although the two sets of results are broadly consistent, there are significant differences both in the predicted performance and in the selection of the predictive frequency components. This is symptomatic of the sensitivity of any systematic method of discriminant analysis when it is applied to a complex class assignment task, on the basis of a relatively small sample size. Once the above modeling and evaluation are completed, it is necessary at least to gauge the extent to which the sample sizes are limiting the predictive power of MRS. This is done empirically by estimating the predictive performance of a fixed variable subset for two class pairs, using progressively larger subsets of the design data. The results, shown in Fig. 7, indicate that the current sample sizes do not achieve optimal performance accuracy
TABLE 3 Selected Subset of Resonance in ppm with nearest Metabolite Assignments
AST OD MM ME CY
GL
ME
MM
OD
1.90, 2.22 (GABA, n/a) 2.34 (Glu, FA) 3.73 (Glu) 2.68 (n/a) Rest 1.38 (Lac, FA)
2.34, 3.05 (Glu, FA), (PCr) 1.06 (FA) 3.73 (Glu)
2.32 (Glu, FA)
2.85 (n/a)
2.32 (Glu, FA)
Glu, glutamic acid; FA, fatty acid; Lac, lactate; (P)Cr, (phospho)creatine; n/a, an unassigned resonance.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
582
Lisboa et al.
FIGURE 7 Bias-corrected estimates of misclassification error, calculated using the bootstrap for increasing subsets of the available samples, for two pairs of tumors.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
583
for these two class pairs. A numerical listing of the results is shown in Table 4. 3.4
Modelling Uncertainty
Uncertain classifications can be identified in the Bayesian model by class assignment probabilities of close to 50%. Figure 8 shows how the classification performance for the separation of glioblastomas from astrocytomas varies with the rejection threshold. When these models are applied in a blind test involving new data, it is necessary to combine the pairwise predictors to obtain a single, consistent estimate of the overall class assignment probability for the complete set of candidate tissue categories. This can be achieved in two fundamentally different ways. The first, according to Friedman, is to count how many times each class ‘‘wins’’ a pairwise contest, treating spectra where two or more classes are selected the same number of times as the candidates for rejection. An alternative method, first introduced by Hastie and Tibshirani [23], is to estimate a single set of consistent real-valued class assignment distributions. However, for N classes there will be N*(N-1)/2 observed pairwise classifications, and the N degrees of freedom corresponding to the class assignment probabilities for each class separately from the rest will not in general allow an exact match of the observed pairwise results. For this reason, optimization methods are used [23].
TABLE 4 Bias-Corrected Estimates of Misclassification Error for Increasing Subsets of the Available Data for Two Pairs of Tumors
Class pairs AST vs. GL
AST vs. MM
Sample sizes [class 1, class 2]
Estimated misclassification error
[7,14] [8,16] [11,18] [12,20] [15,22] [17,24] [9,14] [11,16] [13,18] [15,20] [17,22]
0.30 0.24 0.29 0.23 0.21 0.16 0.29 0.19 0.17 0.12 0.12
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
584
Lisboa et al.
FIGURE 8 Trade-off between classification accuracy and rejection rate for spectra from AST and GL, as a function of an upper rejection threshold, applied to the class assignment probabilities produced by an LDA-Bayes classifier.
The overall accuracy of class assignment for the optimal variable subsets are estimated for the whole data by leave-one-out (LOO) cross-validation. These subsets are derived through the triple application of the bootstrap procedure outlined earlier, in testing for normality, for variable selection and, again, for pairwise performance estimation. The overall classification accuracy of Friedman’s method is 74% with 18 ties using LOO-LDA, and 65% with 13 ties for Bayes’s LOO-LDA; the classification performance of Tibshirani’s method, which requires Bayes’s LDA, is 73% with 36 rejected spectra at a rejection threshold of 0.5.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
3.5
585
Visualization
Finally, clinicians often find it useful when decision support includes a visualization of patient information in two-dimensional plots or even linear scoring scales. The preferred visualization of spectra from class pairs relies heavily, if indirectly, upon the complex variable selection methodology, which in turn is strongly influenced by the choice of a classifier. This is because the plot (shown in Fig. 9) is composed of the statistics of the predictive variable subset that the earlier analysis demonstrated to be the most robust for class assignments for future data, and whose classification performance is characterized by the tables above. The visualization can be shown to capture both mean and variance (MV) [24] information in the data, resulting in plots that accurately reflect the distance between class means, in relation to the covariance structure of the data, given the selection of covariates. The visual improvement in class separation from the original
FIGURE 9 Mean-variance (MV) plot for AST vs. GL spectra, using the statistics of the resonances selected as the most predictive of class membership.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
586
Lisboa et al.
Independent Components Plots can be readily observed by inspection of Figs. 9 and 4. The MV method will, of course, return a two-dimensional map regardless of the dimensionality of the vector of covariates. 4
PERSPECTIVES FOR FUTURE RESEARCH
As the application of MRS by clinicians to resolve ambiguities in MRI becomes established, it is hoped that semiautomated methods of decision support will add to the benefits that can be gained by pooling data across clinical centers, thus significantly increasing the statistical power of the databases. It was noted earlier that the returns that can be gained by sharing data may act as a spur for large-scale collaborations, leading to a greater standardization of the acquisition protocols across the MR community. It is also increasingly the case that MRS is one among multiple noninvasive modalities for use in tissue identification and analysis. Other MR modalities typically include perfusion traces. The fusion of multimodal MR data is an important and immediate priority for future research. Finally, the physical modeling of the MR spectrum is seldom reported in tissue assignment studies. Three aspects of this are metabolite quantification, particularly as a potential vehicle for intercenter compatibility of data; modeling the variation in position of the chemical shift spectrum, in order to make the discriminant models insensitive to small but unavoidable frequency shifts; and the determination of the characteristic sources that drive the spectra for different tissue classes, using advanced modeling techniques borrowed from blind signal separation. 5
CONCLUSION
This paper has reviewed the use of statistical pattern recognition for the automatic tissue assignment of magnetic resonance spectra from the brain. Tissue discrimination on the basis of historical data is possible with statistical rigor, yielding varying levels of accuracy. Differential tissue assignments are carried out with models based on a small number of frequency components, selected for their collective predictive power, and tissue assignment was demonstrated by optimally combining the predictions from pairwise models. Arguably the main limiting factor for a clear demonstration of the predictive power of statistical pattern recognition of brain MRS is the need for larger sample sizes, which must be judiciously constructed using information about the classification accuracy for different tissue types. A systematic methodology is proposed for model design, performance evaluation, and data visualization, whose robustness against bias effects from small data samples is assured by the repeated and extensive application of
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Characterization of Brain Tissue
587
bootstrap resampling methods. The tissue assignments predicted by these models will be valuable even if they rank preferentially tissue types in pairs which, in conjunction with complementary clinical and imaging information, allow an existing ambiguity to be resolved. ACKNOWLEDGMENTS The authors gratefully acknowledge A. R. Tate, from St. George’s Hospital Medical School, London, for making available preprocessed and normalized PROBE spectra for this study. C. A. was funded by CICYT SAF99-0101. REFERENCES 1.
2.
3.
4.
5. 6.
7.
8.
9.
10.
D Kondziolka, LD Lunsford, AJ Martinez. Unreliability of contemporary neurodiagnostic imaging in evaluating suspected adult supratentorial (low-grade) astrocytoma. J Neurosurg 79:533–536, 1993. AR Tate, JR Griffiths, I Martı´nez-Pe´rez, A Moreno, I Barba, ME Caban˜as, D Watson, J Alonso, F Bartumeus, F Isamat, E Ferrer, A Capdevila, C Aru´s. Towards a method for automated classification of 1H MRS spectra from brain tumours. NMR Biomed. 11:171–191, 1998. A Shiino, S Nakasu, M Matsuda, J Handa, S Morikawa. T Inubushi. Noninvasive evaluation of the malignant potential of intracranial meningiomas performed using proton magnetic resonance spectroscopy. Journal of Neurosurgery 91:928–934, 1999. A Lin, S Bluml, AN Mamelak. Efficacy of proton magnetic resonance spectroscopy in clinical decision making for patients with suspected malignant brain tumors. Journal of Neuro-Oncology 45:69–81, 1999. W El-Deredy. Pattern recognition approaches in biomedical and clinical magnetic resonance spectroscopy: a review. NMR BIOMED 10:99–124 1997. G Hagberg. From magnetic resonance spectroscopy to classification of tumors. A review of pattern recognition methods. NMR in Biomedicine 11:148–156, 1998. G Hagberg, AP Burlina, I Mader, W Roser, EW Radue, J Seelig. In vivo proton MR spectroscopy of human gliomas: definition of metabolic coordinates for multi-dimensional classification. Magn. Reson. Med. 34:242–252, 1995. MC Preul, Z Caramanos, DL Collins, J-G Villemure, R Leblanc, A Olivier, P Ronald, DL Arnold. Accurate, non-invasive diagnosis of human brain tumours by using proton magnetic resonance spectroscopy. Nat. Med. 2:323–325, 1996. NM Branston, RJ Maxwell, SL Howells. Generalization performance using backpropagation algorithms applied to patterns derived from tumor 1H-NMR spectra. J. Microcomputer Appl. 16:113–123, 1993. J-P Usenius, S Tuohimets, P Vainio, M Ala-Korpela, Y Hiltunen, R Kauppinen. Automated classification of human brain tumours by neural network analysis using in vivo 1H magnetic resonance spectroscopic metabolite phenotypes. NeuroReport 7:1597–1600, 1996.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
588 11.
12.
13.
14.
15. 16.
17. 18. 19. 20. 21. 22. 23. 24.
Lisboa et al. H Poptani, J Kaartinen, RK Gupta, M Niemitz, Y Hiltunen, RA Kauppinen. Diagnostic assessment of brain tumours and non-neoplastic brain disorders in vivo using proton nuclear magnetic resonance spectroscopy and artificial neural networks. Journal of Cancer Research and Clinical Oncology 125:343–349, 1999. HF Gray, RJ Maxwell, I Martinez-Perez, C Arus, S Cerdan. Genetic programming for classification and feature selection: analysis of H-1 nuclear magnetic resonance spectra from human brain tumour biopsies. NMR in Biomedicine 11:217–224, 1998. RL Somorjai, AE Nikulin, N Pizzi, D Jackson, G Scarth, B Dolenko, H Gordon, P Russell, CL Lean, L Delbridge, CE Mountford, ICP Smith. Computerized consensus diagnosis: a classification strategy for the robust analysis of MR spectra. I. Application to 1H spectra of thyroid neoplasms. Mag. Res. Med. 33:257–263, 1995. H Kugel, W Heindel, RJ Ernestus, J Bunke, R Dumesnil, G Friedmann. Human brain-tumors—spectral patterns detected with localized H-1 MR spectroscopy. Radiology 183:701–709, 1992. RO Duda, PE Hart. Pattern Classification and Scene Analysis. New York, John Wiley, 1973. R Everson, SJ Roberts. Independent components analysis. In: PJG Lisboa, EC Ifeachor, PS Szczepaniak. Artificial Neural Networks in Biomedicine. Springer —Perspectives in Neural Computing, London, 2000, pp. 153–168. J-F Cardoso. Higher-order contrasts for Independent Component Analysis. Neural Computation 11:157–192, 1999. B Efron, RJ Tibshirani. An Introduction to the Bootstrap. New York, Chapman and Hall, 1993. GP McCabe. Computations for variable selection in discriminant analysis. Technometrics 17:103–109, 1975. AC Rencher. Multivariate Statistical Inference and Applications. New York, John Wiley, 1998. BD Ripley. Pattern Recognition and Neural Networks. Cambridge, 1996. JL Fleiss. Statistical Methods for Rates and Proportions. New York, Wiley Interscience, 1973. T Hastie, RJ Tibshirani. Classification by pairwise coupling. Technical Report, University of Toronto, 1996. WC Cheng. A graph for two training samples in a discriminant analysis. Appl. Statists. 36:82–91, 1987.
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
20 Wavelet Packets Algorithm for Metabolite Quantification in Magnetic Resonance Spectroscopy and Chemical Shift Imaging Luca T. Mainardi and Sergio Cerutti Polytechnic University, Milan, Italy
Daniela Origgi European Institute of Oncology, Milan, Italy
Giuseppe Scotti San Raffaele Hospital, Milan, Italy
1
INTRODUCTION
In the past decades there has been an increasing interest in magnetic resonance spectroscopy (MRS) in different fields ranging from analytical chemistry through material sciences to biomedical applications. In particular, the well-known success and the widespread application of MRS techniques in biomedical research and medical practice have been supported by a number of inherent advantages of this technique: MRS performs repetitive, nondestructive measurements of metabolic processes in situ as they proceed in their own environment, and it allows the extraction of valuable in vivo 589
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
590
Mainardi et al.
information on the physiological and pathological states of human tissues in different organs [1]. It is known that the metabolic information contained in the magnetic resonance (MR) signal (the free induction decay [FID] signal) is apparent in its spectrum. In fact, in MRS spectra, different compounds appear as different resonance peaks: the position of resonance identifies the specific compound, the peak line width indicates the transverse relaxation time of the nucleus (an index linked to the mobility of the molecule), and finally the total area of the resonance peak is proportional to the concentration of the detected substance. The quantification of these parameters leads to metabolic tissue characterization [1,2]. Since MRS has been applied to humans, 1H-MRS has attracted much attention. The reason is that a proton is the most sensitive and stable nucleus, and hydrogen atoms are widely diffused in living tissues [3]. Nowadays, thanks to the recent technical advances in MR instrumentation, 1H-MRS is routinely applied in clinical settings, especially in brain study, where it has been documented to be effective in the diagnosis, prognosis, and treatment selection of cerebral tumors [4,5], cerebral ischemia [6], epilepsy [7,8] and multiple sclerosis [9,10]. Changes in the 1H-MRS resonance patterns were observed between normal brain and cerebral tumors, with potential applications for the grading and classification of different tumor types [11,12]. The metabolism of cerebral ischemia shows both acute and chronic changes with relevant implications for a pathophysiological and a therapeutic point of view [24,25]. Finally, a few studies describe the possibility of investigating the metabolic characteristics of multiple sclerosis (MS) lesions classified in acute, subacute, and chronic cases, using MR spectroscopy [9,10, 26]. MS also presents several patterns including benign, relapsing-remitting, secondary progressive, and primary progressive, mostly not well recognized in MR images. The possibility offered by MRS to study and to distinguish the different MS forms therefore can be of great value in establishing diagnosis, monitoring progression, and evaluating therapy [27]. Recently, the development of localization techniques have lead to the introduction of chemical shift imaging (CSI) methods and to the production of images reflecting the spatial distributions of various chemical species. In this way, both localized metabolic information (through single-voxel (SV) techniques) and mapping of metabolite concentration are available to the clinician [13,14]. Because of its clinical importance, the processing of in vivo 1H MRS signals and the extraction of the relevant information is not a trivial task. Major problems, which may limit the theoretical potentiality of the technique, include the narrow chemical shift range of 1H signals, which requires a precise shimming of the B0 field, and the presence of unwanted water and
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet Packets Algorithm
591
lipid contributions, which overwhelm the small contributions of the metabolites of medical interest [3,15]. In addition, the presence of severe phase distortions [16] and consistent overlaps among spectral peaks [17] make it difficult to quantify the parameters of interest, especially in a clinical environment where short echo time (TE = 20 ms) and low-intensity magnetic fields (B0 ⱕ 2 T) are employed. Finally, a good shimming and correct suppression of water contribution usually depends on the intervention of the experimenters, thus limiting the repeatability of the study [3]. For these reasons, there is a need for robust and reliable signal processing methods that make it possible to extract the relevant FID information. The methods should be fast, automatic, and operator-independent. In this chapter a time-domain signal processing technique suitable for the extraction of the MRS parameters is described and applied to the processing of both single-voxel and CSI data. The method is based on a subband orthonormal decomposition of the original FID performed through wavelet packets (WP). Recently introduced [18,19], WPs provide a signal-adaptive framework for time-frequency analysis, which is the natural extension of wavelet transform [20–22], where arbitrary time and frequency resolution can be obtained according to the signal. The combination and superimposition of wavelets make it possible to realize the ad hoc decomposition of the original FID signals, in which the contributions of the different metabolites are separated and easily quantified. This chapter is organized as follows: first, an introduction to the problem of FID data processing is presented; then, after a brief recall of basic concepts about wavelets, the proposed approach is developed; in the next section, simulations using synthetic FID signals and phantom acquisitions are presented and the performances of the method described; finally, a few examples of applications to real data are shown in the construction of metabolic maps in both healthy subjects and diseased patients. 2
DATA PROCESSING
The purpose of signal processing methods in MRS is to estimate the concentration of the various metabolites that contribute to the free induction decay (FID) signal. The FID is usually described as a sum of p complex damped sinusoids according to [23]
冘 p
x(t) =
Ak e jke⫺bkte jkt ⫹ w(t)
(1)
k=1
where Ak is the amplitude of the kth sinusoid, k is its frequency, bk is the damping factor, k is the phase, and w(t) is a Gaussian noise process. Ak is
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
592
Mainardi et al.
the relevant parameter for metabolite quantification. In fact, Ak is proportional to the number of nuclei contributing to the response at the frequency k number, which depends on the metabolite concentration. Ignoring the noise term w(t), in the frequency domain this relationship becomes
冘 p
X() =
k=1
Ak e jk j( ⫺ (k ⫹ jbk))
(2)
consisting of a set of Lorentzian functions. After correct phasing (k = 0) of the spectrum, the absorption mode spectrum (i.e., its real part) becomes
冘 p
Re{X()} =
k=1
Ak bk b2k ⫹ ( ⫺ k)2
(3)
and the metabolite concentration is easily obtained as the integral of each spectral line. Therefore calculation of metabolite concentration can be carried out in both the time domain and the frequency domain, requiring accurate estimation of either the amplitude or the peak area. Several algorithms have been proposed in the research literature for both the time domain and the frequency domain. Frequency-domain methods are based on accurate fitting of spectral lines by different curves (Lorentzian, Gaussian, or a mixture of several) [29–31] through linear and nonlinear fitting procedures, including (or not) a priori knowledge [32]. Most time-domain techniques are based on linear prediction theory using a least-squares (LS) identification procedure [33,34], but nonlinear LS optimization also has been employed [35]. Recently, methods based on wavelet transformation have been introduced [37,38]. A multicenter study that compares the performances of different methods can be found in de Beer et al. [39]. 2.1
Wavelets and Wavelet Packets
A signal x(t) 僆 L2(R) (the space of square integrable functions) can be expressed by a linear combination of elementary building blocks l , usually referred to as basis functions: x(t) =
冘 i
具i (t), x(t)典i (t) =
冘
cii (t)
(4)
i
where 具⭈,⭈典 is the inner product in L2(R), t is the time index, and 具i , j典 = ␦i,j [36]. Equation (4) is the signal expansion of x(t) through l . For analytical and practical reasons all the basis functions l are obtained from a small set of transformations of a single prototype function . These transformations include modulation, shift, and scaling operations. An interesting family of orthonormal bases is obtained by a discrete set of dilated (index i) and translated (index k) versions of a single prototype , called a mother wavelet:
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Wavelet Packets Algorithm
i,k = 2⫺i/2i,k
593
冉 冊 t 2i ⫺ k
(5)
i,k僆Z
Associated with the concept of wavelet expansion is the concept of the multiresolution analysis of the signal x(t) [40]. At a generic level J of a resolution we may rewrite (4) as
冘冘 冘冘 J
x(t) =
j=1
具i,k(t), x(t)典i(t) ⫹
k
J
=
j=1
ci,ki(t) ⫹
k
冘
冘
具J,k(t), x(t)典J,k(t)
k
dJ,kJ,k(t)
(6)
k
where J,k is called the scaling function. In Eq. (6) the first term represents the wavelet signal decomposition up to level J, while the second term is the coarse approximation of x(t) at level J. Figure 1 shows a wavelet decomposition of a signal using Haar wavelets. Lower panels show the coarser approximations of the original signal, while upper panels show the details at different resolutions. At every level of decomposition, details are extracted from the previous coarse approximation of signals using combinations of the same wavelet function. The wavelet shape is shown below the details. Thus wavelet decomposition at level J consists of the coarsest resume [second term of Eq. (6)] and all the details from level 1 to J [first term of Eq. (6)]. It has been shown [20,21] that ci,k and di,k can be computed through specially designed quadrature mirror filters (QMF) by iterating the basic filtering structure of Fig. 2a. Here h[n] and g[n] represent the impulse response of high-pass and low-pass FIR filters, respectively. In the orthonormal case the following relations link h[n], g[n], i,k and i, [40]:
(t) = 2
冘
g[k](2t ⫺ k)
k
h[k] = (⫺1)kg[1 ⫺ k]
(t) = 2
冘
h[k](2t ⫺ k)
k
(7)
Wavelet decomposition can be obtained by iterating the filtering procedure according to the classical dyadic structure in Fig. 3a. Instead of selecting a unique wavelet base , it is possible to construct a library of orthonormal functions and search for the best expansion in respect of a given application [41]. This is obtained by introducing the concept of wavelet packets (WP), i.e., a linear combination and superimposition of wavelets [41]. In practice it is possible for every node of the tree to become the parent of the two immediate descendents, or children, as shown in Fig. 3b. Because of the orthogonal properties of the quadrature filters, different subsets of possible orthonormal bases are created. In fact, it can be dem-
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
594
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Mainardi et al.
FIGURE 1 Details (upper panels) and coarser approximations (lower panels) of a signal using Haar wavelet. Different decomposition levels are shown. Sketches of scaled versions of wavelet and scaling functions are shown below the panels.
Wavelet Packets Algorithm
595
FIGURE 2 (a) One-stage decomposition block. (b) Modulus of the frequency response of the low-pass (g[n]) and high-pass (h[n]) filters shown in (a).
onstrated that removing any subtree from the complete tree still yields an orthonormal decomposition of the signal [41,42]. It is worth noting that the creation of new children corresponds to a partitioning of the frequency axis due to the low-pass and high-pass properties of g[n] and h[n]. If the whole decomposition tree is considered (Fig. 3b) at a generic level J, the frequency axis is equally partitioned (from 0 to Nyquist frequency) in 2J sectors. Moreover, if the h[n] and g[n] are swapped at each parent with an odd sequence, a frequency-ordered partitioning is obtained [18]. As we proceed along the tree frequency, localization is increased, while time resolution is reduced due to the down-sampling. The problem is now to find the best bases, or the optimal decomposition tree (as in Fig. 3c) for a given type of signal in respect to a given criterion [41]. This will be discussed in the next section. 2.2
Wavelet Packet Decomposition in 1H-MRS Signals
In the analysis of MRS signals, the desired target of WP decomposition is to split the different metabolic contributions in different subbands, or at least to create a reduced set of subsignals containing only a few resonance peaks. In other words, WP decomposition should favor the frequency localization. It is indeed well known from information theory that the lower the number of signal sinusoids, the better is the estimation of the parameters of the sinusoids [43]. Thus the basic concept is to use WP decomposition to create a subset of signals, in order to reduce the complex problem of full-band spectral estimation to a set of simpler spectral estimation problems in subbands. In theory, when pure sinusoidal signals are considered and ideal QMF is employed, the superiority of spectral estimation performed in subbands, as compared to that performed on the original signal, has been demonstrated
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
596 Sketches of various decomposition trees. (a) The wavelet basis. (b) The full-tree basis. (c) Wavelet
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
Mainardi et al.
FIGURE 3 packets.
Wavelet Packets Algorithm
597
[44]. In the ideal case, it will be therefore desirable to proceed along the decomposition as much as possible. In practice, due to the nonideal characteristics of QMF filters and to the broad-band pattern of each resonance peak, the influence of aliasing must be considered every time a parent node is split into two children. At every splitting, aliasing may lead to the attenuation, or even to the complete disappearing, of some components Therefore a signal-dependent decomposition scheme must be applied. The trade-off is to find the decomposition tree that is as deep as possible but still has no components aliased [42]. Practically speaking, when an 1H-MRS spectrum is considered, the splitting of a given node into two children is suggested when all the relevant metabolic peaks lie inside the band of the analysis filters and no signal mode is attenuated/created by the splitting. The estimation of the number of signal modes, prior to and after the decomposition, can be obtained through the minimum description length (MDL) criterion [45]. It is known that an analytical expression of an MDL criterion exists for complex exponential signals:
MDL(k) = ⫺2 log
冉
冉写 冊 冉 冘 冊 N
L
i
i=k⫹1
1 L⫺k
i
i=k⫹1
冊
(L⫺k)N
L
⫹
1 k(2L ⫺ k)log(N) 2 (8)
where the i’s are the L eigenvalues of the autocorrelation matrix and N is the number of samples. The number of modes is then the number k that minimizes the criterion. Using MDL, a quantitative criterion for decomposition is provided: if the number of modes remains constant between the parent and the children, the decomposition is accepted, and otherwise it is not [42]. When the criterion is applied to the analysis of an 1H-MRS human brain signal, the decomposition tree reported in Fig. 4 is obtained. Five subband signals have been created. Let us indicate as bandJ,k the frequency band associated with the kth subsignal at level J. Bands containing the relevant peaks are band3,2 = 3.68–2.69 p.p.m., band3,3 = 2.69–1.70 p.p.m., and band3,4 = 1.70–0.71 p.p.m. Residual water-peak contributions fall into the band3,1. Thus the WP decomposition automatically isolates this contribution without the need of a separate preprocessing of the signal. Once the signal is decomposed, the next step is the identification of the signal components in the differently selected subbands. Different methods can be applied. We used the linear prediction singular value decomposition (LPSVD) method [33].
Copyright 2002 by Marcel Dekker, Inc. All Rights Reserved.
598
Mainardi et al.
FIGURE 4 Optimal decomposition tree for the 1H-MRS signals of brain tissues. Frequency ordered partitioning of the frequency axis is obtained by swapping h[n] and g[n] at each parent node with an odd sequence [18].
In order to quantify the effect of the WP decomposition on the original signal component, let us indicate H [] and G[], the discrete Fourier transform (DFT) of the analysis filters h[n] and g[n], respectively. In addition, let us suppose that the discrete-time FID x[n] is expressed as x[n] = 具x(t), (t ⫺ n)典
(9)
We can describe the effects of the filtering block of Fig. 2a on x[n]. At a generic level J we can write XJ⫹1[J⫹1] = 兩G[J]兩e j⬔G[J]XJ [J]
再
DJ⫹1[J⫹1] = 兩H[J]兩e j⬔H[J]XJ [J]
J⫹1 =
2J
2( ⫺ J)
0 ⱕ J