Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2489
3
Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Tokyo
Takeyoshi Dohi Ron Kikinis (Eds.)
Medical Image Computing and Computer-Assisted Intervention – MICCAI 2002 5th International Conference Tokyo, Japan, September 25-28, 2002 Proceedings, Part II
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Takeyoshi Dohi Department of Mechano-informatics Graduate School of Information Science and Technology University of Tokyo, 7-3-1 Hongo Bunkyo-ku, 113-8656 Tokyo, Japan E-mail:
[email protected] Ron Kikinis Department of Radiology, Brigham and Women’s Hospital 75 Francis St., MA, 02115 Boston, USA E-mail:
[email protected] Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Medical image computing and computer assisted intervention : 5th international conference ; proceedings / MICCAI 2002, Tokyo, Japan, September 25 - 28, 2002. Takeyoshi Dohi ; Ron Kikinis (ed.). - Berlin ; Heidelberg ; New York ; Hong Kong ; London ; Milan ; Paris ; Tokyo : Springer Pt. 2 . - (2002) (Lecture notes in computer science ; Vol. 2489) ISBN 3-540-44225-1
CR Subject Classification (1998): I.5, I.4, I.3.5-8, I.2.9-10, J.3 ISSN 0302-9743 ISBN 3-540-44225-1 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2002 Printed in Germany Typesetting: Camera-ready by author, data conversion by Olgun Computergrafik Printed on acid-free paper SPIN: 10870643 06/3142 543210
Preface
The fifth international Conference in Medical Image Computing and Computer Assisted Intervention (MICCAI 2002) was held in Tokyo from September 25th to 28th, 2002. This was the first time that the conference was held in Asia since its foundation in 1998. The objective of the conference is to offer clinicians and scientists the opportunity to collaboratively create and explore the new medical field. Specifically, MICCAI offers a forum for the discussion of the state of art in computer-assisted interventions, medical robotics, and image processing among experts from multi-disciplinary professions, including but not limited to clinical doctors, computer scientists, and mechanical and biomedical engineers. The expectations of society are very high; the advancement of medicine will depend on computer and device technology in coming decades, as they did in the last decades. We received 321 manuscripts, of which 41 were chosen for oral presentation and 143 for poster presentation. Each paper has been included in these proceedings in eight-page full paper format, without any differentiation between oral and poster papers. Adherence to this full paper format, along with the increased number of manuscripts, surpassing all our expectations, has led us to issue two proceedings volumes for the first time in MICCAI’s history. Keeping to a single volume by assigning fewer pages to each paper was certainly an option for us considering our budget constraints. However, we decided to increase the volume to offer authors maximum opportunity to argue the state of art in their work and to initiate constructive discussions among the MICCAI audience. It was our great pleasure to welcome all MICCAI 2002 attendees to Tokyo. Japan, in fall, is known for its beautiful foliage all over the country. The traditional Japanese architectures always catches the eyes of visitors to Japan. We hope that all the MICCAI attendees took the opportunity to enjoy Japan and that they had a scientifically fruitful time at the conference. Those who could not attend the conference should keep the proceedings as a valuable source of information for their academic activities. We look forward to seeing you at another successful MICCAI in Toronto in 2003.
July 2002
DOHI Takeyoshi and Ron Kikinis
Organizing Committee
Honorary Chair Kintomo Takakura
Tokyo Women’s Medical University, Japan
General Chair Takeyoshi Dohi Terry Peters Junichiro Toriwaki
The University of Tokyo, Japan University of Western Ontario, Canada Nagoya University, Japan
Program Chair Ron Kikinis
Harvard Medical School and Brigham and Women’s Hospital, USA
Program Co-chairs Randy Ellis Koji Ikuta Gabor Szekely
Queen’s University at Kingston, Canada Nagoya University, Japan Swiss Federal Institute of Technology, ETH Zentrum, Switzerland
Tutorial Chair Yoshinobu Sato
Osaka University, Japan
Industrial Liaison Masakatsu Fujie Makoto Hashizume Hiroshi Iseki
Waseda University, Japan Kyushu University, Japan Tokyo Women’s Medical University, Japan
VIII
Organization
Program Review Committee Alan Colchester Wei-Qi Wang Yongmei Wang Jocelyne Troccaz Erwin Keeve Frank Tendick Sun I. Kim Pierre Hellier Pheng Ann Heng Gabor Szekely Kirby Vosburgh Allison M. Okamura James S. Duncan Baba Vemuri Terry M. Peters Allen Tannenbaum Richard A. Robb Brian Davies David Hawkes Carl-Fredrik Westin Chris Taylor Derek Hill Ramin Shahidi Demetri Terzopoulos Shuqian Luo Paul Thompson Simon Warfield Gregory D. Hager Kiyoyuki Chinzei Shinichi Tamura Jun Toriwaki Yukio Kosugi Jing Bai Philippe Cinquin Xavier Pennec Frithjof Kruggel
University of Kent at Canterbury, UK Dept. of E., Fudan University, China The Chinese University of Hong Kong, China TIMC Laboratory, France Research Center Caesar, Germany University of California, San Francisco, USA Hanyang University, Korea INRIA Rennes, France The Chinese University of Hong Kong, China Swiss Federal Institute of Technology Zurich, Switzerland CIMIT/MGH/Harvard Medical School, USA Johns Hopkins University, USA Yale University, USA University of Florida, USA The John P. Robarts Research Institute, Canada Georgia Institute of Technology, USA Mayo Clinic, USA Imperial College London, UK King’s College London, UK Harvard Medical School, USA University of Manchester, UK King’s College London, UK Stanford University, USA New York University, USA Capital University of Medical Sciences, USA UCLA School of Medicine, USA Harvard Medical School, USA Johns Hopkins University, USA AIST, Japan Osaka University, Japan Nagoya University, Japan Tokyo Institute of Technology, Japan Tsinghua University, China UJF (University Joseph Fourier), France INRIA Sophia-Antipolis, France Max-Planck-Institute for Cognitive Neuroscience, Germany
Organization
Ewert Bengtsson ` Coste Mani´ere Eve Milan Sonka Branislav Jaramaz Dimitris Metaxas Tianzi Jiang Tian-ge Zhuang Masakatsu G. Fujie Takehide Asano Ichiro Sakuma Alison Noble Heinz U. Lemke Robert Howe Michael I Miga Herv´e Delingette D. Louis Collins
IX
Uppsala University, Finland INRIA Sophia Antipolis, France University of Iowa, USA West Penn Hospital, USA Rutgers University, USA Chinese Academy of Sciences, China Shanghai Jiao tong University, China Waseda University, Japan Chiba University, Japan The University of Tokyo, Japan University of Oxford, UK Technical University Berlin, Germany Harvard University, USA Vanderbilt University, USA INRIA Sophia Antipolis, France Montreal Neurological Institute, McGill University, Canada Kunio Doi University of Chicago, USA Scott Delp Stanford University, USA Louis L. Whitcomb Johns Hopkins University, USA Michael W. Vannier University of Iowa, USA Jin-Ho Cho Kyungpook National University, Korea Yukio Yamada University of Electro-Communications, Japan Yuji Ohta Ochanomizu University, Japan Karol Miller The University of Western Australia William (Sandy) Wells Harvard Medical School, Brigham and Women’s Hosp., USA Kevin Montgomery National Biocomputation Center/Stanford University, USA Kiyoshi Naemura Tokyo Women’s Medical University, Japan Yoshihiko Nakamura The University of Tokyo, Japan Toshio Nakagohri National Cancer Center Hospital East, Japan Yasushi Yamauchi AIST, Japan Masaki Kitajima Keio University, Japan Hiroshi Iseki Tokyo Women’s Medical University, Japan Yoshinobu Sato Osaka University, Japan Amami Kato Osaka University School of Medicine, Japan Eiju Watanabe Tokyo Metropolitan Police Hospital, Japan Miguel Angel Gonzalez Ballester INRIA Sophia Antipolis, France Yoshihiro Muragaki Tokyo Women’s Medical University, Japan
X
Organization
Makoto Hashizume Paul Suetens Michael D. Sherar Kyojiro Nambu Naoki Suzuki Nobuhiko Sugano Etsuko Kobayashi Gr´egoire Malandain Russell H. Taylor Maryellen Giger Hideaki Koizumi Rjan Smedby Karl Heinz Hoene Sherif Makram-Ebeid St´ephane Lavall´ee Josien Pluim Darwin G. Caldwell Vaillant Regis Nassir Navab Eric Grimson Wiro Niessen Richard Satava Takeyoshi Dohi Guido Gerig Ferenc Jolesz Leo Joskowicz Antonio Bicchi Wolfgang Schlegel Richard Bucholz Robert Galloway Juan Ruiz-Alzola
Kyushu University, Japan K.U. Leuven, Medical Image Computing, Belgium Ontario Cancer Institute/University of Toronto, Canada Medical Systems Company, Toshiba Corporation, Japan Institute for High Dimensional Medical Imaging, Jikei University School of Medicine, Japan Osaka University, Japan The University of Tokyo, Japan INRIA Sophia Antipolis, France Johns Hopkins University, USA University of Chicago, USA Advanced Research Laboratory, Hitachi, Ltd., Japan Linkoeping University, Sweden University of Hamburg, Germany Philips Research France PRAXIM, France University Medical Center Utrecht, The Netherlands University of Salford, England GEMS, Switzerland Siemens Corporate Research, USA MIT AI Lab, USA University Medical Center Utrecht, The Netherlands Yale University School of Medicine, USA The University of Tokyo, Japan UNC Chapel Hill, Department of Computer Science, USA Brigham and Womens Hospital Harvard Medical School, USA The Hebrew University of Jerusalem, ISRAEL University of Pisa, Italy DKFZ, Germany Saint Louis University School of Medicine, USA Vanderbilt University, USA University of Las Palmas de Gran Canaria, Spain
Organization
Tim Salcudean Stephen Pizer J. Michael Fitzpatrick Gabor Fichtinger Koji Ikuta Jean Louis Coatrieux Jaydev P. Desai Chris Johnson Luc Soler Wieslaw L. Nowinski Andreas Pommert Heinz-Otto Peitgen Rudolf Fahlbusch Simon Wildermuth Chuck Meyer Johan Van Cleynenbreugel Dirk Vandermeulen Karl Rohr Martin Styner Catherina R. Burghart Fernando Bello Colin Studholme Dinesh Pai Paul Milgram Michael Bronskill Nobuhiko Hata Ron Kikinis Lutz Nolte Ralph Mosges Bart M. ter Haar Romeny Steven Haker
XI
University of British Columbia, Canada University of North Carolina, USA Vanderbilt University, USA Johns Hopkins University, USA Nagoya University, Japan University of Rennes-INSERM, France Drexel University, USA Scientific Computing and Imaging Institute, USA IRCAD, France Biomedical Imaging Lab, Singapore University Hospital Hamburg-Eppendorf, Germany MeVis, Germany Neurochirurgische Klinik, Germany University Hospital Zurich, Inst. Diagnostic Radiology, Switzerland University of Michigan, USA Medical Image Computing, ESAT-Radiologie, K.U. Leuven, Belgium K.U. Leuven, Belgium International University in Germany, Germany Duke Image Analysis Lab, UNC Neuro Image Analysis Lab, Germany University of Karlsruhe, Germany Imperial College of Science, Technology and Medicine, UK University of California, San Francisco, USA University of British Columbia, Canada University of Toronto, Canada University of Toronto/Sunnybrook Hospital, Canada The University of Tokyo, Japan Brigham and Women’s Hospital and Harvard Medical School, USA University of Bern, Germany IMSIE Univ. of Cologne, Germany Eindhoven University of Technology, The Netherlands Brigham and Women’s Hospital and Harvard Medical School, USA
XII
Organization
Local Organizing Committee Ichiro Sakuma Mitsuo Shimada Nobuhiko Hata Etsuko Kobayashi
The University of Tokyo, Japan Kyushu University, Japan The University of Tokyo, Japan The University of Tokyo, Japan
MICCAI Board Alan C.F. Colchester (General Chair)
University of Kent at Canterbury, UK
Nicholas Ayache Anthony M. DiGioia Takeyoshi Dohi James Duncan Karl Heinz H¨ohne Ron Kikinis Stephen M. Pizer Richard A. Robb Russell H. Taylor Jocelyne Troccaz Max A. Viergever
INRIA Sophia Antipolis, France UPMC Shadyside Hospital, Pittsburgh, USA University of Tokyo, Japan Yale University, New Haven, USA University of Hamburg, Germany Harvard Medical School , Boston, USA University of North Carolina, Chapel Hill, USA Mayo Clinic, Rochester, USA Johns Hopkins University, Baltimore, USA University of Grenoble, France University Medical Center Utrecht, The Netherlands
Table of Contents, Part I
Robotics – Endoscopic Device Using an Endoscopic Solo Surgery Simulator for Quantitative Evaluation of Human-Machine Interface in Robotic Camera Positioning Systems . . . . . A. Nishikawa, D. Negoro, H. Kakutani, F. Miyazaki, M. Sekimoto, M. Yasui, S. Takiguchi, M. Monden Automatic 3-D Positioning of Surgical Instruments during Robotized Laparoscopic Surgery Using Automatic Visual Feedback . . . . . . . . . . . . . . . . A. Krupa, M. de Mathelin, C. Doignon, J. Gangloff, G. Morel, L. Soler, J. Leroy, J. Marescaux
1
9
Development of a Compact Cable-Driven Laparoscopic Endoscope Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 P.J. Berkelman, P. Cinquin, J. Troccaz, J.-M. Ayoubi, C. L´etoublon Flexible Calibration of Actuated Stereoscopic Endoscope for Overlay in Robot Assisted Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 ` Coste-Mani`ere F. Mourgues, E. Metrics for Laparoscopic Skills Trainers: The Weakest Link! . . . . . . . . . . . . . 35 S. Cotin, N. Stylopoulos, M. Ottensmeyer, P. Neumann, D. Rattner, S. Dawson Surgical Skill Evaluation by Force Data for Endoscopic Sinus Surgery Training System . . . . . . . . . . . . . . . . . . . . . . . . 44 Y. Yamauchi, J. Yamashita, O. Morikawa, R. Hashimoto, M. Mochimaru, Y. Fukui, H. Uno, K. Yokoyama Development of a Master Slave Combined Manipulator for Laparoscopic Surgery – Functional Model and Its Evaluation . . . . . . . . . 52 M. Jinno, N. Matsuhira, T. Sunaoshi, T. Hato, T. Miyagawa, Y. Morikawa, T. Furukawa, S. Ozawa, M. Kitajima, K. Nakazawa Development of Three-Dimensional Endoscopic Ultrasound System with Optical Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 N. Koizumi, K. Sumiyama, N. Suzuki, A. Hattori, H. Tajiri, A. Uchiyama Real-Time Haptic Feedback in Laparoscopic Tools for Use in Gastro-Intestinal Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 T. Hu, A.E. Castellanos, G. Tholey, J.P. Desai Small Occupancy Robotic Mechanisms for Endoscopic Surgery . . . . . . . . . . . 75 Y. Kobayashi, S. Chiyoda, K. Watabe, M. Okada, Y. Nakamura
XXII
Table of Contents, Part I
Robotics in Image-Guided Surgery Development of MR Compatible Surgical Manipulator toward a Unified Support System for Diagnosis and Treatment of Heart Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 F. Tajima, K. Kishi, K. Nishizawa, K. Kan, Y. Nemoto, H. Takeda, S. Umemura, H. Takeuchi, M.G. Fujie, T. Dohi, K. Sudo, S. Takamoto Transrectal Prostate Biopsy Inside Closed MRI Scanner with Remote Actuation, under Real-Time Image Guidance . . . . . . . . . . . . . 91 G. Fichtinger, A. Krieger, R.C. Susil, A. Tanacs, L.L. Whitcomb, E. Atalar A New, Compact MR-Compatible Surgical Manipulator for Minimally Invasive Liver Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 D. Kim, E. Kobayashi, T. Dohi, I. Sakuma Micro-grasping Forceps Manipulator for MR-Guided Neurosurgery . . . . . . . 107 N. Miyata, E. Kobayashi, D. Kim, K. Masamune, I. Sakuma, N. Yahagi, T. Tsuji, H. Inada, T. Dohi, H. Iseki, K. Takakura Endoscope Manipulator for Trans-nasal Neurosurgery, Optimized for and Compatible to Vertical Field Open MRI . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Y. Koseki, T. Washio, K. Chinzei, H. Iseki A Motion Adaptable Needle Placement Instrument Based on Tumor Specific Ultrasonic Image Segmentation . . . . . . . . . . . . . . . . 122 J.-S. Hong, T. Dohi, M. Hasizume, K. Konishi, N. Hata
Robotics – Tele-operation Experiment of Wireless Tele-echography System by Controlling Echographic Diagnosis Robot . . . . . . . . . . . . . . . . . . . . . . . . . . 130 K. Masuda, N. Tateishi, Y. Suzuki, E. Kimura, Y. Wie, K. Ishihara Experiments with the TER Tele-echography Robot . . . . . . . . . . . . . . . . . . . . 138 A. Vilchis, J. Troccaz, P. Cinquin, A. Guerraz, F. Pellisier, P. Thorel, B. Tondu, F. Courr`eges, G. Poisson, M. Althuser, J.-M. Ayoubi The Effect of Visual and Haptic Feedback on Manual and Teleoperated Needle Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 O. Gerovichev, P. Marayong, A.M. Okamura Analysis of Suture Manipulation Forces for Teleoperation with Force Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 M. Kitagawa, A.M. Okamura, B.T. Bethea, V.L. Gott, W.A. Baumgartner
Table of Contents, Part I
XXIII
Remote Microsurgery System for Deep and Narrow Space – Development of New Surgical Procedure and Micro-robotic Tool . . . . . . . . 163 K. Ikuta, K. Sasaki, K. Yamamoto, T. Shimada Hyper-finger for Remote Minimally Invasive Surgery in Deep Area . . . . . . . 173 K. Ikuta, S. Daifu, T. Hasegawa, H. Higashikawa
Robotics – Device Safety-Active Catheter with Multiple-Segments Driven by Micro-hydraulic Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 K. Ikuta, H. Ichikawa, K. Suzuki A Stem Cell Harvesting Manipulator with Flexible Drilling Unit for Bone Marrow Transplantation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 K. Ohashi, N. Hata, T. Matsumura, N. Yahagi, I. Sakuma, T. Dohi Liver Tumor Biopsy in a Respiring Phantom with the Assistance of a Novel Electromagnetic Navigation Device . . . . . . . . . . . . . . . . . . . . . . . . . 200 F. Banovac, N. Glossop, D. Lindisch, D. Tanaka, E. Levy, K. Cleary Non-invasive Measurement of Biomechanical Properties of in vivo Soft Tissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Lianghao Han, Michael Burcher, J. Alison Noble Measurement of the Tip and Friction Force Acting on a Needle during Penetration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 H. Kataoka, T. Washio, K. Chinzei, K. Mizuhara, C. Simone, A.M. Okamura Contact Force Evaluation of Orthoses for the Treatment of Malformed Ears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 A. Hanafusa, T. Isomura, Y. Sekiguchi, H. Takahashi, T. Dohi Computer-Assisted Correction of Bone Deformities Using A 6-DOF Parallel Spatial Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 O. Iyun, D.P. Borschneck, R.E. Ellis
Robotics – System Development of 4-Dimensional Human Model System for the Patient after Total Hip Arthroplasty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Y. Otake, K. Hagio, N. Suzuki, A. Hattori, N. Sugano, K. Yonenobu, T. Ochi Development of a Training System for Cardiac Muscle Palpation . . . . . . . . 248 T. Tokuyasu, S. Oota, K. Asami, T. Kitamura, G. Sakaguchi, T. Koyama, M. Komeda
XXIV
Table of Contents, Part I
Preliminary Results of an Early Clinical Experience with the AcrobotTM System for Total Knee Replacement Surgery . . . . . . . 256 M. Jakopec, S.J. Harris, F. Rodriguez y Baena, P. Gomes, J. Cobb, B.L. Davies A Prostate Brachytherapy Training Rehearsal System – Simulation of Deformable Needle Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 A. Kimura, J. Camp, R. Robb, B. Davis A Versatile System for Computer Integrated Mini-invasive Robotic Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 ` Coste-Mani`ere L. Adhami, E. Measurements of Soft-Tissue Mechanical Properties to Support Development of a Physically Based Virtual Animal Model . . . . 282 C. Bruyns, M. Ottensmeyer
Validation Validation of Tissue Modelization and Classification Techniques in T1-Weighted MR Brain Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 M. Bach Cuadra, B. Platel, E. Solanas, T. Butz, J.-Ph. Thiran Validation of Image Segmentation and Expert Quality with an Expectation-Maximization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 298 S.K. Warfield, K.H. Zou, W.M. Wells Validation of Volume-Preserving Non-rigid Registration: Application to Contrast-Enhanced MR-Mammography . . . . . . . . . . . . . . . . . . 307 C. Tanner, J.A. Schnabel, A. Degenhard, A.D. Castellano-Smith, C. Hayes, M.O. Leach, D.R. Hose, D.L.G. Hill, D.J. Hawkes Statistical Validation of Automated Probabilistic Segmentation against Composite Latent Expert Ground Truth in MR Imaging of Brain Tumors . 315 K.H. Zou, W.M. Wells III, M.R. Kaus, R. Kikinis, F.A. Jolesz, S.K. Warfield A Posteriori Validation of Pre-operative Planning in Functional Neurosurgery by Quantification of Brain Pneumocephalus . . . . . . . . . . . . . . 323 ´ Bardinet, P. Cathier, A. Roche, N. Ayache, D. Dormont E. Affine Transformations and Atlases: Assessing a New Navigation Tool for Knee Arthroplasty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 B. Ma, J.F. Rudan, R.E. Ellis Effectiveness of the ROBODOC System during Total Hip Arthroplasty in Preventing Intraoperative Pulmonary Embolism . . . . . . . . . . . . . . . . . . . . 339 K. Hagio, N. Sugano, M. Takashina, T. Nishii, H. Yoshikawa, T. Ochi
Table of Contents, Part I
XXV
Medical Image Synthesis via Monte Carlo Simulation . . . . . . . . . . . . . . . . . . 347 J.Z. Chen, S.M. Pizer, E.L. Chaney, S. Joshi Performance Issues in Shape Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 S.J. Timoner, P. Golland, R. Kikinis, M.E. Shenton, W.E.L. Grimson, W.M. Wells III
Brain-Tumor, Cortex, Vascular Structure Statistical Analysis of Longitudinal MRI Data: Applications for Detection of Disease Activity in MS . . . . . . . . . . . . . . . . . . . 363 S. Prima, N. Ayache, A. Janke, S.J. Francis, D.L. Arnold, D.L. Collins Automatic Brain and Tumor Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 N. Moon, E. Bullitt, K. van Leemput, G. Gerig Atlas-Based Segmentation of Pathological Brains Using a Model of Tumor Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 M. Bach Cuadra, J. Gomez, P. Hagmann, C. Pollo, J.-G. Villemure, B.M. Dawant, J.-Ph. Thiran Recognizing Deviations from Normalcy for Brain Tumor Segmentation . . . 388 D.T. Gering, W.E.L. Grimson, R. Kikinis 3D-Visualization and Registration for Neurovascular Compression Syndrome Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 P. Hastreiter, R. Naraghi, B. Tomandl, M. Bauer, R. Fahlbusch 3D Guide Wire Reconstruction from Biplane Image Sequences for 3D Navigation in Endovascular Interventions . . . . . . . . . . . . . . . . . . . . . . . 404 S.A.M. Baert, E.B. van der Kraats, W.J. Niessen Standardized Analysis of Intracranial Aneurysms Using Digital Video Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 S. Iserhardt-Bauer, P. Hastreiter, B. Tomandl, N. K¨ ostner, M. Schempershofe, U. Nissen, T. Ertl Demarcation of Aneurysms Using the Seed and Cull Algorithm . . . . . . . . . . 419 R.A. McLaughlin, J.A. Noble Gyral Parcellation of the Cortical Surface Using Geodesic Vorono¨ı Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 A. Cachia, J.-F. Mangin, D. Rivi`ere, D. Papadopoulos-Orfanos, I. Bloch, J. R´egis Regularized Stochastic White Matter Tractography Using Diffusion Tensor MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 M. Bj¨ ornemo, A. Brun, R. Kikinis, C.-F. Westin
XXVI
Table of Contents, Part I
Sulcal Segmentation for Cortical Thickness Measurements . . . . . . . . . . . . . . 443 C. Hutton, E. De Vita, R. Turner Labeling the Brain Surface Using a Deformable Multiresolution Mesh . . . . . 451 S. Jaume, B. Macq, S.K. Warfield
Brain – Imaging and Analysis New Approaches to Estimation of White Matter Connectivity in Diffusion Tensor MRI: Elliptic PDEs and Geodesics in a Tensor-Warped Space . . . . . 459 L. O’Donnell, S. Haker, C.-F. Westin Improved Detection Sensitivity in Functional MRI Data Using a Brain Parcelling Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 G. Flandin, F. Kherif, X. Pennec, G. Malandain, N. Ayache, J.-B. Poline A Spin Glass Based Framework to Untangle Fiber Crossing in MR Diffusion Based Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Y. Cointepas, C. Poupon, D. Le Bihan, J.-F. Mangin Automated Approximation of Lateral Ventricular Shape in Magnetic Resonance Images of Multiple Sclerosis Patients . . . . . . . . . . . . 483 B. Sturm, D. Meier, E. Fisher An Intensity Consistent Approach to the Cross Sectional Analysis of Deformation Tensor Derived Maps of Brain Shape . . . . . . . . . . . . . . . . . . . 492 C. Studholme, V. Cardenas, A. Maudsley, M. Weiner Detection of Inter-hemispheric Asymmetries of Brain Perfusion in SPECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 B. Aubert-Broche, C. Grova, P. Jannin, I. Buvat, H. Benali, B. Gibaud Discriminative Analysis for Image-Based Studies . . . . . . . . . . . . . . . . . . . . . . . 508 P. Golland, B. Fischl, M. Spiridon, N. Kanwisher, R.L. Buckner, M.E. Shenton, R. Kikinis, A. Dale, W.E.L. Grimson Automatic Generation of Training Data for Brain Tissue Classification from MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516 C.A. Cocosco, A.P. Zijdenbos, A.C. Evans The Putamen Intensity Gradient in CJD Diagnosis . . . . . . . . . . . . . . . . . . . . 524 A. Hojjat, D. Collie, A.C.F. Colchester A Dynamic Brain Atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532 D.L.G. Hill, J.V. Hajnal, D. Rueckert, S.M. Smith, T. Hartkens, K. McLeish
Table of Contents, Part I
XXVII
Model Library for Deformable Model-Based Segmentation of 3-D Brain MR-Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540 J. Koikkalainen, J. L¨ otj¨ onen Co-registration of Histological, Optical and MR Data of the Human Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 ´ Bardinet, S. Ourselin, D. Dormont, G. Malandain, D. Tand´e, E. K. Parain, N. Ayache, J. Yelnik
Segmentation An Automated Segmentation Method of Kidney Using Statistical Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 B. Tsagaan, A. Shimizu, H. Kobatake, K. Miyakawa Incorporating Non-rigid Registration into Expectation Maximization Algorithm to Segment MR Images . . . . . . . . . . . . . . . . . . . . . . . 564 K.M. Pohl, W.M. Wells, A. Guimond, K. Kasai, M.E. Shenton, R. Kikinis, W.E.L. Grimson, S.K. Warfield Segmentation of 3D Medical Structures Using Robust Ray Propagation . . . 572 H. Tek, M. Bergtholdt, D. Comaniciu, J. Williams MAP MRF Joint Segmentation and Registration . . . . . . . . . . . . . . . . . . . . . . . 580 P.P. Wyatt, J.A. Noble Statistical Neighbor Distance Influence in Active Contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 J. Yang, L.H. Staib, J.S. Duncan Active Watersheds: Combining 3D Watershed Segmentation and Active Contours to Extract Abdominal Organs from MR Images . . . . . 596 R.J. Lapeer, A.C. Tan, R. Aldridge
Cardiac Application Coronary Intervention Planning Using Hybrid 3D Reconstruction . . . . . . . . 604 O. Wink, R. Kemkers, S.J. Chen, J.D. Carroll Deformation Modelling Based on PLSR for Cardiac Magnetic Resonance Perfusion Imaging . . . . . . . . . . . . . . . . . . . . 612 J. Gao, N. Ablitt, A. Elkington, G.-Z. Yang Automated Segmentation of the Left and Right Ventricles in 4D Cardiac SPAMM Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620 A. Montillo, D. Metaxas, L. Axel Stochastic Finite Element Framework for Cardiac Kinematics Function and Material Property Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634 P. Shi, H. Liu
XXVIII Table of Contents, Part I
Atlas-Based Segmentation and Tracking of 3D Cardiac MR Images Using Non-rigid Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642 M. Lorenzo-Vald´es, G.I. Sanchez-Ortiz, R. Mohiaddin, D. Rueckert Myocardial Delineation via Registration in a Polar Coordinate System . . . . 651 N.M.I. Noble, D.L.G. Hill, M. Breeuwer, J.A. Schnabel, D.J. Hawkes, F.A. Gerritsen, R. Razavi Integrated Image Registration for Cardiac MR Perfusion Data . . . . . . . . . . . 659 R. Bansal, G. Funka-Lea 4D Active Surfaces for Cardiac Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 A. Yezzi, A. Tannenbaum A Computer Diagnosing System of Dementia Using Smooth Pursuit Oculogyration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674 I. Fukumoto Combinative Multi-scale Level Set Framework for Echocardiographic Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682 N. Lin, W. Yu, J.S. Duncan Automatic Hybrid Segmentation of Dual Contrast Cardiac MR Data . . . . . 690 A. Pednekar, I.A. Kakadiaris, V. Zavaletta, R. Muthupillai, S. Flamm Efficient Partial Volume Tissue Classification in MRI Scans . . . . . . . . . . . . . 698 A. Noe, J.C. Gee In-vivo Strain and Stress Estimation of the Left Ventricle from MRI Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706 Z. Hu, D. Metaxas, L. Axel Biomechanical Model Construction from Different Modalities: Application to Cardiac Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714 M. Sermesant, C. Forest, X. Pennec, H. Delingette, N. Ayache Comparison of Cardiac Motion Across Subjects Using Non-rigid Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722 A. Rao, G.I. Sanchez-Ortiz, R. Chandrashekara, M. Lorenzo-Vald´es, R. Mohiaddin, D. Rueckert
Computer Assisted Diagnosis From Colour to Tissue Histology: Physics Based Interpretation of Images of Pigmented Skin Lesions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730 E. Claridge, S. Cotton, P. Hall, M. Moncrieff In-vivo Molecular Investigations of Live Tissues Using Diffracting Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 V. Ntziachristos, J. Ripoll, E. Graves, R. Weissleder
Table of Contents, Part I
XXIX
Automatic Detection of Nodules Attached to Vessels in Lung CT by Volume Projection Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746 G.-Q. Wei, L. Fan, J.Z. Qian LV-RV Shape Modeling Based on a Blended Parameterized Model . . . . . . . 753 K. Park, D.N. Metaxas, L. Axel Characterization of Regional Pulmonary Mechanics from Serial MRI Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 J. Gee, T. Sundaram, I. Hasegawa, H. Uematsu, H. Hatabu Using Voxel-Based Morphometry to Examine Atrophy-Behavior Correlates in Alzheimer’s Disease and Frontotemporal Dementia . . . . . . . . . . . . . . . . . . . 770 M.P. Lin, C. Devita, J.C. Gee, M. Grossman Detecting Wedge Shaped Defects in Polarimetric Images of the Retinal Nerve Fiber Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 K. Vermeer, F. Vos, H. Lemij, A. Vossepoel Automatic Statistical Identification of Neuroanatomical Abnormalities between Different Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785 A. Guimond, S. Egorova, R.J. Killiany, M.S. Albert, C.R.G. Guttmann Example-Based Assisting Approach for Pulmonary Nodule Classification in 3-D Thoracic CT Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793 Y. Kawata, N. Niki, H. Ohmatsu, N. Moriyama
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
Table of Contents, Part II
Tubular Structures Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Kitaoka, Y. Park, J. Tschirren, J. Reinhardt, M. Sonka, G. McLennan, E.A. Hoffman
1
Segmentation, Skeletonization, and Branchpoint Matching – A Fully Automated Quantitative Evaluation of Human Intrathoracic Airway Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 J. Tschirren, K. Pal´ agyi, J.M. Reinhardt, E.A. Hoffman, M. Sonka Improving Virtual Endoscopy for the Intestinal Tract . . . . . . . . . . . . . . . . . . . 20 M. Harders, S. Wildermuth, D. Weishaupt, G. Sz´ekely Finding a Non-continuous Tube by Fuzzy Inference for Segmenting the MR Cholangiography Image . . . . . . . . . . . . . . . . . . . . . . . 28 C. Yasuba, S. Kobashi, K. Kondo, Y. Hata, S. Imawaki, M. Ishikawa Level-Set Based Carotid Artery Segmentation for Stenosis Grading . . . . . . . 36 C.M. van Bemmel, L.J. Spreeuwers, M.A. Viergever, W.J. Niessen
Interventions – Augmented Reality PC-Based Control Unit for a Head Mounted Operating Microscope for Augmented Reality Visualization in Surgical Navigation . . . . . . . . . . . . . 44 M. Figl, W. Birkfellner, F. Watzinger, F. Wanschitz, J. Hummel, R. Hanel, R. Ewers, H. Bergmann Technical Developments for MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors . . . . . . . . . . . . . . . . . . . . . . . . . 52 S. Morikawa, T. Inubushi, Y. Kurumi, S. Naka, K. Sato, T. Tani, N. Hata, V. Seshan, H.A. Haque Robust Automatic C-Arm Calibration for Fluoroscopy-Based Navigation: A Practical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 H. Livyatan, Z. Yaniv, L. Joskowicz Application of a Population Based Electrophysiological Database to the Planning and Guidance of Deep Brain Stereotactic Neurosurgery . . . . . . . . 69 K.W. Finnis, Y.P. Starreveld, A.G. Parrent, A.F. Sadikot, T.M. Peters
XIV
Table of Contents, Part II
An Image Overlay System with Enhanced Reality for Percutaneous Therapy Performed Inside CT Scanner . . . . . . . . . . . . . . . 77 K. Masamune, G. Fichtinger, A. Deguet, D. Matsuka, R. Taylor High-Resolution Stereoscopic Surgical Display Using Parallel Integral Videography and Multi-projector . . . . . . . . . . . . . . . 85 H. Liao, N. Hata, M. Iwahara, S. Nakajima, I. Sakuma, T. Dohi Three-Dimensional Display for Multi-sourced Activities and Their Relations in the Human Brain by Information Flow between Estimated Dipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 N. Take, Y. Kosugi, T. Musha
Interventions – Navigation 2D Guide Wire Tracking during Endovascular Interventions . . . . . . . . . . . . . 101 S.A.M. Baert, W.J. Niessen Specification Method of Surface Measurement for Surgical Navigation: Ridgeline Based Organ Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 N. Furushiro, T. Saito, Y. Masutani, I. Sakuma An Augmented Reality Navigation System with a Single-Camera Tracker: System Design and Needle Biopsy Phantom Trial . . . . . . . . . . . . . . . . . . . . . . 116 F. Sauer, A. Khamene, S. Vogt A Novel Laser Guidance System for Alignment of Linear Surgical Tools: Its Principles and Performance Evaluation as a Man–Machine System . . . . 125 T. Sasama, N. Sugano, Y. Sato, Y. Momoi, T. Koyama, Y. Nakajima, I. Sakuma, M. Fujie, K. Yonenobu, T. Ochi, S. Tamura Navigation of High Intensity Focused Ultrasound Applicator with an Integrated Three-Dimensional Ultrasound Imaging System . . . . . . 133 I. Sakuma, Y. Takai, E. Kobayashi, H. Inada, K. Fujimoto, T. Asano Robust Registration of Multi-modal Images: Towards Real-Time Clinical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 S. Ourselin, R. Stefanescu, X. Pennec 3D Ultrasound System Using a Magneto-optic Hybrid Tracker for Augmented Reality Visualization in Laparoscopic Liver Surgery . . . . . . 148 M. Nakamoto, Y. Sato, M. Miyamoto, Y. Nakamjima, K. Konishi, M. Shimada, M. Hashizume, S. Tamura Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 D.G. Gobbi, T.M. Peters
Table of Contents, Part II
XV
Projection Profile Matching for Intraoperative MRI Registration Embedded in MR Imaging Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 N. Hata, J. Tokuda, S. Morikawa, T. Dohi
Simulation A New Tool for Surgical Training in Knee Arthroscopy . . . . . . . . . . . . . . . . . 170 G. Megali, O. Tonet, M. Mazzoni, P. Dario, A. Vascellari, M. Marcacci Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 M. Nakao, T. Kuroda, H. Oyama, M. Komori, T. Matsuda, T. Takahashi Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views . . . . . . . . 186 U. Tiede, N. von Sternberg-Gospos, P. Steiner, K.H. H¨ ohne High Level Simulation & Modeling for Medical Applications – Ultrasound Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 A. Chihoub Generation of Pathologies for Surgical Training Simulators . . . . . . . . . . . . . . 202 R. Sierra, G. Sz´ekely, M. Bajka Collision Detection Algorithm for Deformable Objects Using OpenGL . . . . 211 S. Aharon, C. Lenglet Online Multiresolution Volumetric Mass Spring Model for Real Time Soft Tissue Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 C. Paloc, F. Bello, R.I. Kitney, A. Darzi Orthosis Design System for Malformed Ears Based on Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 A. Hanafusa, T. Isomura, Y. Sekiguchi, H. Takahashi, T. Dohi Cutting Simulation of Manifold Volumetric Meshes . . . . . . . . . . . . . . . . . . . . . 235 C. Forest, H. Delingette, N. Ayache Simulation of Guide Wire Propagation for Minimally Invasive Vascular Interventions . . . . . . . . . . . . . . . . . . . . . . . . . . 245 T. Alderliesten, M.K. Konings, W.J. Niessen Needle Insertion Modelling for the Interactive Simulation of Percutaneous Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 S.P. DiMaio, S.E. Salcudean 3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 H. Kawakami, N. Sugano, T. Nagaoka, K. Hagio, K. Yonenobu, H. Yoshikawa, T. Ochi, A. Hattori, N. Suzuki
XVI
Table of Contents, Part II
Simulation of Intra-operative 3D Coronary Angiography for Enhanced Minimally Invasive Robotic Cardiac Intervention . . . . . . . . . . 268 G. Lehmann, D. Habets, D.W. Holdsworth, T. Peters, M. Drangova Computer Investigation into the Anatomical Location of the Axes of Rotation in the Normal Knee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 S. Martelli, A. Visani
Modeling Macroscopic Modeling of Vascular Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 D. Szczerba, G. Sz´ekely Spatio-temporal Directional Filtering for Improved Inversion of MR Elastography Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 A. Manduca, D.S. Lake, R.L. Ehman RBF-Based Representation of Volumetric Data: Application in Visualization and Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Y. Masutani An Anatomical Model of the Knee Joint Obtained by Computer Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 S. Martelli, F. Acquaroli, V. Pinskerova, A. Spettol, A. Visani Models for Planning and Simulation in Computer Assisted Orthognatic Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 M. Chabanas, C. Marecaux, Y. Payan, F. Boutault Simulation of the Exophthalmia Reduction Using a Finite Element Model of the Orbital Soft Tissues . . . . . . . . . . . . . . 323 V. Luboz, A. Pedrono, P. Swider, F. Boutault, Y. Payan A Real-Time Deformable Model for Flexible Instruments Inserted into Tubular Structures . . . . . . . . . . . . . . . 331 M. Kukuk, B. Geiger Modeling of the Human Orbit from MR Images . . . . . . . . . . . . . . . . . . . . . . . 339 Z. Li, C.-K. Chui, Y. Cai, S. Amrith, P.-S. Goh, J.H. Anderson, J. Teo, C. Liu, I. Kusuma, Y.-S. Siow, W.L. Nowinski Accurate and High Quality Triangle Models from 3D Grey Scale Images . . 348 P.W. de Bruin, P.M. van Meeteren, F.M. Vos, A.M. Vossepoel, F.H. Post Intraoperative Fast 3D Shape Recovery of Abdominal Organs in Laparoscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 M. Hayashibe, N. Suzuki, A. Hattori, Y. Nakamura
Table of Contents, Part II
XVII
Statistical Shape Modeling Integrated Approach for Matching Statistical Shape Models with Intra-operative 2D and 3D Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 M. Fleute, S. Lavall´ee, L. Desbat Building and Testing a Statistical Shape Model of the Human Ear Canal . . 373 R. Paulsen, R. Larsen, C. Nielsen, S. Laugesen, B. Ersbøll Shape Characterization of the Corpus Callosum in Schizophrenia Using Template Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 A. Dubb, B. Avants, R. Gur, J. Gee 3D Prostate Surface Detection from Ultrasound Images Based on Level Set Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 S. Fan, L.K. Voon, N.W. Sing A Bayesian Approach to in vivo Kidney Ultrasound Contour Detection Using Markov Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 M. Mart´ın, C. Alberola Level Set Based Integration of Segmentation and Computational Fluid Dynamics for Flow Correction in Phase Contrast Angiography . . . . . . . . . . 405 M. Watanabe, R. Kikinis, C.-F. Westin Comparative Exudate Classification Using Support Vector Machines and Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 A. Osareh, M. Mirmehdi, B. Thomas, R. Markham A Statistical Shape Model for the Liver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 H. Lamecker, T. Lange, M. Seebass Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics . . . . . . 428 R. Larsen, K.B. Hilger, M.C. Wrobel Kernel Fisher for Shape Based Classification in Epilepsy . . . . . . . . . . . . . . . . 436 N. Vohra, B.C. Vemuri, A. Rangarajan, R.L. Gilmore, S.N. Roper, C.M. Leonard A Noise Robust Statistical Texture Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 K.B. Hilger, M.B. Stegmann, R. Larsen A Combined Statistical and Biomechanical Model for Estimation of Intra-operative Prostate Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 A. Mohamed, C. Davatzikos, R. Taylor
Registration – 2D/D Fusion ”Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images . . . 461 D. Tomaˇzeviˇc, B. Likar, F. Pernuˇs
XVIII Table of Contents, Part II
A Novel Image Similarity Measure for Registration of 3-D MR Images X-Ray Projection Images . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 T. Rohlfing, C.R. Maurer Jr. Registration of Preoperative CTA and Intraoperative Fluoroscopic Images for Assisting Aortic Stent Grafting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 H. Imamura, N. Ida, N. Sugimoto, S. Eiho, S. Urayama, K. Ueno, K. Inoue Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy for Voxel-Based 2-D/3-D Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Y. Nakajima, Y. Tamura, Y. Sato, T. Tashiro, N. Sugano, K. Yonenobu, H. Yoshikawa, T. Ochi, S. Tamura
Registration – Similarity Measures A New Similarity Measure for Nonrigid Volume Registration Using Known Joint Distribution of Target Tissue: Application to Dynamic CT Data of the Liver . . . . . . . . . . . . . . . . . . . . . . . . 493 J. Masumoto, Y. Sato, M. Hori, T. Murakami, T. Johkoh, H. Nakamura, S. Tamura 2D-3D Intensity Based Registration of DSA and MRA – A Comparison of Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 J.H. Hipwell, G.P. Penney, T.C. Cox, J.V. Byrne, D.J. Hawkes Model Based Spatial and Temporal Similarity Measures between Series of Functional Magnetic Resonance Images . . . . . . . . . . . . . . . 509 F. Kherif, G. Flandin, P. Ciuciu, H. Benali, O. Simon, J.-B. Poline A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration for Neurointerventions . . . . . . . . . . . . . . . . . . 517 R.A. McLaughlin, J. Hipwell, D.J. Hawkes, J.A. Noble, J.V. Byrne, T. Cox Multi-modal Image Registration by Minimising Kullback-Leibler Distance . 525 A.C.S. Chung, W.M. Wells III, A. Norbash, W.E.L. Grimson Cortical Surface Registration Using Texture Mapped Point Clouds and Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 T.K. Sinha, D.M. Cash, R.J. Weil, R.L. Galloway, M.I. Miga
Non-rigid Registration A Viscous Fluid Model for Multimodal Non-rigid Image Registration Using Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 E. D’Agostino, F. Maes, D. Vandermeulen, P. Suetens
Table of Contents, Part II
XIX
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions . . 549 G. Soza, M. Bauer, P. Hastreiter, C. Nimsky, G. Greiner Brownian Warps: A Least Committed Prior for Non-rigid Registration . . . . 557 M. Nielsen, P. Johansen, A.D. Jackson, B. Lautrup Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration . 565 T. Hartkens, D.L.G. Hill, A.D. Castellano-Smith, D.J. Hawkes, C.R. Maurer Jr., A.J. Martin, W.A. Hall, H. Liu, C.L. Truwit Intra-patient Prone to Supine Colon Registration for Synchronized Virtual Colonoscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 D. Nain, S. Haker, W.E.L. Grimson, E. Cosman Jr, W.W. Wells, H. Ji, R. Kikinis, C.-F. Westin Nonrigid Registration Using Regularized Matching Weighted by Local Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 E. Su´ arez, C.-F. Westin, E. Rovaris, J. Ruiz-Alzola Inter-subject Registration of Functional and Anatomical Data Using SPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590 P. Hellier, J. Ashburner, I. Corouge, C. Barillot, K.J. Friston
Visualization Evaluation of Image Quality in Medical Volume Visualization: The State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598 A. Pommert, K.H. H¨ ohne Shear-Warp Volume Rendering Algorithms Using Linear Level Octree for PC-Based Medical Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 Z. Wang, C.-K. Chui, C.-H. Ang, W.L. Nowinski Line Integral Convolution for Visualization of Fiber Tract Maps from DTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 T. McGraw, B.C. Vemuri, Z. Wang, Y. Chen, M. Rao, T. Mareci On the Accuracy of Isosurfaces in Tomographic Volume Visualization . . . . . 623 A. Pommert, U. Tiede, K.H. H¨ ohne A Method for Detecting Undisplayed Regions in Virtual Colonoscopy Its Application to Quantitative Evaluation of Fly-Through Methods . . . . . . . . . 631 Y. Hayashi, K. Mori, J. Hasegawa, Y. Suenaga, J. Toriwaki
Novel Imaging Techniques 3D Respiratory Motion Compensation by Template Propagation . . . . . . . . . 639 P. R¨ osch, T. Netsch, M. Quist, J. Weese
XX
Table of Contents, Part II
An Efficient Observer Model for Assessing Signal Detection Performance of Lossy-Compressed Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 B.M. Schmanske, M.H. Loew Statistical Modeling of Pairs of Sulci in the Context of Neuroimaging Probabilistic Atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 I. Corouge, C. Barillot Two-Stage Alignment of fMRI Time Series Using the Experiment Profile to Discard Activation-Related Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663 L. Freire, J.-F. Mangin Real-Time DRR Generation Using Cylindrical Harmonics . . . . . . . . . . . . . . . 671 F. Wang, T.E. Davis, B.C. Vemuri Strengthening the Potential of Magnetic Resonance Cholangiopancreatography (MRCP) by a Combination of High-Resolution Data Acquisition and Omni-directional Stereoscopic Viewing . . . . . . . . . . . . 679 T. Yamagishi, K.H. H¨ ohne, T. Saito, K. Abe, J. Ishida, R. Nishimura, T. Kudo
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images Hiroko Kitaoka1,6, Yongsup Park2 , Juerg Tschirren3 , Joseph Reinhardt4, Milan Sonka3, Goeffrey McLennan5, and Eric A. Hoffman1,4 1
Division of Physiologic Imaging, Dept. of Radiology, College of Medicine, University of Iowa, 200 Hawkins Drive, Iowa City, Iowa 52242, USA _LMVSOSOMXESOEIVMGLSJJQERa$YMS[EIHY 2 Dept. of Informatics and Mathematical Science, Graduate School of Engineering Science, Osaka University, 2-2 Yamadaoka, Suita, Osaka 363-0871, Japan ]WTEVO$MQEKIQIHSWEOEYEGNT 3 Dept. of Electrical and Computer Engineering, College of Engineering, University of Iowa, 1402 SC, Iowa City, Iowa 52242, USA _NYIVKXWGLMVVIRQMPERWSROEa$YMS[EIHY 4 Dept. of Biomedical Engineering, College of Engineering, University of Iowa, 1402 SC, Iowa City, Iowa 52242, USA NSIVIMRLEVHX$YMS[EIHY 5 Dept. of Internal Medicine, College of Medicine, the University of Iowa, 200 Hawkins Drive, Iowa City, Iowa 52242, USA KISJJVI]QGPIRRER$YMS[EIHY 6 Biomedical Physics Laboratory, Brussels Free University, Campus Erasme cp 613/3, 808 Route de Lennik, 1070 Brussels, Belgium Abstract. A nomenclature labeling algorithm for the human bronchial tree down to sub-lobar segments is proposed, as a means of inter and intra subject comparisons for the evaluation of lung structure and function. The algorithm is a weighted maximum clique search of an association graph between a reference tree and an object tree. The adjacency between nodes in the association graph is defined so as to reflect the consistency between the bronchial name in the reference tree and the node connectivity in the object tree. Nodes in the association graph are weighted according to the similarity between two tree nodes in the respective trees. This algorithm is robust to various branching patterns and false branches that arise during segmentation processing. Experiments have been performed for nine airway trees extracted automatically from clinical 3D-CT data, where approximately 250 branches were contained. Of these, 95 % were accurately named.
1
Introduction
Isotropic volume data acquisition for medical imaging is now rapidly spreading in clinical use due to the technological progress in multi-detector CT scanners. 3D image processing techniques have enabled precise structural analysis of living organs. Anatomical nomenclature is an important step in sharing a common understanding of organ structure. Inter-individual and intra-individual comparisons are meaningful only when accurate nomenclatures are applied to the structures. Accuracy of nomenclature is also critical for diagnosis and surgical planning. However, anatomical knowledge used for establishing the nomenclature of biological structure is challenging when seeking to construct robust computational algorithms, because of the nature of bioT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 1–11, 2002. © Springer-Verlag Berlin Heidelberg 2002
2
H. Kitaoka et al.
logic complexity and diversity. Discrepancy of anatomical nomenclature even between experts is not uncommon. The human airway tree is a typical example of the difficulty of nomenclature and labeling because of its hierarchical properties and the considerable variations of branching pattern. Mori et al reported a knowledge-based labeling method of the bronchial branches and applied it to seven cases of CT images with a slice thickness of 2 or 3 mm [1]. In their experiment, the number of extracted branches for each subject was about thirty, and none of extracted trees from the seven cases had all segmental bronchi. By the use of modern multi detector CT scanners, more than a hundred bronchial branches can be extracted. The increase in the number of branches identified increases the complexity of establishing a robust labeling scheme. In this paper, we first explain how the bronchial nomenclature is constructed in terms of graph theory, and introduce an algorithm based on a weighted maximum clique search of an association graph between a reference tree and an object tree. We demonstrate its performance by volumetric human lung CT data sets. We believe the proposed algorithm will be applicable to tree systems not only in the lung but also in other organs and across species.
2
Principles of Bronchial Nomenclature
2.1
General Aspects of Bronchial Nomenclature
The human airway tree begins from the trachea and repeatedly branches into smaller and smaller bronchi, ending in the terminal bronchioles, whose diameter is about 0.5 mm. The total number of the airway branches is over 50,000 in the normal adult human [2], and the bronchial nomenclatures are defined to 74 proximal branches down to sub-segmental bronchi [3], [4 ], [5 ]. Currently, the most clinically important nomenclatures include 32 branches down to segmental bronchi. Peripheral bronchi that lie downstream from a segmental bronchus are usually named using the nomenclature of the parent segmental bronchi. The bronchial nomenclature is assigned according to the region of the lung to which a bronchus supplies air. There is a clear definition for the spatial division of the lung as shown in Figure 1. Classes of lobe, segment, and subsegment construct a hierarchic structure, and a set of all members in the same class is equal to the whole lung without overlapping. There is an exact one-to-one correspondence between a branch and the lung region supplied air by the branch, because there are no loops in the airway tree. Therefore, the bronchial nomenclatures are based upon the regional nomenclatures: lobar bronchus, segmental bronchus, and so on. The most common way to mathematically describe the airway tree is by a graph representation using a rooted tree. However, for the purpose of bronchial nomenclature, a tree representation can lead to confusion, because the hierarchy of the rooted tree does not correspond to the nomenclature hierarchy. Figure 2 shows a standard branching pattern of the human bronchial tree [3], where thick lines indicate bronchi having anatomical nomenclatures. In this branching pattern, levels of segmental bronchi across from 3rd to 7th. Furthermore, as shown in Figure 3, there are differences in branching patterns even across normal subjects. It is obvious that the same nomenclature does not mean the same level in the tree representations of different branching patterns.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
Fig. 1. Hierarchy of space division of the lung.
s: lung segment. ss: sub-segment
3
Fig. 2. A typical example of the human bronchial tree. s: segmental bronchus
Fig. 3. Branching patterns of segmental bronchi arising from the right upper lobar bronchus (UB). Frequencies for respective branching patterns are according to [3]
Since there are only five lobar bronchi and there is little variation in the branching pattern, nomenclatures for lobar bronchi are not difficult. On the other hand, determining nomenclatures for segmental bronchi are much more difficult because of a large variety of branching patterns. 2.2
Nomenclature of the Segmental Bronchus
There are ten lung segments in the right lung and eight segments in the left lung. The names of lung segments describe their locations within the lung. For example, there are apical, lateral, and anterior segments of the right upper lobe. For simplicity, numbers from 1 to 10 are used for distinguishing locations. Both the right and left lower lobes sometimes have accessory segments called sub-superior segments, which are often located below the superior segmental bronchus (B6). They are usually expressed as the symbol of asterisk (*) instead of number [3], [4], [5]. Since branchpoints in the bronchial tree have only one upward branch, it is reasonable to assign bronchial nomenclatures to branchpoints, as shown in Figure 4. As shown in Figure 4, each segmental bronchus is located neither upstream nor downstream from other segmental bronchi, since their supplying regions are independent of each other. In addition, the segmental bronchi are always located distal to their parent lobar bronchi regardless of the branching order, because each lobe is comprised of its member segments. These two relationships appear trivial but are very important for clarifying the node connectivity in a rooted tree in terms of graph theory. Meanwhile, intermediate branches between lobar and segmental bronchi have no
4
H. Kitaoka et al.
anatomical names because of their ambiguous relationships. These relationships do not change even if a tree contains false branches or missing true branches due to image processing steps including segmentation and skeletonization. Errors occurring in the segmentation and skeletonization steps of image processing algorithms serve as a primary source of difficulty when seeking to automatically label the bronchial tree.
Fig. 4. Scheme of the bronchial nomenclatures. Each segmental node is connected upward to the segmental bronchi. Some of lobar nodes presented here are different from the traditional definitions. See the text in 3.3
There is one more important characteristic of the bronchial nomenclature that can provide node attributes in the airway tree. Each lung segment is supplied air by its corresponding segmental bronchus, and all branches within the segment are ancestors of the segmental bronchus. Therefore, the position and the direction of a segmental bronchus correspond to the position and the central axis of its associated lung segment. The segmental bronchial nomenclature is defined according to this correspondence, regardless of its branching order.
3
Bronchial Nomenclature Algorithm
Automated bronchial nomenclature and labeling can be viewed as a tree matching problem between an object tree and a standard airway tree. The nomenclature labeling is then applied to give the same name to a node in the object tree as that of its corresponding node in the reference tree. The algorithm is based on a weighted tree matching method proposed by Pelillo et al.[6]. Their method seeks the maximum weight clique in a tree association graph (TAG), equivalent to the maximum similarity subtree isomorphism between two trees. We modify the definition of adjacency of TAG nodes and construct a similarity measure between a reference tree and an object tree according to the property of the bronchial nomenclature explained in the previous section. Before explaining our algorithm, Pelillo’s original method is briefly described.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
3.1
5
Weighted Tree Association Graph by Pelillo et al.
Let G = (V, E) be a graph, where V is the set of nodes and E is the set of edges. Let T1= (V1 ,E1) and T2=(V2, E2) be two rooted trees and let u1, v1 ∈ V1, and u2, v2 ∈ V2 be distinct nodes of the respective trees. The tree association graph (TAG) of T1 and T2 is the graph G = (V, E) where V = V1 x V2 , and TAG nodes (u1, u2) and (v1, v2) are adjacent when the connectivity between u1 and v1 is equivalent to that of u2 and v2. Pelillo et al define equivalence between two sets of nodes in respective trees by comparing path length and level difference in the tree hierarchy. By this definition, there exists one to one correspondence between maximal subtree isomorphism and maximal clique of the TAG from the two trees. Searching for the maximal clique in TAG is equivalent to tree matching. Next, let T(V, E, α) be an attributed tree, whereα is a function which assigns an attribute vector α (u) to each node u ∈ V. Let σ be a similarity measure in attribute space. Subtree isomorphism with the largest similarity is called “maximum similarity subtree isomorphism”. The weighted TAG (WTAG) of two attributed trees T1 and T2 is the weighted graph G = (V, E, ω), whereω is a function which assigns a positive weight to each node z = (u, v) ∈ V as follows: ω(z) = ω(u, v) = σ (α1 (u), α2 (v)). Weight matrix W= (mij) is defined as follows: if i = j, mij = 1 - 0.5σmin /ω(ui) mij = 1 if i ≠ j and ui are adjacent to uj 0 ≤ mij < 1 - 0.5σmin /(ω(ui) + ω(uj)) otherwise, where σmindenotes the minimum value of the similarity measure, σ. Pelillo et al. used the following method to search for a maximum clique from weighted TAG. Let G = (V, E, ω), be an arbitrary weighted graph of order n. The characteristic vector of any subset of nodes C ⊆ V, denoted xc, is defined as follows: if ui∈ C xic = ω(ui)/ Ω(C), = 0, otherwise, where Ω(C) is the total weight on C. It has been proved that C is a maximum weight clique of G if and only if xc is a global maximizer of the function xTWx, where xT denotes matrix transposition [7], [8]. Pelillo et al used replicator dynamic system to seek the maximizer [9]. 3.2
Modification of Weighted TAG for Bronchial Nomenclature
Pelillo et al. defined TAG-node adjacency as an exact agreement between the connectivity of two nodes in one tree and that of their corresponding nodes in the other tree [6]. We propose an alternative definition of TAG-node adjacency to be constructed for the purpose of bronchial nomenclature. Here, a relationship function r between nodes u, v, and w in a rooted tree is defined as follows: When u is located upstream from v, r(u, v) =1, r(v, u ) =-1 When u is located neither upstream nor downstream of w, r(u, w) = r(w, u) = 0.
6
H. Kitaoka et al.
Another relationship function, q in a reference tree is defined as follows: The basic relationship is the same as r, however, when u and v are segmental nodes, q(u, v) = q(v, u) = 2, When u is a node having no nomenclature, q(u, v) = q(v, u) = 3 Both of those relationship functions are applicable to multiple branching. From two relationship functions, the adjacency of TAG nodes A is defined as follows: A=1 if q < 2 and r = q, A=1 if q = 2 and r = 0, A=0 if q = 2 and r ≠ 0, or q < 2 and r ≠ q, A=0 if q = 3 The definition of the lobar node in the algorithm is slightly different from the anatomical definition of the lobar bronchi. The bilateral lower lobe bronchi are very short, and sometimes the superior segment bronchi of the lower lobe (B6) arise from the right intermediate bronchus and left main bronchus. Therefore, instead of the lower lobe bronchi, the basal bronchi are used in the algorithm. The reference tree is indicated in Figure 4. There are 30 branches having anatomical names excluding two lower lobe bronchi. The node attribute vector α (u) of a tree is constructed by the position of a node, denoted Pu, and the direction of the upward edge, denoted Vu, The similarity measure σ is defined as follows: σ (α (u1), α (u2)) =1- β1(1- (Vu1, Vu2) ) -β2 | Pu1 – Pu2| σ ( u, v) = σmin , if σ ( u, v)< σmin, where β1, β2, and σmin (>0)are determined experimentally as 0.5, 0.1/cm, and 0.1, respectively. In order to compare node positions in different trees, size normalization and approximate registration are necessary. The practical methods are described in the next section. 3.3
Correction of Labeled Node
In the above algorithm, a descendent of a true segmental node is labeled as the segmental node when its similarity is higher than that of the true segmental node. Therefore, it is necessary to check whether there is a true segmental node in ascendants of a labeled node. Since all descendents of a segmental node are not those of any other segmental nodes, a sibling of a segmental node should have at least one different segmental node in its descendents including itself. Therefore, if the sibling does not, one of its ancestors should be the true segmental node. Thus, correction of a segmental node is performed by replacing the segmental node upwards until a sibling having at least one segmental node is found. If the parent is labeled as a lobar node in spite of the fact its sibling does not have any segmental node, the sibling and its descendents are regarded as belonging to an unknown segmental node. If there are unknown segmental nodes after checking all labeled segmental nodes, a nomenclature which has the highest similarity of the unknown node is labeled. Proximal branches beyond segmental nodes are relabeled using a relationship between lobes and segments as shown in Figure 1. For example, if a node is located upstream of all unilateral segments, the node is assigned as the main bronchus. When false branches are generated in a proximal branch, the above algorithm does not recognize all parts of the branch. However, by adding this step, all proximal branches are obtained excluding the false branches.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
4
7
Experiments
Nine 3D-CT data sets of the human lungs were used for testing. Scanning occurred with lung volume held near total lung capacity and with subjects lying in the supine posture. The slice thickness was 1.25mm with 0.6mm spacing, and the pixel size ranged from 0.58 mm to 0.72 mm. All subjects were studied under an approved University of Iowa IRB protocol. Segmentation and skeletonization of the airway was performed by a method reported by Kiraly et al. [10] and Palagy et al. [11], respectively. More than 190 branches were extracted for each case. In most cases, there were several false branches in the proximal portion of the tree, which could be automatically recognized as false branches. However, in some cases, there were several clusters of numerous false branches in the peripheral lung regions distal to segmental bronchi. These peripheral false branches were due to incorrect segmentation at the periphery, and they could not be automatically recognized as false. Therefore, peripheral false branches were manually recognized and excluded from the evaluation. False branches located at the proximal part were evaluated whether they were labeled correctly “false” or not. The gold standards for the bronchial nomenclatures were given by careful observation of the CT images by one of the authors who was a pulmonologist expert at chest CT images. An existing 3D mathematical model of the human airway tree [12] was slightly modified and used as a reference tree. The branching pattern was designed to represent a standard airway tree [3], and bilateral sub-superior segmental bronchi were added as shown in Figure 4. Since the maximum thoracic width of this model is fixed at 30 cm [12], size normalization was performed according to the maximum thoracic widths in CT images. The approximate registration was performed by matching the carina point in a normalized object tree to that of the reference tree. Automated detection of the carina point was performed by finding the longest branch located at the center of the thorax. Only branchnodes in an extracted airway tree were subjected to the nomenclature-labeling algorithm, and terminal nodes were labeled later. The reason is that the extracted branches extended peripherally to segmental bronchi. Table 1 shows the number of extracted branches, labeled branches, and correctly labeled branches for each subject. Almost all branches were accurately labeled except for subject 3. Overall accuracy for nine cases was calculated as 95 %. Table 1. Result of automatic labeling of the bronchial tree extracted from 3D-CT data Subject Extracted Branches Labeled branches Correctly Labeled Accuracy (%)
1
2
3
4
5
6
7
8
9
Total
245
197
203
245
192
327
268
195
301
2,173
245
197
203
244
192
327
266
195
298
2,167
245
169
148
244
192
327
266
195
288
2,074
100
96
74
100
100
100
99
100
96
95
Figures 5, 6, and 7 show bronchial trees in subjects 1, 2, and 3, respectively. In these figures, segmented bronchial regions and their skeletons are superimposed. Each segmental bronchus and its descendents are distinguished by color. Proximal branches beyond segmental bronchi are colored white. Incorrectly labeled branches
8
H. Kitaoka et al.
are colored gray. In Figure 5, even though there were several false branches at proximal bronchi, the nomenclature was successfully performed with an accuracy of 100 %. False branches in the proximal bronchi are correctly labeled as false. The left subsuperior bronchus (B*) was correctly labeled.
Fig. 5. Labeled bronchial tree in subject 1. Anterior view (left), right lateral view ( middle), and left lateral view (left). All branches are correctly labeled including false branches
Fig. 6. Labeled bronchial tree in subject 2. There are two mislabeled sub-segmental bronchi
Fig. 7. Labeled bronchial tree in subject 3. The right main and intermediate bronchi are unlabeled. Half of right segmental bronchi are mislabeled
There were several incorrectly labeled branches in subject 2. Mislabeling occurred at the level of the sub-segmental bronchus as shown in Figure 6. Two sub-segmental bronchi of the anterior segmental bronchi of the left lower lobe (B8) arose without having a common trunk in this case. One sub-segmental bronchi were correctly labeled as B8, but the other was labeled as the lateral segmental bronchi of the left lower lobe (B9) as shown by an arrow in Figure 6. One of sub-segmental bronchus of the apico-posterior segmental bronchus of the left upper lobe (B1+2) were mislabeled
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
9
as the anterior segmental bronchus of the left upper lobe (B3), because the true B3 arose at the lower position than usual, the apical sub-segmental bronchi of B1+2 was first labeled as B3 as shown by an arrow head. Causes of mislabeling in subjects 7 and 9 are due to lack of a common trunk of sub-segmental bronchi as left B8 in subject 2. Subject 3 had a very rare variant branching pattern where the apical segmental bronchi of the right upper lobe (B1) arose from the right main bronchus as shown in Figure 7. Since the right upper lobe was much larger than usual, positions and directions of other branches were different from those in a usual airway tree. Therefore, only half of the branches in the right lung were correctly labeled. The right main bronchus and the intermediate bronchus were unlabeled, as colored gray in Figure 7, because of inconsistency of the relationship between lobe and segments. This indicated that the branching abnormality occurred at the level of the main bronchus. There were two unlabeled branches in the left lung. The reason is that they were terminal branches in the extracted tree although they were sub-segmental bronchi.
5
Discussion
The experimental results indicate that the proposed algorithm is useful for bronchial nomenclature labeling up to the segmental level of the airway tree in human CT images with 95 % accuracy. The lowest accuracy is seen in subject 3, where a very rare branching pattern was observed. According to Yamashita [3], such a pattern did not occur in 170 specimens studied. It is unlikely that automated methods for labeling nomenclature will be successful in such cases, and manual correction by an expert will be required. The proposed algorithm can alert the user when such unusual patterns are encountered by labeling no nomenclatures to proximal bronchi. Except subject 3, the accuracy of nomenclature was 98 %in average, which is considered satisfactory for practical application. The main cause of mislabeling was the lack of a common trunk of sub-segmental bronchi. Although the mislabeled bronchi in the left lower lobe in subject 2 was recognized as a sub-segmental bronchus of B8 by one of the authors, other experts may name it a sub-segmental bronchi of B9 or B*. Branching patterns of bronchi at the sub-segmental level are more varied than at the segmental level [3], and hence the difficulty of labeling of sub-segmental bronchi is much higher. In order to solve this problem, the extension of the proposed algorithm to the level of sub-segmental level will be useful. There are three parameters in the weighted TAG that determine the similarity between the reference tree node and an object tree node. Although fixed values were used in the experiment, optimal values should be investigated as the number of clinical cases increase. Accuracy of the nomenclature labeling is also influenced by generality of a reference tree. We used a model-derived tree [12] as a reference in the experiment. The model-derived tree consists of the most common branching pattern in respective lobes and contains accessory segmental bronchi, which will rarely be found in real cases. Refinement of the reference tree is expected by statistically analyzing morphometric data of the airway trees in 3D CT images as we continue to study additional subjects. Mori et al. proposed a knowledge-based labeling method of the airway tree. Nomenclature labeling in their method is executed in the direction of the periphery from
10
H. Kitaoka et al.
the trachea with the depth first search [1]. However, as they discussed in their paper, the depth first search propagates proximal mislabeling into the periphery. Searching for a global solution, as in our algorithm, may be more suitable for bronchial nomenclature labeling. Krass et al. reported that they performed automated bronchial labeling based on graph theory[13], but details of the algorithm was not described in their paper. Automated nomenclature labeling of the airway tree in 3D-CT images is a promising technique for both clinical and fundamental imaging investigation. One can easily begin to recognize branching patterns and to catalogue the spatial distribution of the airway tree. Although it is difficult to obtain precise segmentation of peripheral small airways, it is possible to label pulmonary arteries adjacent to labeled airways and to track more peripheral branches of pulmonary arteries. Arterial labels can likely be transferred to their adjacent airways. These processes will provide better understanding of segmental anatomy of the lung.
6
Conclusion
We have proposed a bronchial nomenclature labeling algorithm that is robust to various branching patterns and false branches that arise during image segmentation and skeletonization. The results show very accurate labeling for more than 200 branches. This technique will be useful for both of clinical and fundamental imaging investigations of the lung.
Acknowledgements: This works was supported in part by NIH HL-04368 and HL-060158 and NSF 0092758.
References 1. Mori, K., Hasegawa, J., Suenaga, Y., Toriwaki J.: Automated Anatomical Labeling of the Bronchial Branch and Its Application to the Virtual Bronchoscopy System. IEEE Trans. Med. Imag. 19 (2000) 103-114 2. Weibel, E.R.: Morphometry of the Human Lung. Academic Press, New York (1963). 3. Yamashita, H.: Roentgenologic Anatomy of the Lung. Igaku-shoin, Tokyo (1978). 4. Moore, K.L. Clinically Oriented Anatomy. Williams &Willkins, Baltimore (1985) 49-148 5. Agur, A.M.R., Lee, M.J. Grant Atlas of Anatomy, Williams &Willkins, Baltimore (1991) 1-76. 6. Pelillo, M., Siddiqi, K., Zucker, S.W.: Matching Hierachical Structure Using Association Graph. IEEE Trans. PAMI. 21 (1999) 1105-1120 7. Motzkin, T.S., Straus, E.G.: Maxima for Graphs and a New Proof of a Theorem of Turan. Canadian J. Math. 17 (1965) 533-540 8. Bomze, L.M., Budinich, M., Pardalos, P.M., Pelillo, M.: The Maximum Clique Problem. In: Du, D.-Z., Paradolas, P.M. (eds): Handbook of Combinatorial Optimization, Vol.4. Mass. Kluwer Academic, Boston (1999) 9. Pelillo, M.: The Dynamics of Nonlinear Relaxation Labeling Process. J. Math.Imag. and Vision 7(1997) 309-323 10. 10. Kiraly, A, Higgins, W.E, Hoffman, E.A., McLennan G., Reinhardt G.M.: 3D Human Airway Segmentation for Virtual Bronchoscopy. In Proc. of SPIE Conf on Medical Imaging (2002) (in press)
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
11
11. Palagyi, K., Sorantin, E., Balogh, E., Kuba, A., Halmai, C., Erdohelyi, B., Hausegger, K.:A sequential 3D Thinning Algorithm and its Medical Applications. In 17th Int. Conf. IPMI (2001) 409-415 12. Kitaoka, H., Takaki, R., Suki, B.: A Three-Dimensional Model of the Human Airway Tree. J. Appl Physiol. 87(1999) 2207-2217 13. Krass, S., Selle, D., Boehm D., Jend H.H., Kriete A., Rau W.S., Peitgen H.O.: Determination of Bronchopulmonary Segments Based on HRCT Data. In: Lemke H.U. et al (eds): Computer Assisted Radiology and Surgery. Elsevier, Amsterdam (2000) 584-589.
Segmentation, Skeletonization, and Branchpoint Matching – A Fully Automated Quantitative Evaluation of Human Intrathoracic Airway Trees J. Tschirren1 , K. Pal´ agyi4 , J. M. Reinhardt2 , E. A. Hoffman3,2 , and M. Sonka1 1
4
Department of Electrical and Computer Engineering 2 Department of Biomedical Engineering 3 Department of Radiology The University of Iowa, Iowa City, IA 52242, USA Department of Applied Informatics, University of Szeged, Hungary
Abstract. Modern multislice X-ray CT scanners provide high-resolution volumetric image data containing a wealth of structural and functional information. The size of the volumes makes it more and more difficult for human observers to visually evaluate their contents. Similar to other areas of medical image analysis, highly automated extraction and quantitative assessment of volumetric data is increasingly important in pulmonary physiology, diagnosis, and treatment. We present a method for a fully automated segmentation of a human airway tree, its skeletonization, identification of airway branches and branchpoints, as well as a method for matching the airway trees, branches, and branchpoints for the same subject over time and across subjects. The validation of our method shows a high correlation between the automatically obtained results and reference data provided by human observers.
1
Introduction
Quantitative assessment of intrathoracic airway trees is critically important for objective evaluation of bronchial tree structure and function. Several approaches to three-dimensional reconstruction of the airway tree have been developed in the past. None of them, however, allows direct comparison of airway trees across and within subjects. Functional understanding of pulmonary anatomy as well as the natural course of respiratory diseases like asthma, emphysema, cystic fibrosis, and many others is limited by our inability to repeatedly evaluate the same region of the lungs time after time and perform accurate and reliable positionally corresponding measurements. Consequently, quantitative analysis of disease status and its progression and regression, as well as longitudinal physiologic and functional analyses are impossible. In this paper, we describe an integrated approach to quantitative analysis of intrathoracic airway trees and inter-tree matching using high-resolution volumetric computed tomography (CT) images. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 12–19, 2002. c Springer-Verlag Berlin Heidelberg 2002
Segmentation, Skeletonization, and Branchpoint Matching
2
13
Methods
The reported system consists of three main blocks: airway tree segmentation, skeletonization and branchpoint localization, and branchpoint matching. Each of these blocks is described separately in the following subsections. 2.1
Airway tree segmentation
The airway segmentation takes advantage of the relatively high contrast in CT images between the center of an airway and the airway wall. A seeded region growing is employed starting with an automatically identified seedpoint within the trachea. New voxels are added to the region if they have a similar X-ray density as a neighbor voxel that already belongs to the region. The similarity measure is designed so that the region growing can overcome subtle gray level changes (like for example caused by beam-hardening). On the other hand a “leaking” into the surrounding lung tissue has to be avoided. This is realized by setting an upper limit of allowed difference in gray value for two neighboring voxels. Our region growing algorithm utilizes a breadth-first search [1], which allows a fast and memory-friendly implementation. After airway segmentation, a binary subvolume is formed that represents the extracted airway tree. 2.2
Skeletonization
The binary airway tree formed in the previous step is skeletonized to identify the three-dimensional centerlines of individual branches and to determine the branchpoint locations. A sequential 3D thinning algorithm reported by Pal´ agyi et al. [2] was customized for our application. To obtain the skeleton, a thinning function deletes border voxels that can be removed without changing the topology of the tree. This thinning step is applied repeatedly until no more points can be deleted. The thinning is performed symmetrically and the resulting skeleton is guaranteed to lie in the middle of the cylindrically shaped airway segments. After completion of the thinning step, the skeleton is smoothed, false branches pruned, the location of the branchpoints identified, and the complete tree converted into a graph structure using an adjacency list representation. Fig. 1 shows a close-up view of a skeleton produced by the algorithm. Skeleton branchpoints are identified as skeleton points with more than two neighboring skeleton points. 2.3
Branchpoint matching
The goal of branchpoint matching is to find anatomically corresponding branchpoints in two different airway trees. Two types of matching are of utmost interest: intra-subject and inter-subject matching. In the first case, trees coming from different scans of the same subject are matched. In the second case, two or more trees are matched originating from different subjects. The latter case only allows matching of the primary branchpoints (the first three or four generations). These
14
J. Tschirren et al.
Fig. 1. Example of segmentation and skeletonization applied on an airway tree phantom.
primary branchpoints are frequently (although not universally) identical among humans. The branching pattern of higher airway generations varies from subject to subject, much like fingerprints do. In the mathematical sense, an airway tree is a graph (rooted tree). The branchpoints correspond to vertices and the airway segments correspond to graph edges. There are many graph-theoretic approaches to graph matching. A widely used method for matching hierarchical relational structures is to map them onto an association graph and then find its maximum clique [3], with many variations existing [4, 5]. To the best of our knowledge, only one application of the method was employed for matching airway trees [6]. A disadvantage of finding the maximum clique is its NP-completeness [7]. This means that for all but small graphs, an exhaustive search is not feasible. There are two basic ways of decreasing the computational complexity: minimizing the overall problem size or splitting the problem into several smaller subproblems. Our method uses both of these strategies. Terminal branches that are shorter than a predefined length are mostly spurious (caused by inaccuracies in the segmentation and skeletonization processes) and are pruned out of the tree in the late stages of the skeletonization process. Additionally, the major vertices (branchpoints) are identified. A vertex is considered to be major if it has at least N vertices hierarchically underneath it, and if these vertices have a spatial extent that exceeds a predefined threshold. The spatial extent is defined as the maximum of the three differences xmax − xmin , ymax − ymin , and zmax − zmin . Next, the two trees undergo a rigid registration, using the major branchpoints as landmark points. The major branchpoints are matched using an association graph. After that, a separate association graph is created for every subtree starting from a set of matched major branchpoints. When creating the association graphs for the sub-trees, only vertex-pairs that lie relatively close to each other are considered. This reduces the size of the association graph. Edges are added to the association graph based on the topological and geometrical distances, inheritance relationships, and geometrical length and directions. For all of these measures tolerances are allowed. For the topological
Segmentation, Skeletonization, and Branchpoint Matching
15
distance, a tolerance of ±2 segments is allowed. A parent–child and a childparent relationship are regarded equivalent if the geometrical distance between the two branchpoints does not exceed 2 mm in both trees. This introduces tolerance for cases where two branches are very close to each other, and — due to tolerances in segmentation and skeletonization — the order of two branchpoints is swapped for the two trees. For the length and angles of segments, tolerances of ±20% and ±0.2 radians are allowed, respectively. Allowing for these tolerances introduces robustness against false branches and missing branches. In a final step, the maximum clique is found for every association graph.
Fig. 2. Result of branchpoint matching for in vivo scan (TLC and FRC), total view and detail view of same matching. The two trees are shown in bold black and bold gray, the matchings are represented by fine black lines.
3
Experimental Methods
To test the method, CT scans of two different physical phantoms and in vivo scans of the human airway trees were used. 3.1
Data
Two different phantoms were available. The first phantom is a hollow rigid plastic phantom (Fig. 3 a), made by a rapid prototyping machine. The phantoms geometry is based on real human data. Consequently, a human-like airway tree with parameters known to a high degree of accuracy is available. This phantom consists of about 100 airway tree branches with about 50 branchpoints (not counting the terminal points of airway segments). The second phantom is a hollow rubber phantom (Fig. 3b) made from a human airway tree cast. This second phantom is more complex, consisting of about 400 branches and 200 branchpoints. The
16
J. Tschirren et al.
rubber phantom was scanned in a Perspex container filled with potato flakes to resemble texture of surrounding pulmonary parenchyma. Since this rubber phantom was not built using a numerical rapid prototyping approach and it is not rigid, exact branchpoint locations were not known. The rigid phantom was CT-scanned at three different angles (0◦ , 10◦ , and ◦ 25 ) by rotating it on the scanner table (rotation around one axis). The rubber phantom was scanned twice. It was rotated in a similar way as the rigid phantom. The rotation angle was 8◦ in this case. The pixel size was 0.49 × 0.49 × 0.60 mm3 for the rigid phantom and 0.68 × 0.68 × 0.60 mm3 for the rubber phantom. The volume sizes were 512×512×500–600 voxels.
(a)
(b)
Fig. 3. Phantoms. a) Rigid phantom, b) Rubber phantom. In both phantoms, all the airway segments are hollow.
Two scans were available for each of 18 in vivo subjects for a total of 36 volumetric high resolution in vivo CT scans. For each subject, a scan close to total lung capacity (TLC) was acquired (at 85% lung volume), and a scan close to functional residual capacity (FRC) was acquired (at 55% lung volume). All in vivo scans have a nearly isotropic resolution of 0.7 × 0.7 × 0.6 mm 3 and consist of 500–600 image slices, 512×512 pixels each. In two of these 18 CT data pairs (in 4 volumes, two from a diseased and two from a normal subject), branchpoints were manually identified by human observers and were used for quantitative validation. 3.2
Validation indices
The validation was done in two parts. First, the reproducibility of the segmentation and skeletonization was tested. Next, the accuracy of the branchpoint matching was examined. The reproducibility of the segmentation and skeletonization was measured by comparing the lengths of corresponding airway segments between the different scans of the two phantoms.
Segmentation, Skeletonization, and Branchpoint Matching
17
The accuracy of the branchpoint matching was measured by comparing the results obtained using the automated method with the results of manual matching. The manual matching was done separately and independently by six different observers. A matched pair of branchpoints was only included in the independent standard if it was matched by a majority of human observers involved.
4
Results
Our method above was successfully applied to all 5 phantom and 36 human datasets. In all cases, the method generated reliable trees, well-positioned skeletons and branchpoints, and provided consistent intra-subject matches. Quantitative validation results are reported below. Fig. 4 gives comparison of airway segment lengths. The p-values are calculated by analysis of variance (ANOVA), using an F-statistic, with the null hypothesis that the mean values are equal. The means and standard deviations for the segment length differences were: Rigid phantom, 0◦ versus 10◦ : Rigid phantom, 10◦ versus 25◦ : Rigid phantom, 0◦ versus 25◦ : Rubber phantom, 0◦ versus 8◦ :
µ1 µ1 µ2 µ1
= 0.03 mm = −0.07 mm = −0.31 mm = 0.24 mm
σ1 σ1 σ2 σ1
= 0.86 = 2.45 = 1.96 = 1.04
mm mm mm mm
Table 1. Results for accuracy assessment of branchpoint matching.
Correct matches: computerdetermined vs. independent standard Wrong matches Missing matches Total computer matches
rigid phantom 0◦ vs. 10◦ 38/39 (97%)
in vivo normal 11/13 (85%)
in vivo diseased 17/19 (89%)
0 1 47
1 1 46
0 2 31
Table 1 lists the results for the branchpoint matching. The segmentation, skeletonization, and matching processes execute very fast on a 1.2 GHz AMD Athlon based Linux system. For an image volume containing 512 × 512 × 524 voxels, the segmentation step finishes in less than one second, the complete skeletonization, smoothing, and graph-generation process executes in about 48 seconds, and matching of two trees containing 150–200 branchpoints each requires one to two seconds. Consequently, a pair of trees can be analyzed and matched in about 100 seconds using our moderate-speed hardware.
J. Tschirren et al.
40
slope = 0.98 intercept = 0.21 N = 36 r = 0.99 p = 0.99
30
20
10
0 0
10
50
40
30
20 30 40 segment length at 0 degrees [mm]
50
40
30
slope = 0.96 intercept = 0.671 N = 42 r = 0.97 p = 0.98
20
10
0 0
60
10
20
30
40
50
60
50
60
segment length at 10 degree [mm]
Rubber Phantom 60
slope = 0.97 intercept = 0.70 N = 30 r = 0.98 p = 0.89
20
10
0 0
50
Rigid Phantom
60
segment length at 25 degrees [mm]
60
segment length at 8 degree [mm]
segment length at 10 degrees [mm]
50
Rigid Phantom
Rigid Phantom
60
segment length at 25 degree [mm]
18
10
20 30 40 segment length at 0 degrees [mm]
50
60
50
40
30
slope = 0.96 intercept = 0.14 N = 121 r = 0.99 p = 0.76
20
10
0 0
10
20 30 40 segment length at 0 degree [mm]
Fig. 4. Segment length comparison for rigid phantom and rubber phantom.
5
Discussion
The comparison of segment lengths as determined in phantoms showed high correlation between the reference data and the computer-determined data (Fig. 4). Agreement between segment lengths identified in the 0◦ and 10◦ rotated phantoms and for the 10◦ and 25◦ rotated phantoms was very good. For 0◦ and 25◦ , somewhat larger differences between the lengths were observed. This is mainly caused by a few outliers likely to be associated with the relatively large change of the CT scanning conditions and is not practically important as 25◦ differences between long-axis orientations of human subjects in a CT scanner is unlikely. The comparison of computer-matched branchpoints and hand-matched branchpoints shows a high matching rate in the phantom cases (97%), as well as in the human data (85–89%). Notice that the human data contained a relatively high number of non-matching branches in the pairs of matched TLC and FRC datasets. Indeed, there is a considerable difference in the number of branches and in the identifiable parts of the tree-structures between FRC and TLC scans due to changes of lung volume and consequently lung geometry. When comparing the matches identified manually and automatically, it is important to distinguish between missing and extra matches. Comparing be-
Segmentation, Skeletonization, and Branchpoint Matching
19
tween these two classes only, a missing match is preferred over an extra match since no incorrect information is introduced. As can be seen in Table 1, only a single incorrect extra match was observed in the tested in vivo datasets. At the same time, a total of only four missing matches occurred - an encouraging sign considering that 77 correct matches were identified overall in the in vivo datasets and additional correct matches were found using the computer approach that were not identified manually. The current implementation is not free of several shortcomings. The segmentation step is currently limited to the first 6 to 8 generations of airway tree segments. While substantially better than any of our previously reported approaches, additional improvements are under development. The branchpoint matching process is under review with a goal to avoid the small number of mismatches present in the current study. Needless to say, additional datasets are manually analyzed by human observers to form a larger and more representative set of independent standard data for future validation studies.
6
Conclusion
We presented an approach that allows reliable segmentation, skeletonization, and branchpoint matching in human airway trees. When tested in two kinds of physical phantoms derived from casts of human airway trees and in 36 invivo acquired airway trees of normal subjects as well as in those suffering from various pulmonary diseases, the method’s performance was incomparably faster than manual analysis and yielded close-to-identical results.
Acknowledgements This work was supported in part by the NIH grant HL-064368.
References 1. J. Silvela and J. Portillo, “Breadth-first search and its application to image processing problems,” IEEE Transactions on Image Processing, vol. 10, pp. 1194–1199, 8 2001. 2. K. Pal´ agyi, E. Sorantin, E. Balogh, A. Kuba, C. Halmai, B. Erdohelyi, and K. Hausegger, “A Sequential 3D Thinning Algorithm and its Medical Applications,” in 17th Int. Conf. Information Processing in Medical Imaging, IPMI 2001, Davis, CA, USA. Lecture Notes in Computer Science 2082, pp. 409–415, 2001. 3. A. P. Ambler, H. G. Barrow, C. M. Brown, R. M. Burstall, and R. J. Popplestone, “A versatile computer-controlled assembly system,” in Proceedings of International Joint Conference on Artificial Intelligence, pp. 298–307, 1973. 4. D. H. Ballard and C. M. Brown, Computer Vision. Prentice Hall PTR, 1982. 5. M. Pelillo, K. Siddiqi, and S. W. Zucker, “Matching hierarchical structures using association graphs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, pp. 1105–1120, 11 1999. 6. Y. Park, Registration of linear structures in 3-D medical images. PhD thesis, Osaka University, Japan. Department of Informatics and Mathematical Science., 1 2002. 7. T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms. MIT Press, 1990.
Improving Virtual Endoscopy for the Intestinal Tract Matthias Harders1 , Simon Wildermuth2 , Dominik Weishaupt2 , and G´ abor Sz´ekely1 1 Swiss Federal Institute of Technology Communication Technology Laboratory ETH Zentrum, CH-8092 Z¨ urich, Switzerland 2 University Hospital Zurich Institute of Diagnostic Radiology Raemistrasse 100, CH-8091 Z¨ urich, Switzerland {mharders,szekely}@vision.ee.ethz.ch, {dominik.weishaupt,simon.wildermuth}@dmr.usz.ch
Abstract. We present a system that opens the way to apply virtual endoscopy on the small intestines. A high-quality image acquisition technique based on MR as well as a haptically assisted interactive segmentation tool was developed. The system was used to generate a topologically correct model of the small intestines. The influence of haptic interaction on the efficiency of centerline definition has been demonstrated by a user study.
1
Introduction
The importance of radiologic imaging in the diagnosis of diseases in the intestinal tract has increased dramatically in recent years. One precursor of this development is virtual colonoscopy, which represents a promising method for colorectal cancer screening. In the early 1990s, Vining et al. [10] were the first to report on the technical feasibility of virtual colonoscopy simulating conventional endoscopic examinations. Its advantage is increased patient comfort, due to noninvasiveness, reduced cost as well as reduced sedation time. Results from recent studies [1, 7] show the accuracy to be comparable to conventional colonoscopy for detection of polyps of significant size. Nevertheless, virtual endoscopic evaluation of the intestines has so far been limited to the colon, but several diseases exist that also necessitate a radiologic exam of the small intestines - especially, since the small bowel can not be assessed completely by conventional methods. Virtual endoscopy of the small intestines is much more difficult then virtual colonoscopy because the tubular structure often follows a tortuous and curved path through 3D space. This makes the accurate tracing of the geometry an extremely difficult task. Furthermore, the tightly folded structure is often sliced at an oblique angle, resulting in extreme deterioration of image quality as tangential slicing direction is approached. Apart from these limitations, further general problems exist that hinder a wide dissemination of virtual endoscopy of the intestinal tract as a primary population T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 20–27, 2002. c Springer-Verlag Berlin Heidelberg 2002
Improving Virtual Endoscopy for the Intestinal Tract
21
screening procedure. These include the relatively lengthy time required for data interpretation, poor patient compliance regarding bowel cleansing, and concerns over the CT radiation dose. Our current research is directed at solving these problems by using MR imaging for virtual endoscopy of the intestinal tract - especially the small bowel. To improve patient compliance, we propose a new concept with an oral approach to avoid the need of invasive intubation which is more acceptable to the patient. Furthermore, we enhance the image analysis process by interactive haptic segmentation methods.
2
Medical Background
The prevalence of small bowel disease - the most common being Crohn’s Disease (chronic inflammatory bowel disease) and small bowel carcinoid tumor or tumor metastasis - is low, and the clinical diagnosis is complicated by nonspecific symptoms and a low index of suspicion. This frequently leads to delays in diagnosis and treatment. An accurate radiologic examination is, therefore, important not only for recognition of small bowel disease but also to help reliably document normal morphology [8]. The limitations of conventional enteroclysis (small bowel barium contrast x-ray) investigation, which needs invasive nasoduodenal intubation for contrast material application, have been recognized for a long time [4].
(a) 2D slice view.
(b) Thresholded 3D view.
Fig. 1. Small intestine image data.
Despite advances in fibre-optic endoscopy, the majority of the small bowel still remains inaccessible. Although recently developed small endoscopes allow true endoscopy of the duodenum and proximal jejunum, the conventional and cross-sectional gastroenterologic imaging methods currently represent the only reliable technique for evaluating the small bowel. The functional information,
22
M. Harders et al.
soft-tissue contrast, direct multiplanar capabilities, and lack of ionizing radiation suggest that MR imaging has a greater potential than other techniques to become the ideal diagnostic method for imaging of the small bowel. After acquisition of the volumetric data with an MR, the mesenteric small bowel has to be assessed by a radiologist. However, doing this step on cross sectional images is somewhat analogous to separating out and identifying a large number of writhing snakes in a crowded reptile tank. A more promising approach is using virtual endoscopy techniques, which have been a major research focus in recent years. The creation of three-dimensional images with perspective distortion promises also for the small bowel to be an advancement in diagnostics. Nevertheless, virtual endoscopy of the small bowel is much more difficult than virtual colonoscopy and can not yet be performed by the currently available postprocessing tools. The manual path definition proves difficult in the sharp turns of the small bowel and the loss of orientation is the most obvious problem (Figure 1). The most crucial task for future integration of small bowel examination into clinical routine is the development of more reliable segmentation tools and path finding systems for virtual endoscopy of the small intestine. As a consequence, we aim at enhancing this process with a new interactive haptic tool for segmentation and centerline definition.
3
Image Acquisition
Driven by public concern about medical radiation exposure, we developed a robust, albeit complex, technique for high-quality MR imaging [11, 7]. Prior to MR imaging of the small bowel, patients were prepared by oral ingestion of four doses of stool softener spiked with a clinically used MR contrast agent starting three hours before the examination. This mixture forms a viscous hydrogel within the intestinal lumen, giving good luminal distension, constant signal homogeneity, sufficient demarcation of the bowel content from surrounding tissues, and a low rate of artifacts, thus permitting non-invasive high quality MRI of the small bowel. According to the report of three volunteers and twelve patients, the oral mixture was well tolerated apart from slight abdominal discomfort and a sensation of being full. Data acquisition was performed breathhold in coronal plane, with the patient in a prone position. This near isotropic volume acquisition strategy permits multiplanar and three-dimensional reconstructions. Because MR imaging remains a motion-sensitive technique, bowel peristalsis is reduced by intravenous administration of a spasmolytic drug. The availability of high-performance gradient systems allows for the acquisition of large data volumes within a single breathhold [6], thereby eliminating respiratory motion artifacts. To assure data acquisition in apnea, imaging times are maintained under 30 seconds, limiting the number of contiguous 2mm sections to 48-64. The technique is based on the use of very short echo and repetition times rendering most tissues, including fat, dark. The signal is evident only within regions containing T1-shortening contrast in a concentration sufficient to reduce T1relaxation times to levels below 50ms [9].
Improving Virtual Endoscopy for the Intestinal Tract
4
23
Interactive Segmentation System
After acquiring the image data of the small intestines, the data sets have to be segmented into their major structural components before any high-level reasoning can be applied. As a consequence of the complex, tightly packed geometry of the small bowel, up to now no method is available, which could reliably provide a topologically correct segmentation. Even manual identification of the organ outline on 2D slices, which is usually the last rescue in case of lacking other alternatives, proved to be inappropriate due to the difficulties discussed in the introduction. Therefore, we had to apply a new virtual reality-based interaction metaphor for semi-automatic segmentation of medical 3D volume data [2, 3]. The mouse-based, manual initialization of deformable surfaces in 3D represents a major bottleneck in interactive segmentation. In our multi-modal system we enhance this process with additional sensory feedback. A 3D haptic device is used to extract the centerline of a tubular structure. Based on the obtained path a cylinder with varying diameter is generated, which in turn is used as the initial guess for a deformable surface. In the following sections we will describe our approach in detail. 4.1
Data Preparation
The initial step of our multi-modal approach is the haptically assisted extraction of the centerline of a tubular structure. First we create a binarization of our data volume by thresholding. We have to emphasize that this step is not sufficient for a complete segmentation of the datasets we are interested in. This is due to the often low quality of the image data caused by unevenly distributed contrast agents, pathological changes and partial volume effects. Nevertheless, in the initial step we are not interested in a topologically correct segmentation. On the contrary, we only need a rough approximation of our object of interest. For each voxel that is part of the tubular structure we compute the Euclidean distance to a voxel of the surrounding tissue. In the next step we negate the 3D distance map and approximate the gradients by central differences. Moreover, to ensure the smoothness of the computed forces, we apply a 5x5x5 binomial filter. This force map is precomputed before the actual interaction to ensure a stable force-update. Because the forces vectors are located at discrete voxel positions, we have to do a tri-linear interpolation to obtain the continuous gradient force map needed for stable haptic interaction. Furthermore, we apply a low-pass filter in time to further suppress instabilities. 4.2
Centerline Extraction
The goal of the centerline extraction process is to identify the ridge line through the resulting distance map. As in most object identification tasks the basic problem is to ensure the connectivity of the result by closing the gaps through areas, where the ridge is less pronounced. Haptic feedback proved to be a very efficient and intuitive metaphor to solve this problem. In the optimal case of good data quality, the user “falls through” the data set guided along the 3D ridge created by the forces. While moving along the path, control points are set, which are
24
M. Harders et al.
used to approximate the path with a spline. At regions with less clear image information, an expert can use his knowledge to guide the 3D cursor through fuzzy or ambiguously defined areas by exerting force on the haptic device to actively support path definition. 4.3
Segmentation
The next step is to use the extracted centerline to generate a good initialization for a deformable surface model. To do this, we create a tube around the path with varying thickness according to the precomputed distance map. This object is then deformed subject to a thin plate under tension model. Assuming positionindependent damping and homogeneous material properties as well as using discrete approximations of the differential operators, we can use Gauss-Seidel iteration to solve the resulting system of Euler-Lagrange equations. γvt − τ ∆v + (1 − τ )∆2 v = − δP δv Due to the good initialization, only a few steps are needed to approximate the desired object. The path initialization can be seen in Figure 2(a). Note, that the 3D data is rendered semi-transparent to visualize the path in the lower left portion of the data. Figure 2(b) depicts the surface model during deformation.
(a) Initialized path.
(b) Deforming tube.
Fig. 2. Interactive Segmentation.
In order to further improve the interaction with complicated data sets we adopt a step-by-step segmentation approach by hiding already segmented loops. This allows a user to focus his attention on the parts that still have to be extracted. For this purpose we have to turn the 3D surface model back into voxels, which should happen fast enough to maintain real time interaction. To achieve this goal we make use of the graphics hardware by implementing a zbuffer based approach as described in [5]. This process is shown in Figure 3.
Improving Virtual Endoscopy for the Intestinal Tract
(a) Voxelization.
25
(b) Removed segmented part.
Fig. 3. Hiding segmented parts.
4.4
System Evaluation
We carried out an initial test study to evaluate the influence of haptic interaction on the performance of centerline extraction and the following segmentation. The experiment followed a within-subjects repeated-measures design. Five participants took part in the study, only one had used a haptic device before.
(a) Start of process.
(b) Complete extraction.
Fig. 4. Centerline extraction.
Subjects were introduced to the interactive segmentation tool and were informed how to set a centerline with the system. Also, to familiarize the subjects with force-feedback, we presented them haptic rendering of the surface of the
26
M. Harders et al.
voxel objects based on data gradients. Each subject carried out the experiment under two conditions, without and with haptic enhancement for centerline tracing. The segmentation task was performed on an artificial and a real data set. The performance measure was the interaction time in seconds for model initialization. After setting the path, a 3D deformable surface was initialized and deformed without user interaction. The effect of the path quality on the segmentation process was also examined in our study. The data samples taken are paired allowing an analysis of difference to be undertaken. The distributions were successfully tested for normality, thus allowing the use of a paired t-test. A summary of the acquired data is shown in Table 1. This initial study has shown that there is a statistically significant performance improvement in the trial time (t = 3.59, df = 9, p ≤ 0.007) when using haptically enhanced interaction in 3D segmentation. Also in the haptic condition the quality of segmentation was always superior to the one without force-feedback. In seven out of ten cases, the deformable surface, initialized based only on visual feedback, collapsed in parts of the structure, thus requiring additional user interaction. This is due to imprecise initialization of the centerline, which causes the deformable model in poorly initialized regions to fail to automatically extract the object of interest. Subjects reported that 3D positioning was substantially facilitated with the force-feedback. Moreover, although most of the participants expressed a need for longer training in haptic interaction itself, all of them were already successful in taking advantage of the technology. Generally, subjects stated that in the haptically assisted condition they mainly focused on using the forces for guidance, while the visual feedback was only used for fine-tuning. Initialization time. Mean trial time (s) Standard deviation
Visual only With haptics 147.0 73.1 85.6 37.5
Table 1. Results of initial study.
5
Results
Three healthy volunteers (without any history of gastrointestinal disease or surgery) and twelve patients (evaluation of small bowel obstruction or chronic inflammatory bowel disease) participated in this preliminary study. We were able to use our system to obtain the centerline through the small intestines. Figure 4(a) shows the start of the process and Figure 4(b) displays the final outcome. Please note, that some of the shorter sections in the first image were combined into longer ones.
6
Conclusions
We have shown a new approach to generating computer models for virtual endoscopy based on MR image acquisition and haptically enhanced interactive
Improving Virtual Endoscopy for the Intestinal Tract
27
segmentation. We acquired high-quality images of the small intestines and used our system to completely segment the small bowel - to the best of our knowledge, this has not been achieved before. Whether our approach will readily replace currently used methods of small bowel imaging will depend on how this method can be integrated into the clinical setting in a practical manner that will be acceptable to patients, referring clinicians, and surgeons. To be the primary method for investigation of small-bowel disease, MR imaging will have to provide reliable evidence of normalcy, allow diagnosis of early or subtle structural abnormalities and influence treatment decisions in patient care. Further research and experience will help clarify whether our approach should be the primary method for investigation of the small bowel or used only as a problem-solving examination. Preliminary clinical investigation using the described system have given rise to the following recommended improvements: increased spatial and temporal resolution in MR imaging of small bowel to achieve true isotropic imaging, to assess ideal timing between intake of oral contrast agent and imaging, optimization of small bowel distension and further refinement of the current tools for segmentation and path definition. Nevertheless, the developed imaging, segmentation and navigation methods already opened a way for the extension of virtual endoscopy investigations onto the whole intestinal tract.
Acknowledgment This work has been performed within the frames of the Swiss National Center of Competence for Research in Computer Aided and Image Guided Medical Interventions (NCCR CO-ME) supported by the Swiss National Science Foundation.
References 1. H.M. Fenlon, D.P. Nunes, P.C. Schroy, M.A. Barish, P.D. Clarke, and J.T. Ferrucci. A comparison of virtual colonoscopy and conventional colonoscopy for the detection of colorectal polyps. In N Engl J Med, pages 1496–1503, 1999. 2. M. Harders and G. Sz´ekely. Improving medical segmentation with haptic interaction. In IEEE Computer Society Conf. on Virtual Reality, 2002. 3. M. Harders and G. Sz´ekely. New paradigms for interactive 3d volume segmentation. In Journal of Visualization and Computer Animation, 2002. 4. H. Herlinger and D.D.T. Maglinte. Clinical radiology of the small intestine, 1989. 41-44. 5. E.-A. Karabassi, G. Papaioannou, and T. Theoharis. A fast depth-buffer-based voxelization algorithm. Journal of Graphics Tools, 4(4):5–10, 1999. 6. D.A. Leung, G.C. McKinnon, C.P. Davis, T. Pfammatter, G.P. Krestin, and J.F. Debatin. Breathheld contrast-enhanced 3d mr angiography. In Radiology, 1996. 7. W. Luboldt, P. Bauerfeind, S. Wildermuth, B. Marincek, M. Fried, and J.F. Debatin. Colonic masses: detection with mr colonography. In Radiology, 2000. 8. D.D.T. Maglinte, K. O’Connor, J. Bessette, S.M. Gernish, and F.M. Kelvin. The role of physician in the late diagnosis of primary malignant tumors of the small intestine. American Journal of Gastroenterology, 86:304–308, 1991. 9. M.R. Prince. Gadolinium-enhanced mr aortography. In Radiology, 1994. 10. D.J. Vining. Virtual endoscopy: Is it. In Radiology, pages 30–31, 1996. 11. S. Wildermuth and J.F. Debatin. Virtual endoscopy in abdominal mr imaging. In Magn Reson Imaging Clin N Am., pages 349–364, 1999.
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 28–35, 2002. c Springer-Verlag Berlin Heidelberg 2002
Finding a Non-continuous Tube by Fuzzy Inference
29
30
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
31
32
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
33
34
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
35
Level-Set Based Carotid Artery Segmentation for Stenosis Grading C.M. van Bemmel , L.J. Spreeuwers, M.A. Viergever, W.J. Niessen Image Sciences Institute, Room E 01.334, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands {kees,luuk,max,wiro}@isi.uu.nl
Abstract. A semi-automated method is presented for the determination of the degree of stenosis of the internal carotid artery (ICA) in 3D contrast-enhanced (CE) MR angiograms. Hereto, we determined the central vessel axis (CA), which subsequently is used as an initialization for a level-set based segmentation of the stenosed carotid artery. The degree of stenosis is determined by calculating the average diameters of cross-sectional planes along the CA. For twelve ICAs the degree of stenosis was determined and correlated with the scores of two experts (NASCET criterion). The Spearman’s correlation coefficient for the proposed method was 0.96 (p 0 or λ3 > 0, FV (x) (5) max σmin ≤σ≤σmax v(x, σ) otherwise, where v(x, σ) = (1 − e−
RA 2 2α2
−
)e
RB 2 2β 2
2
S − 2γ 2
(1 − e
).
(5a)
The parameters α, β, and γ tune the sensitivity of the filter to deviations in RA , RB , and S, respectively. The filter is applied at multiple scales that span the expected vessel-widths. 4. Combined speed function Since the speed terms mentioned above have different properties, a combined speed function can be composed by multiplying them: F = FI F ∇ F V ,
(6)
where a speed term is set to 1 if it is not included. All speed terms are normalized, so values are in the range [0, 1]. 2.3
Vessel Quantification
The degree of stenosis according to the NASCET criterion [2] is given by (see also Figure 2): (1 −
Minimal Residual Lumen ) · 100%. Distal ICA Lumen Diameter
(7)
40
C.M. van Bemmel et al.
b a
This measure is defined for DSA data, which are projection images. In the method we propose, the degree of stenosis is determined from cross-sectional MR slices. In order to determine the degree of stenosis that is comparable to the NASCET criterion, we used the average diameter of the cross-sectional planes along the CA. In this study 12 stenosed carotids arteries were screened. From all carotid arteries both a DSA dataset (consisting of three projections (posteroanterior, oblique, and lateral)) and a CE-MRA dataset were available.
Fig. 2. Schematic view of the linear lumen reduction measuring method according to the NASCET stenosis-criterion ((1 − ab ) ∗ 100%) used for the internal carotid artery.
3 3.1
Results Central Axis Determination
In order to determine the CA, the vesselness image is computed at 25 scales (exponentially increasing) in the range of σ = 0.25 - 7.5 mm. For the vesselenhancement, the parameters α and β were both fixed at 0.5, while γ equals 25% of the maximum occurring pixel value in the 3D dataset. In case one of the eigenvalues is large, S will be large; the output of this filtering process is rather insensitive to the value of γ. In all datasets, the CA was everywhere located inside the lumen and could be used as initialization for the level-set based segmentation. 3.2
Level-Set Based Vessel Segmentation
Vessel segmentation is achieved via level-set techniques using the CA as initialization. Hereto, we implemented Equation 2 using a simple Euler forward-scheme with time-step ∆t = 0.1. We tested the influence of the different speed functions as given by Equations 3 through 6 separately. The gradient-based speed image was computed using σgrad = 0.75 mm, which is a trade-off between noisesuppression on the one side, and taking the width of the ICA into account on the other side. The parameters of the vesselness-based speed image were equal to those used for the CA determination (see Section 3.1). It was found that the segmentation was most robustly estimated using a combination of the speed terms. Therefore, the evaluation on all datasets was carried out by evolving a front utilizing the CA as initialization and the speed function given by F = FI F∇ FV . In Figure 3 a typical segmentation and a diameter-vs-length plot are shown. 3.3
Stenosis Grading
Expert grading of the DSA images was performed by averaging the scores from all available projections without vessel over-projection. Quantification of CEMR angiograms was done by two experts by averaging the degree of stenosis
D (mm)
Level-Set Based Carotid Artery Segmentation for Stenosis Grading
41
9.5 9 8.5 8 7.5 7 6.5 6 5.5 5 4.5 4 3.5 3 2.5 2 0
25
50
75
100
L (mm)
Fig. 3. Maximum Intensity Projections (MIP) of a 3D CE-MR angiogram of the ICA (left) with corresponding segmentation (middle) and diameter-vs-length plot (right) from which the stenosis grade can be determined. 90 80 70 60 50 40 30 20 10 0
100
%D (CE-MRA)
100
%D (CE-MRA)
%D (CE-MRA)
100
90 80 70 60 50 40 30 20 10
0
10
20
30
40
50
60
70
%D (DSA)
80
90
100
0
90 80 70 60 50 40 30 20 10
0
10
20
30
40
50
60
%D (DSA)
70
80
90
100
0
0
10
20
30
40
50
60
70
80
90
100
%D (DSA)
Fig. 4. CE-MRA vs DSA. Degree of stenosis is measured in 12 carotid arteries. Linear regression expert I (left), expert II (middle), and level-set based technique (right). Dashed lines indicate 95% confidence. It can be observed that the semi-automatic method better correlates with the gold standard provided by DSA. Moreover, the bias introduced by the method is smaller and the confidence bounds are tighter.
computed from MIPs in posteroaterior, oblique, and lateral views without vessel over-projection. The same ICAs were graded with the level-set based technique by determining the average diameter of cross-sectional planes along the CA, that was resampled every 0.5 millimeter. Table 1 shows the results of the comparison between the DSA and CE-MRA for the two experts and the level-set based technique. The correlation coefficient indicates a better agreement between the level-set based technique and DSA than the experts. Figure 4 shows the linear regression with the 95% confidence intervals (0.89, 0.88, and 0.96 for expert I, expert II, and the level-set based method, respectively).
4
Discussion
A method is presented for segmentation of the ICA, which is based on level-set techniques. By using the CA as initialization, the method is better suited for segmenting vascular structures, since the initialization is everywhere near the vessel wall. The method has been applied to carotid artery stenosis grading in CE-MRA data, and compared to measurements made by clinical experts. The results show that the presented method correlates better (Spearman’s correlation
42
C.M. van Bemmel et al.
coefficient 0.96 (p 50%), whereas the larger portion of type II myomas is an intramural location, if the angle is obtuse. The type of myoma has implications on the hysteroscopy as only type 0 and some type I myomas are regarded as safely resectable in one session, as the resection should never extend further than the inner border of the myometrium [3]. Leiomyomas may be solitary or multiple and over 90% are found in the uterine corpus. Five percent arise in the cervix and a smaller number are found in the broad ligament. The size of a myoma can vary from a pearl to as large as a melon. A uterus with multiple myoma may even give the impression of a sack filled with different sized potatoes [14]. Despite the amount of research in this area, the exact etiology of myoma is not known. It is assumed that the genesis is initiated by regular muscle cells with increased growth potential and that the growth of myomas is driven by estrogen. They are thus related to the function of the ovaries. Therefore myomas do not appear before puberty and do not emerge after menopause, when already existing myomas even tend to shrink. In general they grow slowly but continuously until the beginning of menopause [6]. The increase of volume by the factor of two usually takes several months or years. Slow growing myomas tend to be squeezed out by the healthy surrounding muscular meshes. Therefore, they seem to migrate over months or years towards the inner surface (endometrium, submucosal) or towards the outer surface (serosa, subserosal). Fast growing myomas, which are potentially malignant, tend to overwhelm this process by stretching out and thinning the healthy surrounding myometrium. They are able to completely deform the organ’s appearance. A myoma has a much stronger tendency to keep its shape than any of the tissues surrounding it, as it is composed of very dense fibrotic tissue. Therefore, the myoma will be able to grow almost independently from its surroundings by keeping a spherical shape. This holds for both the intramural and the submucosal myoma. The surrounding tissue consists of clustered myometrium. There is no actual capsule around the myoma. The tissue of the myoma as well as the surrounding tissue of the myometrium have a layered structure. This often simplifies the resection of the myomas as they can be peeled off the myometrium [14]. The endometrium is a highly reactive tissue covering the whole uterine cavity as well as protruding myomas of any degree. Therefore the endometrium defines the myomas’ visual appearance. The myometrium is an active muscular mesh which exhibits slow waves of contractions. This mechanism extrudes any tumor or foreign body affecting the uterine cavity and finally leads to pedunculated myomas.
Generation of Pathologies for Surgical Training Simulators
205
Table 1. List of required and neglected features
Required Neglected realistic shape exact cellular interaction [15,7] fully automatic generation stability of growth [1] randomness patient specific modeling [18] provide information for: biomechanical deformation of - texturing surrounding tissue[12] - blood perfusion observation of the growing process [9] - biomechanical modeling incorporation in organ model
For a diagnostic description of a myoma the physician will specify: its location within the uterus (fundal, corporal, or cervical), the degree of protrusion into the cavity (type 0, type I, or type II), and the length of the tree main axes [2].
3
Modeling Requirements
The generation of pathologies for surgical simulators has to fulfill a number of application specific requirements, which are summarized in Table 1. Tumor growth has previously been modeled with different objectives in mind, but most of the features of these models can be neglected in this application. The main property needed in a surgical simulator is a realistic appearance of the pathology. A surgical training scenario will be configured by a physician and should not need any additional interaction with a simulator expert. Therefore the generation process has to be fully automatic after initialization. The physician has to be able to specify a desired pathology in medical terminology, i.e. the specification of the pathology type and optionally the definition of size and position. An alternative to the definition of the size is the specification of the tumor’s age. The actual generation procedure can be computed off-line and no modifications of the tumor size or position are needed during simulation. That is, the pathology itself does not change during intervention, but of course it might be altered by the trainee, for example by cutting. In respect of the implementation, the goal is to incorporate a wide range of variations in size, geometry and position within one framework. This demands a nondeterministic model. Once the shape of the pathology has been generated, additional properties that are needed by the simulator have to be added, such as the texturing, the blood perfusion and the biomechanical properties. At the end of the generation process, the tumor has to be seamlessly incorporated into the organ model.
4
Implementation
Cellular automata are predestinated for the modeling of pathologies, as they imitate the development just by applying the same set of rules multiple times.
206
R. Sierra, G. Sz´ekely, and M. Bajka
Once a cellular automaton is able to generate a pathology in one of its most developed stages, any of the intermediate stages can be obtained with no additional effort. The main advantages when using a cellular automaton are the simplicity and extendibility of the implementation. Rules can easily be added, removed or modified. Computational stability is intrinsically a part of a cellular automaton [5]. The synthesis of a cellular automaton is equivalent to the seeking of a minimal set of rules that allows to model a certain behavior. There will always be a trade-off between realism and tractability of the model. Real myomas consist of millions of cells which is more than can reasonably be modeled in a computer simulation. An exact modeling of single cells implies also the use of a non-regular mesh [9]. This approach may be used to model exactly the behavior of a tumor in its very early stage, but it will not be manageable in the orders of magnitude considered. Thus, the actual value in a node of the cellular automaton can be regarded as a cell conglomeration rather than as a single cell. The implemented cellular automaton comprises two cell types and the background or cell-free space. The first cell type is the tissue which consists of the muscle cells of the myometrium, the second cell type represents the tumor cells. The background describes the uterine cavity. Additional factors can be added, e.g. to model the influence of the contraction of the uterine muscles. A single node with a tumor component is enough to initiate the growing process of a myoma. The cellular automaton is defined by a regular, three dimensional, cubic lattice L, an interaction neighborhood template Nb , the set of elementary states E, and the local space- and time-independent transition rules Ri . The local rules are either probabilistic or deterministic. As the cell of the automaton does not model a biological cell, the term node is used to avoid misinterpretation. A node is specified by its position p = (x, y, z) in the lattice. The neighborhood template Nb specifies the nodes that influence the state of the node under scrutiny. In the automaton described Nb is rule dependent and can be either a 6-neighborhood (N6 , von Neumann neighborhood) or a 26-neighborhood (N26 , Moore neighborhood). Whenever possible the smaller neighborhood was selected. The set of elementary states Etumor for the tumor is upper bounded by 1 and defined as a multiple of the predefined step ∆ = n1 , n ∈ IN, whereas the set Etissue is represented by the floating point values in the range [0, 1]. With each node, a tumor and a tissue channel can be associated: ctumor (p) and ctissue (p), thus tumor and tissue do not exclusively occupy a node. The idea is that only nodes with a value of ctumor (p) = 1 are considered to be part of the tumor, while any value smaller than 1 indicates a reactive shell around it. The tissue is initialized with values around 0.5 with decreasing values towards the surface. At least one node needs a tumor component with a value t0 ≥ ∆ for the growing process to start. The strict concept of a cellular automaton is relaxed to allow the integration of global knowledge in two aspects: global cost functions can be evaluated and
Generation of Pathologies for Surgical Training Simulators
207
a global rule Rglobal controlling the application of single rules Ri is introduced. Different rules Ri are applied sequentially in a loop but each rule is applied synchronously on every node. The application of a rule Ri is the transition from time step t to t + 1 with R : E ν → E, where ν = |Nb |. A first rule Rgrow determines the growing process of the tumor: Rgrow :
ctumor (p)t+1 = min 1, ctumor (p)t + ∆
A node will receive a tumor component with a certain probability if one of its neighbors in N26 has a tumor component. If the neighbor is a direct one, i.e. in N6 , the probability of becoming a tumor node is close to one. Otherwise the probability is much smaller. Once a tumor component is in a node it is continuously incremented by the same rule. Three objectives are modeled with this rule; the spherical shape of the myoma, the inhomogeneity of the surface, and the reactive shell around the tumor. A global cost function for the current tumor position is computed. For each node with a tumor component, the amount of tissue as well as the gradient value for its neighbors in N6 are separately added and summed to six different overall costs Ci . The rule Rmoving then moves the tumor into the direction d corresponding to the optimal cost min(Ci ). Rmoving :
ctumor (p)t+1 = ctumor (p − d)t
A third rule Radaption models the adaption of the surrounding tissue to the new situation. In a first pass the displacement of the tissue introduced by the growing tumor is modeled. In this rule the incremental property of Rgrow is used. Radaption :
ctissue (p)t+1 = min 1, ctissue arg max ctumor (q) t + ctissue (p)t q∈N26 ctissue arg max ctumor (q) t+1 = 0 q∈N26
In a second pass the displacement is propagated into the surrounding area by smoothing the tissue state. This could be represented by an additional rule but for simplicity Gauss filters of variable length are used in the actual implementation. A final rule Rclose closes the covering hull of the myoma so that no node with a tumor component will ever touch the background in N26 . This rule is introduced to ensure that the endometrium always covers the tumor, but can be skipped if the relaxation area is large enough, i.e. by applying Radaption several times. The global rule Rglobal defines which rule Ri is applied. This rule is time dependent so that the sequence of rules Ri applied changes during evolution. This allows for modeling a faster movement of the tumor while it is small. As the tumor size increases the rule Rmoving is applied less often to model a slower motion. As soon as the tumor is pedunculated this rule can once again be applied more frequently.
208
R. Sierra, G. Sz´ekely, and M. Bajka
Fig. 1.
Fig. 2.
Artificial myoma after 15, 25, 35 and 43 iterations.
Comparison of real (left) and two artificial myomas.
As stated in Section 2, the physician will specify the myoma by its location, protrusion and size, so that an explicit time-volume relation cannot be used. Therefore, the Gompertz model for tumor growth, which has been proposed on different occasions [9,15], is not suitable for this application. Through adaption of the global function Rglobal the desired myoma can be generated. By counting the number of applications of rule Rgrow and multiplying this number with the respective probability for N6 , one can keep track of the volume. Once the desired value is reached, the procedure exits the main loop.
5
Results
This very simple model is sufficient to produce highly satisfactory results and fulfill the requirements of a simulator. Figure 1 shows an exemplary sequence of the growing myoma. The procedure is fully automatic and does not need any interaction with a physician. The volume where the tumor was grown consisted of a cubic lattice with 1003 nodes. For the conversion to a surface model the marching cubes algorithm was used [13]. No additional factors were used and the global rule states that the moving rule is less often applied as the tumor grows. Figure 2 shows an image of a real myoma on the left which was taken during hysteroscopy. The two following images are synthetic myomas generated with the cellular automaton described. Texture images were taken from real hysteroscopies and mapped on the artificial surfaces. The visual inspection shows a high degree of realism and proves the approach to be suitable for the random
Generation of Pathologies for Surgical Training Simulators
209
generation of myomas for hysteroscopy simulation. This and more examples are available for download at http://www.vision.ee.ethz.ch/∼rsierra/miccai. The differentiation between tumor and normal tissue is needed both for the incorporation of the vascularization and the biomechanical properties. The vascularization is confined in the healthy tissue around the myoma. Different biomechanical properties can be assigned to the tumor and the tissue, so that the pathology consists of a stiff inner sphere surrounded by a softer tissue layer. The structures that can be generated by the cellular automaton proposed are not limited to myomas in the uterine cavity. The automaton can easily be tuned to produce other structures where a growing object is more rigid than the surrounding media. Validation of the resulting myomas is a major task, as it is in general for any training system. To our knowledge there are currently no other systems that generate artificial pathologies for surgical training devices which could serve as a reference. The structures described have subjectively been analyzed by experienced gynecologist who attested them a very high visual resemblance with actual cases. In the future validation will be twofold. On one side the training system will be evaluated, which also entails the behavioral and visual aspects of the myoma. Definition of useful metrics for measuring the resemblance as well as the training efficiency are preliminary tasks towards objective measurements. On the other side the growing process of tumors will be more deeply investigated and compared with the cellular automata described.
6
Conclusion and Future Research
A cellular automaton that is able to generate submucosal myomas has been described. The generated tumors show a high resemblance to real cases and can be seen as a step forward in the creation of high fidelity simulators. Future research will on one hand address the generation of other pathologies, and on the other hand incorporate the existing tumor generation model in more complex situations. In the next step, possible pathologies will be extended to incorporate surgically relevant degenerations of the uterus. The incorporation of the existing model in the fundus and corpus of the uterus is straightforward. Close to the fallopian tubes, where the myometrium is much thinner, a myoma will deform more than one surface at a time. In such cases, the incorporation into the organ model has to be further investigated. The generation of a vascular system for the pathology is closely related to the pathology itself. In the future we plan to merge the generation of the pathology with the generation of the vascular system [16]. Acknowledgments This work has been performed within the frame of the Swiss National Center of Competence in Research on Computer Aided and Image Guided Medical Interventions (NCCR CO-ME) supported by the Swiss National Science Foundation.
210
R. Sierra, G. Sz´ekely, and M. Bajka
References 1. J. Adam. A simplified mathematical model of tumor growth. Math. Biosci., 81:229–244, 1986. 2. M. Bajka. Empfehlungen zur Gyn¨ akologischen Sonographie. Schweizerische Gesellschaft f¨ ur Ultraschall in der Medizin, 2001. 3. P. Brandner, K. Neis, and P. Diebold. Hysteroscopic resection of submucous myomas. Contrib Gynecol Obstet., 20:81–90, 2000. 4. Cootes et al. Active shape models - their training and application. Computer Vision and Image Understanding, 61(1):38–59, 1995. 5. S. Dormann. Pattern Formation in Cellular Automaton Models. PhD thesis, Universit¨ at Osnabr¨ uck, 8.2000. 6. F.H.Netter. Farbatlanten der Medizin, Band 3: Genitalorgane. Georg Thieme Verlag, Stuttgart, New York, second edition, 1987. 7. H. Greenspan. On the growth and stability of cell cultures and solid tumors. J. theor. Biol., 56:229–242, 1976. 8. A. Heuck and M. Reiser. Abdominal and Pelvic MRI. Springer, 2000. 9. A. Kansal et al. Simulated brain tumor growth dynamics using a three-dimensional cellular automaton. J. theor. Biol., 203:367–382, 2000. 10. Kelemen et al. Elastic model-based segmentation of 3-D neororadiological data sets. IEEE Transactions on Medical Imaging, 18(10):828–839, 1999. 11. C. Kuhn. Modellbildung und Echtzeitsimulation deformierbarer Objekte zur Entwicklung einer interaktiven Trainingsumgebung f¨ ur Minimal-Invasive Chirurgie. Forschungszentrum Karlsruhe GmbH, Karlsruhe, 1997. 12. S. Kyriacou et al. Nonlinear elastic registration of brain images with tumor pathology using a biomechanical model. IEEE Transactions on Medical Imaging, 18(7):580–592, 1999. 13. W. Lorensen and H. Cline. Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics, 21(4):163–170, 7.1987. 14. Pschyrembel, Strauss, and Petri. Praktische Gyn¨ akologie f¨ ur Studium, Klinik und Praxis. de Gruyter, Berlin, New York, fifth edition, 1990. 15. A. Qi et al. A cellular automaton model of cancerous growth. J. theor. Biol., 161:1–12, 1993. 16. D. Szczerba. Macroscopic modelling of vascular systems. Submitted to MICCAI, 2002. 17. K. Wamsteker, M. Emanuel, and J. de Kruif. Transcervical hysteroscopic resection of submucous fibroids for abnormal uterine bleeding: Results regarding the degree of intramural extension. Obstet Gynecol, 82:736–740, 1993. 18. R. Wasserman and R. Acharya. A patient-specific in vivo tumor model. Math. Biosci., 136:110–140, 1996. 19. LASSO Project. http://www.vision.ee.ethz.ch/projects/Lasso/start.html, 2001.
Collision Detection Algorithm for Deformable Objects Using OpenGL Shmuel Aharon and Christophe Lenglet Imaging and Visualization Department, Siemens Corporate Research, Princeton, NJ
[email protected] Abstract. This paper describes a collision detection method for polygonal deformable objects using OpenGL, which is suitable for surgery simulations. The method relies on the OpenGL selection mode which can be used to find out which objects or geometrical primitives (such as polygons) in the scene are drawn inside a specified region, called the viewing volume. We achieve a significant reduction in the detection time by using a data structure based on an AABB tree. The strength of our method is that it doesn’t require the AABB hierarchy tree to be updated from bottom to top. We are using only a limited set of bounding volumes, which is much smaller than the object’s number of polygons. This enables us to perform a fast update of our structure when objects deform. Therefore, our approach appears to be a reasonable choice for collision detection of deformable objects.
1
Introduction
Many interactive virtual environments, such as surgery simulations, need to determine if two or more surfaces are colliding. That is, if there are surfaces that are touching and/or intersecting with each other. Finding the exact locations of these areas is a key process in this kind of application. For realistic interactions/simulations these calculations require good timing performance and accuracy. Collision detection algorithms have been published extensively. The most general and versatile algorithms are based on bounding volume hierarchies to detect collisions between polygonal models. These algorithms are primarily categorized by the type of bounding volume that is used at each node of the hierarchy tree. That is, axis aligned bounding boxes (AABB) [1], object oriented bounding boxes (OOBB) [2], or bounding spheres [3], [4], [5]. The main limitation of these algorithms is that for deformable objects, one needs to update (or re-build) the hierarchy trees at every step of the simulation. This is a time-consuming step that significantly reduces the efficiency of these algorithms. As far as we can tell, there are only two algorithms, known today, that are using graphics hardware acceleration for collision detection. One of them is the approach of Hoff et al. [6], which is limited to collisions between two-dimensional objects, or to some specialized three-dimensional scenes, such as those whose objects collide only in a two-dimensional plane. The second approach, suggested by Lombardo et al. [7], uses the OpenGL selection mode to identify collisions between polygonal surfaces. However, it is limited to the collisions between a deformable polygonal surface and an object with a very simT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 211–218, 2002. © Springer-Verlag Berlin Heidelberg 2002
212
S. Aharon and C. Lenglet
ple shape, such as a cylinder or a box. Furthermore, the performance of this algorithm significantly decreases with the increase of the object’s number of polygons. Therefore, it is limited to objects with relatively small number of polygons. This paper presents a new approach for collision detection using Open GL which allows a fast and accurate way to detect the collisions between the individual polygons of deformable objects.
2
The Collision Detection Algorithm
The collision detection algorithm suggested here detects collisions between a specified object, the reference object, and all other objects in the scene. We are assuming that each object is built from a set of polygons, which are either triangles or quadrangles. Following is a description of the various steps needed to complete a collision query with our algorithm. The algorithm is based on rendering the scene in selection mode, using an orthographic camera, with OpenGL graphics. Selection is a mode of operation for OpenGL, which automatically tells which objects in the scene are drawn inside a specified region, called the viewing volume. Before rendering it is necessary to provide a “name” for each object/primitive in the scene. After rendering in selection mode, OpenGL will return the list of all the “names” of the objects/primitives that are drawn inside the viewing volume. Further details about the OpenGL selection mode can be found in [7], [9]. 2.1
Bounding Volumes Hierarchy Creation
As a pre-processing step, it is necessary to divide the surface of each object into a set of Axis Aligned Bounding Boxes (AABB), in such a way that each bounding box contains no more than a specified number of polygons, and each polygon belongs to one and only one bounding box. We start building an AABB tree using the method suggested in [1], with the following modifications. Only the root and the leaves of the tree are used, ignoring all its internal nodes. This allows us to perform very fast updates since there is no need to consider the complete tree structure. We stop subdividing a bounding box when the number of polygons it contains is below a specified threshold. It may happen that the final bounding box has large empty spaces. This will be the case if it contains polygons that are not continuously connected in space. To prevent this, further subdivide it into two bounding boxes as described in [1]. If the resulting two bounding boxes are completely separated - keep them, otherwise - keep their parent bounding box. 2.2
The Collision Query
Step 1: Find the objects’ bounding boxes that intersect with the global bounding box of the reference object. This is done by defining the global bounding box of the reference object as the OpenGL viewing volume, and rendering all the bounding boxes of all other objects, defined as triangle strips, in selection mode, using a different
Collision Detection Algorithm for Deformable Objects Using OpenGL
213
“name” for each one. We use triangle strips, which are the most optimized OpenGL primitive, to ensure high performance [8]. Collisions can occur only in the area of the bounding boxes that intersect with the global bounding box of the reference object. Therefore, only these bounding boxes will be processed in the next step. If there is no bounding box that intersects the global bounding box of the reference object, then there is no collision and the algorithm stops. Step 2: Find the bounding boxes of the reference object that intersect with the bounding boxes found in the previous step. This is done, similar to step 1, by defining the OpenGL viewing volume as one of the bounding boxes found in the previous step, and render all the bounding boxes of the reference object, defined as triangle strips, in selection mode, using a different “name” for each one. Repeat this process using all of the bounding boxes that have been found in the previous step. The goal of this step is to have, for each bounding box from an object that is potentially involved in a collision, the list of all bounding boxes from the reference object that intersect with it, if any. Step 3: Find the list of all the reference object’s polygons that have potential collisions with an object(s) in the scene. For a given object bounding box B, found in step 1, render in selection mode, all the polygons contained in the bounding boxes from the reference object that found in step 2 to intersect B. This allows a major computational saving since it can greatly reduce the final number of polygon-polygon intersection checks. In order to minimize the number of viewing volume definitions, this step can be combined with step 2. Step 4: Find all the polygons, from all objects, that intersect with the bounding boxes of the reference object. These polygons have potential collisions with the reference object. To do this define, as was done in step 3, the OpenGL viewing volume as one of the reference object bounding boxes that has known intersections with one or more objects’ bounding boxes. Then render in selection mode, all the polygons from the objects’ bounding boxes that were found to intersect with this reference object bounding box, using a different “name” for each polygon. Repeat this procedure for all the reference object bounding boxes. The result of this processing provides the following for each reference object bounding box: • The list
L
i r
of the potentially colliding polygons inside this bounding box (i) of
the reference object (found in step 3). • The list
L
jk ri
of the polygons potentially colliding with polygons of
L
i r
inside
the bounding boxes (k) from object (j) of the scene (found in step 4). Where j goes from 1 to the number of scene’s object (excluding the reference object), and k goes from 1 to the number of bounding boxes for object j. This limits the number of polygons that have possible collisions, and hence significantly reduced the number of polygon-polygon intersection tests. Step 5: Find the polygons of the reference object that are colliding with polygons from an object, or objects, and the list of these polygons. For every polygon, P, in a
L
i r
list, with i going from 1 to the number of reference object bounding boxes, find
214
S. Aharon and C. Lenglet
whether or not this polygon really intersects the polygons from the
L
jk ri
lists. To do
so, define the polygon’s viewing volume to be a tightly fitting volume around the polygon P (see section 2.3 for details). Then render in selection mode all the polygons in a
L
jk ri
list, giving a different name to each of them. Every polygon that is found
inside the specified polygon’s viewing volume is actually intersecting the given polygon, P. The accuracy of this detection algorithm is limited to how accurate the polygon’s viewing volume actually limits the region that this polygon occupies in the world, and can be made as good as possible with no additional cost. Repeat this step for all the non-empty
L
jk ri
lists, and for all the
L
i r
lists.
This step provides the list of all the polygons from the reference object and the polygons from the scene’s object(s) that intersect. This is the desired result of the collision detection algorithm. 2.3
Defining the Polygon’s Viewing Volume
The goal is to define a small region tightly covering a given polygon, P, as the OpenGL viewing volume. Then render all other polygons of interest in selection mode to find out if they are drawn inside the specified polygon viewing volume, which means that they are intersecting with it. The accuracy of this detection method is defined by the accuracy of the viewing volume definition, and how well it really describes the region that this polygon occupies in the world. The rendering is done using an orthographic camera. In this case the OpenGL viewing volume is a rectangular parallelepiped (or more informally, a box), defined by 6 values representing 6 planes named left, right, bottom, top, near and far. Below are the steps to define a polygon’s viewing volume. First, find the polygon’s two-dimensional bounding box that resides in its plane. Then specify the viewing volume around this bounding box. That is define the left, right, bottom, and top planes of the viewing volume as the four edges of this bounding box. Next define the depth of the volume to be a very small number, ε, which specifies the required accuracy (we found that ε=0.001 gives good results). That is, specify the near and far planes to be
−ε ε , and from the polygon’s plane along the poly2 2
gon’s normal. If the polygon P under consideration is a triangular polygon one need to add two clipping planes to limit the viewing volume to a pyramid, tightly fitted around the polygon, as shown in Fig. 1. This method can be easily adapted for quadrangular polygons. In this case, three clipping planes might be needed to limit the viewing volume to the quadrangle edges. 2.4
Updating the AABB Structure for Object’s Deformations
When dealing with deformable objects it is necessary to update the Axis Aligned Bounding Boxes (AABB) structure after every step of the simulation. As mentioned before, we only keep the root of an AABB tree (the global bounding box of an ob-
Collision Detection Algorithm for Deformable Objects Using OpenGL
215
ject), and its leaves. Due to the relatively small number of bounding boxes that need to be updated, this task is performed rather quickly. Every bounding box is re-fitted around all the polygons that it contains. We further optimized this step by using the Streaming SIMD Extensions (SSE) provided by Intel processors since the release of the Pentium III processor (see [10] for details). This allows us to perform the update of the AABB structure twice faster.
Fig. 1. Viewing Volume fitting a triangular polygon (shaded). For clarity only one clipping plane is shown
3
Results
We performed several tests of our collision detection algorithm in order to evaluate its performance, the effect of the maximum number of polygons allowed in a bounding box, and the effect of object’s number of polygons, on the detection time. The tests were done on an Intel Pentium III 930 MHz processor with a Matrox Millennium G450 32 MB graphics card. All reported values are the average of a set of about 2000 collisions with approximately 10-15 colliding polygons. We tested the algorithm using a model of a scalpel consisting of 128 triangles, and one of the following polygonal models: Face with 1252 triangles, Teapot - 3752 triangles, Airways – 14436 triangles, Colon – 32375 triangles, and the Spinal Column with the Hips – 51910 triangles. The last three models were generated from clinical CT data. 3.1
Effect of Number of Polygons per Bounding Box on the Performance
The effect of the number of polygons allowed in a bounding box on the performance of our collisions detection method is shown in Fig. 2, for objects with different number of polygons. As expected, increasing the number of polygons in a bounding box will decrease the collision detection performance, since it will require to render large number of polygons for every box that has potential collision polygons and to perform many polygon-polygon intersection tests. On the other hand, having a small number of polygons in a bounding box, will lead to too many boxes for each mesh. Hence, increasing the number of boxes that have to be tested and will also decrease the efficiency of the detection algorithm.
216
S. Aharon and C. Lenglet
Fig. 2. Effect of the number of polygons in a bounding box (BBox) on the performance of the collision detection (upper-left), the update of the AABB tree structure (upper-right), and the total iteration time (bottom) for objects with different number of triangular polygons
However, as can be seen in Fig. 2, the performance of the collision detection is not very sensitive to the selection of the number of polygons in a bounding box. Therefore, a simple rule of thumb can be used to specify this number. That is, having about 300-600 polygons in a bounding box will have the best performance for large objects (more than 5000 polygons) and 50-150 polygons per bounding box for small objects (less than 5000 polygons). It is worth mentioning that changing the number of polygons in a bounding box, hence the number of bounding boxes we have, almost doesn’t affect the performance of updating the AABB structure. This is not surprising since no matter how many bounding boxes we have, we need to process all the polygons within them, that is all the object’s polygons. 3.2
Effect of the Object’s Number of Polygons on the Performance
The effect of the object’s number of polygons on the collision detection performance is shown in Fig. 3.
Fig. 3. Effect of the object’s number of polygons on the collision detection performance
Collision Detection Algorithm for Deformable Objects Using OpenGL
217
As can be seen in Fig. 3, the collisions detection time increases with the total number of polygons. However, the increase rate is far below a linear increase rate (i.e. O(n)). This means, that our algorithm can handle efficiently deformable models with a large number of polygons without a huge penalty in its performance. This is the main strength of the algorithm we presented here. 3.3
Performance Evaluation
Our goal was to compare the performance of our method to others. However, not all of the implementations were available to us. Therefore we performed a ballpark evaluation using public benchmark information [11]. Using the benchmark information we were able to estimate the difference in performance between the machines used to provide timing information of the various collision detection methods. Although this gives only ballpark estimates, it is sufficient to provide an idea of how well our algorithm performs in comparison to others. The Object Oriented Bounding Box (OOBB) method [2], used by RAPID, is very efficient in the collision query. However, it is necessary to rebuild the OOBB tree or refit it at every step when objects deformed. This is a time consuming task (in the order of tens of milliseconds for objects with couple of thousands of polygons, see [1]) that makes this method not suitable for use with deformable objects. The AABB method is also a very fast method for collision query, which can be updated for object’s deformation [1]. However, updating its tree structure is still the bottleneck for this method. As reported in [1] (and estimated for our machine) it takes about 1.5 milliseconds to update the AABB tree for an object with 3752 polygons, while our modified AABB structure can be updated in 0.39 milliseconds. The cost of the tree update increases significantly with the number of polygons using the AABB method. Therefore, although the AABB tree performs a collisions query much faster than our method, the overall performance for each iteration is slower than ours, in particular for large objects. Finally, Brown et. al. [5] suggested a method to update the bounding sphere tree method reported by Quinlan [4]. They reported a time of about 0.02 milliseconds per triangle for the tree structure updates. That is, it will be in the order of 20 milliseconds for an object with 1000 triangles, while with our method we can updated 50000 triangles in about 5 milliseconds. This again implies that our method is much more efficient for large deformable objects.
4
Conclusions
We have presented an algorithm for collision detection between polygonal deformable objects using Open GL. The algorithm is based on the OpenGL selection mode combined with a structure of axis aligned bounding boxes. It performs well on deformable objects with a large number of polygons, with a relatively small cost in performance when increasing the number of polygons. This method is particularly suitable for surgery simulations, where fast interaction is essential.
218
S. Aharon and C. Lenglet
References 1. Van Den Bergen G.: Efficient collision detection of complex deformable models using AABB trees. Journal of Graphics Tools (USA), vol. 2, no. 4, p. 1-13, 1997. 2. Gottschalk S., Lin M.C., Manocha D. OBB Tree: a hierarchical structure for rapid interference detection. Proceedings of 23rd International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96), New Orleans, LA, USA, 4-9 Aug. 1996. 3. Larsen E., Gottschalk S., Lin M.C., and Manocha D. Fast distance queries with rectangular swept sphere volumes. Proceedings 2000 ICRA. IEEE International Conference on Robotics and Automation, vol.4, San Francisco, CA, USA, 24-28 April 2000. 4. Quinlan S. Efficient distance computation between non-convex objects. Proceedings of the 1994 IEEE International Conference on Robotics and Automation, vol. 4, San Diego, CA, USA, 8-13 May 1994. 5. Brown J., Sorkin S., Bruyns C., Latombe JC., Montgomery K., and Stephanides M.: RealTime Simulation of Deformable Objects: Tools and Application, Computer Animation, Seoul, Korea, November 7-8, 2001. 6. Hoff K.E., Zaferakis A., Lin M.C., and Manocha D. Fast and simple 2D geometric proximity queries using graphics hardware. Proceedings of the 2001 symposium on Interactive 3D graphics, p. 145-148, ACM Press New York, NY, USA, 2001. 7. Lombardo J.C., Cani M.P., and Neyret F. Real-time collision detection for virtual surgery. Proceedings Computer Animation, Geneva, Switzerland, p.82-90, 26-29 May 1999. 8. Evans F., Skiena S., and Varshney A.: Optimizing Triangle Strips for Fast Rendering. Proceedings of IEEE Visualization 96, pp. 316-326, 27 October – 1 November 1996. 9. Woo M., Neider J., Davis T., and Shreiner D.: OpenGL Programming Guide. Third Edition. Addison-Wesley, Massachusetts, USA, 2000. 10. Intel Corporation: Data Alignment and Programming Issues for the Streaming SIMD Extensions with the Intel C/C++ Compiler. January 1999. 11. SPEC CPU95 Benchmark. Standard Performance Evaluation Corporation. www.spec.org.
Online Multiresolution Volumetric Mass Spring Model for Real Time Soft Tissue Deformation Celine Paloc1 , Fernando Bello2 , Richard I. Kitney1 , and Ara Darzi2 1
2
Dept. of Bioengineering, Imperial College, London, UK Dept. of Surgical Oncology and Technology, Imperial College, London, UK
[email protected] Abstract. Recent years have seen an increase in the acceptance and demand for Virtual Reality surgical simulators. Although significant advances have been made in the area, real-time accurate simulation of soft tissue deformation is still a major obstacle when developing simulators with haptic feedback. On this paper we present a new multi-resolution volumetric mass-spring model that offers high visual and haptic resolution in and around the region of interaction and other critical regions. Visual and haptic resolution decreases in proportion to the distance from such regions making it possible to distribute the computational workload optimally in order to achieve real-time haptic simulation.
1
Introduction
Surgical simulation is an extremely challenging area of research combining medical imagery, computer graphics and mathematical modelling. Recent advances make it possible to represent complex tissue structures and perform virtual flythrough operations, but a great deal of research in soft tissue modelling is still needed to develop the next generation of surgical simulators. 1.1
Physically-Based Deformable Models
Various approaches founded on the laws governing the dynamics of non-rigid bodies have been proposed for simulating deformable soft tissue. The Finite Element Method (FEM) is a common and accurate way to compute complex deformations of soft tissue, but conventional FEM has high computational cost and large storage requirements. Hybrid models based on global parameterized deformations and local deformations based on FEM have been introduced [1,2,3] to tackle this problem. Large-scale multi-processor computers to obtain soft tissue deformation at interactive rates have also been employed [4], while in [5,6] pre-computed elementary deformations and speed-up algorithms were used. Most of these methods, however, are only applicable to linear deformations and valid for small displacements. Furthermore, they tend to rely on pre-computing the complete matrix system and are therefore unable to cope with topological changes occurred during cutting or tearing. Mass Spring Systems (MSS) have been widely used in soft tissue simulation [7,8,9,10] because of their ability to generate dynamic behaviors that allow real time deformation and topological changes. The main limitation of MSS is T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 219–226, 2002. c Springer-Verlag Berlin Heidelberg 2002
220
C. Paloc et al.
computing the masses and springs parameters in order to set up a homogeneous material. Since damped springs are positioned along the edges of a given mesh, the geometrical and topological structure of this mesh strongly influences the material behavior and may generate undesired anisotropy. This tends to disappear as the density of the mesh increases, but using an extremely dense mesh reduces efficiency. For these reasons, a trade off between accuracy and computational time is typically required in MSS. 1.2
Multiresolution in Physically-Based Simulation
Multiresolution is a very active field of research in computer graphics and image processing. It consists of using representations of a geometric object at different levels of accuracy and complexity. This concept can be extended to physicallybased simulation by dynamically and locally adapting the density of the mesh in regions of interest depending on the desired accuracy. Most of the work in deformable modelling has used fixed space discretization. However, there have been some attempts at using the idea of locally refining a model in and around regions of interest. Hutchinson et. al [11] simulated a piece of draped cloth with a MSS which can be refined in regions of high curvature using a multi-level hierarchical mesh to represent varying levels of granularity. More recently, a model based on a multiresolution triangulation stored with a preprocessed DAG was used to refine a volumetric mass-spring network near user-controlled cutting lines [12]. Debunne et. al [13] combined a linear finite-volume based mechanical model with a non-hierarchical refinement technique and Wu et. al [14] proposed a scheme for mesh adaptation based on an extension of the progressive mesh concept for simulation with non-linear FEM. The problem with these approaches is the use of a pre-processing phase. Such pre-processing fixes the range and accuracy of the stored resolutions and limits the flexibility of the method. Moreover, the pre-processed resolutions depend on the topology of the object and thus prohibit topological modifications. 1.3
Refinement for Unstructured Mesh Generation
Implementation of a flexible multiresolution soft tissue model requires the online refinement/simplification of an unstructured mesh. Starting with a coarse mesh, a refinement procedure based on traditional unstructured mesh generation algorithms can be applied until the desired nodal density has been achieved. One of the existing approaches to element refinement is to divide an element into several ones by inserting a single node inside or on the boundary depending on its location in the mesh [15,16]. The quality of resulting elements can be improved by deleting the local elements and connecting the nodes to the triangulation using the Delaunay criterion. Our refinement scheme is based on Shewchuck’s Delaunay refinement algorithm for 3D quality mesh generation [16]. The remainder of this paper is organized as follows: Section 2 explains in detail the proposed online multiresolution approach. In section 3 we present and discuss the results of applying the new approach making a comparison with standard single resolution techniques. Lastly, in section 4 we formulate our conclusions and comment on our future work.
Online Multiresolution Volumetric Mass Spring Model
2
221
Methodology
Recently, we presented a volumetric MSS that offers topological and geometric flexibility for the efficient modelling of complex anatomical structures and simulation of interactions such as cutting or suturing [17]. Building on our model, we now introduce a flexible and truly dynamic multiresolution volumetric massspring representation offering high visual and haptic resolution in and around the region of interaction and other critical regions. 2.1
Behavior Consistency
One of the main problems in multiresolution models is ensuring that the deformable model stays self-consistent despite changes of resolution. Since there has never been any underlying physical model to refer to in order to find what parameter changes will guarantee the most consistent behavior at different resolutions, it has been commonly assumed that it is difficult to change the density of the mesh during the simulation while maintaining the same global mechanical properties. To address this problem, we studied the oscillations of a deformable tissue block under gravity at several resolutions using different parameter definitions as described below. Some frames of the simulations are presented in Figure 1. Figures 2 and 3 show the results of the simulation at three different resolutions and at combined resolutions.
(a)
(b)
(c)
(d)
Fig. 1. Deformable tissue block (a) under gravity forces using different parameter definitions. (b) single resolution. (c) combined resolutions using k = 1l and constant damping. (d) combined resolutions using (2) and (3).
Point mass. In our volumetric MSS, masses are allocated at the vertices of the tetrahedral mesh and damped springs along the edges. To accurately distribute the total mass of the mesh, we compute the mass mi of each vertex i according to the volume Vj of its adjacent tetrahedron j. If D is the material density, then: D j Vj mi = (1) 4 Spring stiffness. The easiest way is to use a constant value for the stiffness (k). More commonly, k is computed as k = 1l , where l is the length of the spring at rest. In [18], Van Gelder suggested a formula to compute spring stiffness for a 3D mesh that is the closest to an elastic continuous representation. Let E be the material elastic modulus, then:
222
C. Paloc et al. -7.4
-7
-7.6 -7.8
-8 -8
-8 y displacement
-7.5
-7.5
-8.5
-8.2
-8.5
-8.4
-8.8
-9.5
-9.5
-9
single resolution 87 nodes single resolution 187 nodes single resolution 553 nodes Combined resolution 269 nodes
-10
-9.2 -9.4
-9
-9
-8.6
0
10
20
30 40 50 60 70 Number of iterations
80
90
100
-10.5
0
10
20
(a)
30 40 50 60 Number of iterations
(b)
70
80
-10 -10.5 90
20
40
60 80 100 120 Number of iterations
140
160
(c)
Fig. 2. Comparison of the vertical position of an oscillating cube at several and combined resolutions using different k: (a) constant. (b) k = 1l . (c) Using (2).
k=
E
j l2
Vj
(2)
Figure 2 shows the behavior of the tissue block using (1) to obtain the mass of each element and the above stiffness definitions. It clearly illustrates that using a constant k and k = 1l fails to ensure the same amplitude and frequency of oscillations at different resolutions, while using (2) results in a consistent physical behavior at different and combined resolutions. Spring damping. The question of how to assign different damping (c) values to the various springs in a MSS has been largely ignored in the literature. Traditionally, c is treated as a constant throughout the system. If we assume that our multi degree-of-freedom (DOF) system can be transformed into a set of uncoupled single DOF systems, the damping ratio di of each spring taken separately can be expressed as di = 2√ckM , where k is the spring stiffness and M of the system without the effective end mass mi + mj . To limit the oscillations √ overdamping it, we let di = 1, such that c = 2 kM . We performed the same simulation as before using (1) and √ (2) to calculate m and k, first letting c be a constant and then defining it as c = 2 kM . Figure 3(b) shows that the proposed formula guarantees the best frequency consistency for different resolutions, but fails to ensure the same amplitude of oscillation that tends to increases with the resolution. To compensate for this effect, we adjust c to be inversely proportional to l, the length of the spring at rest: √ 2 kM c= (3) l As shown in figure 3(c), this new formula ensures the best behavior consistency for different and combined resolutions. We have thus demonstrated that it is possible to ensure a coherent physical behavior of a volumetric mass spring system at different and combined resolutions by dynamically updating the parameters using equations (1), (2) and (3). 2.2
Online Tetrahedral Refinement
In order to refine a given mesh at a desired location, we extend the concept of tetrahedral Delaunay refinement. A coarse mesh is created offline by forming an
Online Multiresolution Volumetric Mass Spring Model -7.5
-7.5
y displacement
-8
223
-7.5 single resolution 87 nodes single resolution 187 nodes single resolution 553 nodes Combined resolution 269 nodes
-8
-8
-8.5
-8.5 -8.5
-9
-9 -9
-9.5
-9.5 -9.5
-10 -10.5
0
10
20
30 40 50 Number of iterations
60
70
80
-10
-10
0
(a)
10
20
30 40 50 60 Number of iterations
(b)
70
80
90
-10.5
0
10
20
30
40 50 60 70 Number of iterations
80
90
100
(c)
Fig. 3. Comparison of the vertical position of an oscillating cube at several and com√ √ bined resolutions using different c: (a) constant. (b) c = 2 kM . (d) c = 2 lkM .
initial boundary constrained Delaunay tetrahedralization of the input vertices and triangles. The input vertices are used as a reference and can not be deleted. During online simulation, the given mesh can be locally refined/simplified by inserting/deleting vertices. In this section, we present the details of our method. Addition of mass points. During the simulation, additional mass points can be inserted in a tetrahedron of the previous triangulation in its resting configuration. Using the barycenter point coordinates within the tetrahedron, we interpolate its previous and actual position and velocity in the deformed mesh. Delaunay refinement. The resting configuration of the mesh is then updated using the Bowyer/Watson [19,20] algorithm to maintain the Delaunay property. Once all the new points have been inserted, the tetrahedrons which might have appeared in the concavities are removed. Vertex removal. If a vertex is removed, all tetrahedrons incident at the vertex are also removed leaving a ”hole” in the mesh. The re-triangulation of the hole is done by building the Delaunay triangulation of adjacent vertices [21]. Parameters update. Each time a tetrahedron is created or removed, all relevant parameters (m, k, c), are updated using equations (1), (2) and (3). Data structures. The efficiency of the on-line tetrahedral refinement depends heavily on the data structures. We use dynamic memory reallocation and two mesh data structures for mesh manipulation: the tetrahedron-based data structure [22] and the triangle-edge data structure for face classification and mesh manipulation [23]. The data structures have been extended by adding various flags to quickly trace mesh changes after an update and specific attributes related to the mesh deformation. 2.3
Refinement Locations
An essential step in the implementation of our model is to determine where to refine the mesh. Our refinement locations are defined based on the region of interaction and on the topology of the mesh. Region of interaction. The interactions between the surgical instruments and the soft tissue structures are handled by our collision detection algorithm. We approximate the geometric models of the instruments with a set of oriented
224
C. Paloc et al.
bounding boxes (OBBs) and have further optimized the collision detection by also using axis-aligned bounding box (AABB) trees for both the surgical instruments and soft tissue models. AABB trees can be updated very quickly as the shape of the model changes [24] allowing us to compute in real-time the exact points of intersection between the surgical instrument and the soft tissue model. New points into the soft tissue mesh are then inserted at the points of intersection and the mesh is refined using the Online Tetrahedral Refinement method described above. Our collision detection also allows us to quickly compute the distance between the new points inserted into the original mesh and the OBBs of the surgical instrument. If the distance is larger than a specified threshold, the point is removed from the mesh using the Vertex Removal algorithm. Mesh Topology. We use a quality refinement procedure to incrementally insert new points in the areas of high strain in the mesh. Such areas are normally represented by low quality tetrahedra in the initial coarse mesh input. One possible measure for analyzing the quality of a tetrahedron is the circumradius-to-shortest edge ratio of a tetrahedron. Any tetrahedron whose circumradius-to-shortest edge ratio is larger than B is split by inserting a vertex at its circumcenter. By using different values for the threshold B, it is possible to make the degree of refinement dependent on local strain.
3
Results and Discussion
To demonstrate the accuracy and efficiency of our refinement method, we simulated the deformation of an organ under the interaction of a surgical instrument using single (low/high) resolution and multiresolution models. The multiresolution model incorporated refinement at the regions of interactions as described in section 2. Every simulation lasted 20 seconds, including the time the instrument is approaching and leaving the organ. Figure 4 shows three frames taken during each simulation as the instrument is deforming the organ. They clearly illustrate the lack of deformation accuracy for the low-resolution model, whereas the high resolution and the multiresolution models show similar behavior. The underlying physical model as described in 2.1 allows the refined mesh to stay perfectly consistent despite the combination of several resolutions. Unlike continuous models which introduce diverging instabilities when different resolutions are combined [13], our refinement model behaves smoothly, without any visual artifacts. In order to quantify the performance of our model, we plotted for each simulation the elapsed time of every iteration broken down into different steps. The results are shown on figure 4. The relaxation time of the multiresolution model being proportional to the density of the mesh is kept as small as that of the low-resolution model. Figure 4 (c) also shows that the time required for the refinement is small enough to keep the simulation at a rate much faster than the high-resolution model. In fact, the iteration rate of the multiresolution model varies between 2 and 20 times that of the high-resolution model.
Online Multiresolution Volumetric Mass Spring Model
220
700 Collision Relaxation
200
350 Collision Relaxation
600
180
Time (ms)
250
500
200
120
400
100
150
80
300
100
60 40
200
50
20 0
Refinement Collision Relaxation
300
160 140
225
0
50
100 150 Number of iterations
(a)
200
250
100
0
5
10
15
20 25 30 35 Number of iterations
(b)
40
45
50
0
0
20
40
60
80 100 120 140 160 180 200 Number of iterations
(c)
Fig. 4. Performance comparison between different models. (a) Single low resolution. (b) Single high resolution. (c) Multiresolution
4
Conclusions and Future Work
We have developed a soft tissue model that allows localized online refinement, offering high visual and haptic resolutions in the regions of interest. Our results show that the refinement is made completely transparent to the user, as the model stays self-consistent despite the changes of resolution. Our method for 3D mesh refinement is truly dynamic offering full flexibility even in the case of topological changes. In fact, we believe that our multiresolution model will facilitate the implementation of topology modifying interactions such as cutting since we are able to refine the model along the path of a surgical instrument. As part of our future work, we will implement an original method for accurate soft tissue cutting. We also plan to develop a multifrequency relaxation for our model. The integration time step of classical MSS is closely dependent on the spring parameters. Since our parameters are directly proportional to the resolution, we shall optimize the relaxation time in relation to the density of the mesh. This will enable us to distribute the computational workload and help us towards achieving real-time haptic simulation.
References 1. Delingette H., Cotin S., and Ayache N.A. Hybrid elastic model allowing real-time cutting, deformations and force-feedback for surgery training and simulation. In Computer Animation, 1999. 2. Ramanathan R. and Metaxas D. Dynamic deformable models for enhanced haptic rendering in virtual environments. In IEEE Virtual Reality, pages 31–35, 2000.
226
C. Paloc et al.
3. Jianyun C., Jian S., and Zesheng T. Hybrid fem for deformation of soft tissues in surgery simulation. In IEEE Medical Imaging and Augmented Reality, pages 298–303, 2001. 4. Szekely G., Brechbuhler C., Hutter R., Rhomberg A., Ironmonger N., and Schmid P. Modelling of soft tissue deformation for laparoscopic surgery simulation. Medical Image Analysis, 4, march 2000. 5. Cotin S., Delingette H., and Ayache N.A. Real-time elastic deformations of soft tissues for surgery simulation. IEEE Transactions on Visualization and Computer Graphics, 5(1):62–73, 1999. 6. James D.L. and Pai D.K. Artdefo - accurate real time deformable objects. In Siggraph 1999, Computer Graphics Proceedings, pages 65–72, Los Angeles, 1999. 7. Joukhadar A. and Laugier C. Fast dynamic simulation of rigid and deformable objects. IEEE/International Conference on Intelligent Robots and Systems IROS, august 1995. 8. Desbrun M., Schr¨ oder P., and Barr A. Interactive animation of structured deformable objects. In Graphics Interface, pages 1–8, 1999. 9. Kuhnapfel U., Akmak H., and Maa H. Endoscopic surgery training using virtual reality and deformable tissue simulation. Computers and Graphics, 24:671–682, 2000. 10. Brown J., Montgomery K., Latombe J.C, and Stephanides M. A microsurgery simulation system. In MICCAI, pages 137–144, 2001. 11. Hutchinson D., Preston M., and Hewitt T. Adaptive refinement for mass/spring simulations. In Computer Animation and Simulation ’96, pages 31–45. 12. Cignoni P., Ganovelli F., , and Scopigno R. Introducing multiresolution representation in deformable modeling. In SCCG, pages 149–158, april 1999. 13. Debunne G., Desbrun M., Cani M.P, and Barr A.H. Adaptive simulation of soft bodies in real-time. In CA, pages 15–20, 2000. 14. Wu X.M. Adaptive nonlinear finite elements for deformable body simulation using dynamic progressive meshes. Eurographics, pages 439–448, 2001. 15. Ruppert J. A delaunay refinement algorithm for quality 2-dimensional mesh generation. J. Algorithms, 18(3):548–585, 1995. 16. Shewchuk J.R. Tetrahedral mesh generation by delaunay refinement. In Symposium on Computational Geometry, pages 86–95, 1998. 17. Paloc C., Kitney R.I, Bello F., and Darzi A. Virtual reality surgical training and assessment system. In Computer Assisted Radiology and Surgery, pages 207–212, June 2001. 18. Van Gelder A. Approximate simulation of elastic membranes by triangulated spring meshes. Journal of Graphics Tools, 3(2):21–41, 1998. 19. Bowyer A. Computing dirichlet tesselations. Computer Journal, 24:162–166, 1981. 20. Watson D. Dimensional delaunay tesselation with applications to voronoi polytopes. Computer Journal, 24:167–172, 1981. 21. Renze K. and Oliver J. Generalized surface and volume decimation for unstructured tessellated domains. In VRAIS, pages 111–121, March 1996. 22. Shewchuk J.R. Delaunay Refinement Mesh Generation. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, May 1997. 23. Mucke E.P. Shapes and Implementations in Three-dimensional Geometry. PhD thesis, Department of Computer Science, University of Illinois at UrbanaChampaign, Urbana, Illinois, 1993. 24. Van Den Bergen G. Efficient collision detection of complex deformable models using aabb trees. Journal of Graphics Tools, 2(4):1–13, 1997.
Orthosis Design System for Malformed Ears Based on Spline Approximation Akihiko Hanafusa1, Tsuneshi Isomura1, Yukio Sekiguchi1, Hajime Takahashi2, and Takeyoshi Dohi3 1
Department of Rehabilitation Engineering, Polytechnic University 4-1-1 Hashimotodai, Sagamihara, Kanagawa 229-1196, Japan
[email protected] 2 Department of Plastic Surgery, Tokyo Metropolitan Toshima Hospital 33-1 Sakaemachi, Itabashi-ku, Tokyo 173-0015, Japan
[email protected] 3 Graduate School of Information Science and Technology, The University of Tokyo 7-3-1 Hongou, Bunkyou-ku, Tokyo 113-8654, Japan
[email protected] Abstract. Malformed ears of neonates can be effectively treated by employing an orthosis of suitable shape. Currently, we use orthoses made of nitinol shape memory alloy wire and have developed a computer-assisted design system to manufacture the orthosis. Using this method, extracted contours of the helix and auriculotemporal sulcus are approximated to spline, and orthosis shape can be designed by moving the control points of spline with reference to control points of the target auricular shape. The system also functions to evaluate the contact force between the orthosis and auricle. Using this system, orthoses were designed and manufactured for 16 patients with malformed ears. Treatment was more effective in cases where it was necessary to extend the helix.
1
Introduction
In Japan, approximately 20% of neonates are born with an auricular deformity that will not heal spontaneously. Most cases can be treated by mounting an appropriately shaped orthosis in the auricle [1]. Currently, the authors use orthoses made of nitinol shape memory alloy wire covered with an expanded polytetrafluoroethylene tube. An example of treatment of a folded helix using an orthosis is illustrated in Fig.1. For manufacture of the orthosis, the wire is fixed in the appropriate shape and the shape is memorized by heating to 500oC for 30 minutes. An iron plate is grooved by the shape of the orthosis and the wire is inserted to fix the shape. We have recently developed a computer-assisted design system [2] for use in constructing the orthosis. Previously, there have been only limited attempts to employ such a system for this purpose. One example is a system that constructs a wax auricular model using three-dimensional shape measuring system and numerically controlled machine tool. This has been used for planning of a microtia operation [3]. Here, we describe a newly developed orthosis design method that focuses on post therapeutic auricular shape, and orthosis shape is generated based on a splineapproximated curve. The system also permits an estimation of the contact force between the auricle and orthosis by finite element analysis. This system was developed using MATLAB (The Math Works Inc.), and we have introduced several clinical T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 227–234, 2002. © Springer-Verlag Berlin Heidelberg 2002
228
A. Hanafusa et al.
applications of the system. In each case, treatment was performed only after thorough explanation of the procedure by the physician, and with the agreement of the patients' parents. Folded helix
Orthosis
(a) Before treatment
(b) Orthosis mounted in the auricle during treatment
Fig. 1. A case of folded helix, and treatment using an orthosis made of nitinol shape memory wire
2
Spline Approximation of Auricular Shape [4]
Spline curves are widely used in the field of CAD/CAM and computer graphics, with B-splines being the most popular in these applications. To approximate auricular shape in the current investigation, B-spline function of order four with four internal knots is used. An initial step is to convert the co-ordinates of auricular contour points. This conversion is based on the ear base line, that is a line connecting Otobasion superius (obs) and Otobasion inferius (obi), as illustrated in Fig.2(a). In addition, the distance is normalized based on the length of the line between O and obs, thus permitting the comparison of various auricle shapes. X and Y co-ordinates of contour points are defined by the medium of θ, and approximated co-ordinates are defined by the B-spline function as shown in the equation (1). 8
x(θ ) = ! α i Bi , 4 (θ ) i =1
8
y (θ ) = ! β i Bi , 4 (θ ) i =1
(1)
Coefficients (α i , β i ) are calculated using the least square method and plotted on the XY plane as control points (CPi). The position of the control points can then be used as an indicator of auricular shape. Using photographs, we examined the auricular shape of 550 ears of Japanese neonates under the age of 7 days, and classified them as normal or abnormal (five categories). Fig.2(b) illustrates the distribution of control points for 130 normal ears. The large X’s are the average normal control points, that is the average position of control points for normal ears, and the thick curve represents the average shape of a normal ear as determined by the position of the average normal control points. When an individual control point is specified, Mahalanobis generalized distance and the probability of belonging to the group can be calculated by the distribution of control points. We derive the normal rate by the ratio of probability of normal group and that of abnormal groups.
Orthosis Design System for Malformed Ears Based on Spline Approximation
229
Y CP3 1.5
350
CP3
250
Y CP2 CP1 obs
x(θ)
300 CP4
y(θ)
θ
!s θs Ear base line X
200
O
150
obi CP8
CP6 150
1
CP1 CP4
0.5 0
CP5
-0.5
CP5
100 50 100 (Unit:0.1mm)
CP2
200
CP8
-1
CP7 250
(a) Axes of co-ordinates.
CP6
-1.5
-1
-0.5
CP7 0
0.5
X 1
(b) Distribution of control points and average shape of normal ear.
Fig. 2. Approximated spline of an auricle
3
Generation of Orthosis Shape
An Orthosis shape is generated as follows: 1) An auricular three-dimensional model is composed; 2) Contours of the helix and auriculotemporal sulcus are obtained; 3) Contours are approximated by spline and the position of control points are compared with those of target post therapeutic auricular shapes. The control points are moved toward the target positions; and 4) The orthosis shape and output manufacturing data are generated. 3.1
Composure of the Auricular Three-Dimensional Model
The three dimensional auricular shape is measured using the non-contact laser measurement system, which can acquire not only three dimensional position data, but also RGB color data. Since it is not possible to acquire data from both the frontal side of the auricle and the rear side (including the auriculotemporal sulcus) at the same time, it is necessary to compose a three-dimensional image using data obtained from measurements made from various directions. In order to improve the accuracy of auricular matching, we use a composite method that can match not only the surface contour but also the color. This method is based on the Iterative Closest Point (ICP) algorithm [5], with improvements to permit handling of both three-dimensional and RGB color co-ordinate distance. Equation (2) shows the defined united distance (d) of three dimensional distance and color dis" " " " tance. Here, Pa , Pb , C a , C b are three-dimensional co-ordinates and color coordinates of points a and b respectively, and kp and kc are weighted coefficients.
230
A. Hanafusa et al.
Usually, a plaster cast of the auricle, colored in a striped pattern, is used to compose the three dimensional model and color distance is used auxiliarily be setting coefficients to kp = 0.9 and kc = 0.1. " " 2 " " 2 (2) d 2 = kp 2 Pa − Pb + kc 2 C a − C b 3.2
Extraction of Contours of the Helix and Auriculotemporal Sulcus
Fig.3 shows the triangular element modification and line segment extraction system. The system includes a function to trace to the next point, situated where the difference of curvature and direction of normal vector from the current point is smallest. Contours can be obtained by selecting the Otobasion superius (obs), and tracing either helix or auriculotemporal sulcus points, automatically or manually, to the Otobasion inferius (obi).
Orthosis shape Extracted contour
Fig. 3. Overlaid display of generated orthosis and auricular model
3.3
Approximation of Contours by Spline and Comparison of Control Points
To fix and memorize the orthosis shape to the shape memory alloy wire, the helix and the auriculotemporal sulcus side plates are grooved separately. The approximated plane is first calculated and co-ordinates of points on the contour are projected onto the plane. Approximated spline is then calculated using the method described in the previous section. Subsequently, the current position of control points and the position of control points for the post therapeutic target shape are compared. As a post therapeutic target shape, spline approximation results of normal ear shape, obtained from a
Orthosis Design System for Malformed Ears Based on Spline Approximation
231
normal ear on the opposite side or from a sibling or parent, can be used. Also, the average position of normal ear control points can be employed. By moving the position of control points, shape of contours can be modified. 3.4
Generation of the Orthosis Shape
In accordance with the modified shape of the contours, the orthosis shape is generated. Fig.3 illustrates an overlaid display of a generated orthosis and auricular model. Finally, the shape is converted to Numerical Control (NC) data for groove process using a machine tool.
4
Finite Element Analysis of Contact Force [6]
It is important to evaluate the contact force between the orthosis and auricle, to ensure that it is sufficient to correct the auricle shape, while not excessive to the degree that it may cause a decubitus-like inflammation on the auricle. To evaluate the contact force, finite element analysis, that can handle the material non-linearity of an auricle and the contact deformation of auricle and orthosis, is developed. To contend with the material non-linearity, an incremental method is employed and displacement is increased by inserting the orthosis gradually. Moreover, we have also performed a tensile test using pig's auricular cartilage, and applied the resultant strain-stress diagram to the material property. Multiple point constraints are also used to represent the contact deformation of auricle and orthosis. The constraint condition should be updated in every insertion step of the incremental method. Z (mm)
Z (mm)
Orthosis
X (mm)
Y (mm)
(a) Control points lie halfway between their original position and that of the average control points in a normal situation
Orthosis
X (mm)
Y (mm)
(b) Control points are moved to the position of average control points in a normal situation
Fig. 4. Simulation results to demonstrate the effect of moving control points on the contact force distribution on the auricle
Where an orthosis was used in the case of an upper folded helix, as illustrated in Fig.1, contact force was compared for different positions of control points. Figs.4(a) and 5(a) illustrate the outcome of moving control points CP2 to CP4 (see Fig. 2(b)) halfway towards the position of average normal control points as illustrated in
232
A. Hanafusa et al.
Fig.6(a) (middle line). Figs.4(b) and 5(b) illustrate the effect of moving CP2 to CP4 to the position of average normal control points. Orthoses of helix side are inserted in 8 steps, increasing by 0.5 mm each, and the contact force is evaluated. The start edge of the orthosis and auriculotemporal sulcus of the auricle are clipped. The number of elements of the auricle is 630, and that of orthosis is 28. Fig.4 shows the distribution of the force on the auricle. Brighter area where more force is implied is increased when the average of normal position is used. Fig.5 illustrates orthosis deformation and the force produced by the contact. When the control points are in the middle of current and average normal position, the force at the clip point is 0.63 times smaller and the maximum contact force is 0.67 times smaller. Z (mm)
Z (mm)
X (mm)
(a) Control points lie halfway between their original position and that of the average control points in a normal situation
X (mm)
(b) Control points are moved to the position of average control points in a normal situation
Fig. 5. Simulation results to demonstrate the effect of moving control points on orthosis deformation and contact force on the orthosis
1.5
1
CP3
CP2
CP4 CP1
0.5 0
CP6
-0.5 CP5
CP8 CP7
-1
-1
-0.5
0
0.5
(a) Control points used to design the orthosis (b) Clinical result after four months of treatment Fig. 6. Treatment of the folded helix case shown in Fig.1
Orthosis Design System for Malformed Ears Based on Spline Approximation
5
233
Clinical Applications
The clinical application of nitinol orthoses, designed and manufactured using the recently-developed system, was evaluated in 16 cases. More specifically, this study set included 6 cases of cryptotia, 5 cases of folded helix, 3 cases of folded lobulus auriculae, and one case of Stahl's ear and one case of a protruded auricula. Currently 11 of these cases are under treatment and effectiveness of treatment has been confirmed in 9 cases. Where treatment required extension of the helix, employing this method generally resulted in improvement. However, where it was necessary to form anthelicis as part of treatment, it was impossible to effectively treat the deformity using only the developed orthosis. 5.1
Treatment of a Folded Helix
Fig.1(a) show a folded helix in a one-month-old baby. Fig.6(a) shows the spline data used to generate the orthosis for this case. From the center of the figure outwards, the three lines represent the approximated spline of auricle pre-treatment, orthosis shape and the average shape of a normal ear respectively. Control points CP2, CP3 and CP4 are moved halfway between the current position and position of average normal points. Fig.1(b) shows the orthosis mounted in the auricle and, as illustrated in Fig.6(b), after 4-months of treatment the degree of folding was improved. 5.2
Treatment of a Folded Lobulus-Auriculae
Figs.7 and 8 illustrate the use of the system to correct a folded lobulus-auriculae in a three-month-old baby (Fig.7(a)). The orthosis shape shown in Fig.7(b) is generated by moving control points CP4, CP7 and CP8 toward the normal average position, as demonstrated in Fig.8(a). After one month, the degree of folding was improved (Fig.8(b)).
Folded lobulus auriculae
(a) Before treatment
Orthosis
(b) Orthosis mounted in the auricle during treatment
Fig. 7. A case of folded lobulus-auriculae, and treatment using an orthosis
234
A. Hanafusa et al. CP3
1.5 1
CP2
CP4 CP1
0.5 0 -0.5
CP5 CP7
-1
CP8
CP6 -1
-0.5
0
0.5
(a) Control points used to design the orthosis (b) Clinical result after one month of treatment Fig. 8. Treatment of the folded lobulus-auriculae case shown in Fig.7
6
Conclusion
Using the orthosis design system, a three dimensional auricular model can be produced that considers both surface contour and color. Extracted contours of the helix and auriculotemporal sulcus are approximated to spline, and orthosis shape can be designed by moving the control points of spline, referring to control points of the target auricular shape. The system also permits evaluation of the contact force between the orthosis and auricle. By moving control points halfway towards the position of average normal control points, the contact force is less than that associated with moving control points to the average normal positions. Orthoses for 16 cases of malformed ears were designed and manufactured using the system, and treatment was effective in 9 cases when the helix had to be extended.
References 1. Matsuo K, Hirose T, Tomono T, Iwasawa M, Katohda S, Takahashi N, Koh B : Nonsurgical Correction of Congenital Auricular Deformities in the Early Neonate, Plast. Reconstr. Surg. 73 (1984) 38-50. 2. Hanafusa A, Takahashi H, Akagi K, Isomura T : Development of Computer Assisted Orthosis Design and Manufacturing System for Malformed Ears, Computer Aided Surgery 2 (1997) 276-285. 3. Kaneko T.: A System for Three-Dimensional Shape Measurement and its Application in Microtia Ear Reconstruction, Keio J. Med. 42(1) (1993) 22-40. 4. Hanafusa A, Takahashi H, Isomura T, Dohi T: Analyses of Japanese Neonates' Auricular Shape Using Spline Approximation, Proc. of the CAR'98 (1998) 951. 5. Besl PJ, McKay ND : A Method for Registration of 3-D Shapes, IEEE Trans. Pattern Analysis and Machine Intelligence, 14, 2 (1992) 239-256. 6. Hanafusa A, Isomura T, Sekuguchi Y, Takahashi H, Dohi T: Computer Assisted Orthosis Design System for Malformed Ears -Automatic Shape Modification Method for Preventing Excessive Corrective Force-, Proc. of the World Congress on Medical Physics and Biomedical Engineering Chicago 2000 (2000) 1-3.
Cutting Simulation of Manifold Volumetric Meshes C. Forest, H. Delingette, and N. Ayache Epidaure Research Project, INRIA Sophia Antipolis, 2004 route des Lucioles, 06902 Sophia Antipolis, France
Abstract. One of the most difficult problem in surgical simulation consists in simulating the removal of soft tissue. This paper proposes an efficient method for locally refining and removing tetrahedra in a realtime surgical simulator. One of the key feature of this algorithm is that the tetrahedral mesh remains a 3-manifold volume during the cutting simulation. Furthermore, our approach minimizes the number of generated tetrahedra while trying to optimize their shape quality. The removal of tetrahedra is performed with a strategy which combines local refinement with the removal of neighboring tetrahedra when a topological singularity is found.
1
Introduction
The simulation of cutting soft tissue is one of the major components of a surgical simulator. In fact, in a surgical simulation procedure, the word cutting may be used to describe two different actions: incising which can be performed with a scalpel, and removing soft tissue materials which is performed with an ultrasound cautery. Despite their different nature, the simulations of these two actions rise against a common problem, which is the topology modification of a volumetric mesh. Incising algorithms have been the most commonly studied approaches in the literature. Most of them are based on subdivision algorithms whose principle is to divide each tetrahedron across a virtual surface defined by the path of the edge of a cutting tool[BG00,MK00]. These algorithms create a smooth and accurate surface of cut, but they suffer from two limitations. First, they tend to generate a large number of small tetrahedra of low shape quality. The number of new tetrahedra may be reduced by moving mesh vertices along the surface of cut [NvdS01], but this tends to worsen the quality of tetrahedra. Second, the simulation of incising supposes that the cut surface generated by the motion of a surgical tool is very smooth or even, as found in some articles, locally planar. However, in most real-time surgical simulation systems, there is no constraints on the motion of surgical tools and most of the time the cut surface is not even close to a smooth surface. Furthermore, surgical cutting gestures do not usually consist of a single large gesture, but are made of repeated small incisions. Therefore, these algorithms are not of practical use for cutting volumetric meshes, for instance when simulating an hepatectomy. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 235–244, 2002. c Springer-Verlag Berlin Heidelberg 2002
236
C. Forest, H. Delingette, and N. Ayache
In this paper, we focus on the simulation of the removal of soft tissue material as performed with an ultrasound cautery. The targeted application is the simulation of hepatectomy, i.e. the resection of a functional segment of the liver. We suppose that all volumetric anatomical structures (for instance a liver) are represented with tetrahedral meshes which meet some criteria (see section 2.1). To remove soft tissue, we need to perform two distinct tasks : remove tetrahedral elements and control the element size such that the material cavities have the same size as the cautery device. The former task may seem trivial since it could reduce itself to a self-evident “remove this tetrahedron from the list of tetrahedra”. However, due to additional constraints on the mesh topology (which has to remain a manifold), it actually turns out to be a difficult problem to solve. The latter task consists in locally refining the mesh around the cut path. Indeed, in order to speed-up the deformation of soft tissue, it is preferable to use meshes with as few tetrahedra as possible. To obtain a realistic cut, it is therefore necessary to refine the mesh dynamically, as opposed to previous approaches [CDA00,PDA01] where the mesh was refined beforehand in regions where the cutting could occur. In our approach, the tasks of removing and refining tetrahedra are both devised in order to optimize two constraints. First, they must minimize computation time and therefore cannot use sophisticated remeshing algorithm because each cutting operation should be performed in a few tens of milliseconds (typically 50-100 milliseconds). Second, they should produce tetrahedra with a high shape quality.
2 2.1
Tetrahedral Meshes for Surgery Simulation Topological Constraints
For the simulation of volumetric soft tissue, we use tetrahedral meshes as a geometric model. These meshes are built from triangulated surfaces with a dedicated mesh generation software. For our application, we chose to restrict the set of possible tetrahedral meshes to be both conformal and manifold. Conformality is required since we are using finite element modeling for the spatial discretization of the elastic energy (see [PDA01] for more details). In a conformal mesh, the intersection of two tetrahedra is either empty, or a common vertex, edge or triangle. Furthermore, the tetrahedral mesh is a 3-manifold[BY98] which implies that the neighborhood of a vertex is homeomorphic to a topological sphere for inner vertices and is homeomorphic to a half-sphere for surface vertices. More precisely, a tetrahedral mesh is a 3-manifold if the shell of a vertex (resp. an edge), i.e. the set of tetrahedra adjacent to a vertex (resp. an edge), has only one connected component. In Figure 1, we show two examples of non-manifold volumetric objects. When the neighborhood of a vertex or an edge is not singly-connected, we say that there exists a topological singularity at that vertex or edge. Having a manifold tetrahedral mesh is not mandatory for finite element algorithm. However because it allows to compute a normal for each surface vertex this
Cutting Simulation of Manifold Volumetric Meshes
237
Fig. 1. Examples of non manifold objects: (left) edge singularity; (right) vertex singularity
feature is useful for the rendering of the mesh, for example when using Gouraud shading or PN triangles[VPBM01], and become necessary for computing the reaction force to be sent to a force-feedback device. Furthermore it simplifies the computation of edges and vertices neighborhood, allowing to decrease the redundancy of data structure. Removing tetrahedra in conformal tetrahedral meshes is a trivial task but it proves to be much more difficult for manifold meshes. Thus, at least two authors have reported problems for removing topological singularities during cutting simulation[NvdS01,MLBdC01] without providing practical solutions. 2.2
Data Structure
We found the design of a data structure for a tetrahedral mesh to have a strong impact on the computational efficiency of the cutting simulation and on the ease of implementation. As a detailed description of this data structure would fall outside the scope of this paper, we will only describe its main features below. Our data structure relies on the notion of manifold mesh and the topological information is mainly stored in vertices and tetrahedra. These two objects are stored in lists and each tetrahedron points towards its 4 vertices, 6 edges, 4 triangles and 4 neighboring tetrahedra. We also have edge (resp. triangle) objects which are stored in a hash table indexed by the reference of its two (resp. three) vertices. To each vertex ( resp. edge), we add a pointer on a neighboring tetrahedron (resp. triangle) in order to build its neighborhood. We “close” topologically the volumetric mesh by adding virtual vertices, edges, triangles and tetrahedra for each surface vertex.
3 3.1
Cutting Algorithm Problem Position
In our surgical simulator, the user manipulates a force-feedback device which has the same kinematics as a real surgical instrument in a laparoscopic procedure. The position and orientation of this virtual instrument is read periodically and collisions with the different virtual organs are detected with an efficient method [LCN99] based on standard OpenGL hardware acceleration. The input of the cutting algorithm is therefore a set of surface triangles and consequently
238
C. Forest, H. Delingette, and N. Ayache
a set of tetrahedra Tinitial (adjacent to these triangles). When simulating an ultrasound cautery, all tetrahedra in the neighborhood of the tool extremity are simply removed. We proceed in two stages: first neighboring tetrahedra are refined and then a subset of these tetrahedra are removed. We first describe the refinement stage before detailing the deletion stage. 3.2
Local Mesh Refinement
We have chosen, for the sake of efficiency and simplicity, to locally refine a mesh by splitting edges into two edges. More precisely, the input of the refinement algorithm is a set of edges: these edges may be inside or on the surface. Then, we compute the set of tetrahedra that are adjacent to any edges in the set. Finally, we split each tetrahedron in this set in a systematic manner which depends on the number and the adjacency of edges. There are 10 distinct configurations for refinement which are displayed in Figure 2. In order to get continuity across two neighboring tetrahedra (conformality), it is required in some configuration to take into account the lexicographic order of vertices (any other order between vertices could be used). B
B
D
D
ab
D
C
ac
B
A
cd
ac
A
B bc
C
ABbcbd, AbcCcd, Abccdbd, AbdcdD B
B
B
A
ac
AabcdD if A 5mm
35
14 Error [mm]
30 25 20 15 10
12
Mean
10
Maximal
8 6 4
5
2
0
0
2
4
6
8 10 12 Number of modes
14
16
18
2
4
6
8 10 12 Number of Modes
14
16
18
Fig. 6. Results of the leave-one-out-test: area of surface with error >5mm (left), mean and maximum surface distance (right)
4
Discussion and Conclusion
We have analyzed the variations of a 3D statistical shape model. One major challenge for building a shape model is the determination of a good correspondence of two surfaces. We have presented a novel method based on a patchification and the geometric idea of minimizing the distortion (local scaling and shearing of the surface). The algorithm discussed here gives only an approximate solution to this objective. As an improvement we intend to relax the surface nodes after the initial parametrization, constraining only those nodes that represent distinct features (landmarks, nodes along lines of high curvature, etc.). This relaxation should improve the correspondence and hence the representation of new shapes by the model. The correspondence method can be extended to arbitrary topologies without difficulties. This certainly reflects one major advantage of the method presented here. Yet some more manual effort for decomposing the surface into patches is needed for more complicated topologies. We are working on an automatic extraction of anatomic feature lines. The results of the compactness analysis show that this correspondence method is suitable for 3D shape analysis of complex and variable anatomical shape. Fig. 5 (left) indicates that the dimensionality of the model has not reached convergence. This implies that a larger training set is needed. Of course, a liver model might never be complete, though complete enough for image segmentation. Our experiments with the leave-one-out test suggest that the essential features of an arbitrary shape will be accounted for. For the task of accurate image segmentation the model must be extended, as fig. 6 exemplifies. Comparison between different strategies for establishing correspondence would be of high interest. As mentioned in the beginning, judging the quality of a correspondence is not trivial and may depend on the application.
A Statistical Shape Model for the Liver
427
We have developed an efficient and intuitive approach for creation of 3D-shape models from segmented training data. This provides a basis for automated 3D image segmentation incorporating a-priori knowlegde.
Acknowledgements Thomas Lange is supported by the Deutsche Forschungsgemeinschaft (DFG) project “Intraoperative Navigation” 201879. Martin Seebass and Hans Lamecker are supported by DFG collaborative research project “Hyperthermia: Clinical Aspects and Methodology” SFB 273.
References 1. Fasel, J., Selle, D., Evertsz, C., et al.: Segmental Anatomy of the Liver: Poor Correlation with CT, Radiology, 206:151-156, 1998 2. Cootes, T.,Hill, A., Taylor, C., and Haslam, J.: Use of Active Shape Models for Locating Structures in Medical Images, Image and Vision Computing, 12: 355-366, 1994 3. Fleute, M., Lavallée, S., Julliard, R.: Incorporating a Statistically Based Shape Model into a System for Computer-Assisted Anterior Cruciate Ligament Surgery, Medical Image Analysis, 3(3): 209-222, 1999 4. Davies, R., Cootes, T., Taylor, C.: A Minimum Description Length Approach To Statistal Shape Modelling, IPMI 2001, LNCS 2082: 50-63 5. Kelemen, A. , Szekely, G., Gerig, G. : Three-dimensional Model-based Segmentation of Brain MRI, IEEE Trans. on Medical Imaging, 18(10):828-839, 1999 6. Wang, Y., Peterson, B.S., Staib, L.H.: Shape-Based 3D Surface Correspondence Using Geodesics and Local Geometry, CVPR ’00, 2: 644-651 7. Thompson, P.M., Toga., A.W.: Detection, Visualization and Animation of Abnormal Anatomic Structure with a Deformable Probabilistic Brain Atlas Based on Random Vector Field Transformations, Medical Image Analysis, 1(4): 271-294, 1996/97 8. Amira – Visualization and Modelling System, http://www.AmiraVis.com 9. Turk, G., O´Brian, J.F.: Shape Transformation Using Variational Implicit Functions, SIGGRAPH 99: 335-342, 1999 10. Garland, M., Heckbert, P.S.: Surface Simplification Using Quadratic Error Metrices, SIGGRAPH 97: 209-216, 1997 11. Zöckler, M., Stalling, D., Hege, H.-C.: Fast and Intuitive Generation of Geometric Shape Transitions, The Visual Computer, 16(5): 241-253, 2000 12. Floater, M.S.: Parameterization and Smooth Approximation of Surface Triangulations, Computer Aided Geometric Design, 14(3):231-250, 1997
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics Rasmus Larsen, Klaus Baggesen Hilger, and Mark C. Wrobel Informatics and Mathematical Modelling, Technical University of Denmark Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby, Denmark {rl,kbh,mcw}@imm.dtu.dk, http://www.imm.dtu.dk Abstract. The contribution of this paper is the adaptation of data driven methods for non-Euclidean metric decomposition of tangent space shape coordinates. This basic idea is to take extend principal components analysis to take into account the noise variance at different landmarks and at different shapes. We show examples where these non-Euclidean metric methods allow for easier interpretation by decomposition into biologically meaningful modes of variation. The extensions to PCA are based on adaptation of maximum autocorrelation factors and the minimum noise fraction transform to shape decomposition. A common basis of the methods applied is the assessment of the annotation noise variance at individual landmarks. These assessments are based on local models or repeated annotations by independent operators.
1
Introduction
For the analysis and interpretation of multivariate observations a standard methods has been the application of principal component analysis (PCA) to extract latent variables. Cootes et al. applied PCA to the analysis of tangent space shape coordinates [1]. For various purposes different procedures for PCA using non-Euclidean metrics have been proposed. The maximum autocorrelation factor (MAF) transform proposed by Switzer [2] defines maximum spatial autocorrelation as the optimality criterion for extracting linear combinations of multispectral images. Contrary to this PCA seeks linear combinations that exhibit maximum variance. Because imaged phenomena often exhibit some sort of spatial coherence spatial autocorrelation is often a better optimality criterion than variance. We have previously adapted the MAF transform for analysis of tangent space shape coordinates [3]. In [4] the noise adjusted PCA or the minimum noise fraction (MNF) transformations were used for decomposition of multispectral satellite images. The MNF transform is a PCA in a metric space defined by a noise covariance matrix estimated from the data. For image data the noise process covariance is conveniently estimated using spatial filtering. In [5] the MNF transform is applied to texture modelling in active appearance models [6]. Bookstein proposed using bending energy and inverse bending energy as metrics in the tangent space [7]. Using the bending energy puts emphasis on the large scale variation, using inverse bending energy puts emphasis of small scale variation. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 428–435, 2002. c Springer-Verlag Berlin Heidelberg 2002
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics
2 2.1
429
Methods Minimum Autocorrelation Factors
Let the spatial covariance function of a multivariate stochastic variable, Z k , where k denotes spatial position and ∆ a spatial shift, be Π(∆) = Cov{Z k , Z k+∆ }. Then by letting the covariance matrix of Zk be Σ and defining the covariance matrix Σ ∆ = D{Z k − Z k+∆ }, we find Σ ∆ = 2Σ − Π(∆) − Π(−∆)
(1)
Then the autocorrelation in shift ∆ of a linear combination of Zk is Corr{wTi Z k , wTi Z k+∆ } = 1 −
1 wTi Σ ∆ wi . 2 wTi Σwi
(2)
The MAF transform is given by the set of conjugate eigenvectors of Σ ∆ wrt. Σ, W = [w1 , . . . , wm ], corresponding to the eigenvalues κ1 ≤ · · · ≤ κm [2]. The resulting new variables are ordered so that the first MAF is the linear combination that exhibits maximum autocorrelation. The ith MAF is the linear combination that exhibits the highest autocorrelation subject to it being uncorrelated to the previous MAFs. The autocorrelation of the ith component is 1 − 12 κi . 2.2
Minimum Noise Fractions
As before we consider a multivariate stochastic variable, Z k . We assume an additive noise structure Z k = S k + N k , where S k and N k are uncorrelated signal and noise components, with covariance matrices Σ S and Σ N , respectively. Thus Cov{Z k } = Σ = Σ S + Σ N . By defining the signal-to-noise ratio (SNR) as the ratio of the signal variance and the noise variance we find for a linear combination of Z k SNRi =
wT Σ S wi V {wTi S k } wT Σwi = Ti = Ti −1 T V {wi N k } wi Σ N wi wi Σ N wi
(3)
So the minimum noise fractions are given by the set of conjugate eigenvectors of Σ wrt. Σ N , W = [w1 , . . . , wm ], corresponding to the eigenvalues κ1 ≥ · · · ≥ κm [4]. The resulting new variables are ordered so that the first MNF is the linear combination that exhibits maximum SNR. The ith MNF is the linear combination that exhibits the highest SNR subject to it being uncorrelated to the previous MNFs. The SNR of the ith component is κi − 1. The central problem in the calculation of the MNF transformation is the estimation of the noise with the purpose of generating a covariance matrix that approximates Σ N . Usually the spatial nature of the data is utilized and the noise is approximated by the difference between the original measurement and a spatially filtered version or a local parametric function (e.g. plane, quadratic function).
430
2.3
R. Larsen, K.B. Hilger, M.C. Wrobel
MNF and MAF for Shape Decomposition
We have previously [3] shown how to adapt MAF to shape decomposition by utilizing the ordering of landmarks (variables) instead of ordering of pixels (observations) by transposing the data matrix. Furthermore, it was shown that Molgedey-Schuster’s [8] independent components (ICA) is equivalent to MAF. If the matrices in Equations (2) and (3) are singular the solution must be found in the affine support of the matrix in the denominator, e.g. by means of a generalized singular value decomposition.
3
Materials
We demonstrate the properties of the techniques that we propose on two datasets. The first dataset consists of 2D annotations of the outline of the right and left lung from 115 standard PA chest radiographs. The chest radiographs were randomly selected from a tuberculosis screening program and contained normal as well as abnormal cases. The annotation process was conducted by identification of three anatomical landmarks on each lung outline followed by equidistant distribution of pseudo landmarks along the 3 resulting segments of the outline. In Fig. 1(b) the landmarks used for annotation are shown. Each lung field is annotated independently by two observers - Dr. Bram van Ginneken and Dr. Bart M. ter Haar Romeny. The dataset was supplied to us by Dr. Bram van Ginneken. For further information the reader is refered to the Ph.D. thesis of van Ginneken [9]. The second dataset consist of 4D landmarks of a set of surfaces of human mandibles (the lower jaw) registered over time. The surfaces were extracted in a previous study by Dr. Per R. Andresen from CT scans of 7 Apert patients imaged from 3-5 times from age 3 months to 12 years. The mandibles are assumed to exhibit normal growth. The scans were performed for diagnostic and treament planning purposes and supplied by Dr. Sven Kreiborg (School of Dentistry, University of Copenhagen, Denmark) and Dr. Jeffrey L. Marsh (Plastic and Reconstructive Department for Pediatric Plastic Surgery, Washington University School of Medicine at St. Louis Children’s Hospital, St. Louis, Missouri, USA). The surface extraction and registration was carried out using matching of the extremal mesh followed by a geometry-constrained diffusion procedure described in [10,11]. The surfaces contain approximately 14.000 homologous points.
4 4.1
Results Lung Dataset
We intend to use the annotation by two independent observers to estimate the annotation uncertainty. Initially the lung annotations are aligned to a common reference frame by concatenating the annotations of the two observers and performing a generalized Procrustes analysis (GPA) [12,13]. Now we can compute
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics 2
1
40
36 39
3 1
37
5
0.9
5
38
4
36
6
35
0.8
7
10 0.7
8
15
33 32
9
0.6 20
34
31
10
0.5
30
11 0.4
25
0.3
29
12 13
0.2
15
35 0.1 40 5
10
15
20
25
30
35
16
20
21
(a)
1
4 5
32 31
0.8
7
30
19
10 0.7
8
29
0.6
15
9
28
10
27 26
0.5 20
11
0.4
12
22
13 14 21
25
0.3 0.2
30
15
20
0.1
19
18
0.9
5
6
23
24 25 26 22 23
18
17
40
3
33
24
27
14
2
35 34
25
28
30
1
16
35
17
5
10
15
(b)
20
25
30
35
(c)
1 0.9
5
1 0.9
5
0.8 10
431
0.8 10
0.7
0.7
15 0.6 20
15
0.6
0.5
0.5
20 0.4
25
0.3
30
0.2
0.4 25 0.3 30
0.2
35 0.1
0.1
35
40 5
10
15
20
25
(d)
30
35
40
5
10
15
20
25
30
35
(e)
Fig. 1. Landmarks of the left and right lung. Landmark numbers are shown in the middle. The right lung is annotated by 40 landmarks, and the left lung by 36. The anatomical landmarks on the right field are points 1, 17, and 26, on the left field the anatomical landmarks are points 1, 17, and 22. (a),(c) Inter-observer difference canonical correlations between landmarks for the right and left lungs. (d),(e) Interneighbour landmark difference canonical correlations between landmark for the right and left lung.
the differences between the two sets of annotations and estimate an inter-observer covariance matrix of the landmark coordinates. Obviously we would like to view the intercorrelation per landmark and not per coordinate. Rotation of the frame of reference will shift the correlation between x and y coordinates which may cause some confusion. In order to overcome this problem for each pair of landmarks we estimate the maximum correlation between linear combinations of their coordinates. These are the canonical correlations [14]. In Fig. 1 we see these correlations for the right and left lung. The inter lung correlations are neglible. For both set of lungs we see a high degree of correlation along the curved top outline of the lungs. For both lungs landmark 1 is the top point. Again for both lungs there is no or little correlation across the two anatomical landmarks that delimit the bottom segment of the outlines. The inter-observer covariance matrix defines one sensible metric to use when decomposing the shape variability. This would put less emphasis of landmarks with high annotation variance and more emphasis on landmarks with low annotation variance, and result in a minimum noise fraction transform. As an alternative to assessing the interobserver differences we may consider the covariance of the difference of neighbouring landmarks. The correlation structure of these are also shown in Fig. 1. Here the partitioning of landmarks in three segments
432
R. Larsen, K.B. Hilger, M.C. Wrobel PC1
PC2
PC3
PC4
PC5
PC6
PCC1
PCC2
PCC3
PCC4
PCC5
PCC6
EPC1
EPC2
EPC3
EPC4
EPC5
EPC6
MAF1
MAF2
MAF3
MAF4
MAF5
MAF6
REL1
REL2
REL3
REL4
REL5
REL6
Fig. 2. The 6 most important principal components (PC), principal components on a standardized dataset (PCC), annotation noise adjusted principal components (EPC), maximum autocorrelation factors (MAF), and relative warps (REL). The blue curve is the mean shape, and the green and red curves represent ±5 standard deviations as observed in the training set.
for each lung is more pronounced. Using this covariance as metric corresponds to the MAF transform. In Fig. 2 the 6 most important principal components (PC), principal components on a standardized dataset (PCC), annotation noise adjusted principal components (EPC), maximum autocorrelation factors (MAF), and relative warps (REL) are shown. The relative warps use the bending matrix of the estimated mean shape as metric. The PCs and PCCs are fairly similar, but the EPCs, MAFs, and RELs are different. The latter three all represent uses of metrics that are significantly different from the Euclidean one. The first EPC is a an aspect ratio variation, and the following 5 EPC’s seeems to be a mix of the first PCs. The first MAF is also an aspect ratio variation, and the following MAF’s also have evident large scale interpretataions. In particular, MAF4 is the relative size of the lungs. The relative warps also give various large scale variations but they are not as easily interpretable as the MAFs.
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics 140
−300
0
−100
50
0
60
140
−10
5 15
−10
0
1 140
140
1
3
Subject
3
Subject
5
60
5
0
433
5 15
15 0
MAF3
1
3
5
20
30
−150 0
150
(a)
0
2
4
0
2
LogAge
0
LogAge
4
4
−100
−10
PCA3
0 −15
−150 0
MAF2
2
150
−300 50
PCA2
MAF1
−10
PCA1
0
20
Size
20
Size
30
0
60
Age
30
0
60
Age
1
3
5
20
30
−15
0
15
0
2
4
(b)
Fig. 3. Scatter plots of the 3 first PCs and age, log age, and centroid size. A strong correlation between the shape variation in PC1 and MNF1 to size is demonstrated. Because size has been filtered out of the shape decomposition in the Procrustes analysis these components can be interpreted as shape change due to growth. The lower order components exhibit variation between individuals.
4.2
Mandible Dataset
A major objective for the analysis/decomposition of the mandible dataset is the construction of a growth model that allows prediction of mandible size and shape from early scans (1-3 months). When performing pediatric cranio-facial surgery prediction of growth patterns is extremely important. Growth modelling will also add to basic understanding as well as have teaching implications. Here we will demonstrate the use of the MNF transformation for decomposition of a 3D dataset as an alternative to PCA. The mandibles are aligned using a generalized 3D Procrustes analysis [15] and projected into tangent space. Each mandible is represented by a triangulated surface based on the 14000 landmarks. This triangulation allows us to determine the neighboring landmarks easily. We estimate the noise covariance matrix in Equation (3) as the covariance matrix of the deviations from the mean displacements between landmark coordinates and planes fitted locally to all landmarks in a neighbourhood. In the example shown we have used a 4th order neighbourhood. In Fig. 3 pairwise scatter plots of the first three components and age, log age, and centroid size are shown for PCs as well as MNFs. For the PCs we see that there is strong relationship between PC1, age and size. This means that PC1 relates to mandible growth, as was also concluded and utilized in [10]. PC2 and PC3 does not correlate to age or size but contain variation between individuals. For the MNFs we see that we have captured two uncorrelated modes of variation namely MNF1 and MNF2 that relate to size and age. MNF3 is a contrast between the three younger mandible scans of subject number 5 and the rest of the mandibles. In Figs. 4 and 5 the first two PCs and MNFs are shown. In each plot a
434
R. Larsen, K.B. Hilger, M.C. Wrobel
greenish meanshape and a goldish positive or negative deviation are shown. For PC1 we see a contrast between young, broad, flat mandibles with small condyles and elder, slimmer, higher mandibles with large condyles and erupted teeth. For MNF1 and MNF2 we see different patterns of growth.
(a) PC1 ’-’
(b) PC1 ’+’
(c) PC2 ’-’
(d) PC2 ’+’
Fig. 4. Principal components 1 and 2 shown as ±2 standard deviations across the training set.
(a) MNF1 ’-’
(b) MNF1 ’+’
(c) MNF2 ’-’
(d) MNF2 ’+’
Fig. 5. Minimum noise fractions 1 and 2 shown as ±2 standard deviations across the training set.
5
Conclusion
We have demonstrated a series of data driven methods for constructing nonEuclidean metric linear decompositions of the tangent space shape variability in 2D and 3D. We have demonstrated ways of constructing such a metric based on repeated measurements as well as by use of the spatial nature of the outline and surface models considered. It turns out that the MAF and MNF transforms are superior in terms of interpretability for decompoing large scale variation. These methods are tools for determining un-correlated biological modes of variation.
Acknowledgements The work was supported by the Danish Technical Research Council under grant number 26-01-0198 which is hereby gratefully acknowledged. The authors thank Dr. Bram van Ginneken for use of the lung annotation data set. The authors also thank Dr. Sven Kreiborg and Tron Darvann (School of Dentistry, University of Copenhagen, Denmark) for providing insight into the study of mandibular growth.
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics
435
References 1. T. F. Cootes, G. J. Taylor, D. H. Cooper, and J. Graham, “Training models of shape from sets of examples,” in British Machine Vision Conference: Selected Papers 1992, (Berlin), Springer-Verlag, 1992. 2. P. Switzer, “Min/max autocorrelation factors for multivariate spatial imagery,” in Computer Science and Statistics (L. Billard, ed.), pp. 13–16, Elsevier Science Publishers B.V. (North Holland), 1985. 3. R. Larsen, H. Eiriksson, and M. B. Stegmann, “Q-MAF shape decomposition,” in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2001, 4th International Conference, Utrecht, The Netherlands, vol. 2208 of Lecture Notes in Computer Science, pp. 837–844, Springer, 2001. 4. A. A. Green, M. Berman, P. Switzer, and M. D. Craig, “A transformation for ordering multispectral data in terms of image quality with implications for noise removal,” IEEE Transactions on Geoscience and Remote Sensing, vol. 26, pp. 65– 74, Jan. 1988. 5. K. B. Hilger, M. B. Stegmann, and R. Larsen, “A noise robust statistical texture model,” in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2002, 5th International Conference, Tokyo, Japan, 2002. 8 pp. (submitted). 6. T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” in Proceedings of the European Conf. On Computer Vision, pp. 484–498, Springer, 1998. 7. F. L. Bookstein, Morphometric tools for landmark data. Cambridge University Press, 1991. 435 pp. 8. L. Molgedey and H. G. Schuster, “Separation of a mixture of independent signals using time delayed correlations,” Physical Review Letters, vol. 72, no. 23, pp. 3634– 3637, 1994. 9. B. van Ginneken, Computer-Aided Diagnosis in Chest Radiographs. PhD thesis, Image Sciences Institute, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands, 2001. 184 pp. 10. P. R. Andresen, F. L. Bookstein, K. Conradsen, B. K. Ersbøll, J. L. Marsh, and S. Kreiborg, “Surface-bounded growth modeling applied to human mandibles,” IEEE Transactions on Medical Imaging, vol. 19, Nov. 2000. 1053–1063. 11. P. R. Andresen and M. Nielsen, “Non-rigid registration by geometry-constrained diffusion,” Medical Image Analysis, vol. 5, no. 2, pp. 81–88, 2001. 12. J. C. Gower, “Generalized Procrustes analysis,” Psychometrika, vol. 40, pp. 33–50, 1975. 13. C. Goodall, “Procrustes methods in the statistical analysis of shape,” Journal of the Royal Statistical Society, Series B, vol. 53, no. 2, pp. 285–339, 1991. 14. H. Hotelling, “Relations between two sets of variables,” Biometrika, vol. 28, pp. 321–377, 1936. 15. J. M. F. T. Berge, “Orthogonal Procrustes rotation for two or more matrices,” Psychometrika, vol. 42, pp. 267–276, 1977.
Kernel Fisher for Shape Based Classification in Epilepsy N. Vohra1 , B. C. Vemuri1 , A. Rangarajan1 , R.L. Gilmore2 , S.N. Roper3 , and C. M. Leonard4 1
CISE Department, University of Florida, Gainesville, FL 32601, USA Dept. of Neurology, University of Florida, Gainesville, FL 32601, USA Dept. of Neurosurgery, University of Florida, Gainesville, FL 32601, USA Dept. of Neuroscience, University of Florida, Gainesville, FL 32601, USA
2 3 4
Abstract. In this paper, we present the application of Kernel Fisher in the statistical analysis of shape deformations that might indicate the hemispheric location of an epileptic focus. The scans of two classes of patients with epilepsy, those with a right and those with a left medial temporal lobe focus (RATL and LATL), as validated by clinical consensus and subsequent surgery, were compared to a set of age and sex matched healthy volunteers using both volume and shape based features. Shape based features are derived from the displacement field between the left and right hippocampii of a healthy subject/patient. The results show a significant improvement in distinguishing between the controls and the rest (RATL and LATL) using only the shape as opposed to volume based features. We also achieve a reasonable improvement in the efficiency to distinguish between RATL and LATL based on shape in comparison to volume information. It should be noted that automated identification of hemispherical foci of epilepsy has not been previously reported.
1
Introduction
Statistical analysis of shape deformations, such as those likely to occur in epilepsy and other neurological disorders, necessitate both global and local parameter based characterization of the object under study. The most popular method to achieve the same has been size and volume based analysis. However, this captures only one of the aspects necessary for complete characterization while shape based analysis gives much more information, which can be combined with the former to help understand the anatomical structures better. In this paper, we focus on developing an automatic technique which can aid in distinguishing between controls and patients with epilepsy and can indicate the hemispheric location of an epileptic focus (right medial temporal lobe or left medial temporal lobe) in the patients. It should be noted that the work does not attempt to determine the precise coordinates of the epilepsy focus in the patients. 1.1
Literature Review
To use the classification techniques such as Support Vector Machines and Fisher Discriminant for solving the problem of statistical analysis of anatomical shape T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 436–443, 2002. c Springer-Verlag Berlin Heidelberg 2002
Kernel Fisher for Shape Based Classification in Epilepsy
437
differences between different populations has been the focus of various researchers in recent past. The choice of feature vectors to capture maximum information plays an important part of the study. Gerig and Styner [3] proposed the use of both volume measurements and shape based features (Mean Square Distance) to detect group differences in hippocampal shape in schizophrenia. The class differences are then accounted for by using SVM followed by performance evaluation using the leave-one-out technique. From the results reported, it can be concluded that shape alone could not capture the class differences. This failure can be attributed either to weak shape features or the fact that the natures of the groups under study is such that shape alone cannot represent the entire class character. Joshi et al. [4] used high dimensional transformations of a brain template to compare the hippocampal volume and shape characteristics in schizophrenia and control subjects. Linear discriminant analysis was used to measure the performance of the selected feature vector. Use of fluid flow model is computationally expensive. Recently, Golland et al. [5] have studied hippocampal shape differences between schizophrenia patients and normal controls. The shapes were represented using implicit functions and classified using an SVM with a Gaussian kernel. Marginal increases over volume based methods were reported using this technique. 1.2
Overview of Our Algorithm
In this paper, we demonstrate the application of the kernel Fisher algorithm for shape based classification of hippocampal shapes in controls and epilepsy. Given a pair of sparse sets of data points corresponding to the outlines of the left and right hippocampii of a subject, appropriate shape features are extracted by first fitting a model to the data sets using a deformable pedal surface [2]. This is followed by a rigid and a non-rigid registration of the left and right hippocampii using the Iterative Closest Point algorithm [1] and level-set method [7] respectively. The local deformations obtained by non-rigid registration are then fed into the kernel fisher classifier to capture the statistical difference between the three known classes in epilepsy. The choice of kernel Fisher as the classifier has been motivated by the fact that it can separate the classes in a very high or infinite dimensional space using a linear classier and is simple to implement. The kernel Fisher training algorithm used in this work does not require non-linear optimization like SVM and hence is computationally more efficient. Note that the optimization that the SVM tries to solve is a quadratic programming problem with constraints and is known to be NP-complete. Kernel Fisher has shown results comparable to SVM in various other applications [6]. 1.3
Organization of the Paper
The rest of the paper is organized as follows: Section 2 describes the snake pedal model used for fitting a model to given data points followed by the procedure of selection of shape based features. In Section 3, we summarize the kernel Fisher method as a classifier. Sections 4 presents the experimental results followed by conclusions in Section 5
438
2
N. Vohra et al.
Shape Extraction and Features
In this section we will discuss the schemes used to segment the region of interest followed by the methods employed for rigid and non-rigid registration of the corresponding hippocampii for a given subject. 2.1
Overview of the Shape Modeling Scheme
In order to segment the region of interest in the given image, we use a deformable pedal surface described in [2]. Pedal curves/surfaces are defined as the loci of the foot of the perpendiculars to the tangents of a fixed curves/surface from a fixed point called the pedal point [2]. A large class of shapes can be synthesized by varying the position of the pedal point which exhibits both global and local deformations. Physics-based control is introduced by using a snake to represent the position of this varying pedal point. Thus the model is called as ”snake pedal” and allows for interactive manipulation through forces applied to the snake. The model also allows representation of global deformations such as bending and twisting without introducing additional parameters. To fit a model to a given set of data points in 2D/3D, a non-linear optimization scheme using LevenbergMarquardt (LM) method in outer loop for estimating global parameters and the Alternating Direction Implicit (ADI) method in the inner loop for estimating the local parameters of the model is employed [2]. 2.2
Shape Registration
Shape registration, in general, is required at both global and local levels. In the present work we use the Iterative Closest Point algorithm proposed in [1] to determine the rotation and translation between a subject’s left and right hippocampus. The choice of ICP algorithm is motivated by the fact that snake-pedal based model fitting yields an extrinsic parameterization which is not suitable for use in finding the corresponding points between the left and right hippocampii. The corresponding left and right hippocampus of a subject may have a global scaling factor which is accounted for by approximating the shapes of the smallest ellipsoid that encloses the hippocampus and then equalizing their corresponding eigen values. The problem of finding the non-rigid estimation can be formulated as a motion estimation task, in particular, estimation of the displacement field between the two given shapes. We use the level-set formulation described in [7] to estimate the displacement field that leads to the following governing equation: → − ∆u → − → − ∆(Gσ ∗ D1 ( V (X))) + λ ∆v (1) Vt = [d2 (X) − d1 ( V (X))] → − ∆(Gσ ∗ D1 ( V (X))) +α ∆w → − → − with V (X, 0) = 0 where d1 and d2 denote the signed distance images of the source and target shapes, ∆ denotes the Laplacian operator, Gσ is Gaussian kernel and α is a small
Kernel Fisher for Shape Based Classification in Epilepsy
439
positive number called as stabilizing factor. The above differential equation can be solved by using the numerical implementation described in [7]. Note that the signed distance images can be obtained by using the Fast Marching Method (FMM) described in Sethian [8].
3
Kernel Fisher
The classification problem can then be approached in two ways, namely supervised and unsupervised, with the discriminant function being linear or non-linear. The classical approach begins with the optimal Bayes classifier by assuming the normal distribution for the classes which using the linear discriminant analysis leads to the Fisher algorithm. The Fisher approach is based on projecting d-dimensional data onto a line with the hope that the projections are well separated by class. Thus, the line is oriented to maximize this class separation. However, the features in the input space may not possess sufficient discriminatory power for separation of class via linear projection techniques. This problem can be tackled by mapping the input data into a very high dimensional space and using a linear classifier in this new feature space thereby giving an implicit non-linear classification in the input space. This is the basic idea behind the kernel Fisher algorithm. Let φ be a non-linear mapping to some feature space F. Thus, separation in the new feature space can be found by maximizing: J(w) =
φ w T SB w
(2)
φ wT SW w
φ φ where w ∈ F, SB and SW are defined as follows φ φ = (mφ1 − mφ2 )(mφ1 − mφ2 )T , SW = (φ(x) − mφi )(φ(x) − mφi )T (3) SB i=1,2 x∈χi
li with mφi = l1i j=1 φ(xij ). Eqn. (2) can be solved by formulating it in terms of dot-products (φ(x)·φ(y)) of the training patterns [6] which can then be evaluated using Mercer kernels (k(x, y) = (φ(x) · φ(y))) [6]. As explained in [6] using the kernel theory, (2) can be rewritten as J(α) =
αT M α αT N α
(4)
where M = (M1 − M2 )(M1 − M2 )T with (Mi )j N=
=
li 1 k(xj , xik ) li k=1
Kj (I −
1lj )KjT ,
Kj : l × lj matrix s.t. (Kj )nm = k(xn , xjm ) (5)
j=1,2
I : identity matrix , 1lj : lj × lj matrix with all entries 1/lj
440
N. Vohra et al.
and l α are the coefficients corresponding to the training patterns s.t. w = i=1 αi φ(xi ) [6]. The optimum direction of projection can be found by taking the leading eigenvector of N −1 M . This approach is called as Kernel Fisher Discriminant (KFD) [6]. The projections of a new vector x onto w can be obtained by (w · φ(x)) =
l
αi k(xi , x)
(6)
i=1
The proposed setting is ill-posed due to the estimation of l -dimensional covariance structures from l samples which can cause the matrix N to be nonpositive [6]. The problem can be solved by adding a multiple identity matrix to N [6] such that Nµ = N + µI
4
(7)
Experimental Results and Validation
In this section, we present the experimental results obtained by testing the performance of Linear Fisher and kernel Fisher on hippocampal shapes of 25 control subjects, 11 LATL and 14 RATL patients. Given the points sets for the left and right hippocampii of a healthy subject/patient, we begin with model fitting using the Snake Pedal Model. The superimposed mesh of 21 x 40 on the model gives a new point set of size 840 x 3 (each point is in 3D). The fitted point sets are first registered globally using ICP algorithm and ellipsoid based technique. This is followed by local registration using level-set method which uses signed distance images of size 128 x 128 x 128 obtained by the Fast Marching method. The displacement field obtained for the fitted point sets is then used to form two types of shape based features. The first type is called sign of displacement and and the other is called direction of the displacement vector. The sign of the displacement is defined as follows. Given the displacement vector for a point on the zero-set of the source image, determine the cube in which the displaced point falls in the source image. Depending on the sign of the vertices (since each vertex was assigned +/- sign while forming the distance image) of the enclosing cube, assign a sign to the magnitude of the displacement. The direction vector is obtained by finding the unit vector corresponding to the displacement vector at each point on the zero set. We have chosen this as our feature vector since we believe that the displacement vector direction allows us to capture the differences between the two classes. However, the issue of including the magnitude information of the displacement field at each point on the zero set is an important one and we hope to investigate it in future work. The feature vector for the sign of displacement is of length 762 while the direction vector is of length 762 x 3 (each point has x,y,z component of displacement). These numbers are derived from the fact that there are 840 points on the zero set and the first and last row of the 21 x 40 mesh represents the north and south pole as described in [2]. In the present
Kernel Fisher for Shape Based Classification in Epilepsy
441
study, we did not include feature pruning. Since the feature vector dimensionality far exceeds the number of retrospective patient studies, feature selection and pruning using principal component analysis (PCA) or related strategies can play an important role in improving the generalization performance. Our preliminary forays into feature pruning are very promising and we plan to vigorously pursue this line of investigation in future work. The shape based results are also compared to the ones obtained by using the volume information only with L/R and (L-R)/(L+R) as the feature vector where L and R are the volume of left and right hippocampii.
(a) Linear Fisher
(b) Linear Fisher
(c) Kernel Fisher
Fig. 1. CTRL vs Rest (a) Fvec: Volume based, (b) Fvec: Sign of displacement, (c) Fvec: Sign of displacement
Table 1. Controls vs Rest (Controls=24, Rest=25)
Classifer Linear Fisher KF-Poly (d=2) KF-RBF
Volume Training 64.43% 55.2% 64.93%
Testing 61.22% 55% 61.22%
Sign of Displacement Training Testing 95.92% 87.76% 100% 91.84% 100% 89.98%
Direction Vector Training Testing 96.5% 85.71% 100% 95.9% 100% 93.8%
Fig. 1 shows the results of the Linear Fisher and Kernel Fisher (using radial basis function as the kernel) with a feature vector based on volume and shape for controls vs patients. It can be seen from Fig. 1a that using just the volume information does not distinguish between the subjects who need surgery and who do not. Fig. 1b and Fig. 1c show considerable improvement with a feature vector as sign of displacement in particular with Kernel Fisher as the classifier. Plots using direction vector are similar to the ones obtained using sign of displacement. Table 1 summarizes the training set accuracy and the cross-validation accuracy using leave-one-out for the feature vectors and classifiers considered. Given the classification between controls and patients, the next task is to identify the side of focus. This is again done by comparing the shape features for RATL and LATL. Fig. 2 shows the separation between the two classes using volume and shape features. It is clear that volume (Fig. 2a) cannot distinguish between them easily. The sign of displacement using linear Fisher (Fig. 2b) also does not show a good separation. However, kernel Fisher with shape features
442
N. Vohra et al.
(Fig. 2c) is able to capture it much better. Table 2 summarizes the training and leave-one-out accuracy for RATL and LATL using volume and shape features.
(a) Linear Fisher
(b) Linear Fisher
(c) Kernel Fisher
Fig. 2. LATL vs RATL. (a) Fvec: Volume based, (b) Fvec: Sign of displacement, (c) Fvec: Sign of displacement Table 2. LATL vs RATL (LATL=11, RATL=14)
Classifer Linear Fisher KF-Poly (d=2) KF-RBF
Volume Training 66.88% 66.88% 66.88%
Testing 64% 64% 64%
Sign of Displacement Training Testing 74.88% 64% 100% 72% 100% 72%
Direction Vector Training Testing 88% 68% 100% 72% 100% 68%
It can be seen that we are able to distinguish between controls and patients with a much higher accuracy as compared to distinguishing between RATL and LATL. This can be due to various reasons. The shape differences among the pathologies may be highly correlated, hence making it difficult to separate them out. Also the number of data samples for the patients with pathology is quite small which hinders a sufficient representation of the population. This can also be seen in high training accuracy but low test accuracy among patients.
5
Discussion and Conclusion
Our entire approach is predicated on using shape-based features for discriminating between normal controls and subjects diagnosed with epilepsy as well as indicating the hemispheric location of epileptic focus in the patients. It should be noted that the work does not attempt to determine the precise coordinates of the focus of epilepsy in the patients. Since the feature vectors may not be linearly separable, we embarked upon a kernel Fisher strategy in which the patterns are first mapped to an infinite dimensional space before computing the Fisher discriminant. The choice of kernel is crucial for achieving good generalization. This issue requires much more empirical testing and validation in order to determine the best kernel for the task. Unfortunately, the deeper and more fundamental relationship between the feature vector density function and the
Kernel Fisher for Shape Based Classification in Epilepsy
443
choice of the kernel mapping function cannot be empirically explored due to data being available only for a small number of subjects. The choice of feature vector is key to achieving good training and generalization performance. We have shown that the sign of the displacement vector can capture some of the shape differences between the two classes of subjects. Based on our empirical results, we conclude that control subjects and subjects of pathology can be discriminated using shape features. However, the same shape features are less successful in inter-hemispheric discrimination between subjects of pathology. We expect to improve the classification performance in this area by i) increasing the number of patient studies, ii) better feature selection and pruning and iii) improving the classifier.
Acknowledgment This research was in part funded by the NSF grant IIS-9811042 and NIH RO1RR13197.
References 1. Besl P.J., McKay N.D.: A Method of Registration of 3-D Shapes. IEEE Trans. of Pattern Analysis and Machine Intelligence, Vol. 14, No. 2 (1992) 239-255 2. Vemuri B.C., Guo Y.: Snake Pedals: Compact and Versatile Geometric Models with Physics-based Control. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol 22, No. 5 (2000) 445-459 3. Gerig G., Styner M., Shenton M.E., Lieberman J.A.: Shape vs Size: Improved understanding of the Morphology of Brain Structures. MICCAI, (2001) 24-32 4. Csernansky J.G., Joshi S., Wang L., Haller J.W., Gado M., Miller J.P., Grenander U., Miller M.I.: Hippocampal morphometry in schizophrenia by high dimensional brain mapping. Neurobiology, Vol. 95, Issue 19 (1998) 11406-11411 5. Golland P., Grimson W.E.L, Shenton M.E., Kikinis R.: Small Sample Size Learning for Shape Analysis of Anatomical Structures. MICCAI, LNCS 1935 (2000) 72-82 6. Mika S., R¨ atsch G., Weston J.: Fischer Discriminant Analysis with Kernels. Neural Networks for Signal Processing IX, IEEE (1999) 41-48 7. Vemuri B., Ye J., Chen Y., Leonard C.: A Level-set based Approach to Image Registration. Workshop on Mathematical Methods in Biomedical Image Analysis, June11-12 (2000) 86-93 8. Sethian J.A.: Level Set Methods and Fast Marching Methods:Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision and Material Science. Cambridge University Press (1999)
A Noise Robust Statistical Texture Model Klaus B. Hilger, Mikkel B. Stegmann, and Rasmus Larsen Informatics and Mathematical Modelling, Technical University of Denmark DTU, Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby {kbh,mbs,rl}@imm.dtu.dk http://www.imm.dtu.dk Abstract. This paper presents a novel approach to the problem of obtaining a low dimensional representation of texture (pixel intensity) variation present in a training set after alignment using a Generalised Procrustes analysis. We extend the conventional analysis of training textures in the Active Appearance Models segmentation framework. This is accomplished by augmenting the model with an estimate of the covariance of the noise present in the training data. This results in a more compact model maximising the signal-to-noise ratio, thus favouring subspaces rich on signal, but low on noise. Differences in the methods are illustrated on a set of left cardiac ventricles obtained using magnetic resonance imaging.
1
Introduction
Over the past few years, models capable of synthesising complete images of objects have proven very useful when interpreting images. One example is the Active Appearance Models (AAMs) [1,2]. Applications of AAMs include recovery and variation analysis of anatomical structures in medical images, such as magnetic resonance images (MRIs) [3], radiographs [4,5] and ultrasound images [6]. Images can be synthesised in many ways, e.g. [7] uses a linear combination of shape-compensated training images. To reduce dimensionality, AAMs uses a Principal Component (PC) analysis of the training set to synthesise new images. By maximising the variance only, the PC is modelling any noise present in the training set along with the uncontaminated hidden image data. In this paper, we propose to extend the AAM framework by augmenting the image representation with noise characteristics. This is accomplished by applying the Minimum Noise Fraction (MNF) transformation [8]. The ancestor of AAMs, the Active Shape Models [9] have previously been extended by means of a variant of MNF in the analysis of shapes, see [5]. Here, we extend this work to pixel intensities, henceforth denoted texture. The MNF extracts important otherwise occluded information in the correlation structures of the data, and aims at obtaining a low dimensional model representation. As opposed to the PC transform, the MNF transform takes the spatial nature of the image into account. Whereas the PC transform only requires knowledge of the dispersion (covariance) matrix, the MNF transform requires an estimate of the dispersion matrix of the noise structure as additional information. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 444–451, 2002. c Springer-Verlag Berlin Heidelberg 2002
A Noise Robust Statistical Texture Model
445
The MNF transform was originally proposed as a transformation for ordering multispectral data in terms of image quality with applications for noise removal. This paper is organised as follows. Section 2 summarises AAMs and describes the applied statistical models. Section 3 describes the data analysed, and Section 4 presents a comparative study of the PC and MNF. In Section 5 we summarise and give some concluding remarks.
2
Methods
In the following AAMs are summarised along with a description of the traditional AAM texture model; the PC transform, and the proposed alternative; the MNF transform. 2.1
Active Appearance Models
Active Appearance Models [1,2] establish a compact parameterisation of object variability, as learned from a training set by estimating a set of latent variables. From these quantities new images similar to the training set can be generated. Objects are defined by marking up each example with points of correspondence over the set either by hand, or by semi- to completely automated methods. Exploiting prior knowledge about the local nature of the optimisation space, these models can be rapidly fitted to unseen images, given a reasonable initialisation. Shape and texture variability is conventionally modelled by means of PC transforms. Let there be given P training examples for an object class, and let each example be represented by a set of N landmark points and M texture samples. The P shape examples are aligned to a common mean using a Generalised Procrustes analysis. The Procrustes shape coordinates are subsequently projected into the tangent plane of the shape manifold, at the pole denoted by the mean shape. The P textures are warped into correspondence using a suitable warp function and subsequently sampled from this shape-free reference. Typically, this geometrical reference is the Procrustes mean shape. Let s and t denote a synthesized shape and texture and let s and t denote the corresponding means. New instances are now generated by the adjusting PC scores, bs and bt in s = s + Φs bs
,
t = t + Φt bt
(1)
where Φs and Φt are eigenvectors of the shape and texture dispersions estimated from the training set. To regularise the model and improve speed and compactness, Φs and Φt are truncated, usually such that a certain amount of variance in the training set is explained. To obtain a combined shape and texture parameterisation, c, the values of bs and bt over the training set are combined W s bs W s ΦT s (s − s) . (2) b= = bt ΦT t (t − t) Notice that a suitable weighting between pixel distances and pixel intensities is done through the diagonal matrix W s . To recover any correlation between shape and texture a third PC transform is applied
446
K.B. Hilger, M.B. Stegmann, and R. Larsen
b = Φc c
(3)
obtaining the combined appearance model parameters, c, that generate new object instances by Φc,s . (4) s = s + Φs W −1 Φ c , t = t + Φ Φ c , Φ = c,s t c,t c s Φc,t The object instance, (s, t), is synthesised into an image by warping the pixel intensities of t into the geometry of the shape s. Given a suitable measure of fit the model is matched to an unseen image using an iterative updating scheme based on a fixed Jacobian estimate [10,11] or a reduced rank regression [2]. 2.2
Principal Components Transformation
Consider a set of P texture vectors {ti }P i=1 laid out as a set of P shape-free images with grey levels ri (x), i = 1, · · · , P , where x is the coordinate vector denoting the grid point of the sample. Let r(x) = [r1 (x) · · · rP (x)]T and assume first and second order stationarity, i.e. E{r(x)} = 0 and D{r(x)} = Σ. The PC transformation thus chooses P linear transformations zi (x) = aTi r(x), i = 1, · · · , P such that the variance for zi (x) is maximum among all linear transforms orthogonal to zj (x), j = 1, · · · , i − 1. The variance in the ith PC is given by Var{aTi r} = λi = aTi Σai .
(5)
We see that the basis for the PCs is identified as the conjugate eigenvectors of the dispersion matrix. Let λ1 ≥ · · · ≥ λP ≥ 0 be the eigenvalues with the corresponding conjugate eigenvectors A = [a1 · · · aP ]. Above, the PC problem is solved in Q-mode. Using the Eckart-Young’s Theorem the R-mode solution becomes Φt = RT Λ−1/2 A, where R = [r 1 · · · r M ] with r j containing spatially corresponding intensities over the training set, and Λ a diagonal matrix of the eigenvalues. 2.3
Minimum Noise Fractions Transformation
Consider the random signal variable r(x) from above. Assuming that an additive noise structure applies r(x) = s(x) + n(x) with Corr{s(x), n(x)} = 0, the dispersion structure can be separated into D{r(x)} = Σ = Σ s + Σ n .
(6)
The Minimum Noise Fractions transformation chooses P linear combinations zi (x) = aTi r(x), i = 1, · · · , P which maximise the signal-to-noise ratio (SNR) for the ith component SNRi =
aTi Σai V{aTi s(x)} = − 1. V{aTi n(x)} aTi Σ n ai
(7)
and the problem reduces to solving a generalized eigenproblem, Σai = λi Σ n ai . Let λ1 ≥ · · · ≥ λP be the eigenvalues of Σ with respect to Σ n with the corresponding conjugate eigenvectors a1 , · · · , aP . Then zi (x) is the ith MNF. A high
A Noise Robust Statistical Texture Model
447
order component has a high noise fraction and thus little signal. A low order component has a high SNR, hence the name Minimum Noise Fraction transform. The central issue in obtaining good MNF components is the estimation of the dispersion matrix of the noise. Using the difference between a pixel and its neighbours as a noise estimate, MNF maximises the spatial autocorrelation. Let ∆T = [∆1 ∆2 ] represent a spatial shift. Introducing Σ ∆ = D{r(x) − r(x + ∆)} which, when considered as a function of ∆, is a multivariate variogram and assuming a proportional covariance model [12] the covariance of the noise can be estimated by Σ n = Σ ∆ /2. When the covariance structure for the noise is proportional to the identity matrix, the MNF transform reduces to the PC transform. In [13] several other models are presented for estimating image noise. When maximising autocorrelation the MNF analysis qualifies as an Independent Components Analysis (ICA) similar to the Molgedy-Schuster algorithm [14], see [5]. A comparative study of the PC and MNF can be found in [15,16].
3
Data
Short-axis, end-diastolic cardiac MRIs were selected from 28 subjects. MRIs were acquired using a whole-body MR unit (Siemens Impact) operating at 1.0 Tesla. The chosen MRI slice position represented low and high morphologic complexity, judged by the presence of papillary muscles. Images were acquired using an ECG-triggered breath-hold fast low angle shot (FLASH) cinematographic pulse sequence. Slice thickness=10 mm; field of view=263x350 mm; matrix 256x256. The endocardial and epicardial contours of the left ventricle were annotated manually by placing 33 landmarks along each contour, see Figure 3.
4
Results and Discussion
Noise is added to the training data simulating different SNRs, i.e. different quality of the MRIs due to inter-patient, inter-operator variation etc. This is done in order to examine the robustness of the texture representation in the MNF basis compared to the PC basis. Gaussian noise is applied with a standard deviation randomly chosen to produce training images with an SNR down to 6 dB. This knowledge of the noise structure is not used in the subsequent analyses. 4.1
Learning Based Image Representation
To examine the robustness of the MNF transform, 101 leave-one-out studies were carried out. One on the uncorrupted and 100 on the noise degraded shape-free sets of 28 MRIs. Results of the cross-validation analyses are presented in Figure 1. The left plot corresponds to uncorrupted MRIs and the right to a randomly chosen analysis on a degraded training set. The curves with o/x symbols marks the performance of the MNF/PC models and provides the mean squared texture error (MSE) as a function of the model rank. For the scenario without the performance of the MNF and the PC transform is very similar. Notice, however that the MNF is better for almost all number of modes. The general trend for the noise degraded data is reflected in the MSE curves in Figure 1 (right). The MNF
448
K.B. Hilger, M.B. Stegmann, and R. Larsen
and PC are competing for low rank models, but for an intermediate number of modes the MNF outperforms the PC transform. The MNF thus does a better job of separating important signal from noise in the training data. Figure 2 shows the PC and the MNF eigenvectors (the Φt ’s) of the mean shape aligned 28 noise degraded cardiac data for which the leave-one-out texture representation curve in Figure 1 (right) was generated. All images in Figure 2 are stretched between mean ±3 std. The top four rows correspond to the PC eigenvectors and the four bottom rows to the MNFs. The components are ordered row-wise according to the amount of variance/SNR they explain. The last component in both shows the mean texture sample, t. Notice that the MNF gives a better ordering of components in terms of texture quality. A higher degree of speckle noise is present in all PC components compared to the MNF components. Moreover, the last components of the PC analysis appear to include a relatively higher amount of auto-correlated signal. This explains the better performance of the MNF representation in the cross-validation study. MNF PC
1100
1400
1050
1380 1360
MSE (texture)
1000
MSE (texture)
MNF PC
1420
950
900
1340 1320 1300 1280
850
1260 800 1240 750
1220 5
10 15 Texture modes
20
25
5
10 15 Texture modes
20
25
Fig. 1. Leave-one-out study on cardiac MRIs. Without noise (left). With noise (right).
4.2
Cardiac Segmentation
Hitherto, the PC and the MNF transform have been evaluated w.r.t. representation. To assess the transforms capabilities in a de facto segmentation setting, a cross-validation study was carried out on the cardiac data set. To maximise the effective size of the training set, validation was performed using a leave-one-out evaluation on the set of 28 short-axis cardiac MRIs. A total of 56 AAMs were built on noise-contaminated versions of the 28 cardiac
A Noise Robust Statistical Texture Model
449
Fig. 2. PC (top) and MNF (bottom) decomposition of noise degraded cardiac MRIs.
Fig. 3. Example annotation of the left ventricle using 66 landmarks (left). Segmentation result on noise contaminated cardiac MRI (right).
450
K.B. Hilger, M.B. Stegmann, and R. Larsen
MRIs; i.e. 28 PC AAMs and 28 MNF AAMs. In both transforms the largest 14 texture modes were included into the models. This model rank was chosen as half the maximum basis size producing a cut-off point where an average of 85% of the total amount of variation is explained. Each model was initialised on the image that was left out, in its mean configuration (i.e. mean shape and mean texture) and displaced ±8 pixels from the ground truth centre in image coordinates. From this position the AAM search was started. Refer to Figure 3 (right) for an example segmentation. Two performance measures were evaluated: normalised texture MSE (MSE) and mean point-to-point distance between corresponding landmarks of the model and the ground truth. Segmentations with a pt.-pt. distance larger than ten pixels were deemed outliers and removed. The PC/MNF AAMs yielded a mean normalised MSE of 3.55±3.35 / 3.43±2.67 and a pt.-pt. landmark error of 5.03± 1.60 / 4.79 ± 1.51 pixels, respectively. In the two PC/MNF runs 2 / 1 outliers were removed. Thus, a modest improvement in both performance measures and corresponding uncertainties is observed for the MNF AAMs. Notice the rather high MSE standard deviations, due to the large inhomogeneity in the noise characteristics.
5
Conclusion
We have shown that a more compact representation of texture can be obtained by extending the PC to the MNF transformation in the AAM framework. The novel approach shows better performance in leave-one-out representation studies both on original and on noise degraded cardiac MRIs. Thus, by separating important signal from noise the MNF transform generalises better than the PC transform. The MNF texture representation is applied in a leave-one-out AAM segmentation study in comparison to applying a conventional PC basis of equal rank. Even though the MNF extension only affects the texture- and not the shape representation, and the texture model rank is chosen relatively high compared to the amount of noise present in the data; improvements in both landmark and texture error and corresponding uncertainties are observed for the MNF AAMs. In contrast to the PC analysis, the new approach by maximizing SNR is invariant to linear transformations such as scaling of the individual components in the training set. As a consequence, the MNF decomposition is expected to be useful in future AAM studies involving data fusion of multiple features of different nature measured at different scales. This includes derived physiological measures, textual quantities, and multiple imaging modalities. Moreover, the MNF analysis in itself can be applied as a data driven method probing for uncorrelated modes of biological variation in non-Euclidean space, and thus constitute a useful tool in exploratory analysis of medical data.
Acknowledgments MRIs were provided by M.D., Jens Chr. Nilsson and M.D., Bjørn A. Grønning, Danish Research Centre of Magnetic Resonance, H:S Hvidovre Hospital.
A Noise Robust Statistical Texture Model
451
References 1. Edwards, G., Taylor, C.J., Cootes, T.F.: Interpreting face images using active appearance models. In: Proc. 3rd IEEE Int. Conf. on Automatic Face and Gesture Recognition, IEEE Comput. Soc (1998) 300–5 2. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Proc. European Conf. on Computer Vision. Volume 2., Springer (1998) 484–498 3. Mitchell, S., Lelieveldt, B., van der Geest, R., Bosch, H., Reiver, J., Sonka, M.: Multistage hybrid active appearance model matching: segmentation of left and right ventricles in cardiac mr images. Medical Imaging, IEEE Transactions on 20 (2001) 415–423 4. Stegmann, M.B., Fisker, R., Ersbøll, B.K.: Extending and applying active appearance models for automated, high precision segmentation. In: Proc. 12th Scandinavian Conf. on Image Analysis. Volume 1. (2001) 90–97 5. Larsen, R., Eiriksson, H., Stegmann, M.B.: Q-MAF shape decomposition. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2001, 4th International Conference, Utrecht, The Netherlands. Volume 2208 of Lecture Notes in Computer Science., Springer (2001) 837–844 6. Bosch, H., Mitchell, S., Lelieveldt, B., Nijland, F., Kamp, O., Sonka, M., Reiber, J.: Active appearance-motion models for endocardial contour detection in time sequences of echocardiograms. Proceedings of SPIE 4322 (2001) 257–268 7. Jones, M., Poggio, T.: Multidimensional morphable models: a framework for representing and matching object classes. International Journal of Computer Vision 29 (1998) 107–31 8. Green, A.A., Berman, M., Switzer, P., Craig, M.D.: Transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Transactions on Geoscience and Remote Sensing 26 (1988) 65–74 9. Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models – their training and application. Comp. Vision and Image Understanding 61 (1995) 38–59 10. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. on Pattern Recognition and Machine Intelligence 23 (2001) 681–685 11. Cootes, T.F., Taylor, C.J.: Statistical Models of Appearance for Computer Vision. Tech. Report. Oct 2001, University of Manchester (2001) 12. Switzer, P., Green, A.A.: Min/max autocorrelation factors for multivariate spatial imagery. Technical Report 6, Dept. of statistics, Stanford University (1984) 13. Olsen, S.I.: Estimation of noise in images: An evaluation. Graphical Models and Image Processing 55 (1993) 319–323 14. Molgedey, L., Schuster, H.G.: Separation of a mixture of independent signals using time delayed correlations (1994) 15. Nielsen, A.A.: Analysis of Regularly and Irregularly Sampled Spatial, Multivariate, and Multi-temporal Data. PhD thesis, Department of Mathematical Modelling, Technical University of Denmark, Lyngby (1994) 16. Hilger, K.B.: Exploratory Analysis of Multivariate Data, Unsupervised Image Segmentation and Data Driven Linear and Nonlinear Decomposition. PhD thesis, Informatics and Mathematical Modelling, Technical University of Denmark, Kgs. Lyngby (2001) 186 pp.
A Combined Statistical and Biomechanical Model for Estimation of Intra-operative Prostate Deformation Ashraf Mohamed1,2 , Christos Davatzikos1,2 , and Russell Taylor1
2
1 CISST NSF Engineering Research Center Department of Computer Science, Johns Hopkins University
[email protected] http://cisstweb.cs.jhu.edu/ Center for Biomedical Image Computing, Department of Radiology Johns Hopkins University School of Medicine {ashraf,hristos}@rad.jhu.edu http://cbic.jhoc1.jhmi.edu/
Abstract. An approach for estimating the deformation of the prostate caused by transrectal ultrasound (TRUS) probe insertion is presented. This work is particularly useful during brachytherapy procedures, in which planning for radioactive seed insertion is performed on preoperative scans, and significant deformation of the prostate can occur during the procedure. The approach makes use of a patient specific biomechanical model to run simulations for TRUS probe insertion, extract the main modes of the deformation of the prostate, and use this information to establish a deformable registration between 2 orthogonal cross-sectional ultrasound images and the preoperative prostate. In the work presented here, the approach is tested on an anatomy-realistic biomechanical phantom for the prostate and results are reported for 5 test simulations. More than 73% of maximum deformation of the prostate was recovered, with the estimation error mostly attributed to the relatively small number of biomechanical simulations used for training.
1
Introduction
Transrectal ultrasound (TRUS) guided brachytherapy is one of the common therapy alternatives for prostate cancer. The goal of the procedure is to insert a number of radioactive seeds at specific locations into the prostate tissue by using surgical needles. The locations and number of seeds within the prostate gland are decided by means of a surgical planning software that makes use of preoperative volumetric scans of the prostate, typically CT or MRI. During brachytherapy, several factors can cause deformation of the prostate gland from its preoperative shape. These factors include insertion of the ultrasound probe inside the rectum, insertion of the surgical needles, edema, and change in the patient’s posture between the preoperative and the intraoperative conditions [1]. This deformation of the prostate from the preoperative condition induces uncertainties in the radioactive seeds insertion locations, which are the result of planning on preoperative images. Thus, deformation of the prostate can T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 452–460, 2002. c Springer-Verlag Berlin Heidelberg 2002
A Combined Statistical and Biomechanical Model
453
affect the dose distribution in and around the prostate and therefore adversely affect the outcome of the procedure [2]. The goal of the work reported here is to describe a framework for estimating the deformation of the prostate based on sparse data from 2D ultrasound images that can be obtained during a typical brachytherapy procedure. Such estimator can update the preoperative plan to account for deformations thereby reducing the uncertainty in the radioactive seed insertion locations. Our approach builds upon [3], in which a general framework for statistical estimation of intra-operative deformations was presented. The problem of estimation of the deformed shape of the prostate at each probe location can therefore be cast as a deformable 2D/3D registration problem [4], i.e. registration of a 3D preoperative image to two cross-sections of the deformed volume of the same patient. To obtain the main modes of deformation of the prostate under TRUS probe insertion, a patient-specific biomechanical model is constructed from segmented preoperative images. The main modes of deformation are extracted by performing Principal Component Analysis (PCA) on a small number of deformed shapes resulting from simulations for TRUS probe insertion which are run on the biomechanical model. Each of the simulations corresponds to specific insertion angles and insertion depth of the TRUS probe. In order to further simplify the estimator, we derive an analytic representation of the principal modes of deformation (coefficients of the principal modes) as a function of the probe insertion angles and insertion depth. Our goal is to develop a fast, statistically based model, which can be used in real-time to track deformations. This model is trained on computationally intensive biomechanical simulations, which are performed preoperatively. In Section 2, the construction of the prostate phantom, the deformable prostate model, and the estimator are described. In the preliminary work of this paper, the proposed approach is tested on 5 biomechanical simulations that are not used for training. Simulated ultrasound prostate contours are obtained from the deformed prostate and are used by the estimator to estimate those deformed shapes. The results reported in Section 3 indicate good accuracy in the estimation of deformed shapes. In Section 4, we discuss how the current work can be extended to deal with real patient data and to deal with different subjects.
2
Methods
In this section, the construction of the estimator of the deformed prostate shapes is detailed. First, the biomechanical model used for simulation of prostate deformation is described. This biomechanical model is patient specific and is constructed from the patient’s segmented preoperative CT or MRI scan. In the work presented here, a biomechanical model of an anatomy-realistic prostate phantom is used instead. This provides a means for validating the estimates of the deformed prostate by comparing them to the true deformed shapes of the prostate, which are not usually available for real patients unless intra-operative imaging is used. We use the patient’s specific biomechanical model to run a
454
A. Mohamed, C. Davatzikos and R. Taylor
number of biomechanical simulations with different insertion depths and entry angles of the TRUS probe, thereby constructing a number of deformed shapes of the same prostate. The simulations involved the entry angles of the probe to account for misalignment between the axes of the rectum and the probe. From the simulated deformed shapes, we extract the principal modes of deformation for that prostate under TRUS probe insertion. Noting the dependency between the modes of deformation and the parameters of the simulations (the insertion depth and angles of the TRUS probe) a functional approximation was sought by fitting a 3rd degree Bernstein polynomial. Therefore any deformed prostate shape can be described in terms of the principal modes of deformation and their corresponding principal components which are in turn directly related to the insertion angles and insertion depth of the probe. From transaxial and sagittal sections of the prostate obtained through the TRUS probe deformable 2D/3D registration is established between the estimated deformed prostate shapes and the images obtained intra-operatively. 2.1
Biomechanical Model
A patient specific biomechanical model is needed for estimation of the range of deformations of the prostate. Finite element biomechanical meshes can be automatically generated from segmented images of the patient (e.g. [5,6]). For the model to be able to capture the deformation of the prostate accurately, it should include structures such as the prostate, surrounding tissues, the rectum, and the surrounding bony structures (sacrum and pubic arch) that control the boundary conditions. In this preliminary work reported here, we used an anatomy-realistic 3D phantom of the prostate and the surrounding structures for reasons stated earlier. A side view and a 3D rendered view of the phantom are shown in Figure 1. The phantom is composed mainly of a block of soft tissue of dimensions 12 cm × 16 cm × 12 cm. The prostate is modeled as an egg shaped structure of dimensions 3 cm × 3 cm × 3.5 cm. The rectum is modeled as a straight cylinder with circular cross section of radius 0.5 cm that runs 0.25 cm below the lower surface of the prostate. The surfaces of the sacrum, pubic arch, were generated using a spline curve extruded in 3D. The sacrum and the pubic arch are assumed to be pinned and therefore they define the boundary conditions for the problem. No other boundary conditions are imposed on any other structure. A tetrahedral mesh was automatically generated for the prostate and surrounding soft tissue within Abaqus CAE Finite Element (FE) environment [7] which is also used for solving the biomechanical FE model. Even though there is evidence that most soft tissues exhibit non-linear material behavior, a linear material model was used in many studies dealing with biomechanical behavior of soft tissue (e.g. [5]). A linear material model is typically chosen because it produces faster results compared to a non-linear material model and it is easier to implement. The values of material parameters, (e.g. Young’s modulus and Poisson’s ratio for a linear material model) vary widely from one tissue type to another and from one person to another for the same soft tissue type, especially with the presence of tissue anomaly such as cancer. To our
A Combined Statistical and Biomechanical Model
(a)
455
(b)
Fig. 1. The biomechanical prostate phantom. (a) 2D profile. (b) 3D wireframe with no-displacement boundary conditions imposed on the sacrum and the pubic arch.
knowledge the material parameter values for the prostate have not been determined experimentally or otherwise for the in-vivo human prostate. In [1] a linear material model was used for the prostate with different stiffness values for the central gland and the peripheral zone. A linear material model is only valid for small deformations and therefore offers limited accuracy in problems that involve large deformation. In the work presented here, since the expected deformation is large, we used a homogeneous Mooney-Rivlin non-linear material model for the prostate tissue with an initial Young’s Modulus (stiffness) value E = 2kP a and an almost incompressible behavior. This is consistent with the values used in [1] for the peripheral zone of the prostate. For the soft tissue surrounding the prostate, a Mooney-Rivlin material model was also assumed but with an initial stiffness that is 10 times as large as that assumed for the prostate tissue. Such values of material parameters produced deformations that are consistent with observed deformation in real TRUS images. Recently, Magnetic Resonance Elastography has been proposed for in-vivo estimation of material parameter values [8]. If accurate patient specific material parameter values are known, they can be used directly in the model. In section 4 we discuss how our approach can be extended to deal with deformations even if the material parameters were not known accurately, but are known to lie in a certain range. It is important to note that we do not need exact knowledge of the elastic parameters since our goal is to develop a statistical prior model that will follow the actual deformation in TRUS images rather than totally predict the deformation of the prostate. During TRUS-guided prostate brachytherapy, the ultrasound probe is inserted at increasing depths with known constant displacements in between. This causes the dilation of the rectum and exertion of pressure on the surrounding tissues, including the prostate. The displacement of the probe along its axis measured from the start of the rectum as a reference point is denoted by the variable u. The angles φ2 and φ3 denote the rotation angles around the 2nd and 3rd coordinate axes respectively (see Figure 1) and are referred to here as the entry angles of the probe.
456
A. Mohamed, C. Davatzikos and R. Taylor
Fig. 2. The mean shape of the statistical model is shown in the middle, with added -3 standard deviations (left) and +3 standard deviations (right) of the 2nd mode of the deformation.
2.2
Prostate Deformable Model
For training purpose of the deformable statistical model of the prostate, simulations of TRUS probe insertion with different entry angles spanning the range −4 to 4 degrees were performed. A total of 25 such simulations were performed, each with 9 corresponding probe displacements that simulate imaging of the whole prostate in 2D cross sections. Displacements of the probe were 0.5 cm in between, which is consistent with staging in available TRUS systems used for brachytherapy. A total of 225 deformed prostate shapes were therefore available from these simulations. For each simulation, coordinates of the finite element node locations of each of the deformed shapes were assembled into a vector q that represents the deformed shape. Principal Component Analysis (PCA) [9] was performed on the deformed prostate shapes to obtain the main modes of deformation of the prostate. Therefore, any of the simulated deformed shapes can be approximated by q=µ+
M
αi xi
(1)
i=1
where µ is the mean shape of the deformed prostate, xi are the principal modes of deformation, αi are the expansion coefficients, and M is the number of retained modes of deformation. More than 99% of the variation in the training samples was explained by only the first 6 modes of the deformation, and therefore M = 6 was used for the results reported in this work. Some of the modes of deformation were highly correlated with the physical parameters of the biomechanical simulations (modes 1, 2 and 4). In Figure 2, the second mode of deformation of the prostate is shown. It is clear from the figure that the second principal component correlates well (correlation coefficient of 0.88) with the displacement of the TRUS, u. Similarly, modes 1 and 4 correlated highly (correlation coefficients ≥ 0.67) with φ2 and φ3 respectively. Given this observation, a functional relationship was assumed between the principal components of the deformation and the biomechanical simulation parameters, i.e.
A Combined Statistical and Biomechanical Model
αi = fi (u, φ2 , φ3 ),
1≤i≤M.
457
(2)
Linear least squares fitting was used to approximate each fi by fitting 3 degree Bernstein polynomials [10] for each of the principal components in terms of the simulation parameters. Therefore a deformed shape is related to the biomechanical simulation parameters by rd
q = G(u, φ2 , φ3 ) =µ+
M
(3)
fi (u, φ2 , φ3 )xi
i=1
To evaluate the error introduced by the fitting of Bernstein polynomials for the functions fi of equation (3), true deformed shapes that resulted from biomechanical simulations were compared to the deformed shapes constructed by equation (3) for the same simulation parameter values. The maximum error was 0.09 cm, while the maximum deformation encountered in the simulated shapes was 0.7 cm. Therefore, a maximum error of 12.9% was introduced in the training samples by the approximation of equation (3) and using a finite number of deformation modes (M = 6). 2.3
Estimation of Deformed Shapes
During prostate brachytherapy, and before inserting any radioactive seeds, a number of 2D ultrasound images are usually obtained to cover the whole prostate. The displacements between the locations at which the images are obtained are known since a mechanical stepper is typically used to advance the TRUS probe. The goal is to estimate the deformed shape of the prostate at each of those probe locations. If the known displacement between consecutive probe locations is denoted by u, then the deformed shapes are given by qj = G(uo + u(j − 1), φ2 o , φ3 o )
1≤j ≤K.
(4)
where K is the number of probe locations, uo the displacement for the first probe location, φ2 o and φ3 o are the insertion angles for the probe. Thus, if uo , φ2 o and φ3 o were known then the whole set of deformed shapes at different locations of the probe will be available. In the work presented here, it is assumed that at each location of the probe, 2 orthogonal images of the prostate are available. These images are readily obtained by most TRUS probes currently in use for brachytherapy. From each image, coordinates of points on the surface of the deformed prostate can be extracted using manual or automatic outlining. Let the 3D coordinates of the points obtained at the j th location of the probe relative to the ultrasound crystal be denoted by Vj , where 1 ≤ j ≤ K. A coordinate frame transformation relates the coordinate frame of the crystal to the coordinate frame in which the simulations were performed. This transformation can be computed in terms of the geometry of the probe, the parameters u, φ2 , and φ3 , and to , an unknown translation between the coordinate frames. Let this frame transformation for the j th location
458
A. Mohamed, C. Davatzikos and R. Taylor
of the probe be denoted by Tj . Given an estimate of Tj , let V t j denote the points Vj transformed into the simulation coordinate frame by Tj . Also, let the sum of squared distances between the points V t j and their closest corresponding points on the deformed surface qj be denoted by Ej (uo , φ2 o , φ3 o , to ). Therefore, we seek the values u ˆo , φˆ2 o , φˆ3 o , and ˆto that minimize the sum of square errors: (ˆ uo , φˆ2 o , φˆ3 o , ˆto ) = arg min
K
Ej (uo , φ2 o , φ3 o )
(5)
j=1
Using u ˆo , φˆ2 o , φˆ3 o in equation 4 yields the estimates of the deformed shapes, ˆ j , 1 ≤ j ≤ K. The optimization problem is solved using the Nelder-Mead nonq linear optimization method [11] from within the Matlab environment. Similar to the approach in [4], the optimization for the parameters uo , φ2 o , φ3 o , and to is performed at 2 different alternating steps for deformable 2D/3D registration and pure translation.
3
Results
Five simulations of TRUS probe insertion were performed at parameter values uo , φ2 o , and φ3 o that were different from those used for the training but within the range of training values. A pair of orthogonal simulated TRUS prostate image contours were generated at each location of the probe. The estimator described above was then used to obtain the deformed shapes of the prostate at each location of the probe. We computed the estimation error defined as the difference between the estimated deformed prostate shape and true deformed shape obtained by biomechanical simulation: ˆ j − qj 1 ≤ j ≤ K ˆj = q (6) e We also computed the reconstruction error for the deformed shapes, defined as the difference between a deformed shape and its best possible reconstruction in the space spanned by the retained principal modes of the deformation: M
ˇ j − qj ˇj = q e
1≤j≤K
(7)
ˇ j = µ+ i=1 α where q ˇ i xi and α ˇ i are obtained by projecting the deformed shape qj on the orthogonal principal modes xi . The estimation error can therefore be decomposed into 2 orthogonal components [12] ˇj + e ˜j ˆj = e (8) e ˇj , is due to the inability of representing the deformed The reconstruction error e shape qj as the sum of the mean and a linear combination of the principal modes ˜j is due to inability of estimating the deformed of deformation, while the error e shape perfectly from the 2D information provided by the TRUS images, and due the approximation of equation (3). The maximum estimation error and reconstruction error for each of the simulations are shown in Figure 3. In the worst test case (case number 4), the max estimation error was 26.7% of the maximum
A Combined Statistical and Biomechanical Model
459
Fig. 3. The computed maximum estimation error, reconstruction error, and deformation of the prostate for 5 different simulations of TRUS probe insertion.
deformation encountered in this simulation. However, the reconstruction error accounted for more than 57% of the estimation error for this case. The availability of more training samples obtained from more biomechanical simulations will reduce this error, at the expense of increased computational burden.
4
Summary and Future Work
We presented an approach that combines biomechanical and statistical modeling for estimation of the shape of the prostate deformed during TRUS probe insertion. Our approach makes use of a patient specific biomechanical model constructed from a segmented volumetric scan of the patient’s prostate. Since it is virtually not possible to perform biomechanical simulations for every possible value of probe displacement and entry angles, only a small number of biomechanical simulations are used to extract the modes of deformation of the prostate. The coefficients of those modes were then related to the parameters of the biomechanical simulation, namely, the insertion angles and insertion depth of the TRUS probe. This enabled the parameterization of the deformed prostate shape in terms of the biomechanical simulation parameters, and therefore provided a means for combined estimation of a set of deformed prostate shapes given sparse 2D ultrasound images. The framework of [3] can be used to extend the approach presented here to a deformable model for the prostate that includes the modes of deformation as well as modes of shape. Such model can be constructed from several subjects instead of using a patient specific biomechanical model. Another possible extension to
460
A. Mohamed, C. Davatzikos and R. Taylor
the work presented here is the treatment of material parameters as another simulation variable that is related to the modes of deformation, and seeking an estimate of those values as a part of the optimization step.
Acknowledgement The work reported in this paper was supported in part by the National Science Foundation under Engineering Research Center grant EEC9731478, and by the National Institutes of Health grant R01NS42645.
References 1. Bharatha, A., Hirose, M., Hata, N., Warfield, S., Ferrant, M., Zou, K.H., SuarezSantana, E., Ruiz-Alzola, J., D’Amico, A., Cormack, R.A., Kikinis, R., Jolesz, F.A., Temapny, C.M., Evaluation of Three-Dimensional Finite Element-Based Deformable Registration of Pre- and Intraoperative Prostate Imaging. Medical Physics 28(12) December (2001) 2551–2560 2. Booth, J.T., Zavgorodni, S.F., Set-up Error and Organ Motion Uncertainty: a Review. Australas Phys. Eng. Sci. Med., Jun; 22(2) (1999) 29–47 3. Davatzikos, C., Shen, D., Mohamed, A., Kyriacou, S.K., A Framework for Predictive Modeling of Anatomical Deformations. IEEE Transactions on Medical Imaging 20(8) August (2001) 836–843 4. Fleute, M., Lavall´ee, S., Nonrigid 3-D/2-D Registration of images Using Statistical Models. Lecture Notes in Computer Science, Vol. 1679. Medical Image Computing and Computer Assisted Intervention 1999 Springer-Verlag, Berlin Heidelberg New York (1999) 138-147 5. Ferrant, M., Warfield, S.K., Guttmann, C.R., Mulkern, R.V., Jolesz, F.A., Kikinis, R., 3D Image Matching Using a Finite Element Based Elastic Deformation Model. Lecture Notes in Computer Science, Vol. 1679. Medical Image Computing and Computer Assisted Intervention 1999 Springer-Verlag, Berlin Heidelberg New York (1999) 202–209 6. Sullivan, J.M., Charron, G., Paulsen, K.D., A Three-Dimensional Mesh Generator for Arbitrary Multiple Material Domains. Finite Elements in Analysis and Design 25 (1997) 219–241 7. Abaqus version 6.1. Hibbitt, Karlsson, and Sorensen, Inc., USA, 2000. 8. Weaver, J.B., Van Houten, E.E., Miga, M.I., Kennedy, F.E., Paulsen, K.D., Magenetic Resonance Elastography Using 3D Gradient Echo Measurements of SteadyState Motion. Medical Physics 28(8) August (2001) 1620–1628 9. Jolliffe, I.T., Principal Component Analysis. Springer-Verlag, Berlin Heidelberg New York (1986) 10. Farin, G., Curves and Surfaces for Computer Aided Geometric Design. Academic Press Limited, London, UK (1997) 11. Nelder, J.A., and Mead, R., A Simplex Method for Function Minimization. Computer J. 7 (1965) 308–313 12. Mohamed, A., Kyriacou, S.K., Davatzikos, C., A Statistical Approach for Estimating Brain Tumor Induced Deformation. Mathematical Models in Biomedical Image Analysis (2001) 52–59
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images Dejan Tomaeviþ, Boštjan Likar, and Franjo Pernuš University of Ljubljana, Faculty of Electrical Engineering, Traška 25, 1000 Ljubljana, Slovenia _HINERXSQE^IZMGFSWXNERPMOEVJVERNSTIVRYWa$JIYRMPNWM
Abstract. Validation of registration techniques needed for image-guided surgery is an important problem, which received little attention in the literature. In this paper we address the challenging problem of generation of a reliable gold standard for evaluating the accuracy of surgical 2D/3D registrations. We have devised a cadaveric lumbar spine phantom with fiducial markers and established highly accurate correspondence between 3D CT and MR images and 18 2D X-ray images. The expected target registration errors are in the order of 0.2 mm for CT to X-ray registration and in the order of 0.3 mm for MR to X-ray registration. As such, the gold standard images, which are available on request from the authors, are useful for testing 2D/3D registration methods in image guided surgery.
1
Introduction
In image-guided orthopedic surgery, 3D preoperative medical data, such as CT and MRI, are commonly used to plan, simulate, guide, or otherwise assist a surgeon in performing a medical procedure. The plan, specifying how tasks are to be performed during surgery, is developed in the coordinate system of preoperative images. To monitor and guide a surgical procedure, the preoperative image and plan need to be transformed into physical space, i.e. a patient-related coordinate system. The spatial transformation is obtained by acquiring intraoperative data and registering them to data extracted from preoperative images [1]. More recent and promising approaches to obtain the spatial transformation rely on intraoperative x-ray projections acquired with a calibrated x-ray device. The location and orientation of a structure in 3D CT or MR image with respect to the geometry of the x-ray device is determined by 2D/3D registration [2-7]. A necessary step, required before wide spread clinical use of any novel registration technique, is the evaluation and validation of the method. While several researchers have addressed the validation problem in the context of particular methods [2-7], very little formal research has been done in this area. One difficulty in evaluating a registration technique is the need for highly accurate gold standard. Because it is practically impossible to establish gold standard registration with real patient data, simulated data or phantoms have to be considered. In this paper, we report on the creation of a cadaveric lumbar spine phantom to which fiducial markers were attached. 3D CT and MR and 2D X-ray images were acquired and accurate gold standard rigid registration between 3D and 2D images was established by means of fiducial markers. The accuracy of gold standard registration was assessed by target registration error [8]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 461–468, 2002. © Springer-Verlag Berlin Heidelberg 2002
462
2
D. Toma eviþ, B. Likar, and F. Pernuš
Phantom Creation
A cadaveric lumbar spine, comprised of vertebra L1-L5 with some soft tissue, of an 80 year-old female was placed into a plastic tube and tied with thin nylon strings (Fig. 1, top-left). The tube was filled with water to simulate soft tissue and, therefore, to obtain more realistic MR, CT, and X-ray images. Six fiducial markers were rigidly attached to the outside of plastic tube (Fig. 1, bottom-left). Each fiducial marker had two parts, a base that could be screwed to a rigid body and a replaceable marker. Different markers were used for MR and CT and X-ray imaging. Markers, containing a metal ball (1.5 mm in diameter) were used for CT and X-ray imaging, while markers with a spherical cavity (2 mm in diameter) filled with water solution of Dotarem contrast agent (Gothia) were used for MR.
Fig. 1. The spine fastened in a plastic tube (top-left), final phantom with fiducial markers attached to the plastic tube (bottom- left), CT image (top-center), MR image (top-right), AP x-ray image (bottom-center), and lateral x-ray image (bottom-right) image.
3
Image Acquisition
The CT image (Fig. 1, top-center) was obtained with General Electric HiSpeed CT/i scanner at 100kV. Axial slices were taken with intra-slice resolution of 0.27x0.27 mm and 1 mm inter-slice distance. For MR imaging, Philips Gyroscan NT Intera 1.5 T scanner and T1 protocol was used (Fig. 1, top-right). Axial slices were obtained with 0.39x0.39 mm intra-slice resolution and 1.9 mm inter-slice distance. After acquisition, the acquired MR image was retrospectively corrected for intensity inhomogeneity by the information minimization method [9]. X-ray images (Fig. 1) were captured by PIXIUM 4600 (Trixell) digital X-ray detector. The detector had a 429x429 mm active surface, with 0.143x0.143 mm pixel size and 14-bit dynamic range. To simulate Carm acquisition X-ray source and sensor plane were fixed while the spine phantom was rotated on a turntable (Fig. 2, left). In this way mechanical distortion due to gravitational force and other mechanical imperfections of C-arms were avoided, which resulted in a more precise acquisition. By rotating (step=20°) the spine phantom around its long axis, 18 X-ray images were acquired. The X-ray images were filtered by 3x3 median filter and then sub-sampled by the factor of two in order to remove dead pixel artifacts and to reduce the resolution.
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
4
463
Finding Centers of Fiducial Markers
In all 3D and 2D images a rough position pm of each fiducial marker was first defined manually. Next, an intensity threshold IT, that separated a marker from surrounding tissues, was selected for each marker. Finally, the center pc of each marker was defined as: ( I ( p ) − I T )p p∈ pc = (1) ( I ( p) − I T )
∑ ∑
p∈
where I(p) is the intensity at point p and Ω is a small neighborhood around point pm. By this method, centers of markers may be found to sub-pixel or sub-voxel accuracy. Let XMR and XCT be 3x6 matrices, each containing six 3D vectors representing the centers of fiducial markers found in MR and CT, respectively: MR
MR
MR
XMR=[r1 ,r2 ,...,r6 ] CT CT CT XCT=[r1 ,r2 ,...,r6 ]
(2)
T
where r=(x,y,z) . Similarly, let Xϕ be a 2x6 matrix involving six 2D points representing the centers of markers found in X-ray images obtained after rotating the phantom for ϕ degrees (ϕ=0°,20°,...,340°): Xϕ=[p1ϕ, p2ϕ, ..., p6ϕ]
T
where p=(x,y) .
(3)
5ϕ
Fϕ
SKDQWRP
Fϕ
UV
UL
Fϕ
;UD\VRXUFH
6Y 6Y
7YVϕZ(
6V
;UD\VHQVRU
5
SLϕ
SLϕ SLϕ
Fig. 2. X-ray image acquisition (left) and reconstruction of 3D marker position (right).
5
X-Ray Setup Calibration
The X-ray setup was calibrated retrospectively using the centers Xϕ of fiducial markers found in X-ray images and the corresponding centers XCT of markers found in CT volume. Calibration of the acquisition setup (Fig. 2) required the determination of the X-ray projection geometry and rotation between the coordinate system of the phantom and the coordinate system of the X-ray system. This involved determination of 12 geometrical parameters,3 intrinsic wI and 9 extrinsic wE, denoted by calibration paT T T rameter vector w, w=(wI ,wE ) . The intrinsic parameters were describing the X-ray projection geometry while the extrinsic parameters were describing the rotation be-
464
D. Toma eviþ, B. Likar, and F. Pernuš
tween the coordinate system of the phantom and the coordinate system of the X-ray setup. T The intrinsic parameters wI=(xs,ys,zs) define the position of the X-ray source rs in the coordinate systems Ss of the sensor plane and, therefore, define the projection PS(wI) of any 3D point described in the sensor coordinate system Ss to the 2D sensor plane. There are nine extrinsic parameters wE needed to describe the rotation between the phantom and the X-ray system. Four parameters define the axis of rotation in coordinate system Sv of the phantom. We have chosen the coordinate system of the CT volume for Sv. The axis of rotation is defined by point (txv,tyv), which is the intersection of the axis with x-y coordinate plane of Sv and by rotation angles (ωxv,ωyv) of the axis around x and y of Sv. Similarly, four parameters (txs,tys) and (ωxs,ωys) define the same axis of rotation in coordinate system Ss of the X-ray sensor plane. The additional parameter, needed to determine the relation between Ss and Sv on the rotation axis, is distance dvs between the two points of intersection (txv,tyv) and (txs,tys). The T extrinsic parameters wE=(txv,tyv,ωxv,ωyv,dv s ,txs,tys,ωxs,ωys) define transformation TVS(ϕ,wE) that maps, for a given rotation ϕ of the phantom, any 3D point in coordinate system Sv to a 3D point in coordinate system Ss:
TVS (ϕ , w E ) = TRS (txs , tys ,ωxs ,ωys ) ⋅ T(d vs ) ⋅ R (ϕ ) ⋅ TVR (txv , ty v ,ωxv ,ωyv )
(4)
where TVR is the transformation from coordinate system Sv to the axis of rotation, R(ϕ) is the rotation around rotation axis, T(dvs) is the translation along rotation axis, and TRS is the final transformation to the coordinate system Ss. By merging projection PS(wI) and transformation between the coordinate systems TVS(ϕ,wE), the projection PVS(ϕ,w) of 3D point defined in the coordinate system Sv to the 2D point lying in the sensor plane of Ss can be obtained for any rotation ϕ:
PVS (ϕ , w ) = PS ( w I )TVS (ϕ , w E )
(5)
To calibrate the X-ray acquisition system, we thus need to define 12 geometrical parameters w of the projection PVS(ϕ,w). The optimal calibration parameters w are the ones that bring the fiducial markers XCT in CT volume to the best correspondence with the corresponding fiducial markers Xϕ in X-ray images. To find the optimal parameters we project the centers of fiducial markers XCT in CT volume to the sensor plane and compute the root mean squared (RMS) distance Ecalib to the corresponding centers of fiducial markers Xϕ in X-ray images:
Ecalib ( w ) =
1 M
( pϕ −P ∑ ∑ N ϕ 1
N
i
∈Φ
i =1
VS (ϕ , w ) ri
)
CT 2
(6)
where N and M stands for the number of fiducial markers and X-ray images, respectively, and Φ={ϕ1,ϕ2,…,ϕM} defines the X-ray images taken at different phantom rotations. To find the optimal calibration parameters w, we used nine X-ray images Φ={0°,40°,...,320°} and iterative optimization, which resulted in minimum RMS distance (Ecalib) of 0.31 mm. The small RMS indicates that calibration was performed well and reflects the uncertainty of fiducial marker localization in CT and X-ray images.
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
6
465
Reconstruction of 3D Markers from Calibrated X-Ray Images
Once the X-ray acquisition system was calibrated, the positions of X-ray fiducial markers in 3D could be reconstructed from 2D X-ray images. Each point piϕ, repreth senting the center of i fiducial marker in X-ray image taken at rotation ϕ, was backprojected to the X-ray source rs, which yielded the projection line Liϕ (Fig. 2, right). Line Liϕ defines the perspective projection of a 3D marker to the 2D X-ray plane. The projection line Liϕ can be expressed in the coordinate system Sv of the phantom by mapping the X-ray source rs to point cϕ: −1 cϕ = TVS (ϕ , w E ) rs
(7)
and by expressing the line direction in Sv as:
v iϕ =
−1 TVS (ϕ , w E ) ( piϕ − rs )
(8)
−1 TVS (ϕ , w E ) ( piϕ − rs )
where rs and piϕ are points defined in the sensor coordinate system Ss. We reconstructed a 3D marker position from X-ray images by finding the position R of point ri in the coordinate system Sv that minimized RMS distance Erec from point R ri to all lines Liϕ. Erec can be expressed by vector products:
Erec (ri R ) =
1 M
∑ (r ϕ
i
R
− cϕ ) × v iϕ
2
(9)
∈Φ
Reconstruction of 3D position of six fiducial markers from the nine X-ray images Φ={20°,60°,...,340°), which were not used for calibration, by iterative minimization of Erec yielded RMS of less than 0.06 mm for each of the six fiducial markers. The reconstructed fiducial markers from X-ray images were incorporated in a 3x6 matrix R R R XR=[r1 ,r2 ,...,r6 ]. By using different sets of X-ray images for reconstruction and calibration, we were able to validate the calibration procedure. Small RMS of 0.06 mm indicated that the uncertainty of fiducial marker localization in X-ray images was smaller than in CT images and that calibration had been performed well. Therefore, the major source of calibration uncertainty is the uncertainty of fiducial marker localization in CT images, however, its effect on calibration precision is obviously very small.
7
Gold Standard Registration
After calibrating the X-ray acquisition system and reconstructing 3D markers XR from X-ray images, we were able to establish gold standard registration between the X-ray and CT images, and between X-ray and MR images in coordinate system Sv of the phantom. This was achieved by rigid 3D/3D transformation T that minimized the RMS distance Ereg between reconstructed fiducial markers XR from X-ray images and marker points XCT from CT or XMR from MR images:
Ereg (T) =
1 N
∑ (r N
i
i =1
R
− T ri
)
2
(10)
466
D. Toma eviþ, B. Likar, and F. Pernuš CT
MR
where ri stands for points ri or ri . The closed form solution of this minimal RMS problem is known [8]. Rigid transformation T can be decomposed to the rotation component R, represented by 4x4 matrix, and translation vector t: (11) Tr = Rr + t The optimal solution for the translation component is given as:
t = r R − Rr
(12)
R
where r and r stand for mean position of point sets XR and X, respectively, and where set X can either be XCT or XMR. The optimal solution for the rotation component is given as: R = BA T
(13)
A and B are two orthogonal matrices obtained by singular value decomposition (SVD) of the matrix:
X R X T = ADB T
(14)
where D is a diagonal matrix and X R and X are the point sets XR and X, centered at corresponding mean positions r R and r , respectively. Rigid registration of point set (XCT,XR) and (XMR,XR) resulted in minimum RMS distance Ereg of 0.27 mm for CT and 0.44 mm for MR to X-ray registration. Higher RMS for MR than for CT can be attributed to three reasons. First, because CT was used in calibration, second, because intra- and inter-slice resolution of MR images was lower than in CT, which resulted in higher fiducial localization uncertainty, and third, because MR images suffer from non-rigid spatial distortion.
8
Gold Standard Validation
The minimum RMS distance Ereg is also known as fiducial registration error FRE and can be used to evaluate the accuracy of point based rigid registration [8]. By knowing FRE we can determine target registration error (TRE), which is the distance between true, but unknown position of the target, and target position obtained by registration. The expected TRE of a target point r can be estimated from FRE [8]: 3 FLE 2 d k 1 + (15) TRE (r ) = N k =1 f k th where fk is the RMS of the projections of fiducial markers to k principle axis of marker configuration, dk is the projection of target point r to principle axis k, N is the number of fiducial markers, and FLE is the fiducial localization error obtained from FRE: N FLE 2 = FRE 2 (16) N −2 2
∑
Using the above formulation, we had validated the gold standard registration by manually defining eight target points, four per each pedicle (Fig. 3), in each of the 5 vertebra and computing mean TRE for each vertebra. The results of gold standard validation for CT to X-ray and MR to X-ray registration are illustrated in Table 1. The
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
467
expected target registration errors for the pedicles are in the order of 0.2 mm for CT to X-ray registration and in the order of 0.3 mm for MR to X-ray registration.
Fig. 3. The position of eight target points (◆) on the pedicle borders.
9
Vertebrae L1 L2 L3 L4 L5 0.1 0.1 0.2 CT 0.2 0.1 M 0.3 50.2 50.2 90.3 60.4 R 3 4 4 1 2 Table 1. The expected RMS TREs for gold standard registration in mm.
Discussion and Conclusion
We have devised a lumbar spine phantom and obtained and validated a gold standard rigid 2D/3D registration with the aim of testing the performances of methods for 2D/3D registration of X-ray to CT and MR images. Phantom data was composed of CT and MR volumes of the lumbar spine and a set of 18 X-ray projection images. Xray images were obtained by rotating the phantom with a step of 20° around its principal axis, which mimics the intraoperative acquisition with C-arm. As such the phantom is useful for testing 2D/3D registration methods devised for intraoperative image guided surgery. The X-ray acquisition system was calibrated retrospectively by matching the projections of CT markers with the corresponding markers in X-ray images. Calibration with CT markers is generally superior than calibration with MR markers because CT offers better resolution and spatial stability. This observation was confirmed experimentally, as CT-based calibration yielded smaller calibration error Ecalib of 0.31 mm over 0.47 mm found with MR-based calibration. CT-based calibration of the X-ray image acquisition setup already provides registration of CT to X-ray images but does not give any indices of the registration accuracy. We have consequently reconstructed the 3D positions of markers from calibrated 2D X-ray images, which allowed us to implement 3D/3D registration between the reconstructed markers and those found in CT and MR volumes. The result of such a registration reflects: a) uncertainty of marker localization in 2D X-ray images, b) uncertainty of marker localization in 3D CT or MR images, c) uncertainty of the Xray acquisition calibration, and d) uncertainty of marker reconstruction. Altogether, the uncertainties caused fiducial registration error (FRE) of 3D/3D registration, which was used to evaluate target registration error (TRE) of the gold standard CT to X-ray and MR to X-ray registration by the theory developed in [8]. The results in Table 1 indicate that gold standard registration is highly accurate and therefore useful for testing 2D/3D registration methods. However, it should be stressed that the expected TREs for CT to X-ray gold standard registration may possibly be a little larger than those presented in Table 1. This is because the same CT markers were used for X-ray system calibration and for CT to X-ray registration, which could had involved the same bias in the calibration and registration. Nevertheless, if we assume that localization errors for CT markers are much smaller than for MR markers, the expected TREs for CT to X-ray gold standard registration should be
468
D. Toma eviþ, B. Likar, and F. Pernuš
close to those given in Table 1 and are certainly not larger than TREs for MR to X-ray registration. The gold standard image data is available on request from the authors, who believe it will prove useful for validation of newly developed methods with the same data and therefore provide comparison among different registration methods, especially due to the lack of publicly available gold standards for 2D/3D registration.
Acknowledgements The authors would like to thank Laurent Desbat, Markus Fleute and Raphael Martin, University Joseph Fourier, Grenoble, France, Francois Eesteve of Rayonnement Synchrotron et Recherche Medicale, Grenoble, France, and Uroš Vovk of University of Ljubljana for their generous help and support in acquisition of images. This work was supported by the IST-1999-12338 project, funded by the European Commission and by the Ministry of Education, Science and Sport, Republic of Slovenia.
References 1. R. L. Galloway, “The process and development of image-guided procedures,” Annual Rev. Biomed. Eng., vol. 3, pp. 83-108, 2001. 2. S. Lavallée and R. Szeliski, “Recovering the position and orientation of free-form objects from image contours using 3D distance maps,” IEEE Transaction on Pattern Analysis Machine Intelligence, vol. 17, pp. 378-390, 1995. 3. Guéziec, P. Kazanzides, B. Williamson and R. H. Taylor, “Anatomy-based registration of CT-scan and intraoperative X-ray images for guiding a surgical robot,” IEEE Transaction on Medical Imaging, vol. 17, pp. 715-728, 1998. 4. L. Lemieux, R. Jagoe, D. R. Fish, N. D. Kitchen, D. G. T. Thomas, “A patient-tocomputed-tomography image registration method based on digitally reconstructed radiographs,” Medical Physics, vol. 21, pp. 1749-1760, 1994. 5. J. Weese, G. P. Penny, P. Desmedt, T. M. Buzug, D. L. G. Hill, and D. J. Hawkes, “ VoxelBased 2-D/3-D Registration of Fluoroscopy Images and CT Scans for Image-Guided Surgery,” IEEE Transactions on Information Technology in Biomedicine, vol. 1, pp. 284-293, 1997. 6. G. P. Penny, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes, “A comparison of Similarity Measures for Use in 2-D-3-D Medical Image Registration,” IEEE Transactions on Medical Imaging, vol. 17, pp. 586-595, 1998. 7. D. LaRose, J. Bayouth, and T. Kanade, “Transgraph: interactive intensity-based 2D/3D registration of X-ray and CT data”, Medical Imaging 2000, San Diego, USA, K. M. Hanson (ed), SPIE Press 3979:385-396 (2000). 8. J. M. Fitzpatrick, J. B. West, and C. R. Maurer, “Predicting Error in Rigid-Body PointBased Registration,” IEEE Transactions on Medical Imaging, vol. 17, pp. 694-702, 1998. 9. B. Likar, M. A. Viergever, and F. Pernuš, “Retrospective correction of MR intensity inhomogeneity by information minimization”, IEEE Transactions on Medical Imaging, vol. 20, pp. 1398-1410, 2001.
A Novel Image Similarity Measure for Registration of 3-D MR Images and X-Ray Projection Images Torsten Rohlfing and Calvin R. Maurer, Jr. Image Guidance Laboratories, Department of Neurosurgery, Stanford University 300 Pasteur Drive, MC 5327, Room S-012, Stanford, CA 94305-5327, USA {rohlfing,calvin.maurer}@igl.stanford.edu
Abstract. Most existing methods for registration of three-dimensional tomographic images to two-dimensional projection images use simulated projection images and either intensity-based or feature-based image similarity measures. This paper suggests a novel class of similarity measures based on probabilities. We compute intensity distributions along simulated rays through the 3-D image rather than ray sums. Using a finite state machine, we eliminate background voxels from the 3-D image while preserving voxels from air filled cavities and other low-intensity regions that are part of the imaged object (e.g., bone in MRI). The resulting tissue distributions along all rays are compared to the corresponding pixel intensities of the real projection image by means of a probabilistic extension of histogram-based similarity measures such as (normalized) mutual information. Because our method does not compute ray sums, its application, unlike DRR-based methods, is not limited to X-ray CT images. In the present paper, we show the ability of our similarity measure to successfully identify the correct position of an MR image with respect to a set of orthogonal DRRs computed from a co-registered CT image. In an initial evaluation, we demonstrate that the capture range of our similarity measure is approximately 40 mm with an accuracy of approximately 4 mm.
1
Introduction
Most current methods for registering three-dimensional (3-D) tomographic images to two-dimensional (2-D) projection images (e.g., X-ray fluoroscopy, electronic portal images (EPIs) in radiation therapy) make use of digitally reconstructed radiographs (DRR) computed from CT images. The physical foundations of 3-D CT and 2-D X-ray projection imaging are very similar [1]. Therefore, by casting virtual rays through a CT image, one can compute simulated projection images that resemble actual X-ray images (likewise for EPI) of the same patient in the appropriate pose. These simulated projections are compared to the real projections using standard intensity-based image similarity measures in order to achieve registration of projections and 3-D volume [2,3]. Other approaches use geometrical features, such as edges [4] or point-based landmarks T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 469–476, 2002. c Springer-Verlag Berlin Heidelberg 2002
470
T. Rohlfing and C.R. Maurer, Jr.
(either anatomical or artificial) that are back-projected from the 2-D projections into 3-D space and registered using 3-D point-based algorithms [5,6]. Methods based on artificial landmarks (fiducials) are necessarily invasive. Anatomical landmarks, on the other hand, are hard to identify reliably, especially in multimodal images and 2-D projections. Our group has recently introduced a third class of approaches to the registration of 3-D volumetric images and 2-D projections that is based neither on intensities nor on features, but instead on probabilities [7]. Using probabilistic DRRs (pDRR) and a probabilistic extension of histogram-based image similarity measures, we were able to preserve the spatial information present in volumetric images and use it during registration. An additional advantage is that pDRR computation is not based upon the physical interpretation of voxel intensities as X-ray attenuation coefficients. The method can therefore be applied in a meaningful way to tomographic images other than X-ray CT. In the present paper, we apply our probabilistic similarity measure based on pDRR to the registration of 3-D MR images to standard DRR projection images computed from CT. Here, the deterministic DRR images serve as a model for real X-ray projection images, but with a highly accurate known pose, thanks to CTto-MR co-registration. We also describe a method of distinguishing bone in MRI from image background on-the-fly while iterating over the voxel samples along a ray. In summary, this work is, as far as we are aware, the first to introduce a direct way of registering MR images with projection images without any segmentation or other pre-processing.
2
Methods
Probabilistic DRR. For the ray associated with the detector position xdet we define the probabilistic DRR (pDRR) as the distribution P of intensities µ sampled discretely at N uniformly-spaced locations xi along this ray: pDRR(xdet , c) = P [µ(xi ) = c | 0 ≤ i < N ].
(1)
In order to save computation time, the range of samples visited along the ray is restricted to the actual intersection of ray and 3-D image. This is achieved by computing the index Iin of the entry point of the ray into the volume and the index Iout of the exit point. This is efficiently achieved by solving a system of inequalities, originally described in an algorithm for 3-D line clipping on viewport boundaries by Liang and Barsky [8]. The probabilistic DRR can thus be equivalently rewritten as pDRR(xdet , c) = P [µ(xi ) = c | 0 ≤ Iin ≤ i ≤ Iout < N ].
(2)
For a particular pose (position and orientation) of a CT image, we compute a pDRR by generating a histogram of CT intensities along each projection ray. Each pixel in the pDRR image therefore corresponds to a distribution of CT values along the ray that resulted in the projection value at that pixel. In order
A Novel Image Similarity Measure for Registration of 3-D MR Images
471
Projection pixel intensity 2-D histogram
Normalized 1-D histogram
MR image Tissue distribution along ray
Fig. 1. Registration of MRI and projection image (e.g., fluoroscopy, EPI, DRR) using pDRR and pMI. For each projection pixel, the distribution of MR intensities along the corresponding ray is computed. The histogram is normalized to total mass 1, and added to the row in the 2-D histogram that is indexed by the intensity of the current pixel in the projection image.
to avoid interpolation, the original intensities along the ray are entered into the histogram. The proximity of each voxel to the ray is taken into account by adding to the respective histogram bin only a fractional value between 0 and 1 identical to the weight that would otherwise be used for this voxel in the interpolation (partial volume integration [9]). We will later in this paper apply the same principle in order to handle non-scalar, in our particular case probabilistic, data during the computation of histogram-based similarity measures. Probabilistic Mutual Information. The mutual information (MI) image similarity measure [9] has been used with great success in the registration of 3-D to 3-D images [10] (single or multi modality). Based on our previous experience, we usually apply the normalized mutual information [11] (NMI) image similarity measure, which is derived from MI and appears to be less susceptible to changes in mutual image overlap. Both measures are usually computed from discrete 2-D histograms. A 2-D histogram is a matrix H for which each row corresponds to a range of voxel intensities of one of the two images, and each column corresponds to a range of voxel intensities of the other image. A pair of corresponding voxels under the current coordinate transformation therefore indexes one of the matrix fields. The 2-D histogram defined by two images and a particular transformation is the matrix for which every entry has the value that equals the number of corresponding voxel pairs indexing this entry. In 3-D to 3-D image registration, the voxel intensities of one of the two images (the “floating” or “interpolation image”) need to be determined at the voxel locations of the other image “reference image”). Different methods can be used to enter the resulting voxel pairs into the 2-D histogram. The most straightforward techniques involve computing an interpolated intensity value
472
T. Rohlfing and C.R. Maurer, Jr.
from the intensities of the eight voxels enclosing the respective location. Let for example r be the intensity of a particular reference image voxel and fi for i = 0, . . . , 7 the intensities of the eight enclosing voxels in the floating image. Then one may increment the histogram bin indexed by r and the interpolated floating voxel intensity f as follows, producing an updated histogram H as follows: wi fi . (3) Hr,f = Hr,f + 1 where f = i
The two most commonly used interpolation schemes, nearest neighbor and trilinear interpolation, are both special cases of the expression, each with a specific way of computing the interpolation coefficients wi . However, Maes et al. [9] suggest a technique called partial volume integration that completely avoids intensity interpolation. Instead of applying an interpolation scheme such as the one outlined above to the voxel intensities, each of them is entered into the histogram with a weight that is determined by the tri-linear interpolation coefficients that would be applied in the particular situation. As the histogram is actually 2-D, this means that each of the values is actually paired with the single value taken from the other image, and all pairs are entered into the histogram with the respective weights. Hr,fi = Hr,fi + wi for all i.
(4)
This behavior can be understood as adding to the matrix H the result of the outer product of two vectors as follows. One of the vectors is the unit column vector dTr indexing the r-th row of H while the second vector is the distribution of weights assigned to the columns of H: 7 H = H + dTr (5) wi dfi i=0
Here and in all following equations we assume that the respective vector dimensions match the number of rows and columns of H, respectively. The interpolation weights wi are all between 0 and 1 with a total sum of 1. They can therefore easily be re-interpreted as probabilities in a distribution of discrete values (see Fig. 1). We refer to the similarity measures MI and NMI computed from the histograms thus generated as probabilistic MI (pMI), and probabilistic NMI (pNMI), respectively. Background and Air vs. Bone Detection. Clinical images usually show the region of interest of the patient’s body embedded in air. This is useful to ensure that the image boundaries do not crop the presentation of the patient, which would lead to incorrect computation of projections due to missing data. From an image processing point of view, the object of interest is thus surrounded by more or less extended regions of image background, easily detected by its low pixel intensities. For standard DRR computation, values close to zero have no substantial effect on the result.
A Novel Image Similarity Measure for Registration of 3-D MR Images v= T
v= T
Cavity
Add v and Htemp to Hray
Fig. 2. Finite state machine to distinguish image background from air-filled cavities and surface folds. The lower object voxel threshold is denoted by T , the intensity of the next voxel along the ray is denoted by v. The inequalities over each arrow indicate the condition that leads to the respective state transition. The textual description under the arrows is the operation performed upon this transition.
However, when computing the distribution of intensities along a ray, the background pixels do have a substantial impact on the result. On the other hand, one cannot ignore all voxels identified as background by an intensity threshold, as this would also remove voxels that represent air-filled cavities inside the patient’s body or surface folds. Both obviously carry important information about the shape and distribution of tissues inside the patient. When considering MR images, not abandoning voxels below a certain threshold becomes even more essential, since bony structures, from which most information in X-ray projection images originates, would also be removed by such an operation. Instead of simple thresholding, we have implemented a finite state machine (FSM) to distinguish between air-filled cavities and bone inside the patient, voxels from which are included in the resulting tissue distribution, and image background, voxels from which are discarded. The FSM is illustrated in Fig. 2. Its fundamental principle of operation is to enter voxels encountered along the ray into either the ray histogram (“Hray”) or a temporary histogram (“Htemp”), depending on which state the FSM is in. The temporary histogram temporarily stores below-threshold voxels which are moved to the main histogram when the next above-threshold voxel is encountered.
3
Results
We have computed the pNMI image similarity measure between probabilistic DRRs computed from a 3-D MR image and a DRR computed from a coregistered CT image1 . The results are visualized in Fig. 4. For translations of up to 40 mm in either direction along the x, y, and z axes, we found a peak of the similarity measure at the known correct pose (translation in x and z direction), or at least close to it (within 4 mm in y direction). 1
The registration transformation between CT and MRI was computed using an intensity-based algorithm based on NMI [12]. Our algorithm has been validated to achieve better than 1 mm accuracy for CT to MR registration using the Vanderbilt image data [10].
474
T. Rohlfing and C.R. Maurer, Jr. AP View
Lateral View
DRR
MR Ray Sums
Fig. 3. DRR images (top row ) and spatially equivalent MR ray sum images (bottom row ). The 3-D CT and MR image were aligned using an intensity-based rigid-body image registration algorithm.
4
Discussion
This paper has presented a novel approach to the registration of 3-D tomographic images with 2-D projections. Our method is based on probabilities rather than intensities or geometric features. We have described an extension and reinterpretation of histogram-based similarity measures that allows us to compute these between probabilistic, non-scalar images. We have also introduced a probabilistic extension to DRR computation that is not based on the physical interpretation of voxel intensities as X-ray attenuation. Therefore, this extension and the subsequent computation of entropy-based similarity measures can be applied to other imaging modalities than CT. In particular, we have demonstrated the capability of our similarity measure to identify the correct pose of an MRI volume with respect to two orthogonal DRR images. It is worth noting that the described procedure of computing pMI (pNMI) from pDRR is fundamentally equivalent to back-projecting the real projection image into 3-D space and computing standard MI (NMI) between the 3-D image and this back-projection. This observation may provide some justification for our method and explain to some extent how and why it works. In comparison, however, our method avoids problems resulting from the non-orthogonal grid of the back-projected data when working with the common projection geometries. Furthermore, our approach allows for an easy detection of background vs. bone and air-filled cavities along each ray, and the integration of fuzzy-segmented X-ray projections [13] is straight forward. Obviously, the problem of registering MRI to real, especially intraoperative, X-ray projections is substantially harder than registering to DRR due to noise, presence of surgical instruments, and possibly geometrical distortions. We are therefore currently acquiring multi-modal 3-D image data (CT and MRI) and 2-D flat-panel X-ray images of patient anatomy with implanted markers that will provide for gold-standard pose information to validate our similarity measure against.
pNMI Image Similarity
A Novel Image Similarity Measure for Registration of 3-D MR Images
475
Translation Y (lateral) Translation Z (lateral) Translation X (frontal)
-40.0
-20.0
0.0
20.0
40.0
Translation (mm) Fig. 4. Probabilistic normalized mutual information (pNMI) image similarity measure. Probabilistic DRRs were computed from MRI for different poses and compared to a single DRR image computed from a co-registered CT image. The similarity measure was plotted for translations. For translations along the x axis, image similarity was computed from the AP (frontal) projection image, since due to the near-parallel projection geometry there was no sufficient perspective scaling of the lateral projection images.
Acknowledgments TR was supported by the National Science Foundation under Grant No. EIA0104114. We acknowledge support for this research provided by CBYON, Inc., Mountain View, CA.
References 1. G. T. Herman, Image Reconstruction from Projections, Academic Press, 1980. 2. G. P. Penney, P. G. Batchelor, D. L. G. Hill, D. J. Hawkes, and J. Weese, “Validation of a two- to three-dimensional registration algorithm for aligning preoperative CT images and intraoperative fluoroscopy images,” Med Phys 28, pp. 1024–1032, June 2001. 3. G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes, “A comparison of similarity measures for use in 2D-3D medical image registration,” IEEE Trans Med Imaging 17, pp. 586–595, Aug. 1998. 4. D. Tomaˇzeviˇc, B. Likar, and F. Pernuˇs, “Rigid 2D/3D registration of intraoperative digital X-ray images and preoperative CT and MR images,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print.
476
T. Rohlfing and C.R. Maurer, Jr.
5. M. J. Murphy, J. R. Adler, M. Bodduluri, J. Dooley, K. Forster, J. Hai, Q.-T. Le, G. Luxton, D. Martin, and J. Poen, “Image-guided radiosurgery for the spine and pancreas,” Comput Aided Surg 5, pp. 278–288, 2000. 6. J. R. Adler, M. J. Murphy, S. D. Chang, and S. L. Hancock, “Image-guided robotic radiosurgery,” Neurosurgery 44, pp. 1299–1307, June 1999. 7. T. Rohlfing, D. B. Russakoff, M. J. Murphy, and C. R. Maurer, Jr., “An intensitybased registration algorithm for probabilistic images and its application for 2-D to 3-D image registration,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print. 8. Y.-D. Liang and B. Barsky, “A new concept and method for line clipping,” ACM Transactions on Graphics 3, pp. 1–22, Jan. 1984. 9. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximisation of mutual information,” IEEE Trans Med Imaging 16(2), pp. 187–198, 1997. 10. J. B. West, J. M. Fitzpatrick, M. Y. Wang, B. M. Dawant, C. R. Maurer, Jr., R. M. Kessler, R. J. Maciunas, C. Barillot, D. Lemoine, A. Collignon, F. Maes, P. Suetens, D. Vandermeulen, P. A. van den Elsen, S. Napel, T. S. Sumanaweera, B. Harkness, P. F. Hemler, D. L. G. Hill, D. J. Hawkes, C. Studholme, J. B. A. Maintz, M. A. Viergever, G. Malandain, X. Pennec, M. E. Noz, G. Q. Maguire, Jr., M. Pollack, C. A. Pelizzari, R. A. Robb, D. Hanson, and R. P. Woods, “Comparison and evaluation of retrospective intermodality brain image registration techniques,” J Comput Assist Tomogr 21(4), pp. 554–566, 1997. 11. C. Studholme, D. L. G. Hill, and D. J. Hawkes, “An overlap invariant entropy measure of 3D medical image alignment,” Pattern Recognit 33(1), pp. 71–86, 1999. 12. T. Rohlfing, J. B. West, J. Beier, T. Liebig, C. A. Taschner, and U.-W. Thomale, “Registration of functional and anatomical MRI: Accuracy assessment and application in navigated neurosurgery,” Comput Aided Surg 5(6), pp. 414–425, 2000. 13. D. B. Russakoff, T. Rohlfing, and C. R. Maurer, Jr., “Fuzzy segmentation of fluoroscopy images,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print.
Registration of Preoperative CTA and Intraoperative Fluoroscopic Images for Assisting Aortic Stent Grafting Hiroshi Imamura1 , Noriaki Ida1 , Naozo Sugimoto1 , Shigeru Eiho1 , Shin-ichi Urayama2 , Katsuya Ueno3 , and Kanji Inoue3 1
Graduate School of Informatics, Kyoto University, Uji-city, Kyoto, Japan 611-0011 {imamura,nrak,sugi,eiho}@image.kuass.kyoto-u.ac.jp 2 National Cardiovascular Center Research Institute, Suita-city, Osaka, Japan 565-8565
[email protected] 3 Takeda Hospital, Kyoto-city, Kyoto Japan 600-8558
Abstract. We investigated a registration method between preoperative 3D-CTA and intraoperative fluoroscopic images during intervention. Our final goal is assisting endovascular stent grafting for aortic aneurysm. In our method, DRR (Digitally Reconstructed Radiograph) are generated by voxel projection of 3D-CTA after extracting an aorta region. By increasing/decreasing CT value in the aorta region of CTA, DRR with/without contrast media injection are obtained. Subsequently we calculate matching measures between DRR and fluoroscopic images iteratively by changing imaging parameters. The most similar DRR to fluoroscopic image is selected. We investigated characteristics of several matching measures using simulated fluoroscopic images. From simulation results, we use M-estimator of residual in our method. From an application example to clinical data, registration was successfully applied by M-estimator of residual.
1
Introduction
Endovascular stent grafting is a minimal invasive treatment of aortic aneurysm [1]. Currently 2D fluoroscopic image is used to visualize position of lesion or interventional device. Disadvantage in using fluoroscopic image is lack of information in 3D structure of the object. For discovering its information, registration of preoperative 3D CT angiogram (3D-CTA) and intraoperative 2D fluoroscopic image is useful. Therefore 3D-2D registration have been investigated by several groups[2][3][4][5]. As an application to intervention, Penney et al. developed new intensitybased similarity measure: pattern intensity and gradient difference[5]. They reported both measures are robust to soft tissue deformation and presence of interventional device. In their method they use perspective model to project 3D CT image onto 2D fluoroscopic image. Therefore, this method has to search T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 477–484, 2002. c Springer-Verlag Berlin Heidelberg 2002
478
H. Imamura et al.
many imaging parameters (10 parameters) and it has problems of long calculation time and of narrow capture range. They reported that an estimation error in the direction perpendicular to the projection plane was significantly bigger than the other directions. For intervention such as stent graft placement, CTA (with intra venous contrast injection) is usually taken as preoperative data and fluoroscopic image with/without contrast agent injection as intraoperative one. Penney et al. did not, however, investigate influence of contrast agent injection. In this paper we evaluate several intensity based measures including gradient difference. For reducing calculation time and having enough capture range, we use parallel projection model to project 3D CT image onto 2D fluoroscopic image and reduce the number of imaging parameters to 4. By using parallel projection, it is impossible to estimate position in perpendicular direction to the projection plane. However, we think estimating rotation angles is much more important in aortic stent grafting than estimating position in the perpendicular direction. We also investigate influence of contrast agent both in preoperative and intraoperative images.
2 2.1
Materials and Methods Imaging Geometry and Image Specification
Fig. 1 shows the coordinate system used in our method. We assume 4 imaging parameters, position (x, z), angle (rotation, angulation). For a simulation study described in 2.3 and 3.1, CTA which covers wide area (from thorax and abdomen, matrix size : 512×512×313 [pixel], voxel size : 0.664×0.664×1.250 [mm]) is used. Set of preoperative CTA (matrix size : 512×512×153 [pixel], voxel size : 0.625×0.625×1.50 [mm]) and intraoperative fluoroscopic images (matrix size : 450×450 [pixel], pixel size : 0.390×0.390 [mm]) of the same patient is also used in 2.4 and 3.2. These images were taken for placement of stent graft to abdominal aneurysm. Figures 2 and 3 show CTA images (axial, sagittal, and coronal slices), and Fig. 4 shows one of fluoroscopic images. Rectangle area on the fluoroscopic image is used for matching process described in the following subsection.
Fig. 4. Clinical Fig. 1. Imaging ge- Fig. 2. CTA for sim- Fig. 3. CTA for clin- fluoroscopic image. ulation. ical application. ometry.
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
2.2
479
Method
Generation of Digitally Reconstructed Radiograph. First, an aorta region is extracted from CTA using the extraction method which we previously developed[6]. Subsequently, by increasing/decreasing voxel value in aorta region, two kinds of CT data are produced. One is like a CT with intra aortic contrast injection, and the other is like a CT without contrast injection. Digitally reconstructed radiographs (DRR) of with/without contrast injection are, then, produced by parallel voxel projection of CT image with/without high contrasted aorta. By changing the imaging parameters (rotation and angulation in Fig. 1), a lot of DRRs are produced. The most similar DRR to a fluoroscopic image is searched among them. Detecting Contrast Media Injection in Fluoroscopic Image. For detecting contrast media injection in fluoroscopic image, we accumulate pixel value inside ROI in fluoroscopic image (Fig. 4). If sum of pixel value is bigger/smaller than threshold, we estimate this fluoroscopic image is contrasted/non-contrasted and we use DRR with/without contrast media injection for registration. Matching Measures Residual. Residual between fluoroscopic image (Ifl ) and DRR (IDRR ) are defined as follows. Residual can be calculated easily, but it depends on change of brightness and contrast of image. By using sequential similarity detection algorithm (SSDA)[7], calculation time to find a minimum value of this measure is able to be reduced significantly. R=
M N
|Ifl (i, j) − IDRR (i, j)|
j=1 i=1
M-estimator. Robust estimation is a statistic method which is robust to noise included only in one image. In interventional procedure, instruments such as stent or catheter are included only in fluoroscopic image, and there is a possibility that mismatch pair of image (contrasted/non-contrasted fluoroscopic image with non-contrasted/contrasted DRR) are used. M-estimator is one of the most popular criterions in robust estimation. M=
i,j
|Ifl (i, j) − IDRR (i, j)|2 σ 2 + |Ifl (i, j) − IDRR (i, j)|2
(σ : constant)
Gradient Difference. Penney et al.[5] presented a similarity measure for 3D2D registration based on residual of gradient image: gradient difference. They show that it is robust to soft tissue deformation because low frequency components are already filtered out in gradient image. They show it is also robust to presence of linear high intensity region such as a stent or a catheter.
480
H. Imamura et al.
G=
Av
i,j
Av + {IdiffV (i, j)}2
+
i,j
Ah Ah + {IdiffH (i, j)}2
dIDRR dIf l −s , Av , Ah : constants, IdiffV (i, j) = di di dIf l dIDRR IdiffH (i, j) = −s dj dj
By the above formula, SSDA-like fast algorithm can not be utilized. However, this equation can be transformed to the following formula. {IdiffV (i, j)}2 {IdiffH (i, j)}2 G= 1− + 1 − Av + {IdiffV (i, j)}2 Ah + {IdiffH (i, j)}2 i,j i,j =
i,j
{2} − i,j
{IdiffV (i, j)} + Av + {IdiffV (i, j)}2 2
i,j
{IdiffH (i, j)} Ah + {IdiffH (i, j)}2 2
Maximizing G gives the same result as minimizing the second term in the above formula. The second term is a distance measure, thus SSDA-like fast algorithm can be used for minimizing it. In this paper we minimize the second term which corresponds to an M-estimator of residual of gradient images. We also investigated the following three measures, residual of gradient image, correlation coefficient, mutual information. We do not show result on these measures in this paper because result on residual of gradient image, cross correlation, and mutual information resembled gradient difference, residual, and M-estimator of residual respectively. Optimization. We used multi-resolutional analysis for searching optimal imaging parameters. In this paper, we used triple resolutional data. 2.3
Simulation Study
For investigating a characteristics of matching measures, simulation study was performed. In this study, simulated fluoroscopic images were produced almost the same way for generating DRR(2.2.1). However, here, perspective projection was used instead of parallel projection. S-shaped and pincushion distortion was also added on it (Fig. 5). Artificial line of high intensity is, then, added as simulated catheter. Figure 6 shows a finally obtained simulated fluoroscopic image. For investigating influence of rotation and angulation, we produced images from 3 kinds of imaging orientation (anterior, rotated, angulated images). In each case, both thoracic and abdominal images, and also with/without contrasted agent were generated. As a result, 12 fluoroscopic images are used in our study. By using these images and DRRs in Fig. 7, we investigated characteristics of matching measures described in 2.2. In this study we determined Av and Ah for calculating gradient difference in the same way as described in Penney et.al[5].
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
Fig. 5. Added geometric distortion to simulated fluoroscopy.
481
Fig. 6. Simulated fluoroscopic images (Left:contrast- Fig. 7. Generated DRR with ed, Right:non-constrasted). /without contrast agent injection.
Table 1. Average and standard deviation of estimation error.
Measure residual
(n)cFL :(Non-)Contrasted fluoroscopic image (n)cDRR :(Non-)Contrasted DRR Pair of Image Rotation [deg] Angulation[deg] X [mm] −0.67 ± 0.94
0.00 ± 1.16
cFL-ncDRR
0.67 ± 8.72
−6.33 ± 7.90
−13.79 ± 23.01 −1.33 ± 12.70
ncFL-cDRR
7.00 ± 9.77
3.67 ± 6.70
−13.12 ± 21.86 25.79 ± 63.12
ncFL-ncDRR
0.33 ± 0.75
0.67 ± 2.49
cFL-cDRR
−0.33 ± 0.82
0.33 ± 2.34
0.45 ± 1.38
−0.45 ± 2.85
−1.00 ± 5.33
−1.67 ± 9.59
−5.34 ± 6.64
12.23 ± 15.75
M-estimator cFL-ncDRR of residual
gradient difference
2.4
ncFL-cDRR −11.33 ± 10.63
−1.67 ± 7.31
ncFL-ncDRR
−0.33 ± 1.97
1.00 ± 2.45
−0.22 ± 0.50
Z[mm]
cFL-cDRR
0.45 ± 0.99
0.67 ± 1.89
4.00 ± 3.75
−30.02 ± 56.91 53.36 ± 34.58 0.00 ± 2.23
−2.89 ± 4.75
5.67 ± 0.69
0.00 ± 3.13
cFL-cDRR
−2.00 ± 0.00
−1.00 ± 1.67
cFL-ncDRR
−8.00 ± 8.67
5.67 ± 7.94
−71.15 ± 12.56 40.24 ± 51.99
ncFL-cDRR
−3.00 ± 3.52
−4.33 ± 7.74
−27.57 ± 39.87 9.34 ± 42.67
ncFL-ncDRR
1.00 ± 6.42
3.67 ± 8.34
−10.89 ± 21.48 −4.89 ± 5.30
Application to Clinical Data
We tested our algorithm to contrasted/non-contrasted clinical fluoroscopic image. First, we produced triple resolutional DRR. Subsequently we detected contrast media injection in fluoroscopic image and registered it with low resolutional DRR. For the lowest resolutional DRR and medium resolutional DRR, we used cross correlation as matching measure. For the highest resolutional data, we used M-estimator of residual as matching measure.
3 3.1
Results Simulation Study
To examine accuracy of parameter estimation, we calculated average and standard deviation of estimation error for each parameter (Table 1). The distributions of matching measures around peak point are calculated one dimensionally on a variable chosen. Residual, M-estimator of residual, and gradient difference are shown in Fig. 8 (a), (b), and (c) respectively. In these figures, one of imaging parameters is changed, other parameters are fixed with same value in the peak point.
482
H. Imamura et al.
(a) Residual
(b) M-estimator of residual
(c) Gradient difference (second term) Fig. 8. One dimensional profile of matching measure distribution around peak point.
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
(a) contrasted
483
(b) non-contrasted
Fig. 9. Result of application to clinical data.
3.2
Application to Clinical Data
Experimental results of clinical fluoroscopic image with DRR are shown in Fig 9 (a) and (b). In these figures, left side image is original fluoroscopic image, and right side image is DRR with estimated imaging parameters.
4
Discussion
From Table 1, regarding influence of contrast injection, it was proved that a pair of contrasted fluoroscopic image with contrasted DRR is the best, non-contrasted fluoroscopic image with non-contrasted DRR is the second priority, and others are not good because average and standard deviation of estimation error are much bigger. Standard deviation by gradient difference is much bigger than by other matching measures for non-contrasted fluoroscopic image with non-contrasted DRR case (ncFL-ncDRR). When edge of catheter in non-contrasted fluoroscopic image and edge of rib in non-contrasted DRR are matched, incorrect DRR is selected. Example of such a case is shown in Fig. 10. Edge of a rib and a catheter is indicated by inside of white ellipse region of left side and right side image. Simulation study shows that residual of gradient, M-estimator of residual, gradient difference, mutual information have an enough sharp peak, but they have several local optimal points. On the other hand, residual and cross correlation have broad peak around ground truth. Therefore we use cross correlation to low resolutional data and M-estimator of residual to high resolutional data. From clinical application, it seems that appropriate imaging parameters are estimated.
5
Conclusion
In this paper, we investigated a registration method between preoperative 3D CT angiography (3D-CTA) and intraoperative fluoroscopic images (with/without contrast injection) for assisting endovascular stent grafting. Especially we examined influence of contrast agent both in preoperative and intraoperative images. Simulation results and application to the clinical data show that M-estimator of residual is suitable as matching measure.
484
H. Imamura et al.
Fig. 10. Incorrectly selected differentiated DRR (Left, Middle) and differentiated simulated fluoroscopic image (Right).
Acknowledgements This research is partially supported by Grant-in-Aid for Scientific Research (C)(2)(No.13680935) from Japan Society for the Promotion of Science(JSPS).
References 1. K. Inoue, H. Hosokawa, T. Iwase, M. Sato, Y. Yoshida, K. Ueno et al.: Aortic Arch Reconstruction by Transluminally Placed Endovascular Branched Stent Graft. Circulation 100 (1999) 316–321. 2. S. Lavall´ee and R. Szeliski: Recovering the Position and Orientation of Free-form Objects from Image Contours Using 3-D Distance Maps. IEEE Trans. PAMI 17 (1995) 378–390. 3. A. Gu´eziec, P. Kazanzides, B. Williamson, and R. H. Taylor: Anatomy Based Registration of CT-scan and X-ray Images for Guiding a Surgical Robot. IEEE Trans. Med. Imag. 17 (1998) 715–728. 4. L. Z¨ ollei, E. Grimson, A. Norbash, W. Wells: 2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators. IEEE CVPR (2001). 5. G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes: A Comparison of Similarity Measures for Use in 2-D-3-D Medical Image Registration. IEEE Trans. Med.Imag. 17 (1998) 586–595. 6. H. Imamura, N. Sugimoto, S. Eiho, S. Urayama, K. Ueno, K. Inoue: Extraction and Quantitative Analyisis of Aneurysmal Aorta for Aiding Endovascular Stent Grafting. IEICE J84-D-II (2001) 2468–2476. 7. D. I. Barnea and H. F. Silverman: A class of algorithms for fast digital image registration. IEEE Trans. Comput. C-21 (1972) 179–186.
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy for Voxel-Based 2-D/3-D Registration Yoshikazu Nakajima1 , Yuichi Tamura2 , Yoshinobu Sato1 , Takahito Tashiro1 , Nobuhiko Sugano3 , Kazuo Yonenobu4 , Hideki Yoshikawa3 , Takahiro Ochi2 , and Shinichi Tamura1 Division of Interdisciplinary Image Analysis1 Division of Computer Integrated Orthopaedic Surgery2 , Department of Orthopaedic Surgery3 , Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan Osaka Minami National Hospital4 , Kawachinagano, Osaka 586-8521, Japan
Abstract. We have developed a system for the 3-D localization of anatomical structures without the need for surgical exposure by using multiple-view fluoroscopy images. In this paper, we describe the system and evaluate its application to the estimation of optimal imaging orientations in fluoroscopy. For positional measurement, a voxel-based 2-D/3-D registration technique was employed. Since the measurement condition depends on the object shape, spatial distribution of X-ray absorption, and overlap of organs or structures (which differs at each imaging position), determining the optimal combination of fluoroscopy orientations is significant. We propose a system for preoperative determination of the optimal imaging orentation by using the accuracy estimation of stereo localization from single-plane localization results. In an experiment, the computation time needed was 10 hours, which was about 14 times shorter than the time required for a full search of imaging orientation combinations.
1
Introduction
Two-dimensional (2-D)/three-dimensional (3-D) registration between an intraoperative fluoroscopy image and a preoperative 3-D CT image [1] [2] is an effective means of organ localization without surgical exposure. Primary research in 2-D/3-D registration for medical applications was based on the contour-surface matching technique [3] [4] [5]. The contour-based method is, however, not stable with respect to false contours, and exact extraction of bone edges is not always feasible because of material heterogeneity and overlap of peripheral organs. On the other hand, voxel-based registration methods, which use digitally reconstructed radiographs (DRRs) generated from a 3-D CT image, are generally robust compared with the contour-based method [6] [7] [8] [9] [10]. For these reason, we have employed a voxel-based method [6] for our purpose of vertebra localization. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 485–492, 2002. c Springer-Verlag Berlin Heidelberg 2002
486
Y. Nakajima et al.
For accurate pose measurement, registration from two stereo images has been proposed [3] [11]. The object shape, distribution of X-ray absorption, and overlap of peripheral organs or structures — which changes along with the imaging orientation — cause spatial heterogeneity of the pose measurement accuracy. Therefore, to improve the localization accuracy, it is significant to optimize the imaging orientations. Some approaches to this have been reported [12] [13] [14] [15], but, optimal view analysis for registration has not hitherto been proposed. Here, we discuss determination of the optimal imaging orientation in fluoroscopy for voxel-based 2-D/3-D registration.
2
Methods
2.1 Voxel-Based 2-D/3-D Registration The algorithm for voxel-based 2-D/3-D registration [6] [7] consists of three major components: 2-D image (DRR) generation from a 3-D CT image, similarity evaluation between a DRR and a fluoroscopy image, and optimization. In the DRR generation process, segmentation of the anatomical structure of interest in the pre-operative 3-D CT image and pseudo projections to generate a DRR (Fig. 1 (a)) from the segmented CT volume are performed. In the similarity evaluation, the gradient correlation is employed for single-plane pose measurement, which is given by GC =
(i,j)∈T∂
1 2
(i,j)∈T∂
i
where and
Fi Di
i 2
Fi
(i,j)∈T∂
i
∂If l (i, j) ∂If l − , ∂i ∂i ∂If l (i, j) ∂If l Fj = − , ∂j ∂j Fi =
Di2
+
1 2
(i,j)∈T∂
(i,j)∈T∂
j
Fj2
Fj Dj
j
(i,j)∈T∂
j
Dj2
, (1)
∂IDRR (i, j) ∂IDRR − , ∂i ∂i ∂IDRR (i, j) ∂IDRR Dj = − , ∂j ∂j Di =
and If l and IDRR are the pixel intensities of the fluoroscopy image and DRR, respectively. ∂If l /∂i, ∂If l /∂j, ∂IDRR /∂i, and ∂IDRR /∂j are the pixel intensities of horizontal and vertical gradient images of the fluoroscopy image and DRR, respectively. For stereo imaging measurement, the evaluation function — which is the sum of the gradient correlations of both imaging positions — is used. Estimation of the CT volume pose is realized by an optimization approach based on the Powell method [6] [7]. In our experiments, an anatomical coordinate system of the vertebra of interest (Fig. 2 (b)) [16] is determined by manually specifiying surface points on the backface and topface (Fig. 2 (a)) in the 3-D CT image. 2.2
Estimation of Optimal Imaging Orientation
We propose a method of estimating the optimal imaging orientation in fluoroscopy for stereo 2-D/3-D registration. Since the cost of error computation for 2-D/3-D registration is relatively high, a method of estimating the stereo localization accuracy from single-plane localization results is employed. An overview of the system is shown in Fig. 3. From a CT volume and position parameters, an
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
487
Fig. 1. Digitally reconstructed radiographs (DRRs). (a) DRR from segmented CT image of L1 vertebra. (b),(c),(d) DRR from original CT image. The imaging angles are 0 deg, 45 deg, and 90 deg, respectively. Overlaps of vertebrae, ribs and soft tissues change along with the imaging angle.
Fig. 2. Anatomical coordinate system of a vertebra. (a) Sampling points on the vertebra surface. A (set of white four points) determines the origin and the backface. B (set of gray points) determines the topface. (b) Anatomical coordinate system determined by the origin, backface, and topface.
X-ray fluoroscopy image is generated for the computer simulation. The CT volume of the vertebra of interest is segmented in the original CT volume. Initial estimates of the position parameters, which are (tx , ty , tz ) for translation and (θx , θy , θz ) for rotation, are made by adding random noise to the true position parameters. Then, 2-D/3-D registration is performed. By changing the position parameters of the CT volume from the imaging system, an error profile of singleplane localization is computed. Errors of translation and rotation are described as a covariance matrix, and the calculated covariance matrix is transformed in local coordinate system of the anatomical structure. The transformed matrix C´single plane in the local coordinate system of the anatomical structure is given by C´single
plane
= (M T M )−1 M T Csingle
plane ((M
T
M )−1 M T )T ,
(2)
where Csingle plane is the covariance matrix of rotation and translation errors, and M is the Jacobian of transformation between the local coordinate system of target anatomical structure and the coordinate system of the imaging plane.
488
Y. Nakajima et al.
The error of stereo localization, determined by using a combination method of error distribution [17], is given by Cstereo = C´single plane2 (C´single plane1 + C´single plane2 )−1 C´single plane1 ,
(3)
where Cstereo is the covariance matrix of stereo measurement and Csingle plane1 and Csingle plane2 are covariance matrices of single-plane measurement. Then, to project the error into the local coordinate axes of the anatomical structure for clinical evaluation, it is approximately evaluated by fitting a 3-D Gaussian function G(x, y, z; σx , σy , σz ) the axes of which correspond to the local coordinate system of the anatomical structure. When the number of imaging orientations for the optimal orientation analysis of single-plane localization is n, the combination number of imaging orientations for the optimal orientation analysis of stereo localization is 12 n(n + 1). Since the proposed method only requires the accuracy of single-plane localization, the number of imaging orientations for accuracy analysis is n and the computation time of an iteration is half that required for stereo localization. In the case of 15-degree resolution (13 positions) of imaging orientation and 50 iterations at each orientation, the computation time needed for the proposed method was 10 hours (with a Pentium Xeon 1.7 GHz, 2 CPUs, and 2 GB memory) which is 1/14 that required for a full search in imaging orientation pairs.
Fig. 3. Accuracy estimation of stereo localization
3
Experiment
3.1 Effects of Overlap We assessed the effects of organ or other structure overlap. Since our 2-D/3-D registration method registers a segmented CT volume of the vertebra of interest,
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
489
overlap of upper and lower vertebrae, ribs, and soft tissues causes pose estimation error. The CT data used in the experiment sections were a set of abdominal images that were imaged for clinical diagnosis of intestinal disease. The FOV, slice thickness, and matrix size were 360 mm, 2.0 mm, and 512×512×128 pixels, respectively. The images included the lower thoracic and lumber spine (from T10 to C4). In the system, the CT volume was segmented to parts of the vertebra of interest (L1 vertebra), other vertebrae, ribs, and soft tissues. By controlling of their visibility, we evaluated the effect of organ and other structre overlap. The experimental conditions are shown in Table 1. In the experiment, the matrix size of generated 2-D image was 128 × 128 pixels. Let σerror0 , σerror1 , and σerror2 be standard deviations of error in conditions 0, 1, and 2, respectively, and σef f ect0 , σef f ect1 , and σef f ect2 the overlap effect of each organ or structure, respectively. The overlap effects are given by
σef f ect1 = and σef f ect2 =
σef f ect0 = σerror0 ,
(4)
σerror1 − σerror0
2,
(5)
σerror2 − σerror1
2,
(6)
2 2
respectively. Table 1. Experimental Conditions for Error Analysis of Organ or Other Structure Overlap Segmented part Condition 0 Condition 1 Condition 2 L1 vertebra visible visible visible vertebrae invisible visible visible ribs and soft tissues invisible invisible visible
The results are shown in Fig. 4. The error of localization of the L1 vertebra was 0.43 ± 0.45 degree and 1.27 ± 0.70 mm. The error caused by the overlap of the upper and lower vertebrae was 0.76 ± 0.29 degree and 1.15 ± 0.60 mm. The error caused by the overlap of ribs and soft tissues was 0.01 ± 0.11 degree and 0.11 ± 0.38 mm, which was smaller than the other errors. In a clinical CT image for spine surgery, the imaging volume is limited to the area around the surgical site and does not include all of the ribs and soft tissues. The above results showed the feasibility of preoperative estimation of the optimal imaging orientation using a segmented CT image of vertebrae in clinical use. 3.2
Accuracy Computation of Single-Plane Localization
The localization accuracy of the single-plane measuremenst was validated. The results are shown in Fig. 5. In the figure, the optimal orientation was ±60 degrees on the y-axis rotation. On the x-axis rotation, tilting of less than ±15 degrees might be acceptable. The results for over ±15 degrees of tilt were affected by the overlap of upper/lower vertebrae.
490
Y. Nakajima et al.
Fig. 4. Effects of tissue overlap. (a) Rotation error. (b) Translation error.
Fig. 5. Localization accuracy of single-plane measurement. The horizontal axis shows the imaging angle of the fluoroscope in the vertebra coordinate system. The vertical axis shows the root mean square (RMS) errors of the localized pose of the target vertebra. (a) Rotation around the y-axis. (b) Rotation around the x-axis.
3.3
Estimation of Optimal Imaging Orientations for Stereo Fluoroscopy The optimal imaging orientations for stereo fluoroscopy were estimated. The accuracy estimation results are shown in Fig. 6. Panels (a) and (c) respectively depict the rotational errors of the estimation using single-plane localization accuracy as described in Section 2.2, and the error measurements obtained by using a full-search simulation. Panels (b) and (d) respectively depict the translational errors of the estimation and full-search error measurements. Similar error tendencies were observed in both sets of results. With respect to the rotation accuracy, the optimal pairs of imaging points were {0, 60} in the estimation and {0, 75} in the full-search simulation. Errors of {0, 60} in the estimation were 0.53 and 0.54 mm, respectively, while the error of {0, 75} in the full-search simulation was 0.44 mm. In the case of the translation accuracy, the optimal imaging points were {0, 90} in both results. In situations where the fluoroscopy geometry was restricted, the preoperative analysis of the optimal orientation was effective. For example, when the geometry of fluoroscopy was restricted to ±30 degrees, the optimal combination of imaging orientations was {0, 30} and not {-30, 30}.
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
(a)
(b)
(c)
(d)
491
Fig. 6. Results of estimated accuracy of stereo measurement. Horizontal axes show the imaging positions of each fluoroscopy. Vertical axes show the root mean square (RMS) errors of the localized pose of the target vertebra. (a) Estimated rotation error. (b) Estimated translation error. (c) Simulated rotation error. (d) Simulated translation error.
4
Discussion and Conclusions
In this paper, a method of estimating the optimal imaging orientation in fluoroscopy for voxel-based 2-D/3-D registration is proposed. The optimal imaging orientation pairs estimated experimentally were {0, 60} for rotation and {0, 90} for translation. Using optimal view estimation of stereo imaging from the registration accuracy of single-plane imaging, the system computed the optimal orientation in about 10 hours, representing a computation cost 14 times lower than that required by the full-search method. The method was validated for optimal view determination of stereo localization from the accuracy of single localization. In this method, the stereo localization accuracy is estimated by using pseudo X-ray images (DRRs). Although this is covenient for preoperative analysis, DRR and X-ray fluoroscopy images have different pixel intensities. Their spatial resolutions are also differ. These problems are discussed in [9]. As our next step, we intend to validate the method with respect to the limitations of its application to clinical use. The results reported here were good enough to estimate optimal imaging orientations, but were not adequate for estimating registration accuracy.
492
Y. Nakajima et al.
In the future, we will take up the challenge of evaluating the effect of imaging parameters (the CT slice thickness, etc.) and integrating error components to estimate the registration accuracy and evaluate the optimal imaging parameters. Acknowledgement: This work was partly supported by the Japan Society for the Promotion of Science (JSPS) Research for the Future Program JSPSRFTF99I00903 and the JSPS Grant-in-Aid for Scientific Research (Encouragement of Young Scientists (B) 14780281).
References 1. A. Hamadeh and P. Cinquin: ”Kinematic Study of Lumber Spine Using Functional Radiographies and 3D / 2D Registration”, CVRMed-MRCAS ’97, pp.109-118, 1997. 2. K. Takayanagi, K. Takahashi, M. Yamagata, H. Moriya, H. Kitahara, T. Tamaki: ”Using Cineradiography for Continuous Dyanmic-Motion Analysis of the Lumber Spine”, SPINE, 26(17), pp. 1858-1865, 2001. 3. S. Lavall´ ee, R. Szeliski: ”Recovering the Position and Orientation of Free-Form Objects from Image Contours Using 3D Distance Map”, IEEE Trans. on PAMI, 17(4), pp.378-390, 2001. 4. S.A. Banks, W.A. Hodge: ”Accurate Measurement of Three-Dimensional Knee Replacement Kinematics Using Single-Plane Fluoroscopy”, IEEE Trans. on Biomedical Engineering, 43(6), pp.638-649, 1996. 5. S. Zuffi, A. Leardini, F. Catani, S. Fantozzi, A. Cappello: ”A Model-Based Method for Reconstruction of Total Knee Replacement Kinematics”, IEEE Trans. on Medical Imaging, 18(10), 1999. 6. J. Weese, P. Penney, P. Desmedt, T.M. Buzug, D.L.G. Hill, D.J. Hawkes: ”Voxel-Based 2-D/3-D Registration of Fluoroscopy Images and CT Scans for Image-Guided Surgery”, IEEE Trans. on Information Technology in Biomedicine, 1(4), pp. 284-293, 1997. 7. G.P. Penney, J. Weese, J.A. Little, P. Desmedt, D.L.G. Hill, D.J. Hawkes: ”A Comparison of Similarity Measures for Use in 2-D—3-D Medical Image Registration”, IEEE Trans. on Medical Imaging, 17(4), pp. 586-595, 1998. 8. A. Gu´ eziec, P. Kazanzides, B. Williamson, R.H. Taylor: ”Anatomy-Based Registration of CTScan and Intraoperative X-Ray Images for Guiding a Surgical Robot”, IEEE Trans. on Medical Imaging, 17(5), pp. 715-728, 1998. 9. P. Penney: ”Registration of Tomographic Images to X-ray Projections for Use in Image Guided Interventions”, Thesis for the degree of Doctor of Philosophy of the University of London, 1999. 10. L. Z¨ ollei: ”2D—3D Rigid-Body Registration of X-Ray Fluoroscopy and CT images”, Thesis for the degree of Doctor of Philosophy of the Massachusetts Institute of Technology, 2001. 11. B. You, P. Siy, W. Anderst, and S. Tashman: ”In vivo Measurement of 3-D Skeletal kinematics from Sequences of Biplane Radiographs: Application to Knee Kinematics”, IEEE Trans. on Medical Imaging, 20(6), pp.514-525, 2001. 12. A.C.M. Dumay, J.H.C. Reiber, and J.J. Gerbrands: ”Determination of Optimal Angiographic Viewing Angles: Basic Priciples and Evaluation Study”, IEEE Trans. on Medical Imaging, 13(1), pp. 13-24, 1994. 13. Y. Sato, T. Araki, M. Hanayama, H. Naito, S. Tamura: ”A Viewpoint Determination System for Stenosis Diagnosis and Quantification in Coronary Angiographic Image Acquisition”, IEEE Trans. on Medical Imaging, 17(1), pp. 121-137, 1998. 14. Wilson, D.L.; Royston, D.D.; Noble, J.A.; Byrne, J.V. : ”Determining X-ray Projections for Coil Treatments of Intracranial Aneurysms” IEEE Trans. on Medical Imaging, 18(10), pp. 973-980, 1999. 15. A.S. Talukdar and D.L. Wilson: ”Modeling and Optimization of Rotational C-Arm Stereoscopic X-ray Angiography”, IEEE Trans. on Medical Imaging 18(7), pp. 604-616, 1999. 16. M.M. Panjabi, T. Tanaka, V. Goel, D. Federico, T. Oxland, J. Duranceau, and M. Krag: ”Thoracic Human Vertebrae (Quantitative Three-Dimensional Anatomy)”, SPINE, 16(8), pp.888-901, 1991. 17. W. Hoff and T. Vincent: ”Analysis of Head Pose Accuracy in Augmented Reality”, IEEE Trans. on Visualization and Computer Graphics, 6(4), pp. 1-15, 2000.
A New Similarity Measure for Nonrigid Volume Registration Using Known Joint Distribution of Target Tissue: Application to Dynamic CT Data of the Liver Jun Masumoto1 , Yoshinobu Sato1 , Masatoshi Hori2 , Takamichi Murakami2 , Takeshi Johkoh2 , Hironobu Nakamura2 , and Shinichi Tamura1 1
2
Division of Interdisciplinary Image Analysis Department of Radiology, Osaka University Graduate School of Medicine Suita, Osaka, 565–0871, Japan
Abstract. A new similarity measure for volume registration is proposed, which uses using the assumption that the joint distribution of a target tissue is known. This similarity measure is designed so that it can deal with the tissue slide that occurs at boundaries between the target tissue and other tissues. Pre-segmentation of the target tissue is unnecessary. We intend to apply the proposed measure to registering volumes acquired at different time-phases in dynamic CT scans of the liver using contrast materials. In order to derive the similarity measure, we first formulate the ideal case where the joint distributions of all the tissues are known, after which we derive the measure for a realistic case where only the joint distribution of the target tissue is known. We applied the proposed measure experimentally to eight dynamic CT data sets of the liver. After describing a practical method for estimating the joint distribution of the liver from real CT data, we show that the problem of tissue slide is effectively dealt with using the proposed measure.
1
Introduction
Dynamic contrast-enhanced CT scans are effective means of desease diagnosis and surgical planning for the liver. In a dynamic CT study, several CT volumes are typically acquired at different time-phases not in a single breath-hold. Hence, these volumes are not guaranteed to be registered between different time-phases due to respiratory motion. Their registration by post-processing is highly desirable on account of the following advantages: (1) Accurate correlation between different time-phase images can be performed. (2) In 3D rendering of the liver, portal/hepatic veins and tumors, which are enhanced at different phases, can be registered more accurately. (3) Time−density curves can be estimated at every voxel, which should eventually permit automatic cancer characterization [1]. In this paper, we address the problem of nonrigid registration between volumes acquired at different time-phases of dynamic CT scans of the liver. An important issue in registration of the liver is tissue slide, which occurs along T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 493–500, 2002. c Springer-Verlag Berlin Heidelberg 2002
494
J. Masumoto et al.
boundaries between the liver and other tissues, resulting in discontinuities in the 3D vector field describing the nonrigid deformation [2] [3]. Previous attempts to deal with this problem have required pre-segmentation of the liver region [4] or specification of places where tissue slide may occur before registration [5]. However, because segmentation of the liver from CT data is a far from easy task [6], the ability to employ direct registration between raw CT volumes without segmentation is desirable in the clinical environment. As a means of dealing with tissue slide without pre-segmentation, we propose a new similarity measure for volume registration. In the dynamic CT, the tissue contrast during scans at different time-phases changes differently depending on the particular tissue involved. Thus, unlike a cross−correlation measure, a new similarity measure should also be able to cope with differences in contrast between volumes to be registered. Although mutual information [7] (or the entropy correlation coefficient: ECC [8]) is known to be useful as a similarity measure in such a case [9] [10], we instead employ the following assumption: “The joint distribution of a target tissue is known.” The use of the known joint distribution was originally suggested by Leventon et al. [11]. The main difference between their method and ours is that we utilize the known joint distribution of only the target tissue while they use that of the entire volume. Thus, our method tries to register only the target tissue, for example, the liver, but ignores non-target tissues. By taking this approach, we effectively cope with tissue slide which is inevitable in registration of the abdominal domain.
2 2.1
Theory Ideal Case: Assuming Joint Distributions of All Tissues Are Known
We consider the joint distribution Po (I, J) of two volumes whose intensity values are represented by I and J, respectively. These two volumes are assumed to be correctly registered. If we assume that the volume consists of the tissue set Γ = {γ1 , γ2 , · · · , γn } and the joint conditional distribution for each tissue is known, Po (I, J) can be decomposed into P (I, J, γ) = P (I, J|γ) · P (γ), (1) Po (I, J) = γ∈Γ
γ∈Γ
where γ∈Γ P (γ) = 1. In this case, an optimal similarity measure, B(X), should be maximum when X = Po (I, J) is satisfied. Here, we introduce a concept that we call “exclusive”. We define P (I, J|γi ) as being “exclusive” if P (I, J|γi ) satisfies the following conditions for all γi (1 ≤ i ≤ n): ∀(I0 , J0 ) {(I0 , J0 )|P (I0 , J0 |γi ) = 0} , P (I0 , J|γk ) = 0 ∧ P (I, J0 |γk ) = 0. γk ∈Γ
k=i
J
γk ∈Γ
k=i
I
Using the “exclusive” condition, B(X) can be decomposed into
(2)
A New Similarity Measure for Nonrigid Volume Registration
B(X) =
495
Wγ · Tγ (X)
γ∈Γ
= Wγ1 · Tγ1 (X) + Wγ2 · Tγ2 (X) + · · · + Wγn · Tγn (X),
(3)
where Tγi (X) is maximum when γi is correctly registered (that is, when X = P (I, J|γi )), and Wγi is its weight coefficient satisfying γ∈Γ Wγ = 1. By substituting Po for X in B(X), we have B(P0 ) = Wγ · Tγ (P0 ) γ∈Γ
= Wγ1 · Tγ1 (P0 ) + · · · + Wγi · Tγi (P0 ) + · · · + Wγn · Tγn (P0 ).
(4)
Using the exclusive condition, Tγi (P0 ) is described as Tγi (P0 ) = Tγi (P (I, J|γ1 ) · P (γ1 )) + Tγi (P (I, J|γ2 ) · P (γ2 )) + · · · + Tγi (P (I, J|γi ) · P (γi )) + · · · + Tγi (P (I, J|γn ) · P (γn )) = 0 + 0 + · · · + P (γi ) · Tγi (P (I, J|γi )) + · · · + 0 (5) = P (γi ) · Tγi (P (I, J|γi )). Therefore, Tγi (X) should be maximum when X = P (I, J|γi ). 2.2
Realistic Case: Assuming Joint Distribution of One Target Tissue Is Known
Here, we consider a more realistic case. We assume that our target for registration is only liver tissue. Let the tissue set Γ consist of only two tissues, liver (L) and others (O), where O represents all the tissues except liver. When the occurrence probability of liver is P (L) = α, that of the others is P (O) = 1 − α. Using Equation (1), we therefore have Pr (I, J) = α · P (I, J|L) + (1 − α) · P (I, J|O) .
(6)
As a practical supposition, we assume that the joint conditional distribution of liver tissue, P (I, J|L), is known, while P (I, J|O) is unknown. By assuming that P (I, J|L) and P (I, J|O) are exclusive, we have B(X) = WL · TL (X) + WO · TO (X).
(7)
By substituting Pr for X, B(Pr ) = WL · TL (Pr ) + WO · TO (Pr ) = α · WL · TL (P (I, J|L)) + (1 − α) · WL · TL (P (I, J|O)) +α · WO · TO (P (I, J|L)) + (1 − α) · WO · TO (P (I, J|O)) .
(8)
Since P (I, J|L) and P (I, J|O) are exclusive, TL (X) and TO (X) should satisfy the following conditions:
496
1 2 3 4
J. Masumoto et al.
TL (X) is zero when X = P (I, J|O). TO (X) is zero when X = P (I, J|L). TL (X) is maximum when X = P (I, J|L). TO (X) is maximum when X = P (I, J|O).
It should be noted here that P (I, J|O) is unknown. Thus, the above condition 4 should be satisfied for any possible P (I, J|O). Our aim is to derive a similarity measure satisfying the above four conditions. 2.3
Derivation of Similarity Measure for the Realistic Case
We assume that P (I, J|L) is well-approximated by the gaussian function given by 1
P (I, J|L) = 2π|Σ|
T 1 (I, J) − (I, J) Σ −1 (I, J) − (I, J) 2 e , −
(9)
where (I, J) and Σ are the average values and covariance matrix, respectively. In order to obtain an approximated similarity measure satisfying the above four conditions, we use {FL (I, J) · X(I, J)} TL (X) = I,J
TO (X) =
{FO (I, J) · X(I, J)} ,
(10)
I,J
where T 1 (I, J) − (I, J) Σ −1 (I, J) − (I, J) 2 FL (I, J) = e −
(I−I )2 − 2σ I
FO (I, J) = 1 − max e
(J−J )2 − 2σ J
,e
.,
(11) (12)
in which I and J are average values of the projections of P (I, J|L) onto the I-axis and J-axis, and σI and σJ are their variances. Finally, we have the similarity measure B(X) given by B(X) = β · TL (X) + (1 − β) · TO (X).
3 3.1
(13)
Experiments Method for Estimating Joint Distribution of the Liver
We have assumed that the joint distribution of a target tissue is known. To apply the theory described in the previous section, a practical method of estimating the
A New Similarity Measure for Nonrigid Volume Registration
497
Fig. 1. Method for estimating joint distribution of the liver. (a) Volume of interest (VOI) used for the estimation. (b) Estimated FL (I, J) ( Equation (11)). (c) Estimated FO (I, J) ( Equation (12)).
joint distribution of a target tissue from two unregistered volumes is necessary. The field of view (FOV) for abdominal CT scans is usually set based on the spine position. We set the volume of interest (VOI) so that it would be mostly occupied by liver tissue (Fig. 1(a)). The position of the VOI could be fixed for each patient since the position of the liver relative to the spine was not greatly different in each case. We estimated the averages (I, J) and covariance matrix Σ of the joint probability distribution P (I, J|L) of Equation (9) by analyzing the joint histogram inside the VOI of the two volumes. (I, J) and Σ were estimated from the histogram region whose center was the mode of the joint histogram and whose horizontal and vertical widths were three times the full width half maximum (FWHM) values of 1D histograms projected onto the I- and J-axes, respectively. Although the two volumes were not registered at this point, it still gave a good approximation. Figure 1 shows an example of the above estimation. 3.2
Registration Method
Nonrigid volume registration methods are typically comprised of three steps: definition of the similarity measure, representation of the deformation, and maximization of the defined similarity measure. With respect to the latter two steps, we employed an existing nonrigid registration method using free-form deformation by a hierarchical B-spline grid proposed by Rueckert et al. [2]. The hierarchical grid consisted of three levels: 42 mm, 21 mm, and 10.5 mm. We embedded the proposed similarity measure and the entropy correlation coefficient (ECC), which is essentially equivalent to normalized mutual information [8], into the registration method, and compared these two different similarity measures. The parameter value employed in Equation (13) was β = 0.5. 3.3
CT Data Sets
Eight data sets of dynamic CT scans of the liver acquired at Osaka University Hospital and the National Cancer Center were used for performance evaluation.
498
J. Masumoto et al.
Fig. 2. Illustrative examples of registration results. (a) Left: ECC (which is equivalent to normalized mutual information). Right: Proposed similarity measure. (b) Left: ECC. Right: Proposed similarity measure. Table 1. Summary of evaluation results. The quality of the registration results is ranked into five groups based on the visually observed discrepancy: A (discrepancy 0 – 2 mm), B (2–4 mm), C (4–6 mm), D (6–8 mm), E (8– mm). Case #
Imaging conditions Thickness (Interval) FOV Phase 1
1 2 3 4 5 6 7 8
2.5 2.5 2.5 2.5 2.0 2.0 2.0 2.0
mm mm mm mm mm mm mm mm
(1.25 mm) (1.25 mm) (1.25 mm) (1.25 mm) (1.0 mm) (1.0 mm) (1.0 mm) (1.0 mm)
34 34 34 34 28 32 32 32
× × × × × × × ×
34 34 34 34 28 32 32 32
cm2 cm2 cm2 cm2 cm2 cm2 cm2 cm2
early arterial early arterial early arterial early arterial pre-contrast pre-contrast pre-contrast pre-contrast
Evaluation Phase 2 Initial Proposed ECC portal portal portal portal portal portal portal portal
E C E C D C B E
A A A A A A A E
C B C A C B A E
The imaging conditions are summarized in Table 1. Each CT data set originally consisted of volumes at three or four different time-phases, out of which two phases were registered. One was before the injection of the contrast material (pre-contrast) or the early arterial phase (when the effect of the contrast material is small); the other was the portal phase (when the effect of contrast enhancement is large). Because the volumes at these two phases were not acquired in a single breath-hold, there was a possibility of deformation between them due to respiratory motion. The original volume size was 512 × 512 × 150−200 (voxels), which was reduced to half size along each axis direction. 3.4
Results
Table 1 summarizes the evaluation results for the eight data sets. In the evaluation, we classified the quality of the registration results into five groups (see caption of Table 1) based on visually observed discrepancy throughout the volumes. Registration error was reduced from the initial states in the both proposed and ECC measures, but the proposed similarity measure was more effective. Fig-
A New Similarity Measure for Nonrigid Volume Registration
499
ure 2 shows illustrative examples of comparisons between the proposed measure and ECC. Two volumes are displayed using the checker-board method. In Fig. 2(a), tissue slide between the liver and the gallbladder is evident. Using the proposed method, the boundaries of the liver are continuous in the checker-board display, which means they are well-registered, whereas the boundaries are not well-registered in the two volumes using ECC. In Fig. 2(b), the ribs are wellregistered but the liver is not using ECC. In this case, even though the ribs and liver are in close proximity, their motions were largely dissimilar and there is discontinuity in the deformation field between them. The liver is well-registered using the proposed measure since it tracked only liver tissue. However, it should be noted that the boundary of the ribs is not well-registered since the joint distribution of the ribs (bone tissue) differs from the known distribution.
4
Discussion
For application to dynamic CT data of the liver, the proposed measure showed better results than ECC. The reason is considered to be that the new method can more effectively deal with tissue slide. Using the proposed measure, the registration process does not try to register the entire volume but only those regions having the known joint distribution. It simply ignores non-target tissues. Consequently, it is not affected by discontinuities in the deformation field that occur at boundaries of two tissues. In fact, the rib boundary (Fig. 2(b)) was not well-registered using the proposed method, but this is not considered disadvantageous because the aim is to register only the target (i.e. liver) tissue. The proposed similarity measure assumes that the joint distribution of the target tissue is known. One problem is how this should be estimated. The method using histogram analysis of the fixed VOI, explained above in section 3.1, was quite effective so long as the relative position of the target tissue in the FOV was roughly determined. We applied the method to CT data sets acquired at two hospitals and confirmed that the liver region inside the VOI was more than 50% of the entire VOI. The estimation was successful with all the data sets used in our experiments. It should be noted that the boundaries of the target tissue are appropriately registered based on the proposed similarity measure, though the information provided on intensity patterns may be insufficient for registering the inner part of the tissue. The deformation field is considered to be estimated mostly based on B-spline interpolation in the inner part. One approach to addressing this problem would be to use a biomechanically appropriate interpolation method rather than B-splines.
5
Conclusion
We have proposed a novel similarity measure for volume registration when the joint distribution of a target tissue is known. Application of the proposed measure to dynamic CT data sets of the liver confirmed that it could effectively deal with
500
J. Masumoto et al.
tissue slide without the need for any pre-segmentation or manual interaction. We further showed a method for estimating a good approximation of the joint distribution of the target tissue from two unregistered volumes. The proposed measure works well for registering the boundaries of the target tissue, while the registration of the inner part of the tissue is estimated mostly based on B-spline interpolation. Future problems include quantitative evaluation of the proposed similarity measure and developing a post-processing method able to register the inner part of a tissue by taking intensity patterns into account.
Acknowledgements This work was partly supported by JSPS Research for the Future Program JSPSRFTF99I00903 and JSPS Grant-in-Aid for Scientific Research (B)(2) 12558033.
References 1. Andress Carrillo, Jefrey L. Duerk, Jonathan S. Lewin, and David L. Wilson. Semiautomatic 3-D Image Registration as Applied to Interventional MRI Liver Cancer Treatment. IEEE Trans. Med. Imaging, 19(3):175-185, 2000. 2. D.Rueckert, L.I.Sonoda, C.Hayes, D.L.G.Hill, M.O.Leach, D.J.Hawkes. Nonrigid Registration Using Free-Form Deformations: Application to Breast MR Images. IEEE Trans. Med. Imaging, 18(8):712-721, 1999. 3. H. Lester and S.R. Aridge. A survey of hierarchical non-linear medical image registration. Pattern Recognition, 32:71-86, 1999. 4. Mi Chen, Takeo Kanade, Dean Pomerleau, Jeff Schneider. 3D Deformable Registration of Medical Images Using a Statistical Atlas. Lecture Notes in Computer Science, 1679 (MICCAI’99): 621-630, 1999. 5. Yongmei Wang, Lawrence H Staib. Physical model-based non-rigid registration incorporation statistical shape information. Medical Image Analysis, 4:7-20, 2000. 6. Andrea Schenk, Guido Prause, and Heinz-Otto Peitgen. Efficient Semiautomatic Segmentation of 3D Objects in Medical Images. Lecture Notes in Computer Science, 1935 (MICCAI2000): 186-195, 2000. 7. William M. Wells III, Paul Viola, Hideki Atsumi, Shin Nakajima and Ron Kikinis. Multi Modal volume registration by maximization of mutual information. Medical Image Analysis, 1(1):35-51, 1996. 8. Josien P.W. Pluim, J.B. Antoine Maintz, and Max A. Viergever. Interpolation Artifacts in Mutual Information-Based Image Registration. Computer Vision and Image Understanding, 77:211-232, 2000. 9. Mark Holden, Derek L. G. Hill, Erika R. E. Denton, Jo M. Jarosz, Tim C. S. Cox,Trosten Rohlfing, Joanne,Goodey, David J. Hawkes. Voxel Similarity Measures for 3-D Serial MR Brain Image Registration. IEEE Trans. Med. Imaging, 19(2):94102, 2000. 10. Alexis Roche, Greegoire Malandain, Nicholas Ayache, and Sylvain Prima. Toward a Better Comprehension of Similarity Measures Used in Medical Image Registration. Lecture Notes in Computer Science, 1679 (MICCAI’99): 555-566, 1999. 11. Michael E.Leventon and W.Eric L. Grimson. Multi-Modal Volume Registration Using Joint Intensity distributions. Lecture Notes in Computer Science, 1496 (MICCAI’98): 1057-1066, 1998.
2D-3D Intensity Based Registration of DSA and MRA – A Comparison of Similarity Measures John H. Hipwell1 , Graeme P. Penney1 , Tim C. Cox2 , James V. Byrne3 , and David J. Hawkes1 1
Division of Radiological Sciences, UMDS, Guy’s & St Thomas’ Hospitals London SE1 9RT, UK {
[email protected]} 2 National Hospital for Neurology and Neurosurgery, Department of Radiology Queens Square, London, WC1N 3BG, UK 3 Department of Radiology, University of Oxford, The Radcliffe Infirmary, Oxford Oxford, OX2 6HE, UK
Abstract. We have compared the performance of six similarity measures for registration of three-dimensional (3D) magnetic resonance angiography (MRA) to two-dimensional (2D) x-ray angiography images of the cerebral vasculature. The accuracy and robustness of each measure was investigated using a ground truth registration of a neuro-vascular phantom which was obtained using fiducial markers, and using “gold-standard” registrations of four clinical data sets calculated using manual alignment by a neuro-radiologist. Of the six similarity measures, pattern intensity, gradient difference and gradient correlation performed consistently accurately and robustly for all data sets. Using these similarity measures, and for starting positions within 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth positions, we obtained a success rate of greater than 80% for the clinical data sets, whilst none of the phantom registrations failed. The root-mean-square (rms) target reprojection error was less than 1.3mm for the clinical data sets. The rms target reprojection error for the phantom images was less than 1mm when using the most accurate similarity measures.
1
Introduction
Registration of interventional digital subtraction angiography (DSA) to pre-operative magnetic resonance angiograms (MRA) can greatly enhance visualisation during minimally invasive neuro-interventions and introduces potentially useful complementary information such as three-dimensional (3D) blood flow. Whilst there have been a number of papers describing 2D-3D registrations of MRA and x-ray images, these studies have tended to favour a feature based approach in which, for instance, 2D and 3D vascular skeletons are extracted and matched using a suitable distance metric [3,4,5]. In this paper we apply the intensity-based registration of Penney et. al. [6] to the registration of MRA and DSA of the cerebral vasculature. In order to determine the most appropriate similarity measure for this new application, we compare the performance of six measures when applied to the registration of both a physical phantom and routinely acquired clinical images. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 501–508, 2002. c Springer-Verlag Berlin Heidelberg 2002
502
J.H. Hipwell et al.
2 The Registration Algorithm Six rigid-body extrinsic parameters describe the position (X, Y , Z) and orientation (θx , θy , θz ) of the 3D data set. These parameters are iteratively varied and digitally reconstructed radiographs (DRRs) generated and compared to the DSA image using a suitable similarity measure. DRRs of the vasculature are produced by casting rays through the segmented 3D data set, from the x-ray source to each pixel location in the DSA image. As each ray passes through a spherical volume of interest (VOI) selected by the user, the intensities of the intercepted voxels are integrated and projected onto the imaging plane to produce a DRR. A gradient descent search strategy [6] is used to search the extrinsic parameter space to optimise the similarity measure. To reduce processing time and improve the robustness of the algorithm, a multi-resolution strategy has been adopted and a pair of concentric regions of interest (ROI) are specified. The smaller of the two ROI is used to obtain an initial approximate match between the images and the larger is used to refine this registration. The radii of these ROI are set to a quarter and a half of the projected VOI radius. We have compared the performance of normalised cross correlation (CC) gradient correlation (Grad. CC) entropy of the difference image (Entropy), mutual information (Mut. Info.), pattern intensity (Pat. Int.) and gradient difference (Grad. Diff.) when used to quantify the similarity of the DRR and DSA images. Please refer to [6] for an overview of these measures.
Fig. 1. The neuro-vascular phantom. Left: surface rendering of a thresholded CT scan of the phantom with the positions of the eight fiducial markers clearly visible. Middle: the maximum opacity image from the DSA sequence showing the concentric ROI masks used. Right: DRR corresponding to a registration with 0.5mm target reprojection error.
3
Phantom Experiment
In order to assess the accuracy of the algorithm we have applied it to images acquired from a physical silicon neuro-vascular aneurysm phantom (middle cerebral artery bifurcation aneurysm, figure 1). This phantom was made by Professor D. Rufenacht and Dr. K. Tokunaga of the Division of Neuro-Radiology, Geneva University Hospital, Switzerland. It was mounted in a perspex box and filled with a 15% (by weight) aqueous solution of gelatin to give realistic x-ray attenuation and scatter.
2D-3D Intensity Based Registration of DSA and MRA
503
Fig. 2. Bottom: DRRs of the gold standard registrations of the clinical data sets (left to right, patients 1 to 4). Top: The maximum opacity images of the DSA sequence (inverted) to which the DRRs are registered showing the concentric ROI masks used.
DSA images were acquired using an Advantx DX (GE Medical Systems) x-ray set (two views). The PAL composite video output from the x-ray set was digitised via a Pulsar frame capture card (Matrox Imaging) and saved via a PC workstation as 512 × 512 pixel matrix, 8 bit grey-level images. The x-ray tube voltage was 85 kV and the phantom was placed in the isocentre of the x-ray system. Two views were acquired with the C-arm orientated at approximately ±45◦ to the vertical. A distortion-correction phantom and software were used to correct for pincushion distortion in the fluoroscopy images. Phase-contrast MR angiography of the phantom was performed using a GE Medical Systems Signa Echospeed 1.5T. The acquired image contained 256 × 256 × 124 voxels, each with dimensions 0.86×0.86×1.0 mm. Blood vessels were segmented as described in [1] to produce a binary image (the 3D phantom model). The intrinsic perspective parameters of the “ground truth” registration were calculated from images of a 60mm acrylic calibration cube in which 14 radio-opaque ball-bearings are embedded at each of the vertices and at the centers of each face. The extrinsic rigid-body parameters were calculated using eight fiducial markers attached to the perspex box containing the phantom. The fiducial markers consisted of a post to which two different types of acrylic imaging caps could be attached. The MR imaging cap contained a void which was filled with contrast fluid (0.5 mM Gadolinium). The x-ray imaging cap contained a divot which contained a 3mm diameter steel ball bearing. The caps have been accurately manufactured so that the centre of the ball coincides with the centre-of-gravity of the contrast fluid.
4
Clinical Validation
We obtained clinical MRA and DSA images from three patients undergoing treatment for cerebral aneurysms and one patient with an arteriovenous malformation (AVM).
504
J.H. Hipwell et al.
Table 1. Displacements of the test registration starting positions from the ground truth (phantom) or gold standard (clinical data) registrations, in terms of the extrinsic parameters X, θx , θy and θz . Also given are the mean target reprojection errors for these starting positions. Start Position 1 2 3 4
δX ±25 mm ±50 mm ±75 mm ±100 mm
δθx ±4◦ ±8◦ ±12◦ ±16◦
δθy ±4◦ ±8◦ ±12◦ ±16◦
δθz No. of Reg’ns Mean Reproj. Error (mm) ±4◦ 16 2.4 ±8◦ 16 4.7 ±12◦ 16 6.9 ±16◦ 16 9.1
Digital subtraction angiograms were obtained for all patients (two views per patient) using a GE Medical Systems Advantx DX x-ray set. These images were acquired using a Matrox Meteor II frame grabber, captured at half second intervals and saved as 512×512 pixel matrix, 8 bit grey-level images. A distortion-correction phantom and software were used to correct for pincushion distortion of the acquired images. Phase-contrast MR angiography was performed using a Siemens Magnetom Vision 1.5T. The acquired images contained 256 × 256 × 64 voxels with a resolution of 0.8 × 0.8 × 0.5 mm. Blood vessels were segmented from three of these data sets as described in [1] and from the fourth as described in [2] to produce binary images (the clinical 3D models). No perspective calibration cube images were available for these clinical data sets, so the four intrinsic parameters were estimated from the known focal length and image resolution of the x-ray set. This estimation is not expected to introduce significant errors into the registration, however any errors that are present will be included in the estimated target reprojection error calculation (section 5). The extrinsic parameters of the gold standard registrations were generated via manual manipulation of a surface rendering of the 3D model using a interactive, graphical tool. To assess the reproducibility of the gold standard registrations two of the data sets were chosen and eight additional manual registrations were carried out by two observers. The first observer was a consultant neuro-radiologist (JVB) and the second a research fellow in medical imaging science (JHH). The mean reprojection error (calculated over the points described in section 5) between the gold standard and these manual registrations was calculated to be 1.7mm (standard deviation 0.4mm).
5
Registration Accuracy and Robustness Experiments
From the phantom ground truth registration, and the gold standard registrations for the clinical data, a total of 64 starting positions were generated by altering the positions of the 3D data sets using the perturbations given in table 1. The in-plane translation (δY or δZ) is assumed eliminated using a trivial manual alignment procedure, however, we simulate errors in this alignment by introducing a random perturbation (3-mm standard deviation) of the in-plane (Y and Z) position. In order to assess the performance of the registrations, sets of between 12 and 18 target points were chosen by two consultant neuro-radiologists on the 3D phantom and clinical models. These points coincided with features such as bifurcations and points of
2D-3D Intensity Based Registration of DSA and MRA Clinical Registration Robustness 100
90
90
80
80
70
70
Success Rate (%)
Success Rate (%)
Phantom Registration Robustness 100
60 50 40 30
10 0
0
1
50 40
CC Entropy Pat. Int. Grad. CC Grad. Diff Mut. Info.
20 10
2
3
4
Start Position - Distance from "Ground Truth"
Mon Mar 4 19:16:51 2002
60
30
CC Entropy Pat. Int. Grad. CC Grad. Diff Mut. Info.
20
505
5
0
0
1
2
3
4
Start Position - Distance from "Gold Standard"
5
Mon Mar 4 19:13:14 2002
Fig. 3. Robustness results for the phantom (left) and clinical data sets (right), comparing the performance of the six similarity measures when registering the segmented MRA to the maximum opacity DSA images.
high vessel curvature. For each registration the mean target reprojection error of these points was calculated with respect to the corresponding ground truth or gold standard registration. If this mean error was greater than 4mm then the registration was deemed to have failed. Two images were generated from each image sequence. The first was a single, approximately mid-sequence frame exhibiting good opacity of all arterial blood vessels. The second was a maximum opacity image in which the intensity of each pixel was set equal to the maximum opacity achieved during the DSA sequence.
6 Results The results for registering the phantom MRA to the maximum opacity DSA images are summarised in figure 3 (left) and table 2 (top). 100% success rates have been obtained for the two closest starting positions using pattern intensity, gradient correlation and gradient difference. These measures fail more often than entropy, however, as the starting position is moved further from the ground truth. Mutual information performs less well than these measures but correlation is the least successful. Pattern intensity and gradient difference are the most accurate of the similarity measures, both achieving target reprojection errors of less than 1mm for all successful registrations. The results for registering the clinical MRA data sets to the maximum opacity DSA images for all four patients are summarised in figure 3 (right) and table 2 (bottom). There
506
J.H. Hipwell et al.
Table 2. Target reprojection error results for the phantom (top) and clinical data sets (bottom), comparing the performance of the six similarity measures when registering the segmented MRA to the maximum opacity DSA image. Start Posn. 1 2 3 4
CC 1.59 (0.01) 1.66 (0.13) 2.10 (0.25) 2.13 (0.39)
Start Posn. 1 2 3 4
CC 1.26 (0.01) 1.38 (0.00) 1.67 (0.03) 1.64 (0.00)
Phantom Reprojection Errors in mm (SD). Entropy Pat. Int. Grad. CC Grad. Diff. 1.08 (0.02) 0.89 (0.05) 0.90 (0.03) 0.88 (0.02) 1.14 (0.01) 0.88 (0.05) 1.25 (0.36) 0.90 (0.05) 1.10 (0.06) 0.94 (0.05) 1.10 (0.03) 0.99 (0.08) 1.20 (0.07) 0.94 (0.00) 1.42 (0.13) 0.91 (0.05) Clinical Reprojection Errors in mm (SD). Entropy Pat. Int. Grad. CC Grad. Diff. 1.69 (0.02) 1.12 (0.04) 1.18 (0.04) 1.28 (0.02) 1.99 (0.07) 1.15 (0.04) 1.08 (0.04) 1.26 (0.04) 2.17 (0.20) 1.24 (0.04) 1.26 (0.03) 1.39 (0.02) 2.02 (0.06) 1.35 (0.08) 1.34 (0.10) 1.58 (0.01)
Mut. Info. 1.25 (0.07) 1.38 (0.06) 1.45 (0.35) 1.11 (0.08) Mut. Info. 1.15 (0.05) 1.20 (0.05) 1.36 (0.25) 1.74 (0.03)
are a number of differences between these results and those obtained for the phantom. The first is that entropy performs markedly worse than all the other measures for this clinical data, whereas it was at least as good if not better than the majority of the other measures for the phantom data. Of the other measures gradient difference performs consistently well for the clinical data sets, followed by pattern intensity, which out-performs gradient difference for start position 2, and gradient correlation. For these clinical data sets pattern intensity and gradient correlation achieve the lowest target reprojection errors of 1.1 to 1.4mm. The target reprojection errors of gradient difference and mutual information are less than 1.3mm for start positions 1 and 2 but rise more steeply for more distant start positions. Entropy produces the least accurate registrations. The results for registering to the single DSA images were similar to those obtained for registration to the maximum opacity images. In nearly all cases, however, registering to the maximum opacity DSA images resulted in a higher success rate (by up to 10%) compared to registering to the single mid-sequence frame. The maximum opacity registrations were also more accurate (by up to 0.5 mm in some cases). This result is not unexpected as these images will have higher contrast and signal-to-noise ratio than the single frames.
7
Discussion
We have found that gradient correlation, pattern intensity and gradient difference perform best of the six similarity measures compared. This is in agreement with the findings of Penney et. al. [6] for the comparison of these measures when used to register computed tomography (CT) to fluoroscopy images of a spine phantom. This is despite the large differences in modality and anatomy between these two applications, and the fact that none of these similarity measures have been specifically developed for application to MRA-DSA registration.
507
No. of Registrations
2D-3D Intensity Based Registration of DSA and MRA
Similarity Measure Mon Mar 4 15:58:44 2002
Fig. 4. Registration of patient 4, view 2. Left: Histogram of final values of the gradient difference similarity measure. Right: Comparison of registrations (with sub-regions enlarged). Central column: maximum opacity DSA image to which the MRA is registered. Left column: the goldstandard registration. Right column: the “best” gradient difference registration, i.e. that producing the highest value of the gradient difference similarity measure.
We have found that the success rate of these registrations falls off rapidly once the start position exceeds position 2, that is 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth registration. This is representative of intensity-based registration algorithms which have a “capture range” within which a certain fraction of corresponding features must be approximately aligned. However, we have found that manual alignment to within these tolerances can be rapidly and easily achieved using an interactive tool. The registration success rates varied significantly between the four clinical data sets. For patients 1, 2 and 3, for instance, 90% of the registrations obtained using the gradient difference similarity measure were successful. For the second view of patient 4, however, only 48% succeeded. The histogram of similarity measure values for these registrations of patient 4 (figure 4) reveals a small cluster of 9 of the 128 registrations that all have consistently high similarity measures and very similar extrinsic parameter values. This mean registration position differs from the gold-standard position by 10◦ , 6◦ and 10◦ rotations about the x, y and z axes respectively. Visual comparison of these two registration positions, however, (figure 4) suggests that the position found by the algorithm is much more accurate than the gold-standard position. The algorithm currently takes approximately 10 minutes running on a 1.2 GHz AMD processor PC with 1 GByte of RAM, however, considerable speed-up could be achieved using techniques such as shear-warp factorisation. The timing variation with different similarity measures was found to be negligible.
8
Conclusions
We have applied an intensity based 2D-3D registration algorithm to the multi-modality alignment of MRA and DSA images. Of the six similarity measures compared, gradient
508
J.H. Hipwell et al.
difference, pattern intensity and gradient correlation performed consistently accurately and robustly for all data sets. Using these similarity measures, and for starting positions within 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth positions, we obtained a success rate of greater than 80% (less than 4mm target reprojection error) for the clinical data sets. Whilst none of the phantom registrations failed. The root-mean-square (rms) target reprojection error of the clinical registrations was less than 1.3mm (less than the 1.7mm reprojection error estimated for the gold standard registrations) and for the phantom images less than 1mm when using the most accurate similarity measures.
Acknowledgments We would like to thank Kawaldeep Rhode, Robert McLaughlin, Albert Chung and Paul Summers for their assistance in acquiring the images used in this paper and also Kawaldeep Rhode for distortion correcting the DSA images used. This research is funded by EPSRC grant GR/M55015.
References 1. A.C.S. Chung and J.A. Noble. Statistical 3D vessel segmentation using a Rician distribution. In Proc. MICCAI, pages 83–89, 1999. 2. A.C.S. Chung, J.A. Noble and P. Summers. Fusing speed and phase information for vascular segmentation in phase contrast MR angiograms. In Proc. MICCAI, pages 166–175, 2000. 3. J. Feldmar, G. Malandain, N. Ayache, S. Fernandezvidal, E. Maurincomme and Y. Trousset. Matching 3D MR angiography data and 2D x-ray angiograms. In Proc. CVRMed/MRCAS, pages 129–138. Berlin, Germany: Springer-Verlag, 1997. 4. Y. Kita, D.L. Wilson and J.A. Noble. Real-time registration of 3D cerebral vessels to x-ray angiograms. In Proc. MICCAI, pages 1125–1133, 1997. 5. A. Liu, E. Bullitt and S.M. Pizer. 3D/2D Registration via skeletal near projective invariance in tubular objects. In Proc. MICCAI, pages 952–963, 1998. 6. G.P. Penney, J. Weese, J.L. Little, Desmedt P., D.L.G. Hill, and D.J. Hawkes. A comparison of similarity measures for use in 2D-3D medical image registration. IEEE Transactions on Medical Imaging, 17((4):586–595, 1998. 7. G.P. Penney, P.G. Batchelor, D.L.G. Hill, D.J. Hawkes. and J. Weese. Validation of a two- to three-dimensional registration algorithm for aligning preoperative CT images and intraoperative fluoroscopy images. Medical Physics, 28(6):1024–1032, 2001.
Model Based Spatial and Temporal Similarity Measures between Series of Functional Magnetic Resonance Images Ferath Kherif1,2 , Guillaume Flandin1,3 , Philippe Ciuciu1,2 , Habib Benali2,4 , Olivier Simon2,5 , and Jean-Baptiste Poline1,2 1
2
Service Hospitalier Fr´ed´eric Joliot, CEA, 91401 Orsay, France {kherif,poline}@shfj.cea.fr Institut F´ed´eratif de Recherche 49 (Imagerie Neurofonctionnelle), Paris, France 3 INRIA, Epidaure Project, Sophia Antipolis, France 4 INSERM U 494, CHU Pitie-Salpetriere, Paris, France 5 INSERM U 334, CEA, 91401 Orsay, France Abstract. We present a method that provides relevant distances or similarity measures between temporal series of brain functional images. The method allows to perform a multivariate comparison between data sets of several subjects in the time or in the space domain. These analyses are important to assess globally the inter subject variability before averaging subjects to draw some conclusions at the population level. We adapt the RV-coefficient to measure meaningful spatial or temporal similarities and use multidimensional scaling for visualisation.
1
Introduction
Functional brain imaging has been an extremely active field of research during the last fifteen years, first with Positron Emission Tomography and more recently with the advent of functional Magnetic Resonance Imaging (fMRI) because of their potential for the understanding of the human brain functions organisation. An fMRI experiment consists for one subject, in the acquisition of a large number (100 to 1500) of 3D volumes (64x64x32) measuring a parameter related to the brain neural activity in each voxel. The subject is submitted to a experimental paradigm consisting in different conditions designed to study a particular brain system (e.g. memory, language, vision ...). A entire study consists in the acquisition of data for approximately 10 to 30 subjects. The most challenging problem of the neuro-imaging field is to extract the relevant information in this vast amount of data. It is especially important to be able to draw some conclusions from the study across subjects, therefore at the population level. This is complex because of the anatomical and functional differences between subjects. A standard way to analyse multi-subjects data is to summarise the relevant information per subject in one brain volume (for instance the average difference between condition A and B) and use the inter subject variance to infer results at the population level (the so-called random effect group analyses) [8]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 509–516, 2002. c Springer-Verlag Berlin Heidelberg 2002
510
F. Kherif et al.
These multi-subjects analyses are based on the assumption that subjects are drawn from a single population, and therefore assumes some homogeneity between subjects. Clearly, this assumption may not be verified. Subjects are not necessarily homogeneous in the spatial domain (different brain regions are activated) or in the time domain (the time courses of the brain responses are different for a common experimental paradigm for different subjects). This non homogeneity can be due to many factors, including different strategies across subjects, or acquisition differences that cannot be controlled. If the group studied is not homogeneous this may lead at best to less efficient analyses and at worst to erroneous results and interpretations [8]. Although this topic is clearly of major importance for the analysis of fMRI data, it has so far received little attention. This is probably due to both the complexity and the amount of data to be analysed that depend on the experimental paradigm and the noise characteristics. In this paper, we introduce a general technique to derive relevant distances between series of 3D fMRI brain images in order to assess their similarity in the time or spatial domain. The technique is based on the RV-coefficient adapted for this purpose and the use of the multi-dimensional scaling to visualize the group structure. It is flexible, and allows to compare data sets (e.g. subjects) in a the light of a specific question relating to the experimental paradigm in a reasonable computational time. In the following, we first briefly review possible distances or similarity measures and discuss their pros and cons in relation with their application to neuroimaging data. In the section 2.2 we introduce the selected distance based on the RV-coefficient. In section 2.4 we present the experimental data set and some results on the differences in the spatial and temporal domains between subject for this fMRI study.
2 2.1
Methods Candidate Similarity Measures or Distances Given the Data and Problem Specificities
In this section, we briefly present an overview of measures or distances that could be used for comparing series of images. We first review some characteristics of our data and some desirable features for the distance measures. Data and Problem Specificities. (C1 ) The data originating from different subjects may have different number of voxels. Conversely, if images are put in a common spatial reference (realigned to an atlas) the number of scans (time dimension) may not be the same across subjects. (C2 ) When addressing the temporal (resp. spatial) aspect of the data, the mean image (data averaged across time, resp. the mean time course) does not convey any meaningful information.
Spatial and Temporal Similarity Measures
511
(C3 ) Time series at different position in the brain may have different variance due to physiological reasons. The distance looked for should be insensitive to a voxel per voxel variance scaling and to the overall data variance. (C4 ) Estimated covariance structure in the time domain can be inverted (although this may not be advised given the estimation noise). This inversion is not possible in the spatial domain. (C5 ) Data may have non Gaussian components. (C6 ) Similarity measure computations have to be fast enough to be used by clinicians or brain scientists. This is a challenge given the size of the data. (C7 ) The measure should be able to include information from the experimental paradigm and from the known noise characteristics of the data. This is mainly addressed in section 2.3 through the modeling of the experimental variance due to the paradigm.
Some Candidate Similarity Measures. In this section, we only address the comparison in the time domain and mention the comparison in the space domain when dimensions can not be simply swapped. The presentation follows an ”incremental” line of thought. Distance measures successively address different (more complex) aspects of the data. Let Yi be the sample data for the i th subject represented as a matrix with ni rows (voxels dimension) and ti columns (scans or time dimension) with ni ti . The corresponding (time x time) sample covariance matrices are denoted Σi for the i th subject. Σi+j denotes the pooled covariance matrix between the i th and the j th subjects. • Mahalanobis distance D2 . This widely used distance is most meaningful when data are multi-normal and measures the weighted distance between data sets means [1]. It relies on the inversion of the (common) covariance matrix of −1 (Y i − Y j ). It can be tested the data and is computed with D2 = (Y i − Y j )t Σi+j for the null hypothesis that the two groups have the same mean through an F-test. Clearly, this distance can only be computed in the time dimension (cf (C4)) and can not reflect complex links. • Covariance equality test and distance. Once the data means have been compared, a likelihood ratio statistic such as the Box’M can be conducted to test the hypothesis that covariance matrices are equals [1]. The issuing B coefficient −M (B = e df ) can be used as a distance measure between covariance matrices. Two data sets have similar density volumes if B is close to one. This test is meaningful with multi-normal data but is not robust otherwise (cf (C5)). • Canonical Correlation Analysis (CCA). CCA is used to identify linear relations between two data sets [1] found to have an overall link with the covariance distance. CCA finds successive sets of pair of linear combinations (one canonical eigenvector per data set) that explain best this relation and the corresponding canonical roots inform about the relation strength. It is therefore more general than the Mahalanobis distance since the search of the linear link is done in a greater space. However, it relies on the computation of the inverse covariance
512
F. Kherif et al.
matrix of the data in the time and space domains, while the latter is not tractable (cf (C4)). • Krzanowski’s method : One problem with CCA is that it explicitly searches for linear links between two data sets corrected for their covariance structures. This might not be the most relevant comparison between fMRI data series. An alternative to this can be found in the seminal work of Krzanowski [4], who suggests to compare data sets based on the computation and comparison of the eigen-components of their covariance matrices. Comparison is performed by computing angles between sub-spaces spanned by their first principal components. The drawbacks of these methods are those of PCA analysis, they are not scale invariant and highly depend on the pre-processing steps (centering, normalisation,...), (cf (C3)). • Distribution distance : While the previous methods hold in general under multi normal assumptions and linear links, it is easy to define distances that are more general through the data sampled distributions. Several authors [5,7] have proposed such measures, related to mutual information, the expression of which is simplified if the data are normal. For example Matusita derives a separability measure between densities [7]. The difficulty lies in the efficient computation of probability densities (C7). Nethertheless, we plan to investigate these similarity measures in the future. 2.2
The RV-coefficient as a Similarity Measure.
The RV-coefficient was first described by Robert [9] for evaluating multidimensional linear association between several data sets. For each data sets, the matrix: 2 Si = Yit Yi (a time by time ti × ti ) can be considered as a point in Rti . The comparison of two data sets i, j in this space can be made by computing the RV-coefficient as follow trace(Si Sj t ) RVi,j = trace(Si Si t ) trace(Si Si t )
(1)
Escouffier [9] considers each Si as an operator and derives an inner product (and a distance metric) based on the Hilbert-Schmidt norm: |A|2 = trace(At A), for a given matrix A. In this context, the RV-coefficient is seen as the cosine of the angle between Si and Sj . The RV-coefficient can also be considered as a multivariate extension of the classical Pearson correlation coefficient. Lavit showed that if RVi,j is one, then one can derive eigen-components of data set i from data set j through an homothetic transformation [6]. For comparing fMRI data sets, it has several advantages. It reflects the linear link between data sets covariance but is normalised for the absolute amount of variance in the data (C3). Second, it can be used in both the spatial and the temporal domain (C1) and (C2). Third, it is fast to compute (C6). Fourth, it does not necessarily require the inversion a covariance structure, although such normalisation can be included when possible (C4). Lastly, it should be robust with non gaussian data (C5), and is easily adaptable to compare data
Spatial and Temporal Similarity Measures
513
sets considering a specific question (that can be put in the framework of standard fMRI analysis (C7)). This is the subject of the next section. 2.3
Model Based RV-coefficient in fMRI Analysis
Adapting the RV-coefficient. The analysis of fMRI data generally relies on the specification of an a priori model describing the expected time courses derived from the experimental paradigm. The model consists in a time by parameter matrix X (t × p), assumed to explain all deterministic temporal variations of the data. A fMRI analysis consists in linearly regressing the model at each and every voxel, and in testing a contrast of the parameters reflecting the neuroscience question under study. So called statistical parametric maps are constructed with the test statistic attributed at each voxel. The model X, used to analyse the data, can be introduced in the similarity measure. Rather than considering for each subject the covariance matrix estimated from the raw data, it is generally more meaningful to consider the covariance matrix between the data and the model (C7). This allow to compare subjects data sets depending on how well the model X predicts the data. This is obtained through the modified RV-coefficient computed with Si = Yit XX t Yi , a (p × p) matrix. More often than not, only a sub-space G of the model X is of interest (for instance the subspace representing the difference between experimental conditions). In such a case, both model and data can be projected onto this subspace. The model X becomes XG and the data Y becomes YG , leading to a RV-coefficient tuned for the specific question represented by G. The RV-coefficient allows the introduction of two metrics, M and N , respectively for the column (temporal) and row (voxels) spaces, defined respectively by: 1
t V XG )−1/2 M − 2 = (XG
N
− 12
=
diag{ˆ σ1−1 , σ ˆ2−1 , · · ·
(2) σ ˆn−1 }
(3)
The metric M corrects for the scaling differences in the model regressors and takes into account the temporal correlation represented by the (estimated or assumed) time by time matrix V . The diagonal elements of the metric N are the inverse of the square-root of the residual variances estimated for each voxel. 1 1 t YG N − 2 . This leads to compute Si with : Yi = M − 2 XG Spatial and Temporal Similarity Measures. We have so far constructed the Si matrices as a time by time cross-product matrix. If all Yi have identical number of rows, the same computation can be made in the voxel space considering Si = Yi Yi t , a ni × ni matrix. This leads to the same formulation of the RV-coefficient, and provides a similarity measure in the space domain. Computational Cost. The method based on RV-coefficient involves the computation of the trace of matrices. Due to the large amount of data in an fMRI
514
F. Kherif et al.
experiment computation and data storage can be very cumbersome. Our implementation is designed to avoid direct computation of the products between the matrices (using the Hadamard product). Only one pass through the data simultaneously for all the subjects is needed. Whenever possible, computations are performed in the model parameter space which reduces considerably computational cost. For the set of data presented in the following, computation time was of the order of a few minutes on a Sun workstation (.8Ghz, 512M RAM). Results Visualization. The two by two similarity measures Ri,j are first trans formed into a distance measure with with di,j = 2(1 − Ri,j ). A symmetric distance matrix (di,j ) with k(k − 1) distinct values is constructed and processed through Multidimensional Scaling (MDS) [2] to get the best Euclidean representation of these distances. 2.4
Experimental Paradigm and Data
Data are obtained from nine subjects who underwent a calculation task and a control task [10]. During the fMRI scanning, six blocks of 26s each alternating computation and control tasks were presented. Each subject performed two such sequences. A total of 186 scans (64x64x28 voxels per scan) were acquired per subject. The (linear) model used for analysing the data consisted in 3 regressors per condition (computation and control) derived from a standard hemodynamic response. Within this model a sub-space of interest was formed to highlight activations induced by the calculation task relatively to the control task.
3
Results and Conclusion
This section presents the results of the temporal and spatial comparisons for the nine subjects data sets using the adapted RV-coefficient to investigate intersubject distances with respect to the comparison between activation and control. For this purpose, we use equation (1) and formulas in section 2.3 with a subspace G that spanned the expected activation space. Temporal Distances. Figure 1 shows a 2D MDS representation of the temporal distance between subjects. In this case, we observe that although subjects can not be easily divided into more than one group, subjects 3 and 4 lie far apart from the group center of mass. This indicates a different temporal behaviour such that these subjects should probably be considered as outliers. These temporal differences between subjects are observed in figure 2. This figure shows first components of the output of a Multivariate Linear Model (MLM) analysis described in [11]. Components summarise the temporal behaviour and are clearly seen to be similar for two subjects (8,9) close on figure 1. Conversely, those patterns are clearly different from subject 4 component, a subject that is also found far from subject 8 and 9 on figure 1. This result is in accordance with
Spatial and Temporal Similarity Measures 1
7
0
5
4
515
6
6 8
82 9
1
9
5 0
3
7 4
3 0
2 0
Fig. 1. Inter-subject variability in terms of temporal (left panel) and spatial (right panel) distances.
Fig. 2. Illustration of the temporal variability observed in figure 1 (left) with the first temporal MLM eigencomponents.
an other study [3] that showed the particular temporal behaviour of those two subjects data. Spatial Distances. Figure 1 (right panel) shows a 2D MDS representation of the spatial distance between subjects. In this plot, part of the inhomogeneity found in the temporal domain is observed again. In particular, subjects number 1, 3, and 4 are found to be the farthest from the group center. This spatial distance is illustrated on a statistical parametric map showing the activation effect in figure 3 (one axial slice for each subject). Distances between the subject 4 and subjects 8 and 9 are mainly reflected by a greater activity in the left parietal lobe for subjects 8 and 9.
4
Conclusion
We have developed an easy to use, fast and flexible method to analyse the similarity of different subjects fMRI time series in the temporal or spatial domain, taking into account the specificities of these complex data. The method has the potential to detect outliers in the time or spatial before performing any kind of group analysis (resp. in the time or space domain), or to detect any particular
516
F. Kherif et al.
Fig. 3. Illustration of the spatial variability observed in figure 1 (right) on an axial slice.
grouping in the data that would invalidate such group analyses. In the future, the method will be coupled with clustering and outliers detection tests. The method is likely to find a number of application in clinical (e.g. helping for the diagnosis psychiatric diseases) or neuroscience context (e.g. relating the distances with genetic or phenotypic information).
References 1. T.W. Anderson. Introduction to Multivariate Statistical Analysis. John Wiley, 1984. 2. C. Gower. Multidimensional scaling displays. In New York: Praeger., editor, esearch methods for multimode data analysis. Law,H.G., 1984. 3. F. Kherif, J.B Poline, H. Flandin, G.and Benali, S. Dehaene, and K.J. Worsley. Multivariate model specification for fmri data. NeuroImage, 2002 (submitted). 4. W.J. Krzanowski. Between-groups comparison of principal components. Journal of the American Statistical Association, 74:703–704, 1979. 5. S. Kullback and Leibler Leibler. On information and sufficiency. Annals of Math. Stats., 22:79–86, 1951. 6. C. Lavit. Analyse conjointe de tableaux quantitatifs. Masson, 1984. 7. K. Matusita. Decision rules based on the distance for problems of fit. Ann. Math. Statist., 26:631–640, 1955. 8. K.M. Petersson, T.E. Nichols, J.B. Poline, and A.P. Holmes. Statistical limitations in functional neuroimaging. ii. signal detection and statistical inference. Philos Trans R Soc Lond B Biol Sci, 354(1387):1261–81, Jul 1999. 9. P. Robert and Y. Escoufier. A unifying tool for linear multivariate statistical methods: The rv-coefficient. Applied Statistics, 25:257–265, 1976. 10. O. Simon, J.F. Mangin, L. Cohen, D. Le Bihan, and S. Dehaene. Topographical layout of hand, eye, calculation, and language-related areas in the human parietal lobe. Neuron, 31(33(3)):475–87, Jan 2002. 11. K.J. Worsley, J.B. Poline, K.J. Friston, and A.C. Evans. Characterizing the response of pet and fmri data using multivariate linear models. Neuroimage, 6(4):305–19, Nov 1997.
A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration for Neurointerventions Robert A. McLaughlin1 , John Hipwell2 , David J. Hawkes2 , J. Alison Noble1 , James V. Byrne3 , and Tim Cox4 1
Medical Vision Laboratory, Dept. Engineering Science, University of Oxford, Oxford, England
[email protected] 2 CISG, Division of Radiological Sciences, Guy’s Hospital King’s College London, London, England
[email protected] 3 Department of Radiology, Radcliffe Infirmary, Oxford, England 4 National Hospital for Neurology and Neurosurgery Queen Square, London, England
Abstract. Registration of 2D-3D data can improve visualisation during minimally-invasive neurointerventions. Using four clinical data sets, we quantitatively compared two approaches: an intensity-based algorithm and a feature-based algorithm. The intensity-based approach was found to be more accurate, with an average registration accuracy of 1.4mm, compared to the feature-based algorithm with an average accuracy of 2.3mm. The intensity-based algorithm was also found to be more reliable. Reliability of the feature-based algorithm was found to be more sensitive to the complexity of the vasculature structure.
1
Introduction
The registration of 2D-3D data sets is important in minimally invasive neurointerventions, such as the coiling of brain aneurysms or glueing of arteriovenous malformations (AVM). During such interventions a neuro-radiologist guides a catheter through the brain vasculature using 2D X-ray images. The 2D nature of the images can make it difficult to navigate and position the catheter accurately in a complicated 3D angioarchitecture. One solution would be to utilise a pre-operative phase contrast magnetic resonance angiography (PC-MRA) scan. Such a scan could be segmented [1][2] to produce a 3D model of the vasculature. By registering the intra-operative X-ray image with this 3D model, it would be possible to accurately display the position of the catheter relative to the 3D model. In this paper, we compare two approaches to 2D-3D registration: an intensitybased method [3] and a feature-based method [4]. We compare accuracy and robustness of these two algorithms on four clinical data sets. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 517–524, 2002. c Springer-Verlag Berlin Heidelberg 2002
518
2 2.1
R.A. McLaughlin et al.
Method Intensity-Based Registration
The intensity-based registration algorithm builds on the work in [3] and iteratively optimises the six rigid-body parameters describing the location and rotation of the 3D model. Digitally reconstructed radiographs (DRR) are generated by casting rays through the segmented volume, and are compared to the digital subtraction angiography (DSA) image using the gradient difference similarity measure [5]. Gradient images are computed for both the the DSA image and the DRR using 3x3 Sobel templates. The gradient difference similarity measure minimises the difference between these gradient images. Details are given in [3]. Some modifications to the algorithm of [3] were required to adapt it to work with segmented 3D data and DSA images, rather than unsegmented CT data and fluoroscopy images. The primary modification was the use of a spherical volumeof-interest (VOI), manually defined around the feature of interest (aneurysm or AVM). Only voxels lying within the VOI were used in the registration. The VOI was projected onto the DRR as a circular mask. A concentric circular mask with one quarter the radius was then defined, and pixels within this smaller mask were used in an initial registration. The radius of the smaller mask was then doubled and this larger mask was used to refine the registration. The centre of rotation for the volume was set to be the centre of the VOI. To reduce processing time at each stage, a multi-resolution strategy was adopted whereby the DRRs and DSA images were sub-sampled by a factor of four. These dimensions were subsequently doubled until the optimisation of the parameters was completed with both images at their full resolutions. 2.2
Feature-Based Registration
The feature-based registration algorithm [4] first skeletonises vessels in the DSA image, reducing the thickness of each to a single pixel. Blood vessels in the 3D model are also skeletonised by extracting the medial axis of each vessel. The algorithm registers the data sets by matching the skeletonised DSA image with a projection of the skeletonised 3D model. For each 3D point, the closest corresponding point in the skeletonised DSA image is found using a territory-based correspondence search as described in [4]. Using these pairs of points and the method outlined in [6], the algorithm finds the optimal rotation and translation of the 3D model to achieve a registration. The registration was performed in three stages, using the VOI defined in Section 2.1. Registration was initially performed with the small mask, refined using the larger mask and finally completed using the entire DSA image. A similar use of masks was presented in [7].
3 3.1
Experiments Data
Phase-contrast MRA (PC-MRA) scans were obtained for three patients with aneurysms (patients 1-3) and one patient with an AVM (patient 4). Scans were
2D-3D Intensity-Based Registration versus Feature-Based Registration
519
acquired on a Siemens Magnetom Vision 1.5T with voxel size 0.78 x 0.78 x 1.5mm and image dimensions 256 x 256 x 64. The scans for patients 1, 2 and 4 contained flow speed information and were segmented as described in [1]. An improved scan was used for patient 3, giving both flow speed and flow direction information, and this extra information was used to give an improved segmentation [2]. Visualisations of the segmented MRA data sets are shown in Figure 1.
Patient 1
Patient 2
Patient 3
Patient 4
Fig. 1. 3D visualisations of segmented MRA scans.
For each patient, two DSA runs at different orientations were acquired using a GE Medical Systems Advantx DX, and digitised from the PAL composite video signal at an image resolution of 512 × 512 pixels using a Matrox Meteor II framegrabber. For each DSA run, three to seven images were acquired at half second intervals. These were used to generate two images: a maximal image where the images were combined so that the maximal level of contrast over the run is recorded for each pixel; and a single-frame image, where an image that had maximal opacification of the arterial system was chosen. A distortioncorrection phantom and software were used to correct for pincushion distortion in the images [5]. Typical DSA images produced are shown in Figure 2. 3.2
Calculation of “Gold-Standard” Registration
Parameters for the gold-standard registration may be described as either intrinsic or extrinsic. Intrinsic parameters describe properties of the imaging system,
520
R.A. McLaughlin et al.
Patient 1
Patient 2
Patient 3
Patient 4
Fig. 2. The first of two DSA runs obtained for each patient. Patients 1 and 2 show single frame DSA images, while Patents 3 and 4 show maximal DSA images.
such as the perspective projection matrix. Extrinsic parameters describe the orientation (rotation) and position (translation) of the 3D model [5] Intrinsic parameters for the gold standard registration were computed from parameters obtained from the X-ray machine display during acquisition. As no fiducial markers were available in either the PC-MRA scans or the DSA images, the extrinsic parameters were obtained by a manual registration performed by JVB (neuro-radiologist). The manual registration was performed using 3D visualisation software which simulated X-ray images for a specified translation and rotation, allowing the neuro-radiologist to align clinically relevant points in the images. To test the stability of these manual results, registrations for patient 2: DSA run 1 and patient 4: DSA run 1 were each repeated eight times, and variation in the results were computed using the reprojection distance described in the next section. 3.3
Experiments for Accuracy and Robustness
The segmented MRA data sets were registered with both maximal and singleframe DSA images. Starting positions for the registrations were chosen by perturbing the gold standard values by set amounts. This methodology was used in [5]. Four experiments were performed, with the amount of perturbation increased each time, as shown in Table 1. For each experiment, different combinations of the four perturbations resulted in sixteen different starting positions. Note that there were no in-plane translations (δX or δY ), as these can be accurately calculated by selecting a single corresponding point in both the DSA image and the DRR simulated from the MRA data. To measure accuracy, the reprojection distance was used, as defined by Masutani et al. [8]. A number of anatomically visible points on the segmented 3D model were chosen, along with the corresponding points in the DSA image. Using the rotation and translation matrix resulting from each registration, the position of the 3D points was recomputed. The minimum distance (in mm) from each point to the ray passing from the X-ray source to the corresponding DSA image point was then calculated. This gave a measurement of the accuracy of the registration when projecting from 3D to 2D. A discussion of the measurement
2D-3D Intensity-Based Registration versus Feature-Based Registration
521
Table 1. Perturbations of the starting positions from the gold standard for four of the six rigid-body parameters. Experiment # δZ δθx δθy 1 ±25 mm ±4◦ ±4◦ 2 ±50 mm ±8◦ ±8◦ 3 ±75 mm ±12◦ ±12◦ 4 ±100 mm ±16◦ ±16◦
δθz ±4◦ ±8◦ ±12◦ ±16◦
can be found in [5]. Finally, the average RMS error of all such points for each experiment was computed. If the average RMS error for a particular registration was less than 4 mm, the registration was judged to have succeeded.
4
Experiment and Results
Figure 3a plots registration accuracy for each algorithm, using both the maximal DSA images and the single-frame DSA images. These are plotted against the variability in the manual ’gold-standard’ registration, which was computed as 1.7mm. Figure 3b plots the percentage of successful registrations. Only successful registrations were used in computing the accuracies shown in figure 3a. Results of typical registrations are displayed in figure 4. 3
Feature: single frame DSA Feature: maximal DSA Intensity: single frame DSA
RMS error (mm)
2.5 2
Intensity: maximal DSA Manual registration
1.5 1 0.5 0 1
2
3
4
Experiment number
a. 100
Feature: single frame DSA
% successes
90 80
Feature: maximal DSA
70
Intensity: single frame DSA
60
Intensity: maximal DSA
50 40 30 20 10 0 1
2
3
4
Experiment number
b. Fig. 3. (a) Registration accuracy. (b) Registration reliability.
The percentage of successful registrations varied with the data sets, being notably higher with Patients 1 and 2 than with Patients 3 and 4. Graphs showing the percentage of successful registrations with each data set are shown in figure 5.
522
R.A. McLaughlin et al.
a.
b.
e.
f.
c.
d.
Fig. 4. Results of registration. Registered 3D vessels are overlaid in black. (a) Original DSA image for Patient 2. (b) Typical successful registration for patient 2. (c, d) Failed registrations for patient 2. (e) Original DSA image for Patient 4. (f) Typical successful registration for patient 4. 100
Patient 1
% successes
90 80
Patient 2
70
Patient 3
60
Patient 4
50 40 30 20 10 0 1
2
3
4
Experiment number
a. 100
Patient 1
% successes
90 80
Patient 2
70
Patient 3
60
Patient 4
50 40 30 20 10 0 1
2
3
4
Experiment number
b. Fig. 5. Registration reliability for each patient data set. (a) Feature-based algorithm. (b) Intensity-based algorithm.
5
Discussion
The intensity-based algorithm had the greater accuracy of the two algorithms, with an average accuracy of 1.4mm. This compared to an average value of 2.3mm
2D-3D Intensity-Based Registration versus Feature-Based Registration
523
for the feature-based algorithm. Recall that the feature-based algorithm registers a skeleton of the 3D model with a skeleton of the DSA image. It is thus sensitive to inaccuracies in the position of the 2D and 3D skeletonised points. This accounts for its lower accuracy when compared to the intensity-based algorithm, which registers the intensity values of every individual pixel. Note that the difference in image quality between the maximal and single-frame DSA images did not noticeably alter the accuracies for either algorithm. The intensity-based algorithm was also more robust. This is in contrast to the experimental results of [7], which found the feature-based algorithm to be more robust. The experiments in [7] were performed using the far simpler vasculature of an in-vitro silicon aneurysm phantom (middle cerebral artery bifurcation aneurysm). Our results suggest that while for simple angioarchitectures the feature-based approach may be more robust, in complicated situations the intensity-based approach is superior. The results in figure 5 support this conclusion. The robustness trends shown in the graphs fall into two distinct classes, with Patients 1 and 2 proving to be more robust than Patients 3 and 4. Recall that while the scans for Patients 1 and 2 contained only flow speed information, Patient 3 contained both flow speed and direction information. This led to a more complicated segmentation, with small vessels detected. Patient 4 was complex due to the angioarchitecture of the AVM. These results suggest that robustness of the feature-based approach could be greatly improved if, in some initial stage of processing, the vasculature in the 3D model and DSA image could be simplified to contain only the most significant vessels. An essential difference between the two algorithms lies in the method by which they combine conflicting information and iteratively improve the current state of the registration. The intensity-based algorithm tests each minor perturbation to the current rotation and translation, minimising a similarity measure that is summed over the entire data set. In contrast, in the feature-based algorithm each pair of matching points (one from the 3D skeleton and one from the skeletonised DSA image) specify an optimal change to the present rotation and translation. It is the average rotation and translation that is chosen. This method renders the algorithm sensitive to a misregistration of one or two erroneous vessels, as these will produce greatly different estimates for the rotation and translation. This suggests that the use of a robust fitting method such as RANSAC [9] may greatly improve the reliability of the feature-based algorithm. The computation time of the algorithms has important ramifications for the clinical suitability of either approach to registration. The feature-based algorithm is far less computationally intensive than the intensity-based algorithm, resulting in a much faster registration. This is because the feature-based algorithm operates on a small number of skeletonised points, rather than the exhaustive pixel-based approach of the intensity-based algorithm. In future work, we will seek to quantify these differences.
524
6
R.A. McLaughlin et al.
Conclusion
We have compared an intensity-based and a feature-based registration algorithm for the registration of 3D PC-MRA data to DSA images. The algorithms were tested using four clinical PC-MRA data sets and eight DSA runs. The intensitybased registration algorithm produced more accurate registrations, with an average RMS reprojection error of 1.4 mm. The feature-based algorithm was found to have an average RMS reprojection error of 2.3 mm. The intensity-based algorithm was found to converge to the correct solution with greater reliability. Our results suggest that reliability of the feature-based algorithm are more effected by the complexity of the angioarchitecture than is the intensity-based method. In future work we will explore whether the featurebased approach may be made more reliable by the incorporation of a robust fitting method.
Acknowledgements We wish to thank Dr. G. P. Penney, K. Rhode, Dr. A.C.S. Chung and Dr. Y. Kita for their help in undertaking this research. This work was supported by ESPRC grants GR/M55008 and GR/M55015.
References 1. Chung, A.C.S., Noble, J.A.: Statistical 3D vessel segmentation using a Rician distribution. In: Proc. MICCAI. (1999) 82–89 2. Chung, A., Noble, J.: Fusing magnitude and phase information for vascular segmentation in phase contrast mr angiograms. In: Proc. MICCAI. (2000) 166–175 3. Penney, G.P., Weese, J., Little, J.A., Desmedt, P., Hill, D.L.G., Hawkes, D.J.: A comparison of similarity measures for use in 2d-3d medical image registration. IEEE Transactions on Medical Imaging 17 (1998) 586–595 4. Kita, Y., Wilson, D.L., Noble, J.A.: Real-time registration of 3D cerebral vessels to X-ray angiograms. In: MICCAI’98. (1998) 1125–1133 5. Penney, G.P.: Registration of Tomographic Images to X-ray Projections for Use in Image Guided Interventions. Phd thesis, University College London, CISG, Division of Radiological Sciences, Guy’s Hospital, King’s College London, London SE1 9RT England (2000) 6. Heuring, J.J., Murray, D.W.: Visual head tracking and slaving for visual telepresence. In: Proc. of IEEE Int. Conf. on Robotics and Automation. (1996) 2908–2914 7. McLaughlin, R.A., Hipwell, J., Penney, G.P., Rhode, K., Chung, A., Noble, J.A., Hawkes, D.J.: Intensity-based registration versus feature-based registration for neurointerventions. In: Proceedings of Medical Image Understanding and Analysis (MIUA). (2001) 69–72 8. Masutani, Y., Dohi, T., et al., F.Y.: Interactive virtualized display system for intravascular neurosurgery. In: CVRMed-MRCAS’97. (1997) 427–435 9. Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24 (1981) 381–395
Multi-modal Image Registration by Minimising Kullback-Leibler Distance Albert C.S. Chung1 , William M. Wells III2,3 , Alexander Norbash2 , and W. Eric L. Grimson3 1
Dept. of Computer Science, Hong Kong University of Science & Technology, HK 2 Harvard Medical School, Brigham & Women’s Hospital, Boston, MA USA 3 MIT Artificial Intelligence Laboratory, Cambridge, MA USA
[email protected] Abstract. In this paper, we propose a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from aligned training images. The goal of the registration is to find the optimal transformation such that the discrepancy between the expected and the observed joint intensity distributions is minimised. The difference between distributions is measured using the Kullback-Leibler distance (KLD). Experimental results in 3D-3D registration show that the KLD based registration algorithm is less dependent on the size of the sampling region than the Maximum log-Likelihood based registration method. We have also shown that, if manual alignment is unavailable, the expected joint intensity distribution can be estimated based on the segmented and corresponding structures from a pair of novel images. The proposed method has been applied to 2D-3D registration problems between digital subtraction angiograms (DSAs) and magnetic resonance angiographic (MRA) image volumes.
1
Introduction
A key issue in the medical imaging field is multi-modal image registration. As the use of co-registration packages spreads, the number of the aligned image pairs in image databases (either by manual or automatic methods) increases dramatically. These image pairs can serve as a set of training data, in which the statistical joint intensity properties can be observed and learned in order to acquire useful a priori knowledge for future registration tasks. In this paper, we propose a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from aligned training images. One of the key features is the use of the expected joint intensity distribution between two pre-aligned, training images as a reference distribution. The goal is to align any two images of the same or different acquisitions such that the expected distribution and the observed joint intensity distribution are well matched. In other words, the registration algorithm aligns two different images based on the expected outcomes. The difference between distributions is measured using the Kullback-Leibler distance (KLD), which is a T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 525–532, 2002. c Springer-Verlag Berlin Heidelberg 2002
526
A.C.S. Chung et al.
frequently used information theoretic similarity measure in the machine learning and information theory fields. The KLD value tends to zero when the two distributions become equal. The registration procedure is an iterative process, and is terminated when the KLD value becomes sufficiently small. Experimental results in 3D-3D registration show that the KLD based registration algorithm is less dependent on the size of the sampling region than the Maximum log-Likelihood based method. We have also shown that, if manual alignment is unavailable, the expected joint intensity distribution can be estimated based on the segmented and corresponding structures from a pair of novel images. The proposed method has been applied to 2D-3D registration problems between DSAs and MRA image volumes.
2 2.1
Description of the Registration Algorithm The Expected and Observed Joint Intensity Distributions
Expected joint intensity distribution: there are two ways of constructing the expected joint intensity distribution. Firstly, the joint distribution can be constructed by manual alignment, which can be done by experienced clinicians with the help of external or internal markers. Let I1 and I2 be the intensity values of two training images of the same or different acquisitions, and X1 and X2 be their image domains respectively. Assume that the values of image pixels are independent of each other. Since the two images have been already aligned, samples of intensity pairs Iˆ = {i1 (x), i2 (x)|i1 ∈ I1 , i2 ∈ I2 } can be drawn from I1 and I2 , where x are the pixel coordinates, x ∈ X and X = X1 = X2 . The expected joint intensity distribution Pˆ (I1 , I2 ) can be approximated by either Parzen windowing or histogramming [1]. Histogramming is employed in this paper because the approach is computationally efficient, and the intensity histogram size is practical (the histogram has only 2 dimensions in this case). To achieve sub-voxel accuracy, histogram partial volume (PV) interpolation [7] can be used. A smooth histogram can be obtained by convolving with a Gaussian density function, given by Gψ (z) = (2n)
−n 2
|ψ|
−1 2
e
−1 −1 z 2 z ψ
,
(1)
where ψ is the co-variance of the Gaussian function and z can be a vector or scalar value. If manual alignment is unavailable, a second method of constructing the expected joint intensity distribution is to perform segmentations separately in the two images, I1 and I2 , such that the internal anatomical structures are labelled. Let sk , k = 1 . . . M , be the internal structures, where M represents the number of anatomical structures. Then, samples of intensity pairs Iˆ = {i1 (x), i2 (y)|i1 ∈ I1 , i2 ∈ I2 , x, y ∈ sk , k = 1 . . . M } can be drawn if x and y belong to the same structure sk , where x and y are the pixel coordinates in X1 and X2 respectively. Similarly, the expected joint intensity distribution Pˆ (I1 , I2 ) can be approximated by either Parzen windowing or histogramming.
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
527
Observed joint intensity distribution: given a new image pair with a hypothesized transformation T , samples of intensity pairs Io = {i1 (x), i2 (T (x))|i1 ∈ I1 , i2 ∈ I2 } can be drawn from I1 and I2 , where x are the pixel coordinates, x ∈ Ω and Ω ⊂ X1 ∪ X2 . This means Ω represents a sampling domain that is equal to or inside X1 ∪ X2 . Note that the observed joint intensity distribution PoT (I1 , I2 ) is dependent on the values of the transformation T and changes during the registration. The Parzen windowing or histogramming approach can also be used to estimate the distribution PoT . 2.2
Kullback-Leibler Distance (KLD) Given the expected Pˆ and observed PoT joint intensity distributions, the Kullback-Leibler distance between the two distributions is given by P T (i1 , i2 ) . (2) PoT (i1 , i2 ) log o D(PoT ||Pˆ ) = Pˆ (i1 , i2 ) i1 ,i2 According to [3,5], D(PoT ||Pˆ ) has two important properties. 1. D(PoT ||Pˆ ) ≥ 0; and 2. D(PoT ||Pˆ ) = 0 iff PoT = Pˆ . These properties show that, when the two images I1 and I2 are not perfectly registered, the values of KLD, D, will be non-zero and positive because the observed and expected joint intensity distributions are not equal, PoT =Pˆ . On the other hand, if the images are well registered, then the value of KLD is equal to zero, i.e. D = 0. 2.3
Optimisation of the Transformation T
The goal of the registration is to find the optimal transformation Tˆ by minimising the difference between the observed Po and expected Pˆ , which is formulated as Tˆ = arg min D(PoT ||Pˆ ). T
(3)
The proposed method is conceptually different from the mutual information based registration method, which encourages the functional dependence between the two image random variables, I1 and I2 . The KLD based registration method guides the transformation T based on the difference between the expected Pˆ and observed PoT joint intensity distributions, or, in other words, based on the expected outcomes learned from the training data. In this paper, the value of KLD is minimised by Powell’s method with a multi-resolution strategy [9] because it does not require calculations of gradient and, hence, is simpler in terms of implementation. Powell’s method iteratively searches for the minimum value of KLD along each parameter axis T (1D line minimisation) while other parameters are kept constant. The search step ∂T is relatively large in a coarse resolution and decreases as the resolution gets higher, ∂T is set to 2, 1 and 0.5mm in this paper (in Section 3.2). The iteration process stops when the change of KLD is sufficiently small (set 0.001 in this paper).
528
A.C.S. Chung et al.
a. T1 image
b. T2 image
Fig. 1. (a) T1 and (b) T2 images.
3 3.1
Experimental Results T1 – T2 (3D-3D) Registration
The T1 and T2 datasets are obtained from the BrainWeb Simulated Brain Database (277 × 241 × 181 voxels and 1 × 1 × 1mm3 ) [2], in which all the corresponding images have already been perfectly aligned and can be used as a testing platform for studying the performance of different objective functions. Maximum log-Likelihood (ML) [6] and Mutual Information (MI) [10] were compared with the KLD, their definitions are given by log Pˆ (i1 (x), i2 (T (x))), and (4) ML = x
MI =
i1 ,i2
PoT (i1 , i2 ) log
PoT (i1 , i2 ) T Po (i1 )PoT (i2 )
(5)
respectively, where PoT (i1 ) and PoT (i2 ) are the marginal distributions, x are the pixel coordinates, x ∈ Ω and Ω ⊂ X1 ∪ X2 . One of the pairs of 2D T1 and T2 image slices is shown in Figs. 1a and 1b respectively, with their intensity values and image domains represented by I1 and I2 , and X1 and X2 respectively. Since these images in the datasets are aligned, the expected joint intensity distribution Pˆ (I1 , I2 ) can be estimated based on the method described in Section 2.1 (only slices from positions 30 to 160 were used in order to avoid the inherent image artifacts in the dataset). In order to study the performance of the objective functions, X2 was shifted horizontally and rotated, whereas the position and orientation of X1 were fixed. Given a transformation T , if any pixel x2 in X2 fell between the voxel positions of X1 , then its corresponding intensity value i1 was computed by linearly interpolating the values of its four neighbouring pixels in X1 to achieve the sub-voxel accuracy. The observed joint intensity distribution PoT was then estimated according to Section 2.1. In this paper, the number of bins was set to 32 and the co-variance matrix ψ in Eq. 1 was a diagonal matrix DIAG(σ 2 , σ 2 ) and σ 2 = 1.
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
a. KLD
b. ML
529
c. MI
Fig. 2. T1-T2 registration performance analysis. T2 image was shifted horizontally. The offset values range from −40mm to 40mm.
a. KLD
b. ML
c. MI
Fig. 3. T1-T2 registration performance analysis. T2 image was rotated. The offset values range from −40o to 40o .
We set Ω = X2 for ease of implementation in this paper. If x2 fell outside the domain of image X1 , then an arbitrary intensity value in the background of X1 was assigned to i1 . As plotted in Figs. 2 and 3, the performances of the three different measures (KLD, ML and MI) are comparable when the T 2 image (X2 ) was shifted horizontally between −40mm and 40mm, and was rotated between −40o and 40o . However, it is also common to discard a sample (i1 (x), i2 (T (x))) if it fell outside the overlapping region, i.e. x ∈X1 ∩ X2 . As shown in Fig. 4, when Ω was set to X1 ∩ X2 , the performance of ML was adversely affected when only samples drawn from the overlapping region were included in the calculation. As compared with ML, the figure shows that KLD and MI are less dependent on the size of the sampling region Ω. The major reason is that, from Eq. 4, the value of ML depends only on the observed samples x. Therefore, when the area of the overlapping region is small, fewer samples are obtained and thus the value of ML increases. In contrast, given the same set of observed samples, the value of KLD consists of the contributions of the observed samples and, most importantly, the penalties of the unobserved samples from the expected joint intensity distribution Pˆ . Therefore, the entire distribution Pˆ is utilised in the KLD measure. Finally, the value of MI depends mostly on the randomness of the observed samples. The decease in overlapping area increases the sample randomness and, hence, the value of MI decreases. In terms of computational efficiency, comparing Eq. 2 with Eq. 4, it is observed that, since KLD does not require the calculation of the marginal distri-
530
A.C.S. Chung et al.
a. KLD
b. ML
c. MI
Fig. 4. T1-T2 registration performance analysis. T2 image was shifted horizontally. However, only samples, which fell in the overlapping region of the two images, were included in the calculations.
butions, PoT (i1 ) and PoT (i2 ), it can be more computationally efficient than MI. From Eq. 4, the efficiency of ML is directly proportional to the number of samples drawn. On the other hand, the efficiency of KLD is directly proportional to the product of the number of bins B1 and B2 partitioning I1 and I2 respectively. As such, the efficiencies of ML and KLD are related to different parameters, and their comparison is parameter dependent. 3.2
DSA - MRA (2D-3D) Registration
The proposed method was applied to 2D-3D registration problems and tested in two clinical datasets, which were acquired at the Department of Radiology, Brigham and Women’s Hospital, Boston, USA. Each dataset consists of a pre-interventional 3D magnetic resonance angiographic (MRA) image volume (256 × 256 × 60 voxels and 0.78 × 0.78 × 1.3mm3 ), and a 2D digital subtraction angiogram (DSA) during the interventional treatments. Figs. 5a and 5d show the two cropped DSAs. The DSAs were distortion corrected using a distortion correction object with a uniform grid pattern [4]. A maximum intensity projection (MIP) of each MRA volume was generated using the projective geometry and ray casting method [8,11], in which there were six rigid body transformation parameters (three translational and three rotational). The initial transformations were obtained from the machine readings of the C-arm X-ray systems, as shown in Figs. 5d and 5h. For each dataset, the expected joint intensity distribution was estimated based on the segmented and corresponding structures from the novel DSA and the initial non-registered MIP. These structures consist of vessel and background regions, in which each region was defined by a manually selected intensity range for the two datasets (more advanced methods can be applied but they are not the focus of this paper). The expected distribution Pˆ was estimated by randomly drawing samples of the same structures from the DSA and MIP, as described in the Section 2.1. Then, the observed distribution PoT was generated during the registration and used to guide the rigid body transformation using the KLD measure, as defined in the Eqs. 2 and 3. The optimal transformation was searched using Powell’s method with a multi-resolution strategy, as described in Section
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
a.
b.
c.
d.
e.
f.
g.
h.
531
Fig. 5. 2D-3D registration results: (a,e) digital subtraction angiograms (DSA) (vessels are black in colour), (b,f) final image alignments, maximum intensity projections (MIP) of the magnetic resonance angiographic (MRA) image volumes (vessels are white in color and their intensity is directly proportional to the flow speed), (c,g) segmented MIPs are overlaid on their corresponding DSAs and (d,h) initial image alignments.
2.3. Figs. 5b and 5f show the MIPs of the registered MRA volumes and the results are promising. Segmented vessel regions of the MIPs are overlaid on the corresponding DSAs, as shown in Figs. 5c and 5g. Note that the remaining discrepancy between the DSA and MIP may be caused by (a) some vessels that are visible in one image and are not visible in another image due to different vessel delineation properties in different acquisitions and different regions of interest selected, (b) signal loss in the MRA images (e.g. turbulent or eddy flow), or (c) the geometric distortion due to the MR gradient field nonlinearity.
4
Summary and Conclusions
In this paper, we have proposed a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from the aligned training images. The difference between the expected and observed joint intensity distributions is measured by the Kullback-Leibler distance (KLD), which has non-zero and positive value when there is any discrepancy be-
532
A.C.S. Chung et al.
tween the two distributions. The KLD-based registration algorithm guides the transformations by minimising the KLD value until the two datasets are aligned. The results based on T1-T2 (3D-3D) registration experiments show that, as compared with the Maximum log-Likelihood (ML) based registration method, the KLD-based registration algorithm is less dependent on the size of sampling region. In DSA-MRA (2D-3D) registration experiments, we have shown that the expected joint intensity distribution can also be estimated based on the segmented and corresponding structures (vessel and background regions) from the novel DSA and the initial non-registered MIP. The DSA-MRA registration results are promising and demonstrate the applicability of our method in 2D3D registration. Future work will include a further validation of the proposed algorithm by applying it to a large number of datasets.
Acknowledgements We would like to thank K. Rhode and D. Hawkes at Guy’s Hospital, London, U.K. for sharing the DSA image distortion correction software. W. M. Wells III would like to acknowledge support from the NSF ERC grant (JHU Agreement #8810-274) and the NIH (grant #1P41RR13218).
References 1. C.M. Bishop. Neural Networks for Pattern Recognition. Oxford U. Press, 1995. 2. D.L. Collins, A.P. Zijdenbos, and et al. Design and Construction of a Realistic Digital Brain Phantom. IEEE Trans. Med. Img., 17(3):463–468, 1998. 3. T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991. 4. P. Haaker, E. Klotz, and et al. Real-time distortion correction of digital X-ray II/TV-systems: an application example for digital flashing tomosynthesis (DFTS). International Journal of Cardiac Imaging, 6(1):39–45, 1990-91. 5. S. Kullback. Information Theory and Statistics. Dover Publications, Inc., 1968. 6. M.E. Leventon and W.E.L. Grimson. Multi-Modal Volume Registration Using Joint Intensity Distributions. In MICCAI, pages 1057–1066, 1998. 7. F. Maes, A. Collignon, and et al. Multimodality Image Registration by Maximization of Mutual Information. IEEE Trans. Med. Img., 16(2):187–198, 1997. 8. G.P. Penney, J. Weese, and et al. A Comparison of Similarity Measures for Use in 2D-3D Medical Image Registration. IEEE Trans. Med. Img., 17(4):586–595, 1998. 9. W.H. Press, S.A. Teukolsky, and et al. Numerical Recipes in C, 2nd Edition. Cambridge University Press, 1992. 10. W.M. Wells, P. Viola, and et al. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35–51, 1996. 11. L. Z¨ ollei. 2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images. MIT Masters Dissertation, 2001.
Cortical Surface Registration Using Texture Mapped Point Clouds and Mutual Information Tuhin K. Sinha, David M. Cash, Robert J. Weil, Robert L. Galloway, and Michael I. Miga Vanderbilt University, Nashville TN 37235, USA {tk.sinha,dave.cash,michael.i.miga}@vanderbilt.edu
[email protected] http://bmlweb.vuse.vanderbilt.edu
Abstract. An inter-modality registration algorithm that uses textured point clouds and mutual information is presented within the context of a new physical-space to image-space registration technique for imageguided neurosurgery. The approach uses a laser range scanner that acquires textured geometric data of the brain surface intraoperatively and registers the data to grayscale encoded surfaces of the brain extracted from gadolinium enhanced MR tomograms. Intra-modality as well as inter-modality registration simulations are presented to evaluate the new framework. The results demonstrate alignment accuracies on the order of the resolution of the scanned surfaces (i.e. submillimetric). In addition, data are presented from laser scanning a brain’s surface during surgery. The results reported support this approach as a new means for registration and tracking of the brain surface during surgery.
1
Introduction
Understanding the geometric characteristics and the impact of intraoperative surgical events upon the cortical brain surface has important implications in the development of image-guided surgery (IGS) systems. In recent studies [1], the need for brain shift compensation strategies to prevent compromising IGS navigation has become an important area of research [2]. When using a computational approach to correct for brain shift [3], capturing the geometric and visual changes of the brain surface due to deformation may be a valuable source of intra-operative data. To achieve this end, a laser range scanning system capable of capturing textured surfaces with sub-millimetric accuracy will be used. Using features from the cortical surface to register does have precedent. Nakajima et al. demonstrated an average of 2.3 ± 1.3 mm fiducial registration error (FRE) using cortical vessels for registration [4]. More recently, Nimsky et al. reported a deformable surface approach to quantify surface shifts using a variation on the iterative closest point (ICP) algorithm [1]. Also, some preliminary work utilizing a scanning based system for cortical surface registration has been reported but a systematic evaluation has not been performed to date [5]. The novelty of the approach reported here is that both vessel information and three-dimensional T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 533–540, 2002. c Springer-Verlag Berlin Heidelberg 2002
534
T.K. Sinha et al.
topography will be used as the basis of alignment. Furthermore, the scanner provides a highly accurate method for tracking the brain surface that can be used in the model-updating framework. As an initial step, an implementation has been developed using an iterative closest point (ICP) [6] framework with mutual information (MI) [7]. Although ICP and MI have been used extensively [8][9], previously published registration frameworks do not entirely apply to the unique data provided by the scanner or this particular registration approach. The data acquired by the scanner provides a one-to-one correspondence between contour point and image intensity. However, intensity correspondence between a three-dimensional MR surface and an intraoperatively acquired laser-scanned cortical surface is somewhat more elusive. The most similar work relating to this registration framework is that by Johnson and Kang [10] in which these investigators used an objective function for registration based on a combined Euclidean distance and color difference metric. Used primarily in a landscape alignment application, this technique would not be amenable to the alignment process here, since the intensity distribution between scanner and MR image data is fundamentally very different. To our knowledge, no registration algorithm has been developed that will register textured three-dimensional surfaces from two different imaging modalities within the context of cortical surface registration.
2
Methods
In the realization of this approach, a laser range scanning system (RealScan 3D, 3D Digital Corporation, Danbury, CT) capable of capturing three-dimensional textured surfaces to sub-millimeter accuracy has been utilized (see Figure 1). The scanner is lightweight, compact and has a standard tripod mount. The scanning field consists of 500 horizontal by 494 vertical points per scan and is accomplished in approximately 5 seconds. Extensive calibration and characterization has been performed by Cash et al. and has demonstrated the fidelity at which surface data can be acquired [11]. Additionally, the device is approved for use in neurosurgery by the Vanderbilt University Medical Center Institutional Review Board.
Fig. 1. Laser scanner used to acquire textured point clouds.
The registration framework involves two primary steps in its execution. The first step involves acquisition and preparation of the registration surfaces. With
Cortical Surface Registration
535
respect to laser scanned surfaces, the scanner is currently placed approximately 1-2 feet from the surface of interest (achieved either by passive arm or monopod for intraoperative use). The horizontal range of the scanner is established and a vertical laser stripe passes over the surface in approximately 5 seconds. The data acquired consists of a three-dimensional point cloud with each Cartesian coordinate color-encoded via texture mapping into a digital image that is acquired just after scanning. The texture-space to scanner-space registration is calibrated by the manufacturer. The MR-generated point cloud is prepared by segmenting the brain volume, followed by ray-casting to find surface points, and averaging subsequent voxels to generate gray-scale values for each surface point (Analyze AVW - Biomedical Imaging Resource). The final step in our approach is to perform surface registration using a twostage process. An iterative closest point (ICP) algorithm is performed initially to align the point clouds of interest (i.e. laser-scanned surface and/or MR surface). The second stage is a constrained intensity-based registration. The constraint requires the alignment transformation to only operate in spherical coordinates with known radius R; the radius is provided by sphere-fitting the target surface [12]. By enforcing this restriction on the transformation, the degrees of geometric freedom are reduced from six to three, i.e. elevation φ, azimuthal θ, and roll ψ. For the method of intensity-based registration, a maximization of normalized mutual information (NMI) [13] approach is conducted using Powell’s optimization algorithm [14]. Referred to as Surface MI in this work, the method aligns textured surfaces only and does not use volumetric image data. The results presented here do not reflect true cross-modality registration (i.e. scanner to MR).
3
Registration Experiments
To evaluate robustness and accuracy of Surface MI, an initial series of experiments was conducted using a spherical phantom with a heterogenous intensity pattern on the surface. The range scanned surface acquired for registration experiments occupied a solid angle of Ω = 1.2π steradians1 and contained 67257 points (see Figure 2). A known transformation was then applied to the target surface to generate the floating surface. The l imits for elevation, azimuthal and roll angle perturbations were ±13, ±13, and ±25 degrees, respectively (the radius of the spherical phantom was approximately 110 mm). The floating and target surfaces are then re-registered using Surface MI. Five hundred randomly distributed combinations of φ, θ, and ψ were tested for registration accuracy. The second series of experiments employed the point clouds generated from surface projections of the MR volume. The target surface that was generated using a clipping plane had a solid angle of approximately Ω = .38533π steradians and contained 48429 points (see Figure 3). Similar to the spherical phantom experiments, perturbations in φ, θ, and ψ were applied to the MR surface over 500 trials. The range for the parameters φ, θ, and ψ were the same as those for t he previous experiment with similar radius (R=105 mm). 1
The solid angle of a unit sphere Ω = 4π steradians.
536
T.K. Sinha et al.
Fig. 2. Sample textured point cloud generated using a laser range scanner.
Fig. 3. Sample textured point cloud generated using surface projection on a gadolinium enhanced MR volume.
Fig. 4. Use of a clipping plane to select a region of interest in the surface projection.
The last series of experiments evaluated the efficacy of the developed algorithm in registering surfaces across modalities. Inter-modality surfaces were simulated by inverting the texture of the point cloud. Five hundred trials registering a texture-inverted region of interest (ROI) to the original MR brain surface were performed with initial misregistrations comparable to the spherical phantom experiments. The ROIs were generated by varying the normal of the clipping plane used to create the target sur face between ±0.1 cm in the sagittal and coronal axis while holding the axial value at 1 cm (see Figure 4). To create the misregistration between the float and target surface, each surface was re-centered about it’s geometric centroid.
Cortical Surface Registration
4
537
Registration Results and Discussion
Since the same scan was used for both target and floating surfaces in the registrations process, the one-to-one correspondence in points was known. This allowed calculation of the mean target registration error (TRE) between point clouds as well as the global maximum for NMI. Sample registration results are presented for each experiment series (i.e. spherical phantom, intra-modality MR, simulated inter-modality MR) in Figure 5. In addition, a distribution of TREs for each series of experiments can be seen in Figure 6. Registration results from the 500 trials using the spherical phantom yielded a mean TRE of 11.38±28.75 mm (min.=0.04,max.=127.61 mm). Although this result is less than remarkable, it should be noted that 70% of the trials achieved a mean TRE of 0.20±0.05 mm (min.=0.04,max.=0.31 mm). Furthermore, the misalignment range during surgery is expected to be ±5 degrees within each angular coordinate. Within this range, the registration process achieved a 100% success rate (i.e. NMI optimization reached it’s global maximum). With respect to the intra-modality MR experiments, all 500 trials resulted in an ideal value of NMI. The mean TRE for the 500 trials was 0.14±0.04 mm (min.=0.04,max.=0.27 mm). The increased success rate of this series of experiments as compared to the previous trials is likely due to the differences in the geometric structure of the intensity information. Most of the intensity information of the spherical phantom is contained in the central region of the surface. In some cases, when the initial mis-registration of the spherical phantom caused sufficient non-overlap of the central area, the algorithm did not register the surfaces correctly. For the brain, the intensity pattern of the vessel structure occupies most of the surface. Thus, even though the brain’s surface occupies a smaller solid angle than that of the ball, the distribution of the intensity pattern allows the alignment of more severely misregistered surfaces. The last series of experiments simulating inter-modality registration generated a mean TRE of 3.38±7.18 mm (min.=0.07,max.=53.75 mm). Similar to the spherical phantom, 67% of these trials produced a mean TRE of 0.37±0.19 mm (min.=0.07,max.=1.00 mm). Analysis of the failed trials indicated that the spherical constraint prevented accurate registration. In general, the algorithm failed to register surfaces clipped from or containing the periphery of the surface projection, which contained a much higher surface curvature as compared to the target surface. This discrepancy in surface curvatures between target and floating surfaces caused the sub-optimal registrations. In general, the occurrence of curvature discrepancies intra-operatively will be limited since vessel landmarks will be used to provide an initial alignment for the Surface MI.
5
Conclusions and Future Work
The results of this paper show that the ICP and MI framework is a useful tool for cortical surface registration. Results of both intra- and inter-modality surface registration show sub-millimetric accuracies using a phantom. This paper outlines preliminary steps taken with the laser range scanner and the Surface
538
T.K. Sinha et al.
Fig. 5. Sample registration results. Top row, from left to right: on-axis view of misregistered and registered surfaces of the spherical phantom, off-axis view of misregistered and registered surfaces. Middle row: sample results of the intra-modality registration, presented similar to the top row. Bottom row from left to right: misregistered and registered surfaces from simulated inter-modality experiments.
MI algorithm. In vivo analysis of the registration results is currently in progress. Figure 7 shows intra-operative data of the cortical surface acquired by the laser range scanner. More quantitative studies of the laser range scanner and registration algorithm are also planned using an optical tracking system. Algorithmically, the ability to track and register cortical deformations is also being studied.
Acknowledgements The authors acknowledge Dr. Hill for his correspondence on MI. VTK (Kitware Inc.) and Analyze AVW (Mayo Clinic) provided software. A grateful acknowledgement to the VUMC Neurosurgical staff. This project is supported in part by the Vanderbilt University Discovery Grant Program.
Cortical Surface Registration
539
Fig. 6. Distribution of Target Registration Error (TRE) for each series of experiments.
Fig. 7. Example dataset taken with the laser range scanner in the operating room. Left, a CCD image of the surgical area. Right, a tessellated point cloud with texture mapped points on the right.
References 1. Nimsky, C., Ganslandt, O., Cerny, S., Hastreiter, P., Greiner, G., Fahlbusch, R.: Quantification of, visualization of, and compensation for brain shift using intraoperative magnetic resonance imaging. Neurosurgery 47 (2000) 2. Roberts, D., Miga, M., Hartov, A., Eisner, S., Lemery, J., Kennedy, F., Paulsen, K.: Intraoperatively updated neuroimaging using brain modeling and sparse data. Neurosurgery 45 (1999)
540
T.K. Sinha et al.
3. Miga, M., Paulsen, K., Lemery, J., Eisner, S., Hartov, A., Kennedy, F., Roberts, D.: Model-updated image guidance: Initial clinical experiences with gravity-induced brain deformation. IEEE: Trans. on Med. Img. 18 (1999) 4. Nakajima, S., H, H.A., Kikinis, R., Moriarty, T.M., Metcalf, D.C., Jolesz, F.A., Black, P.M.: Use of cortical surface vessel registration for image-guided neurosurgery. Neurosurgery 40 (1997) 5. Audette, M.A., Siddiqi, K., Peters, T.M.: Level-set surface segmentation and fast cortical range image tracking for computing intrasurgical deformations. LNCS: Med. Image Computing and Computer-Assisted Intervention 1679 (1999) 6. Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence 14 (1992) 7. Wells, W.M., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.: Multi-modal volume registration by maximization of mutual information. Med. Image Analysis 1 (1996) 35–51 8. Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., Suetens, P.: Multimodality image registration by maximization of mutual information. IEEE Trans. on Med. Imag. 16 (1997) 187–198 9. Audette, M.A., Ferrie, F.P., Peters, T.M.: An algorithmic overview of surface registration techniques for med. imag.. Med. Image Analysis 4 (2000) 201–217 10. Johnson, A.E., Kang, S.B.: Registration and integration of textured 3d data. Image and Vision Computing 17 (1999) 135–147 11. Cash, D.M., Sinha, T.K., Chapman, W.C., Galloway, R.L., Miga, M.I.: Fast accurate surface acquisition using a laser scanner for image-guided surgery, SPIE: Med. Imag. 2002 (2002) 12. Ahn, S.J., Rauh, W., Warnecke, H.J.: Least-squares orthogonal distances fitting of circle, sphere, ellipse, hyperbola, and parabola. Pattern Recognition 34 (2001) 13. Studholme, C., Hill, D.L.G., Hawkes, D.J.: An overlap invariant entropy measure of 3d medical image alignment. Pattern Recognition 32 (1999) 71–86 14. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Num. Rec. in C : The Art of Scientific Computing. Second edn. Cambridge University Press (1993)
A Viscous Fluid Model for Multimodal Non-rigid Image Registration Using Mutual Information E. D’Agostino, F. Maes , D. Vandermeulen, and P. Suetens Katholieke Universiteit Leuven Faculties of Medicine and Engineering Medical Image Computing (Radiology - ESAT/PSI) University Hospital Gasthuisberg, Herestraat 49, B-3000 Leuven, Belgium
[email protected] Abstract. We propose a multimodal free form registration algorithm based on maximization of mutual information. Images to be aligned are modeled as a viscous fluid that deforms under the influence of forces derived from the gradient of the mutual information registration criterion. Parzen windowing is used to estimate the joint intensity probability of the images to be matched. The method was verified by for registration of simulated T1-T1, T1-T2 and T1-PD images with known ground truth deformation. The results show that the root mean square difference being the recovered and the ground truth deformation is smaller than 1 voxel.
1
Introduction
Maximization of mutual information has been demonstrated to be a very general and reliable approach for affine registration of multimodal images of the same patient or from different patients, including atlas matching [7,9]. In applications where local morphological differences need to be quantified, affine registration is no longer sufficient and non-rigid registration (NRR) is required, aiming at finding a 3D vector field describing the deformation at each point. Applications for NRR include shape analysis (to warp all shapes to a standard space) and atlas-based segmentation (to compensate for gross morphological differences between atlas and study images). Different approaches have been proposed for extending the mutual information criterion to NRR. Spline-based approaches [8,6] can correct for gross shape differences, but a dense grid of control points is required to characterize the deformation at voxel level detail, implying high computational complexity. Block matching [4] or free-form approaches, using a non-parameterized expression for the deformation field, assign a local deformation vector to each voxel individually, but need appropriate constraints for spatial regularization of the resulting vector field. Elastic constraints are suitable when displacements can be assumed to be small, while for large magnitude deformations a viscous fluid model is more appropriate.
Frederik Maes is Postdoctoral Fellow of the Fund for Scientific Research - Flanders (FWO-Vlaanderen, Belgium).
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 541–548, 2002. c Springer-Verlag Berlin Heidelberg 2002
542
E. D’Agostino et al.
Recently, a multimodal NRR algorithm was presented in [5], defining the forces driving the deformation at each voxel such that mutual information is maximized and using a regularization functional derived from linear elasticity theory. In this paper, we extend the approach of [5] by replacing the elastic model by the viscous fluid regularization model of Christensen et al. [3] and thus generalize the method of [3] to multimodal image registration based on maximization of mutual information. The Navier-Stokes equation modelling the viscous fluid is solved by iteratively updating the deformation field and convolving it with a Gaussian filter. The deformation field is regridded as needed during iterations as in [3] to assure that its Jacobian remains positive everywhere, such that the method can handle large deformations. We verified the robustness of the method by applying realistic known deformations to simulated multispectral MR images and evaluating the difference between the recovered and ground truth deformation fields in terms of displacement errors and of tissue classification errors when using the recovered deformation for atlas-based segmentation.
2 2.1
Method The Viscous Fluid Algorithm
We follow the approach of [3] to deform an template image F onto a target image G, using an Eulerian reference frame to represent the mapping T = x − u(x) of fixed voxel positions x in target space onto the corresponding positions x − u(x) in the original template space. The deforming template image is considered as a viscous fluid whose motion is governed by the Navier-Stokes equation of conservation of momentum. Using the same simplifications as in [3], this equation can be written as ∇2 v + ∇ (∇.v) + F (x, u) = 0 (1) with F (x, u) a force field acting at each position x that depends on the deformation u and that drives the deformation in the appropriate direction, and with v(x, t) the deformation velocity experienced by a particle at position x: ∂u ∂u du = + vi dt ∂t ∂xi i=1 3
v=
T
(2) T
with v = [v1 (x, t), v2 (x, t), v3 (x, t)] and u = [u1 (x, t), u2 (x, t), u3 (x, t)] . In section 2.2, we derive an expression for the force field F such that the viscous fluid flow maximizes mutual information between corresponding voxel intensities. When the forces are given, solving (1) yields deformation velocities, from which the deformation itself can be computed by integration over time. In [3] the Navier-Stokes equation is solved by Successive Over Relaxation (SOR), but this is a computationally expensive approach. Instead, we follow the approach of [2] and obtain the velocity field by convolution of the force field with a Gaussian kernel ψ: v =ψF (3)
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
543
The displacement u(k+1) at iteration (k + 1) is then given by: u(k+1) = u(k) + R(k) .∆t
(4)
with R(k) the perturbation to the deformation field: (k) 3 (k) ∂u (k) (k) R =v − vi ∂xi i=1
(5)
The time step ∆t is constrained by ∆t ≤ max(R).∆u, with ∆u the maximal voxel displacement that is allowed in one iteration. To preserve the topology of the object image, the Jacobian of the deformation field should not become negative. When the Jacobian becomes anywhere smaller than some positive threshold, regridding of the deformed template image is applied as in [3] to generate a new template, setting the incremental displacement field to zero. The total deformation is the concatenation of the incremental deformation fields associated with each propagated template. 2.2
Force Field Definition
We define an expression for the force field F (x, u) in (1) such that the viscous fluid deformation strives at maximizing mutual information I(u) of corresponding voxel intensities between the deformed template image F(x − u) and the target image G(x). We adopt here the approach of [5] who derived an expression for the gradient ∇u I of I with respect to the deformation field u, modelling the F ,G (i1 , i2 ) of template and target images as a conjoint intensity distribution pu tinuous function using Parzen windowing. If the deformation field u is perturbed into u + h, variational calculus yields the first variation of I: F ,G pu+h (i1 , i2 ) ∂ ∂I(u + h) F ,G = (i1 , i2 ) log F di1 di2 p ∂ ∂ u+h p (i1 )pGu+h (i2 ) =0 =0 F ,G F ,G ∂pu+h (i1 , i2 ) (i1 , i2 ) pu = 1 + log F di1 di2 (6) ∂ p (i1 )pGu (i2 ) =0
The joint intensity probability is constructed from the domain of overlap V of both images (with volume V ), using the Parzen windowing kernel ψ(i1 , i2 ): 1 F ,G pu (i1 , i2 ) = ψ(i1 − F(x − u), i2 − G(x))dx (7) V V Inserting (7) in (6) and rearranging as in [5], yields ∂Lu 1 ∂I(u + h) ψ (F(x − u), G(x))∇F(x − u)h(x)dx (8) = ∂ V V ∂i1 =0 with Lu (i1 , i2 ) = 1 + log
F ,G (i1 , i2 ) pu F p (i1 )pGu (i2 )
(9)
544
E. D’Agostino et al.
Fig. 1. Left: T1 MPRAGE Patient image; Middle: CSF segmented using standard prior; Right: CSF segmented after non-rigid matching of the atlas. Table 1. Root mean square error ∆T in millimeter between ground thruth and recovered deformation fields within the brain region for different multimodal image combinations of BrainWeb simulated MR brain images at different noise levels. Case 1 2 3
T1/T1 T1/T2 T1/PD 0% 3% 7% 0% 3% 7% 3% 0.384 0.430 0.465 0.577 0.759 0.685 0.723 0.304 0.398 0.433 0.443 0.640 0.649 0.661 0.351 0.411 0.459 0.505 0.753 0.775 0.772
We therefore define the force field F at x to be equal to the gradient of I with respect to u(x), such that F drives the deformation to maximize I: ∂Lu 1 ψ (F(x − u), G(x))∇F(x − u) (10) F (x, u) = ∇u I = V ∂i1 2.3
Implementation Issues
The method was implemented in Matlab, with the image resampling and histogram computation coded in C. The histogram was computed using 128 bins for both template and target images. Parzen windowing was performed by convolution of the joint histogram with a 2D Gaussian kernel. The maximal displacement at each iteration ∆u was set to 0.3 voxels and regridding was performed when the Jacobian became smaller than 0.5. Iterations were continued as long as mutual information I(u) increased, with a maximum of 75 iterations. A multiresolution optimization strategy was adopted by smoothing and downsampling the images at 3 different levels of resolution, starting the process at the coarsest level and gradually increassing resolution as the method converged. Computation time for matching two images of size 128x128x80 is about 50 minutes.
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
545
Table 2. Overlap coefficient for different tissue classes of tissue maps obtained with ground thruth and recovered deformation fields for different multimodal image combinations of BrainWeb simulated MR brain images. Noise level was 3% in each case. Case 1 2 3
3
T1/T1 WM GM CSF WM 0.9282 0.9179 0.8698 0.8579 0.9253 0.9277 0.8969 0.8595 0.9279 0.9260 0.8795 0.8460
T1/T2 T1/PD GM CSF WM GM CSF 0.8320 0.7645 0.8604 0.8454 0.7844 0.8463 0.7839 0.8564 0.8373 0.7818 0.8270 0.7579 0.8552 0.8028 0.7413
Experiments
The method was validated on simulated images generated by the BrainWeb MR simulator [1] with different noise levels. In all experiments the images were non-rigidly deformed by known deformation fields T ∗ . These were generated by using our method to match the T1 weighted BrainWeb image to real T1 weighted images of 3 periventricular leukomalacia patients, typically showing enlarged ventricles. We evaluate how well the recovered deformation T , obtained by matching the original T1 weighted BrainWeb image to the T1, T2 or proton density (PD) weighted images deformed by T ∗ , resembles the ground truth T ∗ . Both deformations were compared by their root mean square (RMS) error ∆T evaluated in millimeter over all brain voxels B:
1 ∆T = (|T (x) − T ∗ (x)|)2 (11) NB B
We also verified the impact of possible registration errors on atlas-based segmentation by comparing the (hard classified) tissue maps M and M ∗ , obtained by deforming the tissue maps of the original image using T and T ∗ respectively. We measure the difference between M and M ∗ by their overlap coefficient Oj (M, M ∗ ) for 3 tissue types j, white matter (WM), grey matter (GM) and cerebro-spinal fluid (CSF): Oj (M, M ∗ ) =
2Vj (M, M ∗ ) Vj (M ) + Vj (M ∗ )
(12)
with Vj (M, M ∗ ) the volume of the voxels that are assigned to class j in both maps and Vj (M ) and Vj (M ∗ ) the volume of the voxels assigned to class j in each map separately. Figure 1 shows the registration result of the BrainWeb T1 image to one of the patient images and the segmentation of CSF obtained using the method of [9] with affine and with our non-rigid atlas registration procedure. Note how the segmentation of the enlarged ventricles is much improved by using non-rigid atlas warping. Table 1 shows the RMS error ∆T computed for T1 to T1, T2 and PD registration of the BrainWeb images at different noise levels (each time identical for
546
E. D’Agostino et al.
Fig. 2. Left: Original BrainWeb T1 template; right: BrainWeb target image obtained by applying a known deformation; middle: template matched to target. Top: T1/T1 registration; middle: T1/T2; bottom: T1/PD.
object and target images), for 3 different ground truth deformations. All values are smaller than one voxel, with the most accurate results being obtained for T1/T1-matching. The overlap coefficients for WM, GM and CSF in the ground truth and recovered tissue maps are tabulated in table 2. The results are visualized in figure 2 and figure 3.
4
Discussion
We present an algorithm for non-rigid multimodal image registration using a viscous fluid model by defining a force field that drives the deformation such that mutual information of corresponding voxel intensities is maximized. Our method is in fact the merger of the mutual information based registration functional presented in [5] with the viscous fluid regularization scheme of [3]. The joint intensity probability of the images to be matched is estimated using Parzen windowing and is differentiable with respect to the deformation field. The size of the Parzen windowing kernel needs to be properly chosen such that the criterion is a more or less smooth function of the deformation field. This choice is related to the image noise. For all experiments described above, the same kernel was used, indepedently of the multispectral nature of the images. In the current implementation, the extension of the Parzen estimator is automatically computed using a leave k out cross validation technique maximizing an empirical likelihood of the marginal densities[10,11]. The impact of the Parzen windowing kernel on the registration process needs further investigation.
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
547
Fig. 3. Misclassified WM (left), GM (middle) and CSF (right) voxels of recovered vs ground truth deformation using the results in figure 2. Top: T1/T1 registration; middle: T1/T2; bottom: T1/PD.
Another relevant implementation parameter is the time step ∆t or the maximal displacement ∆u allowed at each iteration that is specified to update the displacements after solving the Navier-Stokes equation. Selecting a larger value for ∆t will result in larger displacement steps and a more frequent regridding of the template as the Jacobian of the transformation is more likely to become non-positive. A smaller value of ∆t on the other hands implies a larger number of iterations for convergence. More experiments are needed to properly tune this parameter. We validated our algorithm using simulated T1, T2 and PD images from BrainWeb with different noise levels and different realistic ground truth deformations generated by registration of the simulated image with real patient images. Although the RMS error was found to be subvoxel small in all cases, T1/T1 registration gave more accurate results than T1/T2 or T1/PD registration. The contrast between gray and white matter especially is much better in T1 than in T2 or PD and the algorithm succeeds better at recovering the interface between both tissues in T1 than in T2 or PD. We also compared T1-to-T2 versus T2-toT1 registration and found that somewhat better results are obtained using T1 as the template image. This can be explained by the fact that the forces driving the registration depend on the gradient of the template image, which is better defined in T1 than in T2 at the interface between white and gray matter.
548
5
E. D’Agostino et al.
Conclusions
We have presented a multimodal free-from registration algorithm based on maximization of mutual information that models the images as a viscous fluid. The forces deforming the images are defined as the gradient of mutual information with respect to the deformation field, using Parzen windowing to estimate the joint intensity probability. We have validated our method for matching simulated T1-T1, T1-T2 and T1-PD images, showing that the method performs quite well in both mono and multi-modal conditions. Future work includes the introduction of more spatial information and more specific intensity models into the similarity criterion in order to make the registration more robust.
References 1. Available at http://www.bic.mni.mcgill.ca/brainweb/. 2. M. Bro-Nielsen, C. Gramkow. Fast Fluid Registration of Medical Images. Proc. Visualization in Biomedical Computing (VBC’96), Lecture Notes in Computer Science, vol. 1131, pp. 267-276, Springer, 1996. 3. G.E. Christensen, R.D. Rabitt, M.I. Miller. Deformable Templates Using Large Deformation Kinematics. IEEE Trans. Medical Imaging, 5(10):1435–1447, 1996. 4. T. Gaens, F. Maes, D. Vandermeulen, P. Suetens. Non-rigid multimodal image registration using mutual information. Proc. Medical Image Computing and Computer-Assisted Intervention (MICCAI98), Lecture Notes in Computer Science, vol. 1496, pp. 1099-1106, Springer, 1998. 5. G. Hermosillo, C. Chef d’Hotel, O. Faugeras. A Variational Approach to MultiModal Image Matching. INRIA Technical Report N. 4117, February 2001. 6. B. Likar, F. Pernus. A hierarchical approach to elastic registration based on mutual information. Image and Vision Computing, 19:33-44, 2000. 7. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens. Multimodality image registration by maximization of mutual information. IEEE Trans. Medical Imaging, 16(4):187–198, 1997. 8. D. Rueckert, L.I. Sonoda, C. Hayes, D.L.G. Hill, M.O. Leach, D.J. Hawkes. Nonrigid registration using free-form deformation: application to breast MR images. IEEE Trans. Medical Imaging, 18(8):712–721, 1999. 9. K. Van Leemput, F. Maes, D. Vandermeulen, P. Suetens. Automated model-based tissue classification of MR images of the brain. IEEE Trans. Medical Imaging, 18(10):897–908, 1999. 10. G. Hermosillo Valadez. Variational methods for multimodal image matching. Doctoral Thesis, Universite de Nice, Sophia Antipolis, 138-141, 3 May 2002. 11. B. A. Turlach. Bandwidth selection in kernel density estimation: a review. Discussion Paper 9317, Institut de Statistique, UCL, Louvain La Neuve, 1993.
Non-rigid Registration with Use of Hardware-Based 3D B´ ezier Functions Grzegorz Soza1 , Michael Bauer1 , Peter Hastreiter1,2 , Christopher Nimsky2 , and G¨ unther Greiner1 1
2
Computer Graphics Group, University of Erlangen-Nuremberg Am Weichselgarten 9, 91058 Erlangen, Germany
[email protected] Neurocenter, Department of Neurosurgery, University of Erlangen-Nuremberg
Abstract. In this paper we introduce a new method for non-rigid voxelbased registration. In many medical applications there is a need to establish an alignment between two image datasets. Often a registration of a time-shifted medical image sequence with appearing deformation of soft tissue (e.g. pre- and intraoperative data) has to be conducted. Soft tissue deformations are usually highly non-linear. For the handling of this phenomenon and for obtaining an optimal non-linear alignment of respective datasets we transform one of them using 3D B´ezier functions, which provides some inherent smoothness as well as elasticity. In order to find the optimal transformation, many evaluations of this B´ezier function are necessary. In order to make the method more efficient, graphics hardware is extensively used. We applied our non-rigid algorithm successfully to MR brain images in several clinical cases and showed its value.
1
Introduction
Non-rigid registration and elastic warping of medical images have been addressed in numerous works. Bajcsy et al. [1] were first to demonstrate non-rigid registration of medical images. Generally, registration algorithms can be categorized into several different groups. The first group consists of pure voxel-based algorithms, where the computations are done analyzing only voxel grey-value information contained in the image datasets. The analysis is usually conducted according to some special similarity measures, like mutual information [14, 4], without any assumptions of external factors causing the deformation. Optical flow [7] and viscous fluid approaches [3] form another group. Further, there are physically-based methods, where the deformation of the soft tissue is described with physically motivated, mostly differential equations, that are discretized on a 3D grid and then approximately solved using finite element methods [5, 9]. In order to validate any of these registration algorithms, a precise quantification of occurring deformation is necessary [8]. The method we introduce is a novel voxel-based approach that combines elements of geometric transformations with computations done in graphics hardware in order to reduce computation time. Interpolation features of the hardware T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 549–556, 2002. c Springer-Verlag Berlin Heidelberg 2002
550
G. Soza et al.
are used in a special manner for the approximation of the deformation function with 3D piecewise linear patches. Our own experiences with rigid [6], non-rigid registration [13] and, generally, with appliance of graphics hardware [11] were extended in this work in order to allow for computation of Free-Form Deformations (FFD) [2, 12]. An optimal solution is searched for in the space of B´ezier functions, as they seem to be flexible enough and have some inherent elasticity, which makes them suitable for describing deformation of the brain tissue. The paper is divided into 4 sections. An introduction into the theory of B´ezier transformations is given in the Section 2. Subsequently, a general FreeForm Deformation approach and our hardware-based modification of the method are described. In Subsection 2.3 operations done in graphics hardware are explained. At the end of Section 2 the non-linear registration algorithm based on the modified FFD is described. In order to evaluate our registration algorithm we applied it to pre- and intraoperative MR images of the brain and summarized the results in Section 3 and 4.
2 2.1
Method Mathematical Background
Registration of medical images can be treated as a deformation of one of them in the way that the deformed image aligns with the reference image. We deform medical data using Free-Form Deformation (FFD). The idea is to warp the space surrounding an object that will be then warped implicitly. For the purpose of deformation of the space we take three-dimensional B´ezier functions, as they provide a mechanism for their modification and are characterized by intuitive behavior on their change. This kind of Free-Form Deformation contains inherent elasticity as well, which makes it a good choice for describing the movements of the soft tissue. Let us consider the object space OS parameterized with the function P : PS → OS leading from the parameter space PS being [0, 1]3 into this object space. The object space is associated with one of the datasets that will be transformed, the second dataset remains fixed. Let us introduce the deformation function D : PS → T S leading from the parameter space into the texture space T S (defined in the next section). We assume D is a B´ezier function, thus the shape of this deformation function is uniquely defined by the corresponding lattice of control points bi,j,k (i = 0, . . . , l, j = 0, . . . , m, k = 0, . . . , n) placed in the texture space. This deformation function can be expressed then as a trivariate tensor product: D(s, t, u) =
l m n
Bil (s)Bjm (t)Bkn (u)bi,j,k .
(1)
i=0 j=0 k=0
Movements of the control points bi,j,k in the lattice are followed by immediate changes in the form of the deformation function D. The basis functions Bil , Bjm , Bkn are Bernstein polynomials of order l, m and n, respectively.
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
2.2
551
Free-Form Deformation (FFD)
In order to accelerate the FFD we make extensive use of graphics hardware. In the first step the image volume from the object space is recomputed and loaded into the 3D texture memory of the graphics card, since we want to do the most expensive computations in the texture processing unit of graphics hardware. This image data is loaded into the texture memory only once, at the beginning. In order to perform texture mapping, the texture space T S being [0, 1]3 is associated with the texture memory. A single FFD of an object is divided into three steps. In the first step the object is embedded in the initial lattice of control points (the lattice lays in the texture space and the object lays physically in texture memory and in logical sense in the object space). In the next step control points are moved to their new locations in the texture space, thereby changing the control lattice. The movement is denoted by function M : T S → T S M(bx , by , bz ) = (tx , ty , tz ) .
(2)
In our approach we do not consider the absolute coordinates of the control points. We consider the free parameters of our B´ezier transformation to be offset vectors (tx , ty , tz ) from the initial control points positions (bx , by , bz ). Such a treatment allows us to view the occurring deformation as a change of a vector field that deforms an object placed within it, which is closer to the physical nature of the phenomenon. Initially the vector field is everywhere 0. For technical reasons the vector field is set to be 0 at the border during the whole registration process, thus only the inner control points of the B´ezier function are free for optimization. It is also well motivated in practice, because usually no deformation occurs at the boundary of a 3D image volume, as the interesting information is contained in the interior of medical images. After executing these steps, in classical FFD approaches [12] the new positions for every object point are explicitly calculated, based on the new locations of the control points. Instead, in our algorithm texture coordinates are computed with function D only for some uniform discrete sparse grid of points in the parameter space D(s, t, u) =
l m n
Bil (s)Bjm (t)Bkn (u)(bi,j,k + M(bi,j,k )) .
(3)
i=0 j=0 k=0
It should be mentioned, that this grid can be denser than the control lattice in order to get closer to the shape of the original 3D function. Having these texture coordinates on the sparse grid, we use them for approximation of the 3D B´ezier function with piecewise linear 3D patches. The motivation for such an approach is to make possible appliance of graphics hardware in order to optimize the execution of time consuming computations. An example presenting such an analogous approximation (only in 2D) is shown in Figure 1.
552
G. Soza et al.
1.0
1.0
1.0
1.0
0.75
0.75
0.75
0.75
0.5
0.5
0.5
0.5
0.25
0.25
0.25
0.25
0
0 0
0.25
0.5
a)
0.75
1.0
0 0
0.25
0.5
0.75
1.0
0 0
0.25
b)
0.5
c)
0.75
1.0
0
0.25
0.5
0.75
1.0
d)
Fig. 1. Subdivision of a slice into 2D piecewise linear patches. The B´ezier function is defined over a 3 × 3 lattice. Control point b1,1 was moved from its initial position (0.5,0.5) to (0.1,0.1), which resulted in D(0.5, 0.5) = (0.4, 0.4). a) Values from the image of function D on a uniform discrete grid 3 × 3. b) Resulting 2D piecewise linear subdivision of the slice. c) Values from the image of function D on a uniform 5 × 5 grid. d) Piecewise linear subdivision of the slice based on the values from c)
2.3
3D B´ ezier Function and 3D Textures
Based on the values of the function D on this sparse grid, the deformation is then propagated on the whole volume using trilinear interpolation. For accelerating this operation texture processing operations of graphics hardware are used. Using this approach less computational time is needed, as we do not need to process the whole 3D image voxel by voxel in software in order to obtain the new positions of the object points. For this purpose the uniform grid in the parameter space is sliced with planes parallel to one of the main axes in this space. The intersection points of the grid with the planes create a uniform quadrilateral structure in each slice. The number of resulting slices is equal to the resolution of the image volume in the direction perpendicular to the slices. For each slice a corresponding deformed slice in the texture space is computed (see Figure 2). Such a deformed slice consists of non-planar quadrilaterals whose vertices are defined by texture coordinates linearly interpolated from the values of function D on the sparse grid. For the purpose of rendering, the deformed texture coordinates are then assigned to their corresponding vertices in the parameter space. In order to avoid artifacts caused by an incorrect automatic triangulation in OpenGL the quadrilaterals are explicitly triangulated. Polygons are then rendered into the frame buffer. These polygons are texture mapped according to the computed texture coordinates and corresponding image information obtained after trilinear interpolation in graphics subsystem. 2.4
Registration with B´ ezier Functions
As an initial estimation for the non-rigid registration rigidly registered datasets are taken [6]. After the rigid registration also accelerated with graphics hardware, we consider one of the datasets, load it into the texture memory and embed it in a lattice of control points, thereby creating a structure allowing intuitive deformation of this image data. Initially the lattice has the form of a uniform
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
553
texture coordinates additional vertices for explicit triangulation
a)
b)
Fig. 2. Explicitly triangulated slice from the parameter space with corresponding texture coordinates: a) initially and b) after transformation of the texture coordinates
parallelepiped. The main idea of the non-rigid registration is to manipulate free control points in the lattice in such a way that the volume deformed with FFD (as described in Section 2.2) aligns with the reference volume. This makes the registration tantamount to a multidimensional optimization problem. The quality of the alignment is assessed based on mutual information. For the purpose of optimization Powell’s direction set method [10] is used. As we consider the occurring deformation as a deformation of a vector field, the degrees of freedom during the optimization are the translation vectors from the initial positions of the inner control points in the lattice. The control points on the lattice boundary remain fixed during the registration. In each optimization step a coordinate in one dimension of only one control point is changed, and the new volume obtained with FFD is computed. The procedure continues until the similarity measure computed between the deformed volume and the reference dataset reaches its optimum within a desired precision.
3
Results
We validated the algorithm in 7 clinical cases of patients with brain tumors. The experiments were carried out with pairs of MR T1-weighted scans of the head acquired before and during surgery on an open skull at the Department of Neurosurgery of the University of Erlangen-Nuremberg. All the scans were done with a Siemens Magnetom Open 0.2 Tesla scanner with resolution of 256 × 256 × 112 voxels and voxel size of 0.97 mm × 0.97 mm × 1.5 mm. Note the difference between pre- and intraoperative MR images, although the same pulse sequence was applied for both data. This is due to a special coil used for taking intraoperative images and to artifacts resulting from the operating environment. In all cases a significant brain shift effect had occurred. This phenomenon is
554
G. Soza et al.
influenced by a variety of factors, like gravity and leakage of cerebrospinal fluid. The effect was compensated for with our non-linear registration procedure. Each pair of the datasets was firstly registered rigidly and after that registered nonlinearly with our method, as already described. The experiments with the non-rigid registration method were conducted with a control lattice of 5 × 5 × 5 control points. However, in order to better approximate the corresponding B´ezier function, the function was sampled on a more dense grid of 9 × 9 × 9 points. This divided the object space uniformly into parallelepipedes of 3.12 cm × 3.12 cm × 2.40 cm. For achieving acceleration of the execution time we experimented also with downsampled data. The downsampling was done completely in hardware, therefore its computational cost was very low. The results so obtained were almost identical to the ones where original data was used, however, a significant acceleration was achieved. An average non-linear registration lasted between 6 and 7 minutes for one dataset. We would like to mention that no expensive and special hardware was needed for the computations. All computations can be done on a PC equipped with one of the commonly available graphics cards which support 3D texture mapping. Our experiments were executed on a PC with AMD Athlon 1.2 GHz processor and GeForce3 64MB graphics card. After the datasets had been successfully registered, the results were inspected visually by neurosurgeons. A good quality of the registration was observed, above all in the region of the brain surface (cortex), as presented in Figure 3. However, in the vicinity of the ventricles some small artifacts were seen. This could be explained with a quite sparse lattice of control points taken for the registration (5 × 5 × 5). This can be compensated for with a denser lattice of control points. However, as a trade it would result in a higher number of free parameters in the optimization and consequently longer computation times. Finally, we did a quantitative assessment of our algorithm to determine more precisely the quality of the method. The decisive evaluation criterion was the maximal extent of the brain shift measured at the cortex. We considered the magnitude of the brain shift (in mm) after a rigid registration only and after deforming the preoperative image with our approach in a non-linear way. The summarized results of the comparison are collected in Table 1. We can see from the table that the registration algorithm could compensate for the brain shift phenomenon with satisfying precision.
4
Conclusion
We presented a novel, non-linear registration approach based on Free-Form Deformation. In comparison to traditional approaches, the flexibility of B´ezier transformation is combined with the performance achieved applying graphics hardware in a special manner. Tests conducted with data from real patients showed the robustness and efficiency of the method. Despite of a poor contrast in the intraoperative images, the algorithm could correctly match them to the preoperative data in all cases.
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
a1)
b1)
c1)
d1)
a2)
b2)
c2)
d2)
555
Fig. 3. Results of rigid and non-linear registration of pre- and intraoperative MR scans of the brain. Top: axial, bottom: sagittal view. a) A preoperative slice image. b) The corresponding rigidly registered slice of intraoperative image superimposed with the contours of the brain extracted from the preoperative scan. c) Slice from non-linearly deformed image a). d) Slice of intraoperative image overlayed with the contours of brain from the deformed image Table 1. Quantitative results of the experiments Patient Age No (gender) 1 2 3 4 5 6 7
50 (F) 38 (F) 67 (F) 59 (F) 54 (M) 53 (F) 50 (F)
Diagnosis
Location of the tumor
Max shift at the cortex rigid non-linear
astrocytoma WHO II right ventricle 10.97 mm cavernoma frontal 9.47 mm metastasis left frontal 7.11 mm glioblastoma left frontal 6.67 mm glioblastoma left temporal 10.53 mm metastasis left frontal 7.26 mm metastasis left frontal 8.02 mm
1.80 1.74 1.26 1.28 2.13 1.88 1.97
mm mm mm mm mm mm mm
For our experiments we segmented brains in a semi-automatic way, as only the soft tissue undergoes deformation, the skull remaining stiff. This meant some time cost for preprocessing, which is surely a constraint of the procedure. However, there exist numerous methods that allow a completely automatic execution of segmentation of brains. In this work we concentrated exclusively on the registration method, that in the presented approach is completely automatic and computationally very efficient. Moreover, generally, the method is very flexible and can be used for registration of images of different modalities, as it is based on a statistical similarity measure, which mutual information in fact is. This method has been used to carry out successful experiments with registration of CT and MR data.
556
G. Soza et al.
Acknowledgments We gratefully acknowledge the help of Joel Heersink in proofing this paper. This work was funded by Deutsche Forschungsgemeinschaft in the context of the project Gr 796/2-1.
References ! ! ! " #$%&' (( ' ! !% ))! (*' + , -" . ((# # / !!" , 0 ! ! / !1 ! 2" " 23 4 +%'$+&'*# ((5 5 6
/ 7 8 6 / 9! - : ! " 3 +, 2 ) 2" 3
." 6 ;!
! ! !!! " # '