DATA HANDLING IN SCIENCE AND TECHNOLOGY - VOLUME 17
Data analysis for hyphenated techniques
DATA HANDLING IN SCIENCE...
62 downloads
884 Views
14MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
DATA HANDLING IN SCIENCE AND TECHNOLOGY - VOLUME 17
Data analysis for hyphenated techniques
DATA HANDLING IN SCIENCE AND TECHNOLOGY Advisory Editors: B.G.M. Vandeginste and S.C. Rutan
Other volumes in this series: Volume I Volume 2 Volume 3 Volume 4 Volume 5 Volume 6 Volume 7 Volume 8 Volume 9 Volume 10 Volume 11 Volume 12 Volume 13 Volume 14 Volume 15 Volume 16 Volume 17
Microprocessor Programming and Applications for Scientists and Engineers by R.R. Smardzewski Chemometrics: A Textbook by D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte and L. Kaufman Experimental Design: A Chemometric Approach by S.N. Deming and S.L. Morgan Advanced Scientific Computing in BASIC with Applications in Chemistry, Biology and Pharmacology by P. Valk6 and S. Vajda PCs for Chemists, edited by J. Zupan Scientific Computing and Automation (Europe) 1990, Proceedings of the Scientific Computing and Automation (Europe) Conference, 12-15 June, 1990, Maastricht, The Netherlands, edited by E.J. Karjalainen Receptor Modeling for Air Quality Management, edited by P.K. Hopke Design and Optimization in Organic Synthesis by R. Carlson Multivariate Pattern Recognition in Chemometrics, illustrated by case studies, edited by R.G. Brereton Sampling of Heterogeneous and Dynamic Material Systems: theories of heterogeneity, sampling and homogenizing by P.M. Gy Experimental Design: A Chemometric Approach (Second, Revised and Expanded Edition) by S.N. Deming and S.L. Morgan Methods for Experimental Design: principles and applications for physicists and chemists by J.L. Goupy Intelligent Software for Chemical Analysis, edited by L.M.C. Buydens and P.J. Schoenmakers The Data Analysis Handbook, by I.E. Frank and R. Todeschini Adaption of Simulated Annealing to Chemical Optimization Problems, edited by J.H. Kalivas Multivariate Analysis of Data in Sensory Science, edited by T. Naes and E. Risvik Data Analysis for Hyphenated Techniques, by E.J. Karjalainen and U.P. Karjalainen
DATA HANDLING IN SCIENCE AND TECHNOLOGY m VOLUME 17 Advisory Editors: B.G.M. Vandeginste and S.C. Rutan
Data analysis for hyphenated techniques
E.J. KARJALAINEN and U.P, KARJALAINEN Department of Clinical Chemistry, University of Helsinki, 02290 Helsinki, Finland
(,
1996 ELSEVIER Amsterdam
--
Lausanne
--
New
York
--
Oxford
m
Shannon
--
Tokyo
ELSEVIER SCIENCE B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands
ISBN
0-444-82237-2
© 1996 Elsevier Science B.V. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V., Copyright & Permissions Department, P.O. Box 521, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the U S A - This publication has been registered with the Copyright Clearance Center Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the USA. All other copyright questions, including photocopying outside of the USA, should be referred to the copyright owner, Elsevier Science B.V., unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands
MATLAB ® is a registered trademark of The MathWorks, Inc. Macintosh ® is a registered trademark of Apple Computer, Inc. Acrobat TM and Photoshop TM are trademarks of Adobe Systems, Inc.
Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
Summary of the book
7
..........................
MATLAB programs • Companion CD-ROM • Preprocessing and data compression • Alternating regression • Validating the solution • Application of OSCAR
PART 1---Analysis of hyphenated data Analysis of overlapping spectra
.............
1.1 1.2 1.3 1.4 1.5 1.6
Data overload 17 Strategies to deal with data 17 Dealing with the full information 18 The spectral overlap 19 Mixture spectroscopy in hyphenated instruments 20 The advantages of hyphenated data analysis 21
2
Data preprocessing
2.1 2.2 2.3 2.4 2.5 2.6 2.7
A MATLAB example 25 Handling noise 32 Visualization and filtering 34 Types of filters 41 Image processing and hyphenated data 52 Cleaning up data 59 Correction for the sampling delay 63
......................
Compression with principal components
17
25
.......
3.1 3.2
Singular value decomposition 70 Performing a svd on GC-MS data 92
4
Techniques for library searches . . . . . . . . . . . . . .
4.1 4.2 4.3
How do we measure similarity between spectra? 101 A MATLAB program for library searching 102 Speeding up the library search 103
69
99
5
Neighborhood operations on hyphenated data . 1 0 9
5.1 5.2 5.3 5.4 5.5
Background subtraction 109 Homogeneity checks 110 Local purity checks by svd 113 Local peeling 115 Sharpening the chromatographic peaks 119
6
Alternating Regression . . . . . . . . . . . . . . . . . . .
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8
Deconvolution and AR 127 The OSCAR approach 129 Defining the objective function 130 Finetuning the solution 131 The elution curve constraints 132 The steps in the "core AR" algorithm 133 Repeating the AR iterations 137 Summing up 137
7
Applying the O S C A R a l g o r i t h m - - A practical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8
The OSCAR process 141 Taking a first look with MATLAB 141 Preprocessing the data with AutoAR 148 Setting up the constraint space 152 Gathering the AR statistics 152 Inspecting the solution found with AutoAR 155 Plotting the spectra and elution curves 162 Calculating the reproducibility 164
8
Applications in other spectroscopies . . . . . . . . .
8.1 8.2 8.3 8.4
Single-dimensional signals and AR 172 Other hyphenated instruments 175 Two-dimensional data with internal continuity 176 Using OSCAR for spectra of discrete samples 177
9
Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1 9.2
Statistical validation 183 Experimental validation 190
127
141
1 71
183
10
AR and factor analysis . . . . . . . . . . . . . . . . . . .
10.1 10.2
Mapping between spectra and PCA 195 The relative speed of AR calculations 198
11
Looking ahead
11.1 11.2 11.3 11.4
The two kinds of constraints 203 The guiding constraints 203 OSCAR as realized by AutoAR 204 The structural constraints 204
1 95
........................
203
PART 2 - - C o m p u t e r programs for hyphenated data 12
The starting point . . . . . . . . . . . . . . . . . . . . . . .
211
13
Selecting and preprocessing the raw data . . . . .
221
14
Gathering the statistics from the AR experiment . 245
15
AIR statistics in two dimensions
16
AR statistics in three dimensions
307
17
Selecting an optimal parameter set . . . . . . . . . .
341
18
Displaying the spectra and elution curves
367
19
Calculating the confidence ranges of spectra
.............
271
.....
and elution curves . . . . . . . . . . . . . . . . . . . . . . 20
383
Looking at the confidence ranges of the spectra and elution curves . . . . . . . . . . . . . . . . . . . . . .
409
21
Looking at the measurements
427
22
Finding help . . . . . . . . . . . . . . . . . . . . . . . . . .
..............
443
Appendix
.................................
457
Program 2.2: savgoll.m 457 Program 2.3: savgol2.m 458 Program 2.4: savgol3.m 459 Program 2.5" outlier.m 460 Program 2.6: bellshap.m 462 Program 3.1: callgau.m 463 Program 3.2" gaussian.m 464 Program 4. I- ]ibsearl.m 464 Program 4.2: libsear2.m 465 Program 5. ]: purity.m 466 Program 5.3: locpee].m 467 Program 5.4: highpass.m 468 Index .....................................
469
Preface When we analyze complex mixtures such as biological samples, the data captured by hyphenated instruments is sufficient to resolve thousands of components. The information is there. If the computer is able to separate the "pure" components hidden in the overlapping chromatographic peaks, significant benefits arise: We can lower the detection limit in our methods. The runtimes can be shortened, because less separation is necessary. We can quantitate partially separated components. We can resolve a larger number of compounds than is presently possible. This book is written for the analyst who wants to use the computer to extract spectra from overlapping observations. The reader can initially use the programs as such. With proficiency in the use of MATLAB ® the user can modify the programs for his specific purposes. The programs are not "black boxes", they are fully documented, readable source code. The approach to spectral decomposition presented in this monograph does not need libraries for spectra or retention times. The algorithm starts developing the spectra from random numbers. It stays in the positive range of intensities during the solution process. Baseline constraints are gradually tightened until the solutions that are repeatedly found from different random starting points converge to one common point. The approach is called OSCAR ("Optimization by Stepwise Constraining of Alternating Regression"). The OSCAR philosophy is possible because of the speed of modem microcomputers. OSCAR uses altemating regression, AR, as its core algorithm to rapidly solve the spectral decomposition problem hundreds of times. The accumulated statistics give us an overview on the reproducibility of the solution. In this way we can estimate the quality of the solution. OSCAR calculates standard deviations for the spectra and elution curves in the solution. The OSCAR approach is not limited to hyphenated instruments. With it, it is also possible to estimate the spectra of the underlying components in mixtures. In the book, we present an example in which kidney stones are decomposed into their basic constituents in this way.
Full MATLAB listings are given in the program part of the book. The companion CD-ROM contains the program listings in the book and collections of instrumental data. The CD-ROM is in hybrid format, which is directly readable by PC and Macintosh. There are example runs with GC-MS data and IR spectra of discrete samples. Altogether there are more than 10 000 spectra on the CD-ROM. The software and data can be used under Windows and Macintosh operating systems. The contents should also be readable in UNIX workstations that have a MATLAB interpreter. Students can try out the programs with the low-cost MATLAB student versions available for PC and Macintosh. The CD-ROM contains QuickTime videos with spoken commentary. They demonstrate step by step the use of the MATLAB programs. The CD-ROM also contains a full manual for the programs. The manual is in Adobe Acrobat TM format, which is readable on PC and Macintosh. Many persons have influenced the the authors over the years. We want to thank Svante Wold for trying to convert us to SIMCA in 1978, and for many fruitful discussions. Pentti Minkkinen deserves thanks for promoting chemometrics in Scandinavia. Veli-Matti Taavitsainen and Heikki Haario initially recommended MATLAB to us. It is always a ple~isure to discuss anything from algorithms to multimedia with J.T. Clerc; a new point of view is always certain to emerge. Ray Dessy has always been ready with friendly advice. Finally, we would like to pay tribute to the late Manfred Donike by having a cup of his "Kaffee Donike", hot water poured over ground coffee in a pyrex beaker. He gave us the initial guidance for analyses in doping control. The pressures of doping control analysis led us to develop the programs described and listed in this book. In Helsinki, 7th November 1995
Erkki and Ulla Karjalainen
Summary of the book This book has two goals. The first goal is to get the reader started in the analysis of data from hyphenated instruments. For this, the book contains a fully documented MATLAB program that can be used to analyze two-dimensional data matrices. The second goal is to get the reader started in making his/her numerical experiments with MATLAB. The book provides the user with the concepts that are needed and offers background ideas for data analysis. The starting point is to master the operation of the analytical instrument. After that, the analyst must interpret the results. Interpretation means that he must understand multivariate methods in statistics and mathematics. Understanding means understanding the ideas, not just the mechanics of running the results through a standard software package. All statistical methods work as a computer program in some programming language. To validate a statistical calculation the program itself should be checked and understood. This means that the analyst must be familiar with programming and some parts of matrix algebra. We try to introduce the necessary background information about mathematics and programming as we work through the examples in the book. Books about chemometric techniques have been long on theory but short on practice. The illustrated examples have been calculated with a special "black box" program. This book tries to open the "black boxes" for the reader. The second half of the book contains the full source code for a MATLAB program that analyzes hyphenated data. The user can apply the program to any problem involving hyphenated methods. It is usable also for any two-dimensional data from discrete samples.
MATLAB programs All calculations are made with MATLAB programs that are documented in the book. The reader still needs one piece of commercial software. That is the MATLAB interpreter. MATLAB is a portable programming environment developed by a software company, The MathWorks Inc. in Massachusetts, USA. MATLAB is available for Windows, Macintosh and most workstations based on the UNIX operating system. Many MATLAB programs are available in publications and textbooks
[Coleman and Van Loan 1988, Etter 1993, O'Haver 1992]. Some software listings are also available as separate commercial software packages from MathWorks. Public domain listings are available over Internet from The MathWorks Inc. and universities. The great benefit of MATLAB is openness. All programs are listings that can be studied and changed by the reader. Many listings in the book are short and can be entered easily by hand. Because the MATLAB programs are portable, they should work on any computer that has a MATLAB interpreter. Memory size limitations on computers may make it necessary to scale down the problems in some cases. There are student versions of MATLAB available for PC and Macintosh that have built-in size limitations regarding matrix dimensions. The newest version of MATLAB for students can handle maximally 8,000 elements in each array. The programs in the book were developed using version 4 of the interpreter. The main additions to the interpreter in version 4 are the sparse matrices and the new elements in the language that support GUI, the graphical user interface. The programming style was influenced by the need to use GUI. The goal has been to minimize the total number of separate script files needed to use even the most demanding applications.
Companion CD-ROM There is a second problem with a book like this. It concerns access to realistic data sets. A data set for a typical gas chromatographic~mass spectrometric (GC-MS) run is large. It needs easily 10 megabytes of disk space in ASCII format. Data sets for some analytical problems occupy hundreds of megabytes. This amount of data does not fit on floppy disks. The medium for distribution of data files should have a format that is readable by the different computers on the market. CD-ROM is a medium that can be read by all computers having a CD-ROM reader. CD-ROM readers are constantly finding new uses. If you do not have a reader, you can often find a reader in a library where CD-ROM readers are used for bibliographic databases. Some readers will want to improve the programs in the book. They could have ideas that produce faster or more accurate programs. The original data sets are available on a companion CD-ROM. The CD-ROM in the ISO 9660 format contains the data files in ASCII format. There are separate folders for the PC, Macintosh and UNIX. The CD-ROM is a so-called hybrid. The PC user sees only his own data and programs. The same is true for the Macintosh users. The CD-ROM can be used on most UNIX machines that read the ISO 9660 format. The data can be read by any computer that has a CD-ROM reader. We hope that this data set is useful. Common test data sets are necessary for people developing algorithms. Using known data sets as benchmarks, the developers can compare the perform-
ance of different approaches more objectively. As a convenience, the MATLAB program listings contained in the book are on the CD-ROM as well.
Preprocessing and data compression Data preprocessing is a crucial step in the analysis process. Yet this important, aspect is not considered by the textbooks. Data should be smoothed by proper filters. Too little smoothing may prevent the convergence of the algorithms, too much smoothing and essential features are lost. Some spectral features could be emphasized by high-pass filtering before attempting to decompose the data matrix. The important Savitzky-Golay filter is explained in detail [Savitzky and Golay 1964, Steinier et al. 1972, Rutan 1992]. The outliers should be eliminated at the preprocessing stage. We show how data can be inspected for local outliers. Because convolution is so central to all preprocessing, there is a step-by-step discussion of convolution in one and two dimensions. Data compression is an important tool in the analysis of hyphenated data. It is also used for spectral libraries [Harrington 1988]. The result of a successful data analysis is a more compact representation of the original data matrix. The spectra and concentration profiles need less storage space than the original data. In spectrum decomposition the analysis usually starts by compressing the observation matrix. The first step is principal component analysis or factor analysis. This book uses factor analysis for a different purpose. It is not used as a first approximation to the desired solution. We do not attempt to modify the factors to the "true" solution. The factors are used as a compact substitute for the data matrix to speed up calculations of alternating regression. The factor analysis gives an estimate about the number of components.
Alternating regression Then we come to the main theme of the book, AR or altemating regression [Karjalainen and Karjalainen 1985, 1991, Karjalainen 1990]. AR did not start from a statistical theory. It arose in the late seventies from a practical need, rapid analyses for doping control. It was published after several years of use. The initial estimates of spectra are random numbers. AR tries to keep the spectra and elution profiles in the positive space during the whole solution process. We derive the AR method in the book in stepwise fashion. The role of the constraints is crucial. Then we start to optimize the algorithm. The ideal would be a program that can perfectly analyze any data set thrown at it. Of course, we are not there yet. We call
10
the current stage in the evolution of the AR the OSCAR algorithm. OSCAR could be derived from the words "Optimization by Stepwise Constraining of Alternating Regression". The solution that is optimal gives a good fit and good reproducibility. Constraints are gradually tightened to a point where the two modeling errors can be combined in an optimal way. There is always the danger of using a model with too many degrees of freedom. The fit is better, but the solution is not reproducible. When we are overfitting we get different solutions every time we repeat the solution process. The solutions are linear combinations of some basic components. If we do not constrain the solution with more constraints, we cannot choose among the candidate solutions. The optimum combines two opposing criteria or error types. These two criteria are the fit and the variance of the solution. The two criteria are combined by proper weighting coefficients to derive a combined criterion for the optimal model.
Validating the solution We obtain spectra and elution profiles as a solution to the decomposition problem. At the same time we should get the confidence interval of the solution. If the variation is too large, we must obtain new analytical data to get more precise results. We introduce methods for calculating the confidence intervals of spectra and elution profiles. We discuss several statistical approaches for estimating the variance of the solution. Jackknifing, bootstrapping and cross-validation are such computationally intensive methods. We show how to estimate the confidence intervals for spectra and elution curves with OSCAR. The results should make chemical and physical sense. The results of a mathematical analysis are not enough.
Application of OSCAR It is important to get by with general constraints that are not too specific. The AR framework can be easily extended in several directions. Using diode-array UV-Vis spectrometers it is possible to do titration experiments and analyze the results with a variation of the AR algorithm. The only real difference to the "standard" AR is the nature of the second dimension, where the requirement for unimodality is dropped. The constraining needed to get a unique solution is achieved by adjusting the background values. We illustrate here the broad usefulness of OSCAR in analyzing the constraints needed for different spectroscopies. Applications for discrete samples are shown by applying OSCAR to a large collection of IR spectra. Analysis of the IR spectra from 300 samples of kidney stones is interesting. The analysis does not use any
11
library of the expected components: OSCAR is completely unguided with respect to the compositions. The components found by OSCAR are the pure chemical components. The AR point of view applies to problems in single-dimensional chromatography. We show how AR is used for analyzing sets of single-dimensional gas chromatograms. Whole batches of samples are analyzed simultaneously. The benefit of using AR is the better precision and improved knowledge about the exact peak shapes of the compounds. We hope that this book represents a modest beginning in the full documentation of calculation methods.
References Coleman TE Van Loan C. Handbook of matrix computations. Philadelphia: SIAM, 1988, 264 pages. Etter DM. Engineering problem solving with MATLAB®. Englewood Cliffs: Prentice-Hall, 1993, 434 pages. Harrington PB, Isenhour TL. Application of robust eigenvectors to the compression of infrared spectral libraries. Anal Chem 1988; 60: 2687-2692. Karjalainen EJ, Karjalainen UP. Mathematical chromatography~Resolution of overlapping spectra in GC/MS. In: Roger FH, Gr6nroos P, Tervo-Pellikka R, O'Moore R, eds. Medical Informatics Europe 85, Proceedings, Helsinki, Finland, August 25-29, 1985. Berlin--HeidelbergmNew YorkwTokyo: Springer-Verlag, 1985, pp. 572-578. Karjalainen EJ. Isolation of pure spectra in GC/MS by mathematical chromatography: Entropy considerations. In: Meuzelaar HLC, ed. Computer-enhanced analytical spectroscopy, vol. 2. New York~London: Plenum Press, 1990, pp. 49-70. Karjalainen EJ, Karjalainen UP. Component reconstruction in the primary space of spectra and concentrations. Alternating regression and related direct methods. Analytica Chimica Acta 1991; 250: 169-179. O'Haver TC. Teaching and learning chemometrics with MATLAB. In: Brereton RG, Scott DR, Massart DL et al., eds. Chemometrics Tutorials H. Amsterdam--LondonmNew York--Tokyo: Elsevier, 1992, pp. 1-9. Rutan, S.C. Fast on-line digital filtering. In: Brereton RG, Scott DR, Massart DL et al., eds. Chemometrics Tutorials H. Amsterdam--LondonmNew YorkmTokyo: Elsevier, 1992, pp. 67-77. Savitzky A, Golay MJE. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 1964; 36: 1627-1639. Steinier J, Termonia Y, Deltour J. Comments on smoothing and differentiation of data by simplified least square procedure. Anal Chem 1972; 44: 1906-1909.
This Page Intentionally Left Blank
PART I--Analysis of hyphenated data
This Page Intentionally Left Blank
Chapter
1
Analysis of overlapping spectra •
Data overload
•
Strategiesto deal with data
•
Dealing with the full information
• •
The spectral overlap Mixture spectroscopy in hyphenated instruments The advantages of hyphenated data analysis
•
This Page Intentionally Left Blank
Analysis of overlapping spectra 1.1
Data overload
The analytical chemist faces an interesting dilemma. Hyphenated instruments produce a wealth of readings in a very short time. If we continuously scan spectra in a modem chromatographic instrument such as a GC-MS or HPLCUV-Vis, the hard disk of the instrument is rapidly filled. It is impossible to inspect all the spectra produced. The analyst collects a large amount of raw data, but gets little information. The information that the chemist would like to have is not what he gets from the instrument. He would like to get "pure" spectra of the injected components. Instead, he gets spectra that are of mixtures of chemical components (Fig. 1.1). Few spectra, if any, are pure. There is an information famine in the midst of the data glut.
1.2
Spectra
Known Concentrations
Observations
Figure 1.1 The basic situation for analyzing data from hyphenated instruments. We know only the observation matrix that can be thought to be the product of a "spectra" matrix and a "concentrations" matrix. What we would like to know are these two unknown matrices.
Strategies to deal with data
Because the raw data is not what is really wanted, the chemist resorts to several strategies to reduce its amount. The first strategy reduces the amount of recorded data. Only a single spectrum is kept for each chromatographic peak. The spectrum 17
18
on the top of a chromatographic peak is selected to represent the compound eluted. The analyst limits his analysis to the highest peaks and skips the rest. Another way to reduce the volume of data is to record only parts of the spectrum. In mass spectroscopy, this strategy is known as mass fragmentography (or single/selective ion monitoring, SIM, or single/selective ion recording, SIR). Here only a handful of selected masses is recorded. The smaller amount of data is not the only reason for fragmentography. The main benefit is that it gives more sensitivity, too. Another trick used is to subtract backgrounds in recorded spectra. A suitable multiple of a neighboring spectrum is subtracted from the actual spectrum. This produces a simpler spectrum, but the process of background subtraction has its own pitfalls. Still the various tricks used to reduce the amount of information are not the best solution to the data overload. They do not use the full potential of the instruments.
1.3
Dealing with the full information
The only satisfactory solution to the data overload problem is true data reduction. Instead of mixture spectra, the analytical instruments should output pure spectra and elution profiles for the pure components. The instrument manufacturers react to the marketplace. They produce only instruments specified by the instrument buyers. It is the responsibility of instrument users to develop the ideas for the next generation of instruments, If the users can show to the manufacturers that the raw data should be analyzed in a better fashion, a new generation of automatic instruments will appear on the market. The competition between manufacturers will take care of that. The time is now ripe for the full analysis of data produced by the hyphenated instruments. Modem computers can analyze the raw data at low cost and high speed. The cost of a computation goes down by some 25 per cent every year. We shall soon be in a situation where we cannot afford to leave the data unanalyzed. We hope that the MATLAB programs documented in this book provide a starting point for the analytical chemist. The mathematical ideas may be difficult to grasp initially. The only way to learn mathematical modeling is by doing it. MATLAB is an ideal tool for learning mathematics and modeling. The MATLAB programs resemble mathematical notation and are pleasantly short. Instead of pages of CALL-statements in FORTRAN programs, only a few lines of MATLAB are necessary for the same effect. MATLAB is available on most computer platforms including PC, Macintosh, and UNIX workstations. For extremely heavy computing needs, even a Cray version exists. MATLAB is used as a teaching device for mathematics and engineering. The chemists have not used it as widely, but it is
19
Absorbance 1
2
3 v
I
I
I
Compound
~
Interfering
~
compound
Concentration
of interest 1
2
3
Figure 1.2 The specificity of most methods in analytical chemistry is limited. Three different samples can give identical absorbance readings. The first sample containing only of the compound of interest gives a correct result. The second sample has an interfering substance present that absorbs at the same wavelength as the compound we are interested in. This sample gives a concentration that is too high. The third sample contains another interfering substance that reduces the absorbance of the compound of interest. This sample shows a concentration that is too low. gaining momentum in chemometrics. MATLAB is useful as a publishing language because the programs are complete documented tools [O'Haver 1989].
1.4
The spectral overlap
The basic situation facing the analytical chemist is familiar to anyone who has done analysis. The analyst wants to analyze the nature and the concentrations of the chemical compounds in some real-world materials. The sample could, for example, come from the environment or from a patient. The sample contains thousands of compounds, but only a few are of interest. The measurement process is a battle against spectral overlap. Our example is about photometric methods, but the same situation is valid for most spectroscop-
20
ies (Fig. 1.2). The compound of interest reacts with a reagent, forming a colored compound. The color is measured with a photometer. If only the compound of interest is present in the sample, the answer is reliable. The calibration curve prepared with the pure compound is then used to estimate the amount of the unknown. In reality, the calibration process is flawed. There are many other compounds present in the sample that have an effect on the photometric reading. Some react with the reagent to form related (or identical) colored compounds. Other compounds reduce the amount of color produced in the reaction for the compound of interest. What we obtain in our photometric reading is a sum of several spectral compounds. The true answer remains unknown, what we get is an approximation to the true concentration [Martens and Na~s 1989].
1.5
Mixture spectroscopy in hyphenated instruments
A similar situation is present in most of our measurements. When we use a hyphenated instrument, such as a GC-MS, an HPLC-UV-Vis or a capillary zone electrophoresis-diode-array instrument, spectral overlap is always present. The first dimension in the two-dimensional data matrix is due to the chromatographic process. The second dimension is due to the spectroscopic part of the instrument. The chromatographic separation is seldom complete. The separation power of the chromatographic column is not sufficient to cause a complete separation of all compounds in real-world samples. The result is overlap in the recorded spectra. We could call hyphenated instruments equally well instruments for the spectroscopy of mixtures. Of course, we can never analyze all compounds that are present in the sample. We cannot get enough information from a single instrument run. What we can hope for, is that we could resolve most of the overlapping compounds that are present in sufficient concentrations. This means hundreds of compounds in a single sample that is analyzed on a sensitive instrument. We need a way to continue the separation process inside the computer. The chromatographic overlap is a fact we cannot overcome experimentally in a general sense. Of course, we can always optimize the situation for a few compounds of interest. We can use a somewhat different packing material in a column or we can run our separation a bit differently to get some molecules better separated. We cannot improve the separation for all compounds simultaneously to get totally rid of the overlap. It is logical to continue the separation in the next instrument down the analytical chain, the computer. The spectral dimension contains underutilized information. We can use that extra information to compensate for the lack of chromato-
21
graphic separation. It is a new tradeoff between the two dimensions containing information. We get a separation in the abstract, mathematical domain. This separation is not possible in the physical dimension, but it is possible in the data analysis that follows the data acquisition.
1.6
The advantages of hyphenated data analysis
The analysis of data from hyphenated instruments has one major advantage compared with some other domains of science. The spectra that we obtain by analyzing the raw information are verifiable directly. If we obtain a spectrum from a mixture that suggests a novel compound, we can always try to isolate that molecule. The direct recording of the spectrum from the purified compound is the final proof of the finding. On the other hand, if we cannot isolate the compound, the finding should be treated with caution. Chemical compounds are usually stable entities. All fields of study are not blessed with this property of studied objects. The astronomer is not that fortunate. He cannot go directly to pick up a sample of interstellar gas to verify the presence of an organic molecule. The astronomer's conclusions remain more tentative than the findings of the analytical chemist. The final verification for a new spectral entity is the purification of the novel compound. Even before this step, we can obtain information about the reliability of the new spectrum. We can estimate statistically how reliable the spectra and elution profiles are. We shall devote an entire section in the book to the methods for assessing the reliability of the solution. In future, the cost of computing will be so low that extensive analyses should be possible for data from hyphenated instruments. The sample is put through a number of hyphenated instruments and the results from all instruments can be subjected to a common processing step in the computer. A large collection of special methods can be replaced by one standard method. The economics are favorable for the new trend. It is cheaper to add computing capacity to an existing analytical instrument than to refine the instrument with more precise mechanical or optical components. What counts is the real cost of obtaining a bit of information conceming the sample. Chemometrics has a bright future ahead [Massart et al. 1990, Brereton et al. 1992]. With a new generation of improved tools for data analysis the analyst will have more time to contribute to the interpretation and effective use of the data.
References Brereton RG, Scott DR, Massart DL et al., eds. Chemometrics Tutorials H. Amsterdamn LondonnNew YorkmTokyo: Elsevier, 1992, 314 pages.
22
Martens H, Naes T. Multivariate calibration. ChichestermNew YorkmBrisbanem Toronto--Singapore: John Wiley & Sons, 1989, pp. 111-112. Massart DL, R G Brereton RG, Dessy RE, Hopke PK, Spiegelman CH, Wegscheider W, eds. Chemometrics Tutorials. AmsterdammOxfordBNew York--Tokyo: Elsevier, 1990, 427 pages. O'Haver TC. Teaching and learning chemometrics with MatLab. Chemometrics and Intelligent Laboratory Systems, 1989; 6: 95-103.
Chapter
2
Data preprocessing •
A MATLAB example
•
Handling noise
•
Visualization and filtering
•
Typesof filters
•
Image processing and hyphenated data
•
Cleaning up data
•
Correction for the sampling delay
This Page Intentionally Left Blank
2
Data preprocessing
Data from hyphenated instruments is often a huge collection of numbers. The first difficulty is managing the massive amount of data. It is necessary to process the raw data to gain some initial insight into the nature of the data set. The preprocessing necessary to have a look at the data is called visualization. Methods for visualization are a special field in mathematical modeling and programming. Techniques used for visualization are often identical with the techniques used in the actual analysis of chemical data. Chemists should familiarize themselves with the basic tools for image processing, such as the program Photoshop. These tools are very useful in taking a first look at data matrices. Preprocessing of the instrumental data is crucial for the success of the actual signal processing. The goal is to suppress noise and emphasize the information. When data is transferred between computers, transmission errors may creep in and these errors must be detected. All readings are not equally useful in the later steps. Unessential data can be eliminated. The information can sometimes be compacted by statistical means. A large data matrix can be represented by fewer numbers to give a more compact representation. The readings of the original matrix can then be reconstructed from the compact representation and hopefully some noise has been lost in the process.
2.1
A MATLABexample
2.1.1
Conversion to MATLAB matrix format
To process GC-MS readings, we must first convert the readings from the "raw" instrument data format into the matrix format used by MATLAB. Let us assume that the very first step in this conversion process has already been made by using a small Basic program. This step is so specific for each instrument that it is not possible to cover all of them. For simplicity we assume that the collected spectra have been reduced into triples. They contain the number of the scan, the mass number and the abundance or intensity for every spectral line in the observations (Fig. 2.1). The triples are separated by commas in an ASCII file written out by the Basic program. As the conversion is done only once, there is no great need for the highest 25
26
Figure 2.1 A small part of mass spectrometry data file in the triple format suitable for the MATLAB program t r i q u a d , m. The first column is the spectrum number, the second is the mass number--here multiplied by ten-and the last column is the abundance or intensity value. The intensity is expressed in absolute counts, not in relative units.
I, i, i, i, i, i,
552, 562, 572, 591, 611, 672,
93 30 108 39 31 56
speed in the program. Sometimes, the conversion from the "raw" instrument format to a format acceptable to MATLAB can be done manually using a text editor for the purpose. In UNIX systems the "vi" editor is usually available. Program t r i q u a d . m converts triples into the matrix format needed by MATLAB (Prog. 2.1). The dimensions of the matrix are the number of scans and the number of masses. The values in the matrix are the individual abundances. We assume that the spectra are evenly spaced in time. We walk through the program statement by statement to explain the operation of MATLAB. Later in the book we explain only new and important statements. Let us see now how the program triquad works. The first line is"
%TRIQUAD.M converts data from "triple" into matrix format. It has a percent sign as the first character of the line to show that this line is a comment. TRIQUAD is the name we have chosen for our small conversion utility. The extension ".M" signifies that the file is a MATLAB script (or function). A script can contain empty lines for improving the clarity of the code. All explanations in the program code must start with a percentage sign.
clc This line clears the output in the Command window by scrolling the display and putting the cursor on the first line of the new display. All commands and functions are written in lower case letters in MATLAB. Filenames or names of variables may use upper case letters as well. Remember that MATLAB makes a difference between variables a z and AZ. Some platforms are not case-sensitive concerning filenames.
clear This command empties the working RAM memory of MATLAB. The MATLAB interpreter is different from most other interpreters because it retains the contents of the RAM working memory even after restarting the interpreter for
27
%TRIQUAD.M converts data % i n t o m a t r i x format. %
from
% D a t a h a n d l i n g for h y p h e n a t e d % (c) E r k k i & U l l a K a r j a l a i n e n .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
"triple"
techniques 1995
.
clc clear start=clock; 1 oad STERO T R = STERO; [n,m]=size(TR) ; n s c a n s = m a x (TR ( : , I) ) ; nlines=fix(max(TR(:,2)/10)) ; O b s - z e r o s (nscans, n l i n e s ) ; for i-i -n s c a n = T R (i, i) ; line=fix(TR(i,2)/10) ; a b u = T R (i, 3) ; Obs (scan, line) =Obs (scan, line) +abu; end s a v e Obs Obs p i o t (s u m (Obs ' ) ) d i s p ( ' S t r i k e a n y k e y to c o n t i n u e ' ) , p a u s e c o n v t i m e = e t i m e (clock, s t a r t ) ; e c h o off disp('Conversion t i m e t r i p l e s to m a t r i x d i s p (c o n v t ime ) d i s p ( ' T h e n u m b e r of t r i p l e s = ') ,disp(n) d i s p ( ' T h e r e s u l t i n g m a t r i x c o n t a i n s ') .... d i s p (nscans) .... disp(' scans') disp('and ' ) , d i s p ( n l i n e s ) .... disp(' s p e c t r a l l i n e s a f t e r t r u n c a t i o n ' ) d i s p ( ' S t r i k e a n y k e y to c o n t i n u e ' ) , p a u s e e c h o on
Program 2.1 Program t r i q u a d . m
I
)
!
•
o
o
converts spectral data which is in "triple" format into a matrix format for use with MATLAB.
28
different program scripts. This means in practice that the memory must be specifically emptied with the c l e a r statement. If this is not done the memory often turns out to be too small for the problem. start=clock;
It is important to be able to measure the running time of programs. For this, we use the c 1 o c k verb in MATLAB. The current time of the clock is put into the vector s t a r t . When we later call c l o c k again, we can calculate the elapsed time. The semicolon at the end of the line is very important: without it, the program will output the result of that line. This is not too inconvenient for scalars or short vectors but it is intolerable for large arrays. load STERO
This command loads into the working memory a narrow matrix containing the triples (Fig. 2.1). Each horizontal row of the matrix has three elements. The elements are the number of the scan, the mass number as an integer and the abundance reading for that particular spectral line. From the point of view of the operating system the file is a normal ASCII file containing printable characters. If the file is large, the reading process takes a long time because the interpreter checks that the data has the correct form. If the matrix has an erroneous format it is rejected by the MATLAB interpreter. TR=STERO;
The purpose of this statement is to convert the name of the matrix into a more convenient form. In this simple example, the name of the ASCII data file (STERO) is "hard-coded" into the program. If you input a different data file, you must change the name for the file in the MATLAB program. A better way is to change the program so that it would ask the user for the name of the file. This is shown later in other programs. [n,m] =size (TR) ;
This statement stores the size of the triple matrix for later use. The format of the statement is worth noting. It is a function call to a routine called s i z e. The input variable to the routine is the matrix TR. The output from the routine is returned on the left side of the equal sign. There are two output values: n corresponds to the number of rows and m corresponds to the number of columns. It is
29 worth remembering that multiple return arguments are returned in square brackets. If there is only a single return argument no square brackets are needed. nscans--max (TR ( :, 1 ) ) ;
The number of scans in the spectrum matrix is found using this statement. As we see from the statement, the maximum value in the first column of TR is the result. We assume here that the first spectrum number is one. Otherwise we should calculate the real number of spectra as a difference. It is again worth noting how the colon is used. As an index in a matrix it means "any" or "all" columns. It is a very practical way to express things as we shall see later. nl ines=f ix (max (TR ( :, 2 ) /10 ) ) ;
The number of spectral lines is defined using this statement. What happens here is the following: First, we find the maximum value in the second column of TR. This value is divided by ten and converted to an integer by the function f i x . Here the original spectral data were recorded with one decimal in mass numbers. The decimal point was just left out, which is the same thing as multiplying by ten. To avoid the formation of a huge matrix the mass numbers were scaled to integer values. We assume here that spectral lines start from one. If they do not, a number of spectral lines in the matrix always contains zeros. This makes the matrix larger, but the original positions of the mass numbers are retained. O b s = z e r o s (nscans, n l i n e s ) ;
Space needed for the observation matrix is reserved in the working memory under the name Obs. The size of the matrix depends on the maximum dimensions. The dimensions are n s c a n s and nlines; n s c a n s is the number of spectra, while n l i n e s is the number of spectral lines. The function z e r o s fills the Obs matrix initially with zeros. The next six lines of the program must be seen as an entity. The statements in the program fragment form a loop. The loop is between the f o r and e n d statements. for i = l - n s c a n : T R (i, i) ; l i n e : f i x ( T R ( i , 2)/I0) ; a b u = T R (i, 3 ) ; Obs (scan, line) :Obs (scan, line) +abu; end
30
Let us dissect this small piece of code. Programmers using Basic or other highlevel languages can certainly see what this program fragment does. The loop is traversed n times starting with the index value of 1 for the index variable 2. For each pass through the loop the 2 is incremented by one. In each pass we take the elements from a row of matrix TR. We place them into three variables, s c a n , 1 2 n e , and a b u corresponding to the scan number, the mass of the fragment and the abundance or intensity of the fragment. Obs (scan, line) =Obs (scan, line)+abu;
This statement performs the actual conversion from triples into matrix format. Because we are truncating the mass numbers into unit masses, it is possible that intensities from more than one fragment must be summed together. This is handled by the statement above. save Obs Obs x 10 s
Total a b u n d a n c e i
i
i
i
5
4 (/J
8 2
1
10
i
1 oo
i
200 150 Scan n u m b e r
250
300
350
Figure 2.2 Total abundance of GC-MS sample data. The sample is the derivatized neutral steroid fraction extracted from a urine sample. Segments of this data are used later in the examples of this chapter to illustrate filtering.
31
The previous statement saves the results of the conversion on disk into a file Obs . m a t . The first Obs is the filename, the second is the variable name used inside the program. MATLAB automatically adds the ending ". MAT" to the filename. p l o t (s u m (Obs') )
MATLAB contains several functions that are used for displaying graphical output. The p l o t function is used here to show the total abundance of the chromatographic run (Fig. 2.2). The total abundance is calculated by the sum function. The argument of the sum function is the transpose of the Obs matrix. If the sum function is given a matrix as the argument the result is a vector that contains the sums for each column in the matrix. Because we want to see the sums of all rows in the Obs matrix we must first form the transpose of the Obs matrix. The formation of a transpose is very handy in MATLAB. We simply put a single quote after the matrix or vector and obtain its transpose (Obs '). convtime=etime
(clock, start) ;
The conversion time depends on many factors. To measure the effect of different program changes, we must be able to measure the time needed by the computer. The function e t i m e calculates the elapsed time in seconds and stores the result as the variable c o n v t i m e . e c h o off disp( ' C o n v e r s i o n time t r i p l e s to m a t r i x ') .... d i s p (c o n v t ime ) disp( 'The n u m b e r of t r i p l e s : ') ,disp(n) d i s p ( 'The r e s u l t i n g m a t r i x c o n t a i n s ' ) ,d i s p (nscans) .... disp( ' scans' ) d i s p ( 'a n d ' ) ,d i s p (nlines) .... disp(' s p e c t r a l lines a f t e r t r u n c a t i o n ' )
This final group of statements simply shows some statistics about the observations. The number of input triples is displayed as well as the dimensions of the resulting observation matrix. The three dots at the end of some statements mean that the statement continues on the next line. When you run the program you see in the Command window how it proceeds. The statement e c h o o f f stops the printout of the program listing. This way the program does not disturb the actual data output of the program. When all d i s p statements are over, the e c h o o n statement can again be given to enable the program listing process.
32
2.2
Handling noise
2.2.1 Signal-to-noise ratio The analyst tries to get instrument readings that are very precise. Simultaneously he knows that due to imperfections in the instruments he can never measure the phenomena exactly. His readings can contain only a certain number of decimals. Electrical noise interferes with his measurements. Volumes cannot be measured with high precision. Electron multipliers produce background counts due to thermal electrons or cosmic rays. Instrumental data are always a combination of true readings and noise. It is not easy to define precisely what the noise is. For modelling purposes noise is often defined as the residuals that are not explained with the models. The reason for the residuals can be noise in instrument readings or an imperfect model. We can never be sure which of the two is behind the residuals. Sometimes, the interesting part of the analysis turns out to be the residuals. In those cases a plausible mathematical model must be found to describe the residuals and experimental error better. The noise in measurements can be reduced by proper signal processing. If noise has a frequency spectrum different to the phenomenon of interest, it is possible to reduce noise in a selective fashion. If the signal is located at the low frequencies, noise that occurs at high frequencies can be suppressed by low-pass filtering.
2.2.2 Types of noise Salt-and-pepper noise There are several types of noise. The term "salt-and-pepper" noise comes from image processing. In an image, a missing point is w h i t e ~ " s a l t " ~ and an extra point is black ~ " p e p p e r " [Justusson 1981]. In spectral data, if a spectral line is missing, we have "salt" type of noise (Fig. 2.3a). If we have a spectral line in a position where there should be none, we have "pepper" type of noise. If there is a shift in the position of a spectral line, we have both types of noise at the same time. This type of noise is common in mass spectrometry, because the statistics of cumulated ions are reflected in both intensities and positions of spectral lines. If the counts are low, this phenomenon is common.
Constant and proportional noise If the noise does not increase according ~to the level of the signal, it is constant (Fig. 2.3b). Proportional noise is related to the level of the signal; it grows in direct proportion to the signal (Fig. 2.3c). Many types of signals can be handled as proportional as a first approximation. The signal may have noise that is proportional
33
a
I
24
I
I
I
Salt-and-peppero noise
/,
19
o
~epper
14
~ ~ f Salt"
.............................
0
.
.
.
.
.
.
.
.
.
0
.....
1'0 25
I
Constant noise
b
2'o
I
I
25
I
o J
20. 15_
o
o
10.
_
J 25
C
I
I
I
Proportional noise
20.
15
'-'~
10
0
_
o 0
5 o
~
/ /
o oo .
0
I
-
0
1'0
1's
2b
2s
Figure 2.3 Types of noise, a) "Salt-and-pepper" type noise is sudden missing points ("salt") or points with high intensity ("pepper"). b) Constant noise is independent of the level of a signal, while c) proportional noise increases with increasing level of the signal.
34
over higher values of the signal. At lower intensities the error may be higher than proportional to the signal. In mass spectroscopy, the noise is roughly related to the square root of the accumulated pulses. This is an approximation to the Poisson statistics of the pulse arrival. There is an additional error in the values of the mass numbers. The assignment of centroids to integral mass numbers is not perfect. At lower intensities, proportionally more of the readings get "misclassified" because the statistics of the centroid calculation produce uncertainty in the calculated mass values. This results in higher errors for abundances at lower intensities.
2.3
Visualization and filtering
The computer display can show many things in color that cannot be reproduced on a book page. There is a large body of literature about image processing. The methods that are useful in manipulating pictures are relevant to hyphenated methods. Microcomputer programs like Photoshop or the NIH Image permit one to change the images with sharpening filters or different convolution operations. The problem is how to transform instrumental data into an image that can then be inspected. Line spectra can be filtered with low-pass filters to change them into color maps with gradually changing tones. The color hues make the line spectra suitable for the human eye. By a suitable filtering operation it is possible to reduce noise in the observations [Smit 1992a,b, Rutan 1992]. Smoothing of the observations is useful for speeding up the convergence of complex models. Convolution is a common operation in signal processing. Linear filtering operations are often performed using convolution. A careful choice of filtering coefficients reduces noise in the observations and brings out the features of interest. Convolution is almost trivial as a computer program. We see that by walking through our example program. It must be emphasized that this program has not been written in MATLAB style. It has been written with explicit program loops to clarify the logic of the program. In practice the loops are replaced by single lines making calls to the built-in functions of MATLAB. Convolution is a "gliding dot product". The coefficients defining the filter are placed in a vector. The filter is then placed over the time series and the coefficients in matching positions in both series are multiplied. The results of the term-byterm multiplication are summed together and placed as a data point into a new time series. The ends of the time series are somewhat problematic, because the first and last terms cannot have the same treatment as the other terms in the series. The new series can be defined to be longer than the original. As a second alternative, the length of the original is kept, but the ends of the filtered series have low values. Note that the order of the filtering coefficients is changed left-to-right before the local multiplication. It is customary to use an odd number of coeffi-
35
cients for the filter. Most of the filters are constructed to use coefficients that are symmetrical around the center point. The convolution is a heavy numerical calculation. MATLAB has a function, cony, that makes this calculation directly in the interpreter. If you intend to do any convolutions with MATLAB, you should use this special function for speed reasons. Similarly, any two-dimensional convolutions should be made using the corresponding two-dimensional function c o n y 2 . We shall illustrate the effects of different convolution filters by a series of figures that have been calculated by our example program (Figs. 2.4-2.9). We generate a synthetic time series using sine waves on a constant background. To this basic signal we add white noise as random numbers. The amount of noise is controlled by a multiplier that is input by the user. If the generated waveform is noisefree, the filtering operations have little visible effect (Fig. 2.4). Close inspection shows that the longer averaging filters make the signal a bit lower in amplitude (Fig. 2.6). The story changes completely when we add noise to the time series (Fig. 2.5). Progressively longer low-pass filters smooth out the noise (Fig. 2.7). Simultaneously the amplitudes of the peaks in the sine curve get lower. If the filter has only one coefficient, the original time series is copied unchanged to the output (Fig. 2.4a and Fig. 2.5a). High-pass filters have negative tails and a positive middle. The simplest highpass filter uses the coefficients -1,3,-1. If the time series is noise-free, we see only a small emphasis at the maxima and minima of the series (Fig. 2.8). If we use a high-pass filter on a time series containing noise, the effect is dramatic. The noise gets amplified with this filter. This is obviously not the purpose, but it shows well the dangers of applying high-pass filters to data containing large amounts of noise (Fig. 2.9). The broadening of the injected sample in a chromatographic run could be simulated by applying a Gaussian filter to a series of spikes that represent the retention times of the different compounds. The sharp spikes get broadened out in the chromatogram by the low-pass filter of the chromatographic process. In theory we could apply the inverse function of the broadening process to the data and get back the initial sharp peaks. The inverse function of a low-pass filter is a high-pass filter that can be derived from it. If our data were completely noise-free, the recovery of the original unbroadened peaks would work. In practice, this theoretical idea is worthless because the noise that gets amplified by the high-pass filter destroys our attempts. A simple inverse filter is not enough, we must use more sophisticated methods. The word "filtering" has been borrowed from electronics. In electronic devices, analog circuits perform filtering operations. Resistors and capacitors can Continues on p a g e 40
36
Coeffs.= 1 N o i s e |
= 0 |
El
1.8 1.6 1.4 1.2
I:I
0.8 0.6 0.4 0.2 0 ,,, 0
I
I
10
20
30
I
I
I
40
50
60
I
70
80
90
t0
90
100
Point # Coeffs.= 1, 1, 1 Noise = 0
b
.
.
.
.
.
1.81.6- i 1.4~
1.2 ~:
o,,,~
-
0.80.60"4 t 0.2 0
0
~ I
10
!
20
30
40
50
60
_
70
I
100
Point #
Figure 2.4 Convolution. a) The original noise-free data is a sine wave. b) The data is filtered using a low-pass filter. The filter uses a three point moving average.
37
Coeffs.= 1 Noise = 0.5
2.5
a
i
i
!
10
20
30
|
|
!
|
,
|
60
70
80
90
!
,
|
1.5 . ,...q r~
i
0.5
0 0
40
50
100
Point #
b
Coeffs.= 1, 1, 1 Noise = 0.5
2.5
|
|
|
~
,
|
O
o
o
oo
A°
0
0
2
0
o
0
o
~:~_~o
o
1.5
I
o~ ~
o\ fo° o
0
0
1'0
I
20
Ol
30
I
40
I
50
610
70
810
910
100
Point #
Figure 2.5 Convolution. a) A sine wave containing noise, b) The data is filtered using a low-pass filter that is a three point moving average. The filtering reduces noise.
38
Coeffs.= 1, 1, 1, 1, 1 Noise = 0 |
a q
i
|
70
80
90
,
d~o
|
1.8 1.6
l
1.4 1.2 °,..~ _
0.80.6-
0.40.20 0
___
l'O
' 20
30
!
40
50
60
100
Point # Coeffs.= 1, 1, 1 1, 1, 1, 1, 1, 1 Noise = 0
b
d
~O
|
|
1.8 1.6
-
(
0
1.4 0
1.2
0.8 0.6 0.4 0.2 01
0
1
I
10
20
30
I
I
I
40
50
60
°t-r•°
8'0 9'0
100
Point #
Figure 2.6 Convolution by low-pass filtering. Filtering of noise-free sine wave data with a five point moving average (a) and with a nine point moving average (b). The signal is progressively flattened by the longer filters.
39
Coeffs.= 1, 1, 1, 1, 1 Noise = 0.5
2.5
a
|
|
|
|
|
|
i
|
| o
o
o
o°
o
o
o
o o
1 . 5
° °
-
.v..~ r~
o
(
o
1-
; 0
o
0°
O°
0.50 0 0
0 0
1'0
210
o
ol 30
I 40
J 50
610
O0 0
70
I 90
810
100
Point # Coeffs.= 1, 1, 1, 1, 1, 1, 1, 1, 1 Noise = 0.5 2.5
b
|
!
!
|
!
|
!
|
| o
o
o
o°
oO
o
3o 0 -
0
0
(
o
3
o
1.5 .,..~
-
L:
_
0.5
L
-
o
0
0
I
I
10
20
o\ jo
o OI
30
o I
I
40
50
/
60
o I
I
I
70
80
90
100
Point #
Figure 2.7 Convolution by low-pass filtering. Filtering of noise-containing sine wave with a five point m o v i n g average (a) and with a nine point moving average (b). The longer filter reduces noise more effectively, but at the expense of dynamics.
40
Coeffs.= - 1, 3, - 1 Noise = 0 A
|
|
|
1.8 1.6
|
,
|
60
70
80
1
1.4 ~
1"I 0.8 0.6 0.4 0.2 I
0
10
I
20
~
30
I
I
40
50
90
l~
Point #
Figure 2.8 High-pass filtering of noise-free data using a three-point filter. The high-pass filter vector is -1,3,-1. The contrast is signal is slightly increased. (Continuation from page 35) be combined with operational amplifiers to form electronic filters that amplify or suppress different frequencies in continuous electric signals. A common function for electronic filters in instruments is to suppress the frequency of 50 or 60 Hz that is present everywhere due to the frequency of the AC supply voltage. A special "notch" filter eliminates this frequency from the signal of interest. In digital signal processing, computer-based methods have been developed that perform similar operations as the analog electronic filters. The filtering operations are made by mathematical calculations on numbers. The operations can be performed with the convolution operation on time series. Another way of getting the same result is to make the filtering operation in the frequency or Fourier domain. In the Fourier domain, the frequency spectrum can be manipulated to get the same effect as filtering by convolution. The operations in the Fourier space have one major advantage" the laborious convolving operation is replaced by a simple multiplication. If the sum of identical coefficients of the convolution filter is unity, the resulting time series is just a moving average.
41
Coeffs.= - 1, 3, - 1 Noise = 0.5 2.5
~
|
!
|
|
o
O.
0
0
I
I
10
20
30
I
I
40
50
I
60
70
80
90
100
Point #
Figure 2.9 High-pass filtering of data containing noise using a three-point filter. The original amount of noise added is 0.5 percent. The high-pass filter vector is -1,3,-1. The apparent noise level has been increased by filtering.
The convolution filter can have a Gaussian shape. This bell-shaped filter does local smoothing by emphasizing the values in the middle of the series. Many other types of filters are possible.
2.4
Types of filters
Numerical filters were first used by electronic engineers. With the lower cost of computers the digital filters have been rapidly taken up by all scientists measuring signals [Oppenheim 1978].
2.4.1 Low-pass filters Low-pass filters correspond to local smoothing operations. To be useful, the signal that is processed with a low-pass filter should contain noise that is located at higher frequencies than the signal of interest. In chemical instruments it is not usually possible to separate the frequency domains of signal and noise. Still, the signal often has characteristics that make low-pass filtering useful as a first step in the data analysis.
42
a
b
c
/ Figure 2.10 The frequency curves of different types of filters, a) A low-pass filter allows the low frequencies to pass while blocking the high frequencies, b) The high-pass filter allows the high frequencies to pass while blocking the low frequencies, c) The band-pass filter combines the high-pass and low-pass filters to allow the medium frequencies to pass. In mass spectra there is normally some "jitter" in the positions of the lines in a spectrum. If there is a slight mismatch between the model and data in the position of the mass, this causes a large mismatch in the fit of the model. The error is small, but it is in the wrong dimension, namely along the x-axis. Low-pass filtering along the mass axis makes the error less noticeable. It distributes the intensity from a single line to neighboring lines and improves the fit considerably. Yet, this is not the best way to handle the problem. The mass spectra should be reintegrated into the unit masses after the initial broadening of the spectral lines.
2.4.2 High-pass filters High-pass filtering is an everyday operation in chemistry. Background subtraction is an operation that removes a slow component from the signal and is common for chromatographic signals. The slow component in the signal often has an ascending or a descending shape. The first step in the analysis before integration of the peak is to remove the slow background. This operation is not commonly called highpass filtering but this is what it is.
43 T
Taking a derivative of the signal is related to high-pass filtering. It is common to perform this operation on digital signals by forming a digital difference of the original series. The running difference is an approximation of the first derivative. For higher derivatives, the operation is repeated as often as necessary. The derivation operation can be made by convolution. We have a filter with two terms, +1 and -1. The application of this series results in a derivative of the original series. In theory, the signal-broadening effects in chromatography could be reversed by high-pass filtering of the original data. Simple high-pass filtering is fraught with difficulty. Taking a derivative results in higher noise. If the data were noise-free, we could recover the original undistorted signal. In a blurred photograph, the higher frequencies are lost due to unsharp focus or movement during exposure. We could use an inverse filter that emphasizes the higher frequencies to recover the original sharp picture. Elimination of signal broadening is not possible using a simple high-pass filter. The noise gets amplified as well and the potentially sharp original picture is distorted by amplified noise. A similar situation applies to attempts to sharpen chromatographic peaks by inverse filters. Increased noise in the reconstituted signal makes the newly separated peaks unusable. More sophisticated methods are needed to recover sharp chromatographic peaks. The Wiener filter is an example of a better way to reduce the difficulties due to the amplification of noise.
2.4.3 Band-pass filters Band-pass filters let through a certain frequency range. They are formed by combining a suitable low-pass and a high-pass filter. In chromatography we often apply band-pass filtering to the raw signal before the final processing steps. The lowest frequencies are eliminated first and the highest frequencies are then eliminated by low-pass filtering. The opposite of a band-pass filter is a notch filter that suppresses a certain band of frequencies. If noise is limited to a certain frequency band this is a useful way to reduce it.
2.4.4 Savitzky-Golay filters Removing the noise from spectral measurements is not easy. When the signal is passed through a low-pass filter, the spectra are smoothed. At the same time, there is a loss of important spectral features. Peaks lose height when subjected to an averaging filter. Moving average filters are easy to program, but they destroy important spectral data. The ideal filter would produce smooth-looking data without flattening the peaks. What is needed is a selective filtering process. It should not reduce the lowest frequencies as they often contain the information responsible for the heights. The filter should not change the halfwidths of the peaks. It is not possible to find a
44
solution that can meet all these requirements. The Savitzky-Golay filter is wellknown for signal processing in gas chromatography. It has been very popular in this application area, but there are several other areas in chemistry where it has been successfully used. The starting point for the Savitzky-Golay filter is the idea of using local polynomials to describe data [Savitzky and Golay 1964, Steinier et al. 1972]. A central point is chosen as the point that is approximated by the flexible polynomial. A number of adjoining points to the left and fight of the central points is used to fit the polynomial. We can look at the Savitzky-Golay filter as a context-sensitive smoothing function. If we take too many points into the list of points used to define the polynomial function the highest peaks lose height and are flattened. On the other hand, if we have too few points flanking the central point the results may become erratic (Figs. 2.11 to 2.18). We shall now progress through the derivation of the algorithm for the Savitzky-Golay filter. We shall initially use some special functions available in MATLAB for the polynomials. Then we shall use more direct calculations instead of the tailored polynomial functions to gain more speed. Finally we shall convert the solution into one that is based on the convolution function of MATLAB to get maximal speed. We generate first some test data with differing peak widths to get a feel for the algorithm. The test peaks we generate are purely Gaussian ones. The peak halfwidths and peak areas are chosen by parameters. The number of points to be used is selected as well as the position of the Gaussian in the vector of data points. The Savitzky-Golay filter uses a polynomial function to describe the selected set of data points. The central point in the set is then approximated with the resulting polynomial. Next we shall discuss a fragment of the MATLAB program s a v g o 11. m. The program is fully listed in Appendix (Prog. 2.2). TS=TSI +TS2 +TS3 +TS4 +TS5+TS 6 ; tic B= (l'pit) ' ; T S P = z e r o s (size (TS)) ; for i j k = l - t s l - p i t Y Y = T S (ijk- ijk+pit-l) ; p--polyfit (B, YY, 2 ) ; y p = p o l y v a l (p, (nl+l)) ; TSP (nl+ijk, I) =yp; end toc
The calculation is very slow because the polynomial equation must be repeatedly solved for each data point in question. We have an explicit loop in the pro-
45
1.5
Savitzky-Golay ,
,
9 points, ,
degree
,
= 2
1
0.5
0
.
0
0.3 ~
r
.
.
200
~
r
~
r
.
~
.
400
.
600
9 filter coeffs, ~ r ~ r
T
.
.
degree
.
800
1000
1200
= 2
0.1
-0"10
1
2
3
4
5
6
J
7
8
9
10
Figure 2.11 A nine point Savitzky-Golay filter uses a second degree polynomial to approximate a time series. The lower panel shows the actual values for the filter coefficients that are used in the convolution. 1.5,
..............
, .......
_
1
Savitzky-Golay ,
65 points, ,
degree
,
= 2
¢
0.5
_0.51 0
I
I
200
i
400
65 filter coeffs,
0.04
T
T
i
600
T
i
800 degree 1
1000
1200
= 2 !
I
50
60
0.02
-0.02 -0.04
0
10
20
30
40
70
Figure 2.12 A sixty five point Savitzky-Golay filter uses a second degree polynomial to approximate a time series. A longer filter works better with broader peaks at right, but lowers sharper peaks.
46
Savitzky-Golay 9 points, degree = 4
1.5
1
0.5-
0
0
200
400
600
800
9 filter c o e f f s , d e g r e e
0.6
~
~
1
~
T
~
T
~
T
10oo
1200
= 4 I
r
6
7
1
~
1
~
0.4 0.2
-0.
1
2
3
4
5
1
J.
1
8
9
J
10
Figure 2.13 A nine point Savitzky-Golay filter uses a fourth degree polynomial to approximate a time series. With this filter length the second and fourth degree polynomials behave almost identically.
1.5
.
.
S a v i t z k y - G o l a y 65 p o i n t s , d e g r e e = 4 . .
0.5
00
200
400
600
800
1000
1200
65 filter coeffs, d e g r e e = 4 0.06
i
!
i
!
i
2'0
3'0
20
so
60
0.04 0.02
-0.02
0
I
lO
!
I
70
Figure 2.14 A sixty five point Savitzky-Golay filter uses a fourth degree polynomial to approximate a time series. It describes sharp peak better than a second degree polynomial.
47
gram that repeats the calculations for all points in the series. If we want to speed up the program we must find steps in the calculations that are repeated. The program uses repeatedly two MATLAB functions, p o l y f i t and p o l y v a l . These functions make the program short and simple, but they are slow. The execution time for the initial version of the program s a v g o 11 .m is about 50 seconds. To see how the program can be made faster, let us take a look at the program s a v g o 1 2 . ra. The program is fully listed in the Appendix (Prog. 2.3). T S = T S I +TS2 +TS3 +TS4 + T S 5 + T S 6 ; tic A = v a n d e r (i- p i t ) ; A : f l i p l r (A) ; A = A ( :, 1 :m+l ) ; C--piny (A) ; A H = A (nl+l, : ) ; C F : A H * C; T S S = T S (l-pit, i) ; Y = ( t o e p l i t z (TS, TSS) ) ; Y = [Y; z e r o s (nl,pit) ]; T S P = (CF*Y') ' ; T S P ( I :nl) : [ ] ; toc
We find that the coefficient matrix A on the left-hand side of the problem stays constant. The right hand side contains the values of the signal for the selected segment of data points. The coefficient matrix on the left hand side is a collection of different powers of the sequence 1 , 2 , 3 . . . N, where N is the total number of data points in the segment. This A matrix is easily formed using the special function in MATLAB that forms Vandermonde matrices. The Vandermonde matrix contains a regular array of data points that are different powers of a succession of numbers [Press et al. 1992a]. The solution of the matrix equation is found using the pseudoinverse c of the A matrix. In MATLAB there is a special function in the language that finds the value for the pseudoinverse. The statement that does this is simply" C--piny (A) ;
As we want to find the approximated value to the central data point in the segment we need to calculate the value for one point only. This saves a lot in the calculations. We form an auxiliary matrix CF by multiplying one row of the A matrix
48
and the pseudoinverse C. To solve for one value it is necessary only to multiply the points in the selected segment with this auxiliary matrix CF. The fight-hand sides for all points in a series form a very regular matrix. This matrix is well known in statistics and it is called the Toeplitz matrix [Press et al. 1992a]. There is a special function available in MATLAB that can rapidly form the special repetitive Toeplitz matrix. To define a Toeplitz matrix we need two vectors: The first vector defines the number of rows in the Toeplitz matrix, the second vector is a piece from the beginning of the first vector. The latter defines the size of the Toeplitz matrix by defining the number of columns that are needed. The calculation time using this version of the Savitzky-Golay program, s a v g o 1 2 .m is about 2 seconds for our model problem. The tenfold gain in the calculation speed looks good and might be considered satisfactory. Still, there are ways to get more speed. The MATLAB interpreter contains a special function c o n y that is used to calculate convolutions for time series. This part of the interpreter is built into the machine language code of the interpreter and it is optimized for each platform. We shall make use of it in the program s a v g o 1 3 . m. The program is fully listed in the Appendix (Prog. 2.4) T S = T S I +TS2 +TS3 +TS4 + T S 5 + T S 6 ; tic A = v a n d e r (1 -pi t ) ; A = f l i p l r (A) ; A = A ( :, 1 :m + l ) ; C--pinv (A) ; A H = A (nl+l, :) ; C F : A H * C; T S P = c o n v (TS, CF) ; TSP(I:nl) = [] ; tsplen=length (TSP) ;
TSP (tsplen-nl+l toc
:t s p l e n ) = [ ] ;
When the coefficients are then used in the convolution function we obtain the filtered new signal: TSP
= conv(TS,ff);
The calculation time for the newest incarnation of the Savitzky-Golay calculations is just 0.2 seconds, a speed-up of about 200 times from our first approximation. We do not claim that higher speeds are not possible to program. Still we doubt that it is possible to speed up the fastest version still another 200 times. Of
49
Savitzky-Golay ,
't A A
1.5
,
9 points,
degree
= 2
0.5
O0
200
1.5 /
.
400
.
.
.
600
800
.
.
1
°
0.5
1000
1200
. ,...~...,~~
"
0 220
230
225
1.5
i
235
i
1
850
250
i . .
800
i 245
240
900
i "
"
255
260
i
-
950
1000
1050
11 O0
Figure 2.15 A nine point Savitzky-Golay filter uses a second degree polynomial to approximate a time series. The middle panel shows the approximation around the sharpest peak. The lowest panel shows the region around the broadest peak. Savitzky-Golay
Oo
f
~2
....
o
0 °
1 . 5
200
400
degree
600
= 2
800
1OIOO
12OO
•
225
I
230
!
235
i
1
800
65 points,
!
.
850
900
I
240
I
245
250
|
255
260
|
.
950
1000
1050
11 O0
Figure 2.16 A sixty five point Savitzky-Golay filter uses a second degree polynomial to approximate a time series. The sharpest peak is described better with the shorter filter (Fig. 2.15). The longer filter is again better for the broadest peak.
50
1.51 1
Savitzky-Golay ,
,
9 points, ,
degree
= 4 ,
,
I
0.5 0
o
200
400
1.5
~2
600
i
0
0
I
225
.
5
i
i
I
230
800 i
I
235
1o00 i
I
240
I
245
250
1200 i
I
255
260
~
~Qoo
850
900
950
1000
1050
11 o o
Figure 2.17 A nine point Savitzky-Golay filter uses a fourth degree polynomial to approximate a time series. A higher degree of polynomial is better for sharper peaks.
Savitzky-Golay
0.5
°o
.
1.5
.
.
.
=
.
.
.
.
.
.
200
i
.
"A A .
.
.
.
.
.
.
.
.
.
.
.
,
.
225
.
"
"
degree
6o0
800
i
•
= 4
.
,
•
0.5
.
400
1
020
.
65 points,
•
•
•
i
•
°
•
•
~ oo
,
12oo
,
•
"
230
235
I
240
I
245
250
255
260
1.5 1 0.5 ~0
850
900
950
1000
1050
1100
Figure 2.18 A sixty five point Savitzky-Golay filter uses a fourth degree polynomial to approximate a time series. A longer filter is not beneficial for the sharpest peaks.
51
course it would be nice because then the calculation time would be about a millisecond. Now let us study the behavior of the Savitzky-Golay filtering process. It is easy to simulate the behavior of the moving average using the Savitzky-Golay filter. We simply choose the degree of the polynomial to be zero. In that case the function is reduced to the average value of the chosen segment. The narrower peaks are severely reduced in height by this moving-average filter. When we choose a higher degree for the Savitzky-Golay filter we see that the peak heights are better approximated. The peaks do not get wider. The peak areas tend to be preserved, so a lowering of the peak tops is accompanied by the widening of the peaks in the moving-average. One of the main uses of Savitzky-Golay filtering is in chromatographic data processing. The integrators detect peaks by the first derivative. The SavitzkyGolay filter reduces the effect of noise. We see that the noise is considerably reduced when the number of the points in the Savitzky-Golay filter is about the same as the half-width of the peak. When the peaks are much wider the noise reduction is not as effective. The degree of the polynomial should be high enough to preserve the peak top heights. If the degree is too high in the polynomial, the results from the Savitzky-Golay filter become erratic. When the filter is raised to the tenth degree, the predicted points tend to overshoot the true value. One should not use too high values, because the approximation process becomes unstable with high-degree polynomials. Second and fourth degree polynomials are the types that are normally used. The best way to find optimal values for the length of the filter and the best power in the polynomial is direct experimentation using real data. With experience, a certain instrument is known to require a filter with well-known parameters for the length and degree.
2.4.5 Non-linear and Kalman filters The filtering operations that we have handled thus far are all linear filters. If we apply several filtering steps, the steps can be combined into one filter. The effects of the processing steps are additive. There are other types of filters in existence whose effects are not simply additive. These filters are non-linear filters. A common example is the median filter. Each value in a time series is replaced by a new value that is the median of the values in the surrounding time series. The median filtering can detect salt-andpepper noise. The median is the "middlemost" value in a series. The median is calculated by taking shorter samples from the time series. The values in the sample are sorted. Finally, the middle value is used to replace the current value. Median filtering is a sensitive method to detect certain types of outliers. It is a non-linear type of filtering operation and demands many computing cycles to perform. The median is calculated for an odd number of points. Other non-linear ill-
52
ters have calculation formulas that contain product terms of the original time series. These cross-terms are multiplied by filter coefficients before getting the final value by summation. The number of possible non-linear filters is nearly infinite and we shall not go into detail here. The neural networks are a family of useful non-linear filters that are often used in classification problems [Rumelhart et al. 1986, McClelland et al. 1986, Masters 1995]. Kalman filters are recursive adaptive filters that are useful for processing chemical time series information. Their usefulness can be seen from several examples in the literature [Rutan and Motley 1987, Rutan 1992].
2.5
Image processing and hyphenated data
All filtering operations can be performed in either the time domain or the frequency domain. Fourier transforms make it possible to process signals in the frequency domain (Fig. 2.19). Fourier-based methods have been routinely used for improving images sent from the space probes since the late sixties [Gonzalez and Wintz 1977]. These methods are useful for hyphenated data as well. We first convert a time series to the frequency domain by making a Fourier transform calculation. After we have performed the necessary processing in the frequency space, we return to the time domain. This conversion from frequency space to time space is made by the inverse Fourier transform. The Fourier transform has been known a long time. The French soldier and mathematician J. B. J. Fourier discovered it during the wars of Napoleon. It was not very useful without computers. Even after the computers arrived, it was too time-consuming on the computer to be useful. The situation changed during the late sixties. Cooley and Tukey discovered the Fast Fourier Transform (FFT) [Cooley and Tukey 1965]. This way to calculate the Fourier transforms uses symmetry found in the calculations to drastically reduce the operation count. FFT made the routine application of the Fourier transform practical. If we have to perform convolution operations on very large time series it is often cheaper to perform the operation in Fourier space. This is because the computation-intensive convolution operation is replaced by the computationally cheaper multiplication operation in the Fourier domain. Two time series are convolved by multiplying their Fourier transforms and then returning to time space by the inverse transform. In practice, the two series of numbers are not equally long. This difficulty is circumvented by padding the shorter series with zeros to the same length as the other series. The Fourier series has many other uses than filtering. It is a useful way to look at the data. Hidden frequencies are found in data. Such hidden periodic phenomena are nearly impossible to find in the time domain [Brigham 1974].
53
Noise =
0
a
0.5-
0 0
I
I
I
I
I
10
20
30
40
50
60
70
4'0
5'0
6'0
70
Point # FFT
50
©
0 0
10
20
30 Point #
b
Noise = 0.5
1.5
0.5 0
0
i
A
10
!
20
3'0
I
I
I
40
50
60
70
40
50
60
70
Point # FFT
50
0
0
10
20
30 Point #
Figure 2.19 Fourier transforms of sine waves (a) without noise and (b) with noise. A pure sine wave generates a single peak in the Fourier spectrum.
54
Unfiltered data 50 45 40 35 30 25 r.,O
0
20 I
15 10
I
10
I
20
~
I
30
I
40
I
50
60
rn/z Figure 2.20 Contour plot of a segment of the GC-MS data from the derivatized urinary
neutral steroid fraction. The figure shows the original data matrix. The number of contour levels is ten. There are two very high peaks and a number of smaller ones.
2.5.1 Two-dimensional filtering by convolution or FFT The methods for handling one-dimensional time series can be generalized to two dimensions (Figs. 2.20 to 2.26). Convolutions are possible in two dimensions (Figs. 2.21, 2.22, 2.25, and 2.26), as are the different Fourier operations (Fig. 2.23). There is even special computer hardware available that has been built for twodimensional (2D) signal processing. It is often useful to perform a low-pass filtering operation on line spectra. The smoother contours are easy to inspect by eye. The images can be easily manipulated by image processing programs such as Photoshop. The human eye is the best tool for pattem recognition. For the best use of the eye the chemical information must be mapped into proper colors. Line spectra are too sharp to look at in a contour plot. The two-dimensional convolutions are a good way to spread out the data for best color coding. Similarly, high-pass filtering is often beneficial before looking at the smooth optical spectra as a contour plot. Continues on page 58
55
Horizontal filtering. Coeffs.= 0.2, 0.2, 0.2, 0.2, 0.2
50
I
I
I
I
45 40 35 30 25 20 15 10
i
10
I
i
I
I
20
30
40
50
t
60
m/z 50
2D-Filtering. H = 0.2, 0.2, 0.2, 0.2, 0.2 V = 0.2, 0.2, 0.2, 0.2, 0.2 I
1
I
I
I
45 40 35 30 25 r~
20 15 10
I
i
i
i
i
10
20
3o
40
50
i
6O
m/z
Figure 2.21 a) The upper contour plot shows the data in Fig. 2.20 after horizontal filtering. The convolution process was performed with a row vector, b) The lower plot shows the data after being filtered in both directions. A separate convolution was performed by two vectors. The first vector was a row, the second a column vector. The effect is a convolution by a matrix.
56
a
b
Figure 2.22 a) The original steroid GC-MS data set as a mesh diagram. The data is the same as in Fig. 2.20. b) The data matrix is smoothed by convolving with a bell-shaped matrix. This data corresponds to the contour plot in Fig. 2.2 lb.
57
El
b
Figure 2.23 Fourier transforms of unfiltered (a) and low-pass filtered (b) GC-MS data. The data set is the same as in Fig. 2.20. The peak in the middle of plots corresponds to the lowest frequencies. Points at the periphery are the higher frequencies. The effect of lowpass filtering is to remove some high-frequency components.
58
Original data 100 90 80 70 60 50 40 30 20 10 20
40
60
80
100
m/z
Figure 2.24 Contour plot of a GC-MS data set. The number of contour levels is ten. The plot is hard to analyze visually even when plotted using pseudocolors on the computer screen. (Continuation from page 54) The future of signal processing will be determined by the economics of computing. Special integrated circuits can sometimes speed up the calculations ten or hundred times. If the circuits can find common applications in many fields the price of the special computations can be dramatically lower. In this sense the photographic market has an effect on analytical chemistry. Special integrated circuits that are needed in the digital processing of photographic images can be very useful in analytical chemistry. Calculating the convolution in two dimensions is a time-consuming calculation. It takes a large number of calculations to multiply two matrices element-wise and to sum up the products. This slow process must be repeated for each element of a large matrix. Parallel computers that apply many small computers to the task will be needed for large-scale application of filtering operations. It is conceivable that future integrated circuits could contain hundreds of central processing units that operate in parallel. There are some super-computers on the market that contain up to 64,000 microprocessors as central processing units.
59
If the convolving matrix is very large it is generally faster to perform the twodimensional convolution in the Fourier domain. One way to speed up the two-dimensional convolution is to replace it by a sequence of two one-dimensional convolutions. This is possible in those cases where the two-dimensional filter can be formed as an outer product of two vectors (Fig. 2.21b). The effect of two separate convolutions using separate row and column vectors is shown in Figs. 2.21 and 2.22. The effect of the two vectors is the same as applying a matrix-shaped filter that is shown in Fig. 2.25. The amount of calculations needed for the two successive vectors is reduced from seven time units to two.
2.6
Cleaning up data
2.6.1 Eliminating data outliers Some readings in the observation matrix are obviously wrong. Such aberrations are called outliers in the data. The numbers have perhaps been corrupted in transferring data between different devices in the experiment. Some characters have been lost in transmission and two smaller numbers can be fused into a huge number. Some
Figure 2.25 Bell-shaped two-dimensional filter for smoothing. The filter size is 7 x 7 points. The filter is a product of two bell-shaped vectors. It is cheaper to calculate two separate convolutions using two separate vectors.
60
Smoothed data-- a bell-shaped 2D filter
a
100
J
k_.
80
60
40
20
20
40
60
80
100
rrdz Smoothed data-- a bell-shaped 2D filter
b
100
80
60
40
20
20
40
60
80
100
m/z
Figure 2.26 The line spectra can be spread out as smooth patterns using bell-shaped filters. The smoother patterns are easier to inspect visually when shown using smooth color hues on the computer screen. The upper figure has been smoothed using a filter consisting of 5 x 5 points. The filter for the lower figure was sharper, 3 x 3 points. Anomalies in the data such as "salt-and-pepper" noise is easy to find.
61
4000
i
i
i
|
i
3000
2000
1000
0
; . , - -
- 1000
-2000 -
-3000 0
I
I
I
I
I
I
1
2
3
4
5
6
7 x104
Figure 2.27 Outliers can be seen when the residuals - - after local s m o o t h i n g - are sorted. Only a few points deviate markedly from the model. These potential outliers are at the ends of the sorted sequence. values have been entered by hand. A typing error is always possible. The measuring instrument can give rise to wild values by registering current spikes and other electrical interferences. After a model has been fitted we can list the residuals in sorted order to find out where the largest deviations between model and measurements are located. These extreme residuals are an indicator for outliers that are caused by errors in data transfers and conversions. If the deviation stands out from the other residuals we have to replace the deviating point with a new value. There is a limit to how many corrections should be made to the data. We do not want to force-fit our measurements to our models. We want just to eliminate a few technical defects in our measurements. The outliers often prevent the correct fitting of the model and make the rest of the data useless. Another approach to finding outliers is to use some local smoothing function and see if any values do not fit well. A good way to find missing data or extra points in otherwise empty regions of a data matrix is to use a median filter. The median filter replaces each point in a time series with the median value of a short data segment surrounding the point under study. For example, we can take five data points and we can then find the median of these five data points. The median
62
x104
I
LL.
0
100
0
200
~
~,,,.
300
400
,
,
500
600
700
X104 1.5
>libsearl Position
of
duration
=
unknown
= 4
60.2800
1 Pos
= 2718
Corr
=
2
=
Corr
=
Pos
1921
3 Pos
= 340
4 Pos
= 2018
Corr
=
5 Pos
= 2569
Corr
=
7
Pos
=
9 Pos
=
6 Pos 8 Pos i0
Pos
12
Pos
ii
=
1662
=
3636
Corr
Corr
=
968
Corr
=
618
Corr
=
Corr
0.903 0.903 0.886
= =
0.832 0.831
0.829
0.797
0.693
0.603
= 4972
Corr
=
0.573
=
Corr
=
0.560
Pos
=
3325
13
Pos
= 4768
Corr
=
1658
Corr
=
0.572
0.559
14
Pos
= 2280
Corr
=
0.525
15
Pos
=
1277
Corr
=
0.503
16
Pos
= 2542
Corr
=
0.499
17
Pos
= 2051
Corr
=
0.497
18
Pos
= 2561
Corr
=
0.482
19
Pos
= 2958
Corr
=
0.481
20
Pos
=
3603
Corr
=
0.476
Figure 4.2 Program l i b s e a r l , m calculates the correlation between the unknown spectrum and the library using the function c o r r c o e f . The calculation takes 60 seconds on the PowerMacintosh 7100. The time needed depends on the computer.
103
LAB function c o r r c o e f . The correlation coefficients that are obtained are saved in vector x. After calculating the correlations, they are sorted. The sort statement always produces an ascending list. As we want to have a descending list, the sorted vector must be reversed into a descending order. We note that for a library of 5,000 entries the search takes about 60 seconds.
4.3
Speeding up the library search
We can use many tricks to gain more speed in the library search operations. One method is to start with an approximation to the solution. If this approximation can be done rapidly, we can then calculate the final correlation for a subset of the library. The result would be an overall saving in calculation time. There are dangers in using just a subset of the library. We may omit spectra that could have good correlations. A better solution is to optimize the speed of the calculation of the correlations. Next we shall speed up the calculations by a factor of roughly fifty.
4.3.1 The speeded-up program "libsear2.m" for library searching We can reduce calculating the correlation coefficients to a single vector multiplication operation. This is shown in the MATLAB program l i b s e a r 2 , m (Appendix, Prog. 4.2). It is much faster to calculate a dot product between two vectors than to calculate a correlation coefficient for the same two vectors. The basic method is very simple. All spectra in the library are first centered and then scaled to have a unit length as the I_~-norm. Centering means subtracting the average value of the lines from all lines. The scaling of the library spectra is done only once. After scaling, the spectra are stored on disk. The spectrum library that is retrieved from disk is ready for the multiplication by the unknown spectrum. As a preparation for the search the unknown spectrum must first be centered and scaled. The length of the unknown spectrum as measured by the L2-norm is scaled to one. The actual searching operation is a single multiplication. We have all library spectra in a large matrix. We multiply the spectrum matrix by the unknown spectrum (vector) from the left. The result of this calculation is a new vector x that contains the correlations. A companion vector z keeps the indexes to the positions of the elements in the original vector x. The correlation vector x is then sorted together with z. The result is two vectors in descending order. The speed of calculations in l i b s e a r 2 .m is much higher than in l i b s e a r 1 , m. The reason for the speedup is very simple. In the MATLAB function
104
>>libsear2 Position
of
duration
=
unknown
= 4
0.6120 1 Pos
=
2718
Corr
=
0.903
2
Pos
=
1921
Corr
=
0.903
3
Pos
=
340
4
Pos
= 2018
Corr
=
0.832
5 Pos
= 2569
Corr
=
0.831
6 Pos
=
1662
7
Pos
=
968
8 Pos
=
3636
9 Pos
=
618
Corr
=
Corr Corr
= =
Corr Corr
0.886
0.829 0.797
= =
0.693 0.603
i0
Pos
= 4972
Corr
=
0.573
ii
Pos
=
3325
Corr
=
0.572
12
Pos
=
1658
Corr
=
0.560
13
Pos
= 4768
Corr
=
0.559
14
Pos
= 2280
Corr
=
0.525
15
Pos
=
1277
Corr
=
0.503
17
16
Pos
Pos
= 2051
= 2542
Corr
=
=
0.497
18
Pos
= 2561
Corr
=
0.482
19
Pos
= 2958
Corr
=
0.481
20
Pos
=
Corr
=
0.476
3603
Corr
0.499
Figure 4.3 Program l i b s e a r 2 .m lists the 20 highest correlations in the library when comparing the input spectrum. The correlation is calculated by a faster method than 1 i b s e a r 1. m. The calculation takes 0.61 seconds on the PowerMacintosh 7100. corrcoef,
most of the calculation time is taken up by the scaling of the vectors as a preparation for the calculation of the correlation coefficient for the pair. Most of the calculations for scaling purposes are done once while preparing the library. The calculation of the correlation coefficient is mathematically just a multiplication of two vectors that have been properly centered and scaled. Because most of the calculations have been made beforehand on the library spectra, there is just a small amount of calculation left in the final matching step. The total speed is improved about one hundred times.
105
The next modification to the program is left to the reader. By multiplying the library with a matrix, the correlations between all the spectra in a segment can be calculated at the same time. One multiplication calculates the library search for hundreds of spectra simultaneously.
References Brown CW, Lynch PF, Obremski RJ, Lavery DS. Matrix representations and criteria for selecting analytical wavelengths for multicomponent spectroscopic analysis. Anal Chem 1982; 54: 1472-1479. Chapman JR. Computers in mass spectrometry. London--New YorkmSan Francisco: Academic Press, 1978, pp. 101-149. Clerc JT. Automated spectra interpretation and library search system. In: Meuzelaar HLC, Isenhour TL, eds. Computer-enhanced analytical spectroscopy. Vol. 1. New York--London: Plenum Press. 1987, pp. 145-162. Frans SD, Harris JM. Selection of analytical wavelengths for multicomponent spectrophotometric determinations. Anal Chem 1985; 57: 2680-2684. Heller SR, Lowry SR. Library storage and retrieval methods in infrared spectroscopy. In: Meuzelaar HLC, Isenhour TL, eds. Computer-enhanced analytical spectroscopy. Vol. 1. New York--London: Plenum Press 1987, pp. 223-237. Hill DW, Kelley TR, Langner KJ. Computerized library search routine for comparing ultraviolet spectra of drugs separated by high-performance liquid chromatography. Anal Chem 1987; 59: 350-353. Kahaner D, Moler C, Nash S. Numerical methods and software. Singapore: Prentice-Hall International. 1989, pp. 53-54. McLafferty FW. Interpretation of mass spectra. Mill Valley: University Science Books. 1980, pp. 231-239. Stein SE, Scott DR. Optimization and testing of mass spectral library search algorithms for compound identification. J Am Soc Mass Spectrom 1994; 5: 859-866. Warr WA. Spectral databases. In: Brereton RG. Chemometrics Tutorials H. Amsterdamm London--New York--Tokyo: Elsevier. 1992, pp. 25-38. Warren FV Jr., Bidlingmeyer BA, Delaney ME Selection of wavelengths for absorbance ratio monitoring in liquid chromatography. Anal Chem 1987; 59:1897-1907. Warren FV Jr., Bidlingmeyer BA, Delaney ME Selection of representative wavelength sets for monitoring in liquid chromatography with multichannel ultraviolet-visible detection. Anal Chem 1987; 59: 1890-1896.
This Page Intentionally Left Blank
Chapter
5
Neighborhood operations on hyphenated data Background subtraction Homogeneity checks Local purity checks by svd Local peeling Sharpening the chromatographic
peaks
This Page Intentionally Left Blank
5
Neighborhood operations on hyphenated data
In this chapter we study methods that analyze segments in a chromatogram. The aim of these methods is to extract the components by local operations on spectra in the segment. Usually, only a small subset of the spectra is used. It is often useful to find approximate solutions to the component identification problem. In many situations we try to find components that are new. If we can conclude that a spectrum is familiar, we can skip it as uninteresting. We can conclude that there is nothing new if all components in a region can be located in the library of well-known compounds. If we can get spectra that are 99 percent correct, we can exclude whole regions of spectra in the chromatographic run from further consideration. If we find that some extracted component gives a poor correlation with the library of known compounds we should investigate the matter further.
5.1
Background subtraction
We start with the background subtraction [McLafferty 1980, Dyson 1990]. By background we mean a constant element that is present in all spectra. This is the so-called instrumental background. In mass spectrometry the reason for the instrumental background can be the vacuum system in the mass spectrometer. The constant background could be due to diffusion pump oil or an atmospheric gas like oxygen and nitrogen. It could also be due to column bleed, when the chromatographic column slowly releases some chemical component. In liquid chromatography the background is the spectrum of the eluent. The idea in the subtraction is to get a spectrum next to a chromatographic peak in an area of the run where the baseline stays constant (Fig. 5.1). This spectrum is assumed to be the instrumental background. When it is subtracted from the highest spectrum in the chromatographic peak we obtain a simpler spectrum that corresponds better to the chemical compound responsible for the chromatographic peak. The problem is only in finding a representative background spectrum. The second problem is finding the proper scaling for background before subtraction. 109
110
f
I
I
I
I
I
I
I
I
I
I
I
I
Background spectrum
I
I
I
i
i
I
I 'l
i
Spectrum at the peak
Figure 5.1 A schematic description of the background subtraction. A scan near the chromatographic peak is selected to represent the instrumental background. This spectrum is subtracted from the spectrum corresponding to the peak maximum. In practice, it is not easy to find spectra that represent the instrumental background. The valleys between chromatographic peaks are often narrow and the background spectra may contain other chemical entities. What often happens is that there is overlap between the chemical components and this second overlapping component is subtracted at the same time as the background is subtracted. Still, the idea of the background elimination is easy to grasp. It is a step that most spectroscopists routinely do before storing a spectrum in their personal compound library. Although this is the first step in purifying a spectrum it is not sufficient for many purposes. Several conditions should be fulfilled before a mass spectrum is good enough to be accepted in public spectral libraries [Speck, Venkataraghavan and McLafferty 1978]. These conditions are more stringent than just a background subtraction.
5.2
Homogeneity checks
The effect of the background subtraction can be checked by some simple MATLAB programs. In principle, if a chromatographic peak is homogeneous, the correlation
111
I I
L | I | I IIIIII
IIIIIII
Ill i Innnul /uuunl In u u In u n /uunul luunul lnnnn4 lunununl /nnuuuul /uuuunnl lUUuunnl Inuununl In n n n mnnunnnl j i i i i ,II u I I I /I I I I I II I I I I II I I I I II I I I I l i n l i l l i dl I I I I
II nl nl
unl ii ii II Ii II II II
/ 1ii 'Ui"l i1l Il"iI 1l iii l ~
~,,~r I
~ : : [.' :l : I'l-rT-I I I Scan
:)peara
se Iectecl
number
or
correlation calculations
Figure 5.2 The purity of a chromatographic peak can be checked by calculating correlations between the spectra corresponding to the chromatographic peak. The choice of the first and last spectrum for a chromatographic peak is critical. If the analytical system has some instrumental background left the statistics of the recorded spectra at the root of the peak become poor. This will show up as an elevated standard deviation of the correlations.
***
Start
of
First
- 27
Last
: 35
program
"purity"
Maximum
correlation
-
0.9988
Minimum
correlation
-
0.9953
Average
correlation
-
0.9972
Std.
correlation
-
0.000908
of
m n of program p u r i t y , m outputs correlations between all spectra in the chosen interval. The range of selected spectra is controlled manually by the experimenter. If the correlations found inside the range are not high enough, the selected range can be made shorter.
Figure 5.3 A sample
112
xl06
6
0 0
Purity of a chromatographic peak !
i
i
i
!
i
!
50
100
150
200
250
300
350
400
Figure 5.4 The rectangle below the first high peak shows the location of spectra that are selected for purity checking by program p u r l t y . m. This graphical output does not show whether the region is pure or not. between all spectra in the peak should be one. If the correlation between spectra at the leading edge and the trailing edge is less than one, we suspect that the chromatographic peak is not homogeneous. The program p u r l t y . m (Appendix, Prog. 5.1) is designed for this purpose. It is simple in its structure. The correlations are calculated between all selected spectra (Fig. 5.2). The diagonal elements corresponding to the correlation of each spectrum to itself are ignored. A sample run with the program p u r l t y . m is shown in Fig. 5.3. The user inputs the first and last spectrum numbers for a chromatographic peak in the chromatographic run. The program displays the highest and lowest correlations between the spectra in the selected range. The program estimates the average correlation coefficient as well as the standard deviation between all correlation coefficients obtained. The graphics output from the program shows the shape of the chromatographic total abundance curve and the span of the selected spectra in the run (Fig. 5.4). The program is terminated by inputting a zero for the spectrum number.
113
The homogeneity check is a way to make certain that a given native spectrum represents a pure chemical species. If a chromatographic peak is homogeneous it is possible to use it in libraries without further mathematical manipulations.
5.3
Local purity checks by svd
The complete chromatographic run contains too much raw data for efficient inspection. It is not possible to check just by looking at the individual spectra if a given chromatographic peak is pure. One possible approach is to calculate some sort of index that shows how many chemical species are present at each point in time. The chemometric method called evolving factor analysis studies the charges in the estimated number of components when the matrix under study is expanded from the first spectrum to the last or from the last spectrum to the first. Changes in the number of components indicate the presence of peaks [Meader 1987]. Singular value decomposition is a built-in procedure in MATLAB. Function s v d calculates the singular value decomposition of the input matrix. In practice s v d performs the same calculation as the so-called principal component analysis or PCA. The two routines may differ in signs of the resulting vectors, but mathematically the results from s v d and the PCA by the NIPALS algorithm are identical. If we invoke the built-in MATLAB routine by the statement: >>B=svd (A) ;
we get a vector that contains the singular values associated with each principal component. The matrix A contains the spectra to be analyzed. If we have 5 spectra with 400 lines each, matrix A is dimensioned to be 400 by 5. Vector B contains 5 values. Value B ( 1 ) contains the first singular value. The second value B ( 2 ) corresponds to the second singular value and so on. It is not possible to perform the singular value decomposition for one spectrum. Instead, we must take at least as many input spectra as there are eigenvalues. If we need the three principal components, we must analyze more than three spectra. For five principal components we must take more the five spectra. To judge the purity of a chromatogram we perform an iteration where we estimate the local purity for each spectrum. This means that for a chromatogram with one hundred spectra we must calculate the s v d analysis a hundred times. This is not a trivial calculation and even on a fast computer takes roughly one second for each set of five spectra. This means that the complete analysis takes about two minutes. Let us take a look at the program n o f c o m p 2 .m (Prog. 5.2). The program starts by asking the user to input the name of the file containing the data matrix
114
from the chromatographic run. It reads in the file and copies the data into the matrix Obs. Then the program contains an assignment statement that gives a value of three to the variable m a g i c l . The variable defines the maximum number of spectra that are analyzed by the s v d . If the number has a value of 3 it analyzes 2 * m a g i c 1 - 1 spectra, which is 5 spectra.
%NOFCOMP2 .M estimates local number of components in all % spectra of a chromatographic run. %
% Data h a n d l i n g for h y p h e n a t e d techniques % (c) b y Erkki & Ulla Karjalainen 1995 %
format compact clg h o l d off clear
disp( [' *** Start of p r o g r a m "nofcomp" ***'] ) fil=input('File to be checked - ','s'); e v a l ( [ ' l o a d ',fil]) s=['Obs= ',fil,';']; eva l (s ) [nscans, nlines ] =size (Obs) ; abu= sum (Obs') ; magicl=3 ; Cl=zeros (nscans, 2*magicl-l) ; for ii--magicl -nscans-magicl ii A=Obs (ii-magicl+l- ii+magicl-l, • ) ' ; B = s v d (A) ; B= (i/sum(B)) *B; B=abu (ii ) *B; Cl (ii, -)=B' ; end plot (Cl)
Program 5.2 Programfor estimating the local number of components in all spectra of a chromatographic run.
115
x 106
7
6-
5-
4-
3-
2-
0
0
5
10
15
20
25
30
35
40
Figure 5.5 Estimating the local number of components in a chromatographic run using the program no fcomp2 .m. The first singular value has been scaled to unity. After that the singular values have been multiplied by the total abundance. The curves under the sum curve represent the different singular values in this abundance scale. The plot indicates the complexity of the eluate. The results are scaled so that the sum of the singular values is forced to unity. Then the vector containing the singular values is multiplied by the total intensity contained in the spectrum. The intensities were summed into the vector a b u . Finally the curves with the five singular values normalized to correspond to the original abundance are plotted as a diagram. The results from one run are shown in Fig. 5.5. The results should be taken very cautiously. The presence of more than one "component" should be taken as warning. It is not possible to conclude from the results more about the exact number of components present at each point in time. The presence of mixtures is a sign that further studies are needed.
5.4
local peeling
Another way to obtain approximate solutions for pure spectra is to use a local peeling method. By peeling we mean a method where we take the spectrum with the largest abundance as a "pure" component and subtract it from the others. This operation is then repeated.
116
xl0 7
a
1.5
0.5
-0.5
0
' 50
' 100
. 150
, 50
i 100
i 150
.
200
.
. 250
.
300
350
400
, 300
350
4t
xlO 7
b
2
1.5 -
1-
0.5-
0~--~
-0.5 0
, 200
, 250
Figure 5.6 (Local peeling, a) T h e original total a b u n d a n c e curve of our urinary steroid s a m p l e , b) T h e f i r s t - - t h e h i g h e s t - - - c h r o m a t o g r a p h i c p e a k has been subtracted f r o m the c h r o m a t o g r a m . It is replaced by a horizontal line.
117
xl0 7 2
C 1.5-
1-
0.5-
0 r,
-0.5
0
I
I
I
I
I
I
I
50
100
150
200
250
300
350
400
xl0 7 2
d 1.5
-
1
0.5-
0 .,--~
-0.5
0
1
I
I
I
I
I
I
50
100
150
200
250
300
350
400
Fig. 5.6 (cont.) c) The second highest peak has been removed. Notice that the shoulder of the second peak remains in chromatogram, d) The eighth chromatographic peak has been subtracted from the chromatogram. There are practically no more peaks left.
118
xl07
Original i
i
i
1.5
0.5-
~ _ ~ _ _
-0.5 0
' 50
' 100
' 150
I
I
I
x 107
2
I
I
I
I
200
250
300
350
400
,
-
High-pass filtered I
I
i
1.5 1
t
0.5
_
-
0
-0.5
-
0
50
100
150
200
250
300
350
400
Figure 5. 7 High-pass filtering of the urinary steroid data. Notice that the shoulder in the second large peak of the original chromatogram is clearly separated in the high-pass filtered chromatogram.
The simplest variant is to take the local maxima as the pure spectra. We know that this approximation is not very realistic, because it will miss components that are not locally dominant. Components that are hidden under other components will remain undetected. Still, this approach has several advantages. It is simple to implement and it works extremely rapidly. It works best with data that have been properly preprocessed to expose the hidden components, as we will see later. This peeling method has been implemented as the program l o c p e e l .m (Appendix, Prog. 5.3). The program asks the user to input the name of the data file that he wants to process by peeling. The user gives the name of the file and the program then proceeds to store the spectra of the local maxima in the chromatographic run as "pure" spectra in matrix v S which is stored on disk. This simple form of the program removes chromatographic peaks based on the total abun-
119
dance. A peak is eliminated starting from the top. Spectra are removed until the total abundance starts to rise again. Graphical output from the program 1 o c p e e l .m is shown in Figs. 5.6a-d and 5.8a-f. They show how peak after peak is subtracted from the original chromatographic profile. The data used in Fig. 5.6 were the original observations. In Fig. 5.7 the original data were first run through a high-pass filtering process before the local peeling was used. A much more sophisticated chemometric algorithm uses the so-called rank annihilation [Ho et al. 1978, S~chez and Kowalski 1986]. In this method a known standard is subtracted from a region and the rank of the remaining matrix is followed. When the rank has decreased by one, the-standard has been properly subtracted. This algorithm operates on two-dimensional spectra. Each time a peak is subtracted from the chromatogram the spectrum corresponding to the maximum of the peak is stored in matrix p S. At the end, the matrix p s is saved on disk.
5.5
Sharpening the chromatographic peaks
High-pass filtering can be used to counteract the spreading tendency of the chromatographic peaks. In chromatography the peaks are initially very narrow. The sample injection volume is small, so the peaks are initially sharp. When the separation progresses, the peaks get broader. The peak broadening effect can be compared with the effect of low-pass filtering on an initially sharp peak. It is natural to try to recover the original unbroadened signal by applying a reverse filtering process. The simplest tool for this purpose is a high-pass filter that makes broad peaks narrow. We cannot fully recover the original signal by the simple strategy of high-pass filtering, but we can uncover some structure in the chromatogram that would not be visible without the filtering. The process of high-pass filtering is very easy to do in MATLAB. We have earlier discussed convolution as a tool of signal processing. The convolution process is a time-consuming calculation. It is, however, easy to express the convolution process in MATLAB; a single statement does the actual work. The difficulty is not programming the high-pass filtering process, but to find an optimal filter for the high-pass filtering of a given signal. We have constructed a simple filter that has a roughly bell-shaped middle and two negative smaller side lobes. If the chromatographic peaks are broader, the filter should be made correspondingly broader. If the chromatographic peaks are very narrow, the simplest possible filter would have only three coefficients -1,3,-1. Because the sum of these coefficients is one, the average intensity of the observation matrix is retained.
120
xlO 6
a
12
1°
I
8
6 4 2 0 -2-4
0
50
11~3
150
2130
250
3130
350
400
50
100
150
200
250
300
350
400
50
1O0
150
200
250
xlO 6
b
12 108642-
~
o -:2 -4
0 xlO 6
C
12
1°t 8
-2 -4
0
300
350
40t3
Figure 5.8 Local peeling using high-pass filtered data as input. The first three iterations are shown. Notice the horizontal parts showing the removed chromatographic peaks.
121
xlO 6
d
12 10
~
-2
-4
, 50
0
i 100
....., 150
, 200
, 250
, .... 300
, 350
400
200
250
360
350
400
200
250
300
350
400
xlO 6
e
12 108642-
o"~ ......
V
~
F
Y
-2 -4
0
50
1~
150
xlO 6 12 10
v
-2 -4
0
50
" V 1O0
150
Fig. 5.8 (cont.) Iterations 4, 15, and the last, 26, are shown. The high-pass filtering causes some swings in the chromatogram. Newertheless the positive peaks contain useful information.
122
When we apply the filter, we notice some negative intensities in the valleys between the chromatographic peaks. These negative intensities are very hard to avoid. If we apply less high-pass filtering, the negative swings disappear. At the same time, many features in the data matrix that were visible will disappear. It is best to retain the spectra at tops of the local peaks as approximations of the pure spectra and ignore the negative intensities. The program h i g h p a s s .m (Appendix, Prog. 5.4) asks the user to enter the name of the input file prior to performing the high-pass filtering operation. After the filtering has been done, the program asks the user under what name the highpass filtered file is to be saved. The program displays the total abundances as curves before and after the high-pass filtering steps. The graphical output from the program in shown in Fig. 5.7. The peeling operation can be made on the high-pass filtered data matrix. The high-pass filtering brings out structure that was hidden before the operation. Because only spectra corresponding to the tops of the local chromatographic peaks are retained, the spectra are almost completely positive with only a few negative lines. The result of the high-pass filtering has many more chromatographic peaks than the original, unfiltered chromatogram. These can be seen clearly when the maxima of the local peaks are stored in a succession of steps by the program 1 o c p e e l , m (Fig. 5.6 and 5.8). The best way to learn about the effects of high-pass filtering is to use image processing programs on hyphenated data. Photoshop is the best-known program of this kind, but there are many public-domain programs that work in a similar interactive fashion. NIH Image is the best-known example. Conversion programs between Photoshop-acceptable data formats and the user's data format are left as an exercise. Some tools are available on the Internet. The high-pass filtering is a very difficult operation to perform without getting negative values as a by-product. A more sophisticated treatment is possible with Wiener filtering [Press et al. 1992, Oppenheim 1978] that amplifies the noise component less. Yet all local operations cannot find the true spectra, they can only find approximations. The only way to find better spectra is to use more computer power to statistically estimate which solution is the most optimal.
References Dyson N. Chromatographic integration methods. In: Smith RM ed. RSC Chromatography monographs. Royal Society of Chemistry, 1990, pp. 122-123.
123
Ho C-N, Christian CD, Davidson ER. Applications of the method of rank annihilation to quantitative analysis of multicomponent fluorescense data from the video fluorometer. Anal Chem 1978; 50:1108-1113. Maeder M. Evolving factor analysis for the resolution of overlapping chromatographic peaks. Anal Chem 1987; 59: 527-530. McLafferty FW. Interpretation of mass spectra. In: Turro NJ, ed. Organic chemistry series. Mill Valley: University Science Books, 1980, p. 12. Oppenheim AV, ed. Applications of digital signal processing. Englewood Cliffs: PrenticeHall, 1978, pp. 204-223. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical recipes in FORTRAN. The art of scientific computing. Cambridge University Press, 1992, pp. 539-542. S~inchez E, Kowalski BR. Generalized rank annihilation factor analysis. Anal Chem 1986; 58: 496--499. Speck DD, Venkataraghava R, McLafferty FW. A quality index for mass spectra. Org Mass Spectrom 1978; 13:209-213.
This Page Intentionally Left Blank
Chapter
6
Alternating Regression Deconvolution and AR The OSCAR approach Defining the objective function Finetuning the solution The elution curve constraints The steps in the "core AR" algorithm Repeating the AR iterations Summing up
This Page Intentionally Left Blank
6
Alternating Regression
6.1
Deconvolution and AR
The main theme in this book is a chemometric method called alternating regression (AR). Alternating regression approaches the deconvolution problem from a different angle than most of the existing literature. It has been customary to start the solution process by performing a factor analysis [Malinowski 1980]. The factor analysis is often called principal component analysis in the chemometric literature. In chemometrics both terms are used to mean the same method, which is called PCA in the statistical literature. The central idea of alternating regression is keeping the solution in the positive space during the solution process. When some component turns out to be negative the negative value is simply replaced by a zero. The term "AR" is used here in two separate meanings. The first meaning is a "core AR" algorithm for developing a solution by using a sequence of regression calculations. The core of the algorithm has two regression steps with the positivity constraints applied after both steps. The second meaning is used to describe an algorithm~the OSCAR algorithm--that solves the complete deconvolution problem in an optimized way. The process uses adjustable constraints to reduce the solution space to a very small volume, a "unique solution". This OSCAR algorithm has three main parts. The first part is experimental design where the constraint ranges are selected. The second part of the OSCAR algorithm performs the calculation repeatedly under varying constraints using the "core AR" calculations. Using the third part of the OSCAR algorithm the analyst evaluates the results and picks out the unique solution. When the unique solution is examined in detail, the reproducibility of the solution is analyzed by repeating the optimum solution several times from several random starting points. The OSCAR algorithm is different from the earlier methods in the field because it produces an estimate about the uniqueness of the solution found with it. This is possible because the computational cost of the "core AR" is low. The solution process can be repeated several times to get statistics about its uniqueness and reproducibility. 127
128
Data
Constraints
"OSCAR process"
Number of species
"core AR"
Stable result Figure 6.1 The OSCAR algorithm finds an optimum solution to the deconvolution problem. The bulk of the calculations is carried out by the "core AR" program that converges to the optimal solution from several directions. A problem that had to be solved in "core AR" besides negative values was that the elution profiles that were found were not always unimodal. A simple solution to this problem was developed. The highest intensity in the elution curve was frozen and the intensities to the left and fight of it were sorted. Because two sorts were performed the unimodal nature of the elution profile was guaranteed. The reliability of the solution is as important as the solution itself. Initially the robustness of the solution was tested simply by repeating the solution process. Later, a more systematic approach was developed in OSCAR. Here one of the basic features of AR was very beneficial. One crucial point in the "core AR" algorithm is the choice of the initial spectra. Because we do not use any library spectra or approximated spectra in the solution,
129
it was decided that the best choice was to use random numbers. Because random numbers are used, it is possible to gather statistics about the solution by repeating the whole AR process several times. A different starting point is chosen every time. The optimum is approached from several directions. As the precise optimum is not fully reached, there remains a certain distance between the precise optimum and the solution obtained. We can get a measure about this distance by repeating the approach to the optimum from several directions. The simplest measure of the reproducibility of the solution is to compare the statistics with different numbers of spectral species. If the solution is perfectly reproducible the concentrations found for all components are stable and there is no dependence on the starting point.
6.2
The OSCAR approach
With time, AR has matured from its first applications in the late seventies into a more sophisticated tool [Karjalainen and Karjalainen 1985, 1987, 1991]. The OSCAR algorithm makes it possible to get a measure about the reproducibility of the solution and the uncertainty remaining in the spectra and elution curves of the solution [Karjalainen and Karjalainen 1995]. The minor components cannot be analyzed with the same precision as the major constituents. The uncertainty about the smaller components is visible as larger variations in the shape of the elution curves and the spectra. The difficulty with all solutions to the deconvolution problem has been finding solutions that are truly unique. With uniqueness we intuitively mean a solution that is the only possible one. The unique solution should depend on data, not the analyst. The unique result should be identifiable as such, because no other solution can replace it. We take a solution as the unique solution if the "same" solution is always achieved and the solution does not depend on the starting point. By "same" we mean a small region in multidimensional space where all repeated solutions converge. We can talk here about a starting point because the "core AR" process always starts from a different initial set of random numbers. The solution produced by OSCAR should be unique and it should produce a good fit. In practice the error in reproducibility and the error in fit are combined into one error measure. It gives equal weight to both kinds of errors. The solution with the smallest combined error is the optimal solution. The OSCAR algorithm has been put into the form of a MATLAB program. AutoAR~that can be used by chemists not familiar with MATLAB and programming. The user has a graphical user interface with buttons for performing the different steps necessary for the solution. The purpose of the program is not to
130
remain a black box. The graphical user interface is intended to make it easier to start using MATLAB. After this start, the analyst should start improving the AutoAR. The word "algorithm" should sometimes be replaced by the word "approach". OSCAR is not a recipe that guarantees results in one hundred per cent of the cases. The educated user is still needed.
6.3
Defining the objective function
The deconvolution problem is a high-level problem that needs a more precise definition. The difficulty here, as with any optimization problem, is setting up the correct function to be optimized. If we can define the function to be optimized, we are often able to solve the problem. If we cannot clearly define an objective function, we may be solving an incorrect problem. There are two error components that we must balance against each other. First, there is the classical fit, the sum of least squares. This fit measures the difference between a solution and the observations. Unfortunately, one kind of error is not enough. There is a second kind of error that we must measure. That error is the reproducibility of the solution. If our solution approaches the observations too closely we speak of overfitting. The overfitting situation is not visible from inspecting a single solution. The true situation is revealed when we repeat the solution process using AR. When there is overfitting present, each solution fits the observations well but the solutions are not reproducible. We cannot temporarily discard from the observation matrix information which is needed for bootstrapping or cross-validation. Cross-validation can detect overfitting. What we need is a balance between the two kinds of errors. We try to reduce the error described by the fit indicator without increasing the error indicator for reproducibility. This means of course that we must give a subjective weight to both kinds of errors to optimize their weighted sum. To simplify our discussion we choose to call the reproducibility of the solutions scatter. If we did not use AR, we could define the problem as a problem in mathematical optimization. The formulation of the deconvolution problem as a problem in mathematical optimization is simple. We define it as a problem in non-linear optimization. The objective function to be optimized contains two main parts. One part describes the fit, the other uses redundant observations as a measure of the reproducibility of the solution. Both parts of the error function are combined by suitable weighting into one function that is to be optimized. The only trouble with this direct attack is the dimensionality. If we have one thousand spectral lines and five species we have five thousand parameters in the spectral parameters alone. We could perhaps squeeze the total number to five hun-
131
dred but not much less. We see that it is not possible to solve the problem directly as a problem in non-linear optimization. The OSCAR algorithm does not directly minimize the spectral lines or elution curves. Instead, the higher level optimization only follows the two kinds of modeling errors to find the best point in the constraint space. The goal is reached as a by-product of small, purely local calculations. No global function exists at the core level. The bulk of the calculations is carried out at the low "core level" that does not have any knowledge about the exploration of constraint space.
6.4
Finetuning the solution
The solution to the deconvolution problem is optimal when the degrees of freedom in the observations closely match the degrees of freedom that are present in the fitted model. When a close match is achieved by careful constraining, the solution that is found is very reproducible and the solution space is minimized. The first constraint is to find the number of the components that are present. The number of components is estimated by repeating the solution process for different numbers of components. We see that the scatter increases when the number of components that are used in the solution is too large. It may be that the number of components that is optimal does not produce as low a value for the fit as the higher estimates for the number of components. The risk of overfitting can be eliminated by following the scatter indicator. After the best number of components is known, we should refine our estimates more by applying carefully chosen constraints. Constraints apply some restrictions to the problem and thus reduce the degrees of freedom. The constraints are often some facts that we know beforehand, a priori. If these constraints are some facts that we know with certainty, there is no danger of distorting the solution. We must be very careful in our choice of the restrictions. The shape of the elution curves is very unpredictable. There is often a temptation to apply some form constraints to the elution curves. A typical simplification of the elution curves is to assume that the elution curves are Gaussian in shape [Said 1981]. This reduces the number of parameters that are needed to describe the shape of the elution curves radically. The application of real shape functions to elution curves is often not justified. The substitution of two parameters for tens or hundreds of parameters is simply too much. Instead we should try to reduce the number of parameters in our model much more slowly, in carefully measured steps. A typical method for carefully reducing the number of parameters in our model is to gradually change the smoothness of the spectra. This method is especially suitable for continuous spectra such as UV spectra. When we increase the smoothness of the spectra they gradually approach a fiat shape without any
132
troughs or peaks. There is a limit, of course, to the amount of smoothing that can be applied without making all spectra identical. For mass spectra, we can gradually vary the number of active mass peaks (mass fragments) remaining in the solution. The number of the peaks in the spectra can be gradually reduced until one of the components is left totally without any mass peaks. We must then stop reducing the number of peaks in the solution. Otherwise, the number of chemical compounds in our solution would be smaller than assumed. We have earlier described how the number of the peaks can be controlled in the space of the product matrices that are formed by multiplying the individual spectra and elution curves [Karjalainen 1990]. When the vectors for one species are multiplied to form an outer product they form an intermediate matrix with the same dimension as the observation matrix. If we have six species, the observation matrix should be the sum of six intermediate matrices. The smallest terms from the six intermediate matrices can be set to zero to constrain the solution to the desired number of intermediate terms. Because the contrast can be manipulated in the intermediate space, the method has some resemblance to the maximum entropy methods that optimize the contrast in images. The most general type of constraint for spectra is the background value. It is very important that only active spectral elements are considered when constraining with the background. The AutoAR program uses adjustable backgrounds as guiding constraints.
6.5
The elution curve constraints
The constraints can be gradually applied to the elution curves. Here the first type of constraining operation is to simply sort all the components into a shape that is unimodal. On both sides of the elution peak, the intensities are sorted into decreasing order starting from the highest single intensity in the elution curves. The sorting operation guarantees the unimodality of the solution. We can apply other kinds of smoothing to the points in the elution curve. The smoothing function is gradually "tightened" by some kind of parameter adjustment as, e.g., in the smoothing process with spline functions [Spath 1973]. The smoothness is a function of the form control parameter. One way to implement gradual smoothing is to use polynomials of different degrees. When a lower degree of the polynomial is chosen, the curve gets smoother in shape but may get badly distorted. Another way to gradually control the degree of smoothing is to use some kind of filtering operation on the elution curves. We can do the filtering in the Fourier domain [Aubanel and Oldham 1985]. For this, we first transform the points into the Fourier domain. Then we return from the Fourier domain back into primary
133
space. If the size of Fourier coefficients that we choose for the inverse calculations is identical to the size of the coefficients that were obtained initially, there is no change in the elution curve. If we drop some higher frequencies in the Fourier spectra we get a smoother curve than that from the original points. In practice, it is best to limit the influence of the higher frequencies gradually, by applying a tapering function to the points. With the tapering function, from some frequency on, the intensity of higher frequencies is gradually reduced until finally the highest frequency is totally suppressed. For elution curves, background--or baseline~values can be used as constraints. The baseline constraints are useful in the analysis of any two-dimensional data. It is not critical which kinds of constraints are used to reduce the solution space to the optimum value. Several alternative kinds are possible. It is critical thing is that the constraints should be adjustable in steps.
6.6
The steps in the "core AR" algorithm
6.6.1 Filling the spectra with random numbers The "core AR" algorithm starts by filling the spectra with random numbers (Fig. 6.2). The choice of these numbers is not critical. However, they must not be linearly dependent. The uniform random number generator which is available in many programming languages can be used for this purpose. If we would like to get some resemblance in the spectral statistics to the real spectra, we can make some adjustments to the raw random numbers. For example, we can raise them to some higher power to obtain a distribution histogram that better approximates the shape of the histogram of the observation intensities in the measured data. In practice the random numbers are not critical. If we like we can improve the orthogonality of our initial guess by the simple strategy of filling just one element for each spectral line. Let us say that we have four compounds in our solution. We fill out all mass numbers using a simple loop. We can first choose one compound using a random number generator. This compound gets an intensity again from the random number generator. The other compounds do not get any intensity. Their values remain filled with zeroes. For each mass number just one component has an initial intensity different from zero. This simple algorithm for filling the initial guess guarantees that the first candidate solution is orthogonal, a fact that is not present in the results.
6 . 6 . 2 Solving for the elution curves The second step of the AR algorithm solves for the concentrations using the current value for spectra as a basis. This operation requires the solution of a simple multiple regression problem (Fig. 6.2). The MATLAB program for this step is delight-
134
fully simple. It needs just a single program line. Behind the scenes, MATLAB does much work, but it is not visible to the user. If the data do not come from chromatography we have to talk about concentration curves, not elution curves.
6.6.3 Constraining the elution curves When the elution curves have been solved, we are not near the solution yet. Many elution curves initially have their highest intensities in the same point of time. This is not the case for the correct solution. Also, there are multiple local maxima in the elution curves. Some points have negative values, which is not possible for a physically plausible solution. These defects are repaired in the third step that constrains the elution curves (Fig. 6.3). Initially the points are inspected for negative values. If a negative value is found, it is simply chopped off and replaced by a zero. The next phase in the constraining operation guarantees the unimodal nature of the elution curves. The highest point in the elution curves is sought and kept unchanged. The points that precede it in the elution curve are then sorted into an ascending order. The highest point can be omitted from the sorting because we know beforehand that it is higher than the preceding points. The points after the highest intensity are then sorted into a descending order. Again, the highest point is not carried into the sorting process, because it is the highest value. When all the points have been processed, the solution is inspected for intensities. In practice this step is performed after chopping off the negative values. If it is found that the values for intensities were all negative, the elution curve is filled with a small amount of noise. This filling takes place automatically because a tiny amount of noise is always added to the elution curves. This is purely an insurance against difficulties in the later steps. We cannot go on with the other steps of the AR algorithm if the elution curves totally vanish. This rerandomizing step serves as an insurance against the rare situation where an elution curve totally vanishes during the constraining step. It is important that the spectra and elution curves are correctly scaled during the process. For this reason, the lengths of the vectors are adjusted to a constant length of one. This means that all spectra have about the same importance in the regression step. Similarly all elution curves carry the same weight during their regression step. The amounts of the components are kept stored in a separate vector. This vector is sorted into a descending order by the intensities of the components in the solution. The spectra and elution curves are sorted in synchrony with the concentration vector. Finally the spectra and elution curves are sorted into the order of the locations of the peak maxima in the second dimension, usually elution curves.
135
Observations (0)
Spectra (S)
A
.,
Unconstrained concentrations
(c')
Figure 6.2 The first step in decomposing by AR is to fill the spectrum matrix with random numbers. Then the spectra are considered "known". Using regression, the concentration matrix is solved based on the observation matrix and the spectra. Notice that the elution profiles of components (concentration matrix) contains also negative values, which are not physically possible.
,, /
A
~,
.,
Unconstrained concentrations
(c')
Concentrations
(c')
Figure 6.3 To get physically realistic elution profiles, constraints are used. The negative values are replaced by zero. Then the profile data is sorted to become unimodal.
136
Concentrations
(c')
Observations (0')
Unconstrained spectra (S)
Figure 6.4 In the third step the constrained elution profiles are considered to be "known". Using regression, the spectrum matrix is solved based on the observation matrix and the concentrations. Note that the spectra contain some negative peaks.
it
I
.
,
Unconstrained spectra (S)
I ,.
L
I,
I,
,.
,
,L
I.
I.
, L.
,,L,,
,t
•
I.I
h.
Spectra (S)
Figure 6.5 To get physically realistic spectra, constraints are used. The negative values of spectral lines are replaced by zeroes.
137
6.6.4 Solving for the spectra The next step in the algorithm solves for the spectra based on the constrained elution curves (Fig. 6.4). The algorithm is multiple regression, there is nothing fancy about it. As the elution curves were scaled to be roughly equal in size there is no difficulty getting spectra that reflect all components of the solution. The spectra that are obtained differ in their lengths, because the amounts of the components are now reflected in the scaling of the spectra.
6.6.5 Constraining the spectra In the initial iterations the spectra can be nonphysical; they have some negative points in them. The exact constraining steps that are performed depend on the spectra (Fig. 6.5). Initially, we eliminate all negative intensities by chopping them off. If a spectrum component totally vanishes at this step, the spectrum is refilled with random numbers. If the spectra come from an instrument that records smooth spectra, some sort of smoothing is applied now to the spectra. If the spectra are not smooth but line spectra, the number of spectral lines is sometimes adjusted to some fixed maximum number and the smaller components are dropped off.
6.7
Repeating the AR iterations
The iterations described are repeated ten to twenty times. The result converges to a stable end point. The calculations can be continued, but they improve the result only slightly.
6.8
Summing up
The goal in research is to obtain so much redundant information about a problem, that the modeling process needs the minimum amount of a priori constraints for its solution. If the overlap between the components is severe and we wish to find a minority component buried under larger components, there is a limit to how far we can get with statistics. In some situations, it is better to try to reduce the overlap by experimental means. The analyst can perhaps simplify the mixture by extracting out some interfering compound or by concentrating the compound of interest. The use of constraints is the central point in AR. Everything depends on the constraints. If they are wisely applied, interesting results can be obtained in situations that look nearly hopeless. On the other hand, mechanical application of unnecessary constraints leads to a solution that looks good on the surface, but is an artifact. With experience, the analyst leams which constraints work best with his material.
138
OSCAR finds the most stable spectra and elution curves. The solution is possible by increasing the baseline constraints to values that are just sufficient [Karjalainen and Karjalainen 1995]. This forces the solution to a stable value without degrading the fit too much. The practical realization of the OSCAR algorithm is the AutoAR program.
References Aubanel EE, Oldham KB. Fourier smoothing without the Fast Fourier Transform. Byte 1985; 10:207-218. Karjalainen EJ. Isolation of pure spectra in GC/MS by mathematical chromatography" Entropy considerations. In: Meuzelaar HLC, ed. Computer-enhanced analytical spectroscopy, vol. 2. New York--London: Plenum Press, 1990, pp. 49-70. Karjalainen EJ, Karjalainen UP. Mathematical chromatography--Resolution of overlapping spectra in GC/MS. In: Roger FH, Gr~inroos P, Tervo-Pellikka R, O'Moore R, eds. Medical Informatics Europe 85, Proceedings. Springer-Verlag 1985, pp. 572-578. Karjalainen EJ, Karjalainen UP. Mathematical chromatography in GC/MS. Finding the pure mass spectra. Clinical Chemistry Research Foundation Library, vol. 2. Helsinki, 1987, pp. 1-48. Karjalainen EJ, Karjalainen UP. Component reconstruction in the primary space of spectra and concentrations. Alternating regression and related direct methods. Analytica Chimica Acta 1991; 250: 169-179. Karjalainen EJ, Karjalainen UP. Robust recovery of spectra with OSCAR. Kemia--Kemi 1995; 22: in press. Malinowski ER, Howery DG. Factor analysis in chemistry. New York--Chichester--Brisbane--Toronto: John Wiley & Sons, 1980, 251 pages. Said AS. Theory and mathematics of chromatography. Heidelberg--Basel--New York: Dr. Alfred Htithig Verlag, 1981, pp. 71-81. Sp~ith H. Spline-Algorithmen zur Konstruktion glatter Kurven und Fl~ichen. Mtinchen-Wien: R. Oldenbourg Verlag, 1973, 134 pages.
Chapter
7
Applying the OSCAR algorithm
A
practical example The OSCAR process Taking a first look with MAT/AB Preprocessing the data with AutoAR Setting up the constraint space Gathering the AR statistics Inspecting the solution found with AutoAR
Plotting the spectra and elution curves
Calculating the reproducibility
This Page Intentionally Left Blank
7
Applying the OSCAR algorithm--A practical example
7.1
The OSCAR process
The analytical results are analyzed using the OSCAR algorithm, that has been written into a MATLAB program, AutoAR. The calculation has six phases: 1. Setting up the constraint space. 2. Running the AR calculations and collecting the results as a function of guiding constraints. 3. Looking at the statistics as a function of guiding constraints. 4. Selecting the best solution from the set of solutions. 5. Recalculating the statistics for the best solution. 6. Looking at the best spectra and elution curves with their associated confidence ranges.
7.2
Taking a first look with MATI.AB
We shall now work through a practical AR example using some mass spectrometric data. The data are derived from a gas chromatographic analysis of urinary organic acids. The derivatized samples are first separated using gas chromatography on a capillary column. The collection of spectra were taken at a constant speed of scanning, so we get a matrix of intensities that are evenly spaced in time. The number of spectra collected during one chromatographic separation is too large to permit the analysis in just one batch of spectra. Due to computer memory limitations it is necessary to divide one chromatographic run into smaller sections. A typical section may consist of 30 to 150 spectra, depending on the data. A section is selected in such a fashion that it starts and ends on the spectral baseline where only background components are present. It is not always possible to segment a chromatographic run into clearly separated segments. The goal should be to make the cuts at local minimum. If the baseline remains high in a segment, the analysis should be repeated with segments that overlap. 141
142
In the following example we use a segment of a GC-MS run of derivatized organic acids. The ORG4 0 is a data matrix containing 40 spectra. It stores the individual spectra in horizontal rows. The intensity of the mass spectra can be summed and plotted in one statement. There are two ways to calculate the sum of ions in the complete observation matrix. The matrix can be oriented in two different ways. We can take a look at the total ionization curve by calculating the sum of ions in each spectrum. This operation requires that the transpose of ORG4 0 is formed first. The following MATLAB command will plot the total ionization curve: >>plot ( s u m ( O R G 4 0 ' ) )
The result~the total ionization or total abundance~is shown in Fig. 7.1. If we calculate the sum of the data matrix along the other dimension of the matrix, we get the sum spectrum of all 40 mass spectra (Fig. 7.2). This is done by using a MATLAB command: >>bar (s u m (O R G 4 0 ) ) x 10 s i
2.5
1.5-
1-
0.5-
°o
I
5
I
10
I
15
I
20
I
25
I
30
40
Figure 7.1 The total ionization curve is calculated from 40 sequential mass spectra, ORG4 0. The spectra of derivatized organic acids were collected using the HP GCD instrument that contains a quadrupole mass spectrometer coupled to a gas chromatograph.
143
7
x 10 s
6
-
5-
4-
3-
2-
[ [
1-
O0
50
.dLLI, .,_ ll,. I ,..t_ 1 O0
150
200
.t,t
t
250
I
300
.
350
Figure 7.2 The sum spectrum of the 40 mass spectra. The b a r command gives a better looking sum spectrum for line spectral data, e.g., for mass spectra. For optical spectra we get a better display using the p 1 o t statement instead. We are looking at a section of the run that contains 40 mass spectra. To get some idea about the heterogeneity in the sample we shall make some preliminary analyses of the data before starting with the actual processing. The following MATLAB program loads in the short segment of spectra and then calculates the correlation matrix between all spectra: clear
load ORG40
K = c o r r c o e f (ORG40') ;
m e s h (K)
The correlation coefficients are displayed as a three-dimensional mesh diagram. If all spectra were identical, all correlation coefficients would have values close to one. This is clearly not the case as we can see in Fig. 7.3. The smallest correlations are between 0.3 and 0.4. Another way to take a look at the correlations in the run segment is to use a p 1 o t statement:
144
1 -
0.9-
0.80.70.60.50.40.3>
>
i 1 1 ~ ' ~ 4 0
Figure 7.3 The correlation coefficients between the 40 mass spectra in a segment selected for analysis. The diagonal values have a correlation of 1, because each spectrum is correlated there against itself. The lowest values are much smaller, between 0.3 and 0.4, suggesting the presence of at least two components in the segment. Clearly further analysis is necessary to find out how many components are present in the spectra. >>plot (K)
This results in the display of Fig. 7.4. Several consecutive high correlations suggest the presence of a "pure" component at that region. Low correlations suggest the relative absence of a given component in the region. Crossing lines suggest a region between two overlapping components. Around spectrum number 30 there is a clear overlapping area, while the region around spectrum number 25 contains a relatively pure chromatographic peak. This does not mean that there are no other components present at this region, only that there are no other relatively large components present. Small background components are not easy to locate in the plot. A correlation plot like this can be used to obtain a rough estimate about the number of components in a segment. We can take a direct look at two raw spectra and compare them to see if they are different. We select spectrum number 24 and spectrum number 36" >>bar (ORG4 0 (2 4, : ) ) >>bar (ORG4 0 (3 6, : ) )
145
The results of these commands are shown in Fig. 7.5. We see by comparing the spectra that the largest peaks in spectrum number 24 and in spectrum number 36 are different. The exact mass values cannot be read accurately from the figures of mass spectra because of their small size, but the following MATLAB command gives the precise values for both the intensity and the location of the main peak: >> [big, p l a c e ]--max (ORG40 (24, :))
big
-
3. 6 7 5 3 e + 0 4
place
165
=
>> [big, p l a c e ] = m a x (ORG40 (36, :))
big
-
3.4335e+03
place
75
=
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3
0
~i
1'0
1'5
2'0
2'5
3~0
3'5
40
Figure 7.4 The correlation matrix between all spectra in the segment has been plotted. The x-axis is the spectrum number, the y-axis is the correlation. Each line in the plot shows the correlations of one spectrum with all other spectra. We see that the spectra in the neighborhood of spectrum 30 have high correlation values with all other spectra although the correlation varies elsewhere quite substantially. This suggests that the spectra near the spectrum 30 are mixtures.
146
x
4
104
3.5
2.5
1.51 0.50
0
J_._&h,
50
.,.1.., ...... 1 O0
150
.
II
200
,
[
i,
250
360
350
3500
3000
2500
2000-
1500-
1000-
500-
%
IJltl I I I
50
1 O0
I
,ll
150
I
I
200
,
250
,
300
350
Figure 7.5 Spectrum number 24 from the initial part of the chromatographic segment that contains 40 mass spectra from a sample of urinary organic acids is shown in the upper figure. The lower figure shows the spectrum number 36. The spectra are far from identical indicating the presence of more than one component in the collection of 40 spectra.
We can also get an idea about the heterogeneity in the sample by trying out some other plots and mathematical functions available in MATLAB. We can easily write a short routine that plots the principal components by performing a singular value decomposition of the observation matrix. In chemometrics, the output of a svd is often called PCA or Principal Component Analysis (Fig. 7.6). >>clear 771oad ORG4 0 77plot (svd (ORG4 0 (i- 7 ) ) )
The graphical output of the first seven PCA components is shown in Fig. 7.6.
147
2.5
x 10 s
1.5
1
0.5
I
01
I
2
3
I
I
i
4
5
6
Figure 7. 6 The main result of the PCA are the singular values from the svd of the observation matrix. We see that the number of components in the segment of 40 mass spectra is between three and five. Numbers higher than these probably are due to noise.
The s v d function in MATLAB has uses in data compression. By replacing the original matrix with a matrix reconstructed from a smaller number of PCA components we are reducing noise. This is often a practical way to do some smoothing of the data matrix prior to the analysis. The first svd components are multiplied together to form back a somewhat smoothed observation matrix. The following program asks for the number of PCA components used for reconstruction of the matrix. clear
load ORG40
K=ORG40 ';
[U, S, V] = s v d (K, 0) ;
ncomp=input(
'Number
of c o m p o n e n t s
for r e c o n s t r u c t i o n -
T I = U ( :, 1 :ncomp) *S (1 :ncomp, 1 :ncomp) ; T2 = T I * V ( :, 1 :ncomp) ' ; D=K-T2 ; diff=100*norm(D,
'fro')/norm(K,
'fro')
');
148
Original Data Matrix
x 104
6~ 5-. 4-. 3-. 2-.
O>
40
400 300 10 N u m b e r of
scans
0
0
100
200
Number of lines
Figure 7. 7 The unprocessed data matrix containing mass spectra from a GC run of derivatized organic acids (ORG4 0). The data matrix is shown as a mesh diagram. The matrix contains 40 spectra (scans) and 340 spectral lines or mass fragments.
The program gives 1.89 percent for difference between the original data and the smoothed data where the number of components is nine. We should be careful not to oversmooth our data. If we are looking for two or three components in a mixture we should retain six to nine PCA components in the smoothing process. A good rule is to use at least three times more PCA components than the expected number of chemical components in the mixture. If we use too few components in the smoothed observation matrix, there is a danger that we shall discard some essential information.
7.3
Preprocessing the data with AutoAR
The AutoAR program is a tool for the main functions needed in deconvolution. Before we start with the actual analysis we should get acquainted with the preprocessing tools. The preprocessing is integrated as a component of the AutoAR program because it is a rapid way to make the first adjustments to the data. When we get feedback from the actual results produced by the AR process we can easily improve the preprocessing settings. In preprocessing there are three different ways to handle the data:
149
Transposed Data Matrix
x 10 4 6-. 5-. 4-.
/
3-. 2-.
/
i/11, i
10~ 40O
ii ilii
30 20
10
N u m b e r of s c a n s
0
0
N u m b e r of lines
Figure 7.8 The data matrix shown in Fig. 7.7 has been transposed. Now the number of spectral lines is 40 and the number of scansmspectra--is 340.
1. The data matrix can be transposed, if necessary. 2. The analyst can take a segment of the matrix for further use. 3. The raw data can be smoothed using a filter vector or svd. The ORG4 0 contains the first 40 scans from a larger data matrix ORG, which is on the C D - R O M . Because a large data matrix is difficult to display on paper, we only use this small matrix to describe the possibilities of the program. In the Command window of MATLAB we get a short log of all steps we do. The content of the Command window can be printed or stored as a document. First we load the data file. Its name is displayed in the Command window after the program name. ***
Data
AutoAR/Preprocessing file-
ORG40.mat
***
When the data file is loaded the original data matrix can be displayed as a m e s h diagram (Fig. 7.7). The size of the data matrix and its orientation are shown in the Command window.
150
C o n v o l v e d Data Matrix
x 104
6,~ 54-
32-
1-.
0"-1> 40 30
00
0
20 10 0
N u m b e r of s c a n s
I O0
0
200 N u m b e r of lines
Figure 7.9 The data matrix shown in Fig. 7.7 has been smoothed by convolving it with a bell-shaped filter matrix FLTRB7V. mat. The peak tops are rounded when compared to the original data. Original
x
y
- Number
= Number
No. No.
of of
Data
of of
Matrix lines
scans
collected spectral
spectra
lines
-
=
40
340
If necessary, the data matrix can be transposed. The transposed matrix is displayed as a m e s h diagram (Fig. 7.8). Note that the number of spectral lines and the number of spectra are exchanged. The transpose is needed if the collected data are initially arranged so that the rows do not correspond to the different scans or spectra. The transposed data matrix can be saved and reloaded for later use. The original data can be smoothed using the convolution operation with a bellshaped filter. The filtering can be made with the Savitzky-Golay type of filter [Savitzky and Golay 1964]. A collection of Savitzky-Golay filters is located on the companion CD-ROM. A seven-point bell-shaped filter, FLTRB7V.mat, is used in the example. The resulting matrix is displayed as a m e s h diagram (Fig. 7.9). The user can store the smoothed matrix on the disk. This matrix can then be used as input for the AR calculations.
151
Data Matrix after S V D
x 104 6~ 5-J 4-.1
/
\
3-.I 2-J
1-1 _
30
~
/
2o'~.... 1~ " ' ~ ~ ~ ' - " ~ ~oo1 Number of scans
~
400
o 0
0
20o Number of lines
Figure 7.10 The mesh diagram of the organic acid data after smoothing using five principal components.
The second way to smooth the data matrix is to use the svd as described earlier in this book. First we can take a look at the possible number of components needed for the svd. To do this we select the "Factorizing" function. The diagram shows the magnitude of the singular values of the svd components (Fig. 7.6). When the magnitude of a component is small enough, the component has no effect on the solution. We should remember, however, that we need at least three times the number of components for smoothing than there are chemical components expected to be present. The smoothed data matrix is displayed as a m e s h diagram (Fig. 7.10). The relative error displayed in the Command window is the difference between the original and the smoothed data as a percentage of the original data. At the end the smoothed matrix is stored on the disk for further use. >>The n u m b e r of components- 5 Wait, p r o c e s s i n g svd... R e l a t i v e error2.68 (percent)
Another way to inspect the data is to look at the individual rows in the matrix corresponding to the measured spectra or the individual columns in the matrix
152
corresponding to the elution curves of single spectral lines. This part of AutoAR allows you to step from one spectrum to the next or to the previous one. Also it is possible to move from one spectral line to the next or to the previous one. This way it is easy to find the most intense spectrum on a chromatographic peak.
7.4
Setting up the constraint space
The philosophy of the AutoAR program can be summarized in a few sentences. Constrain the solution in such a fashion that the fit is good and the solution is reproducible. The uniqueness of the solution is measured by the scatter of the solutions. The scatter is a good measure of reproducibility due to the nature of the AR procedure. The AR process begins with at random initial values for the spectra. If the process always converges to a single solution the result is reproducible. The actual analysis of the overlapping problem starts by screening the parameter space. The goal is to find the best combination of constraints and the number of components. The variables we have to adjust are the number of species and the values for spectral and elution curve baselines. We give some reasonable ranges for these variables. On the basis of the PCA or the inspection of the correlation matrix between spectra we can select the range for the number of components. The only way to get good ranges for the spectral and the elution curves baseline is to become familiar with the data. Initially we should select a broad range for both of these. If we are going to look at the solution space in three dimensions, it is necessary to select at least three possible values for each adjustable parameter. Otherwise the surfaces will be rather uninteresting. The more points we select the more accurately the space will be scanned. However, the time needed for the calculation increases with the number of parameter values. Ten is usually a sufficient number of "core AR" iterations at each new starting point. The calculation of scatter needs to start the "core AR" calculations at least twice using the same parameter values. More than two repeats gives a more accurate estimate for the scatter value. Ten is a good initial value.
7.5
Gathering the AR statistics
The AutoAR reports its progress in the Command window. First the selected parameter ranges are documented and then the number of total sets needed will be calculated. This number can be used to follow the progress in calculations during the AutoAR run. *** A u t o A R / R u n _ A R Data
file-
ORG40
***
153
X 1
4
n5
AR iteration #20 Fit=45.57 ,
,
,
,
,
5
10
15
20 Scan number
25
3.5
2.5
1.5
-
1-
0.5-
0
0
30
35
40
Figure 7.11 The last view on the screen after the program has processed all requested species. The number of components in the figure is six, the spectral baseline and the elution curve baseline are both 0.00001. The 20th iteration with these parameter values gives a value of 45.57 percent for the fit. As can be seen from the shape of elution curves, six components is probably more than the optimum. Max
iterations-
Elution
curve baseline-
Spectral No.
of
20
baseline-
species-
Repeats0.000000 0.000000
20 - 0.000010 - 0.000010
Total
sets-
720
L o g steps-
3
L o g steps-
3
3 - 6
The elution curve baseline and the spectral baseline can be selected to be either linearly or logarithmically spaced. If we use logarithmic spacing, it is convenient to select the minimum and the maximum values so that the intermediate levels will be multiples, e.g., starting from 0.0001 and ending at 0.01 using three steps we get values of 0.0001, 0.001 and 0.01. This is for esthetics only. Any number is good for the calculations. The program saves the selected parameters into a text file called p a r a m s . t x t . The text file is used by other programs in the AutoAR package, but it can be used for documentation purposes as well.
154
Each "set" starts by finding a new set of random numbers for the spectra. Then the program solves for the elution curves and uses the proper values for the constraints. After this the program solves for the spectra and removes any negative spectral lines. The constraints are applied as the last step in solving for spectra. This is the first iteration cycle. The results for each "set" are shown in the Command window and the elution curves of solved components are displayed in the Figure window (Fig. 7.11). Only the last iteration is shown. setel. curve count baseline
spectral baseline
no. of species
iteration per level
fit %
321 322
0.000001 0.000001
0.000001 O. 000001
3 3
20 20
i0.958 8. 566
359 360 361 362
O. 000001 O. 000001 0.000001 0.000001
O. 000001 O. 000001 0.000001 0.000001
4 4 5 5
20 20 20 20
5. 420 5. 372 4. 731 6. 960
399 400 401 402
O. 000001 O. 000001 O. 000001 O. 000001
O. 000001 O. 000001 O. 000010 O. 000010
6 6 3 3
20 20 20 20
5.100 7. 187 9. 886 i0.761
439 440 441 442
O. 000001 O. 000001 0.000001 O. 000001
O. 000010 O. 000010 0.000010 O. 000010
4 4 5 5
20 20 20 20
5. 937 5. 829 3. 849 3. 963
639 640 641 642
O. 000010 O. 000010 0.000010 O. 000010
O. 000001 O. 000001 0.000010 0.000010
6 6 3 3
20 20 20 20
6. 861 4. 926 8. 628 8. 628
679 680 681 682
O. 000010 O. 000010 0.000010 0.000010
O. 000010 0.000010 0.000010 0.000010
4 4 5 5
20 20 20 20
5. 619 5. 700 6. 249 3. 964
O. 000010 0.000010
O. 000010 0.000010
6 6
20 20
31. 904 45. 569
•
•
•
o
o
•
•
•
•
719 720
155
At the end of the solution process the scatter and the fit value for each parameter set is calculated. The values are the arithmetic means of the individual fit and scatter values for each parameter combination. The vectors shown in the Command window are also stored on the disk in MAT-file format so they can directly used and displayed by MATLAB. The vectors are displayed as rows in the Command window only for more reading convenience. Scatter: 20.5315 41. 3833 20.9934 33.5659 19. 2511 41.7206 Fit" 9.4580 7.9022 9.2378 6.5995 9.4230 9.4649
13.9789 45.1993 62.5423 21.7679 8.3912 53. 8959 17. 9329 6.1806 40. 0731 53. 8574 21.3072 39.0645 67.5575 23.2139 7.3409 63.5548 21.8441 6.2886 48.0597 62.9158 8. 0877 44. 7887 62. 7849 20. 5948 I0.4537 57.8301 21.5030 10.3796 34.2706 64.9903
7.4995 8.5550 18.1851 23.1883 9.5520 5.7440 7.5869 8.0875 21.7223 12.8533 9.6066 5.7673 5.7203 10.5183 11.1512 14.0997 9.1804 6.2374
9.7438 8.8526 9.6148 11.3113 9.2385 5.5592
5.7713 13.5459 5.7295 18.4729 6.8411 20.3920
These results can be displayed in two or three dimensions to observe the solution space.
7.6
Inspecting the solution found with AutoAR
There are two output variables calculated by the AutoAR program that describe the quality of the solution. These are the fit and the scatter. The fit is a measure of the closeness to data of a solution and the scatter describes the repeatability of it. Often one of them decreases while the other increases. The optimum solution is the one where both the fit and the scatter are small. This is the reason why the fit and the scatter are combined into a third variable which describes the overall optimum best. We can look at the behavior of these three variables as a function of one parameter at a time, while the two other parameters remain at a selected, constant level. The parameters here are the number of components, the elution curve baseline, and the spectral baseline. In Fig. 7.12 the three error variables are shown as a function of the number of components. Three different situations are displayed.
156
1
a
Elution
O0
,
curve
baseline
-= I e - 0 7
,
,
Spectral
baseline
--
le-07
i
,
90 80 7060z~
...¢
5040...(~"
30). . . . . . . . . . . .
...........................
~-
3'.5
O3
;
Elution ,
b
o'"" e-
curve
Number
baseline
=
4'.5
~
of species
le-05
i
,
Spectral
5'5
baseline
=
le-07
,
,
90807060z~e
5040-
• o"
3020
..-"" r. .
.
.
.
.
.
.."
10
.
03
3.5
Elution
C
4
curve
Number
baseline
=
.
.
.
4.5
of species Spectral
le-07
.
O-
.
.
.
.
.
5
baseline
=
le-05
90 80 70 60 Z~
........-'''"
50
....o"
40 30 2C 1C 03
p-
315
!
4
!
Number
4.5
of species
~
51s
6
Figure 7.12 The fit, the scatter, and their combination are shown as a function of the number of components. The dashed line corresponds to the fit, the dotted line to the scatter and the solid line to the combined variable, a) The elution curve baseline and the spectral baseline are very small, b) The elution curve baseline is increased by a factor of 100. The spectral baseline has the same value as in figure a. c) The spectral baseline is increased by a factor of 100. The elution curve baseline has the same value as in figure a.
157
Number
of
= 4
species
Spectral
baseline
=
le-06
=
le-06
a
20 A
~o
10 -e
- 7
Elution Number
of
species
.
b
curve
=
.
6
1 0 "s baseline
Spectral . ,
baseline
80
¢ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0
..........................................
¢
60
20
E- .
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
)-7
--0---
.
.
.
1 0 "e Elution
curve
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10 -s baseline
Figure 7.13 The fit, the scatter, and their combination are shown as a function of the elution curve baseline. The dashed line corresponds to the fit, the dotted line to the scatter and the solid line to the combined variable, a) The number of components is four and the spectral baseline is 0.0001 percent or le-6. b) The number of components is six and the spectral baseline has the same value as in figure a. In Fig. 7.12a the values for the elution curve baseline and the spectral baseline are very small. In Fig. 7.12b the elution curve baseline is increased by a factor of one hundred. In Fig. 7.12c the spectral baseline is increased by a factor of one hundred and the elution curve baseline is the same as in Fig. 7.12a. A clear minimum in the values of all three error variables is found at four components. The scatter and the combined variable increase sharply when the number of components becomes larger. The fit increases more slowly. The program reads the parameter values that were earlier selected by the user in setting up the constraint space and displays them in the Command window. These values cover the examined constraint space. The text displayed in the Command window can be stored as a log file. The selected constant variables are
158
a
1
Number
O0
of species
=
Elution
4
curve
baseline
=
l e - 0 6
=
l e - 0 6
80
60
40
20~
°o--
'
. . . . .
Number
of species
1 O0
Spectral = -
',o--
baseline
Elution
6 •
•
curve
' baseline
"
,
-
10
"
-
-s
-
b 80
60
.............................................
0
i
..........................................
o O0 7
Spectral
1 0 -e baseline
10
-s
Figure 7.14 The fit, the scatter, and their combination are shown as a function of the spectral baseline. The dashed line corresponds to the fit, the dotted line to the scatter and the solid line to the combined variable, a) The number of components is four and the elution curve baseline is 0.0001 percent or 1E-6. b) The number of components is six and the elution curve baseline has the same value as in figure a
shown in the window. The line types and colors used in the figure are explained in the text. The next series of 2D figures displays the fit, the scatter and their combination as a function of elution curve baseline levels. The fixed parameters are the number of components and the spectral baseline. The number of components increases from Fig. 7.13a to b. We can see that the fit does not change much along the elution curve baseline axis. When the number of components becomes too large the scatter increases (Fig. 7.13b). The solution is therefore unreliable. The fit value is larger with this parameter set, which means that the solution differs more from the
159
observations than with a smaller number of components. The reason for the worse fit is the unstable convergence of AR when there are too many components. The third series of 2D figures displays the error variables as a function of the spectral baseline, while the number of components and the elution curve baseline are fixed for each figure (Fig. 7.14a and b). The smallest values are found when the number of components is four. The changes are not very large along the spectral baseline axis. The solutions can be studied as a function of two parameters while one parameter remains constant. The three-dimensional (3D) figures display one of the error indicator variables--the fit, the scatter or their combination--at a time. The shape of the 3D surface better describes the variation in the selected variables than the individual 2D figures. The error indicators needed for the 3D figures are produced by the AR calculation. They are the same variables, fit and scatter, that are used for the 2D figures. Here we select the varied parameters for the x- and the y-axes and a fixed value for the third parameter. The combined error variable shows the overall situation. If you are interested in the changes of the fit and scatter you can inspect their behavior as well. First we look at the combination of the fit and the scatter. Fig. 7.15 shows the response surface as a function of the number of components and the elution curve Spectrum
I 0 "s
|
baseline = le-07 !
|
1
/
1 0 .6 o =
1 0 "7
3
'
3.5
4
l
4.5 N u m b e r of
species
5
sls
Figure 7.15 The combined variable of the fit and the scatter is displayed as a c o n t o u r plot. The axes are the number of components and the elution curve baseline. The spectral baseline is held constant.
160
Spectrum baseline = le-07
100
-.
80-. 60< o
~
40
._
20
O~
10 "s
6
Elution curve baseline
1 0 "7
3
N u m b e r of species
Figure 7.16 The combined variable is shown as a mesh plot. The x-axis is the number of components and the y-axis is the elution curve baseline. The spectral baseline is held constant.
baseline. The figure format is a contour plot. It is not possible to see directly from the figure in which direction the values are increasing or decreasing. The plot is shown in the Figure window and explanatory comments are written to the Command window of MATLAB. The other possible 3D plot formats are the m e s h plot (Fig. 7.16), the surf plot (Fig. 7.17), the s u r f c plot, and the s u r f 1 plot (Fig. 7.18). Each of these can be produced with a single MATLAB statement. These three-dimensional plots clearly show the minimas and the maximas if they are present in the surfaces. The increasing and the decreasing directions are easily observable. The m e s h plot in Fig. 7.16 shows the combination of the fit and scatter as a function of the number of components and the elution curve baseline. The data are the same as in Fig. 7.15. While the c o n t o u r plot shows the iso curves having the same value of the plotted variable it gives no idea about the direction of the function. The m e s h plot shows that the combined variable is rather insensitive to the elution curve baseline value. The highest number of components (six) seems to be out of question at all elution curve baseline values tested. The smallest number of components (three) is not an optimum. The minimum seems to be four components.
161
Elution c u r v e baseline = l e - 0 6
25
20
o ~15 ...., ._
LL
10
5 10 -5
10 -8
"~'-~k
~
45
5
5.5
3.5
Spectrum baseline
10 "7
3
N u m b e r of s p e c i e s
Figure 7.18 The fit is shown as a s u r f plot. The x-axis is the number,of components. The y-axis is the spectral baseline. The elution curve baseline is held constant. S p e c t r u m baseline = l e - 0 7
70 60 50 o~ 4 0 =._ (1)
Or) 2O 10 0 1 0 "s
10 -6
Elution c u r v e baseline
"~'-~_
~ 10 7
3
45
5
5.5
3.5 N u m b e r of s p e c i e s
Figure 7.17 The scatter is shown as a surf i plot. The x-axis is the number of components. The y-axis is the elution curve baseline. The spectral baseline is held constant. The shape of the surface resembles that in Fig. 7.16. This shows that the scatter has more effect on the form of the combined error surface than the fit.
162
The difference between the m e s h plot and the s u r f plot is that the latter has a surface of solid "plates". This looks better on the color screen but gives no more information than the m e s h plot. The s u r f c plot is a combination of s u r f and the c o n t o u r plots. The c o n t o u r plot, however, is not always shown. It depends on the form of the surface; the data should be evenly spaced otherwise the c o n t o u r plot cannot be displayed by MATLAB. In these cases the result is the same as with the s u r f plot. The surf i plot has lighting added to the surf plot.
7.7
Plotting the spectra and elution curves
After inspecting the statistics of the fit and scatter functions the analyst decides to display the solution that is optimal. This point in the constraint space gives the lowest combination of fit and scatter. The next step in the analysis is calculating the detailed solution for the optimum. After the analyst has selected the best parameter combination~the number of components, the values for the elution curve baseline, and the spectral b a s e l i n e ~ the spectra and the elution curves of the different components can be displayed.
4
x 10 4
i
i
,
5
10
15
S p e c i e s #1
i
~3o
E .a: ~2rr'lO0
|
20 25 Scan number
3'5
30
40
0.3
0.2
-
0.1-
O0
,,.,..1l,.,
50
. . . . .
1 O0
,
j,,
150
.....
,,,,
200
It_ L 250
,
300
.
L
350
Figure 7.19 The elution curve and the mass spectrum of the first component. The elution curve is shown without the total elution curve and scaled to the maximum. The mass spectrum is shown as a bar graph.
163
x 10 s
S p e c i e s #1
c: 3 0
E .a: ~2t~ ID
rr'l-
• ..........
0 0
t
~
5
.
,'o
--;
Scan number
3'0
.... i ............ 35
40
0.3
0.2
-
0.1-
O0
..z...K..,
50
....
1 O0
,
J,
150
,.,
,, It L
200
250
300
.
L
350
Figure 7.20 The elution curve and the mass spectrum of the first component. The elution curve (solid line) is shown with the total elution curve (dotted line). The elution curve of the first component corresponds to 8.4 percent of the total area in the total ionization curve.
The reason for the separate calculation step is simply the storage space. The exploratory step that studies the solution at all points of the constraint space cannot store all intermediate solutions. In this program step you can refine the fit and the scatter values by calculating the solutions with more repeats than were used in the previous screening step. The number of repeats that should be selected depends on the use of the spectra and elution curves. If just the spectra are needed, this part of the program is better than the portion of the program that estimates the reproducibility of spectra and elution curves. The spectra and the elution curves displayed by the program on the screen are from a single step, representing the last repeat. The matrices which are output represent the more precise averages. In the following example the analyst has decided that the optimal number of components is four. The values for the elution curve baseline and the spectral baseline are set optimally. The report from the Command window shows the baseline values with six decimals. Even smaller values can be used and they are saved correctly in the p a r a m s , t x t file. The spectrum and the elution curve of a component are displayed simultaneously on the computer screen. The elution curve can be displayed alone or with
164
the sum of all elution curves. For the baseline components the plain elution curves often look odd when they are scaled to the full range. When the total elution curve is shown together with the elution curve of the selected component, the relative magnitude of the component is easier to see. The location of the peak maximum relative to the total elution curve is easy to compare. The spectrum and the elution curve of the first component are shown in Figs. 7.19 and 7.20 without and with the total elution curve, respectively. *** A u t o A R / S e l e c t Max
iterations.
***
20
E l u t i o n curve baselineS p e c t r a l baseline-
No.
of species-
3 - 6
Repeats-
20
0. 000000 - 0. 000010
0. 000000 - 0. 000010
Total
sets:
720
Log steps :
3
Log steps:
3
Selectionset-
el.curve
count baseline 1
0.000001
spectral
baseline
0.000001
no.
of
species
4
iteration
p e r level
20
fit %
6.035
The fit value for each solution cycle is reported. At the end of the report the scatter and the mean value for the fit are calculated. If only one repeat is calculated the scatter is zero and the fit has the only calculated value, here 6.0 percent. On the basis of the data shown in Fig. 7.16 there is not a great difference in combined error variable for four and five components. However, when more repeats are calculated the scatter and the fit values for five components are much higher than for the four components. The scatter for five components is 50.8 and the fit is 32.4.
7.8
Calculating the reproducibility
The variation in the spectra and elution curves is an indicator of the quality and reliability of the solution. This estimation of the confidence ranges around the spectra and the elution curves is possible with AutoAR. More precise statistics are calculated for the solution chosen by the user. In this example the elution curve baseline and the spectral baseline are set to their lowest value. The number of components is set to four as before. The spectra and the elution curves are solved ten times to get better statistics. Because the AR algorithm is started each time from a different, random starting point there is variation in the solutions found by AR.
165
x 105
4 c: :3 O
3
Species #1 i
i
I
i
i
-
E o2._> .i..., mm (D
rrl-
;o 0.3
~'~
T
1'
~;o
'30
&
Scan number
I
I
r
&
40
T
0.2 0.1
-
0 4 0
-0.1 0
J
J.
1 O0
J.
150
1
200
4
250
300
350
Figure 7.21 The elution curve and the mass spectrum of the first component. The elution curve is shown with the total elution curve. The standard deviations for the elution curves and for the spectra are shown plotted downwards. S p e c i e s #2
x 10 s 4
3
,
,
;
~o
,
,
-
21O-
"'o
I
~'~ T
2'0
&
3'0
Scan number
T
!
T
~',
.o
T
0.2
0.1
o 0
iO
Figure 7.22
I
1 O0
~
150
L
200
l
250
,L
300
350
]'he e]ution curve and the mass spectrum of the second component forms 44.5 percent of the total area in total ionization curve. The elution curve is shown with the total elution curve. The standard deviations for the elution curves and for the spectra are plotted downwards.
166 ,
,,
4 .¢-.
,
,,,
,
,
,
,
,
,
Species #3
x 105 i
i
i
i
i
i
i
3-
0
<E
2-
(D
._>
0: OI
-1
o
0.3
,'
310
2'0
35
40
Scan number
I
T
r
T
1"
I
0.2 0.1
-0.1 0
L
50
1
I
1
I
I
100
150
200
250
300
350
Figure 7.23 The elution curve and the mass spectrum of the third component forms 23.6 percent of the total area in total ionization curve. The elution curve is shown with the total elution curve. The standard deviations for the elution curves and for the spectra are plotted downwards. The resulting spectra and the elution curves are displayed with their standard deviations in Figs. 7.21 to 7.24. The spectra and the elution curves are plotted upwards as usual while their standard deviations are displayed downwards in the same figure. If the elution curves are displayed together with the sum of all elution curves the y-axis is scaled according to it. In Fig. 7.21 the scatter in the elution curve is so small that it does not differ from zero at the scaling used in the figure. In the other figures (Figs. 7.22 to 7.24) the scatter in both the elution curves and the spectra are visible although small. *** A u t o A R / S t a t i s t i c s Max iterations.
20
Elution curve baselineSpectral baseline-
No.
of species-
3 - 6
*** Repeats-
20
0.000000 - 0.000010
0.000000 - 0.000010
T o t a l steps- 720 L o g steps-
L o g steps-
3
3
167
Species #4
x 105 .-, t-
3-
:3
0
<E
2-
Q~
(D
rr
OI
"'o
I
2'0
....
I
Scan number
O . ~)
i
T
I
.o
T
0.15 0.10.05 -
]
,
I
-
0 -0.05 0
50
1 O0
I
150
I
200
I
250
I
300
350
Figure 7.24 The elution curve and the mass spectrum of the fourth component forms 23.5 percent of the total area in total ionization curve. The elution curve is shown together with the total elution curve. The standard deviations for the elution curves and for the spectra are plotted downwards.
Selectionstepcount 1 2 3 4 5
6 7 8 9 i0 ii
el.curve baseline 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001
spectral baseline 0.000001 0.000001 0.000001 0.000001 0.000001
0.000001 0.000001 0.000001 0.000001 0.000001 0.000001
no. of species 4 4 4 4 4 4 4 4 4 4 4
iteration per level 20 20 20 20 20 20 20 20 20 20 20
fit % 6.035 5.408 5.675 5.429 6.887 5.797 5.434 5.704 5.420 5.420 5.574
168 12 13 14 15 16 17 18 19 20
0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001
0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001 0.000001
20 20 20 20 20 20 20 20 20
5.883 5.724 5.914 5.532 6.307 6.209 5.448 5.420 5.372
Scatter: 7.3409 Fit5.7295
The purpose of the previous programs is to make it simpler for the analyst to find the best values for the three guiding constraints. The guiding constraints are the number of the species, the background value for the spectra and the background value for the elution curves. After the analyst has found the optimum point he illustrates the optimum with spectra and elution curves. If confidence intervals are needed, they can be calculated. The AutoAR has a user-selectable switch for the sorting operation that produces the unimodality of the elution curves. If this switch has not been selected, the requirement for unimodality is dropped. This makes possible the analysis of data from sources other than hyphenated instruments.
Chapter
8
Applications in other spectroscopies •
• •
•
Single-dimensional signals and AR Other hyphenated instruments Two-dimensional data with internal continuity Using OSCAR for spectra of discrete samples
This Page Intentionally Left Blank
Applications in other spectroscopies
8
The OSCAR algorithm for calculating the solution to the deconvolution problem is a general-purpose tool that is easily modified to handle problems in other areas of chemistry. The program sometimes needs additional constraints to handle the data in a given application area. In most cases there is no need to change the program, only the way we look at the data. We start by examining how OSCAR is used to handle single-dimensional chromatographies. After that particular application we shall see how the method works for different hyphenated instruments and methods. Then we apply it to signals where there is no physical separation (chromatography) in the second dimenIR spectra of kidney stones
1-. 0.8-. 0.6-. 0.4..
O.2-
~
i
_..
100 ~1 ~
Sample number
~J
0
0
~
40
Spectra
Figure 8.1 The component spectra can be found with AutoAR for collections of discrete samples. In this case the "chromatographic" axis is discontinuous.
171
172
sion. A typical example of this kind of signal is the data produced in a titration experiment. Finally we apply OSCAR to situations where there is no continuity between the spectra in the data matrix. This last case is valid for discrete samples (Fig. 8.1).
8.1
Single-dimensional signals and AR
It is possible to use OSCAR for calculations that are based on single-dimensional chromatographies [Karjalainen and Karjalainen 1992]. In everyday work a common instrument of this kind is an high-performance liquid chromatograph using UV detection. This type of instrument is used in clinical laboratories for the determination of drugs. Industrial laboratories use it for routine quality control checks. Automatic sample changers keep the machines operating around the clock. Normally the signals from these instruments are handled by integrators. These devices do not have any internal model about the shape of a chromatographic peak. They use an approximated baseline curve and different ways of isolating the single peaks from each other by graphically plotting separation lines between them. We know well that the peaks are not isolated from each other by separation lines, because in reality the peaks do overlap. The separating lines defined by the integrator can be plotted in many different ways. The results from all these separating lines have an error, but we do not know how large it is. Different integrators use different algorithms and get different results from the same data [Papas and Tougas 1990]. The integration and peak isolation process are repeated with different triggering conditions and settings until the operator using the integrator is satisfied with the result. There remains a large subjective element in the results, because the judgment about the quality of the peak isolation is primarily dependent on the experience of the operator and what he considers to be an esthetic solution. OSCAR gives us the possibility of handling the single-dimensional signals better than the integrators. The calculations are made with the AutoAR program, and no change in the program itself is necessary. The basic idea is to regard the individual samples as "mass numbers" in one chromatogram. We combine mentally all samples and standards in one analytical batch together into a single "virtual GC-MS run" (Fig. 8.2). In this imaginary run the individual samples correspond to single m/z lines. For this data matrix we perform the same analysis as for a GC-MS run. The result is also similar. We obtain the shapes of the elution curves for each component. This is a new kind of result that is not obtained when using the hardware or software of the integrators. The second kind of output we obtain are the intensities for single peaks. These intensities are "the mass spectra" of the solution. In the case of single samples the mass spectra are concentration lists for each kind of compound that is found in the samples. The "mass spectrum
173
2D OBSERVATIONS, ONE SAMPLE
Elution profile for one compound
bservation matrix
1 D OBSERVATIONS, MANY SAMPLES
Elution profiles of the compounds
Concentrations for a sample
/
A single run
Figure 8.2 Samples from a batch of a single-dimensional chromatographic run can be put together to form a two-dimensional matrix, which can then be analyzed by AutoAR. Before the analysis can be performed the time scales of the chromatograms have to be synchronized. (Reprinted with permission of Elsevier Science Publishers B.V. from the Analytica Chimica Acta, from Karjalainen and Karjalainen 1991). lines" are the amounts corresponding to each sample. The spectral lines with the "mass" 1 are the amounts of different compounds in the first sample. The spectral lines with the "mass" 2 are the amounts for the second sample in the batch of sampies. The difficulty in applying the AutoAR to single-dimensional data is the preprocessing that is necessary before this step. We must initially synchronize all single chromatograms in such a fashion that the peaks in all of them have identical retention times. We must deform the chromatograms in such a way that the tops of corresponding peaks have the same positions on the time axis. At the same time, if we deform the signal shapes along the longitudinal axis we must compensate for
174
any elongation or shortening by making compensatory changes in the intensities of the signal. We must provide for input from the human operator, because the best judge for the overall shape of the chromatogram is an experienced operator. The operator shows to the synchronizing software which peaks should be considered identical. The final adjustments are then better left to the software to make the peaks fully synchronized in time. After this synchronization has been performed, the analysis with AutoAR can be made. There are some general precautions that must be observed when handling single-dimensional samples. There should be some variation in the relative amounts of the peak areas. If the ratio between two chromatographic peaks in all samples is constant, the problem is unsuitable for this kind of analysis. This holds for the pure standards as well. The standards should not be prepared by the simple dilution of one stock solution because this produces "mass traces" that are fully correlated. It requires a bit more effort to make "non-orthogonal" standards when preparing the standard mixtures, but it is an effort that is well spent. Similarly, if we know that the amounts of any two components in all samples are constant or the ratio between the two components remains constant, we cannot use this method for calculating the concentrations. There should be some variation in the relative proportions to make the method possible. If we know before the analysis that the amounts remain fully constant we should add known "spiking" to some of the samples to perturb this ratio. Mathematically, the best results are achieved in those cases where the variations in the concentrations are large because it means that the resulting "spectra" are maximally dissimilar. The more dissimilar, i.e. orthogonal, the spectra are, the better is the precision that we can expect. The analysis of whole batches of samples as an entity is possible using OSCAR. The instrument that is best suited for this is the gas chromatograph, because the retention times are very reproducible. Less preprocessing is needed to synchronize the analytical traces. The liquid chromatographs are more difficult. The retention times are not as reproducible as with the gas chromatograph. Additionally, there is interaction between successive peaks and sometimes even between successive samples. It is a common observation that the impurities in some samples contaminate even the runs following them. The analysis of whole batches of samples as an entity is the logical next step in the evolution of the integration methods. The first generation of the integrators reacted to the signal on the fly. The amount of internal memory in the integrators was so limited that the data for a single sample could not be stored in RAM to permit reintegration of the signal. The current generation of integrators is better because the data can be kept in RAM and they are automatically stored on disk. The main deficiency in the current integrators is the primitive peak model or the lack of a peak model. The "integration" of the signal using OSCAR is the next
175
step. The memory of the computers has now reached the point where a whole batch of samples can be analyzed as an entity. The benefit of this is improved precision in the final results. Other research groups have applied the same idea to analyzing concentrations in industrial processes [Tauler et al. 1993].
8.2
Other hyphenated instruments
The AutoAR is used to analyze data from many other sources than GC-MS for which the method was initially developed. The precision of the results depends on the overall statistical properties of the observation matrix. If we have highly overlapping peaks and the spectra are almost similar for the overlapping components, we cannot expect to get highly precise estimates of the concentrations. HPLC instruments often are equipped with diode array detectors that cover the optical spectrum in the ultraviolet and visible range. The usual spectral range is between 200 and 800 nanometers. The number of photodiodes varies between 100 and 1000 elements. The spectral features present in the spectra of most samples are so broad that the sampling frequency along the spectral axis is quite sufficient in these instruments. The AutoAR handles this kind of measurement. The background values for the spectra are much higher those for the mass spectra, but proper values are easily found after some initial experiments. The standard errors that are found for the resulting spectra are higher than for GC-MS data. The standard deviations for the elution curves are higher than the corresponding standard deviations for GC-MS data. This is to be expected due to the higher correlations between different optical spectra. The mass spectra are inherently more dissimilar to each other than the optical spectra. The reason that some users of AR have had difficulties with HPLC-UV-Vis data is the background values, which are higher in optical spectra than in mass spectra. OSCAR searches for the best values of the backgrounds thereby facilitating the analysis of HPLC-UV-Vis data. If the information from one chromatographic run is not sufficient, experimental tricks should be used to increase the available information in HPLC. The same sample could be analyzed on two different analytical systems run in parallel. With this arrangement it is possible to obtain more information about the molecules present in the sample. Because this is an experiment with three-dimensional data, the AutoAR program is not equipped to handle it. The modifications to the AutoAR needed to handle three-dimensional data are not impossible to implement and should serve as a stimulus to the programming reader. The noise in the diode array detectors grows with the aging of the instrument. There is some direct damage in the diodes due to the UV photons. In practice the
176
detector element should be changed after some years of use, otherwise the signalto-noise ratio suffers, making the data analysis more difficult. The list of hyphenated instruments is extremely long if we count all examples that have been mentioned in the literature. Infrared spectra can be combined with gas and liquid chromatographic instruments. The quality of the IR signal is generally much worse than for the UV-Vis spectra. The GC-IR signal is best analyzed as part of one extended data matrix. In this extended GC-MS matrix the IR signal has been appended to the mass spectroscopic data as extra "masses". This is not possible with all combined GC-IR-MS instruments because the sample does not flow in synchrony through both instruments. Efforts should be made to convert the IR data into a synchronized format by proper preprocessing because in this extended format the AutoAR can analyze the data.
8.3
Two-dimensional data with internal continuity
There are several combined methods in chemistry that do not possess the unimodal chromatographic peaks in the second dimension. Still the readings along the second dimension are continuous. A typical example of this kind of information is kinetic experiments. These experiments are often measured by continuously scanning optical spectra with a diode array instrument. The changes in the spectra are smooth as a function of time;, there are no arbitrary jumps in the data. A second familiar experiment is titration or experiments where the temperature is gradually changed. The spectral dimension can be visible spectra, ultraviolet spectra or infrared spectra. In some cases even mass spectra are registered. An example is the combination of pyrolysis with GC-MS [Windig and Meuzelaar 1984]. With these kinds of data the non-spectral dimension follows a different curve than in chromatography. The curve changes smoothly but there are no clear unimodal maxima as in chromatography. The AutoAR can serve in these situations if the program is slightly modified. We omit the sorting operation in the "core AR" routine that makes sure that the peaks stay unimodal. With this simple modification, two-dimensional data matrices can be handled without difficulties. Because the requirement to handle non-chromatographic data is frequently encountered, we have put the sorting operation into the AutoAR program as a selectable option. A check box is marked by the user to force unimodality. We could use other types of constraints here than just the adjustment of baselines in both dimensions. We could gradually "tighten" the smoothness of spectra and concentration curves by some smoothing function that can be gradually adjusted in small steps. This is not necessary in practice because the simpler method of baseline adjustment works well. If other types of constraints are needed as guiding constraints, the AutoAR program should be modified to use these constraint types.
177
100
IR s p e c t r u m of c o m p o n e n t #8 i
i
50
100
i
!
i
i
i
200
250
300
350
400
i
9080706050403020 10
Oo
150
450
500
Figure 8.3 The IR absorbance spectrum of the largest component reconstructed by AutoAR from a set of 300 kidney stones. The numbers on the x-axis are simply the positions in the vector holding the values. The IR data have been collected between wavenumbers of 4000 to 400 cm-1. The spectrum is displayed so that the smaller wavenumbers are located on the left hand side in the figure. The y-axis is scaled here to have the value 100 for the highest peak.
For this kind of data the background values can be very high along the nonspectral axis. Here it is important to provide sufficient dynamic range for AutoAR to find the proper backgrounds.
8.4
Using OSCAR for spectra of discrete samples
The last area where the OSCAR approach has been used is isolating the constituent spectra in batches of distinct samples with no intemal continuity in the second dimension. This situation arises when the only possible manipulation that can be made on a sample is taking the spectrum. The samples can represent totally different specimens that have no relation to each other. The only link between the individual samples is that they contain similar components. The proportions of the components vary from sample to sample. The samples can come from individual patients or the environment; there is no need for continuity between them.
178
R spectrum of component #4
100
I
I
I
I
I
I
, 300
, 350
9080706050403020 10
Oor
50
1 O0
, 150
\..F"------._.J 200 250
400
450
500
Figure 8.4 The IR absorbance spectrum of the component number four found by AutoAR from a set of 300 kidney stones. The numbers on the x-axis are the positions in the vector holding the values. The data have been collected between wavenumbers of 4000 to 400 crn-1. The wavenumbers grow from the left to the fight. The y-axis is scaled to have the value 100 for the highest peak in the spectrum. The second situation where we have two-dimensional data matrices without continuity in the second non-spectral dimension is process control data or quality control data. Samples are taken from a reaction vessel after some manipulation or addition. There is no continuous scanning of spectra that would form a continuity between the successive spectra. The time intervals between discrete samples are typically irregular. The analysis with AutoAR is possible in these cases as well. The unimodality requirement in "core AR" is turned off and the data are analyzed in the conventional way. The result is a set of spectra and concentrations that are maximally reproducible. They may not always correspond to the physical spectra but they represent the solution that is most reproducible, based on the current data set. If two components are present in a constant concentration ratio in all samples, their spectra are fused by the method. There is no way to take them apart because the raw data contain no information about them separately. Still, the analysis of the sum spectra can be rewarding because it gives us ideas about the mechanisms involved. Even in those cases where we notice the fusion of more than one component because of equilibria in the mixtures the knowledge
179
Species #4
30
,
i
,
i
i
i
i
i
^ 1O0
150
200
250
300
350
400
¢.. "I
o 20 E
... 30
0.8
Figure 9.4 The fit between the synthetic and the reconstructed observation matrices is shown here. The fit (vertical axis) is excellent with the lower values of the background constraint (right axis). The results show 30 synthetic spectra (left axis). The problem is that the lower values of the constraint do not reproduce the true solution. This can be seen in the previous picture (Fig. 9.3). straints needed for optical spectra. With proper background AR is stable. OSCAR can be used for automation of the spectral extraction process with optical spectra. The recovery experiments are successful only if the range for the number of species that are used in AutoAR includes the number of species that was originally used to generate the synthetic observations. Likewise, the dynamic range of the baselines that is tried out by the AutoAR program must be large enough to cover the range encountered in the original spectra and elution curves. Recovery experiments are used to show how robust the AR algorithm is in general. In this sense, the algorithm can be validated. The backgrounds for spectra and elution curves are adjusted to the point where the AR process combines good fit with a stable repeatable solution. The validation of a single result spectrum found from experimental data is not possible. The only way to validate a single spectrum is to show that the spectrum corresponds to earlier observations. If the spectrum is unfamiliar and new, the only way to validate it is the experimental route.
190
The previous experiments show that the AR algorithm behaves in a stable predictable way when the constraints are active. OSCAR can systematically study the best values for constraints. With faster computers, we are rapidly approaching full automation.
9.2
Experimental validation
Experimental validation is started by comparing the results with earlier results. If we are able to identify the spectra found by the AutoAR using library searches we can be relatively confident that the deconvolution has been successful. If the spectrum is not known to us from earlier experiments or if it is not found in spectral searches against large spectral libraries, we must spend much more effort to make certain that we are really dealing with a new entity and not some kind of artifact. The first thing to do is to repeat the numerical analysis. If the spectrum still persists we can repeat the AutoAR process with odd and even spectra. If we still obtain the same spectrum in both data sets we can assume that there is some kind of novel molecule producing the spectrum. Most mass spectrometers are bought with their own libraries for spectral library searches. The NIST Mass Spectral Search Program can accept mass spectra from several sources. We provide here a MATLAB program ( m a k e n i s t .m, Prog. 9.1) that converts AutoAR data into a form acceptable to the NIST system. Even in the case of a novel molecular structure we do not know if the new structure is really present in the original unchanged sample material or if it is present due to some artifact in our chemical determination process. It is not simple to prove that a certain molecule exists in the original sample material such as patient serum or some environmental sample. Low concentrations of molecules are particularly difficult in this respect. If the amounts are small we cannot isolate the molecule using some batch process such as crystallizing a molecule after a complete purification. We should vary the isolation process and the chromatographic system as much as possible. If we obtain the same new spectrum from quite different isolation and chromatographic experiments we can be more confident that the compound represents something present in the original sample material. Even if we can show that the molecule is found in the original sample material, it does not prove that the molecule has some significant role in the system being studied. If the novel compound is a close structural relative of a familiar molecule, such as an administered drug, we can assume that the modified form is generated by metabolism. If the novel compound has no metabolic relatives in other known molecules it is difficult to show that the new structure is really present in the intact system.
191
Finally, our measuring instruments are so sensitive that we are starting to measure some kind of metabolic noise. We know the structure of some 30,000 steroid molecules. There are probably many more that will be found with better tools. It can be argued that a great number of these molecules are just "molecular accidents" that are formed only because no enzyme is fully specific. Side reactions are catalyzed to a small extent by all enzymes. The concentrations of the molecules formed by these side reactions are too low to have any biological sig-
% M A K E N I S T . M Converts the mass spectra % b y A u t o A R to the NIST format _
m
(S .mat)
found
_
Data analysis for h y p h e n a t e d techniques b y Erkki & U l l a K a r j a l a i n e n 1995 load S.mat [n,m] =size(S) fid - fopen( 'NIST.msp', 'w' ) ; for i=l-m fprintf (fid, 'Name: SPECTRUM %3.0f\n', i) ; V=S(-,i); iso--max (V) V = ( V > iso/lOO).*V; V= (lOOO/iso) *V; v c = s u m (V > 0) ; fprintf (fid, 'Num Peaks : %3.0f\n' ,vc) ; ct=0; for j=l-n if V(j) > 0 ct=ct+l ; y=[j V(j)]; fprintf (fid, '%4. Of %5.0f\n' ,y) ; end end fclose (fid) ;
Figure 9.5 Program m a k e n i s t . m converts spectra in matrix form to the "NIST-acceptable" format. It adds the necessary header text and gives here a standard name NI ST. msp for it.
192
nificance in the organism. Still, this molecular noise can be interesting to the many busy investigators who go on enlarging their stamp collections of new molecules.
References Efron B. The jackknife, the bootstrap, and other resampling plans. Philadelphia: SIAM, 1982, 92 pages. NIST Mass Spectral Search Program and The NIST/EPA/NIH Mass Spectral Library. Version 1.0. For use with Microsoft® WindowsTM. U.S. Department of Commerce, Technology Administration, National Institute of Standards and Technology, Standard Reference Data Program, Gaithersburg, MD 20899, U.S.A.
Chapter
10
AR and factor analysis Mapping between spectra and PCA The relative speed of AR calculations
This Page Intentionally Left Blank
10
AR and factor analysis
10.1 Mapping between spectra and PCA Factor analysis is a well-established method in statistics [Wold et al. 1987] and has found many uses in all branches of science. The method finds new combinations of the variables that can effectively express the information in the original data matrix. If the original data matrix contains redundant variables the factor analysis can compress the information. Factor analysis compresses the original matrix into two smaller matrices. These matrices are called scores and loadings. The terms factor analysis and PCA are used to mean the same technique in chemometrics.
FI Loadings C 0
n
C
Scores * F = Conc.
IF11
Spectra
F-1 * L o a d i n g s - Spectra
Figure 10.1 The scores of the PCA can be transformed into elution curves (concentrations) with a transformation matrix E Multiplication by the inverse of F transforms the loadings into spectra. 195
196
The results from factor analysis are optimal in a statistical sense. The information that is of interest to the analytical chemist is different. The observation matrix should be expressed as a product of two matrices, concentrations and spectra. There is a clear-cut relationship between the two kinds of solution. The chemical spectra and concentrations can be transformed into the factor analytical solution and vice versa (Fig. 10.1). The "mapping" between the two solution types can
%MAPSPEC.M
%-
AR
to P C A m a p p i n g .
% Data handling for hyphenated techniques % (c) by Erkki & Ulla Karjalainen 1995 %
clear C=rand(10,2) ; S=rand(2,5) ; Obs =C * S; [U, SS,V] = s v d ( O b s , 0) ; US=U*SS ; US=US(:,l:2) ; T O b s = U S * V ( : , l : 2 ) '; D=TObs-Obs ; disp( 'Difference between original and PCA reconstruction') absresl = sum(sum(abs(D))) F=pinv(C)*U(:,l:2) ; G = i n v (F) ; L=G*S ; F O b s = U ( : , 1 :2) *L; D2 = F O b s - O b s ; d i s p ( 'R e c o n s t r u c t i o n from mapped components' ) absres2 = sum(sum(abs(D2))) d i s p ( ' T r a n s f o r m a t i o n m a t r i x F' ) di sp (F ) disp('Inverse of m a t r i x F') di sp (G )
Program 10.I The solution by factor analysis can be transformed into the "chemical" solutions when the "chemical" solution is known.
197
>>
Difference
between
original
and
PCA reconstruction
absresl 5.0168e-15 Reconstruction
from mapped
components
absres2 = 9.4820e-15 Transformation matrix 0.3627 0.6510 0.3080 -0.6238
F
I n v e r s e of m a t r i x F 1.4617 1.5255 0.7218 -0.8498
>>
Figure 10.2 The mapping matrix F and its inverse are used to map the spectra and concentration matrices into scores and loadings. The variable a b s r e s l displays the difference between the original observation matrix and the reconstructed observation matrix. The variable a b s r e s 2 displays the difference between the original matrix and the reconstructed factors formed by mapping. be made by a simple matrix multiplication. If we multiply the concentrations C by a transformation matrix F we get the scores matrix. Similarly if the spectrum matrix S is multiplied by the inverse of the F matrix we get the loadings. It follows that the transformations from the factor space into the spectral space are possible by using the inverses of both types of transformation matrices. We can verify these relationships by simple numerical experiments that use MATLAB. The following code fragment (Prog. 10.1, m a p s p e c .m) first constructs a synthetic observation matrix Obs as a product of two matrices. These two matrices S and C contain spectra and concentrations. Next, we perform a svd or PCA analysis. We verify that the two PCA components are enough to reconstruct the original observations. For the reconstruction we use only two vectors from the solution. As we multiply the two small matrices together they form matrix TObs. The sum of absolute differences or the L 1 norm between the original Obs and the reconstructed TObs is very small, showing the numerical errors in the calculations.
198
After this verification we construct the transformation matrix F between the concentrations and scores matrices. Matrix F is obtained by simply multiplying the two first columns of the scores matrix U with the pseudoinverse of the concentration matrix C. The inverse of the transformation matrix F is then obtained by forming a matrix inverse of F. The result is called matrix G. When the loadings matrix is multiplied from the left by this G matrix the result is a scaled version of the loadings matrix, namely matrix r,. The first two columns of scores matrix u and the r, matrix are finally multiplied to obtain matrix FObs. We form the L 1-norm of the difference between the freshly reconstructed observation matrix and the original observations. We see again that the resulting difference is due to numerical noise only (Fig. 10.2). The fact that real spectra can be converted into loadings and vice versa is very useful. It does not give us any short-cut to obtain the real spectra, however. We can easily discover the transforming matrix F after we have reconstructed the spectra by some other method. There are many methods in the literature that start out by first making a factor analysis of the observation matrix. Factor analysis gives us very useful information about the number of components we can expect. Additional steps are required to obtain real spectra and concentrations. The factor analytical solution has several uses. One way to use it is as a compact way to represent the original data matrix. The original data matrix can be replaced by the factor analytical solution components during several routine calculations. If the original data matrix is highly compressible, there can be major savings in the amount of calculations that are necessary. This way of using the PCA components does not try to transform the PCA solution itself into something else. It simply uses the PCA as a compact stand-in for the original data to get savings in the amount of calculations. The AR process can benefit from this use of PCA as a numerical speedup device [Karjalainen and Karjalainen 1991 ].
10.2 The relative speed of AR calculations One of the main benefits of the AR algorithm is the speed of the calculations. The typical approach to deconvolution first makes a PCA analysis of the problem. This factor approach produces an answer that is optimal in the statistical sense of the word. The solution produced by PCA is orthogonal, the scores and loadings are a set of orthogonal vectors. The PCA solution is a very compact way to express the variations in the original data matrix. Additionally, the PCA analysis gives us information about the number of components necessary to explain the variation. Unfortunately, the PCA solution is not optimal in the chemical and physical sense. The scores and loadings are not concentrations and spectra. They must be converted by a transformation matrix into the concentrations and spectra. These
199
22
''
20
18
16 14
12
10
0.6
0.8
1
1.2
1.4
1.6
1.8
2 xlO
4
Figure 10.3 The ratio of the execution times between PCA and AR as a function of the problem size. The problem size--- the x-axis--is expressed as the number of elements in the observation matrix. The y-axis shows the amount of calculations needed by AR as a percentage of the calculations needed for the PCA. calculations are generally performed using some iterative algorithms that are not guaranteed to produce the desired answer. The number of iterations needed to find the solution is hard to predict. AR is an iterative method. The number of iterations needed is rather constant, a satisfactory amount being ten to twenty. If we want to compare the computational effort needed for the PCA-based approach versus the AR, we have to estimate how many calculations are needed for both approaches. To make some kind of comparison between the computational efforts needed for the factor approach versus AR a numerical experiment was performed. Spectra and concentration matrices were initially filled with random numbers. After this they were multiplied together to form a synthetic observation matrix. This matrix was then analyzed by both approaches. The number of AR iterations was set to ten. The PCA analysis was performed only once. To get an idea about the influence of the data matrix, the size of the data matrix was varied in a systematic fashion. It was assumed that five components were involved. Further the number of spectral lines (or wavelengths etc.) was varied in steps between 50 and 1,000. The amount of floating point operations that were needed was measured in flops, a unit that is proportional to the number
200
of floating point calculations in MATLAB. The proportion of the flops needed for ten iterations of AR was given as a percentage of the number of flops needed for PCA. As we can see from the shape of the curve, with larger matrices AR needs about ten times less effort than the PCA calculation (Fig. 10.3). The savings in calculations can be used for many different purposes. We can repeat the solution process in AR several times. If we spend as much total time for AR calculations as in PCA, we can repeat the solution process about ten times. The additional benefit we get from doing these repeated solutions is an estimate about the reproducibility of the solution. It is not as simple to make an estimate about the reproducibility of the solution using PCA. Simply repeating the PCA calculations produces the same result every time. In AR no two solutions are exactly the same, because the algorithm starts from a different set of random spectra each time the calculations are made. The savings in calculations in the AR process can be used for other purposes besides getting an idea about the reproducibility of the solution. We can vary constraints in a systematic fashion and find out what is the optimal combination of constraints. The optimal solution gives the best possible fit in combination with the smallest scatter between repeated solutions. The optimum solution is defined to be some combination of fit and scatter. The rapid calculations in the AR process make it feasible to perform extensive searches in the parameter space of constraints. PCA and AR are complementary. PCA is optimal in a statistical sense. AR is optimal in the sense of being more physical. PCA is maximally orthogonal. AR is minimally orthogonal. PCA produces maximum contrast between the vectors in the solution. AR minimizes the contrast between the vectors. PCA and AR are rotated versions of the same solution.
References Karjalainen EJ, Karjalainen UP. Component reconstruction in the primary space of spectra and concentrations. Alternating regression and related direct methods. Analytica Chimica Acta 1991; 250: 169-179. Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 1987; 2: 37-52.
Chapter
11
Looking ahead °
The two kinds of constraints
•
The guiding constraints
•
OSCAR as realized by AutoAR
•
The structural constraints
This Page Intentionally Left Blank
11
Looking ahead
1 1.1 The two kinds of constraints OSCAR is not a precise algorithm, it is a point of view. OSCAR defines one approach to the deconvolution problem. Deconvolution of overlapping spectra is a mathematical optimization problem. We are looking for a unique, stable solution, i.e. a repeatable solution. The same solution is returned every time. The result does not depend on the starting point. The process rapidly converges to the same point, every time. The philosophy of OSCAR revolves around two kinds of constraints. There are guiding constraints and the structural constraints. Let us define these.
1 1.2 The guiding constraints The secret to finding a stable solution to the deconvolution problem is exploring the constraint space. This constraint space is typically three-dimensional. The axes are: the number of chemical species, the spectral background and the elution curve background. We shall stick to this terminology even in those cases where the problem area is different from chromatography. We explore in OSCAR a volume that is spanned by these three axes. We set upper and lower limits to the number of species. This defines one axis. The limits given to the two backgrounds (or baselines) define the two other axes. We study the modelling error of the AR process as defined by the three guiding constraints. The modelling error is a combination of two kinds of errors. The first error type is the fit that indicates how closely a certain model describes the observations. The second error is scatter, the repeatability of the model. If the AR process is repeated many times, each convergence ends at a different solution. The standard deviation between the convergence endpoints is the scatter. When constraints are tightened, the backgrounds are increased or the number of species is decreased. The three constraints can be manipulated in a search for the best point of the constraint space. This optimal point gives the lowest combined error. The combined error is a weighted sum of fit and scatter. With the best values for the guiding constraints we 203
204
get the lowest combined error. At the optimum point of constraint space the AR process tends to always converge to a single point in the result space. At the optimum, the result does not depend on the initial random spectra. We call the spectra and elution curves the result parameters.
1 1.3 OSCAR as realized by AutoAR The AutoAR is one incarnation of these ideas. The hunt for the best point in constraint space is guided in AutoAR by the experimenter. He defines the regions of the constraint space that should be studied. When the AR solution process is then allowed to converge repeatedly, the statistics from the trajectories are collected and evaluated. Fit and scatter are combined using weights that look fight to the experimenter. When the optimum point in constraint space has been reached, the best point is studied in greater detail. The AR is allowed to converge repeatedly at the optimum point. Repeated AR solutions make it possible to calculate the repeatability of the solution. Standard deviations for the spectra and elution curves can be calculated. With increasing computer power the process could be made more automated. The computer could be programmed to select autonomously the optimal point in the constraint space. A non-linear optimization program could handle the search. The role of the experimenter would be less direct, only the rules of the optimization would then be set by him. The result parameters would then be automatically extracted. The type of constraints that is adjusted by the guidance program is probably not very critical. The constraint should be continuously adjustable in a wide range. Background values are ideal in this because they can be adjusted in the range 0 to 1, if 1 is used to designate the highest point in a spectrum or elution curves. In many problems, the role of the background adjustments can be replaced by other kinds of constraints. The least informative solution would be spectra and elution curves that have just constant values. Such a solution could be called uniformly gray. We can continuously adjust the spectra and elution curves between a totally unconstrained situation and the situation where only one value is permitted by several kinds of functions. These functions have a kind of non-parametric flavor to them. These kinds of functions are probably as useful as the backgrounds as constraints. The constraint space could be explored in "flatness space" equally well as in the "background space".
1 1.4 The structural constraints There is a second kind of constraint that need not be adjustable in the same sense as the guiding constraints. These constraints are structural constraints that charac-
205
terize the model type being optimized with OSCAR. The different model types are encountered with different chemical measurements. When we have discrete samples, the second non-spectral dimension is not continuous between samples. There is no possibility of enforcing the continuity by some kind of constraining operation during the "core AR" process. There is an upper limit to the concentrations but this is not needed for finding the solution. In titrations, kinetics, and experiments where one adjustable parameter, such as temperature, is changed in a continuous fashion we can enforce continuity between successive points on the concentration axis. This continuity is useful as it reduces the degrees of freedom in the model. The information content of the concentration axes is reduced due to the continuity. Finally, the strictest constraints that are local to the experiment type are found in a situation with physical separation such as chromatography. This is the situation with hyphenated instruments. The elution curves can be defined to be unimodal. If there are isomers that do not have any differences in the measured spectra, that is another problem, not a problem in defining what is a peak. Structural constraints can be present in the direction of the spectral axis as well. Mass spectra are line spectra that do not have the continuity properties of many other spectroscopies. The most we can say about continuity of mass spectra is that isotope peaks tend to be situated to the fight of the major peaks. With optical spectra such as UV-Vis, IR and NIR there is a continuity between adjacent wavelengths or wavenumbers. This continuity can be enforced during the AR process. When we are processing single one-dimensional chromatograms we do not have any continuity between the points of the spectral axes; the concentrations in single samples are not linked together in some continuous way. The least amount of continuity would be found in the case of having line spectra of discrete samples. In chemistry this could be mass spectra from mixtures that are directly recorded as mass spectra. Also two-dimensional NMR spectra from populations of discrete samples can be handled in this manner. If we do not have any continuity on the "spectral axis" and no continuity in the "elution profile" axis, we are facing the general statistical situation. It could be any two-dimensional data matrix that is reduced into proper components. Thus far, the dominant methods have been PCA or factor analysis. OSCAR could be used to take apart large matrices as efficiently as with the classical statistical methods. The same number of OSCAR factors describes the original data with the same precision as PCA factors. The difference lies only in the nature of the solution. The OSCAR-based solution is closer to the physical and chemical mechanisms, because everything is expressed as positive entities. The statistical solution, PCA, is optimal in orthogonality. The mapping between the PCA world and the OSCAR world is possible when the transformation matrix has been calculated. The solution found using OSCAR
206
is a rotated version of the PCA. It will be interesting to see if the positive nature of the OSCAR solution finds uses in statistics. The OSCAR solution is easier to describe in words than a PCA solution having several negative elements in the scores and loadings. In chemistry these positive vectors are the molecular spectra and concentrations. OSCAR is a start for developing a robust way to automate the extraction of spectra from instrumental readings. As we have shown with the validation experiment the AR process is stable when the critical value for the background constraint has been exceeded. The optimization of three parameters~the number of components, the spectral background, and the elution curve background---can be trusted to a fast computer. In the not-too-far-off future our instruments will come with a special deconvolution chip.
PART 2---Computer programs for hyphenated data
This Page Intentionally Left Blank
Chapter
12
The starting point
This Page Intentionally Left Blank
12 The starting point The z z z r a e n u , m is the main menu of the AutoAR program. All programs are called from from this menu program. The overall organization of the programs and Data file
~
"......
PRE
Modified data file
"'-.,,.
~--
AR
params.txt
DP
kombi.mat
DP3
DAR
current.txt C.mat S.mat
POST
CC.mat SS.mat FIT.mat SCATTER.mat
Obs.mat
DAR2
current2.txt CCC.mat SSS.mat
POST2
Figure 12.1 The organization of programs and data files in AutoAR. The programs are shown in the boxes. 211
212
the datafiles is shown in Fig. 12.1. The MATLAB m-files are the following: zzzpre
zzzar
.m
.m
z z z d p .m z z z d p 3 .m z z z d a r .m zzzpost
.m
zzzdar2
.m
zzzpost2
.m
z z z o r i g .m
preprocessing program (Prepro) analyzing program (Run_AR) displays the solution in two dimensions (Show_2D) displays the solution in three dimensions (Show_3D) selects the solution for display (Select) displays the spectra as bar or line spectra and the elution curves with or without the total elution curve (Spectra) selects the solution for display and calculates the confidence intervals (Run_Stat) displays the spectra as bar or line spectra and the elution curves with or without the total elution curve, both with their confidence ranges (Statis) displays spectra from any original data file (Data)
Next is the complete listing of the program z z zmenu, m. %ZZ~.M the starting point for AutoAR. % Date- 30 Sep 1995 %
% Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
function zzzmenu (page, gui, action) ; global fg_autoar global mi_autoar_Quit global mi_set_Quit global mn_autoar_Files global mn_set_Files global pb_autoar_Data global pb_autoar_Help global pb_autoar_Preproc global pb_autoar_Run_AR global pb_autoar_Run_s tat global pb_autoar_Select global pb_autoar_Show_2D global pb_autoar_Show_3 D global pb_autoar_Spectra global pb_autoar_Start global pb_autoar_Statis global tx_autoar_framel global tx_autoar_Tagtxt 1 global tx_autoar_Tagtxt2 global tx_autoar_Tagtxt3 if nargin == 0 page = 'initpage' ;
213
gui = 'initgui'; a c t i o n = 0;
end if s t r c m p ( p a g e , 'initpage') fg_autoar = figure; z z z m e n u ( 'a u t o a r ', 'i n i t g u i ' , 0) ; p a g e = ''-, end if s t r c m p (page, ' a u t o a r ' ) figure (fg_autoar) set(gcf, 'NumberTitle','off', ... 'Name','autoar', ... 'backingstore', 'off' ) ; if s t r c m p (gui, ' i n i t g u i ' ) zzzmenu( autoar , autoar_Menul',0); zzzmenu ( autoar , a u t o a r _ M e n u 2 ', 0) ; zzzmenu ( autoar , autoar_framel', 0) ; zzzmenu( autoar , autoar_Tagtxtl' ,0); zzzmenu( autoar , autoar_Tagtxt2' ,0); zzzmenu ( autoar , autoar_Tagtxt3 ', 0 ) ; zzzmenu ( autoar , autoar_Help', 0 ) ; zzzmenu( autoar , autoar_Preproc', 0) ; set (pb_autoar_Preproc, ' V i s i b l e ' , ' off' ) ; z z z m e n u ( 'a u t o a r ', 'a u t o a r _ D a t a ', 0) ; set (pb_autoar_Data, ' V i s i b l e ' , 'off' ) ; z z zmenu ( autoar , autoar_Run_AR', 0); set (pb_autoar_Run_AR, ' V i s i b l e ' , 'off' ) ; zzzmenu( autoar , a u t o a r S h o w 2D' ,0) ; set (pb_autoar_Show_2D, 'V?isible ' , 'off' ) ; zzzmenu( autoar , a u t o a r S h o w 3D' ,0) ; set (pb_autoar_Show_3D, ' v T s i b l e ' , 'off' ) ; zzzmenu( autoar , a u t o a r S e l e c t ' , 0) ; set (pb_autoar_Select, 'Visible', off ) ; zzzmenu ( autoar , autoar_Spectra', 0) ; set (pb_autoar_Spectra, 'Visible', off ) ; zzzmenu ( autoar , autoar_Run_stat', 0); set (pb_autoar_Run_stat, ' V i s i b l e ' , 'off' ) ; zzzmenu( autoar , a u t o a r S t a t i s ' , 0) ; set (pb_autoar_Statis, 'Visible', off ) ; zzzmenu( autoar , a u t o a r _ S t a r t ' , 0) ; end if strc-mp (gui, 'a u t o a r _ M e n u l ' ) if a c t i o n == 0 mn_autoar3iles = u i m e n u ( f g _ a u t o a r .... ' L a b e l ' , 'Files' ) ; mi_autoar_Quit = uimenu (mn_autoarFiles .... 'Label', 'Quit',... ' C a l l B a c k ' , ' z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ b y e ' end end if s t r c m p (gui, 's e t _ M e n u 2 ' ) if a c t i o n == 0 mn_set_Files = u i m e n u ( f g _ s e t ....
', i) ; ' ) ;
214
end
'Label', 'Exit') ; mi_set_Quit = u i m e n u ( m n _ s e t _ F i l e s .... 'Label', 'Quit', ... 'C a l l B a c k ' , 'z z z m e n u ( ' 'set' ', ' 's e t _ b y e '
', I) ; ' ) ;
end if s t r c m p ( g u i , ' a u t o a r _ h o l d e r ' ) if a c t i o n == 1 end end if s t r c m p ( g u i , ' a u t o a r _ H e l p ' ) if a c t i o n == 0 pb_autoar_Help = u i c o n t r o l ( f g _ a u t o a r .... 'S t y l e ' , ' p u s h b u t t o n ' , . . . 'Units','normalized','Position', [ 0 . 8 5 7 0 . 0 7 7 0 . 1 2 9 0 . 0 5 8 ] .... 'String','Help','BackgroundColor',[0.8 0.8 0.8] .... 'CallBack', 'zzzmenu(''autoar' ', ' ' a u t o a r _ H e l p ' ', i) ; ') ; end if a c t i o n == 1 help helpar end end if s t r c m p (gui, ' a u t o a r _ b y e ' ) if a c t i o n == 1 close (fg_autoar) end end if s t r c m p ( g u i , ' a u t o a r _ P r e p r o c ' ) if a c t i o n == 0 pb_autoar_Preproc = u i c o n t r o l ( f g _ a u t o a r .... 'Style', 'pushbutton',... 'Units', 'normalized','Position', [ 0 . 0 1 4 0 . 0 7 7 0 . 1 2 9 0 . 0 5 8 ] .... 'String', 'Preproc', 'BackgroundColor', [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ P r e p r o c ' ', i) ; ' ) ; end if a c t i o n == 1 z z z p r e ( 'i n i t p a g e ', 'i n i t g u i ' , 0 ) end end if s t r c m p (gui, 'a u t o a r _ R u n _ A R ' ) if a c t i o n == 0 pb_autoar_Run_AR = u i c o n t r o l ( f g _ a u t o a r .... 'Style', 'pushbutton',... 'Units' 'normalized','Position', [ 0 . 1 8 6 0 . 0 7 7 0 . 1 2 9 0 . 0 5 8 ] .... 'S t r i n g ' , 'R u n _ A R ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ R u n _ A R ' ', i) ; ' ) ; end if a c t i o n == 1 z z z a r ( 'i n i t p a g e ', ' i n i t g u i ' , 0 ) end end if s t r c m p ( g u i , ' a u t o a r _ S e l e c t ' ) if a c t i o n == 0
215
pb_autoar_Select = u i c o n t r o l ( f g _ a u t o a r .... ' S t y l e ' , ' p u s h b u t t o n ' , ... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 5 1 4 0 . 0 7 7 0 . 1 2 9 0 . 0 5 8 ] .... 'S t r i n g ' , 'S e l e c t ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'CallBack', 'zzzmenu(''autoar'',''autoar_Select'', i) ;') ;
end if a c t i o n == 1 z z z d a r ( 'i n i t p a g e end
', 'i n i t g u i ' , 0 )
end if s t r c m p (gui, 'a u t o a r _ S p e c t r a ' ) if a c t i o n == 0 pb_autoar_Spectra = u i c o n t r o l (f g _ a u t o a r .... 'Style', 'pushbutton',... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 5 1 4 0 . 0 0 0 0 . 1 2 9 0 . 0 5 8 ] .... 'String', 'Spectra','BackgroundColor', [0.8 0.8 0 . 8 ] , . . . CallBack , 'z z z m e n u ( ' ' a u t o a r ' ' , ' 'a u t o a r _ S p e c t r a ' ' , i) ' ) ; end if a c t i o n == 1 z z z p o s t ( 'i n i t p a g e ', 'i n i t g u i ' , 0) end
end if s t r c m p (gui, 'a u t o a r _ S h o w _ 2 D ' ) if a c t i o n == 0 pb_autoar_Show_2D = u i c o n t r o l (f g _ a u t o a r .... ' S t y l e ' , ' p u s h b u t t o n ' , ... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 1 8 6 0 . 0 0 0 0 . 1 2 9 0 . 0 5 8 ] .... 'String ,'Show_2D','BackgroundColor', [0.8 0.8 0 . 8 ] , . . 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ S h o w _ 2 D ' ', i) ; ' ) ; end if a c t i o n == 1 z z z d p ( 'i n i t p a g e ', 'i n i t g u i ' , 0) end end if s t r c m p (gui, 'a u t o a r _ S h o w _ 3 D ' ) if a c t i o n == 0 pb_autoar_Show_3D = u i c o n t r o l (f g _ a u t o a r .... 'S t y l e ' , 'p u s h b u t t o n ' , ... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 3 2 9 0 . 0 0 0 0 . 1 2 9 0 . 0 5 8 ] .... 'String 'Show_3D', 'BackgroundColor', [0.8 0.8 0 . 8 ] , . . 'CallBack', 'zzzmenu(' 'autoar'', ' 'autoar_Show_3D' ', i) ; ') ; end if a c t i o n == 1 z z z d p 3 ( 'i n i t p a g e ' , 'i n i t g u i ', 0) end end if s t r c m p (gui, 'a u t o a r _ S t a r t ' ) if a c t i o n == 0 pb_autoar_Start = u i c o n t r o l ( f g _ a u t o a r .... 'Style', 'pushbutton',... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 8 5 7 0 . 0 0 0 0 . 1 2 9 0 . 0 5 8 ] .... , String ! , , Start l , , BackgroundColor , [0.8 0 . 8 0.8] .... C a l l B a c k ' , ' z z z m e n u ( ' a u t o a r ' ', ' a u t o a r _ S t a r t ', i) );
216
end if a c t i o n == 1 load map load FIG1 c o l o r m a p (map) i m a g e (FIG1) a x i s ('off') set (pb_autoar_Start, 'Visible',' off'); s e t ( p b _ a u t o a r _ P r e p r o c , 'Visible', 'on') ; s e t ( p b _ a u t o a r _ D a t a , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ R u n _ A R , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ S e l e c t , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ S p e c t r a , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ S h o w _ 2 D , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ S h o w _ 3 D , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ R u n _ s t a t , 'Visible', 'on' ) ; s e t ( p b _ a u t o a r _ S t a t i s , 'Visible', 'on' ) ; end
end if s t r c m p (gui, 'a u t o a r _ R u n _ s t a t ' ) if a c t i o n == 0 pb_autoar_Run_stat = u i c o n t r o l ( f g _ a u t o a r .... 'Style', ' p u s h b u t t o n ' , ... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.686 0 . 0 7 7 0 . 1 2 9 0.058] .... 'String', ' R u n _ s t a t ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'CallBack', 'zzzmenu(''autoar'',''autoar_Run_stat'', i) ;') ; end if a c t i o n == 1 z z z d a r 2 ( 'i n i t p a g e ', 'i n i t g u i ', 0) end end if s t r c m p ( g u i , ' a u t o a r _ S t a t i s ' ) if a c t i o n == 0 pb_autoar_Statis = u i c o n t r o l ( f g _ a u t o a r .... 'Style', ' p u s h b u t t o n ' , ... 'Units' ' n o r m a l i z e d ' , 'Position', [0.686 0 . 0 0 0 0 . 1 2 9 0.058] .... 'String','Statis','BackgroundColor', [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ S t a t i s ' ', i) ; ' ) ; end if a c t i o n == 1 z z z p o s t 2 ( 'i n i t p a g e ' , 'i n i t g u i ' , 0) end end if s t r c m p (gui, 'a u t o a r _ D a t a ' ) if a c t i o n == 0 pb_autoar_Data = u i c o n t r o l ( f g _ a u t o a r .... 'Style', ' p u s h b u t t o n ' , . . . 'Units' ' n o r m a l i z e d ' ' P o s i t i o n ' , [0.014 0 . 0 0 0 0 . 1 2 9 0.058] .... 'String', ' D a t a ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ D a t a ' ', i) ; ' ) ; end if a c t i o n == 1 z z z o r i g ( 'i n i t p a g e ' , 'i n i t g u i ' , 0)
217
end end if s t r c m p (gui, 'a u t o a r _ f r a m e l ' ) if a c t i o n == 0 tx_autoar_framel = u i c o n t r o l ( f g _ a u t o a r .... 'Style','text','BackgroundC', [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.143 0 . 3 6 5 0 . 3 4 3 0 . 2 1 2 ] ' S t r i n g ' , '',... 'CallBack', 'zzzmenu(''autoar'',''autoar_framel'', i) ;') ; end end if s t r c m p (gui, 'a u t o a r _ T a g t x t l ') if a c t i o n == 0 tx_autoar_Tagtxtl = u i c o n t r o l ( f g _ a u t o a r .... 'Style','text','BackgroundC', [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 I] .... 'Units', 'normalized', 'Position', [0.157 0.500 0.314 0.058] 'S t r i n g ' , ' D a t a A n a l y s i s ' , . . . ' C a l l B a c k ' , ' z z z m e n u ( ' 'autoar' ', ' ' a u t o a r _ T a g t x t l ' ', i) ; ') ; end end if s t r c m p (gui, 'a u t o a r _ T a g t x t 2 ') if a c t i o n == 0 tx_autoar_Tagtxt2 = u i c o n t r o l (f g _ a u t o a r .... ' S t y l e ' , 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units', 'normalized', 'Position', [0.157 0.442 0.314 0.058] ' S t r i n g ' , 'for', ... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ T a g t x t 2 ' ', i) ; ' ) ; end end if s t r c m p (gui, 'a u t o a r _ T a g t x t 3 ') if a c t i o n == 0 tx_autoar_Tagtxt3 = u i c o n t r o l ( f g _ a u t o a r .... ' S t y l e ' , 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units', 'normalized', 'Position', [0.157 0.385 0.314 0.058] 'S t r i n g ' , ' H y p h e n a t e d T e c h n i q u e s ' , ... 'C a l l B a c k ' , 'z z z m e n u ( ' 'a u t o a r ' ', ' 'a u t o a r _ T a g t x t 3 ' ', i) ; ' ) ; end end
end % #####
End
of p r o g r a m
#####
....
....
....
....
This Page Intentionally Left Blank
Chapter
13
Selecting and preprocessing the raw data
This Page Intentionally Left Blank
13
Selecting and preprocessing the raw data
It is always necessary to select observations that are used in the actual analysis. The data preprocessing program z z z p r e .m allows you to segment, transpose or smooth the data matrix. The segmentation process isolates a smaller subset of the raw data. It can also be used to take a look at the data matrix as a whole. %ZZZPRE.M prepares the data for AR processing. % Date: 30 Sep 1995 %
% Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
function zzzpre (page, gui, action) ; global BObs global NF global nlines global nofile global nscans global Obs global P global SavedFile global segs global SObs global T global trnsp; global unlines global unscans global znspec
Data file
~
PRE
"-
Preprocessed data file
Figure 13.1 The user can select the data file, which should be in matrix form, either a MAT-file or an ASCII file. The processed matrix can be written into a file for later use. 221
222
global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global
ed_PREset_File 1 txt ed_PREset_File s txt ed_PREset_Filter ed_PREset_No_l inel ed_PREset_No_l ine2 ed_PREset_No_scanl ed_PREset_No_scan2 ed_PREs et_No_spe ed_PREset_Segl ed_PREset_Seg2 fg_PRErun fg_PREset fr_PREset_Fileframe fr_PREs et_Methodframe mi_PRErun_Back mi_PRErun_Help mi_PRErun_Qui t mi_PRErun_Save mi_PRErun_Save_as mi_PREset_Help mi_PREset_Load mi_PREset_Quit mi_PREset_Run mi_PREset_Save mi_PREset_Save_as mn_PRErun_Fi les mn_PRErun_GOTO mn_PREset_Files mn_PREset_GOTO pb_PRErun_Back pb_PRErun_Help pb_PREs et_Convolve pb_PREset_Factorize pb_PREset_Help pb_PREset_Load pb_PREs et_None pb_PREset_Run pb_PREset_Save pb_PREs et_Segment pb_PREset_svd pb_PREset_Transpose rb_PREs et_Conv_r rb_PREset_Nfac t_r rb_PREset_None_r rb_PREset_Segm_r rb_PREset_svd_r rb_PREset_Trans_r tx_PREset_File 1 header tx_PREset_File s header tx_PREset_Filtertxt tx_PREs et_No_l ine itxt tx_PREs et_No_l ine2 txt tx_PREs et_No_s canl txt
223
global tx_PREs et_No_s can2 txt global tx_PREset_No_spetxt global tx_PREs et_Paramheader global tx_PREset_Segl txt global tx_PREset_Seg2 txt if n a r g i n == 0 p a g e = 'i n i t p a g e ' ; gui = 'initgui'; a c t i o n = 0; end if s t r c m p (page, 'i n i t p a g e ' ) fg_PREset = figure; fg_PRErun = figure; z z z p r e ( 'P R E r u n ' , ' i n i t g u i ' , 0 ) ; z z z p r e ( ' P R E s e t ' , 'i n i t g u i ' , 0) ; p a g e = ''-, end if s t r c m p (page, ' P R E s e t ' ) f i g u r e (f g _ P R E s e t ) set(gcf, 'NumberTitle','off', ... 'Name', ' P R E s e t ' , ... backingstore', 'off' ) ; if s t r c m p (gui, 'i n i t g u i ' ) zzzpre ( PREset , PREset_Menul', 0) ; zzzpre ( PREset , PREset_Menu2 ', 0 ) ; zzzpre ( PREset , PREset_Fileframe', 0) ; zzzpre ( PREset , PREset_File 1 h e a d e r ' , 0) ; zzzpre( PREset , PREset_File s h e a d e r ' , 0) ; zzzpre( PREset , P R E s e t _ L o a d ' , 0) ; zzzpre( PREset , PREset_File 1 txt' ,0) ; zzzpre ( PREset , P R E s e t _ S a v e ' , 0) ; zzzpre( PREset , PREset_File s txt', 0) ; zzzpre ( PREset , PREset_No_scanltxt', 0) ; zzzpre ( PREset , PREset_No_scanl', 0) ; zzzpre ( PREset , PREset_No_scan2txt', 0) ; zzzpre ( PREset , PREset_No_scan2 ', 0) ; zzzpre ( PREset , PREset_No_lineltxt', 0) ; zzzpre ( PREset , PREset_No_linel', 0) ; zzzpre ( PREset , PREset_No_line2txt', 0) ; zzzpre ( PREset , PREset_No_line2 ', 0) ; zzzpre ( PREset , PREset_Methodframe', 0) ; zzzpre ( PREset , PREset_Paramheader', 0) ; zzzpre( PREset , PREset_None_r', 0) ; zzzpre ( PREset , P R E s e t _ N o n e ' , 0) ; zzzpre( PREset , PREset_Conv_r', 0) ; zzzpre ( PREset , PREset_Convolve', 0) ; zzzpre ( PREset , PREset_Filtertxt', 0) ; zzzpre( PREset , PREset_Filter',0); zzzpre ( PREset , PREset_Trans_r', 0) ; zzzpre( PREset , PREset_Transpose', 0) ; zzzpre( PREset , PREset_Nfact_r', 0) ; zzzpre( PREset , PREset_Factorize', 0) ; zzzpre ( PREset , PREset_Segm_r', 0) ;
224
zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre zzzpre
( 'P R E s e t ' , ( 'P R E s e t ' , ( 'P R E s e t ' , ( 'P R E s e t ' , ( PREset', ( PREset', ( PREset', ( PREset', ( PREset', ( PREset', ( PREset', ( 'P R E s e t ' ,
'P R E s e t _ S e g m e n t ' , 0) ; 'P R E s e t _ s v d _ r ', 0 ) ; 'P R E s e t _ s v d ' , 0) ; 'P R E s e t _ S e g l t x t ' , 0) ; 'P R E s e t _ S e g l ' , 0) ; 'P R E s e t _ S e g 2 t x t ' , 0) ; 'P R E s e t _ S e g 2 ', 0) ; 'P R E s e t _ N o _ s p e t x t ', 0) ; 'P R E s e t _ N o _ s p e ' , 0) ; 'P R E s e t _ R u n ' , 0) ; 'P R E s e t _ H e l p ' , 0) ; 'P R E s e t _ B o o t ' , I) ;
end if s t r c m p (gui, 'P R E s e t F i l e f r a m e ' ) if a c t i o n == 0 fr_PREset_Fileframe = u i c o n t r o l ( f g _ P R E s e t .... 'S t y l e ' , ' f r a m e ' , 'B a c k g r o u n d C o l o r ' , [ 0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units','normalized','Position', [0.043 0.635 0.914 0.269]);
end end i f s t r c m p (gui, 'P R E s e t _ M e t h o d f r a m e ' ) if a c t i o n == 0 fr_PREset_Methodframe = uicontrol ( f g _ P R E s e t .... 'S t y l e ' , ' f r a m e ' , 'B a c k g r o u n d C o l o r ' , [ 0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units','normalized','Position', [0.043 0.077 0.914 0.538]); end end if s t r c m p (gui, 'P R E s e t _ M e n u l ' ) if a c t i o n == 0 ran_PREset_Files = u i m e n u (f g _ P R E s e t .... 'Label', 'Files') ; mi_PREset_Load = uimenu (mn_PREsetFiles .... 'Label', 'Load',... l CallBack l , l zzzpre(' l PREset l I , l l PREset_loadthem mi_PREset_Save = uimenu (mn_PREsetFiles .... ' L a b e l ' , ' S a v e ' , ... 'C a l l B a c k ' , ' z z z p r e ( ' 'P R E s e t ' ', ' 'P R E s e t _ s a v e t h e m ' mi_PREset_Save_as = uimenu(mn P R E s e t F i l e s ....
l
l
', i) ; ' ) ;
'Label', 'Save_as',... ,, ,, 'C a l l B a c k ' , ' z z z p r e ( ' 'P R E s e t , PREset_saveasthem' mi_PREset_Quit = uimenu (mn_PREsetFiles ....
end
' L a b e l ' , 'Quit' , . . . 'C a l l B a c k ' , ' z z z p r e ( ' 'P R E s e t '
', ' 'P R E s e t _ b y e '
, I) ; I ) ;
', i) ; ' ) ;
', i) ; ' ) ;
end i f s t r c m p (gui, 'P R E s e t _ M e n u 2 ') if a c t i o n == 0 r a n _ P R E s e t _ C / r i D = u i m e n u ( f g _ P R E s e t .... ' L a b e l ' , 'GOTO' ) ; mi_PREset_Run = u i m e n u ( m n _ P R E s e t _ C K 3 T O .... 'Label', 'Run',... I CallBack l , l zzzpre( l l PREset l I , l I PREset_runthem mi_PREset_Help = uimenu(mn_PREset_C43TO ....
l
l
/
i) ;') ;
225
end
'Label', 'Help',... 'CallBack', 'zzzpre(' 'PREset'', '' P R E s e t _ H e l p '
', i) ; ') ;
end if s t r c m p (gui, 'P R E s e t _ s e t t h e m ' ) if a c t i o n == 1 f i g u r e (fg_PREset) p a g e = 1; end end if s t r c m p (gui, 'P R E s e t _ r u n t h e m ' ) if a c t i o n == 1 f i g u r e (fg_PRErun) page = 2 ; end end if s t r c m p (gui, 'P R E s e t _ L o a d ' ) if a c t i o n == 0 pb_PREset_Load = u i c o n t r o l ( f g _ P R E s e t .... 'Style', ' p u s h b u t t o n ' , . . . 'Units', 'normalized', 'Position', !0~086 0.769 0~i00 0.058] .... 'String', ' L o a d ' , ' B a c k g r o u n d C o l o r , [0.8 0 8 0.8],... 'C a l l B a c k ' , 'z z z p r e ( ' 'PREset' ', ' 'P R E s e t _ L o a d ' ', I) ; ' ) ; end if a c t i o n == 1 %global nscans %global nlines % g l o b a l Obs % g l o b a l SObs %global BObs z z z p r e ( 'PREset', 'P R E s e t _ L o a d _ f i l e ' , i) if n o f i l e == 0 [nscans, n l i n e s ] : s i z e ( O b s ) ; set ( e d _ P R E s e t _ N o _ s c a n l , 'String', n u m 2 s t r (nscans)) set ( e d _ P R E s e t _ N o _ l i n e l , 'String', n u m 2 s t r (nlines) ) set ( e d _ P R E s e t _ F i l e s txt, 'String' ,n u m 2 s t r ( [ ] ) ) set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (nscans)) set ( e d _ P R E s e t _ N o _ l i n e 2 , 'String', n u m 2 s t r (nlines)) set ( e d _ P R E s e t _ F i l t e r , 'String', n u m 2 s t r ( [ ] ) ) set ( e d _ P R E s e t _ S e g l , 'String', n u m 2 s t r ( [ ] ) ) set ( e d _ P R E s e t _ S e g 2 , 'String', n u m 2 s t r ( [ ] ) ) set ( e d _ P R E s e t _ N o _ s p e , 'String', n u m 2 s t r ( [ ] ) ) z z z p r e ( 'PREset', 'P R E s e t _ N o n e ' , i) end end end if s t r c m p (gui, 'P R E s e t _ S a v e ' ) if a c t i o n == 0 pb_PREset_Save = u i c o n t r o l (f g _ P R E s e t .... 'Style', ' p u s h b u t t o n ' , . . . 'Units', 'normalized', 'Position', [0.557 0.769 0. i00 0.058] .... 'String', 'Save', 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ],. . 'CallBack', 'zzzpre(' 'PREset' ', ''PREset_Save' ,i) ;');
226
end if a c t i o n == 1 z z z p r e ( 'PREset', 'P R E s e t _ S a v e _ f i l e ' , i) end
end if s t r c m p (gui, ' P R E s e t _ L o a d _ f i l e ' ) if a c t i o n == 1 %global nofile [filename, p a t h ] = u i g e t f i l e ( ' * . m a t ' , 'File to load?' ) ; nofile=l; if f i l e n a m e - = 0 r a p u = [ 'l o a d ' f i l e n a m e ] ; eval (rapu) ; p i t = l e n g t h (filename) ; v a r i = f i i ename; if p i t > 4 v a r i = f i l e n a m e (1 -pit-4 ) ; end r a p u = [ ' O b s = ' v a r i ';']; eval (rapu) ; set ( e d _ P R E s e t _ F i l e 1 txt, 'String', vari) ; f p r i n t f ( 'Original d a t a f i l e - ' ) ;d i s p (filename) ; f p r i n t f ( '\n' ) ; trnsp=0; segs=0 ; nofile=0 ; end end end if s t r c m p (gui, 'P R E s e t _ S a v e _ f i l e ' ) if a c t i o n == 1 % P r e _ S a v e .m [filename, path] = u i p u t f i l e ( 'SObs.mat', 'F i l e to b e saved' ) ; p i t = l e n g t h (filename) ; if f i l e n a m e - = 0 v a r i = f i i ename; if p i t > 4 v a r i = f i l e n a m e (1 :pit-4 ) ; end zapu=[vari, ' = SObs;']; eval (zapu) r a p u = [ ' s a v e ', vari,' ', vari]; eval (rapu) set ( e d _ P R E s e t _ F i l e s txt, 'String' ,vari) ; f p r i n t f ( ' M o d i f i e d d a t a s t o r e d in file- ') ; d i s p ( f i l e n a m e ) ; f p r i n t f ( '\n' ) ; end end end if s t r c m p (gui, 'P R E s e t _ N o n e _ r ' ) if a c t i o n == 0 r b _ P R E s e t _ N o n e _ r = u i c o n t r o l (fg_PREset .... 'Style', 'r a d i o ' , . . .
227
'Units', 'normalized', 'Position', [0.086 0.462 0.029 0.058] .... 'String', 'None_r', ' B a c k g r o u n d C o l o r ' , [ 0 . 8 0.8 0 8],.. 'CallBack',' zzzpre(' 'PREset'','' P R E s e t _ N o n e _ r ' ' , i) ;') ;
end if a c t i o n == 1 set ( r b _ P R E s e t _ N o n e _ r , 'Value', i) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', 0) ; set ( r b _ P R E s e t _ s v d _ r , 'Value', 0) ; SObs=Obs ; set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (nscans)) set ( e d _ P R E s e t N o _ l i n e 2 , 'String', num2 str (nlines)) trnsp=0; segs=0; end
end if s t r c m p (gui, 'P R E s e t _ N o n e ' ) if a c t i o n == 0 pb_PREset_None = u i c o n t r o l (f g _ P R E s e t .... 'Style', 'pushbutton', ... 'Units', 'normalized', 'Position', [0.129 0.462 0.171 0.058] .... 'String', 'None', 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z p r e ( ' 'PREset' ', ' 'P R E s e t _ N o n e ' ', I) ; ' ) ; end if a c t i o n == 1 set ( r b _ P R E s e t _ N o n e _ r , 'Value', i) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', 0) ; set ( r b _ P R E s e t _ s v d _ r , 'Value' , 0 ) ; SObs=Obs ; set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (nscans)) set ( e d _ P R E s e t _ N o _ l i n e 2 , 'String' ,nitro2str (nlines)) trnsp=0 ; segs=0 ; end end if s t r c m p (gui, 'P R E s e t _ C o n v _ r ' ) if a c t i o n == 0 rb_PREset_Conv_r = u i c o n t r o l ( f g _ P R E s e t .... 'Style', 'radio',... 'Units"'n°rmalized"'P°siti°n"[e'543 0"462 0"0290 " 1 5 8 ] , .... 'String', 'Conv_r', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0 8] .. 'CallBack', 'zzzpre(' 'PREset'', ' ' P R E s e t _ C o n v _ r ' ' , i) ; ) • end if a c t i o n == 1 set ( r b _ P R E s e t _ N o n e _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', i) ;
228
set (rb_PREset_Nfact_r, 'Value', 0) ; set (rb_PREset_svd_r, 'Value', 0 ) ; set (ed_PREset_No_scan2, 'String', n u m 2 s t r (nscans) ) set (ed_PREset_No_line2, 'String', n u m 2 s t r (nlines)) [filename, p a t h ] = u i g e t f i l e ( ' * . m a t ' , ' F i l e to load?'); if f i l e n a m e - = 0 rapu= [' load ' filename] ; eval (rapu) p i t = l e n g t h (filename) ; v a r i = f i l e n a m e (1 :pit-4) ; rapu=['F=' vari ';']; eval (rapu) set (ed_PREset_Filter, 'String', vari) ; d i s p ( [ ' F i l t e r file: ' vari]) SObs=conv2 (Obs, F, 'same' ) ; else set (rb_PREset_Conv_r, 'Value', 0 ) ; set (rb_PREset_None_r, 'Value', 1 ) ; SObs=Obs; end trnsp=0; segs=0;
end end if s t r c m p (gui, 'PREset_Convolve' ) if a c t i o n == 0 p b _ P R E s e t _ C o n v o l v e = u i c o n t r o l (fg_PREset .... 'Style', 'pushbutton',... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.586 0.462 0.171 0.058] .... 'String','Convolve', 'BackgroundColor', [0.8 0.8 0.8] 'CallBack', 'zzzpre ( ' 'PREset' ', ' 'PREset_Convolve' ', i) ); end if a c t i o n == 1 set (rb_PREset_None_r, 'Value', 0) ; set (rb_PREset_Trans_r, 'Value', 0) ; set (rb_PREset_Segm_r, 'Value', 0) ; set (rb_PREset_Conv_r, 'Value', i) ; set (rb_PREset_Nfact_r, 'Value', 0) ; set (rb_PREset_svd_r, 'Value', 0) ; set (ed_PREset_No_scan2, 'String', n u m 2 s t r (nscans)) set (ed_PREset_No_line2, 'String', nun2 str (nlines)) [filename, p a t h ] = u i g e t f i l e ( ' * . m a t ' , ' F i l e to load?'); if f i l e n a m e - = 0 rapu= [ 'load ' f ilename ] ; eval (rapu) p i t = l e n g t h (filename) ; v a r i = f i l e n a m e (i -pit-4) ; rapu=['F=' vari ';']; eval (rapu) set (ed_PREset_Filter, 'String', vari) ; d i s p ( [ ' F i l t e r file- ' vari]) S O b s = c o n v 2 (Obs, F, 'same' ) ; else
229
set (rb_PREset_Conv_r, 'Value', 0) ; set (rb_PREset_None_r, 'Value', i) ; SObs=Obs;
end trnsp=0; segs=0 ;
end end if strc-~p (gui, 'P R E s e t _ T r a n s _ r ' ) if a c t i o n == 0 r b _ P R E s e t _ T r a n s _ r = u i c o n t r o l (fg_PREset .... 'Style' 'radio', 'Units', 'normalized', 'Position', [0.086 0.365 0.029 0.058] .... 'String','Trans_r','BackgroundColor',[0.8 0.8 0.8] .... 'CallBack', 'zzzpre(' 'PREset'', '' P R E s e t _ T r a n s _ r ' ', I) ; ') ; end if a c t i o n == 1 set (rb_PREset_None_r, 'Value', 0) ; set (rb_PREset_Trans_r, 'Value', i) ; set (rb_PREset_Segm_r, 'Value', 0 ) ; set (rb_PREset_Conv_r, 'Value', 0 ) ; set (rb_PREset_Nfact_r, 'Value', 0) ; set (rb_PREset_svd_r, 'Value', 0) ; if trnsp == 0 trnsp= 1; S O b s = O b s '; else trnsp=0; SObs=Obs; end [unscans, u n l i n e s ] =size (SObs) ; set (ed_PREset_No_scan2, 'String', n u m 2 s t r (unscans) ) set (ed_PREset_No_line2, 'String', n u m 2 s t r (unlines) ) end end if s t r c m p (gui, 'P R E s e t _ T r a n s p o s e ' ) if a c t i o n == 0 p b _ P R E s e t _ T r a n s p o s e = u i c o n t r o l (fg_PREset .... 'Style' 'pushbutton' 'Units', 'normalized', 'Position', [0.129 0.365 0.171 0.058] .... 'String', 'Transpose', 'BackgroundColor', [0.8 0.8 0.8] .... 'CallBack',' zzzpre(' 'PREset'',' ' P R E s e t _ T r a n s p o s e ' ' , i) ;') ; end if a c t i o n == 1 % g l o b a l trnsp; set (rb_PREset_None_r, 'Value', 0) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', I) ; set (rb_PREset_Segm_r, 'Value', 0) ; set (rb_PREset_Conv_r, 'Value', 0) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', 0) ; set (rb_PREset_svd_r, 'Value', 0 ) ; if trnsp == 0 t r n s p = 1; ,
,
•
•
•
,
--
•
230
SObs=Obs '; else trnsp=0; SObs=Obs;
end [unscans, u n l i n e s ] = s i z e (SObs) ; set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (unscans) ) set ( e d _ P R E s e t _ N o _ l i n e 2 , 'String', n u m 2 s t r (unlines) )
end end if s t r c m p ( g u i , ' P R E s e t _ N f a c t _ r ' ) if a c t i o n == 0 r b _ P R E s e t _ N f a c t _ r = u i c o n t r o l (fg_PREset .... 'Style' 'radio', 'Units', 'normalized', 'Position', [0.543 0.308 0.029 0.058] .... 'String','Nfact_r','BackgroundColor', [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z p r e ( ' 'PREset' ', ' 'P R E s e t _ N f a c t _ r ' ', i) ; ' ) ; end if a c t i o n == 1 set ( r b _ P R E s e t _ N o n e _ r , 'Value', 0) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', i) ; set ( r b _ P R E s e t _ s v d _ r , 'Value', 0) ; set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (nscans) ) set ( e d _ P R E s e t _ N o _ l i n e 2 , 'String', n u m 2 s t r (nlines)) [NF] = s v d (Obs) ; trnsp=0; segs=0; end end if s t r c m p (gui, 'P R E s e t _ F a c t o r i z e ' ) if a c t i o n == 0 pb_PREset_Factorize = u i c o n t r o l (fg_PREset .... 'Style' 'pushbutton' ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.586 0.308 0.171 0.058] .... 'String', 'F a c t o r i z e ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack', 'zzzpre(' 'PREset' ', ' ' P R E s e t _ F a c t o r i z e ' ', i) ; ') ; end if a c t i o n == 1 %global NF set ( r b _ P R E s e t _ N o n e _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', 0) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', i) ; set ( r b _ P R E s e t _ s v d _ r , 'Value', 0 ) ; set ( e d _ P R E s e t _ N o _ s c a n 2 , 'String', n u m 2 s t r (nscans)) set ( e d _ P R E s e t _ N o _ l i n e 2 , 'String', n u m 2 s t r (nlines) ) [N F ] = s v d (Obs) ; trnsp=0; segs=0; ,
,
--
•
231
end end if s t r c m p ( g u i , ' P R E s e t _ S e g m _ r ' ) if a c t i o n == 0 rb_PREset_Segm_r = u i c o n t r o l (fg_PREset .... 'Style', 'radio',... 'Units', 'normalized', 'Position', [0.086 0.269 0.029 0.058] .... 'String', 'Segm_r', 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack','zzzpre(''PREset'',''PREset_Segm_r'',l);'); end if a c t i o n == 1 set ( r b _ P R E s e t _ N o n e _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ T r a n s _ r , 'Value', 0) ; set ( r b _ P R E s e t _ S e g m _ r , 'Value', 1 ) ; set ( r b _ P R E s e t _ C o n v _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ N f a c t _ r , 'Value', 0 ) ; set ( r b _ P R E s e t _ s v d _ r , 'Value', 0) ; [nscans, nlines] = s i z e (Obs) ; f s c a n = s t r 2 n u m (get ( e d _ P R E s e t _ S e g l , 'String' ) ) ; i s c a n = s t r 2 n u m (get ( e d _ P R E s e t _ S e g 2 , 'String' ) ) ; if f s c a n > 0 & f s c a n 0 [nscans, n l i n e s ] = s i z e (Obs) ; BObs=Obs; if n s c a n s > n l i n e s B O b s = O b s ';
234
[Ul, SS, Vl ] = s v d (BObs, 0) ; U=V' ; V=UI' ;
end if n s c a n s 4 v a r i = f i l e n a m e (1 :pit-4 ) ; end nimi=var i ; rapu=['Obs=' v a r i ';']; e v a l (rapu) ; OBS=Obs; save OBS OBS nofile=0 ; end end end if s t r c m p (gui, ' A R s e t _ L o g _ e l p _ b a s e l i n e ' ) if a c t i o n == 0 rb_ARset_Log_elp_baseline = u i c o n t r o l (f g _ A R s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units','normalized','Position', [0.543 0 . 7 3 1 0 . 2 8 6 0.058] 'S t r i n g ' , 'L o g e l p b a s e l i n e ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 'C a l l B a c k ' , 'z z z a r ( ' 'A R s e t ' ', ' 'A R s e t _ L o g _ e l p _ b a s e l i n e ' ', i) end if a c t i o n == 1 set (rb_ARset_Lin_elp_baseline, 'Value', 0 ) ; set (rb_ARset_Log_elp_baseline, 'Value', i) ; e l p c h o i c e = 2; end
.... ], ; ') ;
.... ] .... ; ') ;
251
end if s t r c m p (gui, ' A R s e t _ S a v e _ s ' ) if a c t i o n == 1 [filename, p a t h ] = u i p u t f i l e ( ' S S . m a t ' , 'Save s p e c t r a f i l e - ' ) ; if f i l e n a m e - = 0 p i t = l e n g t h (filename) ; v a r i = f i i ename; if p i t > 4 v a r i = f i l e n a m e (i :pit-4) ; end zapu=[vari, ' = SS;'] ; e v a l (zapu) rapu=['save ', vari, ' ', vari]; e v a l (rapu) end end end if s t r c m p (gui, ' A R s e t _ S a v e _ c ' ) if a c t i o n == 1 [f i l e n a m e , path] = u i p u t f i l e ( 'CC.mat', ... 'Save e l u t i o n c u r v e s file- ' ) ; if f i l e n a m e - = 0 pi t = l e n g t h (filename) ; v a r i = f i i ename; if p i t > 4 v a r i = f i l e n a m e (1 - p i t - 4 ) ; end zapu=[vari, ' = CC;'] ; e v a l (zapu) rapu=['save ', vari, ' ', vari] ; e v a l (rapu) end end end if s t r c m p (gui, ' A R s e t _ L i n _ s p e c t _ b a s e l i n e ' ) if a c t i o n == 0 rb_ARset_Lin_spect_baseline = u i c o n t r o l (f g _ A R s e t .... 'Style', 'radio', ... 'Units', ' n o r m a l i z e d ' , 'Position', [0.543 0.538 0 . 2 8 6 0.058] .... 'S t r i n g 'Lin_spect_baseline', . . 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack', 'z z z a r ( ' 'ARset' ', ' ' A R s e t _ L i n _ s p e c t _ b a s e l i n e ' ', i) ; ' ) ; end if a c t i o n == 1 %global spectchoice set ( r b _ A R s e t _ L i n _ s p e c t _ b a s e l i n e , 'Value', i) ; set ( r b _ A R s e t _ L o g _ s p e c t _ b a s e l i n e , 'Value', 0) ; spectchoice = 1 ; end end if s t r c m p (gui, ' A R s e t _ B o o t ' ) if a c t i o n == 1 fid = fopen('params.txt','r');
252
g m a x i t e r = f s c a n f (fid, '%f', [i, i] ) ; set ( e d _ A R s e t _ M a x i t e r , 'String', n u m 2 s t r (gmaxiter)); g r p t c t = f s c a n f ( f i d , ' % f ' , [i,i]); set ( e d _ A R s e t _ R p t c t , 'String', n u m 2 s t r (grptct)) ; g m a x s p e c = f s c a n f ( f i d , ' % f ' , [i,i]); set ( e d _ A R s e t _ M a x s p e c , 'String', n u m 2 s t r (gmaxspec)) ; g m i n s p e c = f s c a n f (fid, '%f', [i, i] ) ; set ( e d _ A R s e t _ M i n s p e c , 'String', n u m 2 s t r (gminspec)) ; g s p e c t s t e p s = f s c a n f ( f i d , ' % f ' , [i,i]); set ( e d _ A R s e t _ S p e c t s t e p s , 'String', n u m 2 s t r (gspectsteps) ) ; g s p e c t m a x = f s c a n f (fid, '%f', [i, 1 ] ) ; set ( e d _ A R s e t _ S p e c t m a x , 'String' , n u m 2 s t r (gspectmax)) ; g s p e c t m i n = f s c a n f ( f i d , ' % f ' , [i,I]); set ( e d _ A R s e t _ S p e c t m i n , 'String' , n u m 2 s t r (gspectmin)) ; g e l p s t e p s = f s c a n f ( f i d , ' % f ' , [I,i]); set ( e d _ A R s e t _ E l p s t e p s , 'String', n u m 2 s t r (gelpsteps)) ; g e l p m a x = f s c a n f ( f i d , ' % f ' , [i,i]); set ( e d _ A R s e t _ E l p m a x , 'String', n u m 2 s t r (gelpmax)) ; g e l p m i n = f s c a n f ( f i d , ' % f ' , [i,i]); set ( e d _ A R s e t _ E l p m i n , 'String' , n u m 2 s t r (gelpmin)) ; g t o t a l s t e p s = f s c a n f ( f i d , ' % f ' , [i,i]); s p e c t c h o i c e = f s c a n f (fid, '%f', [i, i] ) ; if s p e c t c h o i c e -= 1 set ( r b _ A R s e t _ L i n _ s p e c t _ b a s e l i n e , 'Value', 0) ; set ( r b _ A R s e t _ L o g _ s p e c t _ b a s e l i n e , 'Value', i); else set ( r b _ A R s e t _ L i n _ s p e c t _ b a s e l i n e , 'Value', i) ; set ( r b _ A R s e t _ L o g _ s p e c t _ b a s e l i n e , 'Value', 0 ) ; end e l p c h o i c e = fscanf(fid, '%f', [i,i]); if e l p c h o i c e - = 1 set ( r b _ A R s e t _ L i n _ e l p _ b a s e l i n e , 'Value', 0 ) ; set ( r b _ A R s e t _ L o g _ e l p _ b a s e l i n e , 'Value', i) ; else set ( r b _ A R s e t _ L i n _ e l p _ b a s e l i n e , 'Value', i) ; set ( r b _ A R s e t _ L o g _ e l p _ b a s e l i n e , 'Value', 0 ) ; end u n i c h o i c e = f s c a n f ( f i d , ' % f ' , [I,i]); if u n i c h o i c e -= 1 set ( c b _ A R s e t _ U n i m o d a l , 'Value', 0 ) ; else set ( c b _ A R s e t _ U n i m o d a l , 'Value', I) ; end s t a t u s = fclose(fid) ; fl_open=0;
end end if s t r c m p (gui, ' A R s e t _ L o g _ s p e c t _ b a s e l i n e ' ) if a c t i o n == 0 rb_ARset_Log_spect_baseline = u i c o n t r o l (f g _ A R s e t .... 'Style', 'r a d i o ' , . . . 'Units', 'normalized', 'Position', [0.543 0.462 0.286 0.058] .... 'String', ' L o g _ s p e c t _ b a s e l i n e ' , . .
253
'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z a r ( ' 'ARset' ', ' 'A R s e t _ L o g _ s p e c t _ b a s e l i n e ' end if a c t i o n == 1 set ( r b _ A R s e t _ L i n _ s p e c t _ b a s e l i n e , 'Value', 0) ; set ( r b _ A R s e t _ L o g _ s p e c t _ b a s e l i n e , 'V a l u e ' , 1 ) ; spectchoice = 2 ; end
', i) ; ' ) ;
end if s t r c m p (gui, 'ARset_Last' ) if a c t i o n == 1 end end if s t r c m p (gui, ' A R s e t _ l o a d t h e m ' ) if a c t i o n == 1 [f i l e n a m e , p a t h n a m e ] = u i g e t f i l e ( '* .mat',... ' C h o o s e a M A T L A B D a t a file', 50,50) ; nofile=l; if f i l e n a m e ~= 0 SavedFile = [pathname filename] ; clear filename pathname; eval(['load ' S a v e d F i l e ';']).; nofile=0; end end end if s t r c m p (gui, ' A R s e t _ s a v e t h e m ' ) if a c t i o n == 1 %global SavedFile [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat', 'Data F i l e n a m e ' , ... 50,50) ; if f i l e n a m e - = 0 SavedFile = [pathname filename]; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end end end if s t r c ~ (gui, ' A R s e t _ s a v e a s t h e m ' ) if a c t i o n == 1 if (strcmp (SavedFile, ' ' ) ) [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '*.mat', 'Data F i l e n a m e ' , ... 50,50) ; else [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( S a v e d F i l e .... 'D a t a Fi l e n a m e ',... 50,50); end if f i l e n a m e - = 0 SavedFile = [pathname filename]; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end
254
end end if strcmp (gui, 'ARset_bye' ) if action == 1 close (fg_ARset) close (fg_ARrun) end end if strcmp(gui, 'ARset_Run') if action == 0 p b _ A R s e t _ R u n = uicontrol (fg_ARset .... 'Style', 'pushbutton',... 'Units', 'normalized', 'Position', [0.014 0.000 0.157 0.058] .... 'String', 'Run', 'BackgroundColor', [0.8 0.8 0.8 ] .... 'CallBack',' zzzar(' 'ARset'',' 'ARset_Run'', I) ;') ; end if action == 1 %collect the settings %global gmaxiter %global grptct %global gminspec %global gmaxspec %global gspectsteps %global gspectmin %global gspectmax %global gelpsteps %global gelpmax %global gelpmin %global gtotalsteps %global nscans %global nlines %global Obs %global nofile zzzar ( 'ARset', 'ARset_Load_file', I) if nofile == 0 [nscans, nlines]=size(Obs) ; gmaxiter = str2num (get (ed_ARset_Maxiter, 'String' ) ) ; if gmaxiter == [] gmaxiter = 20; set (ed_ARset_Maxi ter, 'String', num2str (gmaxiter)) ; end if gmaxiter 4 v a r i = f i l e n a m e (1 -pit-4) ; end zapu=[vari, ' = SS;']; e v a l (zapu) rapu=['save ', v a r i , ' ', v a r i ] ; e v a l (rapu) end end end if s t r c m p (gui, ' A R r u n _ S a v e _ c ' ) if a c t i o n == 1 [f i l e n a m e , p a t h ] = u i p u t f i l e ( ' C C . m a t ' , . . . 'Save e l u t i o n c u r v e s file- ' ) ; if f i l e n a m e - = 0 p i t = l e n g t h (filename) ; vari= filename;
263
end
end
if pit > 4 v a r i = f i l e n a m e (1 :pit-4) ; end zapu=[vari, ' = CC; '] ; eval (zapu) r a p u = [ ' s a v e ', vari,' ', vari]; eval (rapu)
end if strcmp (gui, 'ARrun_Execute' ) if action == 1 %global gVF %global gFF %global stepcount %global kombi TObs=Obs ; Obs=TObs ' ; totalsteps = g s p e c t s t e p s * g e l p s t e p s * (gmaxspecgminspec+ 1 ) *grptct; kombi=zeros (totalsteps, I0) ; ts tp=to tals teps /grptc t; gVF=zeros (tstp, I) ; gFF=zeros (tstp, I) ; fprintf ( 'in*** A u t o A R / R u n _ A R ***in' ) fprintf( 'Data file: ') disp (nimi) fprintf ( 'in' ) fprintf ( 'Max iterations. %3. Of Repeats: %3.0f Total sets:%3.0fin', ... gmaxiter, grptct, totalsteps ) if elpchoice == 1 f p r i n t f ( ' E l u t i o n curve baseline- %7.6f - %7.6f Lin steps%4.0fin', . .. gelpmin/100, gelpmax/100, gelps teps ) else fprintf ( 'Elution curve baseline- %7.6f - %7.6f Log steps : %4.0fin', ... gelpmin/100, gelpmax/100, gelps teps ) end if spectchoice == 1 fprintf ( 'Spectral baseline: %7.6f - %7.6f Lin steps%4.0fin', ... gspectmin/100, gspectmax/100, gspects teps ) else fprintf ( 'Spectral baseline%7.6f - %7.6f Log steps : %4.0fin', . .. gspectmin/100, gspectmax/100, gspects teps ) end fprintf ( 'No. of species: %2. Of -%2.0f\nin' ,gminspec,gmaxspec) if elpchoice == 1 elpvector = linspace (gelpmin/100, gelpmax/100, gelpsteps) ; else
264 elpmi = logl0 (gelpmin/100) ; e l p m a = logl0 (gelpmax/100) ; e l p v e c t o r = l o g s p a c e (elpmi, elpma, gelpsteps) ;
end if s p e c t c h o i c e == 1 s p e c t v e c t o r = l i n s p a c e ( g s p e c t m i n / 1 0 0 .... gspectmax/i00, gspectsteps ) ; else spectmi = logl0(gspectmin/100); s p e c t m a = logl0 (gspectmax/100) ; s p e c t v e c t o r = l o g s p a c e (spectmi, spectma, g s p e c t s t e p s ) ; end s t e p c o u n t = 0; disp('setel. c u r v e spectral no. of iteration fit') d i s p ( 'c o u n t baseline baseline species p e r level %') r a n d (' seed', 0) ; for elp = e l p v e c t o r for spect = s p e c t v e c t o r for n s p e c = gminspec: g m a x s p e c F = z e r o s (grptct, i) ; T T = z e r o s (i, nspec) ; T T T = z e r o s (grptct, nspec) ; [nlines, nscans] = s i z e (Obs) ; t o t a l = n o r m ( O b s , ' fro') ; for rr = l : g r p t c t if rr == 1 siidi = rand(' seed' ) ; siid2 = r a n d n (' seed' ) ; end S = r a n d (nl ines, n s p e c ) ; bestfit=9999999; for iter=l: g m a x i t e r S = (S > z e r o s ( s i z e ( S ) ) ) . * S ; S = S+0.00000000001*abs(randn(size(S) ) ) ; for j = 1 :n s p e c S (i :nlines, j ) =S (I :nlines, j ) / s u m (S (i- nl ines, j ) ) ; end if iter == g m a x i t e r PI=S*C; D1 = Obs-Pl; diff = 1 0 0 * n o r m ( D l , ' f r o ' ) / t o t a l ; s t e p c o u n t = stepcount+l; fprintf('%5.0f %7.6f %7.6f %4.0f %4. Of %5.3f\n',... stepcount, elp, spect, nspec, iter, diff) set (ed_ARrun_Stepct, 'String', n u m 2 s t r (stepcount) ) ; end Cl = pinv(S) ; C = Cl*Obs; % s o l v e c o n c e n t r a t i o n s C = (C > zeros (size(C))).*C; C = C + 0.00000000001*abs(randn(size(C)));
265
p a i k k a ) •- i- 1 ) ] ;
graphically
n u m 2 s t r (diff) ] )
for j = l - n s p e c i f unicho ice== 1 % M a k e the c u r v e m o n o t o n i c t e m p = C (j, i- n s c a n s ) ; [a r v o p a i k k a ] = m a x (temp) ; t e m p l s = s o r t (temp (1 :paikka) ) ; t e n o r s = s o r t (t e m p (paikka: n s c a n s ) ) ; C(j, l - n s c a n s ) = [temlols t e m p r s ( (nscansend W = C (j, i- n s c a n s ) ; w b i g = m a x (W) ; w s m a l l = m i n (W) ; wlimit = wbig*elp; if w s m a l l < w l i m i t ero = w l i m i t - w s m a l I; W=W+ero; end C ( j , l : n s c a n s ) = W; e n d %j % L a s t i t e r a t i o n route: s h o w the r e s u l t if i t e r == g m a x i t e r [WWW, KKK] = m a x (C') ; [LLL,MMM] = s o r t (KKK) ; C = C(MMM,:); S = S(:,MMM); p l o t (C') title( ['AR i t e r a t i o n #' , n u m 2 s t r (iter), ' x l a b e l ( 'S c a n n u m b e r ' ) drawnow
Fit=',
end G = pinv(C') ; ST = G * T O b s ; ST = (ST > z e r o s (size (ST) ) ) .*ST; ST = ST + 0 . 0 0 0 0 0 0 0 0 0 0 1 * a b s ( r a n d n ( s i z e ( S T ) ) ) ; for j =i :n s p e c W = S T (j, i- n l i n e s ) ; w b i g = m a x (W) ; w s m a l l = m i n (W) ; wlimit = wbig*spect; if w s m a l l < w l i m i t ero = w l i m i t - w s m a l i; W=W+ero; end S T ( j , l - n l i n e s ) = W; e n d %j S = ST'; end %iter F (rr) =diff; T T = s u m (C') ; T T T (rr, : ) =TT;
266
end
e n d %rr W = std(TTT) ; M M = m e a n (TTT) ; sct = s t e p c o u n t / g r p t c t ; g V F (sct) = 100" s u m (VV) / s u m (MM) ; g F F (sc t ) = m e a n (F) ; k o m b i (sct, i) = nspec; k o m b i (sct, 2) = elp; k o m b i (sct, 3) = spect; k o m b i (sct, 4) = m e a n (F) ; kombi(sct,5) = 100*sum(VV)/sum(MM) k o m b i (sct, 6) = siidi; k o m b i (sct, 7) = siid2 ; SS=S; CC=C; s a v e SS SS s a v e CC CC KOMBI=kombi; save KOMBI KOMBI FIT=gFF; SCATTER=gVF; save FIT FIT save SCATTER SCATTER end %nspec end %spect end %elp f p r i n t f ( '\ n S c a t t e r : \n' ) %6.4f %6.4f %6.4f fprintf( '%6.4f %6.4f f p r i n t f ( '\nFit: \n') %6.4f %6.4f %6.4f f p r i n t f ( '% 6 . 4 f %6.4f
;
%6.4f\n',gVF) %6.4f\n',gFF)
end if s t r c m p (gui, ' A R r u n _ s a v e t h e m ' ) if a c t i o n == 1 %global SavedFile [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat', 'Data F i l e n a m e ' , ... 50,50) ; if f i l e n a m e - = 0 S a v e d F i I e = [p a t h n a m e f i l e n a m e ] ; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end end end if s t r c m p (gui, 'A R r u n _ s a v e a s t h e m ' ) if a c t i o n == 1 %global SavedFile if ( s t r c m p ( S a v e d F i l e , ' ') ) [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat', 'Data F i l e n a m e ' , ... 50,50) ; else [filename, p a t h n a m e ] = u i p u t f i l e ( S a v e d F i l e .... 'D a t a Fi l e n a m e ',...
267
50,50) ; end if f i l e n a m e - = 0 SavedFile = [pathname filename] ; clear filename pathname; eval(['save ' SavedFile ';']); end
end end i f s t r c m p (gui, 'A R r u n _ b y e ' ) if a c t i o n == 1 close (fg_ARset) close (fg_ARrun) end end if s t r c m p (gui, ' A R r u n _ B a c k ' ) if a c t i o n == 0 pb_ARrun_Back = u i c o n t r o l ( f g _ A R r u n .... 'S t y l e ' , ' p u s h b u t t o n ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.014 0 . 0 0 0 0 . 1 5 7 0.058] .... 'String , 'Back','BackgroundColor', [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z a r ( ' 'ARrun' ', ' 'A R r u n _ B a c k ' ', I) ; ' ) ; end if a c t i o n == 1 plot(1,'k') a x i s ( 'off' ) f i g u r e (f g _ A R s e t ) p a g e = 1; end end if s t r c m p ( g u i , ' A R r u n _ H e l p ' ) if a c t i o n == 0 pb_ARrun_Help = u i c o n t r o l ( f g _ A R r u n .... 'Style', ' p u s h b u t t o n ' , ... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.857 0 . 0 0 0 0 . 1 4 3 0 . 0 5 8 ] .... ' S t r i n g ' , 'Help', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z a r ( ' 'A R r u n ' ', ' 'A R r u n _ H e l p ' ', i) ; ' ) ; end if a c t i o n == 1 help helparl; end end if s t r c m p (gui, ' A R r u n _ S t e p c t ' ) if a c t i o n == 0 ed_ARrun_Stepct = u i c o n t r o l ( f g _ A R r u n .... 'S t y l e ' , 'e d i t ' , 'B a c k g r o u n d C o l o r ' , [0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.657 0 . 0 0 0 0 . 0 8 6 0.058] .... ' S t r i n g ' , ' ',... 'C a l l B a c k ' , 'z z z a r ( ' 'A R r u n ' ', ' 'A R r u n _ S t e p c t ' ', i) ; ' ) ; end end if s t r c m p ( g u i , ' A R r u n _ M a x i t e r ' ) if a c t i o n == 0
268
e d _ A R r u n _ M a x i t e r = u i c o n t r o l (fg_ARrun .... 'Style' 'edit' 'BackgroundColor' [0 50 0 50 0 50] 'Units', 'normalized', 'Position', [0.757 0.000 0.086 0.058] .... 'String', '', ... 'CallBack', 'zzzar( ' 'ARrun' ', ''ARrun_Maxiter'', i) ; ') ; ,
end
end
end % ##### E n d of p r o g r a m
,
#####
,
•
•
•
l
•
. .
Chapter
15
AR statistics in two dimensions
This Page Intentionally Left Blank
15
AR statistics in two dimensions
The results from the AR calculations can be inspected in two dimensions by the program z z z d p . m. The user can then decide which constraint combination best represents the underlying components. The user can display the statistics as a function of all constraints that have been varied in the numerical AR calculations. %ZZZDP.M shows the effect of constraints on AR in two dimensions. % Date: 30 Sep 1995 % % Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 % function zzzdp (page, gui, action) ; global elpchoice global gelpmax global gelpmin global gelpsteps global gmaxiter global gmaxspec global gminspec global grptct global gspectmax global gspectmin global gspectsteps global gtotalsteps global kombi global plbutton
params.txt kombi.mat
DP
Figure 15.1 Program z z z d p .m reads parameter ranges from the file p a r a m s , t x t and the fit and the scatter from the file k o m b i , m a t . It displays on the screen the fit, the scatter and the combined error variable as a function of selected parameters: the elution curve baseline, the spectral baseline or the number of components. Two other parameters are fixed. 271
272
global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global
p2button parllabel parlval par2 label par2val SavedFile spectchoice stepcount xchoice ychoice ed_DPset_PITl ed_DPset_PITl0 ed_DPset_PiT2 ed_DPset_PlT3 ed_DPset_PiT4 ed_DPset_PiT5 ed_DPset_PiT6 ed_DPset_PlT7 ed_DPset_PlT8 ed_DPset_PlT9 ed_DPset_P2Tl ed_DPset_P2Tl0 ed_DPset_P2T2 ed_DPset_P2T3 ed_DPset_P2T4 ed_DPset_P2T5 ed_DPset_P2T6 ed_DPset_P2T7 ed_DPset_P2T8 ed_DPset_P2T9 ed_DPset_Parl txt ed_DPset_Par2 txt ed_DPset_X~is txt ed_DPset_Yaxis txt fg_DPlook fg_DPset fr_DPset_Parl frame fr_DPset_Par2 frame fr_DPset_Xaxis frame fr_DPset_Yaxis frame mi_DPlook_Back mi_DPlook_Help mi_DPlook_Quit mi_DPset_Help mi_DPset_Look mi_DPset_Quit mn_DPlookFiles mn_DPlook_GOTO mn_DPset_Files mn_DPset_C43TO pb_DPlook_Back pb_DPlook_Help pb_DPset_Help
273
global p b _ D P s e t _ L o o k global rb_DPset_Al 1 global r b _ D P s e t _ B o t h global r b _ D P s e t _ C o m b i n e d global rb_DPset_Elut ion_bas el ine global rb_DPset_Fit global rb_DPset_No_of_species global rb_DPset_PIBl global rb_DPset_PlBl 0 global rb_DPset_PlB2 global rb_DPset_PIB3 global rb_DPset_PiB4 global rb_DPset_PiB5 global rb_DPset_PlB6 global rb_DPset_PlB7 global rb_DPset_PIB8 global rb_DPset_PiB9 global rb_DPset_P2Bl global rb_DPset_P2Bl0 global rb_DPset_P2B2 global rb_DPset_P2B3 global rb_DPset_P2B4 global rb_DPset_P2B5 global rb_DPset_P2B6 global rb_DPset_P2B7 global rb_DPset_P2B8 global rb_DPset_P2B9 global rb_DPset_Scatter global rb_DPset_Spec tral_basel ine if nargin == 0 page = 'initpage' ; gui = 'initgui'; action = 0; end if strcmp (page, 'initpage' ) fg_DPset = figure; fg_DPlook = figure; zzzdp ( 'DPlook', 'initgui', 0) ; zzzdp ( 'DPset ', 'initgui', 0) ; page = ' '; end if strcmp(page, 'DPset') figure (fg_DPset) set(gcf, 'NumberTitle', 'off', ... 'Name', 'DPset', ... 'backingstore', 'off' ) ; if strcmp(gui, 'initgui') zzzdp ( 'DPset , 'DPset_Menul ', 0) ; zzzdp ( 'DPset , 'DPset_Menu2 ', 0) ; zzzdp ( 'DPset , 'DPset_Xaxisframe', 0) ; zzzdp ( 'DPset , 'DPset_Parlframe', 0) ; zzzdp ( 'DPset , 'DPset_Par2frame', 0) ; zzzdp ( 'DPset , 'DPset_Xaxistxt', 0) ;
274
z z z d p ( 'D P s e t ', 'D P s e t _ P a r l t x t ' , 0) ; z z z d p ( 'DPset', D P s e t _ P a r 2 t x t ' , 0) ; z z z d p ( 'DPset', D P s e t _ N o _ o f _ s p e c i e s ' , 0) ; z z z d p ( 'DPset', D P s e t _ P l B l ' , 0) ; z z z d p ( 'DPset', D P s e t _ P i T l ' , 0) ; z z z d p ( 'DPset', D P s e t _ P 2 B I ' , 0) ; z z z d p ( 'DPset', D P s e t _ P 2 T I ' , 0) ; z z z d p ( 'DPset', D P s e t _ E l u t i o n _ b a s e l i n e ' , 0) ; z z z d p ( 'DPset', 'D P s e t _ P I B 2 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P l T 2 ', 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ P 2 B 2 ', 0) ; z z z d p ( 'DPset', D P s e t _ P 2 T 2 ', 0) ; z z z d p ( 'DPset', D P s e t _ S p e c t r a l _ b a s e l i n e ' , 0) ; z z z d p ( 'DPset', D P s e t _ P l B 3 ', 0) ; z z z d p ( 'DPset', D P s e t _ P i T 3 ', 0) ; z z z d p ( 'DPset', D P s e t _ P 2 B 3 ', 0) ; z z z d p ( 'DPset', D P s e t _ P 2 T 3 ', 0) ; z z z d p ( 'DPset', D P s e t _ P l B 4 ' , 0) ; z z z d p ( 'DPset', 'D P s e t _ P i T 4 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 B 4 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 T 4 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P I B 5 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P I T 5 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 B 5 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 T 5 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P I B 6 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P i T 6 ', 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ P 2 B 6 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 T 6 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ Y a x i s f r a m e ', 0) ; z z z d p ( 'DPset', 'D P s e t _ Y a x i s t x t ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P I B 7 ', 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ P I T 7 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 B 7 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 T 7 ', 0) ; z z z d p ( 'D P s e t ', 'D P s e t _ F i t ' , 0 ) ; z z z d p ( 'DPset', 'D P s e t _ P I B 8 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P l T 8 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 B 8 ', 0) ; z z z d p ( 'DPset', 'D P s e t _ P 2 T 8 ', 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ S c a t t e r ' , 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ P l B 9 ', 0) ; z z z d p ( 'DPset , ' D P s e t _ P I T 9 ' , 0) ; z z z d p ( ' D P s e t , 'D P s e t _ P 2 B 9 ' , 0) ; z z z d p ( 'DPset , ' D P s e t _ P 2 T 9 ' , 0) ; z z z d p ( 'DPset , 'D P s e t _ B o t h ' , 0) ; z z z d p ( ' D P s e t , 'D P s e t _ P i B l 0 ', 0) ; z z z d p ( 'DPset , ' D P s e t _ P l T l 0 ' ,0) ; z z z d p ( 'DPset , 'D P s e t _ P 2 B l 0 ', 0) ; z z z d p ( 'DPset , ' D P s e t _ P 2 T I 0 ' ,0) ; z z z d p ( 'DPset , 'D P s e t _ C o m b i n e d ' , 0) ; z z z d p ( 'DPset , 'D P s e t _ A l l ', 0) ; z z z d p ( 'DPset', 'D P s e t _ L o o k ' , 0) ;
275
z z z d p ( 'DPset', 'D P s e t _ H e l p ' , 0) ; end if s t r c m p ( g u i , ' D P s e t _ X a x i s f r a m e ' ) if a c t i o n == 0 fr_DPset_Xaxisframe = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'frame', 'B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50 ] .... 'Units', 'normalized', 'Position', [0.014 0.538 0.300 0 . 3 8 5 ] ) ; end end if s t r c m p (gui, ' D P s e t _ P a r l f r a m e ' ) if a c t i o n == 0 fr_DPset_Parlframe = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'frame', 'B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50 ] .... 'Units','normalized','Position', [0.329 0.077 0.286 0.846]); end end if s t r c m p (gui, ' D P s e t _ P a r 2 f r a m e ' ) if a c t i o n == 0 fr_DPset_Par2frame = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'frame', 'B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units', ' n o r m a l i z e d ' , 'Position', [0.629 0.077 0.286 0.846]); end end if s t r c m p (gui, ' D P s e t _ Y a x i s f r a m e ' ) if a c t i o n == 0 fr_DPset_Yaxisframe = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'frame', 'B a c k g r o u n d C o l o r ' , [0_50 0.50 0.50 ], _ . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.014 0.077 0 300 0 442]) ; end end if s t r c m p ( g u i , ' D P s e t _ M e n u l ' ) if a c t i o n == 0 mn_DPset_Files = u i m e n u ( f g _ D P s e t .... 'Label', 'Files') ; mi_DPset_Quit = u i m e n u ( m n _ D P s e t _ F i l e s .... 'Label', 'Quit', ... 'CallBack', 'zzzdp(' 'DPset'',' ' D P s e t _ b y e ' ' , l ) ;') ; end end if s t r c m p ( g u i , ' D P s e t _ M e n u 2 ' ) if a c t i o n == 0 m n _ D P s e t _ C 4 3 T O = u i m e n u (f g _ D P s e t .... 'Label', 'GOTO' ) ; mi_DPset_Look = u i m e n u ( m n _ D P s e t _ G O T O .... 'Label', 'Look',... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ...., D P s e t l o o k t h e m ' ' , l ) ; ' ) ; mi_DPset_Help = u i m e n u ( m n _ D P s e t _ G O T O .... 'Label', 'Help',... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ H e l p ' ', i) ; ' ) ; end end if s t r c m p ( g u i , ' D P s e t _ s e t t h e m ' ) if a c t i o n == 1
276
f i g u r e (fg_DPset) p a g e = 1;
end end if s t r c m p (gui, ' D P s e t _ l o o k t h e m ' ) if a c t i o n == 1 f i g u r e (fg_DPlook) page = 3 ; end end if s t r c m p (gui, ' D P s e t _ N o _ o f _ s p e c i e s ' ) if a c t i o n == 0 rb_DPset_No_of_species = u i c o n t r o l (fg_DPset .... 'Style' 'radio', 'Units', 'normalized', 'Position', [0.029 0.769 0_271 0.058] .... 'String', ' N o _ o f _ s p e c i e s ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8],. 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ N o _ o f _ s p e c i e s ' ', I) ; ' ) ; end if a c t i o n == 1 %global plbutton %global p2button %global xchoice %global elpchoice %global spectchoice set ( r b _ D P s e t _ N o _ o f _ s p e c i e s , 'Value', I) ; set ( r b _ D P s e t _ E l u t i o n _ b a s e l i n e , 'Value', 0) ; set ( r b _ D P s e t _ S p e c t r a l _ b a s e l i n e , 'Value', 0) ; z z z d p ( 'DPset', 'D P s e t _ l o a d t h e m ' , I) ; p a r l l a b e l = 'Elution c u r v e b a s e l i n e ' ; p a r 2 1 a b e l = 'S p e c t r a l b a s e l i n e ' ; p l b u t t o n = z e r o s (I0, I) ; p 2 b u t t o n = z e r o s (i0, I) ; if e l p c h o i c e == 1 e l p v e c t o r = l i n s p a c e (gelpmin/100, g e l p m a x / 1 0 0 , g e l p s t e p s ) ; else elpmi = logl0(gelpmin/100); elpma = logl0(gelpmax/100) ; e l p v e c t o r = l o g s p a c e (elpmi, elpma, g e l p s t e p s ) ; end if s p e c t c h o i c e == 1 s p e c t v e c t o r = l i n s p a c e ( g s p e c t m i n / 1 0 0 .... gspectmax/100, gspectsteps ) ; else spectmi = logl0(gspectmin/100) ; s p e c t m a = logl0 (gspectmax/100) ; s p e c t v e c t o r = l o g s p a c e (spectmi, spectma, g s p e c t s t e p s ) ; end e l p a s k e l = ( g e l p m a x - g e l p m i n ) / (gelpsteps-l) ; ct = 0; for e l p = g e l p m i n : e l p a s k e l - g e l p m a x ct = ct + i; end s p e c t a s k e l = ( g s p e c t m a x - g s p e c t m i n ) / (gspectsteps-l) ; ,
•
•
•
277
ct2 for
end
= 0; spect = gspectmin-spectaskel-gspectmax ct2 = c t 2 + i;
end p l b u t t o n (i- ct) = e l p v e c t o r ; p 2 b u t t o n (i- ct2) = s p e c t v e c t o r ; x c h o i c e = i; z z z d p ( 'D P s e t ' , 'D P s e t _ s h o w t h e m ' ,
I) ;
end if s t r c m p ( g u i , ' D P s e t _ P l B l ' ) if a c t i o n == 0 rb_DPset_PiBl = u i c o n t r o l (f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 7 6 9 0 . 0 2 9 0 . 0 5 8 ] .... 'String', 'PlBl','BackgroundColor', [0.8 0.8 0.8] .... ' C a l l B a c k ' , ' z z z d p ( ' 'DPset' ', ' ' D P s e t _ P l B l ' ', i) ; ') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , 'Value', i) ; set (rb_DPset_PlB2, V a l u e , 0) ; set (rb_DPset_PiB3, V a l u e , 0) ; set (rb_DPset_PiB4, V a l u e , 0) ; set (rb_DPset_PlB5, V a l u e , 0) ; set (rb_DPset_PiB6, V a l u e , 0) ; set (rb_DPset_PlB7, V a l u e , 0) ; set (rb_DPset_PiB8, V a l u e , 0) ; set (rb_DPset_PlB9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P l B l 0 , 'Value', 0) ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 B l ' ) if a c t i o n == 0 rb_DPset_P2Bl = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.657 0 . 7 6 9 0 . 0 2 9 0 . 0 5 8 ] .... ' S t r i n g ' , 'P2BI', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... ' C a l l B a c k ' , 'zzzdp( ' 'DPset' ', ' ' D P s e t _ P 2 B l ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , ' V a l u e , I) ; set (rb_DPset_P2B2, V a l u e , 0) ; set (rb_DPset_P2B3, V a l u e , 0) ; set (rb_DPset_P2B4, V a l u e , 0) ; set (rb_DPset_P2B5, V a l u e , 0) ; set (rb_DPset_P2B6, V a l u e , 0) ; set (rb_DPset_P2B7, V a l u e , 0) ; set (rb_DPset_P2B8, V a l u e , 0) ; set (rb_DPset_P2B9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B l 0, 'V a l u e ' , 0) ; end end if s t r c m p (gui, ' D P s e t _ E l u t i o n _ b a s e l i n e ' )
278
if a c t i o n == 0 rb_DPset_Elution_baseline = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units','normalized','Position', [0.029 0 . 7 1 2 0 . 2 7 1 0.058] .... 'S t r i n g ' , 'E l u t i o n _ b a s e l i n e ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ E l u t i o n _ b a s e l i n e ' ', I) ; ' ) ; end if a c t i o n == 1 set (rb_DPset_No_of_species, 'Value', 0) ; set (rb_DPset_Elution_baseline, 'Value', i) ; set (rb_DPset_Spectral_baseline, 'V a l u e ' , 0) ; z z z d p ( 'D P s e t ' , 'D P s e t _ l o a d t h e m ' , i) ; parllabel = ' N u m b e r of s p e c i e s ' ; par21abel = 'Spectral baseline' ; plbutton=zeros (i0, i) ; p2button=zeros (I0, i) ; ct = 0; for ii = g m i n s p e c :g m a x s p e c ct = ct + i; p l b u t t o n (ct) = ii; end if s p e c t c h o i c e == 1 spectvector = l i n s p a c e ( g s p e c t m i n / 1 0 0 .... gspectmax/i00, gspectsteps) ; else spectmi = logl0 (gspectmin/100) ; spectma = logl0(gspectmax/100) ; spectvector = l o g s p a c e (spectmi, s p e c t m a , g s p e c t s t e p s ) ;. end spectaskel = (gspectmax-gspectmin) / (gspectsteps-l) ; ct2 = 0; for s p e c t = g s p e c t m i n - s p e c t a s k e l - g s p e c t m a x ct2 = ct2 + I; end p 2 b u t t o n (i- ct2) = s p e c t v e c t o r ; z z z d p ( 'DPset', 'D P s e t _ s h o w t h e m ' , i) ; x c h o i c e = 2; end
end if s t r c m p (gui, 'D P s e t _ P l B 2 ' ) if a c t i o n == 0 rb_DPset_PlB2 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 7 1 2 0 . 0 2 9 0.058] .... 'String', ' P l B 2 ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ' ', ' ' D P s e t _ P l B 2 ' ', i) ; ') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , 'Value', 0) ; s e t ( r b _ D P s e t _ P l B 2 , 'V a l u e ' , i) ; s e t ( r b _ D P s e t _ P i B 3 , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 4 , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 5 , 'Value', 0) ;
279
set set set set set
(rb_DPset_PiB6, (rb_DPset_PiB7, (rb_DPset_PlB8, (rb_DPset_PlB9, (rb_DPset_PlBl0,
'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ;
end end if s t r c m p (gui, ' D P s e t _ P 2 B 2 ' ) if a c t i o n == 0 rb_DPset_P2B2 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.657 0 . 7 1 2 0 . 0 2 9 0.058] .... 'String', 'P2B2', ' B a c k g r o u n d e o l o r ' , [0.8 0.8 0.8.]:... CallBack'.'zzzdp(' 'DPset' ', ' ' D P s e t _ P 2 B 2 ',i).-') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 2 , 'V a l u e , I) ; s e t ( r b _ D P s e t _ P 2 B 3 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 4 , 'V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 5 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 6 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 7 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 8 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 9 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B l 0 , 'Value', 0 ) ; end end if s t r c m p (gui, ' D P s e t _ S p e c t r a l _ b a s e l i n e ' ) if a c t i o n == 0 rb_Dpset_Spectral_baseline = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.029 0 . 6 5 4 0 . 2 7 1 0.058] .... String', 'Spectral_baseline', . . 'BackgroundColor', [0.8 0.8 0.8] .... ' C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ S p e c t r a l _ b a s e l i n e ' ', i) ; ' ) ; end if a c t i o n == 1 set (rb_DPset_No_of_species, 'Value', 0) ; set (rb_DPset_Elution_baseline, 'Value', 0) ; set (rb_DPset_Spectral_baseline, 'Value', i) ; z z z d p ( 'DPset', 'D P s e t _ l o a d t h e m ' , I) ; parllabel = ' N u m b e r of species'-; par21abel = 'Elution curve baseline'; plbutton=zeros (i0, i) ; p2button=zeros (i0, i) ; ct = 0; for ii = g m i n s p e c - g m a x s p e c ct = ct + I; plbutton(ct) = ii; end if e l p c h o i c e == 1 elpvector = linspace (gelpmin/100, gelpmax/100, gelpsteps) ;
280
else elpmi = logl0(gelpmin/100) ; elpma = logl0(gelpmax/100) ; e l p v e c t o r = l o g s p a c e (elpmi, elpma, g e l p s t e p s ) ;
end elpaskel = (gelpmax-gelpmin) / (gelpsteps-l) ; ct2 = 0; for elp = gelpmin:elpaskel:gelpmax ct2 = ct2 + i; end p 2 b u t t o n (I :ct2) = e l p v e c t o r ; z z z d p ( 'D P s e t ' , 'D P s e t _ s h o w t h e m ' , i) ; x c h o i c e = 3;
end end if s t r c m p (gui, ' D P s e t _ P l B 3 ' ) if a c t i o n == 0 rb_DPset_PiB3 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units','normalized','Position', [0.357 0 . 6 5 4 0 . 0 2 9 0.058] .... 'S t r i n g ' , 'PIB3' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... ' C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' ' D P s e t _ P l B 3 ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 2 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P I B 3 , V a l u e ' , i) ; s e t ( r b _ D P s e t _ P i B 4 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P i B 5 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P l B 6 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P i B 7 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P l B 8 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P i B 9 , V a l u e ' , 0) ; s e t ( r b _ D P s e t _ P l B l 0, 'V a l u e ' , 0 ) ; end end if s t r c m p (gui, ' D P s e t _ P 2 B 3 ' ) if a c t i o n == 0 rb_DPset_P2B3 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units' ' n o r m a l i z e d ' ' P o s i t i o n ' , [0.657 0 . 6 5 4 0 . 0 2 9 0.058] .... 'S t r i n g ' , 'P2B3' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P 2 B 3 ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 2 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 3 , 'Value', i) ; s e t ( r b _ D P s e t _ P 2 B 4 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 5 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 6 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 7 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B 8 , 'Value', 0) ;
281
s e t ( r b _ D P s e t _ P 2 B 9 , 'Value', 0) ; s e t ( r b _ D P s e t _ P 2 B l 0 , 'Value', 0) ;
end end if s t r c m p (gui, ' D P s e t _ P l B 4 ' ) if a c t i o n == 0 rb_DPset_PlB4 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 5 9 6 0 . 0 2 9 0.058] .... String', 'PlB4','BackgroundColor', [0.8 0.8 0.8] .... 'CallBack','zzzdp(''DPset'',' 'DPset_PlB4'',l);'); end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , 'Value', 0) ; set (rb_DPset_PlB2, V a l u e , 0) ; set (rb_DPset_PlB3, V a l u e , 0) ; set (rb_DPset_PlB4, V a l u e , i) ; set (rb_DPset_PlB5, V a l u e , 0) ; set (rb_DPset_PlB6, V a l u e , 0) ; set ( r b _ D P s e t _ P l B 7 , V a l u e , 0) ; set (rb_DPset_PlB8, V a l u e , 0) ; set (rb_DPset_PlB9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P l B l 0, 'Value', 0 ) ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 B 4 ' ) if a c t i o n == 0 rb_DPset_P2B4 = u i c o n t r o l ( f g _ D P s e t .... 'Style' 'radio', 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.657 0 . 5 9 6 0 . 0 2 9 0.058] .... ' S t r i n g , 'P2B4', ' B a c k g r o u n d C o l o r ' , [0.8 0 8 0 . 8 ] , . . . 'CallBack','zzzdp(''DPset'',''DPset_P2B4'',l);'); end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , 'Value , 0) ; set (rb_DPset_P2B2, V a l u e , 0) ; set (rb_DPset_P2B3, V a l u e , 0) ; set (rb_DPset_P2B4, V a l u e , i) ; set (rb_DPset_P2B5, V a l u e , 0) ; set (rb_DPset_P2B6, V a l u e , 0) ; set (rb_DPset_P2B7, V a l u e , 0) ; set (rb_DPset_P2B8, V a l u e , 0) ; set (rb_DPset_P2B9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B l 0, 'Value', 0) ; end end if s t r c m p (gui, ' D P s e t _ P l B 5 ' ) if a c t i o n == 0 rb_DPset_PlB5 = u i c o n t r o l (f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 5 3 8 0 . 0 2 9 0.058] .... S t r i n g ' , 'PIB5', ' B a c k g r o u n d C o l o r ' , [0.8 0 8 0.8] C a l l B a c k ' , 'z z z d p ( ' 'D P s e t ' ', ' 'D P s e t _ P l B 5 ' ', i) ;-' ) ,
•
•
•
282
end if a c t i o n == 1 s e t ( r b _ D P s e t _ P i B l , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P I B 2 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P l B 3 , ' V a l u e , 0) ; set (rb_DPset_PlB4, V a l u e , 0) ; set (rb_DPset_PiB5, V a l u e , i) ; set (rb_DPset_PlB6, V a l u e , 0) ; set (rb_DPset_PIB7, V a l u e , 0) ; set (rb_DPset_PiB8, V a l u e , 0) ; set (rb_DPset_PIB9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P l B l 0 , 'Value', 0) ; end
end if s t r c m p ( g u i , ' D P s e t _ P 2 B 5 ' ) if a c t i o n == 0 rb_DPset_P2B5 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units','normalized','Position', [0.657 0 . 5 3 8 0 . 0 2 9 0.058] .... 'String', ' P 2 B 5 ' , ' B a c k g r o u n d C o l o r ' , [0.8 0 8 0 . 8 ] , . . . 'CallBack','zzzdp(''DPset' ',' ' D P s e t _ P 2 B 5 ' ' i);') end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , 'V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 2 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 3 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 4 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B 5 , ' V a l u e , I) ; s e t ( r b _ D P s e t _ P 2 B 6 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 7 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 8 , 'Value , 0) ; s e t ( r b _ D P s e t _ P 2 B 9 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B l 0, 'V a l u e ' , 0 ) ; end end if s t r c m p ( g u i , ' D P s e t _ P i B 6 ' ) if a c t i o n == 0 rb_DPset_PiB6 = u i c o n t r o l (f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units','normalized','Position',!0~357 0 . 4 8 1 0 . 0 2 9 0.058] .... 'String','PlB6','BackgroundColor t0.8 0 8 0.8] ... ' C a l l B a c k ' , ' z z z d p ( ' 'DPset'',' ' D P s e t _ P l B 6 ' ' , i) ;') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 2 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 3 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 4 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 5 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 6 , ' V a l u e , i) ; s e t ( r b _ D P s e t _ P I B 7 , 'V a l u e , 0) ; s e t ( r b _ D P s e t _ P I B 8 , ' V a l u e , 0) ; s e t ( r b _ D P s e t _ P i B 9 , 'Value', 0) ;
283
end
set ( r b _ D P s e t _ P I B l 0 ,
'Value', 0) ;
end if s t r c m p (gui, 'DPset_P2B6' ) if a c t i o n == 0 r b _ D P s e t _ P 2 B 6 = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'radio', ... 'Units', ' n o r m a l i z e d ' , 'Position', [0.657 0.481 0 . 0 2 9 0.058] .... 'String', 'P2B6', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8! .... C a l l B a c k , ' z z z d p ( ' ' D P s e t ' ',' 'DPset_P2B6' ,i~-,'); end if a c t i o n == 1 set ( r b _ D P s e t _ P 2 B l , 'V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 2 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 3 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 4 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 5 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 6 , V a l u e , i) ; set ( r b _ D P s e t _ P 2 B 7 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 8 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B 9 , V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B l 0 , 'Value', 0) ; end end if s t r c m p ( g u i , ' D P s e t _ P l B 7 ' ) if a c t i o n == 0 r b _ D P s e t _ P l B 7 = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'radio',... 'Units','normalized','Position', [0.357 0.423 0.029 0.058] .... 'String','PiB7','BackgroundColor', [0.8 0.8 0 .8 ... 'CallBack', ' z z z d p ( ' ' D P s e t ' ' , ' ' D P s e t _ P i B 7 ,li-,
!i;
end
if a c t i o n == 1 set ( r b _ D P s e t _ P i B l , set ( r b _ D P s e t _ P l B 2 , set ( r b _ D P s e t _ P l B 3 , set ( r b _ D P s e t _ P l B 4 , set ( r b _ D P s e t _ P l B 5 , set ( r b _ D P s e t _ P l B 6 , set ( r b _ D P s e t _ P l B 7 , set ( r b _ D P s e t _ P l B 8 , set ( r b _ D P s e t _ P l B 9 , set ( r b _ D P s e t _ P l B l 0 , end
'Value , 0) ; V a l u e , 0) ; V a l u e , 0) ; V a l u e , 0) ; V a l u e , 0) ; V a l u e , 0) ; V a l u e , i) ; V a l u e , 0) ; V a l u e , 0) ; 'Value', 0) ;
end if s t r c m p ( g u i , ' D P s e t _ P 2 B 7 ' ) if a c t i o n == 0 r b _ D P s e t _ P 2 B 7 = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , 'Position', [0.657 0.423 0.029 S t r i n g ' , ' P 2 B 7 ' , 'Bac,kgroun, dC,olor', [0.8 0 8 0 8] C a l l B a c k ' , 'zzzdp(' D P s e t , DPset_P2B7 ',li;'i
end
i'"
0.058] ....
284
if a c t i o n == 1 set (rb_DPset_P2Bl, set (rb_DPset_P2B2, set (rb_DPset_P2B3, set (rb_DPset_P2B4, set (rb_DPset_P2B5, set (rb_DPset_P2B6, set (rb_DPset_P2B7, set (rb_DPset_P2B8, set (rb_DPset_P2B9, set ( r b _ D P s e t _ P 2 B l 0 , end
'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ; 'Value', i) ; 'Value', 0) ; 'Value', 0) ; 'Value', 0) ;
end if s t r c m p ( g u i , 'DPset_Fit') if a c t i o n == 0 r b _ D P s e t _ F i t = u i c o n t r o l (fg_DPset .... 'Style', 'r a d i o ' , . . . 'Units', 'normalized', 'Position', [0.029 0.365 0.271 0.058] .... 'String', 'Fit' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack',' zzzdp(' 'DPset'',' 'DPset_Fit' ', I) ;') ; end if a c t i o n == 1 %global ychoice set (rb_DPset_Fit, 'Value', i) ; set ( r b _ D P s e t _ S c a t t e r , 'Value', 0) ; set (rb_DPset_Both, 'Value', 0) ; set ( r b _ D P s e t _ C o m b i n e d , 'Value', 0 ) ; set (rb_DPset_All, 'Value', 0) ; y c h o i c e = i; end end if s t r c m p ( g u i , 'DPset_PlB8') if a c t i o n == 0 r b _ D P s e t _ P I B 8 = u i c o n t r o l (fg_DPset .... 'Style' , 'r e di o ' , . . . 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0.365 0.029 0.058] .... 'String','PiB8','BackgroundColor', [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P l B 8 ' ', i) ; ' ) ; end if a c t i o n == 1 set ( r b _ D P s e t _ P i B l , V a l u e , 0) ; set (rb_DPset_PiB2, V a l u e , 0) ; set ( r b _ D P s e t _ P i B 3 , V a l u e , 0) ; set ( r b _ D P s e t _ P i B 4 , V a l u e , 0) ; set (rb_DPset_PiB5, V a l u e , 0) ; set ( r b _ D P s e t _ P i B 6 , V a l u e , 0) ; set ( r b _ D P s e t _ P i B 7 , V a l u e , 0) ; set (rb_DPset_PIB8, Value', i) ; set (rb_DPset_PiB9, 'Value', 0) ; set ( r b _ D P s e t _ P I B l 0 , 'Value', 0) ; end end if s t r c m p ( g u i , 'DPset_P2B8' )
285
if a c t i o n == 0 rb_DPset_P2B8 = u i c o n t r o l (f g _ D P s e t .... 'S t y l e ' , ' r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.657 0 . 3 6 5 0 . 0 2 9 0 . 0 5 8 ] .... 'String', 'P2B8','BackgroundColor', [0.8 0.8 0.8] .... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ' ', ' ' D P s e t _ P 2 B 8 ' ', i) ; ') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P 2 B l , 'V a l u e , 0) ; set (rb_DPset_P2B2, V a l u e , 0) ; set (rb_DPset_P2B3, V a l u e , 0) ; set (rb_DPset_P2B4, V a l u e , 0) ; set (rb_DPset_P2B5, V a l u e , 0) ; set (rb_DPset_P2B6, V a l u e , 0) ; set (rb_DPset_P2B7, V a l u e , 0) ; set (rb_DPset_P2B8, V a l u e , i) ; set (rb_DPset_P2B9, V a l u e , 0) ; s e t ( r b _ D P s e t _ P 2 B l 0, 'V a l u e ' , 0 ) ; end
end if s t r c m p ( g u i , ' D P s e t _ S c a t t e r ' ) if a c t i o n == 0 rb_DPset_Scatter = uicontrol(fg_DPset .... 'S t y l e ' 'radio' 'Units' 'normalized' ' P o s i t i o n ' , [0.029 0 . 3 0 8 0 . 2 7 1 0 . 0 5 8 ] .... 'S t r i n g ' , 'S c a t t e r ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... ' C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ S c a t t e r ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ F i t , 'V a l u e ' , 0 ) ; set (rb_DPset_Scatter, 'Value', i) ; s e t ( r b _ D P s e t _ B o t h , 'Value', 0) ; set (rb_DPset_Combined, 'Value', 0) ; s e t ( r b _ D P s e t _ A l l , 'Value', 0) ; y c h o i c e = 2; end end if s t r c m p (gui, ' D P s e t _ P l B 9 ' ) if a c t i o n == 0 rb_DPset_PlB9 = u i c o n t r o l (f g _ D P s e t .... 'S t y l e ' , 'r a d i o ' , . . . 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 3 0 8 0 . 0 2 9 0.058] .... 'S t r i n g ' , 'P I B 9 ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ' ', ' ' D P s e t _ P l B 9 ' ', i) ; ') ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P l B l , 'Value', 0) ; set (rb_DPset_PlB2, V a l u e , 0) ; set (rb_DPset_PlB3, V a l u e , 0) ; set (rb_DPset_PlB4, V a l u e , 0) ; set (rb_DPset_PlB5, V a l u e , 0) ; set (rb_DPset_PiB6, V a l u e , 0) ; set (rb_DPset_PlB7, V a l u e , 0) ; ,
,
•
.
•
286 ,,
set (rb_DPset_PiB8, 'Value', 0) ; set (rb_DPset_PiB9, 'Value', i) ; set ( r b _ D P s e t _ P i B l 0 , 'Value', 0) ;
end end if s t r c m p ( g u i , 'DPset_P2B9') if a c t i o n == 0 r b _ D P s e t _ P 2 B 9 = u i c o n t r o l (fg_DPset .... 'Style', 'r a d i o ' , . . . 'Units', 'normalized', 'Position', [0.657 0.308 0.029 0.058] .... 'String', 'P2B9' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack', 'zzzdp(''DPset' ', ''DPset_P2B9' ', i) ; ') ; end if a c t i o n == 1 set (rb_DPset_P2Bl, 'Value', 0) ; set (rb_DPset_P2B2, 'Value', 0) ; set (rb_DPset_P2B3, 'Value', 0) ; set (rb_DPset_P2B4, 'Value', 0) ; set (rb_DPset_P2B5, 'Value', 0) ; set (rb_DPset_P2B6, 'Value', 0) ; set (rb_DPset_P2B7, 'Value', 0 ) ; set (rb_DPset_P2B8, 'Value', 0) ; set (rb_DPset_P2B9, 'Value', i) ; set ( r b _ D P s e t _ P 2 B l 0 , 'Value', 0) ; end end if s t r c m p (gui, ' D P s e t _ s h o w t h e m ' ) if a c t i o n == 1 %global stepcount %global parllabel % g l o b a l p a r 2 label set ( e d _ D P s e t _ P a r l t x t , 'String', p a r l l a b e l ) ; set ( e d _ D P s e t _ P a r 2 txt, 'String', p a r 2 1 a b e l ) ; set (ed_DPset_PiTl, 'String', n u m 2 s t r ( p l b u t t o n (I) ) ) ; if p l b u t t o n ( 2 ) == 0 set (ed_DPset_PlT2, 'Visible', 'off' ) ; set (rb_DPset_PiB2, 'Visible', 'off' ) ; else set (ed_DPset_PlT2, 'Visible', 'on' ) ; set (rb_DPset_PiB2, 'Visible', 'on' ) ; set (ed_DPset_PiT2, 'String', n u m 2 s t r ( p l b u t t o n (2) ) ) ; end if p l b u t t o n ( 3 ) == 0 set (ed_DPset_PiT3, 'Visible', 'off' ) ; set (rb_DPset_PlB3, 'Visible', 'off' ) ; else set (ed_DPset_PlT3, 'Visible', 'on' ) ; set (rb_DPset_PlB3, 'Visible', 'on' ) ; set (ed_DPset_PlT3, 'String', n u m 2 s t r ( p l b u t t o n (3) ) ) ; end if p l b u t t o n ( 4 ) == 0 set (ed_DPset_PiT4, 'Visible', 'off' ) ; set (rb_DPset_PiB4, 'Visible', 'off' ) ;
287
else set (ed_DPset_PlT4, set (rb_DPset_PlB4, set (ed_DPset_PlT4,
end if p l b u t t o n ( 5 ) == 0 set (ed_DPset_PlT5, set (rb_DPset_PlB5, else set (ed_DPset_PIT5, set (rb_DPset_PlB5, set (ed_DPset_PlT5, end if p l b u t t o n ( 6 ) == 0 set (ed_DPset_PlT6, set (rb_DPset_PlB6, else set (ed_DPset_PIT6, set (rb_DPset_PlB6, set (ed_DPset_PlT6, end if p l b u t t o n ( 7 ) == 0 set (ed_DPset_PlT7, set (rb_DPset_PlB7, else set ( e d _ D P s e t _ P l T 7 , set (rb_DPset_PlB7, set (ed_DPset_PlT7, end if p l b u t t o n ( 8 ) == 0 set ( e d _ D P s e t _ P l T 8 , set ( r b _ D P s e t _ P l B 8 , else set (ed_DPset_PlT8, set (rb_DPset_PlB8, set (ed_DPset_PlT8, end if p l b u t t o n ( 9 ) == 0 set (ed_DPset_PlT9, set (rb_DPset_PlB9, else set (ed_DPset_PlT9, set (rb_DPset_PlB9, set (ed_DPset_PlT9, end if p l b u t t o n ( 1 0 ) == 0 set (ed_DPset_PlTl0, set (rb_DPset_PiBl0, else set (ed_DPset_PlTl0, set (rb_DPset_PlBl0, set (ed_DPset_PlTl0, end
'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (4)) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (5) ) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (6) ) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (7) ) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (8 ) ) ) ; 'Visible', 'off' ) ; ' V i s i b l e ' , 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ,n u m 2 s t r ( p l b u t t o n (9 ) ) ) ; ' V i s i b l e ' , 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ' , n u m 2 s t r ( p l b u t t o n (i0 ) ) ) ;
288
set ( e d _ D P s e t _ P 2 T l , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n (i) if p 2 b u t t o n ( 2 ) == 0 set ( e d _ D P s e t _ P 2 T 2 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 2 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 2 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 2 , 'Visible', 'on' ) ; set ( e d _ D P s e t _ P 2 T 2 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 3 ) == 0 set ( e d _ D P s e t _ P 2 T 3 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 3 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 3 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 3 , 'Visible', 'on' ) ; set ( e d _ D P s e t _ P 2 T 3 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 4 ) == 0 set ( e d _ D P s e t _ P 2 T 4 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 4 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 4 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 4 , 'Visible', 'on' ) ; s e t ( e d _ D P s e t _ P 2 T 4 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 5 ) == 0 set ( e d _ D P s e t _ P 2 T 5 , 'Visible',' off') ; set ( r b _ D P s e t _ P 2 B 5 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 5 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 5 , 'Visible', 'on' ) ; set ( e d _ D P s e t _ P 2 T 5 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 6 ) == 0 set ( e d _ D P s e t _ P 2 T 6 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 6 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 6 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 6 , 'Visible', 'on' ) ; set ( e d _ D P s e t _ P 2 T 6 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 7 ) == 0 set ( e d _ D P s e t _ P 2 T 7 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 7 , 'Visible', 'off' ) ; else set ( e d _ D P s e t _ P 2 T 7 , 'Visible', 'on' ) ; set ( r b _ D P s e t _ P 2 B 7 , 'Visible', 'on' ) ; set ( e d _ D P s e t _ P 2 T 7 , 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n end if p 2 b u t t o n ( 8 ) == 0 set ( e d _ D P s e t _ P 2 T 8 , 'Visible', 'off' ) ; set ( r b _ D P s e t _ P 2 B 8 , 'Visible', 'off' ) ; else
));
(2) ) ) ;
(3) ) ) ;
(4 ) ) ) ;
(5 ) ) ) ;
(6 ) ) ) ;
(7) ) ) ;
289
set (ed_DPset_P2T8, set (rb_DPset_P2B8, set (ed_DPset_P2T8,
end if p 2 b u t t o n ( 9 ) == 0 set (ed_DPset_P2T9, set (rb_DPset_P2B9, else set (ed_DPset_P2T9, set (rb_DPset_P2B9, set (ed_DPset_P2T9, end if p 2 b u t t o n ( 1 0 ) == 0 set (ed_DPset_P2Tl0, set (rb_DPset_P2Bl0, else set (ed_DPset_P2Tl0, set (rb_DPset_P2Bl0, set (ed_DPset_P2Tl0, end
'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n (8 ) ) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'S t r i n g ' , n u m 2 s t r ( p 2 b u t t o n (9 ) ) ) ; 'Visible', 'off' ) ; 'Visible', 'off' ) ; 'Visible', 'on' ) ; 'Visible', 'on' ) ; 'String', n u m 2 s t r ( p 2 b u t t o n (2) ) ) ;
end end if s t r c m p (gui, ' D P s e t _ B o t h ' ) if a c t i o n == 0 rb_DPset_Both = u i c o n t r o l (f g _ D P s e t .... 'Style' 'radio' 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.029 0 . 2 5 0 0 . 2 7 1 0.058] .... 'String', 'Both', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... ' C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' ' D P s e t _ B o t h ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ F i t , 'Value', 0) ; s e t ( r b _ D P s e t _ S c a t t e r , 'Value', 0) ; s e t ( r b _ D P s e t _ B o t h , 'Value', i) ; s e t ( r b _ D P s e t _ C o m b i n e d , 'Value' , 0) ; s e t ( r b _ D P s e t _ A l l , 'Value', 0) ; y c h o i c e = 3; end end if s t r c m p ( g u i , ' D P s e t _ P I B l 0 ' ) if a c t i o n == 0 rb_DPset_PiBl0 = u i c o n t r o l ( f g _ D P s e t .... 'Style' ' adio' 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.357 0 . 2 5 0 0 . 0 2 9 0.058] .... 'S t r i n g ' , 'P I B I 0 ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P I B I 0 ' ', i) ; ' ) ; end if a c t i o n == 1 s e t ( r b _ D P s e t _ P i B l , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 2 , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 3 , 'Value', 0) ; s e t ( r b _ D P s e t _ P i B 4 , 'Value', 0) ; s e t ( r b _ D P s e t _ P I B 5 , 'Value', 0) ; ,
,
r
,
•
•
-
l
.
.
.
290
set set set set set
(rb_DPset_PiB6, 'Value', 0) ; (rb_DPset_PiB7, 'Value', 0) ; (rb_DPset_PiB8, 'Value', 0) ; (rb_DPset_PIB9, 'Value', 0) ; ( r b _ D P s e t _ P i B l 0, 'Value', i) ;
end end if s t r c m p (gui, 'DPset_P2Bl0' ) if a c t i o n == 0 r b _ D P s e t _ P 2 B l 0 = u i c o n t r o l (fg_DPset .... 'Style' , 'r e di o ' ,... 'Units' 'normalized' 'Position', [0.657 0.250 0.029 0.058] .... ' S t r i n g ' , ' P 2 B l 0 ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P 2 B I 0 ' ', i) ; ' ) ; end if a c t i o n == 1 set (rb_DPset_P2Bl, 'Value', 0) ; set (rb_DPset_P2B2, 'Value', 0) ; set (rb_DPset_P2B3, 'Value', 0) ; set (rb_DPset_P2B4, 'V a l u e , 0) ; set (rb_DPset_P2B5, 'Value , 0) ; set (rb_DPset_P2B6, 'Value , 0) ; set (rb_DPset_P2B7, 'Value , 0) ; set (rb_DPset_P2B8, 'Value , 0) ; set (rb_DPset_P2B9, 'V a l u e , 0) ; set ( r b _ D P s e t _ P 2 B l 0 , 'Value', i) ; end end if s t r c m p ( g u i , ' D P s e t _ l o a d t h e m ' ) if a c t i o n == 1 %global grptct %global gminspec %global gmaxspec %global gmaxiter %global gspectmax %global gspectmin %global gspectsteps %global gelpsteps %global gelpmin %global gelpmax %global gtotalsteps %global stepcount %global spectchoice %global elpchoice p l b u t t o n = z e r o s (i0, i) ; p 2 b u t t o n = z e r o s (i0, i) ; fid = f o p e n ( 'PARAMS. TXT', 'r' ) ; g m a x i t e r = f s c a n f ( f i d , ' % f ' , [i,i]); g r p t c t = f s c a n f ( f i d , ' % f ' , [i,i]); g m a x s p e c = f s c a n f ( f i d , ' % f ' , [I,i]); g m i n s p e c = f s c a n f ( f i d , ' % f ' , [i,i]); g s p e c t s t e p s = fscanf (fid, '%f', [I, i] ) ; g s p e c t m a x = fscanf (fid, '%f', [I, i] ) ;
291
g s p e c t m i n = f s c a n f ( f i d , ' % f ' , [i,i]); g e l p s t e p s = f s c a n f ( f i d , '%f', [i,i]); g e l p m a x = f s c a n f ( f i d , ' % f ' , [i,I]); g e l p m i n = f s c a n f ( f i d , ' % f ' , [i,i]); g t o t a l s t e p s = f s c a n f ( f i d , ' % f ' , [I,i]); s p e c t c h o i c e = f s c a n f ( f i d , ' % f ' , [i,i]); e l p c h o i c e = f s c a n f ( f i d , ' % f ' , [I,I]); s t a t u s = f c l o s e (fid) ;
end end if s t r c m p (gui, ' D P s e t _ C o m b i n e d ' ) if a c t i o n == 0 rb_Dpset_Combined = u i c o n t r o l (f g _ D P s e t .... 'Style', 'radio',... 'Units','normalized','Position', [0.029 0.192 0 . 2 7 1 0.058] .... 'S t r i n g ' , 'C o m b i n e d ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ C o m b i n e d ' ', i) ; ' ) ; end if a c t i o n == 1 set ( r b _ D P s e t _ F i t , 'Value', 0) ; set ( r b _ D P s e t _ S c a t t e r , 'Value', 0) ; set ( r b _ D P s e t _ B o t h , 'Value', 0) ; set ( r b _ D P s e t _ C o m b i n e d , 'Value', I) ; set ( r b _ D P s e t _ A l l , 'Value', 0) ; y c h o i c e = 4; end end if s t r c m p (gui, ' D P s e t _ s a v e t h e m ' ) if a c t i o n == 1 %global SavedFile [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat', 'Data F i l e n a m e ' , ... 50,50) ; if ( f i l e n a m e -= 0) SaveclFile = [ p a t h n a m e f i l e n a m e ] ; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end end end if s t r c m p ( g u i , 'DPset_All') if a c t i o n == 0 r b _ D P s e t _ A l l = u i c o n t r o l ( f g _ D P s e t .... 'Style', 'radio',... 'Units', ' n o r m a l i z e d ' , 'Position', [0.029 0.135 0.271 0.058] .... 'String', 'All', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .''" C a l l B a c k ' , 'zzzdp( ' 'DPset' ', ' 'DPset_All' ', I) ; ' ) end if a c t i o n == 1 set ( r b _ D P s e t _ F i t , 'Value', 0 ) ; set ( r b _ D P s e t Scatter, 'Value', 0) ; set ( r b _ D P s e t _ B o t h , 'Value', 0) ; set ( r b _ D P s e t _ C o m b i n e d , 'Value', 0) ; set ( r b _ D P s e t _ A l l , 'Value', i) ;
292
end
ychoice
= 5;
end if strcmp(gui, 'DPset_saveasthem' ) if a c t i o n == 1 %global S a v e d F i l e if (strcmp (SavedFile, ' ' ) ) [filename,pathname] = u i p u t f i l e ( '* .mat', 'Data Filename',... 50,50) ; else [filename, pathname] = u i p u t f i l e (SavedFile .... 'Data Filename',... 50,50) ; end if (filename -= 0) S a v e d F i l e = [pathname filename] ; c i ear f i i ename pathname; e v a l ( [ ' s a v e ' SavedFile ';']); end end en d if strcmp (gui, 'DPset_bye' ) if a c t i o n == 1 close (fg_DPset) close (fg_DPlook) end end if s tr cm p (gui, 'DPset_Look' ) if a c t i o n == 0 p b _ D P s e t _ L o o k = u i c o n t r o l (fg_DPset .... 'Style' 'pushbutton' ,Units','normalized','Position', [0i014 0.000 0.171 0.058] .... 'String', 'Look','BackgroundColor' 0.8 0 8 0.8] ... 'CallBack', 'zzzdp(''DPset' ', ' 'DPset_Look' ', i) ; ') ; end if a c t i o n == 1 if x c h o i c e == [] set (rb_DPset_No_of_species, 'Value', i) ; zzzdp ( 'DPset', 'D P s e t _ N o _ o f _ s p e c i e s ', i) ; e nd if y c h o i c e == [] set (rb_DPset_Fit, 'Value', i) ; zzzdp ( 'DPset ', 'DPset_Fit', I) ; end flagl=0; flag2=0; if get(rb_DPset_PlBl, 'Value') == 1 p a r l v a l = s t r 2 n u m ( g e t (ed_DPset_PiTl, 'String' ) ) ; flagl=l; end if get (rb_DPset_PiB2, 'Value' ) == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PlT2, 'String' ) ) ; flagl=l; ,
,
. . .
293
end if get (rb_DPset_PiB3, 'Value' ) == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PIT3, flagl=l; end if get (rb_DPset_PiB4, 'Value' ) == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PiT4, flagl=l; end if get (rb_DPset_PlB5, 'Value' ) == 1 p a r l v a l = s t r 2 n u m ( g e t (ed_DPset_PlT5, flagl=l; end if g e t ( r b _ D P s e t _ P l B 6 , 'Value') == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PlT6, flagl=l; end if get (rb_DPset_PlB7, 'Value' ) == 1 p a r l v a l = s t r 2 n u m ( g e t (ed_DPset_PIT7, flagl=l; end if get (rb_DPset_PlB8, 'Value' ) == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PlT8, flagl=l; end if get (rb_DPset_PlB9, 'Value' ) == 1 p a r l v a l = s t r 2 n u m (get (ed_DPset_PlT9, flagl=l; end if g e t ( r b _ D P s e t _ P l B l 0 , 'Value') == 1 p a r l v a l = s t r 2 n u m ( g e t (ed_DPset_PlTl0, flagl=l; end if g e t ( r b _ D P s e t _ P 2 B l , 'Value') == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2Tl, flag2=l; end if g e t ( r b _ D P s e t _ P 2 B 2 , 'Value') == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T2, flag2=l; end if. get (rb_DPset_P2B3, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T3, flag2=l; end if get (rb_DPset_P2B4, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T4, flag2=l; end if get (rb_DPset_P2B5, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T5, flag2=l; end
'S t r i n g
));
'S t r i n g
));
'S t r i n g
));
'S t r i n g
));
'String
));
'S t r i n g
));
'S t r i n g
));
'String' ) ) ;
'String' ) ) ;
'String' ) ) ;
'String' ) ) ;
'String' ) ) ;
'String' ) ) ;
294
if get (rb_DPset_P2B6, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T6, 'String' ) ) ; flag2=l; end if g e t (rb_DPset_P2B7, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T7, 'String' ) ) ; flag2=l; end if get (rb_DPset_P2B8, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2T8, 'String' ) ) ; flag2=l; end if get (rb_DPset_P2B9, 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m (get (ed_DPset P2T9, 'String' ) ) ; flag2=l; end if get ( r b _ D P s e t _ P 2 B l 0 , 'Value' ) == 1 p a r 2 v a l = s t r 2 n u m ( g e t (ed_DPset_P2Tl0, 'String' ) ) ; flag2=l; end if flagl == 0 set (rb_DPset_PIBl, 'Value', I) ; p a r l v a l = s t r 2 n u m (get (ed_DPset_PiTl, 'String' ) ) ; end if flag2 == 0 set (rb_DPset_P2Bl, 'Value', i) ; p a r 2 v a l = s t r 2 n u m (get (ed_DPset_P2Tl, 'String' ) ) ; end if e x i s t ( ' p a r l v a l ' ) -= 0 & e x i s t ( ' p a r 2 v a l ' ) -= 0 f i g u r e (fg_DPlook) page = 2 ; z z z d p ( 'DPlook', ' D P l o o k _ p l o t i t ' , i) ; end
end end if s t r c m p ( g u i , 'DPset_Help') if a c t i o n == 0 pb_DPset_Help = u i c o n t r o l (fg_DPset .... 'Style' 'pushbutton' 'Units' 'normalized' 'Position',[0.743 0.000 0.171 0.058] .... 'String', 'Help' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ H e l p ' ', i) ; ' ) ; end if a c t i o n == 1 h e l p helpdp; end end if s t r c m p ( g u i , ' D P s e t _ X a x i s t x t ' ) if a c t i o n == 0 ed_DPset_Xaxistxt = u i c o n t r o l (fg_DPset .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units' 'normalized', 'Position', [0.057 0.846 0.214 0.058] .... 'String' 'X-axis', ,
i
,
o .
•
•
•
•
295
end
'CallBack',
zzzdp(
'DPset
, 'DPset_Xaxistxt'',l);');
end if s t r c m p ( g u i , ' D P s e t _ P a r l t x t ' ) if a c t i o n == 0 e d _ D P s e t _ P a r l t x t = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',~0~50 0.50 0.50],.i. Units','normalized','Position',[u.J57 0 846 0.229 u 058] String','',... 'CallBack', zzzdp( 'DPset , 'DPset_Parltxt'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P a r 2 t x t ' ) if a c t i o n == 0 e d _ D P s e t _ P a r 2 t x t = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units','normalized','Position',[0.657 0.846 0.229 0.058] 'String','',... 'CallBack', zzzdp( 'DPset , 'DPset_Par2txt'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P l T l ' ) if a c t i o n == 0 e d _ D P s e t _ P l T l = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units','normalized','Position',[0.414 0.769 0.171 0.058] 'String' '' 'CallBack', zzzdp( 'DPset , 'DPset_PlTl'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P 2 T l ' ) if a c t i o n == 0 e d _ D P s e t _ P 2 T l = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units','normalized','Position',[0.714 0.769 0.171 0.058] 'String','',... 'CallBack', zzzdp( 'DPset , 'DPset_P2Tl'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P l T 2 ' ) if a c t i o n == 0 e d _ D P s e t _ P l T 2 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units','normalized','Position',[0.414 0.712 0.171 0.058] 'String','',... 'CallBack', zzzdp( 'DPset , 'DPset_PlT2'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P 2 T 2 ' ) if a c t i o n == 0 e d _ D P s e t _ P 2 T 2 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor' [0.50 0.50 0.50] . 'Units','normalized','Position',i0.714 0 712 0.171 01058] ,
,
•
•
....
....
....
•
....
....
....
296
end
'String','',... 'CallBack', 'zzzdp(''DPset'
', ' ' D P s e t _ P 2 T 2 ' ', i) ; ') ;
end if s t r c m p (gui, ' D P s e t _ P i T 3 ' ) if a c t i o n == 0 ed_DPset_PiT3 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.414 0 . 6 5 4 0 . 1 7 1 0.058] 'String', '', ... ' C a l l B a c k ' , 'zzzdp(' 'DPset' ', ' ' D P s e t _ P I T 3 ' ', i) ; ') ; end end if s t r c m p (gui, ' D P s e t _ P 2 T 3 ' ) if a c t i o n == 0 ed_DPset_P2T3 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'edit', 'B a c k g r o u n d C o l o r ' , [0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.714 0 . 6 5 4 0 . 1 7 1 0.058] 'String', '', ... 'CallBack', 'zzzdp(''DPset'',''DPset_P2T3'',l) ;') ; end end if s t r c m p ( g u i , ' D P s e t _ P i T 4 ' ) if a c t i o n == 0 ed_DPset_PiT4 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.414 0 . 5 9 6 0 . 1 7 1 0.058] 'String',",... ' C a l l B a c k ' , ' z z z d p ( ' 'DPset'',' ' D P s e t _ P I T 4 ' ' , i) ;') ; end end if s t r c m p (gui, ' D P s e t _ P 2 T 4 ' ) if a c t i o n == 0 ed_DPset_P2T4 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.714 0 . 5 9 6 0 . 1 7 1 0.058] 'S t r i n g ' , - - .... 'C a l l B a c k ' , 'z z z d p ( ' 'D P s e t ' ', ' 'D P s e t _ P 2 T 4 ' ', i) ; ' ) ; end end if s t r c m p (gui, ' D P s e t _ P I T 5 ' ) if a c t i o n == 0 ed_DPset_PiT5 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'edit', 'B a c k g r o u n d C o l o r ' , [0 . 5 0 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.414 0 . 5 3 8 0 . 1 7 1 0.058] 'String', ' ',... 'C a l l B a c k ' , 'z z z d p ( ' 'D P s e t ' ', ' 'D P s e t _ P I T 5 ' ', i) ; ' ) ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 T 5 ' ) if a c t i o n == 0 ed_DPset_P2T5 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] ....
....
....
....
....
....
297
'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.714 0.538 0.171 'String ,'',... 'CallBack','zzzdp(''DPset'',''DPset_P2T5'',l);');
0.058] ....
end end if s t r c m p ( g u i , ' D P s e t _ P l T 6 ' ) if a c t i o n == 0 e d _ D P s e t _ P l T 6 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.414 0.481 0.171 0.058] 'String ,'',... 'CallBack', z z z d p ( ' ' D P s e t ' ' , ' ' D P s e t _ P l T 6 ' ' , l ) ; ' ) ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 T 6 ' ) if a c t i o n == 0 e d _ D P s e t _ P 2 T 6 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.714 0.481 0.171 0.058] 'String ,'',... 'CallBack','zzzdp(''DPset'',''DPset_P2T6'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ Y a x i s t x t ' ) if a c t i o n == 0 e d _ D P s e t _ Y a x i s t x t = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.057 0.442 0.214 0.058] 'String ,'Y-axis',... 'CallBack','zzzdp(''DPset'',''DPset_Yaxistxt'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P l T 7 ' ) if a c t i o n == 0 e d _ D P s e t _ P l T 7 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.414 0.423 0.171 0.058] 'String '' 'CallBack','zzzdp(''DPset'',''DPset_PiT7'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P 2 T 7 ' ) if a c t i o n == 0 e d _ D P s e t _ P 2 T 7 = u i c o n t r o l ( f g _ D P s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ,[0.714 0.423 0.171 0.058] 'String '' 'CallBack','zzzdp(''DPset'',''DPset_P2T7'',l);'); end end if s t r c m p ( g u i , ' D P s e t _ P l T 8 ' ) if a c t i o n == 0 e d _ D P s e t _ P l T 8 = u i c o n t r o l ( f g _ D P s e t .... i
,
•
•
•
,
,
•
•
•
....
....
....
....
....
298
'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.414 0 . 3 6 5 0 . 1 7 1 0.058] .... 'String', '', ... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ' ', ' ' D P s e t _ P I T 8 ' ', i) ; ') ;
end end if s t r c m p (gui, ' D P s e t _ P 2 T 8 ' ) if a c t i o n == 0 ed_DPset_P2T8 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.714 0 . 3 6 5 0 . 1 7 1 0.058] 'String', '', ... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P 2 T 8 ' ', I) ; ' ) ; end end if s t r c m p ( g u i , ' D P s e t _ P l T 9 ' ) if a c t i o n == 0 ed_DPset_PiT9 = u i c o n t r o l ( f g _ D P s e t .... 'S t y l e ' , 'edit', 'B a c k g r o u n d C o l o r ' , [0 . 5 0 0.50 0 . 5 0 ] .... 'Units','normalized','Position', [0.414 0 . 3 0 8 0 . 1 7 1 0.058] 'String', '', ... ' C a l l B a c k ' , 'zzzdp(' 'DPset' ', ' ' D P s e t _ P l T 9 ' ' , i) ; ') ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 T 9 ' ) if a c t i o n == 0 ed_DPset_P2T9 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.714 0 . 3 0 8 0 . 1 7 1 0.058] 'S t r i n g ' , ' ' .... ' C a l l B a c k ' , ' z z z d p ( ' ' D P s e t ' ' , ' ' D P s e t _ P 2 T 9 ' ', i) ; ') ; end end if s t r c m p ( g u i , ' D P s e t _ P l T l 0 ' ) if a c t i o n == 0 ed_DPset_PiTl0 = u i c o n t r o l (f g _ D P s e t .... 'Style' 'edit' ' B a c k g r o u n d C o l o r ' [0 50 0 50 0 50] 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.414 0 . 2 5 0 0 . 1 7 1 0.058] 'String', '', ... ' C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' ' D P s e t _ P I T I 0 ' ' , I) ; ') ; end end if s t r c m p ( g u i , ' D P s e t _ P 2 T l 0 ' ) if a c t i o n == 0 ed_DPset_P2Tl0 = u i c o n t r o l ( f g _ D P s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.714 0 . 2 5 0 0 . 1 7 1 0.058] 'String',",... 'C a l l B a c k ' , 'z z z d p ( ' 'DPset' ', ' 'D P s e t _ P 2 T I 0 ' ', i) ; ' ) ; end end ,
,
end if s t r c m p (page, 'D P l o o k ' )
,
•
.
.
,
.
....
....
....
. .
....
....
299
figure (fg_DPlook) set(gcf, ' N u m b e r T i t l e ' , 'off', ... 'Name', ' D P l o o k ' , ... 'backingstore', 'off ) ; if s t r c m p (gui, ' i n i t g u i ' ) z z z d p ( 'D P l o o k ' , 'D P I o o k _ M E N U I ', 0) ; z z z d p ( 'D P l o o k ' , 'D P l o o k _ M e n u 2 ', 0) ; z z z d p ( 'D P l o o k ' , 'D P l o o k _ B a c k ' , 0) ; z z z d p ( 'D P l o o k ' , 'D P l o o k _ H e l p ' , 0) ; end if s t r c m p ( g u i , ' D P I o o k _ M E N U I ' ) if a c t i o n == 0 mn_DPlook_Files = u i m e n u ( f g _ D P l o o k .... 'L a b e l ' , 'F i l e s ' ) ; mi_DPlook_Quit = u i m e n u ( m n _ D P l o o k _ F i l e s .... ' L a b e l ' , 'Quit', ... ' C a l l B a c k ' , 'z z z d p ( ' 'D P l o o k ' ', ' 'D P l o o k _ b y e ' ', i) ; ' ) ; end end if s t r c m p (gui, ' D P l o o k _ M e n u 2 ' ) if a c t i o n == 0 mn_DPlook_C43TO = u i m e n u ( f g _ D P l o o k .... 'L a b e l ' , 'G O T O ' ) ; mi_DPlook_Back = u i m e n u ( m n _ D P l o o k _ G O T O .... 'Label', 'Back',... 'C a l l B a c k ' , 'z z z d p ( ' 'D P l o o k ' ', ' 'D P l o o k _ s e t t h e m ' ', i) ; ' ) ; mi_DPlook_Help = uimenu(mn_DPlook_GOTO .... 'Label', 'Help',... 'C a l l B a c k ' , 'z z z d p ( ' 'D P l o o k ' ', ' 'D P l o o k _ H e l p ' ', i) ; ' ) ; end end if s t r c m p ( g u i , ' D P l o o k _ s e t t h e m ' ) if a c t i o n == 1 f i g u r e (f g _ D P s e t ) page = 1 ; end end if s t r c m p ( g u i , ' D P l o o k _ l o o k t h e m ' ) if a c t i o n == 1 figure (fg_DPlook) page = 3 ; end end if s t r c m p (gui, ' D P l o o k _ p l o t i t ' ) if a c t i o n == 1 %global kombi %global parlval %global par2val %global parllabel %global par2 label plbutton=zeros (i0, i) ; p2button=zeros (i0, i) ; load KOMBI
300 kombi=KOMBI; [n, m] = size(kombi); if e l p c h o i c e == [] e l p c h o i c e = 1; end if s p e c t c h o i c e == [] s p e c t c h o i c e = I; end tstp = g t o t a l s t e p s * grptct; f p r i n t f ( '*** A u t o A R / d p ***\n\n' ) f p r i n t f ( 'Max i t e r a t i o n s : %3. Of Repeats: %3. Of Total steps:%3.0f\n',... gmaxiter, grptct, tstp) if e l p c h o i c e == 1 f p r i n t f ( ' E l u t i o n c u r v e baseline- %6.5f - % 6 . 5 f L i n steps : %4.0f\n', ... g e l p m i n / 1 0 0 , gelpmax/100, gelpsteps) else fprintf( 'Elution c u r v e baseline: %6.5f - % 6 . 5 f L o g steps : %4.0f\n', ... gelpmin/100,gelpmax/100,gelpsteps) end if s p e c t c h o i c e == 1 f p r i n t f ( ' S p e c t r a l baseline: %6.5f - % 6 . 5 f L i n steps%4.0f\n',... gspectmin/100,gspectmax/100,gspectsteps) else L o g steps : fprintf('Spectralbaseline: %6.5f - % 6 . 5 f %4.0f\n', . .. g s p e c t m i n / i 0 0 , g s p e c t m a x / i 0 0 , gspectsteps) end f p r i n t f ( 'No. of s p e c i e s : %2. Of -%2.0f\n\n', gminspec, gmaxspec) it = 0.0000001; x k a h a = [] ; k a h a l = [] ; kaha2 = [] ; kaha3 = [] ; if x c h o i c e == 1 x t e x t = 'N u m b e r of species' ; p c t = 0; for ii = l - g t o t a l s t e p s if a b s ( k o m b i ( i i , 2 ) - p a r l v a l ) < i t & a b s ( k o m b i (ii,3)p a r 2 v a l )< i t p c t = pct+l; x k a h a (pct) = kombi (ii, i) ; %nspec kahal (pct) = kombi (ii, 4) ; %AF kaha2 (pct) = kombi (ii, 5) ; %SF xax=l; end end a f = s u m (kahal) ; s f = s u m (kaha2) ; for ii = l:pct
301
end
k a h a 3 (ii)
= k a h a l (ii)
+ k a h a 2 (ii) *af/sf;
end if x c h o i c e == 2 xtext = 'Elution curve baseline' ; p c t = 0; for ii = l - g t o t a l s t e p s if a b s ( k o m b i ( i i , l ) - p a r l v a l ) < i t & a b s ( k o m b i (ii,3)par2val) 2 & y s 2 > 2 h = s u r f l (xkaha, ykaha, S) ; else x t e x t = ' X , Y, a n d Z m u s t be at least ytext=' ' ; z t e x t = ' '; end end set (gca, 'XScale', 'linear' ) if x a x == 2 set (gca, 'XScale', 'log' ) end set (gca, 'YScale', 'linear' ) if y a x == 2 set (gca, 'YScale', 'log' )
3-by-3. ';
338
end x l a b e l (xtext) y l a b e l (ytext) z l a b e l ( [ztext ' %'] ) t i t l e ( [parllabel ' = ' n u m 2 s t r (parlval) ] ) disp( [parllabel ' = ' n u m 2 s t r ( p a r l v a l ) ] ) f p r i n t f ( 'x : ' ) ;d i s p (xtext) ; f p r i n t f ( 'y = ' ) ;disp (ytext) ; f p r i n t f ( ' z = ');disp(ztext); f p r i n t f ( '\n' ) ;
end end if s t r c m p (gui, '3 D l o o k _ H e l p ' ) if a c t i o n == 0 p b _ 3 D l o o k _ H e l p = u i c o n t r o l ( f g _ 3 D l o o k .... 'Style', ' p u s h b u t t o n ' , . . . 'Units', 'normalized', 'Position', [0.914 0.000 0.086 0.058] .... ' S t r i n g ' , ' H e l p ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'CallBack',' zzzdp3 ('' 3Dlook'','' 3 D l o o k _ H e l p ' ' , i) ;') ; end if a c t i o n == 1 help helpdp3 ; end end
end % # # # # # E n d of p r o g r a m
#####
Chapter
17
Selecting an optimal parameter set
This Page Intentionally Left Blank
17
Selecting an optimal parameter set
The program z z z d a r . m is usd to select an optimal set from the results of AR for closer inspection of spectra and elution curves. This program is very similar to the program z z z d a r 2 , m that calculates the confidence ranges as well. The reason for keeping a similar function in these two programs is that the z z z d a r . m can operate faster when the full confidence ranges are not needed. %ZZZDAR.M selects AR solution for display. % Date- 30 Sep 1995
%
% Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
function zzzdar (page, gui, action) ; global C global elpchoice global gcuriter global gcurrpt global gcurspec global gelpcur global gelpmax global gelpmin global gelpsteps global gmaxiter global gmaxspec
Obs.mat params.txt kombi.mat
--
.._--_'~
DAR
"-
~
current.txt C mat S.mat
Figure 17.1 Program z z z d a r .m reads the data matrix Obs .mat, the parameters from p a r a m s , t x t and the starting situation (kombi .mat) for producing the spectra (S. mat) and the elution curves (C. ma t). The selected parameters are stored in c u r rent. txt. 341
342
global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global global
gminspec grptct gspectcur gspectmax gspectmin gspectsteps gtotalsteps kombi nofile Obs S SavedFile spectchoice stepcount unichoice ed_SELErun_Stepc t ed_SELEset_Curiter ed_SELEs et_Currpt ed_SELEset_Curspec ed_SELEset_Elpcurl ed_SELEset_Elpcur2 ed_SELEset_Elplinlog ed_SELEset_Elpmax ed_SELEset_Elpmin ed_SELEset_Elpsteps ed_SELEset_Maxspec ed_SELEset_Minspec ed_SELEset_Spectcul ed_SELEset_Spectcu2 ed_SELEs et_Spec ti inl og ed_SELEset_Spectmax ed_SELEset_Spectmin ed_SELEs et_Spec tsteps ed_SELEset_Unimod fg_SELErun fg_SELEset fr_SELEs et_Current frame fr_SELEs et_Paramframe mi_SELErun_Back mi_SELErun_Help mi_SELErun_Quit mi_SELErun_Save_curves mi_SELErun_Save_spec tra mi_SELEset_Help mi_SELEset_Quit mi_SELEset_Run mn_SELErun_Fi les mn_SELErun_GOTO mn_SELEset_Files mn_SELEset_C93TO pb_SELErun_Back pb_SELErun_Help pb_SELEset_Help
343
global pb_SELEset_Load_Default global pb_SELEset_Run global tx_SELEset_Curitertx g l o b a l t x _ S E L E s e t _ C u r r p t tx global tx_SELEset_Curspectx global tx_SELEset_Elpcurtxl global tx_SELEset_Elpcurtx2 global tx_SELEset_Elpmaxtx global tx_SELEset_Elpmintx global tx_SELEset_Elpro f g l o b a l t x _ S E L E s e t _ E l p s teps tx global tx_SELEset_Maxspectx global tx_SELEset_Minspectx global tx_SELEs et_Paramheader global tx_SELEset_Species g l o b a l t x _ S E L E s e t _ S p e c tcons t g l o b a l t x _ S E L E s e t _ S p e c tcur txl g l o b a l t x _ S E L E s e t _ S p e c t c u r tx2 global tx_SELEs et_Spec tmaxtx global tx_SELEs et_Spec tmintx g l o b a l t x _ S E L E s e t _ S p e c t steps tx global tx_SELEs et_Subsetheader if n a r g i n == 0 p a g e = 'initpage' ; gui = 'initgui'; a c t i o n = 0; end if s t r c m p (page, 'initpage' ) f g _ S E L E s e t = figure; f g _ S E L E r u n = figure; z z z d a r ( 'SELErun', 'initgui', 0) ; z z z d a r ( 'S E L E s e t ', 'initgui', 0) ; p a g e = ''-, end if s t r c m p (page, 'SELEset' ) f i g u r e (f g _ S E L E s e t ) set(gcf, 'NumberTitle', 'off', ... 'Name', 'SELEset', . b a c k i n g s t o r e ' , 'off' ) ; if s t r c m p (gui, 'initgui' ) z z z d a r ( S E L E s e t , 'S E L E s e t _ M e n u l ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ M e n u 2 ', 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ P a r a m f r a m e ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ C u r r e n t f r a m e ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ P a r a m h e a d e r ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ S u b s e t h e a d e r ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p r o f ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ U n i m o d ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p m i n ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p m i n t x ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p c u r t x l ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p c u r l ' , 0) ; z z z d a r ( S E L E s e t , S E L E s e t _ E l p m a x ' , 0) ;
344 zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar zzzdar( zzzdar zzzdar zzzdar zzzdar zzzdar
( 'S E L E s e t ', 'S E L E s e t _ E l p m a x t x ' , 0) ; ( 'S E L E s e t ', 'S E L E s e t _ E l p c u r 2 ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ E l p c u r t x 2 ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ E l p l i n l o g ', 0 ) ; ( 'S E L E s e t ', 'S E L E s e t _ E l p s t e p s ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ E l p s t e p s t x ' , 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t c u r t x l ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t c u l ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t c o n s t ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t c u 2 ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t c u r t x 2 ', 0) ; ( 'S E L E s e t ', 'S E L E s e t _ S p e c t m i n ' , 0) ; ( 'S E L E s e t ' , SELEset_Spectmintx', 0) ; ( 'S E L E s e t , SELEset_Spectmax', 0) ; ( 'S E L E s e t , SELEset_Spectmaxtx', 0) ; ( 'S E L E s e t , SELEset_Curspectx', 0) ; ( 'S E L E s e t , SELEset_Curspec ', 0) ; ( 'S E L E s e t , SELEset_Spectlinlog', 0) ; ( 'S E L E s e t , SELEset_Spectsteps ', 0) ; ( 'S E L E s e t ' , SELEset_Spectstepstx', 0) ; ( 'S E L E s e t , SELEset_Currpttx', 0) ; ( 'S E L E s e t , SELEset_Currpt', 0) ; ( 'S E L E s e t , SELEset_Species ', 0) ; ( 'S E L E s e t , SELEset_Minspec', 0) ; ( SELEset , SELEset_Minspectx', 0) ; ( SELEset , SELEset_Curitertx', 0) ; ( SELEset , 'S E L E s e t _ C u r i t e r ' , 0) ; SELEset , SELEset_Maxspec', 0) ; ( SELEset , SELEset_M~nxspectx', 0) ; ( SELEset , SELEset_Run', 0) ; ( SELEset , SELEset_Load_Default', 0) ; ( SELEset , SELEset_Help', 0) ; ( SELEset', SELEset_Boot', I) ;
end if s t r c m p (gui, 'S E L E s e t _ P a r a m f r a m e ' ) if a c t i o n == 0 fr_SELEset_Paramframe = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , ' f r a m e ' , 'B a c k g r o u n d C o l o r ' , [ 0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units','normalized','Position', [0.071 0.058 0.443 0.904]); end end if s t r c m p (gui, 'S E L E s e t _ C u r r e n t f r a m e ' ) if a c t i o n == 0 fr_SELEset_Currentframe = u i c o n t r o l ( f g _ S E L E s e t .... 'Style' 'f r a m e ' 'BackgroundColor' [0 50 0 50 0 50], 'Units', 'normalized', 'Position', [0.529 0.058 0.414 0.904]); end end if s t r c m p (gui, 'S E L E s e t _ M e n u l ' ) if a c t i o n == 0 mn_SELEset_Files = u i m e n u ( f g _ S E L E s e t .... 'Label', 'Files') ; mi_SELEset_Quit = uimenu (mn_SELEset_Files ....
345
end
'Label' , 'Quit' , . . . 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ b y e
' ', i) ; ' ) ;
end if s t r c m p ( g u i , ' S E L E s e t _ M e n u 2 ' ) if a c t i o n == 0 m n _ S E L E s e t _ C 4 3 T O = u i m e n u (f g _ S E L E s e t .... 'Label', 'GOTO' ) ; mi_SELEset_Run = u i m e n u ( m n _ S E L E s e t _ G O T O .... 'Label', ' R u n ' , . . . ' C a l l B a c k ' , ' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ r u n t h e m ' ' , I) ;') ; mi_SELEset_Help = uimenu(mn_SELEset_GOTO .... 'Label', ' H e l p ' , . . . 'CallBack','zzzdar(''SELEset'',''SELEset_Help'',l);'); end end if s t r c m p (gui, 'S E L E s e t _ s e t t h e m ' ) if a c t i o n == 1 f i g u r e (f g _ S E L E s e t ) p a g e = 1; end end if s t r c m p (gui, 'S E L E s e t _ r u n t h e m ' ) if a c t i o n == 1 f i g u r e (f g _ S E L E r u n ) page = 2 ; end end if s t r c m p (gui, 'S E L E s e t _ S a v e _ s ' ) if a c t i o n == 1 [f i l e n a m e , p a t h ] = u i p u t f i l e ( 'S . m a t ' , 'S a v e s p e c t r a file- ' ) ; pit=length (filename) ; if f i l e n a m e - = 0 v a r i = f i l e n a m e (i •p i t - 4 ) ; zapu=[vari, ' = S; '] ; e v a l (zapu) rapu=['save ', vari, ' ', v a r i ] ; e v a l (rapu) end end end if s t r c m p ( g u i , ' S E L E s e t _ S a v e _ c ' ) if a c t i o n == 1 [filename, path]=uiputfile( 'C.mat', 'Save e l u t i o n c u r v e s file-') ; var i = f i i ename; pit=length (filename) ; if f i l e n a m e - = 0 if p i t > 4 v a r i = f i l e n a m e (1 - p i t - 4 ) ; end zapu=[vari, ' = C;']; e v a l (zapu)
346
end
end
rapu=['save eval (rapu)
', vari,'
', vari];
end if strcmp(gui, 'SELEset_Boot' ) if action == 1 %global gmaxiter %global grptct %global gminspec %global gmaxspec %global gspectsteps %global gspectmin %global gspectmax %global gelpsteps %global gelpmax %global gelpmin %global gtotalsteps %global elpchoice %global spectchoice %global unichoice %global kombi %global Obs fid = fopen(' PARAMS.TXT' .'r') ; gmaxiter = fscanf(fid,'%f', [i,i]); grptct = fscanf(fid,'%f', [i,i]); gmaxspec = fscanf(fid,'%f', [i,i]); gminspec = fscanf(fid,'%f', [i,i]); gspectsteps = fscanf (fid, '%f', [i, i] ) ; gspectmax = fscanf(fid, '%f', [i,i]); gspectmin = fscanf(fid, '%f', [i,i]); gelpsteps = fscanf(fid, '%f', [i,i]); gelpmax = fscanf(fid, '%f', [i,i]); gelpmin = fscanf(fid,'%f', [i,i]); gtotalsteps = fscanf(fid,'%f', [I,i]); spectchoice = fscanf(fid,'%f', [i,i]); elpchoice = fscanf(fid,'%f', [i,i]); unichoice = fscanf(fid,'%f', [i,I]); status = fclose(fid) ; load KOMBI kombi=KOMBI; load OBS TObs=OBS; Obs=TObs ' ; totalsteps = gspectsteps*gelpsteps* (gmaxspec-gminspec+!) ; set (ed_SELEset_Elpmin, 'String', num2str (gelpmin)) ; set (ed_SELEset_Elpmax, 'String', num2str (gelpmax)) ; set (ed_SELEset_Elpsteps, 'String', num2str (gelpsteps) ) ; set (ed_SELEset_Spectmin, 'String', num2str (gspectmin)) ; set (ed_SELEset_Spectmax, 'String', num2str (gspectmax)) ; set (ed_SELEset_Spectsteps, 'String', num2str (gspectsteps) ) ; set (ed_SELEset_Minspec, 'String', num2str (gminspec)) ; set (ed_SELEset_Maxspec, 'String', num2str (gmaxspec)) ;
347
if e l p c h o i c e == 1 t e k s t i = ' Lin' ; else t e k s t i = ' Log' ; end set ( e d _ S E L E s e t _ E l p l i n l o g , 'String', teksti) ; if s p e c t c h o i c e == 1 t e k s t i = ' Lin' ; else t e k s t i = 'Log' ; end set ( e d _ S E L E s e t _ S p e c t l i n l o g , 'String', teksti) ; if u n i c h o i c e == 1 teksti=' U n i m o d ' ; else teksti=' P o l y m o d ' ; end set ( e d _ S E L E s e t _ U n i m o d , 'String', teksti) ;
end end if s t r c m p (gui, 'S E L E s e t _ l o a d t h e m ' ) if a c t i o n == 1 %global SavedFile %global nofile [f i l e n a m e , p a t h n a m e ] = u i g e t f i l e ( '* .mat', ... ' C h o o s e a M A T L A B D a t a file', 50,50) ; nofile=l; if f i l e n a m e - = 0 SavedFile = [pathname filename] ; clear filename pathname; eval(['load ' SavedFile ';']); nofile=0 ; end end end if s t r c m p ( g u i , ' S E L E s e t _ s a v e t h e m ' ) if a c t i o n == 1 %global SavedFile [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat' , . . . 'Data F i l e n a m e ' , 50,50) ; if f i l e n a m e - = 0 SavedFile = [pathname filename] ; c i ear f i i ename pathname; eval(['save ' SavedFile ';']); end end end if s t r c m p (gui, 'S E L E s e t _ s a v e a s t h e m ' ) if a c t i o n == 1 %global SavedFile if (strcmp (SavedFile, ' ' ) ) [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( ' * .mat', ... 'Data F i l e n a m e ' , 50,50) ;
348
else [filename, p a t h n a m e ] = u i p u t f i l e ( S a v e d F i l e .... 'Data F i l e n a m e ' , 50,50) ;
end if f i l e n a m e - = 0 S a v e d F i l e = [pathname filename] ; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end
end end if s t r c m p (gui, 'S E L E s e t _ b y e ' ) if a c t i o n == 1 c l o s e (fg_SELEset) c l o s e (fg_SELErun) end end if s t r c m p (gui, 'S E L E s e t _ R u n ' ) if a c t i o n == 0 pb_SELEset_Run = u i c o n t r o l (f g _ S E L E s e t .... 'Style', 'p u s h b u t t o n ' , . . . 'Units' l 'normalized' t 'Position', [0.057 0.000 0.171 0.058] .... 'String', 'Run', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'C a l l B a c k ' , 'z z z d a r ( ' 'SELEset' ', ' 'S E L E s e t _ R u n ' ', I) ; ' ) ; end if a c t i o n == 1 %global gelpcur %global gspectcur %global gcurspec %global gcurrpt %global gcuriter g e l p c u r = s t r 2 n u m (get ( e d _ S E L E s e t _ E l p c u r l , 'String' ) ) ; if g e l p c u r == [] g e l p c u r = i; end if g e l p c u r < 1 g e l p c u r = I; set ( e d _ S E L E s e t _ E l p c u r l , 'String', n u m 2 s t r (gelpcur)) ; end if g e l p c u r > g e l p s t e p s gelpcur = gelpsteps; set ( e d _ S E L E s e t _ E l p c u r l , 'String', n u m 2 s t r (gelpcur)) ; end if e l p c h o i c e == 1 e l p v e c t o r = l i n s p a c e (gelpmin/100, g e l p m a x / 1 0 0 , g e l p s t e p s ) ; else elpmi = logl0(gelpmin/100) ; elpma = logl0(gelpmax/100) ; e l p v e c t o r = l o g s p a c e (elpmi, elpma, g e l p s t e p s ) ; end set ( e d _ S E L E s e t _ E l p c u r 2 , 'String', n u m 2 s t r (1 0 0 * e l p v e c t o r (gelpcur)) ) ; g s p e c t c u r = s t r 2 n u m ( g e t ( e d _ S E L E s e t _ S p e c t c u l , 'String' ) ) ;
349
if g s p e c t c u r == [] g s p e c t c u r = i; end if g s p e c t c u r < 1 g s p e c t c u r = i; set ( e d _ S E L E s e t _ S p e c t c u l , 'String', n u m 2 s t r (gspectcur)) ; end if g s p e c t c u r > g s p e c t s t e p s gspectcur = gspectsteps; set ( e d _ S E L E s e t _ S p e c t c u l , 'String', n u m 2 s t r (gspectcur)) ; end if s p e c t c h o i c e == 1 s p e c t v e c t o r = l i n s p a c e ( g s p e c t m i n / 1 0 0 .... gspectmax/I00, gspectsteps) ; else spectmi = logl0(gspectmin/100) ; spectma = logl0(gspectmax/100) ; s p e c t v e c t o r = l o g s p a c e (spectmi, spectma, g s p e c t s t e p s ) ; end set ( e d _ S E L E s e t _ S p e c t c u 2 , 'String', n u m 2 s t r (100" s p e c t v e c t o r (gspectcur)) ) ; g c u r s p e c = s t r 2 n u m ( g e t ( e d _ S E L E s e t _ C u r s p e c , 'String' ) ) ; if g c u r s p e c = = [] g c u r s p e c = gminspec; end if g c u r s p e c < g m i n s p e c g c u r s p e c = gminspec; end if g c u r s p e c > g m a x s p e c g c u r s p e c = gmaxspec; end g c u r r p t = s t r 2 n u m (get (ed_SELEset_Currpt, 'String' ) ) ; if g c u r r p t == [] g c u r r p t = I; end if g c u r r p t 4 v a r i = f i l e n a m e (1 -pit-4 ) ; end zapu=[vari, ' = S;']; eval (zapu) r a p u = [ ' s a v e ', vari,' ', vari]; e v a l (rapu) end end end if s t r c m p (gui, 'S E L E s e t _ L o a d _ D e f a u l t ' ) if a c t i o n == 0 pb_SELEset_Load_Default = u i c o n t r o l (f g _ S E L E s e t .... 'Style' 'pushbutton' 'Units', 'normalized', 'Position', [0.529 0.000 0.171 0.058] .... 'String', 'L o a d _ D e f a u l t ' , 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack',' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ L o a d _ D e f a u l t ' ' , i) ;') ; end if a c t i o n == 1 g e l p c u r = I; set ( e d _ S E L E s e t _ E l p c u r l , 'String', n u m 2 s t r (gelpcur)) ; set ( e d _ S E L E s e t _ E l p c u r 2 , 'String', n u m 2 s t r (1 0 0 * k o m b i (gelpcur, 2 ) ) ) ; g s p e c t c u r = i; set ( e d _ S E L E s e t _ S p e c t c u l , 'String', n u m 2 s t r (gspectcur)) ; ,
,
--
•
set ( e d _ S E L E s e t _ S p e c t c u 2 , 'String', n u m 2 s t r (100*kombi (gspectcur, 3) ) ) ; gcurspec = gn~nspec; set ( e d _ S E L E s e t _ C u r s p e c , 'String', n u m 2 s t r (gcurspec)) ; g c u r r p t = i; set ( e d _ S E L E s e t _ C u r r p t , 'String', n u m 2 s t r (gcurrpt)) ; g c u r i t e r = 20; set ( e d _ S E L E s e t _ C u r i t e r , 'String', num2 str (gcuriter)) ; end end if s t r c m p (gui, 'S E L E s e t _ H e l p ' ) if a c t i o n == 0 p b _ S E L E s e t _ H e l p = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'pushbutton', ... 'Units' 'normalized' 'Position', [0.771 0.000 0.171 0.058] .... 'String', ' H e l p ' , ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'CallBack',' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ H e l p ' ' , i) ;') ; i
351
end if a c t i o n == 1 h e l p helpdar; end
end if s t r c m p ( g u i , ' S E L E s e t _ P a r a m h e a d e r ' ) if a c t i o n == 0 tx_SELEset_Paramheader = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','text' ' B a c k g r o u n d C ' , [ 0 . 5 0.5 0.5] .... 'ForegroundC','[l 1 i] .... 'Units','normalized','Position',[0.143 0.904 0.286 0.038] .... 'String','Parameters for A n a l y s i s ' , . . . 'CallBack','zzzdar(''SELEset'',''SELEset_Paramheader'',l);'); end end if s t r c m p ( g u i , ' S E L E s e t _ S u b s e t h e a d e r ' ) if a c t i o n == 0 tx_SELEset_Subsetheader = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','text' ' B a c k g r o u n d C ' , [ 0 . 5 0.5 0.5] .... 'ForegroundC','[l 1 i] .... 'Units','normalized','Position',[0.629 0.904 0.200 0.038] .... ' S t r i n g ' , ' S e l e c t e d Subset',... 'CallBack','zzzdar(''SELEset'',''SELEset_Subsetheader'',l);'); end end if strcmp(gui, 'SELEset_Elprof' ) if a c t i o n == 0 t x _ S E L E s e t _ E l p r o f = u i c o n t r o l (fg_SELEset .... 'Style','text','BackgroundC',[0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [ l 1 i] .... 'Units','normalized','Position',[0.071 0.846 0.271 0.038] .... 'String','Elution Curve Baseline',... 'CallBack , ' z z z d a r ( ' ' S E L E s e t ' ' , ' ' S E L E s e t _ E l p r o f ' ' , l ) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ U n i m o d ' ) if a c t i o n == 0 e d _ S E L E s e t _ U n i m o d = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] .... 'Units','normalized','Position',[0.100 0.769 0.i00 0.058] .... 'String' '' 'CallBack','zzzdar(''SELEset'',''SELEset_Unimod'',l);'); end end if s t r c m p (gui, 'S E L E s e t _ E l p m i n ' ) if a c t i o n == 0 e d _ S E L E s e t _ E l p m i n = u i c o n t r o l (f g _ S E L E s e t .... 'Style','edit','BackgroundColor',[0.50 0.50 0.50] . . 'Units','normalized','Position',[0.214 0 769 0.086 0 058] .... 'String' '',... 'CallBack','zzzdar(''SELEset'',''SELEset_Elpmin'',l);'); end end
352
if s t r c m p (gui, 'S E L E s e t _ E l p m i n t x ' ) if a c t i o n == 0 tx_SELEset_Elpmintx = u i c o n t r o l (f g _ S E L E s e t .... 'S t y l e ' , 't e x t ' , 'B a c k g r o u n d C ' , [0.5 0.5 0.5 ] .... ' F o r e g r o u n d C ' , [I 1 I] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.314 0 . 7 6 9 0 . 1 5 7 0 . 0 5 8 ] .... 'String', 'Minimum %',... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p m i n t x ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ E l p c u r t x l ' ) if a c t i o n == 0 tx_SELEset_Elpcurtxl = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5 ] .... ' F o r e g r o u n d C ' , [I 1 i] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.557 0 . 6 7 3 0 . 1 7 1 0 . 1 5 4 ] .... 'String', 'Selected step for elution curve baseline', .. 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p c u r t x l ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ E l p c u r l ' ) if a c t i o n == 0 ed_SELEset_Elpcurl = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 . 7 6 9 0 . i 0 0 0 . 0 5 8 ] .... 'String' ,-- .... ' C a l l B a c k ' , ' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ E l p c u r l ' ' , i) ;') ; end end if s t r c m p ( g u i , ' S E L E s e t _ E l p m a x ' ) if a c t i o n == 0 ed_SELEset_Elpmax = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'edit', ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.214 0 . 6 9 2 0 . 0 8 6 0.058] .... 'String' '' 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p m a x ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ E l p m a x t x ' ) if a c t i o n == 0 tx_SELEset_Elpmaxtx = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','text','BackgroundC', [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.314 0 . 6 9 2 0 . 1 5 7 0 . 0 5 8 ] .... 'String', 'Maximum %',... ' C a l l B a c k ' , 'z z z d a r ( ' 'SELEset' ', ' ' S E L E s e t _ E l p m a x t x ' ', i) ; ') ; end end if s t r c m p (gui, 'S E L E s e t _ E l p c u r 2 ' ) if a c t i o n == 0 ed_SELEset_Elpcur2 = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 . 6 9 2 0 . i 0 0 0 . 0 5 8 ] .... i
,
•
•
•
353
end
' S t r i n g ' , '',... 'C a l l B a c k ' , 'z z z d a r
( ' 'S E L E s e t '
', ' 'S E L E s e t _ E l p c u r 2
' ', I) ; ' ) ;
end if s t r c m p (gui, 'S E L E s e t _ E l p c u r t x 2 ') if a c t i o n == 0 tx_SELEset_Elpcurtx2 = uicontrol ( f g _ S E L E s e t .... 'Style','text','BackgroundC', [0.5 0 . 5 0.5] .... 'ForegroundC', [i 1 i] .... 'Units','normalized','Position', [ 0 . 8 5 7 0 . 6 9 2 0 . 0 5 7 0 . 0 5 8 ] .... 'String', ' (%)',... 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p c u r t x 2 ' ', i) ; ' ) ;
end end if s t r c m p ( g u i , 'SELEset_Elplinlog' ) if a c t i o n == 0 ed_SELEset_Elplinlog = uicontrol ( f g _ S E L E s e t .... 'Style', 'edit', 'BackgroundColor', [ 0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units', 'normalized', ' P o s i t i o n ' , [ 0 . i 0 0 0 . 6 1 5 0 . i 0 0 0 . 0 5 8 ] .... ' S t r i n g ' , '',... 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p l i n l o g ' ', i) ; ' ) ; end end if s t r c m p if
(gui, 'S E L E s e t _ E l p s t e p s '
)
action == 0 ed_SELEset_Elpsteps = uicontrol ( f g _ S E L E s e t .... 'Style', 'edit', 'BackgroundColor', [ 0 . 5 0 0 . 5 0 0 . 5 0 ] .... 'Units', 'normalized', ' P o s i t i o n ' , [ 0 . 2 1 4 0 . 6 1 5 0 . 0 8 6 0 . 0 5 8 ] .... ' S t r i n g ' , '', ... 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p s t e p s ' ', i) ; ' ) ;
end end if s t r c m p (gui, 'S E L E s e t _ E l p s t e p s t x ' ) if a c t i o n == 0 tx_SELEset_Elpstepstx = uicontrol ( f g _ S E L E s e t .... 'Style', 'text', 'BackgroundC', [0.5 0 . 5 0.5] .... 'ForegroundC', [i 1 i] .... 'Units', 'normalized', 'Position', [0.314 0.615 0.157 ' S t r i n g ' , 'No. o f s t e p s ' , . . . 'CallBack', ' z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ E l p s t e p s t x ' end end if s t r c m p (gui, 'S E L E s e t _ S p e c t c u r t x l ' ) if a c t i o n == 0 tx_SELEset_Spectcurtxl = uicontrol ( f g _ S E L E s e t .... 'Style', 'text', 'BackgroundC', [0.5 0 . 5 0.5] .... 'ForegroundC', [I 1 i] .... 'Units','normalized','Position',[0.557 0.481 0.171 'String', 'Selected step for spectral baseline', ... 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t c u r t x l end end if strc-mlo(gui, ' S E L E s e t _ S p e c t c u l ' )
0 . 0 5 8 ] .... ', i) ; ') ;
0 . 1 5 4 ] .... ' ', i) ; ' ) ;
354
if a c t i o n == 0 ed_SELEset_Spectcul = u i c o n t r o l (f g _ S E L E s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 . 5 7 7 0 . i 0 0 0.058] .... 'String', '',... ' C a l l B a c k ' , 'z z z d a r ( ' 'SELEset' ', ' ' S E L E s e t _ S p e c t c u l ' ', i) ; ') ; end
end if s t r c m p (gui, 'S E L E s e t _ S p e c t c o n s t ' ) if a c t i o n == 0 tx_SELEset_Spectconst = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', ' t e x t ' , ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 I] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.i00 0 . 5 3 8 0 . 2 4 3 0.038] .... 'String', ' S p e c t r a l B a s e l i n e ' , . . . 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t c o n s t ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t c u 2 ') if a c t i o n == 0 ed_SELEset_Spectcu2 = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 . 5 0 0 0 . i 0 0 0.058] .... 'String', '', ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t c u 2 ' ', I) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t c u r t x 2 ') if a c t i o n == 0 tx_SELEset_Spectcurtx2 = u i c o n t r o l (f g _ S E L E s e t .... 'Style', ' t e x t ' , ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units','normalized','Position', [0.857 0 . 5 0 0 0 . 0 5 7 0.058] .... ' S t r i n g ' , ' (%)',... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t c u r t x 2 ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t m i n ' ) if a c t i o n == 0 ed_SELEset_Spectmin = u i c o n t r o l (f g _ S E L E s e t .... 'Style', 'edit', ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.214 0 . 4 6 2 0 . 0 8 6 0.058] .... 'String', '',... ' C a l l B a c k ' , 'z z z d a r ( ' 'SELEset' ', ' ' S E L E s e t _ S p e c t m i n ' ', i) ; ') ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t m i n t x ' ) if a c t i o n == 0 tx_SELEset_Spectmintx = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','text' ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' ,'[i 1 i] .... 'Units' ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.314 0 . 4 6 2 0 . 1 5 7 0.058] .... 'S t r i n g ' , ' M i n i m u m %', ...
355
end
' C a l l B a c k ' , ' z z z d a r ('' S E L E s e t ' ' , ' '
SELEset_Spectmintx'',
i) ;') ;
end if s t r c m p ( g u i , ' S E L E s e t _ S p e c t m a x ' ) if a c t i o n == 0 ed_SELEset_Spectmax = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , 'e d i t ' , 'B a c k g r o u n d C o l o r ' , [0 . 5 0 0 . 5 0 0 _ 5 0 ], . _. 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.214 0 385 0 . 0 8 6 0 058] .... 'String', '', ... 'CallBack',' zzzdar('' SELEset'','' SELEset_Spectmax'', I) ;') ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t m a x t x ' ) if a c t i o n == 0 tx_SELEset_Spectmaxtx = u i c o n t r o l (f g _ S E L E s e t .... 'Style', 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.314 0 . 3 8 5 0 . 1 5 7 0.058] .... 'String', ' M a x i m u m %',... ' C a l l B a c k ' , ' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ S p e c t m a x t x ' ' , i) ;') ; end end if s t r c m p (gui, 'S E L E s e t _ C u r s p e c t x ' ) if a c t i o n == 0 tx_SELEset_Curspectx = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [I 1 i] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.557 0 . 3 0 8 0 . 1 7 1 0.096] .... 'String', ' N u m b e r of s p e c i e s ' , ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ C u r s p e c t x ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ C u r s p e c ') if a c t i o n == 0 ed_SELEset_Curspec = u i c o n t r o l (f g _ S E L E s e t .... 'Style', 'edit', ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 . 3 4 6 0 . I 0 0 0.058] .... S t r i n g ' , '',... ' C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ C u r s p e c ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t l i n l o g ' ) if a c t i o n == 0 ed_SELEset_Spectlinlog = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0.50] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.i00 0 . 3 0 8 0 . i 0 0 0.058] .... String','',... ' C a l l B a c k ' , ' z z z d a r ( ' ' S E L E s e t ' ' , ' 'S E L E s e t _ S p e c t l i n l o g ' ', i) ;') ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c t s t e p s ') if a c t i o n == 0 ed_SELEset_Spectsteps = u i c o n t r o l ( f g _ S E L E s e t ....
356
'Style', 'edit','BackgroundColor', [0.50 0 . 5 0 0 . 5 0 ] , . 'Units','normalized','Position', [ 0 . 2 1 4 0 3 0 8 0 . 0 8 6 0 . 0 5 8 ] .... ' S t r i n g ' , '', ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t s t e p s ' ', i) ; ' ) ; - - .
end end if s t r c m p (gui, 'S E L E s e t _ S p e c t s t e p s t x ' ) if a c t i o n = = 0 tx_SELEset_Spectstepstx = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , 't e x t ' , ' B a c k g r o u n d C ' , [0 . 5 0 . 5 0 . 5 ] .... ' F o r e g r o u n d C ' , [1 1 1 ] .... 'Units','normalized','Position', [ 0 . 3 1 4 0 . 3 0 8 0 . 1 5 7 0 . 0 5 8 ] .... 'String','No. of steps',... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c t s t e p s t x ' ', i) ; ' ) ; end end if s t r c m p ( g u i , ' S E L E s e t _ C u r r p t t x ' ) if a c t i o n == 0 tx_SELEset_Currpttx = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , 't e x t ' , 'B a c k g r o u n d C ' , [0 . 5 0 . 5 0 . 5 ] .... ' F o r e g r o u n d C ' , [i 1 i] .... 'Units','normalized','position', [ 0 . 5 5 7 0 . 1 9 2 0 . 1 7 1 0 . 0 9 6 ] .... 'String', 'Number of repeats',... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ C u r r p t t x ' ', i) ; ') ; end end if s t r c m p (gui, 'S E L E s e t _ C u r r p t ' ) if a c t i o n = = 0 ed_SELEset_Currpt = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'edit','BackgroundColor', [0.50 0 . 5 0 0 . 5 0 ] .... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 7 4 3 0 . 2 3 1 0 . i 0 0 0 . 0 5 8 ] .... ' S t r i n g ' , ' ', ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ C ~ r p t ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ S p e c i e s ' ) if a c t i o n == 0 tx_SELEset_Species = u i c o n t r o l ( f g _ S E L E s e t .... 'S t y l e ' , 't e x t ' , 'B a c k g r o u n d C ' , [0.5 0 . 5 0 . 5 ] .... ' F o r e g r o u n d C ' , [i 1 I] .... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . i 0 0 0 . 2 3 1 0 . 2 4 3 0 . 0 3 8 ] .... 'String', 'Number of Species',... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ S p e c i e s ' ', I) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ M i n s p e c ' ) if a c t i o n = = 0 ed_SELEset_Minspec = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'edit','BackgroundColor', [0.50 0 . 5 0 0 . 5 0 ] .... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 2 1 4 0 . 1 5 4 0 . 0 8 6 0 . 0 5 8 ] .... ' S t r i n g ' , '', ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ M i n s p e c ' ', i) ; ' ) ; end
357
end if s t r c m p (gui, 'S E L E s e t _ M i n s p e c t x ' ) if a c t i o n == 0 tx_SELEset_Minspectx = u i c o n t r o l ( f g _ S E L E s e t .... 'Style','text','BackgroundC', [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [I 1 i] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.314 0 . 1 5 4 0 . 1 5 7 0 . 0 5 8 ] .... 'S t r i n g , 'M i n i m u m ' , . 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ M i n s p e c t x ' ', I) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ C u r i t e r t x ' ) if a c t i o n == 0 tx_SELEset_Curitertx = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'text', ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' , [i 1 I] .... 'Units', ' n o r m a l i z e d ' , ' P o s i t i o n ' , [ 0 . 5 5 7 0 . 0 7 7 0 . 1 7 1 0.096] .... ' S t r i n g ' , ' N u m b e r of A R i t e r a t i o n s ' ... 'C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ C u r i t e r t x ' ', I) ; ' ) ; end end if s t r c m p ( g u i , ' S E L E s e t _ C u r i t e r ' ) if a c t i o n == 0 ed_SELEset_Curiter = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'edit', ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0 . 5 0 ] , . _ . ' U n i t s , n o r m a l i z e d ' , ' P o s i t i o n ' , [0.743 0 115 0 . i 0 0 0 058] .... 'String' '' ' C a l l B a c k ' , ' z z z d a r ('' S E L E s e t ' ' , ' ' S E L E s e t _ C u r i t e r ' ' , i) ;') ; end end if s t r c m p (gui, 'S E L E s e t _ M a x s p e c ' ) if a c t i o n == 0 ed_SELEset_Maxspec = u i c o n t r o l ( f g _ S E L E s e t .... 'Style', 'edit', ' B a c k g r o u n d C o l o r ' , [0.50 0 . 5 0 0 _ 5 0 ] , . _ . 'Units', n o r m a l i z e d ' , ' P o s i t i o n ' , [0.214 0 077 0 . 0 8 6 0 058] .... String' '' ' C a l l B a c k ' , 'z z z d a r ( ' 'S E L E s e t ' ', ' 'S E L E s e t _ M a x s p e c ' ', i) ; ' ) ; end end if s t r c m p (gui, 'S E L E s e t _ M a x s p e c t x ' ) if a c t i o n == 0 tx_SELEset_Maxspectx = u i c o n t r o l (f g _ S E L E s e t .... 'Style', 'text' ' B a c k g r o u n d C ' , [0.5 0.5 0.5] .... ' F o r e g r o u n d C ' ,'[i 1 i] .... 'Units','normalized','Position', [ 0 . 3 1 4 0 . 0 7 7 0 . 1 5 7 0.058] .... 'S t r i n g ' , 'M a x i m u m ' , . . . ' C a l l B a c k ' , 'z z z d a r ( ' S E L E s e t ' ', ' 'S E L E s e t _ M a x s p e c t x ' ', i) ; ') ; end end !
,
•
•
•
end if s t r c m p (page, 'S E L E r u n ' ) figure (fg_SELErun) set(gcf, ' N u m b e r T i t l e ' , 'off',
...
358 'Name',
'S E L E r u n ' ,
.
off' i ;
'backingstore' ,' i f s t r c m p (gui, ' i n i t g u i ' ) z z z d a r ( 'S E L E r u n ' , 'S E L E r u n _ M E N U I zzzdar zzzdar
( 'S E L E r u n ' , (' S E L E r u n ' ,
'S E L E r u n _ M e n u 2 'S E L E r u n _ B a c k ' ,
zzzdar zzzdar
( 'S E L E r u n ' , ( 'S E L E r u n ' ,
'S E L E r u n _ H e l p 'S E L E r u n _ B o o t
zzzdar
end if strcmp if
( 'S E L E r u n ' ,
'S E L E r u n _ S t e p c t
(gui, ' S E L E r u n _ M E N U I '
action == 0 mn_SELErun_Files
', 0 ) ;
', 0 ) ; 0) ;
', 0 ) ;
', 0 ) ; ', i) ;
)
= uimenu
'L a b e l ' , 'F i l e s ' ) ; mi_SELErun_Save_spectra
(fg_SELErun = uimenu
'L a b e l ' , ' S a v e _ s p e c t r a ' , . . . 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E r u n '
mi_SELErun_Save_curves
....
(mn_SELErun_Files
', ' ' S E L E r u n
= uimenu
Save_s
(mn_SELErun_Files
'L a b e l ' , ' S a v e _ c u r v e s ' , . . . 'CallBack', ' z z z d a r ( ' ' S E L E r u n ' ', ' S E L E r u n _ S a v e _ c mi_SELErun_Quit = uimenu (mn_SELErun_Files ....
end
'Label' , 'Quit' , . . . 'C a l l B a c k ' , ' z z z d a r (' ' S E L E r u n '
', ' 'S E L E r u n _ b y e '
end i f s t r c m p (gui, ' S E L E r u n _ M e n u 2 ') if action == 0 mn_SELErun_CgDTO = uimenu (fg_SELErun
end if strcmp(gui, if
end
action figure
page
end if strcmp if
end
end if strcmp if
'SELErun_setthem'
', i) ; ') ;
....
', ' ' S E L E r u n _ H e l p
)
== 1 (fg_SELEset)
= 1;
(gui, ' S E L E r u n _ r u n t h e m '
action figure
page
( ' 'S E L E r u n '
)
== 1 (fg_SELErun)
= 2 ;
(gui, ' S E L E r u n _ B a c k '
action == 0 pb_SELErun_Back 'S t y l e ' ,
)
= uicontrol
'p u s h b u t t o n ' , . . .
....
' ', i) ; ' ) ;
'Label', 'Back',... 'C a l l B a c k ' , ' z z z d a r ( ' 'S E L E r u n ' ', ' ' S E L E r u n _ s e t t h e m ' mi_SELErun_Help = uimenu (mn_SELErun_GOTO ....
end
' ', i) ; ' ) ;
....
'Label', 'GOTO' ) ; mi_SELErun_Back = uimenu(mn_SELErun_C43TO
'Label', 'Help',... 'C a l l B a c k ' , 'z z z d a r
....
(fg_SELErun
....
', i) ; ') ;
' ', i) ; ' ) ;
359
'Units', ' n o r m a l i z e d ' , 'Position', [0.000 0.000 0.186 0.058] .... 'String', 'Back', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0 . 8 ] , _ . . C a l l B a c k ' , 'z z z d a r (' 'SELErun' ' , ' 'S E L E r u n _ B a c k ' ' , i) ' ) ;
end if a c t i o n == 1 p l o t ( l , 'k') a x i s ( 'off' ) f i g u r e (f g _ S E L E s e t ) p a g e = 1; end
end if s t r c m p (gui, 'S E L E r u n _ b y e ' ) if a c t i o n == 1 c l o s e (fg_SELEset) c l o s e (fg_SELErun) end end if s t r c m p (gui, 'S E L E r u n _ s a v e t h e m ' ) if a c t i o n == 1 %global SavedFile [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '* .mat', ... 'Data F i l e n a m e ' , 50,50) ; if f i l e n a m e - = 0 S a v e d F i l e = [ p a t h n a m e filename] ; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end end end if s t r c m p (gui, 'S E L E r u n _ s a v e a s t h e m ' ) if a c t i o n == 1 %global SavedFile if (strcmp (SavedFile, ' ' ) ) [f i l e n a m e , p a t h n a m e ] = u i p u t f i l e ( '*.mat', . . . 'Data F i l e n a m e ' , 50,50) ; else [filename,pathname] = u i p u t f i l e ( S a v e d F i l e .... 'Data F i l e n a m e ' , 50,50) ; end if f i l e n a m e - = 0 S a v e d F i l e = [ p a t h n a m e filename] ; clear filename pathname; eval(['save ' S a v e d F i l e ';']); end end end i f strc~r~o (gui, 'S E L E r u n _ E x e c u t e ' ) if a c t i o n == 1 %global grptct %global gminspec %global gmaxspec %global gmaxiter %global gspectmax
360
%global gspectmin %global gspectsteps %global gelpsteps %global gelpmin %global gelpmax %global stepcount %global S %global C load OBS TObs=OBS; Obs=TObs'; [nk, mk] = size(kombi); totalsteps = gspectsteps*gelpsteps* (gmaxspecgminspec+ 1 ) *gcurrpt; SVF=zeros (i, i) ; SFF=zeros (I, i) ; tstp = gtotalsteps * grptct; fprintf ( '\n*** AutoAR/Select ***\n\n' ) fprintf ( 'Max iterations- %3. Of Repeats- %3. Of Total sets:%3.0f\n',... gmaxi ter, grptct, tstp) if elpchoice == 1 fprintf('Elution curve baseline: %7.6f - %7.6f Lin steps:%4.0f\n', ... gelpmin/100, gelpmax/100, gelps teps ) else fprintf ( 'Elution curve baseline- %7.6f - %7.6f Log steps:%4.0fkn',... gelpmin/100, gelpmax/100, gelps teps ) end if spectchoice == 1 fprintf ( 'Spectral baseline: %7.6f - %7.6f Lin steps:%4.0f\n', ... gspectmin/I00, gspectmax/i00, gspectsteps ) else fprintf ( 'Spectral baseline: %7.6f - %7.6f Log steps:%4.0fkn',... gspectmin/i00, gspectmax/i00, gspectsteps ) end fprintf ( 'No. of species : %2. Of -%2.0f\n\n', gminspec, gmaxspec) if elpchoice == 1 elpvector = linspace (gelpmin/100, gelpmax/100, gelpsteps) ; else elpmi = logl0(gelpmin/100) ; elpma = logl0(gelpmax/100) ; elpvector = logspace (elpmi, elpma, gelpsteps) ; end elp = elpvector (gelpcur) ; if spectchoice == 1 spectvector = linspace(gspectmin/100 .... gspectmax/100, gspectsteps ) ; else spectmi = logl0(gspectmin/100) ;
361
spectma = logl0(gspectmax/100) ; s p e c t v e c t o r = l o g s p a c e (spectmi, spectma, g s p e c t s t e p s ) ;
end s p e c t = s p e c t v e c t o r (gspectcur) ; siemen = 0 ; s i e m e n 2 = 0; f p r i n t f ( 'S e l e c t i o n : \n\n' ) for ii = l - n k w n s p e c = k o m b i (ii, i) ; welp = kombi(ii,2); w s p e c t = k o m b i (ii, 3) ; w A F = k o m b i (ii, 4) ; wSF = kombi(ii,5); w s i e m e n = k o m b i (ii, 6) ; w s i e m e n 2 = k o m b i (ii, 7 ) ; it = 0 . 0 0 0 0 0 0 0 1 ; if w n s p e c == g c u r s p e c if a b s ( w e l p - e l p ) < it i f abs (wspect- spec t) < i t s i e m e n = ws iemen; siemen2 = wsiemen2 ; end end end end f i g u r e (f g _ S E L E r u n ) s t e p c o u n t = 0; disp( 'setel.curve spectral no. of iteration fit' ) disp( 'c o u n t baseline baseline species per level %') r a n d ( 'seed', s iemen) ; r a n d n ( 'seed', s i e m e n 2 ) ; nspec = gcurspec; F = z e r o s (gcurrpt, i) ; T T = z e r o s (i, nspec) ; T T T = z e r o s (gcurrpt, n s p e c ) ; [nlines, n s c a n s ] = s i z e (Obs) ; t o t a l = n o r m (Obs, 'fro' ) ; for rr = l : g c u r r p t S = r a n d (nl ines, n s p e c ) ; bestfit=9999999 ; for iter=l- g c u r i t e r S = (S > z e r o s ( s i z e ( S ) ) ) . * S ; S = S+0.00000000001*abs (randn(size(S)) ) ; for j =i. n s p e c S (i- n l i n e s , j ) =S (i- n l i n e s , j ) / s u m (S (i- n l i n e s , j ) ) ; end if i t e r == g c u r i t e r PI=S*C; D1 = O b s - P l ; diff = 100*norm(Dl,'fro')/total; stepcount = stepcount+l; fprintf('%5.0f %7.6f %7.6f %4.0f %4.0f %5.3f\n',...
362
s t e p c o u n t , elp, spect, nspec, iter, di f f) set ( e d _ S E L E r u n _ S t e p c t , 'String', n u m 2 s t r (stepcount)) ;
n u m 2 s t r (diff) ] )
end Cl = p i n v ( S ) ; C = Cl*Obs; %solve concentrations C = (C > zeros (size (C) ) ) .*C; C = C + 0.00000000001*abs(randn(size(C))); for j = 1 :n s p e c if u n i c h o i c e == 1 % Make monotonic t e m p = C (j, 1: n s c a n s ) ; [a r v o p a i k k a ] = m a x (temp) ; t e m p l s = s o r t (t e m p (1 :paikka) ) ; t e m p r s = s o r t (t e m p (paikka: n s c a n s ) ) ; C (j, l-nscans) = [templs t e m p r s ( ( n s c a n s - p a i k k a ) --i- i) ] ; end W = C (j, i- n s c a n s ) ; w b i g = m a x (W) ; w s m a l l = m i n (W) ; wlimit = wbig*elp; if w s m a l l < w l i m i t ero = w l i m i t - w s m a l l ; W=W+ero; end C ( j , l - n s c a n s ) = W; e n d %j % Last iteration: show & save results if i t e r == g c u r i t e r if u n i c h o i c e == 1 [WWW, EKE] = m a x ( C ' ) ; [LLL,MMM] = s o r t (KKK) ; C = C(MMM,:); S = S(:,MMM); else T T = s u m (C') ; [WWW, KKK] = s o r t ( T T ) ; C = C(KKK,:); S = S(:,KKK); end p l o t (C') title( [ 'AR i t e r a t i o n #' , n u m 2 s t r ( i t e r ) , ' Fit=', x l a b e l ( 'S c a n number' ) drawnow save C C save S S
end G = pinv(C') ; ST = G * T O b s ; ST = (ST > z e r o s (size (ST) ) ) .*ST; ST = ST + 0 . 0 0 0 0 0 0 0 0 0 0 1 * a b s ( r a n d n ( s i z e ( S T ) ) ) ; for j = I- n s p e c W = S T (j, 1 :nlines) ;
363 ii
w b i g = m a x (W) ; w s m a l l = m i n (W) ; wlimit = wbig*spect; if w s m a l l < w l i m i t ero = w l i m i t - w s m a l I; W=W+ero; end ST(j,l-nlines) : W; e n d %j S = ST'; end %iter F(rr) =diff; TT=sum(C' ) ; T T T (rr, : ) =TT; e n d %rr if g c u r r p t == 1 VV= 0; else V V = std(TTT); end M M = m e a n (TTT) ; SVF (I) = 1 0 0 * s u m ( V V ) / s u m ( M M ) ; SFF (1 ) = m e a n (F) ; f p r i n t f ( '\nScatter: \n') %6.4f f p r i n t f ( '% 6 . 4 f %6.4f %6.4f f p r i n t f (' \nFit : \n') f p r i n t f ( '% 6 . 4 f %6.4f %6.4f %6.4f f p r i n t f (' \n')
,
,
,it
%6.4f
%6.4f\n',SVF)
%6.4f
%6.4f\n',SFF)
,
end end if s t r c m p (gui, 'S E L E r u n _ H e l p ' ) if a c t i o n == 0 pb_SELErun_Help = u i c o n t r o l ( f g _ S E L E r u n .... 'Style', ' p u s h b u t t o n ' , . . . 'Units', 'normalized', 'position', [0.814 0.000 0.098 0.058] .... 'String', 'Help', ' B a c k g r o u n d C o l o r ' , [ 0 . 8 0.8 0.8] .... 'CallBack',' zzzdar(' 'SELErun'',' ' S E L E r u n _ H e l p ' ' , i) ;') ; end if a c t i o n == 1 help helpdar; end end i f s t r c m p (gui, 'S E L E r u n _ S a v e _ s ' ) if a c t i o n == 1 [filename, path] = u i p u t f i l e ( 'S.mat', 'Save s p e c t r a file: ' ) ; v a r i = f i i ename; p i t = l e n g t h (filename) ; if f i l e n a m e - = 0 if p i t > 4 v a r i = f i l e n a m e (I :pit-4) ; end zapu=[vari, ' = S;'] ; eval (zapu)
364
end
end
rapu=['save eval (rapu)
', vari,'
', vari];
end if s t r c m p (gui, 'S E L E r u n _ S a v e _ c ' ) if a c t i o n == 1 [filename, path] = u i p u t f i l e ( 'C.mat',... 'Save e l u t i o n curves file-'); v a r i = f i i ename; p i t = l e n g t h (filename) ; if f i l e n a m e - = 0 if p i t > 4 v a r i = f i l e n a m e (1 -pit-4) ; end zapu=[vari, ' = C;']; eval (zapu) r a p u = [ ' s a v e ', vari,' ', vari]; eval (rapu) end end end if s t r c m p (gui, 'S E L E r u n _ S t e p c t ' ) if a c t i o n == 0 e d _ S E L E r u n _ S t e p c t = u i c o n t r o l (f g _ S E L E r u n .... 'Style', ' e d i t ' , ' B a c k g r o u n d C o l o r ' , [0.50 0.50 0.50] .... ' U n i t s ' , ' n o r m a l i z e d ' , ' P o s i t i o n ' , [0.686 0.000 0.i00 0.058] .... 'String', '', ... 'CallBack', 'zzzdar ( ' 'SELErun' ', ' 'S E L E r u n _ S t e p c t ' ', i) ; ' ) ; end end
end % # # # # # E n d of p r o g r a m
#####
Chapter
18
Displaying the spectra and elution curves
This Page Intentionally Left Blank
18
Displaying the spectra and elution curves
When a point in the constraint space has been selected the optimized results are recalculated for display purposes by the program z z z d a r . m. The display of the selected results is then finally handled by the program z z z p o s t . m. %ZZZPOST.M displays the spectra and elution curves found by AR. % Date: 30 Sep 1995 % % Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 % function zzzpost (page, gui, action) ; global C global echoice global m global n global S global SavedFile global schoice global sps global ed_POSTset_Areatxt global ed_POSTset_AreaVal global fg_POSTset global mi_POSTset_Help global mi_POSTset_Quit global mn_POSTset_Files global mn_POSTset_C43TO global pb_POSTset_Help
C.mat S mat
_.~ ..-
POST
Figure 18.1 Program z z z p o s t .m displays the solved the elution curves (C .mat) and their spectra (S. mat). 367
368
global pb_POSTset_Spl global pb_POSTset_Spl 0 global pb_POSTset_Sp2 global pb_POSTset_Sp3 global pb_POSTset_Sp4 global pb_POSTset_Sp5 global pb_POSTset_Sp6 global pb_POSTset_Sp7 global pb_POSTset_Sp8 global pb_POSTset_Sp9 global rb_POSTset_Bar global rb_POSTset_Line global rb_POSTset_Spec global rb_POSTset_Total if n a r g i n == 0 p a g e = 'i n i t p a g e ' ; g u i = 'initgui'; a c t i o n = 0; end if s t r c m p ( p a g e , 'initpage' ) fg_POSTset = figure; z z z p o s t ( 'P O S T s e t ' , 'i n i t g u i ', 0) ; p a g e = ''end if s t r c m p (page, 'P O S T s e t ' ) f i g u r e (f g _ P O S T s e t ) set(gcf, ' N u m b e r T i t l e ' , 'off', ... 'Name', 'P O S T s e t ' , . ' b a c k i n g s t o r e ' , 'off' { ; if s t r c m p ( g u i , 'initgui') z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ M e n u l ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ M e n u 2 ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ S p l ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ S p 2 ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ S p 3 ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ S p 4 ', 0) ; z z z p o s t ( 'P O S T s e t ' , P O S T s e t _ S p 5 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ T o t a l ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p e c ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p 6 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ A r e a t x t ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ A r e a V a l ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p 7 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p 8 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p 9 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ S p l 0 ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ B a r ', 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ L i n e ' , 0) ; z z z p o s t ( 'P O S T s e t ' , 'P O S T s e t _ H e l p ' , 0) ; z z z p o s t ( 'P O S T s e t ', 'P O S T s e t _ B o o t ' , i) ; end if s t r c m p (gui, 'P O S T s e t _ M e n u l ' ) if a c t i o n == 0
369
mn_POSTset_Files = u i m e n u (f g _ P O S T s e t .... 'Label', 'Files' ) ; mi_POSTset_Quit = u i m e n u ( m n _ P O S T s e t _ F i l e s .... 'Label', 'Quit',... 'CallBack','zzzpost(''POSTset'',''POSTset_bye'',l);');
end end if s t r c m p (gui, 'P O S T s e t _ M e n u 2 ' ) if a c t i o n == 0 mn_POSTset_C43TO = u i m e n u ( f g _ P O S T s e t .... 'Label', 'GOTO' ) ; mi_POSTset_Help = u i m e n u ( m n _ P O S T s e t _ G O T O .... 'Label', 'Help',... 'C a l l B a c k ' , 'z z z p o s t ( ' 'P O S T s e t ' ', ' 'P O S T s e t _ H e l p ' ', i) ; ' ) ; end end if s t r c m p (gui, ' P O S T s e t _ s e t t h e m ' ) if a c t i o n == 1 f i g u r e (f g _ P O S T s e t ) p a g e = 1; end end if s t r c m p ( g u i , ' P O S T s e t _ l o o k t h e m ' ) if a c t i o n == 1 end end if s t r c m p ( g u i , 'POSTset_Spl') if a c t i o n == 0 pb_POSTset_Spl = u i c o n t r o l (f g _ P O S T s e t .... 'Style', ' p u s h b u t t o n ' , . . . 'Units', 'normalized', 'Position', [0.929 0.769 0.071 0.058] .... 'String', 'Spl', ' B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8] .... 'CallBack','zzzpost(''POSTset'',''POSTset_Spl'',l);'); end if a c t i o n == 1 sp = i; if m > = 1 s u b p l o t (2, i, I) a r e = s u m ( C (sp, : ) ) * 1 0 0 / s u m ( s p s ) ; set ( e d _ P O S T s e t _ A r e a V a l , 'String', n u m 2 s t r (are)) ; if e c h o i c e == 1 p l o t (sps, 'g- ' ) h o l d on p l o t (C (sp, • ), 'r' ) h o l d off else p l o t ( C ( s p , - ) , 'r') end title(['Species #I ']) x l a b e l (' S c a n number') y l a b e l (' R e l a t i v e A m o u n t ' ) s u b p l o t (2, I, 2 ) if s c h o i c e == 1
370
b a r (S ( :, sp), 'r') else p l o t ( S ( : ,sp), 'r')
end
end
end drawnow
end if s t r c m p (gui, 'P O S T s e t _ S p 2 ' ) if a c t i o n == 0 pb_POSTset_Sp2 = u i c o n t r o l ( f g _ P O S T s e t .... 'Style', ' p u s h b u t t o n ' , . . . 'Units' 'normalized' 'Position', [0.929 0.712 0.071 0.058] .... 'String', 'Sp2', 'B a c k g r o u n d C o l o r ' , [0.8 0.8 0.8 ] .... 'CallBack','zzzpost(''POSTset'',''POSTset_Sp2'',l);'); end if a c t i o n == 1 sp=2; if m >= 1 & sp = 1 & sp = 1 & sp = 1 & sp = 1 & sp = 1 & sp = 1 & sp prints out the initial comments in a program file. The z z z h e l p .m displays the overall help text for the AutoAR program. The text is displayed in the Command window of MATLAB. When the H e l p button on the start-up page of AutoAR is pushed, it triggers the folloving call-back: % Z Z Z H E L P . M reads and displays the Help text of AutoAR. % Date: 30 Sep 1995 function zzzhelp (page, gui, action) ; help helpar % ##### End of p r o g r a m #####
The following listing is the common help text, the text in the file h e l p a r .m. Each subprogram has its own help text. They follow the listing of h e l p a r , m. %HELPAR Date-
Help text for A u t o A R 30 Sep 1995
M E N U (zzzmenu.m, AutoAR) is the starting point of A R analysis. T h e r e is a logical sequence you should use w h e n running the p r o g r a m for the first time with a n e w data setPreproc RunAR Show_2D Show_3D Select Spectra Run_stat Statis Data
-- P r e p r o c e s s i n g data -- Calculates spectra and elution curves w i t h a n u m b e r of different constraints -- Displays the solutions as a 2D plot -- Displays the solutions as a 3D plot -- Selects a subset of results for d i s p l a y -- Displays the extracted spectra and elution curves -- Solves for spectra and elution curves and calculates statistics for a selected p a r a m e t e r set -- Displays the extracted spectra and elution curves together with their standard deviations -- Displays m e a s u r e d spectra
443
444 % % % %
Before you can start to solve for spectra and elution curves (Run_AR) with different constraints you should do a preprocessing step (Preproc) where you select the data set and perhaps use smoothing etc. You can also save the selection under a new name.
% % % % % % % %
When you have calculated Run_AR, you can browse the results. You analyze which of the different constraint sets gives the best combination of fit and scatter (the repeatability of the solution). When the best combination has been found the solution is repeated for the best combination of constraints. This optimal parameter set is then used to produce the final spectra and elution curves. Another program calculates the standard deviation for the spectra and elution curves.
%
Preproc
prepares the data set for the actual AR process.
Run_AR
defines the constraint ranges for solving a number of AR problems. The Run_AR combines the experiment design step with the collection of statistics about fit and scatter of the solutions.
Show_2D
displays the solutions calculated by "Run_AR". The results are shown as two-dimensional plots. Fit and scatter are shown alone or in combination. You select the parameters among the values used by the previous AR calculation step.
Show_3D
displays the solutions calculated by "Run_AR". The results are shown as three-dimensional plots. Fit and scatter or their sum is shown. You select the parameters among the values used by the previous AR calculation step.
Select
is used to pick one parameter set from the many produced by running the "Run_AR" program. The solution can be inspected by the "Spectra" program. The number of repeats and the number of iterations can be freely extended.
Spectra
shows one of the resolved components at a time. The elution curve is shown with or without the original chromatographic curve. The spectra can be shown in bar graph or line graph format.
Run_stat
is used to pick one parameter set from the many produced by running the "Run_AR" program. The solution can be inspected by the "Statis" program. The number of repeats and the number of iterations can be freely extended.
Statis
shows one of the resolved components at a time together with its standard deviation. The elution curve is shown with or without the
445
original chromatographic curve. The spectra and their standard deviations (below zero values) can be shown in bar graph or line graph format. Data
% % % % % % %
% % % % % %
displays a measured spectrum or a spectral line. You must input one spectrum/spectral line number. Then you can move to the next or to pthe revious one by pressing a pushbutton.
Help gives program specific advice. You can use either a push button or a M A T L A B m e n u item GOTO to get help. The help text will be printed in MATLAB's Command window. AutoAR adds two p u l l d o w n m e n u s
to MATLAB. These are:
Files
In all parts of A u t o A R y o u can select Quit to get to the main page of AutoAR. Sometimes there are additional choices.
GOTO
contains in all parts of AutoAR at least a Help selection. It gives the same help text as the Help button in the same view. In most parts of the program you can toggle between the dialog page (for user input) and the figure page (for results).
AutoAR and other M A T L A B p r o g r a m s write results and comments into MATLAB's C o ~ d window. This report c a n b e printed on a printer. The whole content of the window or a selection of it can be printed. For these purposes the Print... or Print Selection... commands from the File menu are used.
% % % % %
To save the report into a file, you should enter 'diary on' command in the MATLAB Command window before you start the A u t o A R p r o g r a m . This is a normal MATLAB feature.
% % % % % %
NOTE that you can do only one of the following operations at a timeTRANSPOSE, SEGMENT, CONVOLVE or SVD. All of these start their process using the freshly loaded data matrix. If you intend to do several sequential operations, you have to save the data between the operations.
% % %
- LOAD
%HELPPRE Help text for Preproc. % Date: 30 Sep 1995 %
% %
The preprocessing program
(zzzpre.m)
does the following-
Reads in an observation matrix (in .MAT format). Do not enter anything into the text field which is filled automatically after you have made your choice.
446 % % % %
% % % % % % % %
% % % % % % % % % % % % % % % % % % % % %
Push the LOAD button. The size of the data matrix (no. of scans, no. of lines) is shown along with it's name. Now you are ready to apply the preprocessing methods. - SAVE
Saves the modified observation matrix in .MAT format using the filename you give. Do not enter anything into the text field which is filled automatically after you have made your choice. Never use the same filename as the original matrix to avoid destroying the original data.
- TRANSPOSE You can transpose the matrix, if the axes are swapped. When you push the TRANSPOSE button, the size of the transposed matrix appears in the fields of "File to be saved". To start the transpose process push the RUN button. The transposed data matrix is shown on a new figure page as a mesh diagram. Return to the dialog page by pressing the BACK button. You can undo this operation b y p r e s s i n g the NONE button. - SEGMENT
Lets you select a segment from the original data matrix. First enter the values for the first and the last scan into the text fields. Then press the SEGMENT button. The size of the matrix segment appears in the fields of "File to be saved". To start the actual segment selection press the RUN button. The selected part of the data matrix is shown on a new figure page as a mesh diagram. Return to the dialog page by pressing the BACK button. You can undo this operation by pressing the NONE button.
% % % % % % % %
- CONVOLVE
You can make a convolution using a filter matrix. First press the CONVOLVE button. Then select an existing filter matrix (in .MAT format). To perform the convolution press the RUN button. The convolved data matrix is shown on a new figure page as a mesh diagram. Return to the dialog page b y p r e s s i n g the BACK button. You can undo this operation by pressing the NONE button.
% % % %
- SVD
% % % % % % % %
- FACTORIZE By factorizing the data matrix you get an estimate of the number components in the data. To perform the factorization push first the FACTORIZE button and then the RUN button. A diagram of the magnitude of eigenvalues is shown on the figure page. Return to the dialog page by pressing the BACK button. You can undo this operation by pressing the NONE button. Lets you perform the svd or Singular Value Decomposition. This operation is also known as PCA in chemometric literature. The number of species you give here should be estimated
447 % % % % % % % % % %
% % % % % %
first. First enter the number of species into the corresponding field and then push the S V D b u t t o n . In the Command window you will see the value of 'noise' which is the percentage difference between the model and the data. Push the RUN button to perform the svd. Return to the dialog page by pressing the BACK button. You can undo the svd operation by pressing the NONE button. - NONE
Undoes the effect of the previous operation and reselects the original data matrix. To perform this push first the NONE button and then the RUN button. Return to the first figure page by pressing the BACK button.
% % % % % % %
It is not necessary to save the use it in one analysis session. modified matrix later or if you then the easiest w a y to do this under a new name.
processed matrix if you only want to If you intend to use the same want to make sequential operations, is to save the modified matrix
% % % % %
This program (zzzar.m) designs the AR experiment and collects the results with their statistics. The goal is to choose parameters which optimize the combination of fit and scatter (the repeatability of the solution).
% % % % % %
If you don't know proper inputs, you can run the program with default parameter values which are generated by the program, if no "params.txt" file is found. By observing the results you can change the parameter values to correspond better to your data.
% %
%HELPARI Help text for Run_AR. % Date- 30 Sep 1995 %
% % %
You select first a range of parameters on the dialog page which constrain the solution space.
% % % %
- RUN
Starts the analysis. You are asked to select the data file before the program continues.
- HELP
Lists this text in the Command window.
% % %
User inputs on the dialog page-
% %
% %
Elution curve baselineThe lowest baseline value is forced to be at least a certain percentage of highest value. If we choose a value of 0.1%
448
for the baseline, the smallest value is set to be at least 0.1% of the maximum. We can define the baseline values in the analysis in steps, starting from a low value and proceeding to the highest value. The m a x i m u m n u m b e r of steps is i0. Unimodality forces the elution curve to have a single maximum. The checkbox should be on for chromatography and off for discrete samples. % % % % % %
% % % % % % % % % % % % % % % % % % % % % % % % %
Spectral baselineThe lowest spectrum value is forced to be at least a certain percentage of the highest value. The baseline can v a r y b e t w e e n 0.0001% and 100% of the highest intensity. The number of speciesThe number of species (chemical components) that are present in the observation matrix is defined here. The number of species can v a r y b e t w e e n one and ten. It is possible to vary the number of components in the analysis by stating the lowest and highest number of components to be used. The number of repeats: The precision of the results depends on this number. The precision of the estimated fit and scatter can suffer if not enough repeat experiments from different starting points are performed. The number of AR iterations: The A R p r o c e s s converges to a solution typically after ten iterations of the algorithm that starts with random numbers for the spectra. To be safe, 20 iterations are recommended. Program output on the dialog page: The number of parameter sets to be calculated: The number of calculations that are performed in an AR analysis depends on the number of different parameter sets. Let us assume that we vary the number of components from one to five. Likewise, we try five different values for the baseline of elution curves and five different values for the spectral baseline. We end up with 5 x 5 x 5 = 125 different parameter sets. This number is shown during the calculations in the "Run_AR" window. The number of the current parameter set is shown in the left field. It can be compared with the total number of parameter sets that is shown in a field at right.
449
%HELPDP Help text for Show_2D Plots. % Date: 30 Sep 1995 % % % % % % % %
The 2D Plots program (zzzdp.m) is used to inspect the results of the AR experiment obtained by executing the R u n _ A R p r o g r a m . The goal is to find which combination of constraints produces the best combination of fit and scatter (the repeatability of the solution). User inputs on the dialog page-
% % % % % % % % %
First you should select the parameter for the x-axis using radio buttons. When this is done the the other two choices show up in two collections of fields. If you have selected, e.g., the number of species for the x-axis, you see the possible values of the elution curve baseline in the middle field collection and the values of the spectrum baseline in the rightmost field collection. You select one value from each collection.
% % % % %
% %
% %
The y-axis displays one or more of the model metrics. - Fit
expresses the L 2 - n o r m o f the modelling error compared to the L2-norm of the original observations. The ratio of the norms is expressed as a percentage.
% % % %
- Scatter
expresses the reproducibility of the repetead solutions. The percentage is based on relative component concentrations.
- Fit_et_Scatter
Shows both the fit and the scatter curves.
% % %
- Combined
Shows the curve formed by the weighted sum of the fit and the scatter.
- All
Shows the fit, the scatter and the combined curves.
% % %
- LOOK
% % % % %
The 3D Plots program (zzzdp3.m) is used to inspect the results of the AR experiment obtained by executing the 'Run_AR' program. The goal is to find which combination of constraints produces the best combination of fit and scatter (the repeatability of the solution).
% % %
Starts the plotting.
%HELPDP3 Help text for Show_3D Plots. % Date: 30 Sep 1995 %
% %
User inputs on the dialog page-
450
%
% % % % % % % % % % % % % % % % % %
First you should select the parameter for the x-axis and y-axis using the radio buttons. Note that you can choose two parameters from a set of three. The choice of the fixed value for the third parameter is made by the user from a set of radio buttons formed dynamically by the program. The set of radio buttons is dependent on the other choices of the user. The error surfaces can be shown from different points of view depending on the choice of x and y.
% % % % % % % % %
User inputs on the dialog page-
Because we have a surface on the plot we can display only one error type at a time. In the uppermost right hand corner we have the radio buttons for the choice of fit, scatter or their combination. Finally the plotting function is selected from the collection of radio buttons in the bottom right-hand corner.
%HELPDAR Help text for Select. % Date- 30 Sep 1995 % % This program (zzzdar.m, Select) selects a subset of AR results for % display. The program is used to select the combination of constraints % that seems best. The results of this selection are displayed later % by the program 'Spectra'. %
The first page contains two sets of parameter fields. The leftmost set (Parameters for Analysis) shows the parameters which were used for running the AR experiment by executing the 'Run_AR' program. The rightmost set (Selected Subset) displays your selection. - Load_Default
Pushing this button will fill the fields in the rightmost set with default values (Selected Subset). You cannot give values larger than the maximum value or smaller than the minimum value of each parameter. When starting the program any erroneous parameter values will be changed to the nearest possible values. The default values are: - Selected steps for elution curve baseline and spectral baseline are the minimum values given on the left set. - The selected number of species is the smallest number which has been used for calculation. - Number of repeats is i. This gives a zero for the scatter of results. You can input here any value. - Number of AR iterations is 20. You can input here any value.
451
%
% % % % % % %
% % %
The user should select the step which produces the o p t i m u m v a l u e for a constraint. The constraints that are selected are the elution curve baseline and the spectral baseline. The user fills the step number into the upper field for each constraint. The numerical value for the parameter is filled later automatically into the lower text field by the program. - RUN
Starts the analysis. If you have not input the parameters, default values are assumed.
- - .
%HELPPOST Help text for Spectra. % Date: 30 Sep 1995 %
% % % % % % %
The program (zzzpost.m) displays the extracted spectra and elution curves. You can select the species to view by pushing a button at the right hand side of the screen. The numerical values displayed by this program must be selected first by the program 'Select'. The species to display is chosen b y p u s h i n g a button with the desired label ('SpI', 'Sp2' etc.).
% % % % % % % %
Spectrum: The spectrum vector is scaled to a length of one. that the sum of all intensities is one. The spectrum can be displayed as a line graph or graph. The line graph format (radio button Line) for UV-Vis or NIR spectra while the mass spectra in bar graph format (radio button Bar).
% % % % %
%
The number of species: The m a x i m u m n u m b e r of species the program c a n h a n d l e is i0. If there are less, the unnecessary pushbuttons (Sp2...Spl0) are hidden.
as a bar is suitable look better
Elution curveThe area under the elution curve corresponds to the relative amount of this species. The elution curve of a compound can be displayed with (radio button Total) or without (radio button Spec) the total chromatogram. The total elution curve (= sum curve) is plotted using a green dotted line. The elution curve of the current species is plotted using a red solid line.
%HELPDAR2 Help text for Run_stat. % Date- 30 Sep 1995 %
% % %
This means
This program (zzzdar2.m, Run_stat) selects a subset of AR results for display. The program is used to pick the combination of constraints that seems best. The results of this selection are
452
% % %
% % %
% % % % % % %
displayed later by the program 'Statis'. The spectra and elution curves have additionally their standard deviations displayed. User inputs on the dialog page: The first page contains two sets of parameter fields. The leftmost set (Parameters for Analysis) shows the parameters which were used for running the AR experiment by executing the 'Run_AR' program. The rightmost set (Selected Subset) displays your selection. - Load_Default
Pushing this button will fill the fields in the rightmost set with default values (Selected Subset). You cannot give values larger than the m a x i m u m v a l u e or smaller than the minimum value of each parameter. When starting the program the erroneous parameter values will be changed to the nearest possible values. The default values are: - Selected steps for elution curve baseline and spectral baseline are the minimum values given on the left set. - The selected number of species is the smallest number which has been used for calculation. - Number of repeats is i. This gives a zero for the scatter of results. You can input here any value. - Number of AR iterations is 20. Here you can give a number which is larger than the one you used for previous calculations. This way you ca get a more accurate result.
% % % % % % %
The user should select the step which corresponds to the optimum value for a constraint. The constraints that are selected are the elution curve baseline and the spectral baseline. The user fills the step number into the upper field for each constraint. The numerical value for the parameter is filled later automatically into the lower text field by the program.
% % % %
- RUN
Starts the analysis. If you have not input the parameters, default values are assumed.
% % % % % % %
The program (zzzpost2.m) displays the extracted spectra and elution curves together with their standard deviations. You can select the species to view b y p u s h i n g a button onthe right hand side of the screen. The numerical values displayed by this program must be selected first by the program 'Select'. The species to display is chosen by pushing a button with the desired label ('Spl', 'Sp2' etc.).
%HELPPOS2 Help text for Statis. % Date: 30 Sep 1995 %
453 i
%
% % % % %
The number of speciesThe maximum number of species the program can handle is i0. If there are less, the unnecessary pushbuttons (Sp2...Spl0) are hidden.
%
Elution curveThe area under the elution curve corresponds to the relative amount of this species. The elution curve of a compound can be displayed with (radio button Total) or without (radio button Spec) the total chromatogram. The total elution curve (= sum curve) is plotted using a green dotted line. The elution curve of the current species is plotted using a red solid line. The standard deviation in the elution curve of the species is plotted downwards using a blue solid line.
% % % % % % % % % % % %
Spectrum: The spectrum vector is scaled to a length of one. This means that the sum of all intensities is one. The spectrum can be displayed as a line graph or as a bar graph. The line graph format (radio button Line) is suitable for UV-Vis or NIR spectra while the mass spectra look better in bar graph format (radio button Bar). The spectrum of the selected species is plotted using a red solid line. The standard deviation in the spectrum of a species is plotted downwards using a blue solid line.
%HELPORIG Help text for Data. % Date: 30 Sep 1995 %
% % % % % %
The program (zzzorig.m) displays measured spectra and spectral line curves. You can select both the number of the spectrum and the number of the spectral line you want to display. You can alternate between the display of the spectrum and the display of the spectral line.
% % % %
Spectrum: The spectrum can be displayed as a line graph or as a bar graph. The line graph format (radio button Line) is suitable for UV-Vis or NIR spectra while the mass spectra look better
% % % % % % % %
LOAD
Reads in an observation matrix (in .MAT format). Do not enter anything into the text field which is filled automatically after you have made your choice. Push the LOAD button. The size of the data matrix (no. of scans, no. of lines) is shown along with it's name. You can load either an original data matrix or one of the preprocessed matrices.
454
% % % %
% % % % % % % % % % %
% % % % % % %
% % % % % %
% % % % % % %
in bar graph format (radio button Bar). The spectrum of the selected species is plotted using a red solid line. Line curve: The area under the curve corresponds to the relative amount of this spectral line. The curve of an individual spectral line can be displayed with (radio button 'Total') or without (radio button 'Single') the total sum chromatogram. The total elution curve (= sum curve) is plotted using a green dotted line. The curve of the individual spectral line is plotted using a red solid line. Scaling factorThe default scaling factor is i. Higher values can be used to emphasize spectral line curves with low intensities relative to the total chromatogram. The controls on display pagePrev: Used to toggle to the previous spectral line curve or spectrum. Next: Used to toggle to the next spectral line curve or spectrum. Show: Redraws the spectral line curve or spectrum after changing their settings (Single/Total or Bar/Line). 1995 by Erkki & Ulla Karjalainen
Appendix
Program 2.2" savgoll .m Program 2.3" savgol2.m Program 2.4: savgol3.m Program 2.5" outlier.m Program 2.6" bellshap.m Program 3.1 • callgau.m Program 3.2-gaussian.m Program 4.1-libsearl .m Program 4.2" libsear2.m
Program 5.1" purity.m Program 5.3" Iocpeel.m Program 5.4: highpass.m
Appendix Program 2.2: savgoll .m %SAVGOLI.M Savitzky-Golay filtering using polynomial functions. %
% Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
clear rand ( 'seed', 0) tsl=1200; m=l; nr=16; nl=16; pit=nr+nl+l; par (i) =i; %first point on x-axis par (2) =tsl; %last point on x-axis par (3) =tsl; %number of divisions par(4)=100; %position of the gaussian peak center par(5)=16; %halfwidth of the gaussian curve 1 par(6)=200; %area under the gaussian curve 1 TSI= (gaussian (par)) '; par(4)=200; %position of the gaussian peak center par (5) =24 ; %halfwidth of the gaussian curve 2 par (6) =300; %area under the gaussian curve 2 TS2= (gaussian (par)) '; par(4)=300; %position of the gaussian peak center par (5) =32 ; %halfwidth of the gaussian curve 3 par (6) =500; %area under the gaussian curve 3 TS3= (gaussian (par)) '; par(4)=500; %position of the gaussian peak center par (5) =48 ; %halfwidth of the gaussian curve 4 par (6) =600; %area under the gaussian curve 4 TS4= (gaussian (par)) '; par(4)=900; %position of the gaussian peak center par(5)=240; %halfwidth of the gaussian curve 5 par(6)=3000; %area under the gaussian curve 5 TS5= (gaussian (par)) '; TS6=rand (tsl, i) ; TS=TSI +TS2 +TS3 +TS4 +TS5 +TS6; tic B=(l-pit) ';
1
2
3
4
5
457
458
TSP=zeros (size (TS)) ; for ijk=l- tsl-pit YY=TS (ijk- ij k+pit-l) ; p=polyfit (B, YY, 2 ) ; yp=polyval (p, (nl+l)) ; TSP (nl+ijk, I) =yp; end toc plot (TS, '. ') title(['Savitzky-Golay ' num2str(pit)
])
' points
, degree = ' num2str(m)
hold on plot (TSP) hold off
Program 2.3: savgol2.m %SAVGOL2.M Savitzky-Golay %
filtering using matrix algebra.
% Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
clear rand (' seed', 0) tsl=1200; m=2 ; nr=16; nl=16; pit=nr+nl+l; par (i) =I; %first point on x-axis par (2) =tsl; %last point on x-axis par (3) =tsl; %number of divisions pax (4) =100 ; %position of the gaussian peak center par (5) =16 ; %halfwidth of the gaussian curve 1 par (6) =200 ; %area under the gaussian curve 1 TSI= (gaussian (par)) ' ; par(4)=200; %position of the gaussian peak center par (5) =24; %halfwidth of the gaussian curve 2 par (6) =300; %area under the gaussian curve 2 TS2= (gaussian (par)) '; par(4)=300; %position of the gaussian peak center par (5) =32 ; %halfwidth of the gaussian curve 3 par (6) =500; %area under the gaussian curve 3 TS3= (gaussian (par)) ' ; par(4)=500; %position of the gaussian peak center par (5) =48; %halfwidth of the gaussian curve 4 par (6) =600 ; %area under the gaussian curve 4 TS4= (gaussian (par)) '; par(4)=900; %position of the gaussian peak center par (5) =240; %halfwidth of the gaussian curve 5 par (6) =3000; %area under the gaussian curve 5 TS5= (gaussian (par)) ' ; TS6=rand (tsl, i) ;
1
2
3
4
5
459 i
TS=TSI +TS2 +TS3 +TS4 +TS5 +TS 6 ; tic A = v a n d e r (1 :pi t) ; A : f l i p l r (A) ;
A=A(:,I:m+I);
C = p i n v (A) ; A H = A (nl+l, : ) ; CF=AH*C; TSS=TS (l-pit, i) ; Y= (toeplitz (TS, TSS) ) ; Y= [Y; zeros (nl,pit) ] ; TSP=(CF*Y')'; TSP(I:nl):[]; toc plot (TS, '. ') title(['Savitzky-Golay
])
' nttm2str(pit)
' points
, degree = ' num2str(m)
h o l d on plot (TSP) h o l d off
Program 2.4: savgol3.m %SAVGOL3.M %
Savitzky-Golay
filtering using convolutions. ==
% Data analysis for h y p h e n a t e d % (c) Erkki & Ulla K a r j a l a i n e n %
techniques 1995
clear rand ( 'seed', 0 ) tsl=1200; m=4; nr=l 6 ; nl=16; pit=nr+nl+l; par (i) =i; %first point on x-axis par (2) =tsl ; %last point on x-axis par (3) =tsl ; %number of divisions par(4)=100; %position of the gaussian peak center par(5)=16; % h a l f w i d t h of the gaussian curve 1 par (6) =200 ; %area under the gaussian curve 1 TSI: (gauss ian (par)) '; par(4)=200; %position of the gaussian peak center par (5) =24 ; %halfwidth of the gaussian curve 2 par (6) =300; %area under the gaussian curve 2 TS2: (gaussian (par)) ' ; par(4)=300; %position of the gaussian peak center par (5) =32 ; % h a l f w i d t h of the gaussian curve 3 par (6) =500; %area under the gaussian curve 3 TS3= (gaussian (par)) ' ; par(4)=500; %position of the gaussian peak center p a r (5) =48; %halfwidth of the gaussian curve 4 p a r (6) =600; %area under the gaussian curve 4
1
2
3
4
460
TS4= (gaussian (par)) ' ; par(4)=900; %position of the gaussian peak center 5 par(5)=240; %halfwidth of the gaussian curve 5 par(6)=3000; %area under the gaussian curve 5 TS5= (gaussian (par)) '; TS6=rand (tsl, I) ; TS=TSI+TS2 +TS3 +TS4+TS5+TS6; tic A--vander (1 :pi t) ; A=fliplr (A) ; A = A ( :, 1 :m+l) ; C--pinv (A) ; A H = A (nl+l, :) ; CF=AH*C; TSP=conv (TS, CF) ; TSP(I:nl) = [] ; tsplen=length (TSP) ; TSP (tsplen-nl+l :tsplen) = [] ; toc plot (TS, '. ') title(['Savitzky-Golay ' num2str(pit) ' points , degree = ' num2str(m)
])
hold on plot (TSP) hold off
Program 2.5: outlier.m %OUTLIER.M %
checks data for outliers.
% Data analysis for hyphenated % (c) Erkki & Ulla Karjalainen %
techniques 1995
echo on clear clg clc % If you want to be succesful in the later steps of the data % treatment you must inspect the data to see if there are % any gross errors in it. Outliers in the data can ruin % the actual modelling programs. Therefore, they must be % eliminated. % After the visual inspection, the computer should be used % to develop criteria for faulty data. % _7
% The first step is always to inspect some projections of % the data to see if there are any obviously aberrant % points. The following small program loads a data matrix % and then shows a number of projections for visual % inspection. load Obs frstsc=l; lastsc=100;
461
AObs=Obs (frstsc :lastsc, :) ; Obs =AObs; AObs= [] ; pack plot (sum(Obs ') ) % pause % Hit any key % plot (sum(Obs)) % pause % Hit any key % contour (Obs) % pause % Hit any key % plot (Obs) % pause % Hit any key % plot (Obs') % pause % Hit any key % After these inspections, the data can be subjected to % some processing. % For example, we can take the first derivative % of the observations and plot the results. % DDObs =di f f (di f f (Obs) ) ; % plot (DDObs) pause %Hit any key clg % Another approach is to find suspect data points % by local smoothing operations. % We can calculate a smoothed version of the original % data base and then we can find out where the largest % differences between the original and smoothed versions % are located. The largest residuals indicate candidate % points for outliers. nfilter=5 [nscans, nlines ]=size (Obs) F= (1/sum (bellshap (nfilter)) ) *bellshap (nfilter) ; SObs=conv2 (Obs, F) ; save SObs SObs [snscans, snlines ]=size (SObs) RSObs=SObs ((nfilter-l)/2+1 :snscans- (nfilter-l)/2, :) ; SObs=RSObs ; RSObs= [] ; pack DObs=Obs- SObs; o b s = [] ; SObs= [] ; pack w=sort (DObs ( :) ) ; plot(w) pause %Hit any key load Obs; load SObs; for i=l- i0 subplot (iii) clg disp (i ) vl=min (w) ;
462
end
H = ( D O b s == vl); x c = f i n d (sum(H)) ; yc= find (sum (H') ) ; o v a l = O b s (yc, xc) ; disp( ['Maximum v a l u e disp ( [ 'M a x i m u m v a l u e disp ( [ 'O b s e r v e d value disp ( [ 'S m o o t h e d value s ub pl ot (211) p l o t (Obs (yc, : ) ) s ub pl ot (212) p l o t (Obs ( : ,xc) ) h o l d on p l o t (SObs ( : ,xc) ) pause h o l d off v 2 = m a x (w) ; H = ( D O b s == v2) ; x c = f i n d (sum(H)) ; yc= find (sum (H') ) ; o v a l = O b s (yc,xc) ; disp( ['Minimum v a l u e disp( [ 'Minimum v a l u e disp ( [ 'O b s e r v e d value disp ( [ 'S m o o t h e d v a l u e subplot (211) p l o t (Obs (yc, : ) ) su bp lo t (212) p l o t (Obs ( : ,xc) ) h o l d on p l o t (SObs ( : ,xc) ) pause h o l d off w l e n = l e n g t h (w) ; w(wlen) =[] ; w(1)=[];
at line =' ,int2str(xc) ] ) at scan =', int2str (yc) ] ) =', int2str (oval) ] ) =', int2str (v2) ] )
at line =' ,int2str(xc) ]) at scan =', int2str(yc) ] ) =', int2str (oval) ] ) =', int2str (v2) ] )
Program 2.6: bellshap.m % F U N C T I O N GCK~ = bellshap(par) generates a b e l l s h a p e d % the function has par points. % PAR should be odd. % % Data analysis for h y p h e n a t e d techniques % (c) Erkki & U l l a K a r j a l a i n e n 1995 % f u n c t i o n ggg = bellshap(par) x x x = l i n s p a c e (i, par, par) ; c e n t e r = (par+ 1 ) / 2 ; d x x x = x x x - center; h w = par /3 ; ggg = e x p ( - ( d x x x ./(0.600561"hw)).^2);
curve
463
ggg = ggg - min(ggg) ; ggg = (i/sum(ggg) *ggg) ;
end
Program 3.1: callgau.m %CALLGAU.M % % %
main program for the generation of four gaussian chromatographic peaks and random spectra. Calls the function GAUSSIAN.
% Data handling for hyphenated techniques % (c) by Erkki & Ulla Karjalainen 1995 %
nspec=4; nlines=100 ; nscans=40; rand ( 'seed', 0) init=input ( 'Empty randoms ') ; A=rand (ini t, 1 ) ; S=rand (nspec, nl ines ) ; P=rand(size (S)) ; PP=P>0.8; S=PP. *S; for i= 1 :nspec S(i,:)=(i/sum(S(i,:)))*S(i,:); end par (i) =i; %first point on x-axis par (2) =nscans ; %last point on x-axis par (3) =nscans; %number of divisions par(4)=9; %position of the gaussian peak center par (5) =3 ; %halfwidth of the gaussian curve par (6) =20 ; %area under the gaussian curve rowl=gaussian (par) ; par(4)=12; %position of the gaussian peak center par(5)=5; %halfwidth of the gaussian curve par (6) =8 ; %area under the gaussian curve row2=gaussian (par) ; par(4)=17; %position of the gaussian peak center par (5) =7 ; %halfwidth of the gaussian curve par (6) =15 ; %area under the gaussian curve row3=gaussian (par) ; par(4)=28; %position of the gaussian peak center par (5) =9; %halfwidth of the gaussian curve par (6) =25 ; %area under the gaussian curve row4=gauss ian (par) ; C=[rowl; row2; row3; row4]; C=C' ; plot(C) title( 'Synthetic elution curves'); hold on Obs 1 =C* S; plot (sum(Obsl ') )
464 hold off save Obsl Obsl save C C save S S pause m e s h (Obsl ' ) ; title('Synthetic pause contour (Obsl) ; title( 'Synthetic pause bar(S(l, :)) ; title( 'Spectrum pause bar (S (2, :) ) ; title ( 'Spectrum pause bar (S (3, :) ) ; title ( 'Spectrum pause bar (S (4, :) ) ; title ( 'Spectrum end;
GC-MS data') ; GC-MS data' ) ; #i') ; #2' ) ; #3' ) ; #4' ) ;
Program 3.2: gaussian.m function ggg=gaussian (par) % F U N C T I O N G A U S S I A N (PAR) % Data analysis for hyphenated % (c) Erkki & Ulla Karjalainen %
% % % % % % %
Generates par(1) is par(2) is par(3) is par(4) is par(5) is par(6) is
xxx dxxx ggg ggg end;
= = = =
techniques 1995
gaussian curves with defined properties the first point on x-axis the last point on x-axis the number of divisions in the curve to be m a d e the position of the gaussian peak center the halfwidth of the gaussian curve the area under the gaussian curve
linspace (par (1 ), par (2), par (3) ) ; xxx-par(4) ; exp(-(dxxx ./(0.600561*par(5))).^2); par (6 ) * ( (1 / sum (ggg)) *ggg) ;
Program 4.1: libsearl .m %LIBSEARI makes library searches by full correlation. % % Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995
465
clear load OBS3 [nscans, n l i n e s ] = s i z e (OBS3) ; Obs=OBS3 '; load LIB [libsize, n l i n e s ] = s i z e (LIB) ; u k = i n p u t ( 'P o s i t i o n of u n k n o w n = ' ) ; d i s p (' ' ) s k = O B S 3 (uk, : ) ; a v e = m e a n (sk) ; sk=sk-ave; sk= (i/sqrt (sk*sk') ) *sk; start=clock; for i=l :l i b s i z e K = c o r r c o e f (sk, L I B (i, : ) ) ; x ( i ) = K ( 2 , I) ; end [y, z ] = s o r t (- l'x) ; y=-l*y; finish=clock; d u r a t i o n = e t i m e (finish, start) for i=l- 20 disp([int2str(i),' Pos = ',int2str(z(i)) .... ' Corr = ',num2str(y(i))]) end
Program 4.2" libsear2.m %LIBSEAR2
makes
faster
library
searches - - - -
% D a t a a n a l y s i s for h y p h e n a t e d % (c) E r k k i & U l l a K a r j a l a i n e n %
clear l o a d OBS3 [nscans, n l i n e s ] = s i z e (OBS3) ; O b s = O B S 3 '; load LIB [libsize,nlines] =size(LIB) ; u k = i n p u t ( 'P o s i t i o n of u n k n o w n d i s p (' ' ) s k = O B S 3 (uk, : ) ; a v e = m e a n (sk) ; sk=sk-ave; sk= (1 / sqrt (sk* sk' ) ) * sk; start=clock; st=sk' ; x = L I B * st; [y, z] = s o r t (-l'x) ; y=-l*y; finish=clock; d u r a t i o n = e t i m e (finish, start)
techniques 1995
=
') ;
by correlation.
466
for i=l- 20 disp([int2str(i),' Pos = ',int2str(z(i)) .... ' C o r r = ',num2str(y(i))]) end
Program 5. I: purity.m %PURITY.M %
checks h o m o g e n e i t y
of a c h r o m a t o g r a p h i c
% D a t a a n a l y s i s for h y p h e n a t e d % (c) E r k k i & U l l a K a r j a l a i n e n %
peak.
techniques 1995
clear clg d i s p ( [ '*** Start of p r o g r a m "purity" ***' ] ) l o a d Obs3 [nscans, nlines] =size(Obs3) ; S O b s = z e r o s (nscans, nlines) ; s c = s u m (Obs3') ; m a r k = z e r o s (nscans, i) ; b i g = m a x (sc) ; p l o t (sc) fs=l; w h i l e fs -= 0 fs=input(' F i r s t = ');. if fs == 0, b r e a k ,end is=input(' Last = '); if is == 0, b r e a k ,end if fs == 0, b r e a k ,end for ii=fs :Is m a r k (ii, i) =big/40; end h o l d on p l o t (mark) h o l d off span=is-fs+l; SS=Obs3 (fs- is, : ) ; K = c o r r c o e f (SS') ; s k = s o r t (K ( : ) ) ; m a x c o r r = s k (s p a n * s p a n - s p a n - i ) ; disp( ['Maximum c o r r e l a t i o n = ' ,num2str(maxcorr) ] ) m i n c o r r = s k (1 ) ; disp( ['Minimum c o r r e l a t i o n = ' ,num2str(mincorr) ]) sk (span* s p a n - s p a n - 1 :span* span) = [] ; a v c o r r = m e a n (sk) ; d i s p ( [ 'A v e r a g e c o r r e l a t i o n = ', n u m 2 s t r (avcorr) ] ) s t d c o r r = s t d (sk) ; d i s p ( [ ' S t d , of c o r r e l a t i o n = ',num2str(stdcorr) ]) end t i t l e ( 'P u r i t y of a c h r o m a t o g r a p h i c peak' ) d i s p ( [ ' * * * E n d of p r o g r a m ***'])
467
Program 5.3: Iocpeel.m % L O C P E E L . M "peels" chromatographic peaks. % % Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
clear clg fil=input( 'The name of input file = ', 's' ) ; e v a l ( [ ' l o a d ',fill) s=['Obs= ',fil,';']; eva i (s ) s=[fil, '=[]; ']; eval (s ) [nscans, nlines] =size (Obs) ; W=Obs; nega=0; ps=[] ; a b u = s u m (Obs') ; orbig=max (abu) ; empty=zeros (i, nlines) ; for j j=l-nscans if nega == 1 break end a b u = s u m (Obs') ; if jj == 1 plot (abu) limits=axis; end if jj > 1 axis (limits) ; plot (abu) end [big, place ] =max (abu) p e a k = s u m (Obs (place, • ) ) if peak < orbig/1000000 nega= 1; end PS= [PS; W (place, • ) ] ; for ii=place- nscans-i a=sum (Obs (ii, • ) ) ; b=sum(Obs (ii+l,-)); ifa>b Obs (ii, • ) =empty; else break end end for ii=place-l--l- 2 a=sum (Obs (ii, • ) ) ; b = s u m (Obs (i i- i, • ) ) ;
468
ifa>b Obs (ii, :) =empty; else break end
end end save PS PS
Program 5.4: highpass.m % H I G H P A S S . M performs highpass filtering. % % Data analysis for hyphenated techniques % (c) Erkki & Ulla Karjalainen 1995 %
clear clg disp( [ '*** Start of p r o g r a m "highpass" ***' ] ) fil=input('File to be filtered = ', 's') ; e v a l ( [ ' l o a d ',fil]) s=['Obs= ',fil,';']; eval (s) subplot (211) plot (sum(Obs ') ) title ( 'Original' ) [nscans, nlines] =size (Obs) ; F=[-0.2024 -0.5952 -0.2024 0 0.6071 1.7857 0.6071 0 -0.2024 -0.5952 -0.2024" ] ; HObs=conv2 (Obs, F) ; [nscans2, nlines2 ] =size (HObs) ; diff= (nscans2-nscans)/2 ; HObs (i- di f f, - )= [] ; HObs (nscans+l -nscans+diff, • )=[ ] ; subplot (212) plot (sum (HObs') ) title( 'High-pass filtered') fil=input ( 'Filtered output file = ', 's' ) ; s=[fil,'= Obs;']; eval (s) eval(['save ',fil,' ',fil]) subplot (iii) disp ( [ '*** End o f p r o g r a m "highpass" ***' ] )
469
Index A aberrations 59 adaptive filter 52 alternating regression 9, 71,127-138 "core AR" 127, 176, 178, 205 factor analysis and 195-200 applications to spectroscopies 171-180 AR 9, 71,127-138 "core AR" 127, 176, 178, 205 deconvolution and 127-129 defining the objective function 130131 factor analysis and 195-200 single-dimensional signals and 172175 AR algorithm 133-137 ASCII file 25, 28 AutoAR 129, 130, 141,175 constraint space 152 help texts 443-454 plotting the spectra and elution curves 162-164 preprocessing 148-152 solution found with 155-162 validation 186 AutoAR program 211-217 average moving 36
B background 186 elution curve 206 instrumental 109, 110 subtraction 42, 109-110 background constraint 188 background space 204 band-pass filter 42, 43 baseline elution curve 156 spectral 141, 156 baseline constraints 188 Basic 25, 30 bell-shaped filter 41, 59, 150 bellshap.m 462
benchmark 184 bootstrapping 184 broadening of chromatographic peaks 35, 43, 119
C C language 70 calibration 20 callgau.m 71,463 CD-ROM 8, 9, 150 "chemical" solution 92 chemometrics 21, 62, 70 chromatographic peak broadening of 35, 43, 119 shape of 172 sharpening of 43, 119-122 chromatography single-dimensional 171 cleaning up data 59-63 clinical laboratories 172 combined error 203 compression by factor analysis 195 compression methods 69, 69-95 computers parallel 58 PowerMacintosh 102, 104 super-computers 58 confidence ranges 164 constraining the elution curves 134 the spectra 137 constraint space 203,204 constraints 131,203 background 188 baselines 176, 188 elution curve 132-133, 134 guiding 203-204 local 205 spectrum 137 structural 203,204-206 types of 176 conversion to MATLAB matrix format 25
470
triples into matrix 26 convolution 34--41, 43, 52 smoothing by 56 two-dimensional 54 convolution filter 40 Cooley, J.W. 52 "core AR" 127, 176, 178, 205 "core AR" algorithm 133-137 correlation 143, 465 correlation coefficient 102, 143 Pearson' s 101 correlation matrix 145 cross-validation 184
elution curve baseline 156 error combined 203 error estimates 183 error indicator 130 error norm Ll-norm 101,197, 198 L2-norm 101,103 the sum of squared residuals 101 sum of the absolute deviations 101 Euclidean length 80 evolving factor analysis 113 execution time 199 experimental validation 184, 190-192
D data compression 63 data conversion to MATLAB matrix format 25 triples into matrix 26 data format ASCII file 25, 28 MATLAB file type .MAT 31 MATLAB matrix format 25 Photoshop-acceptable 122 data overload 17 data preprocessing 25-65 decomposition 70 spectral 86 svd 63, 70-96, 113, 147 deconvolution 171 AR and 127-129 demosvd.m 78, 79 derivative of the signal 43 deskew.m 64 deskewing strategy 65 digital processing of photographic images 58 digital signal processing 40 diode array detectors 175 diode array instrument 176 discrete samples 172, 177 drugs determination of 172
E eigenvalue analysis 70 eigenvalues 81, 93 elution curve background 206
F FA 62, 70, 80, 127 factor analysis 62, 70, 80, 127 AR and 195-200 factor space 197' factors 70, 77, 80 number of 93 Fast Fourier Transform 52, 54 Fb'T 52,54 filtering 34-59 by convolution 40 high-pass 40, 41, 118, 119, 122, 468 linear 34 low-pass 36-39, 57, 119 Savitzky-Golay 43-51,457, 458, 459 smoothing with 132 two-dimensional 54 filters 41-52, 150 adaptive 52 band-pass 42, 43 bell-shaped 41, 59, 150 convolution 40 electronic 40 Gaussian 35, 187 high-pass 35, 42-43 inverse 35, 43 Kalman 51-52 low-pass 35, 41-42 median 51, 61 non-linear 51-52 notch 43 two-dimensional 59 Wiener 43
471
fit 155, 156, 189, 200, 203 fit indicator 130 fit of the model 42 fit, the sum of least squares 130 "flatness space" 204 format ASCII file 28 ISO 9660 8 MATLAB file type .MAT 31 MATLAB matrix 25 FORTRAN 18 Fourier domain 40, 52, 132 Fourier space 40, 52 Fourier spectra 53, 133 Fourier transform 52, 57 two-dimensional 63 Fourier, J.B.J. 52 fragmentography 18 Frobenius norm 78 function bar 144 clock 28 conv 35, 48 conv2 35 corrcoef 103, 104 etime 31 fix 29 fliplr 47 length 48 max 29, 145 mesh 150 norm 147 pinv 47 plot 31 rand 71 size 28 sum 31 svd 70, 77, 113 toeplitz 47 vander 47 zeros 29
G Gaussian curves 131,464 Gaussian filter 35, 187 Gaussian peaks 71,463 Gaussian shape 41, 131, 187 gaussian.m 71,464
GC-IR signal 176 GC-MS 25, 58, 71, 92, 175, 176 "virtual GC-MS run" 172 GC-MS data 30, 142 synthetic 71 graphical user interface 8 GUI 8 guiding constraints 203-204
H help texts 443-454 hidden components 118 hidden frequencies 52 high-pass filter 35, 42-43 high-pass filtering 40, 41, 118, 119, 122, 468 highpass.m 122, 468 high-performance liquid chromatograph 172 homogeneity checks 110-113,466 homogeneous peak 110 HPLC 63, 172 HPLC and diode array detector 175 hyphenated instruments 20, 175-176 GC-IR 176 GC-MS 17, 20, 175 HPLC with a diode array detector 175, 187 HPLC-UV-Vis 17, 20
I image processing 25, 52, 122 hyphenated data and 52 photographic 58 industrial laboratories 172 infrared spectra 176 instrumental background 109, 110 integrators 51, 172, 174 Internet 122 inverse filter 35, 43 inverse transform 52 IR signal 176 IR spectra 99, 205 from kidney stones 179 IR spectral library Mattson KS Library 179 ISO 9660 format 8
472
J jackknifing 184
K Kalman filter 51-52 kidney stones 179 IR spectral library 179 kinetic experiments 176, 205
L Ll-norm 101,197, 198 L2-norm 101, 103 libraries IR spectra 99, 179 kidney stone library 179 mass spectra 99 Mattson KS Library 179 NIST/EPA/NIH Mass Spectral Library 192 NMR spectra 99 on-line 99 spectral 99 UV-visible spectra 99 library search 99-105, 464, 465 library search algorithms 99 libsear 1.m 102, 103,464 libsear2.m 103,465 line spectra synthetic 73 linear filtering 34 loadings 63, 71, 77, 81-85, 95, 195 loadings matrix 78 local operations 109-122 local peeling 115-119, 467 local purity checks 113-115 local smoothing 41, 61, 71 locpeel.m 118, 119, 122, 467 low-pass filter 35, 41-42 low-pass filtering 36-39, 57, 119
M .M-extension 26 makenist.m 190, 191 mapspec.m 196, 197 mass fragment 30 mass fragmentography 18 mass spectra 132, 205
Mass Spectral Library NIST/EPA/NIH 192 mass spectrometric data 141 MAT-file type 31 The MathWorks Inc. 7 MATLAB 100 for Cray 18 for Macintosh 7, 18, 100 forPC 18 for UNIX workstations 7, 18 for Windows 7 in library search 100 student version 8 matrix format for MATLAB 26 Mattson KS Library IR spectra 179 maximum entropy 132 median filter 51, 61 metabolic noise 191 MEX routine 70 mixture spectroscopy 20 molecular accidents 191 moving average 36 multiple regression 137 multiplication element-wise 72
N neighborhood operations 109-122 neural networks 52 neutral steroids 92 NIH Image 34, 122 NIPALS 63, 70, 71, 77, 113 NIR spectra 205 NIST Mass Spectral Search Program 190, 192 NIST/EPA/NIH Mass Spectral Library 192 NMR spectra 99 two-dimensional 205 nofcomp2.m 113, 114 noise 32-34, 175 constant 32, 33 electrical 32 metabolic 191 numerical 80 proportional 32, 33
473 salt-and-pepper type 32, 33, 51 white 35 non-chromatographic data 176 non-linear filter 51-52 non-linear optimization 204 norm 101 Frobenius 78 Ll-norm 101,197, 198 L2-norm 101,103 notch filter 43 number of components 114, 115, 157, 206 number of factors 93
O objective function defining 130-131 optimization 204 non-linear 204 Optimization by Stepwise Constraining of Alternating Regression 10 organic acids 141,146 orthogonal matrix 77, 80 orthogonal solution 133 OSCAR 10, 172, 174, 203 application of 10 realized by AutoAR 204 robustness of 184-190 OSCAR algorithm 127, 128, 171 applying 141-168 OSCAR approach 129-130, 177 OSCAR process 141 outlier.m 62, 460, 462 outliers 51,460 elimination of 59, 71 overfitting 130, 131, 184 overlap 70, 110, 172 overlapping spectra analysis of 17-21
P parallel computers 58 PCA 62, 63, 69, 70, 113, 127, 146, 195, 197 PCA and spectra mapping between 195-198 peak heights 51 peak isolation 172 peak shape 172
Pearson' s correlation coefficient 101 periodic phenomena 52 Photoshop 25, 34, 54, 122 Poisson statistics 34 polynomials 44 smoothing with 132 PowerMacintosh 102, 104 preprocessing 25-65 single-dimensional data 173 with AutoAR 148-152 principal component analysis 62, 63, 69, 70, 113, 127, 146, 195 principal components 69 process control data 178 programs AutoAR programs zzzar.m 245-268 zzzdar.m 341-364 zzzdar2.m 383-406 zzzdp.m 271-304 zzzdp3.m 307-338 zzzhelp.m 443 zzzmenu.m 211-217 zzzorig.m 427-440 zzzpost.m 367-379 zzzpost2.m 409-423 zzzpre.m 221-242 callgau.m 71,463 demosvd.m 78, 79 deskew.m 64 gaussian.m 71,464 highpass.m 122, 468 libsear 1.m 102, 103,464 libsear2.m 103,465 locpeel.m 118, 119, 122, 467 makenist.m 190, 191 mapspec.m 196, 197 nofcomp2.m 113, 114 outlier.m 62, 460, 462 purity.m 111, 112, 466 savgoll.m 44, 457 savgol2.m 47, 48,458 savgol3.m 48,459 triquad.m 26, 27 pseudoinverse 47, 198 pseudo-random numbers 71 purity checks local 113-115 purity.m 111, 112, 466
474
pyrolysis with GC-MS 176
Q quality control checks 172 quality control data 178
R random number generator 71,133 rank annihilation 119 rank of the matrix 77 recovery experiments 188, 189 redundancy 69, 183 redundant information 86 relative retention times 184 repeatability of the solution 204 reproducibility calculation of 164-168 reproducibility of the solution 130, 200 residuals 61 robustness of OSCAR 184-190
S salt-and-pepper noise 32, 33, 51 sampling delay correction of 63-65 savgoll.m 44, 457 savgol2.m 47, 48,458 savgol3.m 48, 459 Savitzky-Golay filtering 43-51,457, 458, 459 scatter 130, 155, 156, 203 between repeated solutions 200 scores 63, 71, 77, 81-85, 95, 195 scores matrix 77 selective ion monitoring 18 shape functions 131 sharpening of chromatographic peaks 43, 119-122 signal broadening elimination of 43 signal processing 25 digital 40 two-dimensional 54 signal-broadening in chromatography 43 signal-to-noise ratio 32 SIM 18 similarity of spectra 101-102 sine waves 35
single ion recording 18 single-dimensional chromatograms synchronizing 173 single-dimensional chromatography 171 single-dimensional data preprocessing 173 single-dimensional samples precautions 174 single-dimensional signals and AR 172175 singular value decomposition 63, 70--96, 113, 147 calculation of 77-80 performing on GC-MS data 92-96 SIR 18 skewness 63 correction of 65 smoothing 132 by convolution 56 the elution curves 132 with filtering 132 local 41, 61, 71 with polynomials 132 with spline functions 132 smoothness of the spectra 131 solution "chemical" 92 statistical 86 solving for elution curves 133-134 for spectra 137 sparse matrices 100 spectra and PCA mapping between 195-198 spectral background 206 spectral baseline 141, 156 spectral decomposition 86 spectral features 99 spectral libraries 99 IR spectra 99, 179 kidney stone library 179 mass spectra 99 Mattson KS Library 179 NIST/EPA/NIH Mass Spectral Library 192 NMR spectra 99 on-line 99 UV-visible spectra 99 spectral overlap 19
475 spectral space 197 spectroscopy 63 spectrum matching algorithms 99 spline functions 62 smoothing with 132 standardized test data set 184 statistical validation 183-190 steroids 92, 116 structural constraints 203,204-206 super-computers 58 svd 63, 70-96, 113, 147 calculation of 77-80 performing on GC-MS data 92-96 svd process solution of 80 synchronizing single-dimensional chromatograms 173 synthetic elution curves 74
T tapering function 133 The MathWorks Inc. 7 titration experiments 172, 176, 205 Toeplitz matrix 48 total ionization curve 142 transformation matrix 197, 198 transpose 31 triquad.m 26, 27 Tukey, J.W. 52 two-dimensional data with internal continuity 176-177 two-dimensional filtering 54 two-dimensional filters 59 two-dimensional Fourier transform 63 two-dimensional NMR 205
U unimodal 132, 134, 205 unimodal curve 176 unimodality 176, 178 of the solution 132 unique solution 127 uniqueness of the solution 129, 152 UNIX 26 UV detection 172 UV spectra 131 UV-Vis spectra 176, 205
V validation 183-192 experimental 184, 190-192 statistical 183-190 Vandermonde matrix 47 vi editor 26 visualization 25, 34
W wavelets 63 white noise 35 Wiener filter 43
This Page Intentionally Left Blank