This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
10,000 m with fiber based cables Wireless Up to 95 m (outdoors)
Data Transfer Rate (Mbitls)
Year Introduced 1993 1064 8512 1998 0.020 1969 8 1975 0.1-10 (depends on cable length) 1994 12 1998 2000 480 2008 4,800 1995 400 2002 3200 1990 10 1995 100 1999 1,000 2003 10,000 1999 Up to 11 2003 Up to 54
From a data-acquisition point of view, the user is typically concerned with the speed and length over which the data can be transferred from the external device to a com puter. Table 4.3 lists typical specifications for each standard. In many cases, the transfer rate and cable length depend on other factors such as the type of cable and the pres ence of electromagnetic noise in the system. Repeaters and hubs can also be used to extend the length. Recent advances in fiber-optic based communications have increased these distances significantly. Each family of standards has a unique connec tor design that allows one to quickly identify the type of device being used and prevents accidental connection to other types of connections. Many standards (e.g. USB and Firewire) are backward compatible such that a device based on the old standard recent standard, can be connected to a device or computer designed using the more · albeit with the speed limited to that of the old specification. In addition, most recent standards have a provision for delivering DC power to the connected device, eliminating the need for an external power source for devices requiring relatively low power. More recently, wireless connections between devices have become prevalent. The most common standards found in home networks are IEEE802.11b and IEEE802.11g , which allow devices to connect over a limited distance. In addition, cellular networks are also allowing many devices to connect to a computer remotely via the Internet. The length and speed of wireless connections depend greatly on the strength of signal between the host and receiver and generally are significantly slower than their hardwired counterpart. Wireless connections, however, allow for mobility such that a measurement system is not confined to a single location. For example, a technician may be able to travel to a remote site with a handheld device and upload the data to a central computer. 4.4.4
Virtual Instruments
The modular nature of digital data-acquisition devices gives an engineer a tremendous amount of flexibility in the design of a measurement system. Various "off-the-shelf'
!
4.4
Configu rations of Data Acqu isition Systems
95
components can be selected and integrated into a single measurement system using the various connections listed in the previous section. For example, one may wish to mea sure the pressure and temperature of a fluid at several locations as it travels through a large and complex piping system. A simple, but not very efficient, approach would be to have separate sensors and displays located at each station with the values manually checked periodically to ensure that things are operating smoothly. A more elegant and useful approach, however, would to be combine all of the signals together and display them on single screen. This is possible using commercial software packages such as National Instruments LabVIEW, which allows one to create a custom "virtual instru ment" designed for a particular application. These types of commercial software pack ages have become extremely sophisticated, allowing the user to take data, display it in real time, write the data to files, perform real-time data processing, conduct process control and perform safety checks, just to name a few functions. TYpically, the software consists of a graphical user interface that allows the user to design a custom screen con sisting of various menus and icons connected to the various components in the system. With Ethernet connections, these components can even be spread across several states or countries. In the above scenario, the custom software may consist of an on-screen schematic of the piping system with the current pressure and temperature displayed at each measurement station. The interface may also allow for various settings to be changed, such as the closing or opening of a valve, by simply clicking on a button located near the valve in the schematic. The system may also be set up to take appro priate action, such as automatic shut down, in the event of a failure that might compro mise safety or cause property damage. Virtual instruments are recognized in contrast to measurement hardware that has a predefined function. For example, a digital multimeter uses an AID converter and a digital display to read and display a voltage level. Other functions, such as measuring AC voltage amplitude are hard-wired into the device and new functions cannot be added as needed. A computer with an AID convertor and appropriate software can also perform the same functions, but provide additional flexibility and customizability by exploiting the capabilities of the computer to which it is attached. For example, one could build a program to not only indicate the amplitude of the AC signal but also dis play the signal on the screen and indicate the frequency and phase at which it is oscil lating. It would also be possible to display multiple signals on a single screen or perform mathematical operations on the signals. While this example would be straight forward to implement for even a lightly trained programmer, it would still take time and resources to implement and may be an overkill for certain tasks. As such, the sys tem engineer must balance the function of the device with the cost.
4.4.5
Digital storage oscillosc� pes
Digital storage oscilloscopes (DSOs) display the data in a similar fashion to analog
CRT scope; however, the signal is digitized using an analog-to-digital (AID) converter prior to being displayed on the screen. Figure 4.14 presents a block diagram of the basic architecture of a DSo. The signal is first amplified and then digitized using a spe cial AID convertor commonly referred to as a high-speed digitizer. High-speed digitiz ers are similar to conventional AID circuits except that they emphasize speed, or
96
Chapter 4
Computerized Data-Acquisition Systems High-speed digitzer
(AID )
Acquisition memory
Micro processor
Display memory
Display
FIGURE 4. 1 4
Block diagram of a digital storage oscilloscope.
sampling rate, over precision. Digitizers are available with speeds in the GHz range and typically have 8-bit precision. As the digitization occurs at speeds greater than that at which the signal can be processed and displayed, the digital output is immediately stored into memory. The stored signal is then sent to a microprocessor and a display unit, such as an LCD screen. The inclusion of a microprocessor into the system archi tecture allows for the inclusion of advanced data processing and triggering algorithms to be included as part of the function of the DSo. For example, the scope can also be used to measure the signal frequency, amplitude, pulse width, and rise time, or set to trigger on a particular type of event, just to name a few of the functions commonly available. In addition, digital scopes are capable of acquiring and displaying multiple signals simultaneously on the same screen. Many DSOs, taking advantage of the digital nature of the device, are also outfit ted with components found in personal computers (e.g. hard drive, DVD-ROMs, etc.) and include an operating system allowing for the instrument to be integrated into more complex test environments and for the use of customized software. Similarly, high speed digitizers can be purchased separately and installed on a computer, allowing the computer to function as an oscilloscope. Regardless of whether the scope is analog, digital, or PC-based, there are a number of performance parameters that must be con sidered when selecting a scope for a particular measurement. The first is the
dB (Le. the output is 70.7% of the input amplitude), and also the parameter best
bandwidth, which is defined as the frequency at which a sinusoidal input is attenuated by
-3
correlated with the cost of the scope. The bandwidth takes into account the frequency response of all elements of the oscilloscope prior to digitization (e.g. amplifier, connec tion circuitry, etc.) and is distinct from the sampling rate of the scope, which refers to the rate at which the digitizer samples the signal and is typically much greater than the bandwidth. Insufficient bandwidth can lead to distortion of the input signal, particu larly if the signal contains high frequencies and/or sharp edges, as is common in high speed digital signals. More details about the limitations associated with bandwidth and
4.5
Softwa re for Data-Acqu isition Systems
97
sampling rate cali be found in Chapter 5. Additional parameters of importance are the record length, which is the number of points that can be acquired and stored for a given waveform, and the waveform capture rate, which is a measure of how quickly succes sive waveforms can be read and captured by the scope. 4.4.6
Data loggers
Data loggers are used to collect and store data in the same way as data-acquisition sys tems but they are generally simpler and often specialized. A general purpose data logger might be temporarily installed for a building air conditioning system in order to obtain diagnostic performance data over a period of time (such as temperature, airflow, and fan operation). Some buildings are permanently instrumented using data loggers to obtain acceleration and other data if and when an earthquake occurs. The flight data recorder in a commercial aircraft is a specialized data logger and provides important information after accidents. Recent automobiles often include a data logger that records a small amount of speed and other data just before and after an accident (may be part of the airbag system). The following are some common characteristics of data loggers: • • • • • • • •
A microprocessor and memory (often nonvolatile) A limited number of channels (4-8 being common) Low sampling rates (sometimes as low as 1 sample/second) Unmanned operation Durable packaging Battery or battery backup operation Limited or no display and no keyboard Connection to a personal computer for programming/setup and data download
Data loggers may have a wide variety of features such as wireless communications, connection to the Internet, or high sampling rates over a short time span. Many sys tems are commercially available, so a suitable solution for a particular application should be readily available. 4.5
SOFTWARE FOR DATA-ACQUISITION SYSTEMS
For a computerized data-acquisition system (with possible control functions) to per form satisfactorily, the system must be operated using suitable software. To take a data sample, for example, the following instructions must be executed: 1. Instruct the multiplexer to select a channel.
2. Instruct the AID converter to make a conversion. 3. Retrieve the result and store it in memory.
In most applications, other instructions are also required, such as setting amplifier gain or causing a simultaneous sample-and-hold system to take data. The software required depends on the application.
98
4.5.1
Chapter 4
Computerized Data-Acqu isition Systems
Commercial Software Packages
In the process-control industry, sophisticated computer programs have been avail able for some time. Using selections from various menus, the operator can configure the program for the particular application. These programs can be configured to take data from transducers at the times requested, display the data on the screen, and use the data to perform required control functions. These systems are often con figured by technicians rather than engineers or programmers, so it is important that the software setup be straightforward. For complicated processing or control func tions, it is possible to include instructions programmed in a higher-level language such as C. There are a number of very sophisticated software packages now available for personal computer-based data-acquisition systems. These packages are very capable they can take data, display it in real time, write the data to files for subsequent process ing by another program, and perform some control functions. The programs are configured for a particular application using menus or icons. They may allow for the incorporation of C program modules. These software packages are the best choice for the majority of experimental situations. REFERENCES [1] ANDERSON, N. (1980). Instrumentation for Process Measurement and Control, Chilton, Radnor, PA. [2] BARNEY, O. (1988). Intelligent Instrumentation, Prentice Hall, Englewood Cliffs, NJ. [3] BoYEs, w. (2003). Instrumentation Reference Book, 3rd ed., Elsevier Science, Burlington, MA. [4] BISHOP, R. (2007). Lab VIEW 8 Student Edition. Prentice Hall, Englewood Cliffs, NJ. [5] DALLY, J., RILEY, W. AND MCCONNELL, K. (1993). Instrumentation for Engineering Measure ments, 2d ed., Wiley, New York. [6] FRANCO, S. (2002). Design with Operational Amplifiers and Analog Integrated Circuits, McGraw-Hill, New York. [7] IEEE COMPUTER SOCIETY (2002). IEEE Std 802-2001. IEEE Standard for Local and Met ropolitan Area Networks: Overview and Architecture. The Institute for Electrical and Electronic Engineers, Inc., New York. [8] IEEE COMPUTER SOCIETY (2008). IEEE 1394-2008. IEEE Standard for a High-Perfor mance Serial Bus. The Institute for Electrical and Electronic Engineers, Inc., New York. [9] INTELLIGENT INSTRUMENTATION (1994). The Handbook ofPersonal Computer Instrumenta tion, Intelligent Instrumentation, Thcson, AZ. [10] NATIONAL INSTRUMENTS (2009). Introduction to Labview: 6-Hour Hands-On Tutorial. National Instruments Corporation, Austin, TX. [11] SHAPIRO, S. F. (1987). Board-level systems set the trend in data acquisition, Computer Design, Apr. 1. [12] SHEINGOLD, D. H., (1986). Analog Digital Conversion Handbook, Prentice Hall, Engle wood Cliffs, NJ. [13] S ONY (1989). Semiconductor IC Data Book 1990, AID, DIA Converters, Sony Corp, Tokyo, Japan. [14] TAYLOR, J. L. (1990). Computer-Based Data Acquisition, Instrument Society of America, Research 'ftiangle Park, NC.
\
Problems
99
[15] TEKTRONIX (2009). XYZs of Oscilloscopes, Tektronix, Beaverton, OR. [16] TuRNER, 1 D. (1988). Instrumentation for Engineers, Springer-Verlag, New York.
PROBLE M S 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15
Convert the decimal number 147 t o 8-bit simple binary. Convert the decimal number 145 to 8-bit simple binary. Convert the decimal number 1 149 to 12-bit simple binary. Convert the decimal number 872 to 12-bit simple binary. Convert the numbers + 121 and - 121 to 2's-complement 8-bit binary numbers. Convert the numbers + 101 and - 101 to 2's-complement 8-bit binary numbers. Find the 12-bit 2's-complement binary equivalent of the decimal number 891 . Find the 12-bit 2's-complement binary equivalent of the decimal number 695. The number 10010001 is an 8-bit 2's-complement number. What is its decimal value? The number 10010001 is an 8-bit 2's-complement number. What is its decimal value? How many bits are required for a digital device to represent the decimal number 27,541 in simple binary? How many bits for 2's-complement binary? How many bits are required for a digital device to represent the decimal number 12,034 in simple binary? How many bits for 2's-complement binary? How many bits are required to represent the number -756 in 2's-complement binary? How many bits are required to represent the number -534 in 2's-complement binary? A 12-bit AID converter has an input range of ±8 V, and the output code is offset binary. Find the output (in decimal) if the input is (a) (b) (c) (d)
4.2 V. -5.7 V. 1O.9 V. -8.5 V.
(a) (b) (c) (d)
2.4 V. -6.3 V. 11.0 V. -9.2 V.
(a) (b) (c) (d)
5.75 V. -5.75 V. 11.5 V.
4.16 A 12-bit ND converter has an input range of ±8 V, and the output code is offset binary. Find the output (in decimal) if the input is
4.17
An 8-bit AID converter has an input range of 0 to 10 V and an output in simple binary. Find the output (in decimal) if the input is
o v.
4.18 An 8-bit ND converter has an input range of 0 to 10 V and an output in simple binary. Find the output (in decimal) if the input is (a) (b) (c) (d)
6.42 V. -6.42 V. 12.00 V. O V.
1 00
Chapter 4
Computerized Data-Acqu isition Systems
4.19 A 12-bit AID converter has an input range of ± 1O V and an amplifier at the input with a gain of 10. The output of the AID converter is in 2's-complement format. Find the output of the AID converter if the input to the amplifier is (a) 1.5 V. (b) 0.8 V. (c) - 1 .5 V. 4.20
(d) -0.8 V.
A 12-bit AID converter has an input range of ±1O V and an amplifier at the input with a gain of 10. The output of the AID converter is in 2's-complement format. Find the output of the AID converter if the input to the amplifier is (a) 0.52 V. (b) 1 .3 V. (c) -0.52 V. (d) - 1.3 V.
4.21 A 16-bit AID converter has an input range of 0 to 5 V. Estimate the quantization error (as a percent of reading) for an input of 1 .36 V. 4.22 A 12-bit AID converter has an input range of 0 to 5 V. Estimate the quantization error (as a percent of reading) for an input of 2.45 V. 4.23 An AID converter has an input range of ±8 V. 1f the input is 7.5 V, what is the quantiza tion error in volts and as a percent of input voltage if the converter has 8 bits, 12 bits, and 16 bits. 4.24 An AID converter has an input range of ±10 V. If the input is 8.0 V, what is the quantiza tion error in volts and as a percent of input voltage if the converter has 8 bits, 12 bits, and 16 bits. 4.25 A 12-bit AID converter has an input range of -8 to +8 V. Estimate the quantization error (as a percentage of reading) for an input -4.16 V. 4.26 A 12-bit AID converter has an input range of -5 to +5 V. Estimate the quantization error (as a percentage of reading) for an input -2.46. 4.27 An AID converter uses 12 bits and has an input range of ± 1O V. An amplifier is connected to the input and has selectable gains of 10, 100, and 1000. A time-varying signal from a transducer varies between + 15 and -15 mV and is input to the amplifier. Select the best value for the gain to minimize the quantizing error. What will be the quantizing error (as a percentage of the reading) when the transducer voltage is 3.75 mY? Could you attenuate the signal before amplification to reduce the quantizing error? 4.28 A 12-bit AID converter has an input range of ± 10 V and is connected to an input amplifier with programmable gain of 1, 10, 100, or 500. The connected transducer has a maximum output of 7.5 mY. Select the appropriate gain to minimize the quantization error, and com pute the quantization error as a percent of the maximum input voltage. 4.29 A 12-bit AID converter has an input range of ± 10 V and is connected to an input amplifier with programmable gain of 1, 10, 100, or 1000. The connected transducer has a maximum output of 10 mV. Select the appropriate gain to minimize the quantization error, and com pute the quantization error as a percent of the maximum input voltage. 4.30 An 8-bit digital-to-analog converter has an output range of 0 to 5 V. Estimate the analog voltage output if the input is simple binary and has the decimal value of 32.
Problems
101
4.31 A 12-bit digital-to-analog converter has an output range of 0 to 10 V. Estimate the analog voltage output if the input is simple binary and has the decimal value of 45. 4.32 A 3.29-V signal is input to a 12-bit successive approximations converter with an input range of 0 to 10 V and simple binary output. Simulate the successive-approximations process to determine the simple binary output. 4.33 Repeat Problem 4.8( a) using the successive-approximations simulation used in Example 4.8. 4.34 What are the errors that a digital data-acquisition system may introduce into a measure ment? Specify whether these errors are of bias or precision type. 4.35 Imagine that you want to purchase a digital data-acquisition system. List the questions that you want to discuss with an application engineer working for a supplier.
D i screte Sa m p l i n g a n d
CHAPTER
Ana lys i s of Ti me-Va ryi n g Sig nals
Unlike analog recording systems, which can record signals continuously in time, digital data-acquisition systems record signals at discrete times and record no information about the signal in between these times. Unless proper precautions are taken, this dis crete sampling can cause the experimenter to reach incorrect conclusions about the original analog signal. In this chapter we introduce restrictions that must be placed on the signal and the discrete sampling rate. In addition, techniques are introduced to determine the frequency components of time-varying signals (spectral analysis), which can be used to specify and evaluate instruments and also determine the required sam pling rate and filtering.
5.1
SAMPLING-RATE THEOREM
When measurements are made of a time-varying signal (measurand) using a com puterized data-acquisition system, measurements are only made at a discrete set of times, not continuously. For example, a reading (sample) may be taken every 0.1 s or every second, and no information is taken for the time periods between the samples. The experimenter is then left with the problem of deducing the actual measurand behavior from selected samples. The rate at which measurements are made is known as the sampling rate, and incorrect selection of the sampling rate can lead to mis leading results. Figure 5.1 shows a sine wave with a frequency, 1m, of 10 Hz. We are going to explore the output data of a discrete sampling system for which this continuous time dependent signal is an input. The important characteristic of the sampling system here is its sampling rate (normally expressed in hertz). Figures 5.2 to 5.5 show the sampled values for sampling rates of 5, 1 1 , 18, and 20. 1 samples per second. To infer the form of the original signal, the sample data points have been connected with straight-line segments. In examining the data in Figure 5.2, with the sampling rate of 5 Hz, it is reason able to conclude that the sampled signal has a constant (dc) value. However, we know that the sampled signal is, in fact, a sine wave. The amplitude of the sampled data is also 1 02
5.1
Sampli ng-Rate Theorem
1 03
1.25 0.75
� t - 0.25 Q)
0.25
-0.75 - 1 .25
FIGURE 5 . 1
0 Time (sec)
10-Hz sine wave to be sampled.
1.25 0.75
j
0.25
Straight-line interpretation
t -0.25
c
Sampled data
- 0.75 - 1 .25
FIGURE 5.2
0
0.2
0.4
0.6 0.8 Time (sec)
1.0
1.2
Results of sampling a 10-Hz sine wave at a rate of 5 Hz.
misleading -it depends on when the first sample was taken. This behavior ( a constant value of the output) occurs if the wave is sampled at any rate that is an integer fraction of the base frequency fm ( e.g., fm, fm/2 , fm/3 , etc. ) . The data in Figure 5.3 appear to be a sine wave, as are the sampled data, but only one cycle appears in the time that 10 cycles occurred for the sampled data. The frequency, 1 Hz, is the difference between the sampled-data frequency, 10 Hz, and the sampling rate, 11 Hz. The data in Figure 5 .4, sampled at 18 Hz, also represent a periodic wave. The apparent frequency is 8 Hz, the difference between the sam pling rate and the signal frequency, and is again incorrect relative to the input fre quency. These incorrect frequencies that appear in the output data are known as aliases. Aliases are false frequencies that appear in the output data, that are simply artifacts of the sampling process, and that do not in any manner occur in the origi nal data. Figure 5.5, with a sampling rate of 20.1 Hz, can be interpreted as showing a fre quency of 10 Hz, the same as the original data. It turns out that for any sampling rate greater than twice fm ' the lowest apparent frequency will be the same as the actual
1 04
Chapter 5
Discrete Sampling and Ana lysis of Time-Va rying Signals 1.25
-8 E
�
:= Q.
0.75 0.25 -0.25 -0.75
FIGURE 5.3
- 1 .25
Results of sampling a 10-Hz sine wave at 11 Hz.
0
1 .0
1.2
1.0
1.2
Time (sec) 1.25 0.75 Q) "0
E
t
0.25
:=
-0.25 -0.75
FIGURE 5.4
- 1 .25
Results for sampling a 10-Hz sine wave at a rate of 18 Hz.
0 Time (sec)
1.25 0.75 Q) "0
E
0.25
:=
S'
«
-0.25 -0.75
FIGURE 5.5
Results for sampling a 10-Hz sine wave at a rate of 20.1 Hz.
- 1 .25
0
0.2
0.4
0.8 0.6 Time (sec)
frequency. This restriction on the sampling rate is known as the sampling-rate theo rem. This theorem simply states that the sampling rate must be greater than twice the highest-frequency component of the original signal in order to reconstruct the origi nal waveform correctly. In equation form this is expressed as
Is > 21m
(5.1)
5. 1
Sa m p l i ng-Rate Theorem
1 05
where 1m is the signal frequency (or the maximum signal frequency if there is more than one frequency in the signal) and Is is the sampling rate. The theorem also specifies methods that can be used to reconstruct the original signal. The amplitude in Figure 5.5 is not correct, but this is not a major problem, as discussed in Section 5.4. The sampling-rate theorem has a well-established theoretical basis. There is some evidence that the concept dates back to the nineteenth-century mathematician Augustin Cauchy (Marks, 1991). The theorem was formally introduced into modern technology by Nyquist (1928) and Shannon (1948) and is fundamental to communica tion theory. The theorem is often known by the names of the latter two scientists. A comprehensive but advanced discussion of the subject is given by Marks (1991). In the design of an experiment, to eliminate alias frequencies in the data sampled, it is neces sary to determine a sampling rate and appropriate signal filtering. This process will be discussed in some detail later in the chapter. Even if the signal is correctly sampled (i.e., at a frequency greater than twice the signal frequency), the data can be interpreted to be consistent with specific frequencies that are higher than the signal frequency. For example, Figure 5.6 shows the same data as in Figure 5 .5: a lO-Hz signal sampled at 20.1 samples per second. The sampled data are shown as the small squares. However, these data are not only consistent with a 10-Hz sine wave but in this case, the data are also consistent with 30.1 Hz. Actually, there are an infinite number of higher frequencies that are consistent with the data. If, however, the requirements of the sampling-rate theorem have been met (perhaps with suitable filtering), there will be no frequencies less than half the sampling rate t�at are consistent with the data except the correct signal frequency. The higher frequencies can be eliminated from consideration since it is known that they don't exist. 1 0.5
I Q)
11 (\
"
0 -0.5 -1
�
o
--
0.05
--
\/
FIGURE 5.6
Higher frequency aliases.
-
0.1
-
-
r� �
\
--
\
0.15
--
--
0.2
Time (sec)
�
\
-- ------
-
,\J \
0.25
10 Hz
...- � 30.1 Hz
-
--
--
-
-
-
--
--
r
"r------
-
------
03
0.35
1 06
Chapter 5
Discrete Sampling a n d Ana lysis of Time-Va rying Signals 4.0
fm Is
fa fN
A
0.2
0.0
sampled frequency sampling rate alias frequency folding frequency
0.4
FIGURE 5.7
Folding diagram.
In some cases, the requirements of the sampling-rate theorem may not have been met, and it is desired to estimate the lowest alias frequency. The lowest is usually the most obvious in the sampled data. A simple method to estimate alias frequencies involves the folding diagram as shown in Figure 5.7 [Taylor (1990)]. This diagram enables one to predict the alias frequencies based on a knowledge of the signal fre quency and the sampling rate. To use this diagram, it is necessary to compute a fre quency iN called the folding frequency. iN is half the sampling rate, is. The use of this diagram is demonstrated in Example 5.1. Example 5.1 (a) im
80 Hz and is = 100 Hz. 100 Hz and is = 60 Hz. 100 Hz and is = 250 Hz.
Compute the lowest alias frequencies for the following cases: im (c) im
(b)
=
iN = 100/2 = 50 Hz imliN = 80150 = 1 .6
Solution:
(a)
= =
5.2
Spectral Analysis of Time-Varying Signals
1 07
Find fm/fN on the folding diagram, draw a vertical line down to the intersection with AB, and read 0.4 on line AB. The lowest alias frequency can then be determined line
from
fa
=
(fa/fN)fN
=
(b) fN = 60/2 = 30 fm/fN = 100/30 = 3.333
0.4 X 50
=
20 Hz (a false frequency)
Finding 3.333 on the folding diagram and drawing the vertical line down to AB, we find = 0.667. The lowest alias frequency is then
falfN (c)
fa
=
0.667 X 30
fN = 250/2 = 125 Hz fmlfN = 100/125 = 0.8 This falls on the line AB, so falfN
=
=
20 Hz (a false frequency)
0.8 and fa
=
125 Hz, which is the same as the sampled
Comment: In part (a), the sampling frequency is between the signal frequency and the minimum frequency for correct sampling. The lowest alias frequency is the difference between
frequency.
the sampling frequency and the signal frequency. In part (b), the sampling frequency is less than the signal frequency. The folding diagram is the simplest method to determine the lowest alias frequency. In part (c), the requirement of the sampling-rate theorem has been met, and the alias frequency is in fact the signal frequency.
We will always find a lowest frequency using the folding diagram, whether it is a cor rect frequency or a false alias. To know that the frequency is correct, we must insure that the sampling rate is at least twice the actual frequency, usually by using a filter to remove any frequency higher than half the sampling rate.
5.2
SPECTRAL ANALYSIS OF TIM E-VARYING SIGNALS
When a signal is a pure sine wave, determining the frequency is a simple process. How ever, the general time-varying signal does not have the form of a simple sine wave; Figure 5.8 shows a typical example. As discussed below, complicated waveforms can be considered to be constructed of the sum of a set of sine or cosine waves of different fre quencies. The process of determining these component frequencies is called spectral
analysis.
There are two times in an experimental program when it may be necessary to perform spectral analysis on a waveform. The first time is in the planning stage and the second is in the final analysis of the measured data. In planning experiments in which the data vary with time, it is necessary to know, at least approximately, the frequency characteristics of the measurand in order to specify the required frequency response of the transducers and other instruments and to determine the sampling rate required. While the actual signal from a planned experiment will not be known, data from simi lar experiments may be used to determine frequency specifications.
1 08
Chapter 5
D iscrete Sampling and Ana lysis of Time-Va rying Signals
v
�-4-+�--�-4-+----��1¥---t--r--;-t--
FIGURE 5.8
'TYpical measured time-varying waveform.
In many time-varying experiments, the frequency spectrum of a signal is one of the primary results. In structural vibration experiments, for example, acceleration of the vibrating body may be a complicated function resulting from various resonant fre quencies of the system. The measurement system is thus designed to respond properly to the expected range of frequencies, and the resulting data are analyzed for the spe cific frequencies of interest. To examine the methods of spectral analysis, we first look at a relatively simple waveform, a simple 1000-Hz sawtooth wave as shown in Figure 5.9. At first, one might think that this wave contains only a single frequency, 1000 Hz. However, it is much more complicated, containing all frequencies that are an odd-integer mUltiple of 1000, such as 1000, 3000, and 5000 Hz. The method used to determine these component fre quencies is known as Fourier-series analysis. The lowest frequency, to, in the periodic wave shown in Figure 5.9, 1000 Hz, is called the fundamental or the first harmonic frequency. The fundamental frequency has period To and angular frequency woo (Note: The angular frequency, w = 21Tt, where
]
11)
J
0.0
- 2.0 o Time (sec)
FIGU R E 5.9
lOoo-Hz sawtooth waveform.
5.2
� �
Spectra l Analysis of Time-Va ryi ng Signals
1 09
J = 1/ .) s disc�ssed by Den Hartog (1956), Churchill (1987), and Kamen (1990), any penodlc functlon J(t) can be represented by the sum of a constant and a series of sine and cosine waves. In symbolic form, this is written J(t)
=
ao + al cos wot + a2 cos 2wot + + b1 sin wot + � sin 2wot + ' "
+ bn sin nWot
(5.2)
The constant ao is simply the time average of the function over the period T. This can be evaluated from ao
=
The constants an can be evaluated from an
=
(J(t) dt i T o
fT
T io 2
1
J(t) cos nWot dt
(5.3)
(5.4)
and the constants bn can be evaluated from bn
=
(J(t) sin nWot dt T io 2
(5.5)
Although it can be tedious, the constants a n and bn can be computed in a straightfor ward manner for any periodic function. Of course, Eq. (5.2) is an infinite series, so the constants a and b can only be determined for a limited number of terms. Since J(t) can not, in general, be expressed in equation form, it is normal to evaluate Eqs. (5.3), (5.4), and (5.5) by means of numerical methods. The function fit) is considered to be an even function if it has the property that J(t) = J( -t). f(t) is considered to be an odd function if J(t) = -J( -t). If J(t) is even, it can be represented entirely with a series of cosine terms, which is known as a Fourier cosine series. If fit) is odd, it can be represented entirely with a series of sine terms, which is known as a Fourier sine series. Many functions are neither even nor odd and require both sine and cosine terms. If Eqs. (5.3) through (5.5) are applied to the saw tooth wave in Figure 5.9 (either using direct integration or a numerical method), it will be found that all the a's are zero (it is an odd function) and that the first seven b's are b1 bz
b3 b4
= = =
=
1 .6211
bs
0.0000
b6
-0.1801
b7
=
=
=
0.0648 0.0000 -0.0331
0.0000
It is not surprising that the a's are zero since the wave in Figure 5.9 looks much more like a sine wave than a cosine wave. bb b3 , bs, and � are the amplitudes of the first, third, fifth, and seventh harmonics of the function f(t). These have frequencies of 1000, 3000, 5000, and 7000 Hz, respectively. It is useful to present the amplitudes of the har monics on a plot of amplitude versus frequency as shown in Figure 5.10. As can be
1 10
Chapter 5
Discrete Sa m p l i n g and Ana lysis of Ti me-Va ryi ng Signals 2
1.5 B
4) "0
e «
:::I c..
1
0.5
o
1
2 Number o f harmonic, n
FIGURE 5 . 1 0
Amplitudes of harmonics for a sawtooth wave.
seen, harmonics beyond the fifth have a very low amplitude. Often, it is the energy con tent of a signal that is important, and since the energy is proportional to the amplitude squared, the higher harmonics contribute very little energy. Figure 5.11 shows the first and third harmonics and their sum compared with the function f(t). As can be seen, the sum of the first and third harmonics does a fairly good job of representing the sawtooth wave. The main problem is apparent as a rounding near the peak- a problem that would be reduced if the higher harmonics (e.g., fifth, seventh, etc.) were included. Fourier analysis of this type can be very useful in specify ing the frequency response of instruments. If, for example, the experimenter considers the first-plus-third harmonics to be a satisfactory approximation to the sawtooth wave, the sensing instrument need only have an upper frequency limit of 3000 Hz. If a better f(t)
3
�
i «
FIGURE 5. 1 1
Harmonics of sawtooth wave.
1.8
Sum first and third harmonics First harmonic
0.6 - 0.6 - 1 .8
Third harmonic
o
-3 �
�______�______�______�
______
0.00025
0.0005 Time (sec)
0.00075
0.001
5.2
Spectra l Analysis of Time-Varying Signals
111
representation is required, the experimenter can examine the effects of higher har monies on the representation of the wave and then select a suitable transducer. Example 5.2 demonstrates the process of determining Fourier components for another function. The · process of determining Fourier coefficients using numerical methods is demonstrated in Section A.t, Appendix A. Example 5.2
For the periodic function shown in Figure ES.1, find the amplitude of the first, second, and third harmonic components.
Solution: The fundamental frequency for this wave is 10 Hz and the angular frequency, w
is 62.83 rad/s. The first complete cycle of this function can be expressed in equation form as f(t) f(t)
= =
60t 60t - 6
=
O s t < O.OS O.OS S t < 0.10
27Tf,
Since the average value for the function is zero over one cycle, the coefficient 00 will have a value of zero. Also, by examination, we can conclude that it is an odd function and that the cosine terms will be zero and only the sine terms will be required. Using Eq. (S.S), the first harmonic coefficient can be computed from 1 0 1°0 2 [ 1°. 5 (6Ot - 6) sin(62.83t) dt 60t sin(62.83 t) dt + b1 = QOO Q1 ° This can be evaluated using standard methods to give a value of 1. 9098. Similarly, the second and third harmonics can be evaluated from
]
-
bz
b3
=
=
�[ rO.os �[ rO.os 0.1 10 0.1 10
60t sin(2
60t sin( 3
x
x
62.83 t) dt
62.83 t) dt
to give values of - 0. 9S49 and 0.63 66, respectively.
S
+
+
rO\60t - 6) sin(2 10005 ° r "\60t - 6) sin( 3 10.05
x
x
5 4 3 2 1
O �----+---��--�--�7
..... 1 -2 -3 -4 _5 L-�---L--����--�-�
o
Time (sec)
FIGURE ES.l
Sawtooth wave.
62.83t) dt ]
62.83t) dt ]
112
Chapter 5
Discrete Sam p l i n g and Ana lysis of Time-Varying S i g nals Original signal
Duplicate signal
x
FIGURE 5. 1 2
Duplicating a signal to make it harmonic.
One problem associated with Fourier-series analysis is that it appears to only be useful for periodic signals. In fact, this is not the case and there is no requirement that f(t) be periodic to determine the Fourier coefficients for data sampled over a finite time. We could force a general function of time to be periodic simply by duplicating the function in time as shown in Figure 5.12 for the function in Figure 5.8. If we directly apply Eqs. (5.3), (5.4), and (5.5) to a function taken over a time period T, the reSUlting Fourier series will have an implicit fundamental angular frequency Wo equal to 2",IT. However, if the resulting Fourier series were used to compute values off(t) outside the time interval O-T, it would result in values that would not necessarily (and probably would not) resemble the original signal. The analyst must be careful to select a large enough value of T so that all wanted effects can be represented by the resulting Fourier series. An alternative method of finding the spectral content of signals is that of the Fourier transform, discussed next.
5.3
SPECTRAL ANALYSIS USING THE FOURIER TRANSFORM
Although a discussion of Fourier series is useful to introduce the concept that most functions can be represented by the sum of a set of sine and cosine functions, the tech nique most commonly used to spectrally decompose functions is the Fourier transform. The Fourier transform is a generalization of Fourier series. The Fourier transform can be applied to any practical function, does not require that the function be periodic, and for discrete data can be evaluated quickly using a modern computer technique called the Fast Fourier Transform. In presenting the Fourier transform, it is common to start with Fourier series, but in a different form than Eq. (5.2). This form is called the complex exponential form. It can be shown that sine and cosine functions can be represented in terms of complex exponentials: cos x sin x
= =
ejx + e-jx 2
ejx - e-jx 2j
(5.6)
5.3
Spectra l Ana lysis Using the Fourier Tra nsform
1 13
where j =: vCi. These relationships can be used to transform Eq. (5.2) into a complex exponentIal form of the Fourier series. [See basic references on signals, such as Kamen (1990)] The resulting exponential form for the Fourier series can be stated as
f(t) where
Cn
=
00 .L cn ejnwot n - oo
(5.7)
I1T12f(t)e-,.nwot dt
= -
(5.8)
T -T12
Each coefficient, Cm is, in general, a complex number with a real and an imaginary part. In Section 5.2 we showed how a portion of a nonperiodic function can be repre sented by a Fourier series by assuming that the portion of duration T is repeated peri odically. The fundamental angular frequency, Wo, is determined by this selected portion of the signal ( wo = 27TIT). If a longer value of T is selected, the lowest frequency will be reduced. This concept can be extended to make T approach infinity and the lowest frequency approach zero. In this case, frequency becomes a continuous function. It is this approach that leads to the concept of the Fourier transform. The Fourier transform of a function .f(t) is defined as
F(w)
=
J�f(t)e-jwt dt
(5.9)
100F(w)ejwt dw
(5.10)
F( w) is a continuous, complex-valued function. Once a Fourier transform has been deter mined, the original function .f(t) can be recovered from the inverse Fourier transform: f(t)
1 27T
= -
-00
In experiments, a signal is measured only over a finite time period, and with computer ized data-acquisition systems, it is measured only at discrete times. Such a signal is not well suited to analysis by the continuous Fourier transform. For data taken at discrete times over a finite time interval, the discrete Fourier transform (DFT) has been defined as
F(kAf)
N-l
=
.L f(n At) e-j(21rkilf) ( n ilt ) n =O
k
=
0, 1, 2, . . . , N - 1
(5.11)
where N is the number of samples taken during a time period T. The increment of f, Af, is equal to lIT, and the increment of time (the sampling period At) is equal to TIN. The Fs are complex coefficients of a series of sinusoids with frequencies of 0, Af, 2Af, 3Af, . . . , (N - l ) Af. The amplitude of F for a given frequency represents the relative contribution of that frequency to the original signal. Only the coefficients for the sinusoids with frequencies between 0 and (N 12 - 1 ) Af are used in the analysis of signal. The coefficients of the remaining frequencies provide redundant information and have a special meaning, as discussed by Bracewell (2000). The requirements of the Shannon sampling-rate theorem also prevent the use of any
1 14
Chapter 5
Discrete Sa m p l i ng a n d Ana lysis of Ti me-Varying Signals
frequencies above (NI2) !:J.f. The sampling rate is NIT, so the maximum allowable fre quency in the sampled signal will be less than one-half this value, or N/2T N!:J.f12. The original signal can also be recovered from the DFf using the inverse discrete Fourier transform, given by =
fen !:J. t)
=
1 N-l � F(k !:J.f) ej(21Tkt:..f )(nt:..t ) N k=O
--
n
=
0, 1, 2, . . . , N - 1
(5.12)
The F values from Eq. (5.11) can be evaluated by direct numerical integration. The amount of computer time required is roughly proportional to N 2 • For large values of N, this can be prohibitive. A sophisticated algorithm called the Fast Fourier Transform (FFf) has been developed to compute discrete Fourier transforms much more rapidly. This algorithm requires a time proportional to N log2 N to complete the computations, much less than the time for direct integration. The only restriction is that the value of N be a power of 2: for example, 128, 256, 512, and so on. Programs to perform fast Fourier transforms are widely available and are included in major spreadsheet programs. The fast Fourier transform algorithm is also built into devices called spectral analyzers, which can discretize an analog signal and use the FFf to determine the frequencies. It is useful to examine some of the characteristics of the discrete Fourier transform. To do this, we will use as an example a function that has 10- and 15-Hz components:
f(t)
2 sin 27T10t + sin 27T15t (5.13) This function is plotted in Figure 5.13. Since this function is composed of two sine waves with frequencies of 10 and 15 Hz, we would expect to see large values of F at =
these frequencies in the DFf analysis of the signal. If we discretize one second of the signal into 128 samples and perform an FFf (we used a spreadsheet program as demonstrated in Section A.2), the result will be as shown in Figure 5.14, which is a plot of the magnitude of the DFf component, I F(k !:J.f) I , versus the frequency, k !:J.f. As expected, the magnitudes of F at f 10 and f 15 are dominant. However, there are some adjacent frequencies showing appreciable magnitudes. In this case, these signifi cant magnitudes of F at frequencies not in the signal are due to the relatively small =
=
3
2
�
o ��t-1It-t-j--ti-t-�-t-t-r-f��-+�L-��- o 1 -1
-2 -3
Time FIGURE 5. 1 3
The function 2 sin 27rlOt + sin 27r15t.
5.3
Spectra l Ana lysis U s i n g t h e Fourier Transform
115
140 120 100 80 60 40 20
Frequency (Hz) FFT of Eq. (5.13), N FIGURE 5. 1 4
=
128, T
=
1 s.
number of points used to discretize the signal. If we use N = 512, the situation improves significantly, as shown in Figure 5.15. It can be noticed that the magnitude of I F I for the 10 Hz is different in Figures 5.14 and 5.15 and does not equal the amplitude of the first term on the right side of Eq. 5.13. This is a consequence of the definition of the discrete Fourier transform and the FFf algorithm. To get the correct amplitude of the input sine wave, I F I should be multiplied by 21N. In Figure 5.14, the value of I F I for 10 Hz is 128. If we multiply this by 21N we get 128 x 2/128 2, the same as Eq. 5.13. Similarly for Figure 5.15, I F I is 512 so the amplitude of the input sine wave is 512 x 2/512 2. In many cases (finding a natural frequency for example), only the relative amplitudes of the Fourier compo nents are important so this conversion step is not necessary. =
=
o
Frequency (Hz)
FIGURE 5. 1 5
FFT of Eq. 5.11, N
=
512, T
=
1 s.
116
Chapter 5
D iscrete Sa mpling a n d Ana lysis of Time-Varying Signals
It should be noted that the requirements of the Shannon sampling-rate theorem are satisfied for the results shown in both Figures 5.14 and 5.15. For Figure 5.14, the sampling rate is 128 samples per second, so the maximum frequency in the sample should be less than 64-Hz. The actual maximum frequency is 15-Hz. For the FFTs shown in Figures 5.14 and 5.15, the data sample contained integral numbers of complete cycles of both component sinusoids. In general, the experimenter will not know the spectral composition of the signal and will not be able to select a sampling time T such that there will be an integral number of cycles of any frequency in the signal. This complicates the process of Fourier decomposition. To demonstrate this point, we will modify Eq. (5.13) by changing the lowest frequency from 10-Hz to 1O.3333-Hz. f(t)
=
2 sin 27T10.3333t + sin 27T15t
(5. 14)
An FFT is then performed in a 1-s period with 512 samples. Although 15 complete cycles of the 15-Hz components are sampled, 10.3333 cycles of the 10.3333-Hz compo nent are sampled. The results of the DFT are shown in Figure 5.16. The first thing we notice is that the Fourier coefficient for 10.3333-Hz is distrib uted between 10 and 11-Hz with the magnitude of the 10-Hz component slightly higher than at 11-Hz. It should be recognized that without a priori knowledge, the user would not be able to deduce whether the signal had separate lO-Hz and 11-Hz components or just a single component at 1O.3333-Hz, as in this case. An unexpected result is the fact that the entire spectrum (outside of 10, 11, and 15-Hz) has also been altered, yielding significant coefficients at frequencies not present in the original signal. This effect is called leakage and is caused by the fact that there are a non-integral number of cycles of the 1O.3333-Hz sinusoid. Since one does not
500
450
400
350
300
�
250
200 150
100 50
Frequency (Hz) FIGURE 5. 1 6
FFT of Eq. 5.14, N
=
512, T =
Is.
5.3
Spectra l Ana lysis Using the Fourier Tra nsform
117
know the signal frequency in advance, this leakage effect will normally occur in a sam ple. The actual cause is that the sampled value of a particular frequency component at the start of the sampling interval is different from the value at the end. A common method to work around this problem is the use of a windowing function to attenuate the signal at the beginning and the end of the sampling interval. A windowing function is a waveform that is applied to the sampled data. A commonly used window function is the Hann Function, which is defined as
w(n)
=
� (1
-
cos
( ��n ) )
(5.15)
1
The length of the window, N, is the same as the length of the sampled data. This equa tion is plotted in Figure 5.17. The sampled data is multiplied by this window function producing a new set of data with smoother edges. Figure 5.18 shows the data used to plot Fig. 5.16 both with and without the Hann window function applied to it. The Hann function is superimposed on top of the data with the sinusoidal shape apparent. The central portion of the signal is unaffected while the amplitude at the edges is gradually reduced to create a smoother transition. Figure 5.19 shows the distribution of Fourier coefficients for the windowed data. Compared to Fig 5.16, the frequency content at 10 and 15 Hz is much more distinct; however, the amplitude of the signal at these frequencies is clearly underestimated. This should not come as a surprise as the windowing function clearly suppresses the average amplitude of the original signal. This tradeoff between frequency resolution and amplitude is inherent for all window types. Windows that present good resolution in frequency but poor determination of amplitude are often referred to as being of high resolution with low dynamic range. A variety of window functions and their char acteristics have been defined in the literature. Each type of window has its own unique characteristics, with the proper choice depending on the application and preferences of the user. Some common window functions are rectangular, Hamming, Hann, cosine, Lanczos, Bartlett, triangular, Gauss, Bartlett-Hann, Blackman, Kaiser, Blackman Harris, and Blackman-Nutall. See Engelberg (2008), Lyons (2004), Oppenheim et. al. (1999), and Smith (2003) for more information on the windowing process.
FIGURE 5. 1 7
n
=
Hann windowing function (Eq. 5.15) for N 128.
118
Chapter 5
Discrete Sampling and Ana lysis of Ti me-Va ryi ng Signals 3
3
2
2
� Q)
8 ...:
0.
1 0 -1 -2
-2 -3
o
_3 L-
0 Time (sec) (a)
�__J-__�__�__ �
__
0.2
Time (sec) (b)
FIGURE 5 . 1 8
(a) Plot of Eq. 5.14 showing Hann Window function. (b) Modified data.
300 250 200 �
150 100
o
Frequency (Hz)
FIGURE 5. 1 9
FFT of data depicted in Fig. 5.18 with N
=
512, T
=
1 s.
An additional consideration in spectral analysis is the types of plots used to dis play the data. In the previous figures, a linear scale has been used both for the ampli tude and frequency. It is common, however, to plot one or both axes on a logarithmic scale. In the case of amplitude, it is common to plot the spectral power density, which is typically represented in units of decibels (dB). The majority of signals encountered in practice will have record lengths much longer than the examples presented here. Consider a microphone measurement sampled at 40 kHz over a 10-s period of time. The record length in this case would
5.4 Selecting the Sa m p l i n g Rate a n d F i ltering
119
be 400,000 val�e �. While i t is possible t o compute the DFT of a signal with N = 400,000, thIs IS not always optimal and can give misleading results. In this case, the apparent frequency resolution, tlf, of the FFT would be 0.1 Hz. As already seen, spectral leakage is likely to limit the resolution to much higher values and it is unlikely that this level of resolution would practically be needed. Rather, it is more common to use methods such as Bartlett's or Welch's methods. In these methods, the sampled signal is divided into equal length segments with a window function applied to each segment. An FFT is then computed for each segment and then aver aged together to produce a single FFT for the entire signal. For example, if one were to take the example above, it could be divided into approximately 390 segments with 1024 values in each segment. The FFT of each segment would have a frequency resolution 39.06 Hz, which would be sufficient for many applications. The major advantage to this form of analysis is that any uncertainty in the Fourier coefficients is reduced through the averaging of multiple FFT coefficients. This typically yields a much smoother curve than a single FFT. In Bartlett's method, there is no overlap between segments while in Welch's method, some overlap (typically 50%) is allowed, thus taking more of the signal content into account. See Lyons (2004) and Oppenheim, et. al. (1999) for more information on these more advanced windowing functions.
5.4
SELECTING THE SAMPLING RATE AND FILTERING
5.4. 1
Selecting the Sampling Rate
As shown in previous 'sections, a typical signal can be considered to be composed of a set of sinusoids of different frequencies. In most cases the experimenter can deter mine the maximum signal frequency of interest, which we shall call fe. However, the signal frequently contains significant energy at frequencies higher than fe. If the signal is to be recorded with an analog device, such as an analog tape recorder, these higher frequencies are usually of no concern. They will either be recorded accurately or attenuated by the recording device. If, however, the signal is to be recorded only at discrete values of time, the potential exists for the generation of false, alias signals in the recording. The sampling-rate theorem does not state that to avoid aliasing, the sampling rate must be twice the maximum frequency of interest but that the sampling rate must be greater than twice the maximum frequency in the signal, here denoted by fm ' As an example, consider a signal that has Fourier sine components of 90, 180, 270, and 360 Hz. If we are only interested in frequencies below 200 Hz, we might set the sampling rate at 400 Hz. In our sampled output, however, we will see frequencies of 130 Hz and 40 Hz, which are aliases caused by the 270- and 360-Hz components of the signal. Section 5.1 provided a simple method to determine alias frequencies for a given signal frequency and sampling rate. If the sampling rate is denoted by fs, the sampling rate theorem is formally stated as
fs > 2fm
(5.16)
1 20
Chapter 5
D iscrete Sa m p l i n g and Analysi s of Ti me-Varyi ng Signals
If we set the sampling rate, fs, to a value that is greater than .twice fm, w� s�ould not only avoid aliasing but also be able to recover (at least theoretically) the ongm?l wave form. In the foregoing example, in which fm is 360 Hz, we would select a samplIng rate, fs, greater than 720 Hz. t If this sampling-rate restriction is met, the original waveform can be recovered using the following formula called the cardinal series (Marks, 1991): f(t)
=
�
sin['IT(tfaT n)] 1 00 f(naT) 'IT n tf aT _ n -
(5.17)
In this formula, f(t) is the reconstructed function. The first term after the summation, fe n aT}, represents the discretely sampled values of the function, n is an integer cor responding to each sample, and aT is the sampling period, 1Ifs. One important charac teristic of this equation is that it assumes an infmite set of sampled data and is hence an infinite series. Real sets of sampled data are finite. However, the series converges and discrete samples in the vicinity of the time t contribute more than terms not in the vicinity. Hence, the use of a finite number of samples can lead to an excellent recon struction of the original signal. As an example, consider a function sin(2'IT0.45t), that is, a simple sine wave with a frequency of 0.45 Hz and a peak amplitude of 1.0. It is sampled at a rate of 1 sample per second and 700 samples are collected. Note that fs exceeds 2fm' so this requirement of the sampling-rate theorem is satisfied. A portion of the sampled data is shown in Figure 5.20. The sampled data have been connected with straight-line segments and do not appear to closely resemble the original sine wave. Using Eq. (5.17) and the 700 data samples, the original curve has been reconstructed for the interval of time between 497 and 503 S, as shown by the heavy curve on Figure 5.20. The original sine wave has been recovered from the data samples with a high degree of accuracy. Reconstructions with small sample sizes or at the ends of the data samples will, in general, not be as good as this example. Although Eq. (5.17) can be used to reconstruct data, other methods, beyond the scope of the present text, are often used (Marks, 1991). In most cases, the use of very high sampling rates can eliminate the need to use reconstruction methods to recover the original signal.
t Although it is normally desirable to select such a sampling rate, it is possible to relax this in some cases. The requirement that Is > 21m is more stringent than is necessary to eliminate alias frequencies in the O - Ie range. As we noted in Section 5.1, for sampling rates between a sampled frequency I and twice that frequen cy, the lowest alias frequency is the difference between I and the sampling rate. Assume that we have set the sampling rate such that the minimum alias frequency la has a value just equal to Ie. The highest frequency in the signal that could cause aliasing is 1m. (All frequencies above 'm have zero amplitude.) Then this alias fre quency, la, is the difference between Is and 'm:
la
=
Is
-
'm
=
Ie
Solving, we obtain I. = Ie + 1m. If this basis is used to select the sampling rate, there will be aliases in the sa�pled signal with frequencies in the Ie Is range but not in the 0 -Ie range. Digital filtering techniques USIng software can be used to eliminate these alias frequencies. If Ie has a value of 200 Hz and 1m has A value o� 4OO �, we would select a sampling rate of 600 Hz, significantly lower than twice 1m (BOO Hz). For further diSCUSSion, see Taylor (1990).
5.4
Selecting the Sa mpling Rate and Fi ltering
121
1 0.5
�
X2 , · · · , Xm then the probability of occurrence of a particular value of Xi is P(Xi), where P is the probability mass function for the variable x. As an example, consider a single die that has six possible states, each with equal probability 1/6. Then P( Xi) = 1/6 for each value of Xi. As another example, consider a biased coin that has a probability of heads of 2/3 and a probability of tails of 1/3. Considering heads to be Xl and tails to be X2 , the complete function P(x) is represented by P(Xl) = 2/3 and P(X2 ) = 1/3. The sum of the probabilities of all possible values of X must be 1:
2: P(Xi) n
i= l
=
1
(6.13)
The mean of the population for a discrete random variable is given by
J-L
=
2: XiP(Xi) N
i=l
(6.14)
The quantity J-L is also called the expected value o/x, E(x). The variance of the popula tion is given by
u2 /(X) ,
=
2: (Xi - J-L) 2p(Xi) N
i=l
(6.15)
Probability Density Function For a continuous random variable, a function
called a probability density function, is defined such that the probability of occurrence of the random variable in an interval between Xi and Xi + dx is given by
(6.16)
6.3
Probab i l ity
To evaluate the probability that x will occur in a finite interval from this equation can be integrated to obtain =
P ( a :5 x :5 b)
x
Ibf(X) dx
=
1 37
a to x
=
b,
(6.17)
For a continuous random variable, the probability of x having a single unique value is zero. If the limits of integration are extended to negative and positive infinity, we can be sure that the measurement is in that range and the probability will be
P( - oo :5 x :5 (0 )
=
1.
Th e definition o f f(x) now allows us t o define the mean of a population with probability density function f(x): JL =
J�xf(x) dx
(6.18)
This is also the expected value of the random variable, E(x). The lation is given by
variance of the popu (6.19)
Example 6.1
{
The life of a given type of ball bearing can be characterized by a probability distribution function
f(x)
=
0 200
x3
x < lO h
x
�
lO h
f(x) is shown in Figure E6.1. (a) Calculate the expected life of the bearings. (b) If we pick a bearing at random from this batch, what is the probability that its life (x) will
be less than 20 h, greater than 20 h, and finally, exactly 20 h?
Solution:
(a) Using Eq. (6.18), we have
E(x)
= f.L =
100xf(x) dx 100x-3 dx o
=
200
10
X
= --
200
X
1 00
10
=
20 h
0.2
30
o Lifetime (h)
40 FIGURE E6. 1
1 38
Chapter 6
Statistica l Ana lysis of Experimental Data
(b) The probability that the lifetime is less than 20 h is given by
Also,
=
P(x < 20)
20 1 f(x) dx -00
P(x P(x
2:
20) 20)
=
=
=
10 1 O dx
=
+
--<Xl
1 - P(x 0
s
20)
202 00 dx 1 10 3 =
X
=
0.75
0.25
Cumulative Distribution Function The cumulative distribution function is another method of presenting data for the distribution of a random variable. It is used to determine the probability that a random variable has a value less than or equal to a specified value. The cumulative distribution function for a continuous random variable (rv) is defined as
F(rv :5 x)
=
F(x)
=
J�f(X) dx
For a discrete random variable, it is defined as :5
F(rv
X
Xj)
=
=
P(rv :5 x)
(6.20)
� P(Xj ) i
(6.21)
j= l
The next two relations follow from the definition of the cumulative distribution function:
Pea < :5 b) P(x > a)
=
=
F(b) - F(a) 1 - F(a)
(6.22)
The use of the cumulative distribution function is demonstrated in Example 6.2. Example 6.2
Find the probability that the lifetime of one of the ball bearings in Example 6.1 will have a life time of (a) less than 15 h and
(b) less than 20 h, using the cumulative distribution function.
Solu tion:
(a) Using Eq. (6.20), we obtain, for the cumulative distribution function,
F(x)
=
=
jj(X)
0
+
(
dx
=
200 dx 110 x3
j�OdX =
1 -
=
100 x2
0
for x
s
10
for x > 10
This function is plotted in Figure E6.2. Either by substituting 15 into the equation or by reading from the graph, we find that the probability that the lifetime is less than 15 h is 0.55. (b) Similarly, the probability that the lifetime is less than 20 h is 0.75.
6.3
Proba b i l ity
1 39
1
0.8 ,-... 0.6 � 10;,
0.4 0.2
o
6.3.2
FIGURE E6.2
Some Probability Distribution Functions with Engineering Applications
A number of distribution functions are used in engineering applications. In this sec tion, some of the most common are described briefly and their application is discussed. Binomial Distribution The binomial distribution is a distribution which describes discrete random variables that can have only two possible outcomes: "success" and "failure." This distribution has application in production quality control, when the quality of a product is either acceptable or unacceptable. The following conditions need to be satisfied for the binomial distribution to be applicable to a certain experiment: L Each trial in the experiment can have only the two possible outcomes of success
2. The probability of success remains constant throughout the experiment. This prob
or failure.
ability is denoted by p and is usually known or estimated for a given population. 3. The experiment consists of n independent trials. The binomial distribution provides the probability cesses in a total of n trials and is expressed as
(P)
of finding exactly
r suc (6.23)
In this relation, r is an integer and is always less than or equal to n, and
(n ) r
=
n! r!(n - r) !
(6.24)
is called n combination r, which is the number of ways that we can choose r items from n items. The expected number of successes in n trials for binomial distribution is JL =
np
(6.25)
1 40
Chapter 6
Statistica l Analysis of Experimental Data
The standard deviation of the binomial distribution is 0" =
v'np( l - p)
(6.26)
In some engineering applications, one is interested in finding the probability that the event will happen less than or equal to a certain number of times. If k is this number of occurrences (k :5 n), then
Per
:5
k)
=
k �P(r
=
i)
=
�k ( ni ) pl{. l - pt-I.
(6.27)
Example 6.3
A manufacturer of a brand of computer claims that his computers are reliable and that only 10% of the computers require repairs within the warranty period. Determine the probability that in a batch of 20 computers, 5 will require repair during the warranty period. Solution: We can apply the binomial distribution because of the pass/fail outcome of the process. Success will be defined as not needing repair within the warranty period. In this case, based on the manufacturer's tests, p = 0.9. Other assumptions underlying the application of this distribution are that all trials are independent and that the probabilities of success and failure are the same for all computers. The problem amounts to determining the probability (P) of hav ing 15 successes (r) out of 20 machines (n). Using Eqs. (6.23) and (6.24) yields
( ;) G�) G�) P
��
=
=
=
0§5 ( 1 - 0.9?
0
15! (2
15)!
=
=
15,504
0.032
The conclusion here is that there is a fairly small chance (3.2 %) that there will be exactly 5 com puters out of a batch of 20 computers requiring repair.
Example 6.4
A light-bulb manufacturing company has discovered that, for a given batch, 10% of the light bulbs are defective. If we buy four of these bulbs, what are the probabilities of finding that four, three, two, one, and none of the bulbs are defective? Again, we can use the binomial distribution. The number of trials is 4, and if we define success as bulb failure, p = 0.1. The probability of having four, three, two, one, and zero defective light bulbs can be calculated by using Eq. (6.23). The probability of finding four defec tive bulbs is determined from
Solution:
P(r
=
4)
=
=
G)
0.14(1 - 0. 1)4-4
0.0001
Note: since O! Similarly,
P(r P(r
=
=
3) 2)
=
=
0.0036 0.0486
=
1
P(r P(r
=
=
6.3
1) 0)
=
=
Proba b i lity
141
0.2916 0.6561
The total probability of all five possible outcomes is P
=
=
P(r = 4) 1.0000
+
P(r
=
3)
+
P(r
=
2)
+
P(r
=
1 ) + P(r
=
0)
Example 6.5
For the data of Example 6.4, calculate the probability of finding up to and including two defec tive light bulbs in the sample of four. Solution:
We use Eq. (6.27) for this purpose: P(r
::;;
2)
=
= =
P(r = 0) + P(r = 1 ) + P(r 0.6561 + 0.2916 + 0.0486 0.9963
=
2)
This means that the probability of finding two or fewer defective light bulbs in four is 99.63 % . Poisson Distribution The Poisson distribution is used to estimate the number of random occurrences of an event in a specified interval of time or space if the average number of occurrences is already known. For example, if it is known that, on average, 10 customers visit a bank per five-minute period during the lunch hour, the Poisson distribution can be used to predict the probability that 8 customers will visit during a particular five-minute period. The Poisson distribution can also be used for spatial variations. For instance, if it is known that there are, on average, two defects per square meter of printed circuit boards, the Poisson distribution can be used to predict the probability that there will be four defects in a square meter of boards. The following two assumptions underline the Poisson distribution: 1. The probability of occurrence of an event is the same for any two intervals of the
same length. 2. The probability of occurrence of an event is independent of the occurrence of other events. The probability of occurrence of x events is given by
P(x)
=
e-AAx xl
(6.28)
where A is the expected or mean number of occurrences during the interval of interest. The expected value of x for the Poisson distribution, the same as the mean, IL, is given by
E(x) The standard deviation is given by
=
u =
IL
=
VA
A
(6.29) (6.30)
1 42
Chapter 6
Statistica l Ana lysis of Experimenta l Data
In some cases, the goal is to find the probability that a certain number of events or fewer .will occur. To compute the probability that the number of occurrences is less than or equal to k, the sum of the probability of k, k
1 . . . 0 events must be computed. This is given by
-
(6.31) Example 6.6
In a binary data stream, it is known that there is an average of three errors per minute. Find the probability that there are exactly zero errors during a one-minute period. So lution:
For this problem, A is 3 and x is O. Substituting into Eq. 6.28 yields P(x)
=
e-33°
=
m
0.050
The probability that there are no errors in a one-minute interval is thus 0.050.
Example 6.7
In Example 6.6, what is the probability that there will be (a) three or fewer errors. (b) more than three errors. Solution:
Frrst we need to compute the probability of 0, 1 , 2, and 3 errors using Eq. 6.28: P(3)
=
e-33 3
=
3!
0.224
Similarly, P(2) P(l) P(O)
= = =
0.224 0.149 0.050
Thus, P(x s 3)
=
0.050
+
0.149
TIle probability that x is greater than 3 is then P(x > 3)
=
1 - P(x
s
+
3)
.
0.224
=
+
0.224
1 - 0.647
=
=
0.647
0.353
Example 6.8
It has been found in welds joining pipes that there is an average of five defects per 10 linear meters of weld (0.5 defects per meter). What is the probability that there will be (a) a single defect in a weld that is 0.5 m long or (b) more than one defect in a weld that is 0.5 m long.
\
A, the average number of defects in 0.5 m, is 0.5 X 0.5 find that the probability of one defect is then e-o .25 1 = 0.194 P( 1 ) =
Solution:
6.3 =
Probab i l ity
1 43
0.25. Using Eq. 6.28, we
'��
There is thus a probability of 0.194 that there would be a single defect. The probability that there will be more than one defect is =
P(x > 1 )
1 - P(O) - P ( 1 )
P(O) can b e computed from Eq. 6.28 a s 0.778. The probability is P(x > 1 )
=
-
1
0.778 - 0 .194
=
0.028
Thus, the probability of more than one defect is only 0.028. Normal (Gaussian) Distribution The normal (or Gaussian) distribution function is a simple distribution function that is useful for a large number of common problems involving continuous random variables. The normal distribution has been shown to describe the dispersion of the data for measurements in which the variation in the measured value is due totally to random factors and occurer nces of both positive and negative deviations are equally probable. The equation for the normal probability density function is
f(x)
=
1_e-Cx-p,)2f2u'l uV21r
(6.32)
_
In this equation, x is the random variable. The function has two parameters: the popu lation standard deviation, u, and the population mean, f.t. A plot off(x) versus x for dif ferent values of u and a fixed value of the mean is shown in Figure 6.4. As the figure shows, the distribution is symmetric about the mean value, and the smaller the stan dard deviation, the higher is the peak value of f(x) at the mean. According to the definition of the probability density function [Eq. (6.17)], for a given population, the probability of having a single value of x between a lower limit of 1 0.8
�
0.6
'"""
......
0.4 0.2 0 -1
Random variable, x
5
FIGURE 6.4
Normal distribution function.
144
Chapter 6
Statistica l Analysis of Experimenta l Data
Xl and an upper limit of X2 is (6.33)
f(x)
Since is in the form of an error function, the preceding integral cannot be evalu ated analytically, and as a result, the integration must be performed numerically. To simplify the numerical integration process, the integrand is usually modified with a change of variable so that the numerically evaluated integral is general and useful for all problems. A nondimensional variable z is defined as
X - IL - (J' 1 f(z) = -- e d2 z=
(6.34)
It is now possible to define the function
VIii
_-2
(6.35)
which is called the standard normal density function. It represents the normal probabil ity density function for a random variable z with mean IL equal to zero and (J' = 1. This normalized function is plotted in Figure 6.5. Taking the differential of Eq. (6.34), we have (J' dz. Equation (6.33) will then transform to
dx X2) = lz2f (Z ) dz =
P(Xl
s;
X
s;
(6.36)
Z2: X - IL x2 - IL ) Xl X2) = P(Zl Z Z2) = P ( -(J' (J' (J' P(Zl Z Z2) ZI
The probability that X is between Xl and X2 is the same as the probability that the trans formed variable z is between Zl and
P(Xl
S;
X
S;
S;
- IL
S;
s;
s;
(6.37a)
The probability has a value equal to the crosshatched area shown on s; s; Figure 6.5. The curve shown in the figure is symmetric with respect to the vertical axis at = 0, which indicates that, with this distribution, the probabilities of positive and
Z
0.4
0;::: 0.2 .-.. N
FIGURE 6.5
Standard normal distribution function.
o Z
Z2
1
2
3
negative deviations from Z = P( -Zl
S
Z
S
6.3
Probab i l ity
1 45
0 are equal. Mathematically, this can be stated as 0)
= P(O
S
Z
S
Zl ) =
S
P ( -Zl
Z
2.0
S
Zl )
(6.37b)
As mentioned, the integral in Eq. (6.33) has two free parameters ( 11- and
0' ) , and as a practical matter, it would have to be integrated numerically for each application. On the other hand, the integral in Eq. (6.36) has no free parameters. If Zl in Eq. (6.36) is chosen to be zero, it is practical to perform the integration numerically and tabulate the results as a function of Z2, or simply z. The results of this integration are shown in Table 6.3. Since the standard normal distribution function is symmetric about Z = 0, the table can then be used to predict the probability of occurrence of a random vari able anywhere in the range - 00 to + 00 if the population follows the basic characteris tics of a normal distribution. The table presents the value of the probability that the random variable has a value between 0 and Z for Z values shown in the left column and the top row. The top row serves as the second decimal point of the first column. The process of using Table 6.3 to predict probabilities is demonstrated in Example 6.9.
Example 6.9 The results of a test that follows a normal distribution have a mean value of 10.0 and a standard deviation of 1. Find the probability that a single reading is (a) between 9 and 12.
(e) less or equal to 9. (d) greater than 12.
(b) between 8 and 9.55.
Solution:
(a) Using Eq. (6.34), we have Zl
= (9 - 10)/1 = - 1 and Z2 = (12 - 10)/1 = 2. We are looking for the area under the standard normal distribution curve from Z = - 1 to Z = 2 [Figure E6.9(a)]. This will be broken into two parts: from Z = - 1 to 0 and from z = 0 to z = 2. The area under the curve from z = - 1 to 0 is the same as the area from z = 0 to z = 1. From Table 6.3, this value is 0.3413. The area from z = 0 to z = 2 is 0.4772. That is, P( - 1 s z s 0) = P(O, l ) = 0.3413 Hence,
P( -1 s z s 2) = 0.3413
+
and P(O
S
0.4772 = 0.8185
z
S
2) = 0.4772
or 81.85 %
(b) Zl = (8 - 10)/1 = -2 and Z2 = (9.55 - 10)/1 = -0.45. We are looking for the area between z = -2 and z = -0.45 [Figure E6.9(b)]. From Table 6.3, the area from z = -2 to z = 0 is 0.4772 and the area from z = -0.45 to z = 0 is 0.1736. The result we seek is the difference between these two areas: (e)
P( -2 s z s -0.45) = 0.4772 - 0.1736 = 0.3036 = 30.36%
Zl = - 00 and Z2 = -1 [Figure E6.9(c)]. From Table 6.3, we find that P( -1 s z s 0) is 0.3413. We also know that P( - 00 s z s 0) is 0.5. Hence,
P(- oo s z s - 1 ) = 0.5 - 0.3413 = 0.1587
or
15.87%
Chapter 6
1 46
TABLE 6.3
z
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0
Statistical Ana lysis of Experimental Data
Area Under the Normal Distribution From z
0.00 .0000 .0398 .0793 .1179 .1554 . 1915 .2257 .2580 .2881 .3159 .3413 .3643 .3849 .4032 .4192 .4332 .4452 .4554 .4641 .4713 .4772 .4821 .4861 .4893 .4918 .4938 .4953 .4965 .4974 .4981 .4987 .4990 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .5000 .5000
0.01 .0040 .0438 .0832 .1217 .1591 .1950 .2291 .261 1 .2910 .3186 .3438 .3665 .3869 .4049 .4207 .4345 .4463 .4564 .4649 .4719 .4778 .4826 .4864 .4896 .4920 .4940 .4955 .4966 .4975 .4982 .4987 .4991 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .5000 .5000
0.02 .0080 .0478 .0871 .1255 .1628 .1985 .2324 .2642 .2939 .3212 .3461 .3686 .3888 .4066 .4222 .4357 .4474 .4573 .4656 .4726 .4783 .4830 .4868 .4898 .4922 .4941 .4956 .4967 .4976 .4982 .4987 .4991 .4994 .4995 .4997 .4998 .4999 .4999 .4999 .5000 .5000
0.03 .0120 .0517 .0910 .1293 .1664 .2019 .2357 .2673 .2967 .3238 .3485 .3708 .3907 .4082 .4236 .4370 .4484 .4582 .4664 .4732 .4788 .4834 .4871 .4901 .4925 .4943 .4957 .4968 .4977 .4983 .4988 .4991 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
=
0 to z
o
0.04
.0160 .0557 .0948 . 1331 .1700 .2054 .2389 .2704 .2995 .3264 .3508 .3729 .3925 .4099 .4251 .4382 .4495 .4591 .4671 .4738 .4793 .4838 .4875 .4904 .4927 .4945 .4959 .4969 .4977 .4984 .4988 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
z
0.05 .0199 .0596 .0987 .1368 .1736 .2088 .2422 .2734 .3023 .3289 .3531 .3749 .3944 .4115 .4265 .4394 .4505 .4599 .4678 .4744 .4798 .4842 .4878 .4906 .4929 .4946 .4960 .4970 .4978 .4984 .4989 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
0.06 .0239 .0636 .1026 .1406 .1772 .2123 .2454 .2764 .3051 .3315 .3554 .3770 .3962 .4131 .4279 .4406 .4515 .4608 .4686 .4750 .4803 .4846 .4881 .4909 .4931 .4948 .4961 .4971 .4979 .4985 .4989 .4992 .4994 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
0.07 .0279 .0675 .1064 .1443 .1808 .2157 .2486 .2794 .3078 .3340 .3577 .3790 .3980 .4147 .4292 .4418 .4525 .4616 .4693 .4756 .4808 .4850 .4884 .4911 .4932 .4949 .4962 .4972 .4979 .4985 .4989 .4992 .4995 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
0.08 .0319 .0714 .1103 .1480 .1844 .2190 .2517 .2823 .3106 .3365 .3599 .3810 .3997 .4162 .4306 .4429 .4535 .4625 .4699 .4761 .4812 .4854 .4887 .4913 .4934 .4951 .4963 .4973 .4980 .4986 .4990 .4993 .4995 .4996 .4997 .4998 .4999 .4999 .4999 .5000 .5000
0.09 .0359 .0753 . 1141 .1517 .1879 .2224 .2549 .2852 .3133 .3389 .3621 .3830 .4015 .4177 .4319 .4441 .4545 .4633 .4706 .4767 .4817 .4857 .4890 .4916 .4936 .4952 .4964 .4974 .4981 .4986 .4990 .4993 .4995 .4997 .4998 .4998 .4999 .4999 .4999 .5000 .5000
6.3
Proba b i l ity
0
1
Z
Z
(a)
( b)
1 47
FIGURE E6.9(a, b)
-3 Zl
-2
-1
0
1
2
3
-3
-2
-1
Z
2
3
Z (d)
(c) FIGURE E6.9(c, d)
(d) Zl = ( 12 - 10)/1 = 2 and Z2 =
00. We are seeking the area from Z = 2 to Z = 00 [Figure E6.9(d)]. From Table 6.3, the area from z = 0 to z = 2 is 0.4772. The area from z = 0 to z = 00 is 0.5. The required probability is then the difference between these two areas: P(2
S
z
S
00 )
=
P(O, 00 ) - P(0, 2)
=
0.5 - 0.4772
=
0.0228 or 2.28%
It is also of interest to determine the probability that a measurement will fall within one or more standard deviations (u's) of the mean. The value for 1u( Z = 1 ) is evaluated from Table 6.3. For z = 1, the probability is 0.3413. Since we are looking for ± lu, the probability we seek is twice this value, or 0.6826 (68.26 % ). This means that if a mea surement is performed in an environment where the deviation from the mean value is totally influenced by random variables, we can be 68.26% confident that it will fall within one standard deviation from the mean. For 20', the probability is 95.44 % . This value is close to 95 % ; hence, in practical usage, 20' and 95 % are often used interchangeably.
1 48
Chapter 6
fi
Statistica l Ana lysis of Experimenta l Data TABLE 6.4 Confidence Intervals and Probabilities
Confidence interval
Confidence JeveJ ( % )
±lu ±2u ±3u ±3.5u
68.26 95.44 99.74 99.96
Table 6.4 shows the probability (confidence level) for different deviation limits (confi dence intervals). The concepts of confidence intervals and confidence level are discussed in greater detail later in the chapter. Example 6. 1 0 In the mass production of engine blocks for a single-cylinder engine, the tolerance in the diame ter (D) of the cylinders is normally distributed. The average diameter of the cylinders, JL , has been measured to be 4.000 in. and the standard deviation is 0.002 in. What are the probabilities of the following cases? (a) A cylinder is measured with D :S 4.002. (b) A cylinder is measured with D � 4.005.
(c) A cylinder is measured with 3.993
:S D :S 4.003. (d) If cylinders with a deviation from the mean diameter of greater than what percent of the blocks will be rejected?
20'
are rejected,
The deviation in diameter in this case is d = D - JL , where D is the diameter of the cylinder and JL is the mean value for diameter. It has been established that the diameter is nor mally distributed, so we can use the standard normal distribution function and Table 6.3 to solve this problem. Solution:
(a) Z = (D - JL )/u = (4.002 - 4.000)/0.002 = 1. From Table 6.3, for z = 1, the area from z = 0 to z = 1 is 0.3413. The area from z = 00 to Z = 0 is 0.5. Then
P ( - oo :S Z :S 1) = 0.5
0.3413 = 0.8413 or 84.3 % -
+
(b) z = (4.005 - 4.000)/0.002 = 2.5. From Table 6.3, for z = 2.5, the area from z Z = 2.5 is 0.4938. The area from z = 0 to z = 00 is 0.5. Then
P( z > 2.5 ) = 0.5 - 0.4938 = 0.0062 or 0.62%
=
0 to
= (3.993 - 4.000)/0.002 = -3.5 and Z2 = (4.003 - 4.000)/0.002 = 1 .5. From Table 6.3, we obtain 0.4998 and 0.4332 for z = 3.5 and 1 .5, respectively. Then
(c) Zl
P( -3.5
:S
z
:S
1.5)
=
0.4998 + 0.4332 = 0.933 or 93.3%
(d) If the rejection criterion is 20' , the fraction of rejected blocks will be
1 - 2 X P(O :S
Z
:S 2)
=
1 - 2 X 0.4772 = 0.0456 or 4.56%
6.3
Proba b i l ity
1 49
FIGURE 6.6
Distribution suitable for lognormal analysis.
Random variable X
Standard Lognormal Distribution In some experimental situations, the measured variable is restricted to positive values and the values of this variable are usually small but can occasionally be very large. When plotted as histogram, these data give rise to skewed distributions with very long upper tails as depicted in Figure 6.6. Such variables can sometimes be efficiently described by a lognormal distribution. If the experimental variable is X and we define a new variable, Y, such that Y = In( X)\ and if Y is normally distributed, then the experimental variable X follows a lognormal distribution. If the mean of X is m and its standard deviation is s, then Y is normally distrib uted with a mean f.L = In ( m ) and a standard deviation (T = In ( s ) . This distribution is demonstrated in the following examples. If X follows a lognormal distribution with a mean m = 20 and a standard deviation s = 10, what is the probability that the value of X will (a) lie between 50 and 100 (b) be greater than 1000? If we define Y = In(X) then Y is normally distributed with mean J.I. = In(20) = 2.9957 and a standard deviation u = In( 10) = 2.3026.
Example 6.1 1
(a) P(50 < X < 100) = P(ln(50) < Y
4) = P(Y > In(4» = P(Y > 1.386) = P(Z > 0.865 )
From Table 6.3, we find that P(z > 0.865) = 0.5 - 0.3065 = 0.1935
(b) P(2 < X < 3) = P(ln(2) < Y < In(3» = P(0.6931 < Y < 1 .0986)
We can convert it to the standard normal variable as P( -0.2716 < z < 0.3934) . From Table 6.3 we find: P( -0.2716
0. x , x < O.
(6.38)
where A is always greater than zero. For this distribution, the mean and the standard deviation of x are equal to each other and are given by JL = 1/A and 0" = 1/A. Since the probability of occurrence of an event in an interval is the integral off(x) A A in that interval, it can be shown that P(Xl S x S X2 ) = ( e - X1 - e - x2 ) and that Aa e P(x S a ) = 1 • -
Example 6.1 3 The length of life of light emitting diodes (LED) used in a digital instrument display panel is described by an exponential distribution and it is known that the mean life of an LED is 2000 days. Find the probability that a given LED will fail (a) within 1000 days (b) between 1000 and 2000 days (c) will last at least 1500 days.
Solution:
From the given data, p.
= I ll.. = 2000, so that A. = 1/2000 = 0.0005/day.
(8) Since P(x s a ) = 1 - e-M• P(x s 1000) = 0.3935. (b) P(x1 :S X s X2 ) = ( e-,ul - e-,uZ ) .
(c)
=1
- e-o·OOOS ( l OOO) = 1 - e-o·5
=1
- 0.6065
so that P( I 000 :S x s 2000) = e-o·OOO5 ( l OOO ) - e-o·OOOS ( 2000) = e-o·5 - e-1 .O = 0.6065 - 0.3679 = 0.2387. P(x � 1500) = 1 - P(x s 1500 ) . Since P(x :s
a)
=1
- e-M, P(x � 1500)
= 1 - (1
- e-o·OOOS ( l500» )
=
e-o·75
= 0.4724.
Other Distribution FlIDctions Several other types of probability distribution functions are used in engineering experiments. Detailed discussion of all these functions are beyond the scope of this book. In Table 6.5, we summarize briefly important distributions that have significant applications in engineering. For details on the subject, the reader is referred to more comprehensive texts on the subject, such as Lipson and Sheth (1973) and Lapin (1990).
6.4
PARAM ETER ESTI MATION
In most experiments, the sample size is small relative to the population, yet the princi pal intention of a statistical experiment is to estimate the parameters describing the entire population. While there are many population parameters that can be estimated,
1 52
Chapter 6
Statistica l Ana lysis of Experimenta l Data
Summary of Important Distributions in Engineering
TABLE 6.5
Distribution Binomial Poisson Normal (Gaussian) Student's t
Weibull Exponential Lognormal Uniform
Application Used for quality control (rejection of defective products), failure-success, good-bad type of observation. This is a discrete distribution. Used to predict the probability of occurrence of a specific number of events in a space or time interval if the mean number of occurrences is known. Continuous, symmetrical, and most widely used distribution in experimental analysis in physical science. Used for explanation of random variables in engineering experiments, such as gas molecule velocities, electrical power consumption of households, etc. Continuous, symmetrical, used for analysis of the variation of sample mean value for experimental data with sample size less than 30. For sample sizes greater than 30, Student's t approaches normal distribution. Continuous, nonsymmetrical, used for analysis of variance of samples in a population. For example, consistency of chemical reaction time is of prime importance in some industrial processes and X2 distribution is used for its analysis. This distribution is also used to determine goodness of fit of a distribution for a particular application. Continuous, nonsymmetrical, used for describing the life phenomena of parts and components of machines. Continuous, nonsymmetrical, used for analysis of failure and reliability of components, systems and assemblies. Continuous, nonsymmetricaI, used for life and durability studies of parts and components. Continuous, symmetrical, used for estimating the probabilities of random values generated by computer simulation.
the most common estimated parameter is the population mean, JL. In some cases, it is also necessary to estimate the population standard deviation, (T. An estimate of the population mean, JL, is the sample mean, x [as defined by Eq. (6.1)], and an estimate of the population standard deviation, (T, is the sample standard deviation, S [as defined by Eq. (6.6)]. However, simple estimation of values for these parameters is not sufficient. Different samples from the same population will yield different values. Consequently, it is also necessary to determine uncertainty intervals for the estimated parameters. In the sections that follow, we discuss the determination of these uncertainty intervals. For further details, see Walpole and Myers (1998). 6.4. 1
I nterval Estimation of the Population Mean
We wish to make an estimate of the population mean, which takes the form JL = x ± 8 or x
-
8
s;
JL
s;
x+ 8
(6.39) 8 to x + 8
where 8 is an uncertainty and x is the sample mean. The interval from x is called the confidence interval on the mean. However, the confidence interval depends on a concept called the confidence level, sometimes called the degree of confi dence. The confidence level is the probability that the population mean will fall within the specified interval: -
confidence level = P(x
-
8
s;
JL
s;
x + 8)
(6.40)
We will have a higher confidence level that the mean will fall in a large interval than that it will fall within a small interval. For example, we might state that the mean is 11 ± 5 with a 95% confidence level or that the mean is 11 ± 3 with a 60% confidence
6.4
Parameter Esti mation
level. The confidence level is normally expressed in terms of a variable
level of significance:
confidence level = 1 -
a
a
1 53
called the (6.41)
is then the probability that the mean will fall outside the confidence interval. The central limit theorem makes it possible to make an estimate of the confidence interval with a suitable confidence level Consider a population of the random variable x with a mean value of p. and standard deviation u. From this population, we could take sev eral different samples, each of size n. Each of these samples would have a mean value Xi , but we would not expect each of these means to have the same value. In fact, the x;'s are values of a random variable. The central limit theorem states that if n is sufficiently large, the x;'s follow a normal distribution and the standard deviation of these means is given by a
u x- =
u
Vn
(6.42)
The population need not be normally distributed for means to be normally distributed and for the standard deviation of the means to be given by Eq. (6.42). The standard deviation of the mean is also called the standard error of the mean. For the central limit theorem to apply, the sample size n must be large. In most cases, to be considered large, n should exceed 30 (Harnett and Murphy, 1975; Lapin, 1990). The following are important conclusions from the central limit theorem: L If the original population is normal, the distribution for the x;'s is normal. 2. If the original population is not normal and n is large (n > 30 ) , the distribution
3. If the original population is not normal and if
for the x;'s is normal.
distribution only approximately.
n < 30, the xi's follow a normal
If the sample size is large, we can use the central limit theorem directly to make an estimate of the confidence interval. Since x is normally distributed, we can use the statistic as defined by u-x
x - p. z = -
(6.43)
Using the standard normal distribution function and the standard normal distrib ution tables, we can estimate the confidence interval on z. u is the population standard deviation, which, in general, is not known. However, for large samples, the sample stan dard deviation, S, can be used as an approximation for u. The estimation of the confidence interval is shown graphically in Figure 6.7. If z has a value of zero, this means that the estimate x has a value that is exactly the popu lation mean, p.. However, we only expect the true value of p. to lie somewhere in the confidence interval, which is ±zan. The probability that z lies in the confidence inter val, between -Zan and Za/2, is the area under the curve between these two z values and has the value 1 - a . The confidence level is this probability, 1 - a . The term a is equal
1 54
Chapter 6
Statistica l Ana lysis of Experimenta l Data
FIGURE 6.7
Concept of confidence interval of the mean.
to the sum of the areas of the left- and right-hand tails in Figure 6.7. These concepts can then be restated as P[ Za/2 :s; Z :s; Zal2] = 1 -
Substituting for z, we obtain P
[ Za/2 :s; -
This can be rearranged to give
p[ x
It can also be stated as
-
Za/2
-
]
X - JL (Fi Vn :s; Za/2 = 1
� :s;
JL
:s; x + Za/2
(6.44)
a
�]
-
(6.45a)
a
= 1
(F JL = x ± Za/2 Vn with confidence level 1
-
-
a
(6.45b)
a
(6.46)
In some instances, we are more interested in knowing what our data predicts about the maximum value that the mean can take at a given level of confidence, rather than in knowing the interval within which the mean falls. Similarly, in other situations we may be more interested in finding, at a given level of confidence, the minimum value that the mean can take. Such questions are answered by one-sided confidence intervals rather than the two-sided confidence intervals given in Eq. (6.46). The one-sided confidence interval for the upper limit of the mean at a confidence level of (1 a ) is given by - 00 :s; JL < x + Za This means that, based o n the -
�.
�
given data, the likelihood that - 00 < JL :s; x + Za is ( 1 one-sided confidence interval we use Za rather than Za/2 '
-
a
) Note that for the .
6.4
(x
.::n)
Parameter Esti mation
1 55
Similarly, the one-sided confidence interval for the lower limit of the mean at a confidence level of ( 1
- a
) is given by
-
Za
�
(
JL
that the likelihood of the mean, /L, being greater than x
< 00 . As before this means
-
za
.::n) is ( 1
-
a
).
Example 6.14 We would like to determine the confidence interval of the mean of a batch of resistors made using a certain process. Based on 36 readings, the average resistance is 25 0 and the sample stan dard deviation is 0.5 O. (a) Determine the 90% confidence interval of the mean resistance of the batch. (b) Determine the one-sided confidence interval for the maximum value of the mean at a confidence level of 90% . Determine the one-sided confidence interval for the minimum value of the mean for the same confidence level. (a) The desired confidence level is 90% , 1 - a = 0.90 and a = 0.1. Because the number of samples is greater than 30, we can use the normal distribution and Eq. (6.41) to deter mine the confidence interval. Based on the nomenclature of Figure 6.7, we can deter mine the value of Za/2 . Since the area between Z = 0 and Z = 00 is 0.5, the area between Z = 0 and Za/2 is 0.5 - al2 = 0.45. We can now use Table 6.3 with this value of probabil ity (area) to find the corresponding value of Za/2 . The value of Zal2 is 1 .645. Using the sample standard deviation, S, as an approximation for the population standard deviation, U, we can estimate the uncertainty interval on p.:
Solution
(x - za/2SIVn s P. s X + za/2SIVn) = (25 - 1.645 X 0.5/6 s p. s 25 + 1.645 X 0.516)
=
(24.86
s
p.
s 25.14)0
This indicates that the average resistance is expected to be 25 ± 0.14 0 with a confi dence level of 90% .
(b) Given that a = 0.1 , w e use the procedure described above t o find ZO. 1 = Za from Table
6.3 as Za = 1.2816. We can now estimate the 90% confidence level for the upper limit of
b
= 25.1068 0 p. as x + za . � = 25 + 1 .2816 X ° vn v 36 This result is interpreted as equivalent to saying "the maximum value of the mean resistance is expected to be no more than 25.1068 0 with a confidence level of 90 %." •
value of the mean resistance as x - za .� = 25 - 1 .2816
b = 24.8932 O.
Using a similar procedure, we can find the 90% confidence level for the minimum
vn
X
•
°
v 36
If the sample size is small ( n < 30 ) , the assumption that the population standard deviation can be represented by the sample standard deviation may not be accurate. Due to the uncertainty in the standard deviation, for the same confidence level, we
1 56
Chapter 6
Statistical Ana lysis of Experimental Data
would expect the confidence interval to be wider. In case of small samples, a statistic called Student's t is used: t =
-x - p,
(6.47)
S/Vn
In this formula, x is the sample mean, S is the sample standard deviation, and n is the sample size. In contrast to the normal distribution that is independent of sample size, there is a family of t-distribution functions that depend on the number of samples. The functional form of the t-distribution function is given by (see Lipson and Sheth, 1973)
f ( t,
v)
(v ; ) (-v )( ) v 1
r
= v'V7Tr
(6.48)
t2 ('/)+I)12 1 +-
2
r(x) is a standard mathematical function known as a gamma function and can be obtained from tables of mathematical functions, such as those of the U.S. Department of Commerce (1964). is a parameter known as the degrees offreedom. It is given by the number of independent measurements minus the minimum number of measurements that are theoretically necessary to estimate a statistical parameter. For the t-distribution, one degree of freedom is used estimating the mean value of x so n - 1. It is unlikely that an engineer would actually use Eq. (6.48) to perform an analysis that uses the t distribution. The function is built into spreadsheet and statistical analysis pro grams. For purposes of the present course, all required values can be obtained from a single table. Figure 6.8 [a plot of Eq. (6.48)] shows the Student's t-distribution for different values of degrees of freedom Like the normal distribution, these are symmetric curves. As the number of samples increases, the t-distribution approaches the normal distribution. For lesser values of the distribution is broader with a lower peak.
v
v=
v. v,
0.4
� 0.2 -:::-
FIGURE 6.8
Probability density function using the Student's t-distribution.
-2
-1
o
1
2
3
6.4
Parameter Estimation
1 57
FIGURE 6.9
Confidence interval for the t-distribution.
The t-distribution can be used to estimate the confidence interval of a mean value of a sample with a certain confidence level for small sample sizes (less than 30). The t distribution is used in much the same manner as the normal distribution, except that the curve for the appropriate value of v is selected as shown in Figure 6.9. The proba bility that t falls between -ta/2 and ta/2 is then 1 - a. This can be stated as P[ -ta/2 :s; t' :s; ta/2] = 1 - a Substituting for t, we obtain
[
P -ta/2 :S;
]
x - p. Vn :s; ta/2 = 1 - a S/
(6.49)
(6.50a)
This can be rearranged to give I
(6.50b)
This equation can also be stated as p. = x ± ta/2
� with confidence level 1 - a
(6.51)
Since complete tables of the t-distribution would be quite voluminous, it is common to present only a table of critical values, as shown in Table 6.6. Only the most common values of t are used -those which correspond to common confidence levels. For exam ple, for a 95 % confidence level, a = 1 - 0.95 = 0.05 and a/2 = 0.025. Similarly, for a 99% confidence level, a = 0.01 and a/2 = 0.005.
1 58
Chapter 6
Statistica l Ana lysis of Experimenta l Data TABLE 6.6
Student's t as a Function of a and v
Area = a12
tal2
t=0
al2 0.100
0.050
0.025
0.010
0.005
1 2
3.078 1 .886
6.314 2.920
12.706 4.303
31 .823 6.964
63.658 9.925
3
1 .638
2.353
3.182
4.541
5.841
4
1.533
2.132
2.776
3.747
4.604
5
1.476
2.015
2.571
3.365
4.032
6
1 .440
1.943
2.447
3. 143
3.707
7
1 .415
1.895
2.365
2.998
3.499
8
1.397
1.860
2.306
2.896
3.355
9
1.383
1 .833
2.262
2.821
3.250
2.764
3.169
v
10
1.372
1.812
2.228
11
1.363
1.796
2.201
2.718
3.106
12
1 .356
1.782
2.179
2.681
3.054
13
1 .350
1.771
2.160
2.650
3.012
14
1 .345
1.761
2.145
2.624
2.977
15
1 .341
1 .753
2.131
2.602
2.947
16
1.337
1 .746
2.120
2.583
2.921
17
1 .333
1 .740
2.110
2.567
2.898
18
1 .330
1.734
2.101
2.552
2.878
19
1 .328
1 .729
2.093
2.539
2.861
1.325
1.725
2.086
2.528
2.845 2.831
20 21
1 .323
1 .721
2.080
2.518
22
1.321
1.717
2.074
2.508
2.819
23
1.319
1.714
2.069
2.500
2.807
24
1.318
1.711
2.064
2.492
2.797
1.316
1.708
2.060
2.485
2.787
1.706
2.056
2.479
2.779
25 26
1.315
27
1 .314
1 .703
2.052
2.473
2.771
28
1.313
1 .701
2.048
2.467
2.763
29
1 .311
1 .699
2.045
2.462
2.756
30
1.310 1 .283
1 .697
2.042
2.457
2.750
1 .645
1 .960
2.326
2.576
00
6.4
Para m eter Esti mation
1 59
Example 6. 1 5 A manufacturer of VCR systems would like to estimate the mean failure time of a VCR brand with 95 % confidence. Six systems are tested to failure, and the following data (in hours of play ing time) are obtained: 1250, 1320, 1542, 1464, 1275, and 1383. Estimate the population mean and the 95% confidence interval on the mean. Because the sample size is small (n < 30) , we should use the t-distribution to estimate the confidence interval. But first we have to calculate the mean and standard deviation of the data.
Solution:
1464 + 1275 + 1383 = 1372 h 6 S = « 1250 - 1372) 2 + ( 1320 - 1372) 2 + (1542 - 1372) 2 ( 1464 - 1372? + ( 1275 - 1372) 2 + ( 1383 - 1372 f 1I2
x
=
1250
+
1320
+
1542
+
+
]
5
= 1 14 h
A 95% confidence level corresponds to cr = 0.05. From Table 6.6 of the t-distribution, for v = n - 1 = 5 and cr/2 = 0.025, ta/2 = 2.571. From Eq. (6.51) the mean failure time will be /L
=
x
S 1 14 ± ta/2 c = 1372 ± 2.571 X IZ = 1372 ± 120 h - vn v6 •
•
I t should lYe noticed that if we were to increase the confidence level, the estimated interval will also expand, and vice versa.
Example 6 . 1 6 I n Example 6.15, to reduce the 95 % confidence interval to ±50 h from ± 120 h , the VCR manu facturer decides to test more systems to failure. Determine how many more systems should be tested. Solution: Since we do not know the number of samples in advance, we cannot select the appro priate t-distribution curve. Hence, the solution process is one of converging trial and error. To obtain a first estimate of the number of measurements n, we assume that n > 30, so that we can use the normal distribution. Then Eq. (6.46) can be applied, and the confidence interval becomes
so
For a 95 % confidence level, cr/2 = 0.025. Using the standard normal distribution, we find from Table 6.3 that ZO.025 = 1 .96. Using S = 114 (from Example 6.15) as an estimate for a, we obtain a first estimate of n:
n =
(
1 .96
x
)
1 14 2
50
= 20
1 60
Chapter 6
Statistica l Analysis of Experi mental Data
Because n < 30, we have to use the t-distribution instead of the standard normal distribution. We can use n = 20 for the next trial. For v = n - 1 = 19 and ad2 = 0.025, from Thble 6.6 we obtain t = 2.093. This value of t can be used with Eq. (6.51) to estimate a new value for n: IL
ta/2 vii s
= x
±
= 50
n =
(
S ta/2 vii
= -x
S 2 ta/2 50
) =(
±
2.093
50
)
114 2
X 50
= 23
We can use this number as a trial value to get a new value of t and recalculate n, but the result will be the same. Note that with the additional tests, the value of the sample mean, X, may also change and may no longer have the value 1372 h.
6.4.2
Interval Estimation of the Population Variance
In many situations, the variability of the random variable is as important as its mean value. The best estimate of the population variance, 0'2 , is the sample variance, S2. As with the population mean, it is also necessary to establish a confidence interval for the estimated variance. For normally distributed populations, a statistic called (pro nounced "kye squared" ) is used for the purpose of establishing a confidence interval. Consider a random variable x with population mean value IL and standard deviation 0'. If we assume that x is equal to IL , Eq. (6.6) can be written
r
n
The random variable
r
(
� Xi - IL) 2 ..:...i=-=l'--2 = S n-1
__ _
(6.52)
is defined as
(6.53) Combining Eqs. (6.52) and (6.53), we obtain
r
(6.54)
That is, the variable relates the sample variance to the population variance. X2 is a random variable, and it has been shown that in the case of a normally distributed pop ulation its probability density function is given by
f( r) = (r)
(v- 2 )l2e-,yl12
r
for > 0 (6.55) 2vl2f(v/2) where v is the number of degrees of freedom and f is a gamma function, which can be obtained from standard tables such as those of the U.S. Department of Commerce (1964).
6.4
Parameter Esti mation
161
FIGURE 6. 1 0
Chi-squared distribution function.
X�. l - al2
FIGURE 6. 1 1
X�. al2
Confidence interval for the chi squared distribution.
The number of degrees of freedom is the same as for the Student's I-distribution: the num ber of samples minus 1. /(1') is plotted in Figure 6.10 for several values of the degrees of freedom. As with other probability density functions, the probability that the variable X2 falls between any two values is equal to the area under the curve between those values (as shown in Figure 6.11). In equation form, this is
P(�.l- al2
s;
' I
s;
X�.al2 ) = 1 - a
(6.56)
where a is the level of significance defined earlier and is equal to (1 - confidence level) . Substituting for I' from Eq. (6.54), we obtain
p[ �.1 -a/2
s;
(n - 1 )
!: �.a/2 ] = s;
1 -a
(6.57)
Since I' is always positive, this equation can be rearranged to give a confidence inter val on the population variance:
2 (n - 1 )s2 2 ( n - 1 )s �.al2 - - �,l -al2
-'---=-'- < U
't , we can presume that y does depend on x in a nonrandom manner, and we can expect that a linear relationship will offer some approximation of the true functional relationship. A value of 'xy less than 't implies that we canno t be confident that a linear functional relationship exists. It is not necessary for the functional relationship actually to be linear for a signif icant correlation coefficient to be calculated. For example, a parabolic functional rela tionship that shows little data scatter will often show a large correlation coefficient. On the other hand, some functional relationships (e.g., a multivalued circular function), while very strong will result in a very low value of 'xy . 1\vo additional precautions to be aware of in using the correlation coefficient should be mentioned. First, a single bad data point can have a strong effect on the value of 'xy. If possible, outliers should be eliminated before evaluating the coefficient. It is also a mistake to conclude that a significant value of the correlation coefficient implies -
Chapter 6
1 68
Statistica l Analysis of Experi mental Data
that a change in one variable causes the other variable to change. Causality should be determined from other knowledge about the problem. IExample 6. 1 9 It is thought that lap times for a race car depend on the ambient temperature. The following data for the same car with the same driver were measured at different races:
Lap time (8)
Ambient temperature (OF)
40
47
55
62
66
88
65.3
66.5
67.3
67.8
67
66.6
Does a linear relationship exist between these two variables? First, we plot the data as shown in Figure E6.19. From the plot, it looks as though there might exist a weak positive correlation between lap time and ambient temperature, but we can compute the correlation coefficient to determine whether this correlation is real or might be due to pure chance. We can determine this coefficient using Eq. (6.60). The following computa tion table is prepared:
Solution:
x
L
40 47 55 62 66 88 =
358.00
L
=
y
x-x
65.3 66.5 67.3 67.8 67 66.6
- 19.67 - 12.67 -4.67 2.33 6.33 28.33
400.50
(x - X) 2
L
y -y
386.78 160.44 21 .78 5.44 40.1 1 802.78
=
(y
- 1 .45 -0.25 0.55 1.05 0.25 -0.15
1417.33
L
=
y) 2 2.10 0.06 0.30 1.10 0.06 0.02
_
(x - x) (y -
L
3.66
70 68
X
---
� u
.�
....:I
0.. �
66
x
x
x
x
64 62 60 30
FIGURE E6. 1 9
Ambient temperature eF)
X
y)
28.52 3.17 -2.57 2.45 1.58 -4.25
=
28.90
6.6
1 69
Correlation of Experimental Data
We can now compute the correlation coefficient, using Eq. (6.60):
rx y
=
28.9 ( 1417.33 X 3.66) 112
=
0.4013
For a 95 % confidence level, a = 1 0.95 = 0.05. For six pairs of data, from Table 6.9, we obtain a value of rt of 0.811. Since rxy is less than rr. we conclude that the apparent trend in the data is probably caused by pure chance. The calculation of rxy is a feature of some spreadsheet programs. The user needs only to input two columns of numbers and then call the appropriate function. -
6.6.2
Least-Squares Linear Fit
It is a common requirement in experimentation to correlate experimental data by fit ting mathematical functions such as straight lines or exponentials through the data. One of the most common functions used for this purpose is the straight line. Linear fits are often appropriate for the data, and in other cases the data can be transformed to be approximately linear. As shown in Figure 6.13, if we have n pairs of data (Xi > Yi) , we seek to fit a straight line of the form
Y = ax + b
(6.62)
through the data. We would like to obtain values of the constants a and b. If we have only two pairs of data, the solution is simple, since the points completely determine the straight line. However, if there are more points, we want to determine a "best fit" to the data. The experimenter can use a ruler and "eyeball" a straight line through the data, and in some cases this is the best approach. A more systematic and appropriate approach is to use the method of least squares or linear regression to fit the data. Regression is a well-defined mathematical formulation that is readily automated. Let us assume that the test data consist of data pairs ( Xi, Yi) . For each value of Xi (which is assumed to be error free), we can predict a value of Yi according to the linear relationship Y = ax + b. For each value of Xi, we then have an error
(6.63)
t--x
Value of Yi FIGURE 6. 1 3
Fitting a straight line through data.
1 70
Chapter 6
Statistica l Ana lysis of Experimenta l Data
and the square of the error is
er = (Y; - Yif = ( axi + b - Yif
(6.64)
The sum of the squared errors for all the data points is then
where � =
n
� i=l
(6.65)
We now choose a and b to minimize E by differentiating E with respect to a and b and setting the results to zero:
aE "" 2 ( ax·I + b - lI· ) x·I -=0= � aa aE = 0 = � 2 ( axi + b - Yi ) ab Jl
(6.66)
These two equations can be solved simultaneously for a and b:
� Xi )( � Yi ) ( a= n � xr - ( � Xi y � Xr � Yi - ( � Xi )( � XiYi ) b = ------n � xr - ( � Xi y n � xiYi -
(6.67a)
(6.67b)
The resulting line, Y = ax + b, is called the least-squares best fit to the data repre sented by ( Xi , Yi ) ' When a linear regression analysis has been performed, it is desirable to deter mine how good the fit actually is. Some idea of this measure can be obtained by exam ining a plot, such as Figure 6.13, of the data and the best-fit line. However, it is desirable to have a mathematical expression of how well the best-fit line represents the actual data. A good measure of the adequacy of the regression model is called the coefficient of determination, given by
r2 = 1 - -------
(6.68)
In the second term on the right-hand side of Eq. (6.68), the numerator is the sum of the square deviations of the data from the best fit. The denominator of this term is the sum of the squares of the variation of the Y data about the mean, y, which is the average of the y/s. For a good fit, r2 should be close to unity. That is, the best-fit line accounts for most of the variation in the Y data. For engineering data, r 2 will normally be quite high (0.8-0.9 or higher), and a low value might indicate that there exists some important variable that was not considered, but that is affecting the result.
6.6 16 14 12 10 '" 8 6 4 2
Jest fit
I I
�
o � 2 o
�
-"
......
4
�
171
Correlation of Experimental Data
�
..- ,.-.
Forced origin
x
I
I
6
10
8
FIGURE 6. 1 4
Least-squares line with forced origin.
Another measure of how well the best-fit line represents the data is called the
standard error of estimate, given by
(6.69)
Sy.x =
This is, effectively, the standard deviation of the differences between the data points and the best-fit line, and it has the same units as y. In some cases, a somewhat different form of linear regression is used in which the line is forced to go through the origin (x = 0, Y = 0). This form is often employed to calibrate instruments in which the zero offset can be adjusted to zero prior to making measurements. This situation is shown in Figure 6.14. If a linear best fit of the form of Eq. ( 6. 62 ) is used, the best-fit line will not go through the origin. If the line is forced to go through the origin, the fit is not quite as good, but the fitted curve will give the cor rect value when x = O. With the origin fixed at (0, 0), the fitted line will have the form
Y = ax
(6.70a)
The value of a is calculated from n
L XiYi i=l = a -- L xr i=l
(6.70b)
n
This form of the least-squares fit is a standard feature of common spreadsheet programs. Example 6.20 The following table represents the output (volts) of a linear variable differential transformer (LVDT; an electric output device used for measuring displacement) for five length inputs: L(cm)
0.00
0.50
1.00
1.50
2.00
2.50
V(V)
0.05
0.52
1.03
1.50
2.00
2.56
Determine the best linear fit to these data, draw the data on a (V, L) plot, and calculate the stan dard error of estimate as well as the coefficient of determination.
1 72
Cha pter 6
Statistica l Ana lysis of Experimenta l Data
To solve this problem, we substitute the data into Eq. (6.67). The following table shows how the individual sums are computed:
Solution:
X?
Xi
Yi
I
XiYi
YT
0 0.25
0.05 0.52
1
1
1.03
1.03
1.0609
1.5
2.25
1.5
2.25
2.25
2
4
2
4
4.0
2.5
6.25
2.56
6.4
6.5536
0 0.5
�Xi =
� xt =
7.5
�Yi =
13.75
7.66
0 0.26
�XiYi =
13.94
0.0025 0.2704
�Yt
=
14.137
Then, 13.94 - 7.5 x 7.66 6 x 13.75 - 7.52 = 0.9977
a =
6
x
and b
=
13.75
X 7.66 - 7.5 X 13.94 X 13.75 - 7.52
6
= 0.0295 The resulting best-fit line is then
Y = 0.9977x + 0.0295 where Y is the voltage and x is the displacement. The best-fit line, together with the data, is plot ted in Figure E6.20. We can now compute the coefficient of determination and the standard error. Noting that y = L Yi = (7.66) = 1.27666, we compute the following table:
�
i
Xi
0 0.5 1 1 .5 2 2.5
Yi
0.05 0.52 1.03 1.5 2 2.56
(1'1
- Yi)2
0.000419 7.D2E-05 7.63E-06 0.000681 0.000623 0.00131
� (Y; - Yi)2 =
0.00311
(Yi - y)2
1 .504711 0.572544 0.060844 0.049878 0.52321 1 1.646944
� (Yi - y)2 =
4.358133
Correlation of Experi mental Data
6.6
1 73
3
G
,-.. 2.5 Q) bll «!
�>
'5
So
0
='
2 1.5 1 0.5 0
r2 is thus
0.5
2 1 1.5 Displacement, L (cm)
FIGURE E6.20
� (Y; - Yi ? .003 11 2 = 0. 999286 = 1 r = 1 - -==---_ y-=? 35 81 33 4. � ( Yi
This is high, close to unity, and consistent with the plot of the results shown in Figure E6.20. The standard error of the estimate is
� 0. 003 1 1 = 0 . 0278 y.x = 6-2
s
Comment: Linear regression is a standard feature of statistical programs and most spreadsheet programs. It is only necessary to input columns of the data, and the remaining calculations are performed immediately.
A frequent application of a least-square fit of experimental data is the prediction of the value of the dependent variable, y, for a given value of x. If x = x*, we can easily obtain the expected value y* of the dependent variable by substituting x = x· in the formula for our linear regression line, Y = ax + b. However, it will be more informative if we can construct a confidence interval for the resulting value of y*. This confidence interval is often called a prediction interval. A two sided ( 1 - a) confidence interval can be generated for y* at x = x* using the following formula [see Hayter (2002)] :
y* = (ax* + b)
±
)--
? (x* - x n+1 + � 2 Sy.xtal2,n-2 n .£.. xi - nx-2
(6.71a)
Here, tal2,n - 2 is the value obtained from Table 6.6 for the Student's t- distribution corresponding to al2 and degrees of freedom (n - 2) and Sy.x is the standard error given by Eq. (6.69).
Example 6.21
Using the information from Example 6.20 and taking a confidence level of 95 % , find the value of the voltage output and the confidence interval for a displacement (L) of 1.72 cm.
1 74
Chapter 6
Statistica l Ana lysis of Experimenta l Data
From results in the solution of Example 6.17, we find 0.9977, b = 0.0295, x = '22 x;ln = 7.5/6 = 1 .25, '22 xf = 13.75, and Sy.x = 0.0278. a = 1 - 0.95 = 0.05; a / 2 = 0.D25, v = 6 - 2 = 4. From Table 6.6, t = 2.776 Substituting into Eq. (6.71a)
that
Solution: a
=
v
v
x
1.72
+ 0.0295
±
�
0.0278 X 2.776
=
0.9977
=
1 .744 ± 0.085 with a 95 % confidence level.
6
+1 + 6
( 1.72 - 1.25) 2 13.75 - 6 X 1 .252
In many practical situations, we may be interested in constructing a linear model of the relationship between the variables y and x, of the form y = ax + b. In such cases, the expected values of the parameters a and b can determined from Eqs. (6.67a) and (6.67b). It is often useful to also estimate confidence intervals for these parameters in order to get a sense of how reliable the model is. The two-sided ( 1 - a ) confidence interval for the slope parameter a is given by Hayter (2002) as:
a±
Sy.xtaIZ,n-Z
---;;:;:;:::::=:: ::;;= ::::;;
(6.71b)
v' � xr - nxz
while the two-sided (1 - a ) confidence interval for the intercept parameter, b, is given by Mason et al. (2003) as
b ± Sy.xtaIZ,n-Z
i V+ n
-
( � xi i -z i - nx )
(6.71c)
z( � Z n £.J X
Example 6.22
Using the data from Example 6.20, find the 95 % confidence intervals for the parameters a and b where Y = ax + b.
Solution:
b
=
0.0295,
From results in the solutions of Example 6.20 and 6.21, we note that '22 xf = 13.75, n = 6, x = 1 .25, Sy.x = 0.0278, ta/2,n -2 = (025,4 = 2.776 '22 xf - nx2 = 13.75 - (6 X 1 .252 ) = 4.375
a
=
0.9977,
The 95 % confidence interval for the slope parameter, a, is a ±
Sy.xta/2,n- 2
V'22 xf
- nx2
=
(
0.9977
X 2.776 )) ± ( 0.0278 V4.375
=
The 95 % confidence interval for the intercept parameter, b, is
b
± Sy.xta/2,n-2 �n1 + n2 C ( '22X2i X-;) nx-2 )
=
-
(0.0295
±
0.0559)
�
""
2
=
0. 0295
±
0. 0278
x
(0.9977 ± 0.0360)
2 .776 1 6
2
+ 2 7.5 6 X 4.375
6.6
Correlation of Experimenta l Data
1 75
There are some important considerations to bear in mind about the least-squares method:
1. Variation in the data is assumed to be normally distributed and due to random
causes. 2. In deriving the relation Y = ax + b, we are assuming that random variation exists in y values, while x values are error free. 3. Since the error has been minimized in the y direction, an erroneous conclusion may be made if x is estimated based on a value for y. Linear regression of x in terms of y(X = cy + d) cannot simply be derived from Y = ax + b.
6.6.3
Outliers in x-y Data Sets
In Section 6.5, the Thompson T technique was presented as a method for eliminating bad data when several measurements are made of a single variable. When a variable y is measured as a function of an independent variable x, in most cases there is only one value of y measured for each value of x. As a result, at each x there are insufficient data to determine a meaningful standard deviation. One way to identify outliers is to plot the data and the best-fit line as shown in Figure 6.15. A point such as A, which shows a much larger deviation from the line than the other data, might be identified as an outlier. Montgomery et al. (1998) suggest a more sophisticated method of identifying outliers in x-y data sets. The method involves computing the ratio of the residuals [ ei' Eq. (6.63)] to the standard error of estimate [ Sy. .v Eq. (6.69)] and making a graph. These ratios, called the standardized residuals, can be plotted as a function of the inde pendent variable, x, the dependent variable, y, or the time or sequence in which the x-y pair was measured. Figure 6.16 shows a plot of the standardized residuals as a function of x for the same data shown in Figure 6.15. If we assume that the residuals are normally distributed, then we would expect that 95 % of the standardized residuals would be in the range ±2 (that is, within two standard deviations from the best-fit line). We might consider data values with stan dardized residuals exceeding 2 to be outliers, since they only have a 5 % chance of occurring, due to simple randomness in the measurement. In the example, point A has a value of -3, and hence deviates from the best-fit line by three standard deviations. Since it is not consistent with its neighbors, it is almost certainly an outlier.
y x
x-y data showing an outlier. FIGURE 6. 1 5
1 76
Chapter 6
Statistical Ana lysis of Experimenta l Data •
1
•
O ���-z� .��.------------x•
� -2 .. - 1
�;.,
FIGURE 6. 1 6
Plot of standardized residuals.
-3
-4
However, there are several complications, and the process of determining out liers will require considerable judgment on the part of the experimenter. The first problem occurs when the data set is relatively small. In this case, the potential outlier itself has a major effect on the value of Sy .x ' As a result, it is unlikely that the standardized residual for this suspect data point will have a large value. The Thompson T test in Section 6.4.3 accounts for this effect by reducing the value of T for a small sample size. No similar guidance exists for the x-y outlier determination for small samples in common statistics practice. A second problem occurs when the data themselves are not linear. In this case, correct data may result in high standardized residuals. Consider the non-linear data plotted in Figure 6.17. The standardized residuals are plotted in Figure 6.18. Although the magnitude of the normalized residual at the highest value of x is greater than 2, this is not an outlier. The plot of the normalized residuals is smooth, and these data simply reflect nonlinear behavior. Potential outliers at the extreme ends of the data are likely real data. It is also possible that there is a change in the character of the physical phenomenon represented at the high or low values of the independent variable.
•
y
x
FIGURE 6. 1 7
Nonlinear data.
�
�
�>,
FIGURE 6. 1 8
Normalized residuals for nonlinear data.
1.5 1 ••• • • 0.5 0 -1-.... ...%. .,. .. -. ----------• • x -0.5 • -1
- 1.5 -2 25
-
.
•
6.6
Correlation of Experimental Data
1 77
In summary, outliers in x-y data sets cannot be determined by simple mechanistic rules. The experimenter can make use of plots of the data and plots of the standardized residuals, but ultimately, it is a judgment call as to whether to reject any data point. It may in fact be necessary to consider the actual data-gathering process in order to make a decision.
Example 6.23 The following data were taken from a small water-turbine experiment: rpm
torque (N-m)
100 201 298 402 500 601 699 799
4.89 4.77 3.79 3.76 2.84 4.00 2.05 1.61
Fit a least-squares straight line to these data, and determine whether any of the torque values appear to be outliers. We will use a spreadsheet program to fit the straight line. This is shown as Figure E6.23(a). Examining Figure E6.23(a), we find that the torque value at about 600 rpm appears to be inconsistent with the other data and is probably an outlier. To confirm this, we can prepare a plot of the standardized residuals. For these data, the best-fit equation is
Solution:
torque = 5.433 - 0.0044 rpm and Sy. x = 0.5758. This information can be used to prepare the following table: rpm 100
201 298 402
500 601 699 799
torque (measured)
torque (predicted)
4.89 4.77 3.79 3.76 2.84 4.00 2.05 1 .61
4.99 4.55 4.12 3.66 3.23 2.79 2.36 1 .92
elSy.x 0.18 -0.38 0.58 -0.16 0.68 -2.10 0.53 0.54
The standardized residuals are plotted against the rpm in Figure E6.23(b). Clearly, the torque value at 600 rpm has a high probability of being an outlier. The standard ized residual has a magnitude greater than 2. Furthermore, the standardized residual plot appears random and shows no trend, so the high standard residual is not likely caused by nonlinearity.
1 78
Chapter 6
Statistica l Ana lysis of Experimenta l Data 6.00
�
5.00 4.00
8 3.00
Z
...�
2.00 1 .00 0.00
0 FIGURE E6.23(a)
1 .00
i ':t
0.00 - 1 .00 -2.00
FIGURE E6.23(b)
6.6.4
-3.00
l
j
0
200
400
600
800
1000
rpm
•
2&
•
•
400
•
I
600
•
• I
800
I
rpm
1000
•
Linear Regression Using Data Transformation
Commonly, test data do not show an even approximately linear relationship between the dependent and independent variables, and linear regression is not directly useful. In some cases, however, the data can be transformed into a form that is linear, and a straight line can then be fitted by linear regression. Examples of data relationships that can easily be transformed to linear form are y = ax b and y = aebx • In these relations, a and b are constant values and x and y are variables. For example, we examine the equation y = aebx Taking the natural logarithms of each side, we obtain In(y) = bx + In(a). Since In(a) is just a constant, we now have a form in which In(y) is a linear function of x.
•
Example 6.24 In a compression process in a piston-cylinder device, air pressure and temperature are mea sured. Table E6.24 presents these data. Determine an explicit relation of the form T = F(P) for the given data. The data are plotted in Figure E6.24(a). As can be seen, the temperature-pressure relationship is curved. In a compression process of this type, it is known from thermodynamics that the temperature and the pressure are related by an equation of the form
Solution:
� = (�)�
where T is the absolute temperature, P is the absolute pressure, and n is a number, called the polytropic exponent, that depends on the actual compression process. For this system of units, Tabs Tp + 460.0. If we take natural logarithms of both sides of the equation for TITo, we
=
obtain an equation of the form
In ( T ) =
a In(P) + b
6.6
Correlation of Experimenta l Data
TABLE E6.24
Temperature (OF)
Pressure (psia)
44.9 102.4 142.3 164.8 192.2 221.4 228.4 249.5 269.4 270.8 291.5 287.3 313.3 322.3 325.8 337.0 332.6 342.9
20.0 40.4 60.8 80.2 100.4 120.3 141.1 161.4 181.9 201.4 220.8 241.8 261.1 280.4 300.1 320.6 341.1 360.8
400
ii:' 300
350
� 250 f:! � 200 ... ::s
S' Q)
150
Eo-< 100 Q)
X
50 0
0
X
X
50
X
X
x X
100
x
x x
x x
x x x
150 200 250 Pressure, (psia)
300
x x x
350
400 FIGURE E6.24(a)
7 6.8
i' 6.6 +
E-< :;
6.4 6.2
X
X
3
X
3.5
4
X
4.5 In(P)
5
5.5
6 FIGURE E6.24(b)
1 79
1 80
Chapter 6
Statistica l Ana lysis of Experimenta l Data
where a = (n - l ) /n and b is another constant. In this form, In(1) is a linear function of In(P). Natural logarithms of the pressure and absolute temperature have been plotted in Figure E6.24(b). The data now appear to follow an approximate straight line. Using the method of least squares, the equation of the best-fit straight line is In(T
+ 460)
=
0.1652 ln(P)
+ 5.7222
This is the functional form required; T is in OF and P is in psia.
6.6.5
M u ltiple and Polynomial Regression
Regression analysis is much more general than the methods presented so far. Best-fit functions can be determined for situations with more than one independent variable (multiple regression) or for polynomials of the independent variable (polynomial regression). The calculations for these methods can be quite tedious, but they are stan dard features of statistical-analysis programs, and many features are available in com mon spreadsheet programs. In multiple regression, we seek a function of the form (6.72) We can have several independent variables, X l . . . Xk ' At first, it would appear that the dependent variable will be a linear function of each of the independent variables, but in fact the method is more general, similar in concept to the transformation of vari ables demonstrated in Section 6.5.4. It is for this reason that X rather than x is used in Eq. (6.72). The x 's can be independent variables, or they can be functions of the inde pendent variables. As an example, consider a situation with two independent variables, Xl and X2 ' We could use three x ' s: X2 = x2 A
In this case, X3 is the product of the two independent variables. The theoretical basis is the same as for simple linear regression (Section 6.5.2). For each data point, the error is The sum of the squares of the errors is then
E = � ( ao + alxli + a2x2i + a 3 x3i +
.
..
n
+ ak xki - YY where � = � l i=
E is then minimized by partially differentiating with respect to each a and then setting each resulting equation to zero. The set of equations can then be solved simultaneously for values of the a's.
6.6
Correlation of Experi menta l Data
1 81
One important aspect of multiple regression is that evaluating the adequacy of the best fit is considerably more difficult than it is for simple linear regression. This is because it may not be possible to generate suitable plots ( since the function has three or more dimensions) and because it is more difficult to interpret the statistical parame ters, such as the coefficient of determination ( r2 ). [ Consult texts such as Devore (1991) and Montgomery (1998) for greater detail on interpreting multiple-regression results.] Example 6.25 The following data represent the density of an oil mixture as a function of the temperature and mass fraction of three different component oils: T(K)
m! 0 0 0 0.5 0.5 0.5 1 1
300 320 340 360 380 400 420 440
m2 1 0.5 0 0 0.25 0.5 0 0
m3 0 0.5 1 0.5 0.25 0 0 0
PmiXI
879.6 870.6 863.6 846.4 830.8 819.1 796 778.2
Find the coefficients for a multiple regression of the form
Pmixt = ao + a l T + aZm l + a3 mZ + a4m 3
where Pmixt is the mixture density, T is the temperature, and m l , mz, and tions of the component oils. ® Solution: In an Excel spreadsheet, the input table is as follows: T(K)
m!
m2
m3
Pmixl
300
0
1
0
879.6
320
0
0.5
870.6
340
0
0
0.5 1
360
0.5
0
0.5
380
0.25
400
0.5 0.5
0.5
0.25 0.25
846.4 830.8 819.1
420
1
0
0
796
1
0
0
778.2
440
m3 are the mass frac
863.6
After calling the regression function in the spreadsheet program, the following output is obtained: Summary Output Coefficients
Intercept X Variable 1 X Variable 2 X Variable 3 X Variable 4
4636.991971 -0.592043796 -3592.60073 -3578.259367 -3570.666667
1 82
Chapter 6
Statistica l Ana lysis of Experimenta l Data
"Intercept" is ao, "X Variable 1" is a J, "X Variable 2" is a2, "X Variable 3" is a3 and "X Variable 4" is a4. Thus, the regression equation is
Pmixt = 4636.99 - O.592044T - 3592.60ml - 3578.26m2 - 3570.67m3 One should show great care in rounding the coefficients. The regression equation often involves differences between large numbers and is hence very sensitive to the significant figures of the coefficients.
Many physical relationships cannot be represented by a simple straight line, but can be easily fit with a polynomial. Polynomial regression involves only a single inde pendent variable, but will involve terms with powers higher than unity. The form of a polynomial regression equation is (6.73) where k, the exponent of the highest-order term, is the degree of the polynomial. Poly nomial regression can be performed in statistical programs by simply inputting the data and the order of the polynomial desired. It can also be treated as a subset of mul tiple regression by making the x's in Eq. (6.72) be x, x 2 , x3, etc. Example 6.26 Consider the following x-y data:
x
0 1 2 3 4 5 6 7 8 9 10
y
4.997 6.165 6.950 8.218 9.405 10.404 10.425 10.440 9.393 7.854 5.168
Perform a polynomial regression for a third-order polynomial on these data. Solution:
We will use the multiple-regression function in Excel®. The input field is X
0 1 2 3 4 5
x2
0 1 4 9 16 25
x3
0
1
8 27 64 125
y
4.997 6.165 6.95 8.218 9.405 10.404
6.6 36 49 64 81 100
6 7 8 9 10
Correlation of Experi menta l Data
1 83
10.425 10.44 9.393 7.854 5.168
216 343 512 729 1000
Note that columns for x2 and x3 have been created, and these, together with x, will be used to generate the regression model. The spreadsheet output is as follows: Coefficients
Intercept
5.023965035
X Variable 1
0.836644911
X Variable 2
0.158657343
X Variable 3
-0.024113442
Thus, the regression equation is
Y
=
5.02396
+
0.836645x
+ 0. 158657x2 - 0.0241 134x3
The regression curve and the data are shown in Figure E6.26. ® In the Excel spreadsheet program, a simpler approach is available. A polynomial-regression feature is built into the plotting function. It is only necessary to plot the data, select the trendline function, and specify the order of the polynomial.
12 10
Y
8 6 4 2 0 0
5
x
10
15 FIGURE E6.26
Comment: The use of high-order polynomials to match data can sometimes have unforeseen complications. Slight irregularities in the data can produce nonphysical wiggles in the best fit. There is normally no problem with second-order fits, but third- or higher order polynomials may produce unreasonable results. To ensure that this doesn't happen, it is necessary to examine the fit by comparing it with the data.
1 84
6.7
Chapter 6
Statistica l Ana lysis of Experimenta l Data
LIN EAR FUNCTIONS OF RAN DOM VARIABLES A variable that is a function of other random variables is itself a random variable. Con sider the simplest case, in which the variable y is a linear function of variables
Xl > X2 ,
• •
' Xn:
(6.74)
Devore (1991) shows that if the x's are independent of each other, the mean value and standard deviation of y can be evaluated as (6.75)
and (6.76) These equations are useful in evaluating tolerances in fabricated parts and other applications. Example 6.27
Consider a shaft in a bearing, as shown in Figure E6.27. The shaft diameter, Ds, is 25.400 mm, and the bearing inside diameter, Db, is 25.451 mm. The standard deviation of the shaft diameter is 0.008 mm, and the standard deviation of the bearing diameter is 0.010 mm. For satisfactory oper ation, the difference in diameters (clearance) between the bearings must be between 0.0381 mm and 0.0635 mm. What fraction of the final assemblies will be unsatisfactory? The difference in diameters, AD, is a linear function of Db and Ds: AD = Db - Ds. Equations (6.75) and (6.76) can then be used to evaluate the mean and standard deviation of the clearance:
Solution:
iLllD
aAD
=
=
25.451 - 25.400 = 0.051 mm [.0102 + .0082]112 = 0.013 mm
The fraction of acceptable parts would equal the area under the standard normal distribution function for
ZI
=
0.0381 - 0.051 = -0 99 Z2 0.013 .
=
0.0635 - 0.051 0.013
=
0 . 096
Using the same method as in Example 6.9, we find that the area under the curve is 0.67, so 67 % of the assemblies will be acceptable and 33 % will be rejected.
Bearing
Shaft FIGURE E6.27
References
1 85
Comment: Such a high rejection rate would not be acceptable. There are two things that could be done to reduce the rejection rate. The standard deviations of the dimensions of the parts could be reduced by improving the manufacturing process. Alternatively, the parts could be paired on a selective basis -large shafts mated with large bearings, small shafts with small bearings.
6.8
APPLYING COM PUTER SOFTWARE FOR STATISTICAL ANALYSIS OF EXPERIM ENTAL DATA
Statistical analysis and data presentation have become a necessary feature of many engineering and business projects. Most modern spreadsheet programs contain some statistical functions, and some spreadsheet programs contain extensive statistical capa bilities. The better programs contain not only calculations for the mean and the stan dard deviation, data sorting into bins, and histogram plotting, but also calculations of linear-regression coefficients and correlation coefficients. They also have built-in tables for the common distribution functions (normal, t-distribution, and chi-squared). There are also several specialized software packages to perform customized statistical-analysis tasks. The reader should check the field for available packages if major statistical analysis is planned. REFERENCES [1]
ANDERSON, DAVID R., SWEENEY, DENNIS J., AND WILLIAMS, THOMAS A (1991).
Introduction to Statistics, Concepts and Applications, West Publishing Co, St. Paul, Minnesota. [2] ASME (1998). Measurement Uncertainty, Part I, ASME PTC 19.1-1998. [3] BLAISDELL, ERNEST A (1998). Statistics in Practice, Saunders College, Fort Worth, Texas. [4] CROW, E., DAVIS, E, AND MAXFIELD, M. (1960). Statistics Manual, Dover Publications, New
York. [5] DEVORE, JAY L. (1991). Probability and Statistics for Engineering and the Sciences, Brookes/Cole, Pacific Grove, California. [6] DUNN, OLIVE JEAN, AND CLARK, VIRGINIA A (1974). Applied Statistics: Analysis of Vari ance and Regression, John Wiley, New York. [7] HARNETT D., AND Mu RPHY, J. (1975). Introductory Statistical Analysis, Addison-Wesley, Reading, Massachusetts. [8] HAYTOR, AJ. (2002) Probability and Statistics for Engineers and Scientists, 2d ed., Duxbury, Pacific Grove, CA [9] JOHNSON, R. (1988). Elementary Statistics, PWS-Kent, Boston. [10] LAPIN, L. (1990). Probability and Statistics for Modern Engineering, PWS-Kent, Boston. [11] LIPSON, C., AND SHETH, N. (1973). Statistical Design and Analysis of Engineering Experi ments, McGraw-Hill, New York. [U] MASON, R.L., GUNST, R.E, AND HESS, J.L. (2003) Statistical Design and Analysis of Experi ments with Applications to Engineering and Science, 2d ed., J.Wiley, New York. [13] MONTGOMERY, DOUGLAS C, RUNGER, GEORGE C, AND HUBELE, NORMA E (1998). Engineering Statistics, John Wiley, New York. [14] REES, D. (1987). Foundations of Statistics, Chapman and Hall, London and New York.
1 86
Chapter 6
Statistica l Ana lysis of Experimenta l Data
[15] S CHEAFFER , RICHARD L., AND MCCLAVE , JAMES T. (1995). Probability and Statistics for Engineers, Duxbury Press, Belmont, California. [16] U.S. DEPARTMENT OF COMMERCE (1964). Handbook of Mathematical Functions. U.S. Dept.
of Commerce, Washington, D.C. [17] WALPOLE, R., AND MYERS, R. (1998). Probability and Statistics for Engineers and Scientists, Prentice Hall, Upper Saddle River, New Jersey.
PROBLEM S
Note: A n asterisk ( . ) denotes a spreadsheet problem. 6.1 A certain length measurement is made with the following results:
1
2
3
4
5
6
7
8
9
10
49.3
50.1
48.9
49.2
49.3
50.5
49.9
49.2
49.8
50.2
Reading
x (em)
(a) Arrange the data into bins with width 2 mm. (b) Draw a histogram of the data.
6.2 A certain length measurement is made with the following results: 1
2
3
4
5
6
7
8
9
10
58.7
60.0
58.8
59.1
59.2
60.4
59.8
59.3
59.8
60.3
Reading
x (in)
(b) Draw a histogram of the data.
(a) Arrange the data into bins with width 0.1 inch.
6.3 The air pressure (in psi) at a point near the end of an air supply line is monitored every
hour in a 12-h period, and the following readings are obtained: Reading P (psi)
1
2
3
4
5
6
7
8
9
10
11
12
110
104
106
94
92
89
100
114
120
108
110
115
Draw a histogram of the data with a bin width of 5 psi. 6.4 The air pressure (in psi) at a point near the end of an air supply line is monitored every #:
hour in a 12-h period, and the following readings are obtained:
Reading P (bar)
9.5
2
3
4
5
6
7
8
9
10
11
12
9.3
9.4
8.9
8.8
8.7
9.0
9.8
10.2
10
9.5
9.9
Draw a histogram of the data with a bin width of 0.2 bar.
Probl ems
1 87
6.5 Calculate the standard deviation, mean, median, and mode of the values of the data in Problem 6.1. 6.6 Calculate the standard deviation, mean, median, and mode of the values of the data in Problem 6.2. 6.7 Calculate the standard deviation, mean, median, and mode of the values of the data in Problem 6.3. 6.8 Calculate the standard deviation, mean, median, and mode of the values of the data in Problem 6.4. 6.9 Calculate the probability of having a 6 and a 3 in tossing two fair dice. 6.10 Calculate the probability of having a 4 and a 2 in tossing two fair dice. 6.11 At a certain university, 15% of the electrical engineering students are women and 80% of the electrical engineering students are undergraduates. What is the probability that an electrical engineering student is an undergraduate woman? 6.U At a certain university, 55% of the biology students are women and 85 % of the biology students are undergraduates. What is the probability that a biology student is an under graduate woman? 6.13 A distributor claims that the chance that any of the three major components of a com puter (CPU, monitor, and keyboard) is defective is 3 % . Calculate the chance that all three will be defective in a single computer. 6.14 The chance that any of the two components of a measurement system (transducer and transmitter) is defective is 2 % . Calculate the chance that both components in a measure ment system are defective. 6.15 In filling nominally 12-oz beer cans, the probability that a can has 12 or more ounces is 99% . If five cans are filled, what is the probability that all five will have 12 or more ounces. What is the probability that all of the cans will have less than 12 ounces? 6.16 In filling nominally 8-oz drink cans, the probability that a can has 8 or more ounces is 98% . I f five cans are filled, what i s the probability that all five will have 8 o r more ounces? What is the probability that all of the cans will have less than 8 ounces? 6.17 It has been found that the probability that a light bulb will last longer than 3000 hours is 90% . If a room contains six of these bulbs, what is the probability that all six will last longer than 3000 hours? 6.18 It has been found that the probability that a light bulb will last longer than 3600 hours is 95 % . If a room contains six of these bulbs, what is the probability that all six will last longer than 3600 hours? 6.19 The probability that a certain electronic component will fail in less than 1000 hours is 0.2. There are two of these components in an instrument. What is the probability that either one or both will fail before 1000 hours? 6.20 Consider the following probability distribution function for a continuous random variable:
3x2 f(x) = 35 =0
-2
O°c, What will be the resistance at 350°C?
Solution:
Substituting into Eq. (9. lO) yields
Rr = lOO { 1 + 0.00392[350 - 1 .49(0.01 -0.0(0.01
= 232.08 n
x
350 - 1 ) (0.01
x
x
350 - 1 ) (0.01 3 350) ] }
x
350)
This is a substantial change in resistance - much larger than observed in strain measurements. Alternatively, we could use Table 9.3 and obtain R = 231.89 n .
9.2.3
Thermistor and Integrated-Circuit Temperature Sensors
As with the RTD, the thermistor is a device that has a temperature-dependent resis tance. However, the thermistor, a semiconductor device, shows a much larger change in resistance with respect to temperature than the RTD. The change in resistance with temperature of the thermistor is very large, on the order of 4 % per degree Celsius. It is possible to construct thermistors that have a resistance-versus-temperature character istic with either a positive or a negative slope. However, the most common thermistor devices have a negative slope; that is, increasing temperature causes a decrease in resis tance, the opposite of RTDs. They are highly nonlinear, showing a logarithmic relation ship between resistance and temperature: T
1
-=
A + B In R + C(lnR) 3
(9.15)
As semiconductor devices, thermistors are restricted to relatively low temperatures many are restricted to lOO°C and are generally not available to measure temperatures
9.2
-=
309
Measuring Tem perature
Analog meter
Battery
FIGURE 9.23
Thermistor circuit.
over 300°C. While probes can be as small as 0.10 in. in diameter, they are still large compared to the smallest thermocouples and have inferior spatial resolution and tran sient response. Thermistor sensors can be quite accurate, on the order of ±O.l°C, but many are much less accurate. Thermistors are often used in commercial moderate temperature-measuring devices. Thermistor-based temperature-measuring systems can be very simple, as in Figure 9. 23, which shows a circuit frequently used in automobiles to measure engine water temperature. The current flow through the thermistor is measured with a simple analog mechanical meter. Nonlinearity can be handled by having a nonlinear scale on the meter dial. Thermistors are used in electronic circuits, where they are used to com pensate for circuit temperature dependency. Another thermistor application is in simple temperature controllers. For example, thermistor-based circuits can be used to activate relays to prevent overheating in many common devices, such as VCRs. Thermistors are not widely used by either the process industry or in normal engineering experiments since RTDs and thermocouples usually have significant advantages. As mentioned, the resistance of a thermistor is highly nonlinear. When inserted into a circuit, however, a fairly linear response can readily be achieved. For example, a typical thermistor shows a highly nonlinear resistance variation with temperature, as in Figure 9. 24(b). However, the voltage-divider circuit shown in Figure 9.24(a), which includes the thermistor, has an approximately linear output, as shown in Figure 9.24(b). In Figure 9. 24(b) both resistance and voltage are normalized by the values at 25°C. 1 .2 r----, 3 2.5 :q
�c �c
v.s -
2
0.8
1 .5 0.6
1 Thermistor resistance
0.4 (a) 0.2
0
20
0.5 40
60 80 Temperature ('C) (b)
FIGURE 9.24
Linearization circuit for thermistor.
100
0 120
�c
on N
310
Chapter 9
Measu ring Pressu re, Tem perature, a n d H u m i d ity
Integrated-circuit temperature transducers, which combine several components on a single chip, have a variety of special characteristics that make them useful for some applications. Some circuits provide a high-level (range 0 to 5 V) voltage output that is a linear function of temperature. Others create a current that is a linear function of tem perature. These chips generally have the same temperature-range limitations as ther mistors and are larger, giving them poor transient response and poor spatial resolution. One common application is measuring temperature in the connection boxes used when thermocouples are connected to a data-acquisition system. (See Section 9.2.1.) 9.2.4
Mechanical Temperature-Sensing Devices Liquid-in-Glass Thermometer Probably the best known of temperature measuring devices is the liquid-in-glass thermometer as shown in Figure 9.25. The most common liquid is mercury, but alcohol and other organic liquids are also used, depending on the temperature range. Liquid-in-glass thermometers are useful when a ready indication of temperature is required. They are also rather simple devices and are likely to maintain accuracy over long periods of time (many years). Consequently, they are useful for calibrating other temperature-measuring devices. The user should be cautious that the liquid column is continuous (no gaps in the column) and that the glass envelope is free of cracks. The accuracy of liquid-in-glass thermometers can be quite good. Measurement uncertainty depends on the range, but um:ertainties in the ±O.2C are quite possible. High-accuracy thermometers are of the total-immersion type. This means that the thermometer is immersed into the fluid from the bulb to the upper end of the liquid column in the stem. Bimetallic-Strip Temperature Sensors These devices are based on the differential thermal expansion of two different metals that have been bonded together as shown in Figure 9.26( a). As the device is heated or cooled, it will bend, producing a deflection of the end. Figure 9.26(b) and (c) show alternative geometries of these devices. Bimetal devices have been widely used as the sensing element in simple temperature-control systems. They have the advantage that they can do sufficient work to perform mechanical
/ Scale
Stem -
/
Capillary tube
FIGURE 9.25
Liquid-in-glass thermometer.
I
/ BUlb
9.2
I
Metal A
Metal B
�
Measuring Tem peratu re
31 1
I�
(a)
Spiral
FIGURE 9.26
Bimetallic strip devices. [(b) and (c) Based on Doebelin, 1990.]
(c)
(b)
functions, such as operating a switch or controlling a valve. Household furnaces are frequently controlled with bimetallic sensors. They are sometimes used for temperature measurement; for example, domestic oven thermometers are often of this type. They are not normally used for accurate temperature measurement. Pressure Thermometers A pressure thermometer, such as the one shown in Figure 9.27, consists of a bulb, a capillary tube, and a pressure-sensing device such as a bourdon gage. The capillary is of variable length and serves to locate the pressure indicator in a suitable location. The system may be filled with a liquid, a gas, or a combination of vapor and liquid. When filled with a gas, the gage is measuring the pressure of an effectively constant-volume gas, so pressure is proportional to bulb temperature. With a liquid, sensing is due to the differential thermal expansion between the liquid and the bulb. In both cases the temperature of the capillary might have a slight effect on the reading. In a vapor-liquid system, the gage is reading the vapor pressure of the liquid. These liquid-vapor devices are nonlinear since vapor pressure is
Bourdon-type pressure gage Fluid Bulb
Capillary tube
FIGURE 9.27
Pressure thermometer.
312
Chapter 9
Measuring Pressu re, Temperatu re, and H u mid ity
usually a very nonlinear function of temperature, but they are insensitive to the temperature of the capillary. Pressure thermometers are used in simple temperature control systems in such devices as ovens. At one time they were widely used as engine water temperature-measuring systems in automobiles, but this function has been taken over by thermistor systems. 9.2.5
Radiation Thermometers (Pyrometers)
Contact temperature measurements are very difficult at high temperatures since the measurement device will either melt or oxidize. As a result, noncontact devices were developed. These devices, called radiation thermometers, measure temperature by sens ing the thermally generated electromagnetic radiation emitted from a body. The term pyrometer is used to name high temperature thermometers (including some contact devices) but is usually applied to noncontact devices. Radiation thermometers can also be used at lower temperatures as a nonintrusive alternative to contact methods. Any body emits electromagnetic radiation continuously, and the power and wavelength distribution of this radiation are functions of the temperature of the body. An ideal radiating body is called a blackbody, and no body at a given temperature can thermally radiate more. The total power radiated by a blackbody, Eb , in watts per square meter of surface, is given by the Stefan-Boltzmann law: E = uT4 (9.16) b
where T is the absolute temperature in kelvin and u is the Stefan-Boltzmann constant, 2 5.669 X 10-8 W/m _K4 . The distribution of this radiation with wavelength is described with the variable EbA, called the monochromatic emissive power. The monochromatic emissive power represents the power in a narrow band of wavelengths, AA, and is a function of wavelength. The expression for EbA is c r5 (9.17) EbA = C2' T e 1 2 where A is the wavelength, C1 = 3.743 X 108 W-J-Lm4/m and C2 = 1.4387 X 104J-Lm-K.
: _
Figure 9.28, which plots Eq. (9.17), shows several interesting points:
1. The wavelength of the maximum monochromatic emissive power decreases with
2. The total radiation (area under each curve) increases significantly with temperature.
increasing temperature.
3. At low temperatures, there is very little radiation in the visible range but at
higher temperatures the radiation in this range is significant.
There are then several possible measurement devices that can be used to deter mine the temperature of the source: (a) Narrow-band devices that measure the radiation in a limited band of wave
(b) Wide-band devices that measure the radiation in a wide range of wavelengths.
lengths.
(c) Ratio devices that measure the radiation intensity in more than one narrowband
(usually two bands).
9.2
Measuring Tem perature
313
10000
1000 T = 25OO K
e r N
�
�
�
'-:
n2)'
9.4
Fiber-Optic Devices
325
FIGURE 9.40
lYpical optical fiber.
Light ray not within acceptance cone
_ .:':- - - - OptiZally_
:- -=--=-.:.:.:.:waveguided " }£.;;;:-�;--.::'"'':.:c:.;-o :: :: : - _ _ _ _ )ight fit",��C
Eventually lost by radiation FIGU E 9.41
Ray ansmission through an optical fiber.
Basically, the optical fiber is a cylinder of transparent dielectric material sur roun ed by another dielectric material, called cladding, with a lower refractive index. In practice, a third protective layer is also required. (See Figure 9.40.) Figure 9.41 , which shows the ray diagram of an optical fiber, indicates that the rays that enter the fiber beyond the acceptance angle may not be fully transmitted through the fiber and will eventually be lost. Optical fibers have extensive application in telecommunication and computer networking, but their application as sensing devices is not that widespread yet. Apply ing fiber-optic-based sensors is an emerging technology and is expected to grow in the near future. Optical sensing and signal transmission have several potential advantages over conventional electric output transducers and electric signal transmission. Major advantages are as follows: 1. Nonelectric (optical fibers are immune to electromagnetic and radio-frequency
2 . Explosion proof.
interference).
3. High accuracy.
4. Small size (both fibers and the attached sensors can be very small in size, applica
ble to small spaces with minimum loading and interference effect).
326
Chapter 9
5.
6.
7.
Measu ring Pressu re, Tem peratu re, and H u m i d ity
High capacity and signal purity. Can be easily interfaced with data-communication systems. Multiplexing capability (numerous signals can be carried simultaneously, allow ing a single fiber to monitor multiple points along its length or to monitor several different parameters).
According to Krohn (2000), most physical properties can be sensed with fiber optic sensors. Light intensity, displacement (position), pressure, temperature, strain, flow, magnetic and electrical fields, chemical composition, and vibration are among the measurands for which fiber-optic sensors have been developed. In the sections that fol low, basic principles and some typical fiber-optic sensors are introduced. 9.4.2
General Characteristics of Fiber-Optic Sensors
Fiber-optic sensors can be divided into two general categories, intrinsic and extrinsic. In intrinsic sensors, the fiber itself performs the measurement, while in extrinsic sensors a coating or a device at the fiber tip performs the measurement. Figure 9.42 shows schematic diagrams of these two types of sensors. Depending on the sensed property of light, fiber-optic sensors are also divided into phase-modulated sensors and intensity modulated sensors. Phase-modulated sensors compare the phase of light in a sensing fiber to a reference fiber in a device called an interferometer. Phase difference can be measured with extreme sensitivity. In intensity-modulated sensors the perturbation causes a change in received light intensity, which is a function of the phenomenon being measured (measurand). Intensity-modulated sensors are simpler, more eco nomical, and more widespread in application, so the discussion here is limited to this type. What will follow here is a brief introduction of these sensors for some mechani cal measurements.
Incident light
� Sensor To detector
.......
Sensor
To detector
-
'>
Incident light -
--+
Incident light
.r------
To detector (a) Tho types of fiber-optic sensors: (a) extrinsic; (b) intrinsic. FIGURE 9.42
(b)
9.4
9.4.3
Fi ber-Optic Devices
327
Fiber-Optic Displacement Sensors
1\vo concepts that are widely used in fiber-optic sensors are reflective and microbend ing concepts. Both concepts sense displacement but can be used for other measure ments if the measurand can be made to produce a displacement. Figure 9.43 shows the basic concept of a reflective-displacement sensor. In a reflective sensor a pair or two bundles of fibers are used. One bundle is used for transmitting the light to a reflecting target while the other collects (traps) the reflected light and transmits it to a detector. Any motion or displacement of the reflecting target can affect the reflected light that is being transmitted to the detector. The intensity of the reflected light captured depends on the distance of the reflecting target from the optic probe. A typical response curve is also shown. Following the basics of geometric optics, the behavior of this response curve can be interpreted and used in measuring displacement. Plain reflective dis placement sensors have a limited dynamic range of about 0.2 in. This can be improved by using a lens system (shown schematically in Figure 9.44) to 5 or more inches (Krohn, 2000). Disadvantages of this type of sensor are that they are sensitive to the orientation of the reflective surface and to the contamination of the reflective surface.
�
Light source
Detector
Transmit leg
�------
X
1.0 Back slope Reflecting surface o
Receive leg
25
50 75 100 125 150 175 200 Distance (0.001 in.) output versus distance
FIGURE 9.43
Reflective fiber-optic response curve for displacement measurement. (After Krohn, 2000.) � 1.0 II)
.
Light source
Fiber-optic probe
��:=t:=� Expanded range Detector P /
£ .9 c
i
.�
:= 1>1)
0.8 0.6 0.4 0.2
1;:1
�
(a)
0
3 Distance (in.) (b)
4
5
FIGURE 9.44
Fiber-optic displacement transducer with lens: (a) configuration; (b) response curve. (After Krohn, 1992.)
328
Chapter 9
Measuring Pressu re, Tem perature, and H u m i d ity 1 .0 C
�
0.8
. 1il
� 0.6 � .
.al§ �
E-