index previous next

Peak Finding and Measurement (May, 2012)

[findpeaks]  [iPeak[FindPeakSliders]  [Video Demonstration]

A common requirement in signal processing is to detect peaks in a signal and to measure their positions, heights, and widths. A common way to do this is to make use of the fact that the first derivative of a peak has a downward-going zero-crossing at the peak maximum.  But the presence of random noise in real experimental signal will cause many false zero-crossing simply due to the noise. To avoid this problem, the technique described here first smooths the first derivative of the signal, before looking for downward-going zero-crossings, and then it takes only those zero crossings whose slope exceeds a certain pre-determined minimum (called the "slope threshold") at a point where the original signal exceeds a certain minimum (called the "amplitude threshold").  By carefully adjusting the smoothwidth, slope threshold, and amplitude threshold, it's possible to detect only the desired peaks and ignore peaks that are too small, too wide, or too narrow.  Moreover, because smoothing can distort peak signals, reducing peak heights and increasing peak widths, this technique estimates the position, height, and width of each peak by least-squares curve-fitting of a segment of the original unsmoothed signal in the vicinity of the zero-crossing. Thus, even if heavy smoothing of the first derivative is necessary to provide reliable discrimination against noise peaks, the peak parameters extracted by curve fitting are not distorted. (This is useful primarily for signals that have several data points in each peak, not for spikes that have only one or two points).  This technique is capable of measuring peak positions and heights quite accurately, but the measurements of peak widths and areas is accurate only if the peaks are approximately Gaussian in shape.  (For the most accurate measurement of highly overlapped peaks, iPeak version 3 is capable of utilizing non-linear iterative curve fitting).

The routine is available in three different versions:
(1) the basic command-line findpeaks function (findpeaks.m);
(2) an interactive keypress-operated function, called iPeak (ipeak.m) for adjusting the peak detection criteria in real-time to optimize for any particular peak type; and
(3) an older script using mouse-controlled sliders for interactive control, which works only in Matlab 6.5 but not in more recent versions. These are all detailed below.

Click here to download the ZIP file "PeakFinder.zip", which includes findpeaks.m, ipeak.m, and a sample data file and demo scripts for testing. You can also download iPeak and other programs of mine from the Matlab File Exchange.

The findpeaks function (findpeaks.m)

function P=findpeaks(x,y,SlopeThreshold,AmpThreshold,SmoothWidth,FitWidth,smoothtype)
This is the basic command-line function to locate and measure the positive peaks in a noisy data sets. It detects peaks by looking for downward zero-crossings in the smoothed first derivative that exceed SlopeThreshold and peak amplitudes that exceed AmpThreshold, and determines the position, height, and approximate width of each peak by least-squares curve-fitting the top part of the peak. Returns a list (in matrix P) containing the peak number and the estimated position, height, and width of each peak.
It can find and measure over 1000 peaks per second in very large signals.

The data are passed to the findpeaks funciton in the vectors x and y (x = independent variable, y = dependent variable). The other parameters are user-adjustable:

There is also a similar function for finding valleys (minima), called findvalleys.m, which works the same way as findpeaks.m, except that it locates minima instead of maxima.  Right-click the link and select "Save link as...."

Examples:

>> x=[0:.01:50];y=cos(x);findpeaks(x,y,0,-1,5,5)
ans =
            1       6.2832            1       2.3569
            2       12.566            1       2.3569
            3        18.85            1       2.3568
            4       25.133            1       2.3569
            5       31.416            1       2.3569
            6       37.699            1       2.3568
            7       43.982            1       2.3569

>> x=[0:.01:50];y=cos(x);findvalleys(x,y,0,-1,5,5)
ans =
            1       3.1416           -1       2.3571
            2       9.4248           -1       2.3571
            3       15.708           -1       2.3571
            4       21.991           -1       2.3571
            5       28.274           -1       2.3571
            6       34.558           -1       2.3571
            7       40.841           -1      
2.3571


Note: The accuracy of the measurements of  peak position, height, and width by the findpeaks function depends on the shape of the peaks, the extent of peak overlap, and signal-to-noise ratio.  The width measurements particularly are strongly influenced by peak overlap, noise, and the choice of FitWidth. Isolated peaks of Gaussian shape are measured most accurately.  See "ipeakdemo" (below) for an accuracy trial. For the most accurate measurements of highly overlapping peaks of various shapes, use the Interactive Peak Fitter (ipf.m).

DemoFindPeak (DemoFindPeak.m)

Click to view enlarged figure
Click to view enlarged figure
A simple demonstration script using the findpeaks function on noisy synthetic data. Numbers the peaks and prints out the P matrix in the Matlab command window: 
    Peak #   Position   Height    Width
P =
1.0000 10.0235 0.1814 1.7506
2.0000 29.9964 0.6412 1.9931
3.0000 50.0205 0.4315 2.1433
4.0000 70.0120 1.3935 2.0202
5.0000 79.9567 0.2580 1.8267
6.0000 100.0086 1.0862 2.0306
......


The iPeak Function (ipeak.m), Version 3.9  Download as a ZIP file with sample data.


Peaks in annual sunspot numbers from 1700 to 2009 (download the datafile)*.


Press U key to switch between peak and valley mode.




                     Example 2




ECG data (normal and inverted)

ipeak

ipeak4
Example 3


Example 5. Pressing 'L' toggles ON and OFF the peak labels in the upper window.


Example 8. The peak identification function applied to a high-resolution atomic spectrum.


Example 8. Three peaks near 296 nm isolated and identified. Press the I key to display the peak ID names.



Normal Peak Fit (N key)



iPeak 3.2 and later, showing log scale (Y key) and Autozero mode (T key)


Click to view larger figures.
iPeak is a keyboard-operated Interactive Peak Finder for time series data, based on the "findpeaks.m" function. It accepts data in a single vector, a pair of vectors, or a matrix with the independent variable in the first column and the dependent variable in the second column:

Example
1:   Two input arguments; data in separate x and y vectors
              
>> x=[0:.1:100];y=(x.*sin(x)).^2;ipeak(x,y);
 
Example 2:    One input argument; data in two columns of a matrix
              
>> x=[0:.01:5]';y=x.*sin(x.^2).^2;M=[x y];ipeak(M)
 
Example 3:   One input argument; data in single vector
              
>> y=cos(.1:.1:100);ipeak(y) 

Example 4:   Additional input argument (after the data) to control peak sensitivity.

       >> x=[0:.1:100];y=5+5.*cos(x)+randn(size(x));ipeak(x,y,10);
   or  >> ipeak([x;y],10);
   or  >> ipeak(humps(0:.01:2),3)
   or  >> x=[0:.1:10];y=exp(-(x-5).^2);ipeak([x' y'],1)

The additional argument 
is an estimate of the ratio of the typical peak width to the length of the entire data record (PeakD). Small values detect fewer peaks; larger values detect more peaks. It effects only the starting values for the peak detection parameters. (This is just a quick way to set reasonable initial values of the peak detection parameters, rather than specifying each one individually as in the following example).

iPeak displays the entire signal in the lower half of the Figure window and an adjustable zoomed-in section in the upper window. (Press the Enter key to change the plot color.)  Adjust the peak detection parameters AmpThreshold (A/Z keys), SlopeThreshold (S/X), SmoothWidth (D/C), FitWidth (F/V) so that it detects the desired peaks and ignores those that are too small, too broad, or too narrow to be of interest.  Detected peaks are numbered from left to right.  Press P to display the peak table of all the detected peaks (Peak #, Position, Height, Width, Area). Press U to switch between peak and valley mode (version 3.7 and above).

Example 5: Six input arguments. As above, but input arguments 3 to 6 directly specifies initial values of AmpThreshold (AmpT), SlopeThreshold (SlopeT), SmoothWidth (SmoothW), FitWidth (FitW) . PeakD is ignored in this case, so just type a '0' as the second argument after the data matrix).

                 >> ipeak(datamatrix,0,.5,.0001,20,20);

Keystrokes allow you to pan and zoom the upper window, to inspect each peak in detail if desired.  You can set the initial values of pan and zoom in optional input arguments 7 ('xcenter') and 8 ('xrange').  See example 6 below.

The peak cloest to the center of the upper window is labeled in the upper left of the top window, and it peak position, height, and width are listed. The Spacebar/Tab keys
jump to the next/previous detected peak and displays it in the upper window at the current zoom setting (use the up and down cursor arrow keys to adjust the zoom range) The Y key toggles between linear and log y-axis scale in the lower window (a log axis is good for inspecting signals with high dynamic range; it effects only the lower window display and has no effect on the peak detection or measurements). 
 
Example 6: Eight input arguments. As above, but input arguments 7 and 8 specifiy the initial pan and zoom settings, 'xcenter' and 'xrange', respectively.  In this example, the x-axis data are wavelengths in nanometers (nm), and the upper window zooms in on a very small 0.4 nm region centered on 249.7 nm. (These data, provided in the ZIP file, are from a high-resolution atomic spectrum).

        >> load ipeakdata.mat
       
>> ipeak(Sample1,0,100,0.05,3,4,249.7,0.4);

Autozero mode. The T key toggles the autozero mode off and on.  When autozero is OFF, peak heights are measured relative to zero.  (If the peaks are superimpored on a background, use the baseline subtract keys - B and G - first to subtract the background).  When autozero is ON, peak heights are automatically measured relative to the local baseline on either side of the peak; use the zoom controls to isolate the peaks so that the signal returns to the local baseline between the peaks as displayed in the upper windowWhen autozero is ON, the peak heights, widths, and areas in the peak table (R or P keys) will be automatically corrected for the baseline.  (Autozero OFF will give better results when the baseline is zero, or has been subtracted using the B key, even if the peaks are partly overlapped. Autozero ON will work best if the peaks are well separated so that the signal returns to the local baseline between the peaks.  If the peaks are highly overlapped, or if they are not Gaussian in shape, the best results will be obtained by using the curve fitting function - the N or M keys). 
Example 7:  Nine input arguments. As example 6, but the 9th input argument turns ON the autozero mode (equivalent to pressing the T key).  If not specified, autozero is initially OFF.

       
>> ipeak(Sample1,0,100,0.00,3,4,249.7,0.4,1);

Normal and Multiple Curve fitting: The N key applies variable-shape iterative curve fitting to the detected peaks that are displayed in the upper window (referred to a "Normal" curve fitting).  If the peaks are superimposed on a background, turn on the Autozero mode (T key). Then use the pan and zoom keys to select a peak or a group of overlapping peaks in the upper window, with the signal returning all the way to the local baseline at the ends of the upper window.  Make sure that AmpThreshold, SlopeThreshold, SmoothWidth are adjusted so that each peak is numbered once.  Then press the N key, type a number for the desired peak shape at the prompt in the Command window and press Enter (1=Gaussian (default); 2=Lorentzian; 3=logistic, 4=Pearson; 5=exponentionally broadened Gaussian, 6=equal-width Gaussians; 7=Equal-width Lorentzians; 8=exponentionally broadened equal-width Gaussian), then type in a number of repeat trial fits and press Enter (the default is 1; start with that and then increase if necessary).  The program will perform the fit, display the results graphically in Figure window 2, and print out a table of results in the command window, e.g.:

Peak shape (1-8): 2
Number of trials: 1

Least-squares fit to Lorentzian peak model
Fitting Error 1.1581e-006%
          Peak#     Position     Height      Width         Area 
            1          100            1           50       71.652
            2          350            1          100       146.13
            3          700            1          200       267.77

The use of the iterative least-squares function can result in more accurate peak parameter measurements that the normal
peak table (R or P keys), especially if the peaks are non-Gaussian in shape or are highly overlapped.

There is also an experimental "Multiple" peak fit function (M key) that will attempt to apply
non-linear iterative curve fitting to to all the detected peaks in the signal automatically.  Before using this function, use the baseline correction function first (B key) to remove the background signal.  Then press M and proceed as for the normal curve fit. With some signals, this function may not work as well as using the normal (N-key) curve fitting function on each group of peaks separately.  Note: The N and M key fitting functions perform non-linear iterative curve fitting using the peakfit.m function. The number of peaks and the starting values of peak positions and widths for the curve fit function are automatically supplied by the the findpeaks function, so it is essential that the peak detection variables in iPeak be adjust so that all the peaks in the selected region are detected and numbered once.  (For more flexible curve fitting, use ipf.m, which allows manual optimization of peak groupings and start positions). 

Peak identification. There is an optional "peak identification" function if optional input arguments 9 ('MaxError'), 10 ('Positions'), and 11 ('Names') are included. The "I" key toggles this function ON and OFF. This function compares the found peak positions (maximum x-values) to a database of known peaks, in the form of an array of known peak maximum positions ('Positions') and matching cell array of names ('Names'). If the position of a found peak in the signal is closer to one of the known peaks by less than the specified maximun error ('MaxError'), then that peak is considered a match and its name is displayed next to the peak in the upper window. When when the 'O' key is pressed (the letter 'O'), the peak positions, names, errors, and amplitudes are printed out in a table in the command window.
 
Example 8: Eleven input arguments. As above, but also specifies 'MaxError', 'Positions', and 'Names' in optional input arguments 9, 10, and 11, for peak identification function. Pressing the 'I' key toggles off and on the peak identification labels in the upper window. Pressing "O" prints the peak positions, names, errors, and amplitudes in a table in the command window. These data (provided in the ZIP file) are from a high-resolution atomic spectrum (x-axis in nanometers).

       
>> load ipeakdata.mat
       
>> ipeak(Sample1,0,100,0.05,3,6,296,5,0.1,Positions,Names);

Note: The ZIP file contains the latest version of the iPeak function as well as some sample data to demonstrate peak identification (Example 8).  Obviously for your own applications, it's up to you to provide your own array of known peak maximum positions ('Positions') and matching cell array of names ('Names') for your particular types of signals.

Keyboard Controls:

 Pan signal left and right...Coarse pan: < and >  
                             Fine pan: left and right cursor arrow keys
                             Nudge
left and right: [ and ]

 Zoom in and out.............Coarse zoom: / and '  
                             Fine zoom: up and down cursor arrow keys
 Resets pan and zoom.........ESC
 Change plot color...........Enter
 Adjust AmpThreshold.........A,Z  (A increases, Z decreases)
 Adjust SlopeThreshold.......S,X  (S increases, X decreases)
 Adjust SmoothWidth..........D,C  (D increases, C decreases)
 Adjust FitWidth.............F,V  (F increases, Z decreases)
 Baseline correction.........B, then click baseline at multiple points
 Restore original signal.....G  to cancel previous background subtraction
 Invert signal...............-  Invert (negate) the signal (flip + and -)
 Set minimum to zero.........0 (Zero) Sets minumun signal to zero
 Toggle log y mode OFF/ON....Y  Plot log Y axis on lower graph 
 Toggle autozero OFF/ON......T  Auto background subtraction on upper graph
 Toggle valley mode OFF/ON...U  Switch to valley mode
 Print peak table............P  Prints Peak #, Position, Height, Width
 Step through peaks..........Space/Tab  Jumps to next/previous peak
 Normal peak fit.............N  Fit peaks in upper window with peakfit.m
 Multiple peak fit...........M  Fit all peaks in signal with peakfit.m
 Print keyboard commands.....K  Prints this list
 Print findpeaks arguments...Q  Prints findpeaks function with arguments.
 Print ipeak arguments.......W  Prints ipeak function with all arguments.
 Print report................R  Prints Peak table and parameters
 Peak labels ON/OFF......... L  Displays peak parameters in upper window.
 Peak ID ON/OFF..............I  Identifies closest peaks in 'Names' database.
 Print peak IDs..............O  Prints table of peaks IDs


*Sunspot data downloaded from NOAA: ftp://ftp.ngdc.noaa.gov/STP/SOLAR_DATA/SUNSPOT_NUMBERS/INTERNATIONAL/yearly/YEARLY.PLT

iPeak Demo scripts

The ZIP file at http://terpconnect.umd.edu/~toh/spectrum/ipeak3.zip contains several demo scripts (ipeakdemo.m, ipeakdemo1.m, etc) that illustrate various aspects of the iPeak function and how it can be used effectively. Download the zip file, right-click and select "Extract all", then put the resulting files in the Matlab path and run them by typing their names at the Matlab command window prompt or by opening them in the Matlab editor and clicking the green "Run" button on the editor toolbar.


ipeakdemo: effect of the peak detection parameters

Four Gaussian peaks with the same heights but different widths (10, 30, 50 and 70 units) This demonstrates the effect of SlopeThreshold and SmoothWidth on peak detection. Increasing SlopeThreshold (S key) will discriminate against the broader peaks. Increasing SmoothWidth (D key) will discriminate against the narrower peaks and noise. FitWidth (F/V keys) controls the number of points around the "top part" of the (unsmoothed) peak that are taken to estimate the peak heights, positions, and widths. A reasonable value is ordinarily about equal to 1/2 of the number of data points in the half-width of the peaks. In this case, where the peak widths are quite different, set it to about 1/2 of the number of data points in the narrowest peak.
ipeakdemo1: the autozero mode

Demonstration of autozero background correction, for separated, narrow peaks on a large baseline. Each time you run this demo, you will get a different set of peaks and noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. Jump to the next/previous peaks using the Spacebar/Tab keys.

Hint: Turn on the Autozero mode (T key), adjust the zoom setting so that the peaks are shown one at a time in the upper window, then press the P key to display the peak table. 

ipeakdemo2: peak overlap and the curve fitting fuctions.

Demonstration of error caused by overlapping peaks on a large offset baseline. Each time you run this demo, you will get a different set of peaks and noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. Jump to the next/previous peaks using the Spacebar/Tab keys.

Hint: Use the B key and click on the baseline points, then press the P key to display the peak table. Or turn on the Autozero mode (T key) and use the Normal curve fit (N key) or Multiple curve fit (M key). 

ipeakdemo3 Non-Gaussian peak shapes

Demonstration of overlapping Lorentzian peaks, without an added background. Overlap of peaks causes significant errors in peak height, width, and area. Jump to the next/previous peaks using the Spacebar/Tab keys. Each time you run this demo, you will get a different set of noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window.

Hint: turn OFF the Autozero mode (T key) and use the Normal curve fit (N key) or Multiple curve fit (M key) with peak shape 2 (Lorentzian). 

ipeakdemo4: dealing with very noisy signals

Detection and measurement of four peaks in a very noisy signal. The signal-to-noise ratio of first peak is 2. Each time you run this demo, you will get a different set of noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. Jump to the next/previous peaks using the Spacebar/Tab keys. The peak at x=100 is usually detected, but the accuracy of peak parameter measurement is poor because of the low signal-to-noise ratio.

Hint: With very noisy signals it is usually best to increase SmoothWidth and FitWidth to help reduce the effect of the noise.



FindPeakSliders, for Matlab 6.5 (does not work reliably in more recent versions of Matlab; use iPeak instead.)

Click to view enlarged figure
Chromatogram, before baseline correction

Click to view enlarged figure

After baseline correction

Interactive findpeaks script using mouse-controlled sliders for interactive control. Requires Matlab 6.5. This can be used to determine what values of the parameters give the most reliable peak detection. Load your data set into the vectors x and y (x = independent variable, y = dependent variable), then run this m-file and adjust the sliders to change the four arguments of the peakfind function. The four sliders correspond to the arguments of the findpeaks function described above: SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth (Smooth), and FitWidth (Fit). The range of these sliders is easily changed in lines 65 - 70 to suit a wide range of data types. The BG button is used for baseline (background) subtraction: click once on the BG button, then click on the baseline at five points starting to the left of the lowest x-value and ending to the right of the higest x-value. The background will be subtracted. You can repeat as needed. (You can also change the number of baseline points by changing BaselinePoints in the function BG.m). Peak number and the estimated position, height, and width of each peak is returned in the matrix P:
>> P
P =
1.0000 45.0095 0.1328 4.0102
2.0000 90.5301 0.3655 4.7127
3.0000 94.9673 1.4075 4.0967
......

DemoFindPeakSliders

Click to view enlarged figure
Click to view enlarged figure
A self-contained interactive demonstration of FindPeakSliders applied to noisy synthetic data set consisting of a random number of narrow peaks superimposed on a gently curved background. Requires Matlab 6.5. Use the sliders to explore the effect of the variables SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth (Smooth), and FitWidth (Fit), and the baseline correct (BG) button. Peak number and the estimated position, height, and width of each peak is returned in the matrix P.

DemoFindPeakSliders2


Click to view enlarged figure
A self-contained interactive demonstration of FindPeakSliders applied to a data set containing four simple peaks with increasing peak height and peak width. Use this to understand the difference between the variables SlopeThreshold (SlopeT), which discriminates on the basis of peak width, and AmpThreshold (AmpT), which discriminates on the basis of peak amplitude. Peak number and the estimated position, height, and width of each peak is returned in the matrix P.
>> P
P =
1.0000 49.9096 1.9895 20.1960
2.0000 126.1034 3.0585 42.3508
3.0000 201.3947 4.0752 63.3354
4.0000 299.9242 4.9925 82.2000


FindPeakSlidersG

Click to view enlarged figure
Click to view enlarged figure
Click to view enlarged figure
Click to view enlarged figure

Interactive findpeaks script with peak-zoom feature, for pre-defined data in x,y. Requires Matlab 6.5. Load a typical data set into the vectors x,y, then run this m-file and adjust the sliders to determine what values of the parameters give the most reliable peak detection. Peak number and position, height, and width of each peak is returned in the matrix P. This version has an additional slider, Peak, which zooms in on any one of the detected peaks and shows the fitted top of the peak (FitWidth) as a blue curve. To see all the peaks, set the Peaks slider to zero (all the way down).
Note: If the slider ranges are not appropriate for your data, change them in lines 63-69.

DemoFindPeakSlidersG

A self-contained interactive demonstration of FindPeakSlidersG. Requires Matlab 6.5.

Video Demonstration of FindPeakSliders

This is a 26-second, 1.4 MByte video (PeakFind2.wmv) of FindPeakSliders (version 1.1), showing the effect of varying the four parameters SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth, and FitWidth (labeled "PeakWidth" in this video).

Tom O'Haver
Professor Emeritus
Department of Chemistry and Biochemistry
The University of Maryland at College Park
toh@umd.edu
http://www.wam.umd.edu/~toh

Last updated, May, 2012


index previous next

Number of unique visits since May 17, 2008:

counter
create hit