index previous next

Peak Finding and Measurement

[findpeaks]  [findvalleys]  [Background correction]  [Signal-to-noise ratio]  [Peak Start and End]  [Finding and fitting peaks]   [Peak Identification]  [iPeak]  [iPeak demo functions]  [Spreadsheets]  [Animated Demonstration]  [Have a question? Email me]

A common requirement in signal processing is to detect peaks in a signal and to measure their positions, heights, and widths. A common way to do this is to make use of the fact that the first derivative of a peak has a downward-going zero-crossing at the peak maximum. But the presence of random noise in real experimental signal will cause many false zero-crossing simply due to the noise. To avoid this problem, the technique described here first smooths the first derivative of the signal, before looking for downward-going zero-crossings, and then it takes only those zero crossings whose slope exceeds a certain predetermined minimum (called the "slope threshold") at a point where the original signal exceeds a certain minimum (called the "amplitude threshold"). By carefully adjusting the smoothwidth, slope threshold, and amplitude threshold, it's possible to detect only the desired peaks and ignore peaks that are too small, too wide, or too narrow. Moreover, because smoothing can distort peak signals, reducing peak heights and increasing peak widths, this technique estimates the position, height, and width of each peak by least-squares curve-fitting of a segment of the original unsmoothed signal in the vicinity of the zero-crossing. Thus, even if heavy smoothing of the first derivative is necessary to provide reliable discrimination against noise peaks, the peak parameters extracted by curve fitting are not distorted. (This is useful primarily for signals that have several data points in each peak, not for spikes that have only one or two points). This technique is capable of measuring peak positions and heights quite accurately, but the measurements of peak widths and areas is accurate only if the peaks are approximately Gaussian in shape.  (For the most accurate measurement of highly overlapped peaks, iPeak version 3 and above is capable of utilizing non-linear iterative curve fitting).

The routine is now available in several different versions:
(1) a set of command-line functions for Matlab or Octave: findpeaks.m (don't confuse with the findpeaks function in the Signal Processing Toolbox), findvalleys.m, findpeaksL.m, findpeaksb.m, findpeaksplot.m, peakstats.m, findpeaksnr.m, findpeaksE, findpeaksGSS.m , findpeaksLSS.m, and findpeaksfit.m. These can be used as components in creating your own custom scripts and functions.
(2) an interactive keypress-operated function, called iPeak (ipeak.m), for adjusting the peak detection criteria interactively to optimize for any particular peak type (Matlab only). iPeak runs in the Figure window and use a simple set of keystroke commands to reduce screen clutter, minimize overhead, and maximize processing speed.
(3) A spreadsheet, available in Excel and in OpenOffice formats.

Click here to download the ZIP file "PeakFinder.zip", which includes findpeaks.m, ipeak.m, and a sample data file and demo scripts for testing. You can also download iPeak and other programs of mine from the Matlab File Exchange.

The basic command-line function, findpeaks.m, for Matlab or Octave.

function P=findpeaks(x,y,SlopeThreshold,AmpThreshold,SmoothWidth,FitWidth,smoothtype)
This is my basic command-line function to locate and measure the positive peaks in a noisy data sets. It detects peaks by looking for downward zero-crossings in the smoothed first derivative that exceed SlopeThreshold and peak amplitudes that exceed AmpThreshold, and determines the position, height, and approximate width of each peak by least-squares curve-fitting the top part of the peak. Returns a list (in matrix P) containing the peak number and the estimated position, height, width, and area of each peak.
It can find and measure over 1000 peaks per second in very large signals. Version 5.0 (December, 2012) has improved error catching for very noisy data.

The data are passed to the findpeaks function in the vectors x and y (x = independent variable, y = dependent variable). The other parameters are user-adjustable: >> x=[0:.01:50];y=cos(x);P=findpeaks(x,y,0,-1,5,5)
P =
    1.0000    6.2832    1.0000    2.3547    2.5068
    2.0000   12.5664    1.0000    2.3547    2.5068
    3.0000   18.8496    1.0000    2.3547    2.5068
    4.0000   25.1327    1.0000    2.3547    2.5068
    5.0000   31.4159    1.0000    2.3547    2.5068
    6.0000   37.6991    1.0000    2.3547    2.5068
    7.0000   43.9823    1.0000    2.3547    2.5068

Optimization of peak finding
The optimum values of the input arguments for findpeaks and related functions depend on the signal and on which features of the signal are important for your work. Rough values for these parameters can be estimated based on the width of the peaks that you wish to detect, as described above, but in many cases it will be best to fine-tune these parameters for your particular signal.

In the example shown on the left (using the interactive peak detector iPeak program described below), the important parts of the signal are two broad peaks
at x=4 and x=6, the second one half the height of the first. The small jagged features are just random noise. We want to detect the two peaks but ignore the noise. (The detected peaks are numbered 1,2,3,...in the lower panel of this graphic). This is what it looks like if the AmpThreshold is too small or too large, if the SlopeThreshold is too small or too large, if the SmoothWidth is too small or too large, if the FitWidth is too small or too large If these parameters are in the optimum range for this objective, the findpeaks functions will return something like this (although the exact values will vary with the noise and with the value of FitWidth):

          Peak#     Position     Height      Width          Area       
            1       3.9649      0.99919       1.8237      1.94      
            2       5.8675      0.53817       1.6671      0.95516      


 
How is 'findpeaks' different from 'max' in Matlab or 'findpeaks' in the Signal Processing Toolkit?

The 'max' function simply returns the largest single value in a vector.  Findpeaks in the Signal Processing Toolbox can be used to find the values and indices of all the peaks in a vector that are higher than a specified peak height and are separated from their neighbors by a specified minimum distance.  My version of findpeaks accepts both an independent variable (x) and dependent variable (y) vectors, finds the places where the average curvature over a specified region is concave down, fits that region with a least-squares fit, and returns the peak position (in x units), height, width, and area, of any peak that exceeds a specified height. For example, let's create a noisy series of peaks (plotted on the right) and apply each of the functions to the resulting data.

x=[0:.1:100];
y=5+5.*sin(x)+randn(size(x));

plot(x,y)

Now, anyone looking at this plot of data would count 16 peaks, with peak heights averaging about 10 units. Every time the statements are run, the random noise is different, but you would still count the 16 peaks. But the findpeaks function in the Signal Processing Toolbox,

[PKS,LOCS]=findpeaks(y,'MINPEAKHEIGHT',5,'MINPEAKDISTANCE',11)

counts anywhere from 11 to 20 peaks, with an average height (
PKS) of 11.5. In contrast, my findpeaks function returns 16 peaks every time, with a mean height of 10 ħ0.3.   

findpeaks(x,y,0.001,5,11,11,3)


It also measures the width and area, if the peaks are Gaussian (or Lorentzian, in the variant findpeaksL). On the other hand, f
indpeaks in the Signal Processing Toolbox works better for peaks that have only 1-3 data points on the peak; my function is better for peaks that have a lot of points.

findvalleys. There is also a similar function for finding valleys (minima), called findvalleys.m, which works the same way as findpeaks.m, except that it locates minima instead of maxima.  Only valleys above (that is, more positive or less negative than) the AmpThreshold are detected; if you wish to detect valleys that have negative minima, then AmpThreshold must be set more negative than that.

>> x=[0:.01:50];y=cos(x);P=findvalleys(x,y,0,-1,5,5)
P =
    1.0000    3.1416   -1.0000    2.3549         0
    2.0000    9.4248   -1.0000    2.3549         0
    3.0000   15.7080   -1.0000    2.3549         0
    4.0000   21.9911   -1.0000    2.3549         0
    5.0000   28.2743   -1.0000    2.3549         0
    6.0000   34.5575   -1.0000    2.3549         0
    7.0000   40.8407   -1.0000    2.3549         0
    8.0000   47.1239   -1.0000    2.3549         0


The accuracy of the measurements of  peak position, height, width, and area by the findpeaks function depends on the shape of the peaks, the extent of peak overlap, and signal-to-noise ratio. The width and area measurements particularly are strongly influenced by peak overlap, noise, and the choice of FitWidth. Isolated peaks of Gaussian shape are measured most accurately.  For peak of Lorentzian shape, use findpeaksL.m instead (the only difference is that the reported peak heights, widths, and areas will be more accurate if the peak are actually Lorentzian). See "ipeakdemo.m" below for an accuracy trial for Gaussian peaks. For the most accurate measurements of highly overlapping peaks of various shapes, use peakfit.m or the Interactive Peak Fitter (ipf.m).

For a direct comparison of the accuracy of findpeaks vs peakfit, run the script peakfitVSfindpeaks.m. This script (figure on the left) generates four very noisy peaks of different heights and widths, then applies findpeaks.m and peakfit.m to measure the peaks and compares the results. The peaks detected by findpeaks are labeled "Peak 1", "Peak 2", etc. If you run this script several times, you'll find that both methods work well most of the time, with peakfit giving smaller errors in most cases, but occasionally findpeaks will miss the first (lowest) peak and rarely it will detect an extra peak that is not there if the signal is very noisy.


findpeaksb.m is a variant of findpeaks.m that more accurately measures peak parameters by using iterative least-square curve fitting based on peakfit.m. This yields better peak parameter values that findpeaks alone, because it can be set for different peak shapes and it fits the entire peak, not just the top part, and because it has provision for background subtraction (linear or quadratic). This function works best with isolated peaks that do not overlap. For version 3, the syntax is P = findpeaksb(x,y, SlopeThreshold, AmpThreshold, smoothwidth, peakgroup, smoothtype, windowspan, PeakShape, extra,
background correction mode). The first seven input arguments are exactly the same as for the findpeaks.m function; if you have been using findpeaks or iPeak to find and measure peaks in your signals, you can use those same input argument values for findpeaksb.m. The remaining four input arguments of are for the peakfit function: "windowspan" specifies the number of data points over which each peak is fit to the model shape (it must be large enough to cover the entire peak and get down to the background on both sides of the peak); "PeakShape" specifies the model peak shape: "peakshape" (1=Gaussian, 2=Lorentzian, etc), "extra" is the shape modifier variable that is used for the Voigt, Pearson, exponentially broadened Gaussian and Lorentzian, Gaussian/Lorentzian blend, and bifurcated Gaussian and Lorentzian shapes to fine-tune the peak shape; "background correction" is 0, 1, 2, or 3 for no, linear, quadratic, or flat background subtraction. The peak table returned by the function has a 6th column listing the percent fitting errors for each peak. The demonstration script DemoFindPeaksb.m shows how findpeaksb works with multiple peaks on a curved background. Here is a simple example with three Gaussians on a linear background, comparing plain findpeaks to findpeaksb with and without background subtraction:

x=1:.2:100;Heights=[1 2 3];Positions=[20 50 80];Widths=[3 3 3];
y=2-(x./50)+modelpeaks(x,3,1,Heights,Positions,Widths)+.02*randn(size(x));plot(x,y);
disp('          Peak      Position      Height      Width        Area         % error')

PlainFindpeaks=findpeaks(x,y,.00005,.5,30,20,3)
NoBackgroundSubtraction=findpeaksb(x,y,.00005,.5,30,20,3,150,1,0,0)
LinearBackgroundSubtraction=findpeaksb(x,y,.00005,.5,30,20,3,150,1,0,1)


findpeaksb3.m is a variant of findpeaksb.m that fits each detected peak along with the previous and following peaks found by findpeaks.m, so as to deal better with overlap of the adjacent peaks. The syntax is function FPB=findpeaksb3(x,y, SlopeThreshold, AmpThreshold, smoothwidth, peakgroup, smoothtype, PeakShape, extra, NumTrials, AUTOZERO, ShowPlots). The demonstration script DemoFindPeaksb3.m shows how findpeaksb3 works with multiple overlapping Lorentzian peaks, as in the example on the left


The function findpeaksplot.m is a simple variant of findpeaks.m that also plots the x,y data and numbers the peaks on the graph (if any are found).  An example is shown on the right.
 
The function peakstats.m uses the same algorithm as findpeaks, but it computes and returns a table of summary statistics of the peak intervals (the x-axis interval between adjacent detected peaks), heights, widths, and areas, listing the maximum, minimum, average, and percent standard deviation of each, and optionally plotting the x,y data with numbered peaks in figure window 1, printing the table of peak statistics in the command window, and plotting the histograms of the peak intervals, heights, widths, and areas in the four quadrants of figure window 2. Type "help peakstats". The syntax is the same as findpeaks, with the addition of a 8th input argument to control the display and plotting. Example:

>> x=[0:.1:1000];y=5+5.*cos(x)+randn(size(x));PS=peakstats(x,y,0,-1,15,23,3,1);


Peak Summary Statistics
158 peaks detected

           Interval      Height      Width          Area
Maximum    6.6195       10.8991      4.7749       47.3653
Minimum    5.9086       9.1022       2.4123       26.3111
Mean       6.2833       10.0183      3.2346       34.4027
% STD      1.7362       3.2154       13.028       10.9314


With the last input argument omitted or equal to zero, the plotting and printing in the command window are omitted; the numerical values of the peak statistics table are returned as a 4x4 array, in the same order as the example above.

FindpeaksE.m is a variant of findpeaks.m that additionally estimates the percent relative fitting error of each peak (assuming a Gaussian peak shape) and returns it in the 6th column of the peak table.  Example:

>> x=[0:.01:5];y=x.*sin(x.^2).^2+.1*whitenoise(x);P=findpeaksE(x,y,.0001,1,15,10)
P =
            1       1.3175       1.3279      0.25511      0.36065       5.8404
            2       1.4245       1.2064      0.49053      0.62998       10.476
            3       2.1763       2.1516      0.65173       1.4929       3.7984
            4       2.8129       2.8811       0.2291      0.70272       2.3318
            5       3.2815       3.2034      0.62416       2.1286        2.981
            6       3.7631        3.872       0.1785      0.73581       3.0948
            7       4.1583       4.1711       0.1882      0.83572       2.4155
            8       4.5199       4.4808      0.18566      0.88566        2.001
            9       4.8532       4.7332      0.18627       0.9386      0.94514


Peak start and end.  Defining the "start" and "end" of the peak (the x-values where the peak begins and ends) is a bit arbitrary because typical peak shapes approach the baseline asymptotically far from the peak maximum. You might define the peak start and end points as the x values where the y value is some small fraction, say 1%, of the peak height, but then the random noise on the baseline is likely to be a large fraction of the signal amplitude at that point. Smoothing to reduce noise is likely to distort and broaden peaks, effectively changing their start and end points. One solution to this dilemma is to fit each peak to a model shape, then calculate the peak start and end from the model expression. That minimizes the noise problem by fitting the data over the entire peak, but it works only if the peaks can be accurately modeled by available fitting programs. For example, Gaussian peaks can be shown to reach a fraction a of the peak height at x=pħsqrt(w^2 log(1/a))/(2 sqrt(log(2))) where p is the peak position and w is the peak width.  So, for example if a=.01, x=pħw*sqrt((log(2)+log(5))/(2 log(2)))=1.288784*w. Lorentzian peaks can be shown to reach a fraction a of the peak height at x=pħsqrt[(w^2 - a w^2)/a]/2. If a=.01, x=pħ(3/2 sqrt(11)*w)=4.97493*w. The findpeaks variants findpeaksGSS.m and findpeaksLSS.m, for Gaussian and Lorentzian peaks respectively, compute the peak start and end positions in this manner and return them in the 6th and 7th columns of the peak table P. (For greater accuracy with overlapping peaks, use peakfit.m or the Interactive Peak Fitter (ipf.m) and calculate the start and end from the peak positions and width using the formulas above).

findpeaksfit.m is a findpeaks variant that is essentially a serial combination of findpeaks.m and peakfit.m. It uses the number of peaks found and the peak positions and widths determined by findpeaks as input for the peakfit.m function, which then fits the entire signal with the specified peak model. This combination yields better values that findpeaks alone, because peakfit fits the entire peak, not just the top part, and it deals with non-Gaussian and overlapped peaks. However, it fits only those peaks that are found by findpeaks. The syntax is

function [P,FitResults,LowestError,BestStart,xi,yi] =
findpeaksfit(x,y,SlopeThreshold,AmpThreshold,smoothwidth,peakgroup,smoothtype,peakshape,extra,NumTrials,baseline correction mode,fixedparameters,plot
s)

The first seven input arguments are exactly the same as for the findpeaks.m function; if you have been using findpeaks or iPeak to find and measure peaks in your signals, you can use those same input argument values for findpeaksfit.m. The remaining six input arguments of findpeaksfit.m are for the peakfit function; if you have been using peakfit.m or ipf.m to fit peaks in your signals, you can use those same input argument values for findpeaksfit.m. Type "help findpeaksfit" for more information. 

Note:
findpeaksfit.m differs from findpeaksb.m in that findpeaksfit.m fits all the found peaks at one time with a single multi-peak model, whereas findpeaksb.m fits each peak separately with a single-peak model.  As a result, findpeaksfit.m works better with a relatively small number of peak that overlap, whereas findpeaksb.m works better with a large number of isolated non-overlapping peaks .

findsteps.m Syntax: P=findsteps(x,y,SlopeThreshold,AmpThreshold,peakgroup) locates positive steps in noisy x-y time series data, by computing the first derivative of y that exceed SlopeThreshold, computes the step height as the difference between the maximum and minimum y values over a number of data point equal to "Peakgroup". Returns list (P) with step number, x position, and the step height of each step detected; "SlopeThreshold" and "AmpThreshold" control step sensitivity; higher values will neglect smaller features. See findsteps.png for a real example.


Rectangular pulses. Rectangular pulses (square waves) require a different approach, based on amplitude discrimination rather than differentiation. The function "findsquarepulse.m" (syntax S=findsquarepulse(t,y,threshold) locates the rectangular pulses in the signal t,y that exceed a y-value of "threshold" and determines their start time, average height (relative to the baseline) and width. DemoFindsquare.m creates a test signal and calls findsquarepulse.m to demonstrate. If the signal is very noisy, some preliminary rectangular smoothing (e.g. using fastsmooth.m) before calling findsquarepulse.m may be helpful to eliminate false peaks.

Using the peak table. All these peak finding functions return the peak table as a matrix, with one row for each peak detected and with several columns listing the peak number, position, height, width, and area in columns 1 - 5 (with additional columns included for the variants findpeaksnr.m, findpeaksGSS.m, and findpeaksLSS.m). You can assign this matrix to a variable (e.g. P, in the examples above) and then use Matlab/Octave functions to extract specific information from that matrix. For example:

[P(:,2) P(:,3)] is the time series of peak heights (peak position in the 1st column and peak height in the 2nd column)
mean(P(:,3)) returns the average peak height of all peaks (because peak height is in column 3);
max(P(:,3)) returns the maximum peak height;
hist(P(:,3)) displays the histogram of peak heights;
std(P(:,4))/mean(P(:,4)) returns the relative standard deviation of the peak widths (column 4)
for n=1:length(P)-1;d(n)=max(P(n+1,2)-P(n,2));end  creates "d" as the vector of x-axis distance between adjacent peaks (because peak position is in column 2).

Using my val2ind.m function (download this function and place in the Matlab path):
val2ind(P(:,3),7.5) returns the peak number of the peak whose height (column 3) is closest to 7.5;
P(val2ind(P(:,2),7.5),3) returns the peak height (column 3) of the peak whose position (column 2) is closest to 7.5;
and so on.

Demo scripts:  DemoFindPeak.m and DemoFindPeaksb.m

Click to view
                  enlarged figure
Click to view enlarged figure
DemoFindPeak.m is a simple demonstration script using the findpeaks function on noisy synthetic data. Numbers the peaks and prints out the peak table in the Matlab command window: 
          Peak #  Position    Height    Width     Area
Measuredpeaks =
1 799.95 6.9708 51.222 380.12
2 1199.4 3.9168 50.44 210.32
3 1600.6 2.9955 49.683 158.44
4 1800.4 2.0039 50.779 108.33

......
DemoFindPeakSNR is a variant of DemoFindPeak.m that uses findpeaksnr.m to compute the signal-to-noise ratio (SNR) of each peak and returns it in the 5th column.  
DemoFindPeaksb.m is a similar demonstration script that uses the findpeaksb function on noisy synthetic data consisting of several Gaussian peaks superimposed on a curved background.  (The findpeaks function would not give accurate measurements of peak height, width, and area for this signal, because it does not correct for the background).
 
     Peak #   Position   Height    Width    Area      Fitting error
Measuredpeaks =
        1     599.13     1.0059    49.533    53.044    9.6798
        2     799.44     1.0206    45.073    48.974    8.6772
        3    1000        3.9501    49.221    206.99    2.2302
        4    1200.8      1.9413    49.133    101.54    5.195
        5    1599.8      3.9979    49.138    209.14    2.4324
      .....

Peak Identification  

The command line function idpeaks.m is used for identifying peaks according to their x-axis maximum positions. The syntax is
[IdentifiedPeaks,AllPeaks]=idpeaks(DataMatrix,AmpT,SlopeT,SmoothWidth,FitWidth,maxerror,Positions,Names)
It finds peaks in the signal "DataMatrix" (x-values in column 1 and y-values in column 2), according to the peak detection parameters "AmpT", "SlopeT", "SmoothWidth", "FitWidth" (see the "findpeaks" function above), then compares the found peak positions (x-values) to a database of known peaks, in the form of an array of known peak maximum positions ('Positions') and matching cell array of names ('Names'). If the position of a peak found in the signal is closer to one of the known peaks by less than the specified maximum error ('maxerror'), that peak is considered a match and its peak position, name, error, and peak amplitude (height) are entered into the output cell array "IdentifiedPeaks". The full list of detected peaks, identified or not, is returned in "AllPeaks". Use "cell2mat" to access numeric elements of IdentifiedPeaks,e.g. cell2mat(IdentifiedPeaks(2,1)) returns the position of the first identified peak, cell2mat(IdentifiedPeaks(2,2)) returns its name, etc. Obviously for your own applications, it's up to you to provide your own array of known peak maximum positions ('Positions') and matching cell array of names ('Names') for your particular types of signals. The interactive iPeak function described in the next section has this function built in as one of the keystroke commands.

Example:  Download idpeaks.zip, extract it, and place the extracted files in the Matlab or Octave path. This contains a high-resolution atomic emission spectrum of copper ('spectrum') and a data table of known Cu I and II atomic lines ('DataTable') containing their positions and names.

>> load DataTable
>> load spectrum
>> idpeaks(Cu,0.01,.001,5,5,.01,Positions,Names)

ans=
'Position' 'Name' 'Error' 'Amplitude'
[ 221.02] 'Cu II 221.027' [ -0.0025773] [ 0.019536]
[ 221.46] 'Cu I 221.458' [ -0.0014301] [ 0.4615]
[ 221.56] 'Cu I 221.565' [-0.00093125] [ 0.13191]
.................... etc...

The Matlab iPeak Function (ipeak.m), Version 6  Download as a ZIP file with sample data.  Animated step-by-step instructions.

ipeak
Zoom control (up/down cursor arrows)

Gaussian or Lorentzian mode (Shift-G)

Jump to Next/Previous peak (Space/Tab keys)

Amplitude Threshold adjust (A/Z keys)
Click to enlarge
iPeak is a keyboard-operated interactive peak finder for time series data, based on the "findpeaks.m" function, for Matlab only. It accepts data in a single vector, a pair of vectors, or a matrix with the independent variable in the first column and the dependent variable in the second column:
 
Example 1:   One input argument; data in single vector
              
>> y=cos(.1:.1:100);ipeak(y) 


Example 2:   One input argument; data in two columns of a matrix
              
>> x=[0:.01:5]';y=x.*sin(x.^2).^2;M=[x y];ipeak(M)

                    
Example  3:   Two input arguments; data in separate x and y vectors
               >> x=[0:.1:100];y=(x.*sin(x)).^2;ipeak(x,y);

ipeak4

Example 4:   Additional input argument (after the data) to control peak sensitivity.
       >> x=[0:.1:100];y=5+5.*cos(x)+randn(size(x));ipeak(x,y,10);
   or  >> ipeak([x;y],10);
   or  >> ipeak(humps(0:.01:2),3)
   or  >> x=[0:.1:10];y=exp(-(x-5).^2);ipeak([x' y'],1)

The additional numeric argument 
is an estimate of maximum peak density (PeakD), the ratio of the typical peak width to the length of the entire data record. Small values detect fewer peaks; larger values detect more peaks. It effects only the starting values for the peak detection parameters. (This is just a quick way to set reasonable initial values of the peak detection parameters, rather than specifying each one individually).


Peaks in annual sunspot numbers from 1700 to 2009 (download the datafile). Sunspot data downloaded from NOAA: ftp://ftp.ngdc.noaa.gov/STP/SOLAR_DATA/SUNSPOT_NUMBERS/INTERNATIONAL/yearly/YEARLY.PLT

iPeak displays the entire signal in the lower half of the Figure window and an adjustable zoomed-in section in the upper window. Pan and zoom the portion in the upper window using the cursor arrow keys.  The peak closest to the center of the upper window is labeled in the upper left of the top window, and it peak position, height, and width are listed. The Spacebar/Tab keys jump to the next/previous detected peak and displays it in the upper window at the current zoom setting (use the up and down cursor arrow keys to adjust the zoom range). Or you can press the J key to jump to a specified peak number (Version 5.7 and later). 

Adjust the peak detection parameters AmpThreshold (A/Z keys), SlopeThreshold (S/X), SmoothWidth (D/C), FitWidth (F/V) so that it detects the desired peaks and ignores those that are too small, too broad, or too narrow to be of interest. You can also type in a specific value of AmpThreshold by pressing Shift-A or a specific value of SlopeThreshold by pressing Shift-S. Detected peaks are numbered from left to right.
Press P to display the peak table of all the detected peaks (Peak #, Position, Height, Width, Area, and percent fitting error):
    Gaussian shape mode (press Shift-G to change)
    Window span: 169 units
    Linear baseline subtraction
          Peak#     Position     Height      Width          Area        Error
            1       500.93       6.0585       34.446       222.17       9.5731
            2       767.75       1.8841       105.58       211.77       25.979
            3       1012.8      0.20158       35.914       7.7071       269.21
             .............
Press Shift-G to switch between Gaussian and Lorentzian shape modes. Press Shift-P to save peak table as disc file. Press U to switch between peak and valley mode.
Don't forget that only valleys above (that is, more positive or less negative than) the AmpThreshold are detected; if you wish to detect valleys that have negative minima, then AmpThreshold must be set more negative than that.  Note: to speed up the operation for signals over 100,000 points in length, the lower window is refreshed only when the number of detected peaks changes or if the Enter key is pressed. Press K to see all the keystroke commands.

Press U key to switch between peak and valley mode.

If the density of data points on the peaks is too low - less than about 4 points - the peaks may not be reliably detected; you can improve reliability by using the interpolation command (Shift-I) to re-sample the data by linear interpolation to a larger number of points.  Conversely, if
the density of data points on the peaks of interest is very high - say, more than 100 points per peak - then you can speed up the operation of iPeak by re-sampling to a smaller number of points.

Peak Summary Statistics. The E key prints a table of summary statistics of the peak intervals (the x-axis interval between adjacent detected peaks), heights, widths, and areas, listing the maximum, minimum, average, and percent standard deviation, and displaying the histograms of the peak intervals, heights, widths, and areas in figure windows 2 through 5.
 
Peak Summary Statistics
15 peaks detected
No baseline correction
         Interval   Height   Width      Area
Maximum   6.3795    10.5308   3.2354   34.943
Minimum   6.1649     9.7355   2.6671   29.9008
Mean      6.291     10.1559   3.0149   32.5771
% STD     0.91178    1.904    5.2584    4.3022


Example 5: Six input arguments. As above, but input arguments 3 to 6 directly specifies initial values of AmpThreshold (AmpT), SlopeThreshold (SlopeT), SmoothWidth (SmoothW), FitWidth (FitW). PeakD is ignored in this case, so just type a '0' as the second argument after the data matrix).
                 >> ipeak(datamatrix,0,.5,.0001,20,20);


Pressing 'L' toggles ON and OFF the peak labels in the upper window.

Keystrokes allow you to pan and zoom the upper window, to inspect each peak in detail if desired.  You can set the initial values of pan and zoom in optional input arguments 7 ('xcenter') and 8 ('xrange').  See example 6 below.

The Y key toggles between linear and log y-axis scale in the lower window (a log axis is good for inspecting signals with high dynamic range; it effects only the lower window display and has no effect on the peak detection or measurements). 


Log scale (Y key) and background correction mode (T key)

Example 6: Eight input arguments. As above, but input arguments 7 and 8 specify the initial pan and zoom settings, 'xcenter' and 'xrange', respectively.  In this example, the x-axis data are wavelengths in nanometers (nm), and the upper window zooms in on a very small 0.4 nm region centered on 249.7 nm. (These data, provided in the ZIP file, are from a high-resolution atomic spectrum).

        >> load ipeakdata.mat
       
>> ipeak(Sample1,0,100,0.05,3,4,249.7,0.4);

Baseline correction
modes. The T key cycles the baseline correction mode from off, to linear, to quadratic, to flat mode, then back to off. The current mode is displayed above the upper panel. When the baseline correction mode is OFF, peak heights are measured relative to zero. (Use this mode when the baseline is zero or if you have previously subtracted the baseline from the entire signal using the B key). In the linear or quadratic modes, peak heights are automatically measured relative to the local baseline interpolated from the points at the ends of the segment displayed in the upper panel; use the zoom controls to isolate a group of peaks so that the signal returns to the local baseline at the beginning and end of the segment displayed in the upper window. The peak heights, widths, and areas in the peak table (R or P keys) will be automatically corrected for the baseline. The linear or quadratic modes will work best if the peaks are well separated so that the signal returns to the local baseline between the peaks.  (If the peaks are highly overlapped, or if they are not Gaussian in shape, the best results will be obtained by using the curve fitting function - the N or M keys. The flat mode is used only for curve fitting function, to account for a flat baseline offset without reference to the edges of the signal segment being fit).
Example 7:  Nine
input arguments. As example 6, but the 9th input argument sets the background correction mode (equivalent to pressing the T key)' 0=OFF; 1=linear; 2=quadratic, 3=flat.  If not specified, it is initially OFF.

       
>> ipeak(Sample1,0,100,0.00,3,4,249.7,0.4,1);

Converting to command-line functions.  To aid in writing your own scripts and function to automate processing, the 'Q' key prints out the findpeaks, findpeaksb, and findpeaksfit commands for the segment of the signal in the upper window and for the entire signal, with most or all of the input arguments in place, so you can Copy and Paste into your own scripts. The 'W' key similarly prints out the peakfit and ipf commands.

Ensemble averaging. For signals that contain repetitive waveform patterns occurring in one continuous signal, with nominally the same shape except for noise, the ensemble averaging function (Shift-E) can compute the average of all the repeating waveforms. It works by detecting a single peak in each repeat waveform in order to synchronize the repeats (and therefore does not require that the repeats be equally spaced or synchronized to an external reference signal). To use this function, first adjust the peak detection controls to detect only one peak in each repeat pattern, then zoom in to isolate any one of those repeat patterns, and then press Shift-E. The average waveform is displayed in Figure 2 and saved as “EnsembleAverage.mat” in the current directory. See iPeakEnsembleAverageDemo.m.

Normal and Multiple Peak fitting: The N key applies iterative curve fitting to the detected peaks that are displayed in the upper window (referred to here as "Normal" curve fitting). The use of the iterative least-squares function can result in more accurate peak parameter measurements than the normal peak table (R or P keys), especially if the peaks are non-Gaussian in shape or are highly overlapped. (If the peaks are superimposed on a background, select the baseline correction mode using the T key, then use the pan and zoom keys to select a peak or a group of overlapping peaks in the upper window, with the signal returning all the way to the local baseline at the ends of the upper window if you are using the linear or quadratic baseline modes).  Make sure that the AmpThreshold, SlopeThreshold, SmoothWidth are adjusted so that each peak is numbered once. Only numbered peaks are fit. Then press the N key, type a number for the desired peak shape at the prompt in the Command window and press Enter (1=Gaussian (default), 2=Lorentzian, 3=logistic, 4=Pearson, 5=exponentially broadened Gaussian; 6=equal-width Gaussians; 7=Equal-width Lorentzians; 8=exponentially broadened equal-width Gaussian, 9=exponential pulse, 10=sigmoid, 11=Fixed-width Gaussian, 12=Fixed-width Lorentzian; 13=Gaussian/Lorentzian blend; 14=bifurcated Gaussian, 15=bifurcated Lorentzian, 16=Fixed-position Gaussians; 17=Fixed-position Lorentzians; 18=exponentially broadened Lorentzian; 19=alpha function; 20=Voigt profile), then type in a number of repeat trial fits and press Enter (the default is 1; start with that and then increase if necessary).  If you have selected a variable-shape peak (numbers 4, 5, 8 ,13, 14, 15, 18, or 20), the program will ask you to type in a number that controls the shape ("extra" in the peakfit input arguments). The program will then perform the fit, display the results graphically in Figure window 2, and print out a table of results in the command window, e.g.:

Peak shape (1-8): 2
Number of trials: 1

Least-squares fit to Lorentzian peak model
Fitting Error 1.1581e-006%
  Peak#   Position   Height   Width   Area 
    1        100        1       50    71.652
    2        350        1      100    146.13
    3        700        1      200    267.77


Normal Peak Fit (N key) applied to a group of six overlapping Gaussians peaks

There is also a "Multiple" peak fit function (M key) that will attempt to apply
iterative curve fitting to all the detected peaks in the signal simultaneously. Before using this function, it's best to turn off the automatic baseline correction (T key) and use the multi-segment baseline correction function (B key) to remove the background (because the baseline correction function will probably not be able to subtract the baseline from the entire signal). Then press M and proceed as for the normal curve fit. A multiple curve fit may take a minute or so to complete if the number of peaks is large, possibly longer than the Normal curve fitting function on each group of peaks separately. 

The N and M key fitting functions perform non-linear iterative curve fitting using the peakfit.m function. The number of peaks and the starting values of peak positions and widths for the curve fit function are automatically supplied by the the findpeaks function, so it is essential that the peak detection variables in iPeak be adjust so that all the peaks in the selected region are detected and numbered once.  (For more flexible curve fitting, use ipf.m, which allows manual optimization of peak groupings and start positions). 

Example 8. 
This example generates four Gaussian peaks, all with the exact same peak height (1.00) and area (1.773). The first peak (at x=4) is isolated, the second peak (x=9) is slightly overlapped with the third one, and the last two peaks (at x= 13 and 15) are strongly overlapped.

    >> x=[0:.01:20];
   
>> y=exp(-(x-4).^2)+exp(-(x-9).^2)+exp(-(x-13).^2)+exp(-(x-15).^2);
   
>> ipeak(x,y)


By itself, iPeak does a fairly good just of measuring peaks positions and heights by fitting just the top part of the peaks,
because the peaks are Gaussian, but the areas and widths of the last two peaks (which should be 1.665 like the others) are quite a bit too large because of the overlap:
          Peak#     Position     Height      Width          Area
            1            4            1       1.6651       1.7727
            2            9            1       1.6651       1.7727
            3       13.049       1.0198       1.8381       1.9956
            4       14.951       1.0198       1.8381       1.9956

In this case, curve fitting (using the N or M keys) does a much better job,
even if the overlap is even greater, but only if the peak shape is known:  
          Peak#     Position     Height      Width         Area 
            1            4            1       1.6651       1.7724
            2            9            1       1.6651       1.7725
            3           13            1       1.6651       1.7725
            4           15      0.99999       1.6651       1.7724

Note 1
: If the peaks are too overlapped to be detected and numbered separately, try pressing the H key to activate the sharpen function before pressing M (version 4.0 and above only).
This does not effect the signal itself, only the peak detection.
Note 2
If you plan to use a variable-shape peak (numbers 4, 5, 8 ,13, 14, 15, 18, or 20) for the Multiple peak fit, it's a good idea to obtain a reasonable value for the requested "extra" shape parameter by performing a Normal peak fit on an isolated single peak (or small group of partly-overlapping peaks) of the same shape, then use that value for the Multiple curve fit of the entire signal.
Note 3: Although the peak widths can vary from peak to peak, the curve fit routines assume that the peak shape is the same for all peaks in one fitting operation, so if the peak shape varies across the signal, use the Normal peak fit to fit each section with a different shape rather than the Multiple peak fit.

Peak identification. There is an optional "peak identification" function if optional input arguments 9
('MaxError'), 10 ('Positions'), and 11 ('Names') are included. The "I" key toggles this function ON and OFF. This function compares the found peak positions (maximum x-values) to a reference database of known peaks, in the form of an array of known peak maximum positions ('Positions') and matching cell array of names ('Names'). If the position of a found peak in the signal is closer to one of the known peaks by less than the specified maximum error ('MaxError'), then that peak is considered a match and its name is displayed next to the peak in the upper window. When when the 'O' key is pressed (the letter 'O'), the peak positions, names, errors, and amplitudes are printed out in a table in the command window.
 
Example 9: Eleven input arguments. As above, but also specifies 'MaxError', 'Positions', and 'Names' in optional input arguments 9, 10, and 11, for peak identification function. Pressing the 'I' key toggles off and on the peak identification labels in the upper window.These data (provided in the ZIP file) are from a high-resolution atomic spectrum (x-axis in nanometers).

       
>> load ipeakdata.mat
       
>> ipeak(Sample1,0,100,0.05,3,6,296,5,0.1,Positions,Names);

The peak identification function applied to a high-resolution atomic spectrum.


Three peaks near 296 nm isolated and identified. Press the I key to display the peak ID names.
Pressing "O" prints the peak positions, names, errors, and amplitudes in a table in the command window.
    Name            Position    Error        Amplitude
    'Mg I 295.2'     [295.2]    [0.058545]    [129.27]
    'Cu 296.1 nm'    [296.1]    [0.045368]    [124.6]
    'Hg 297.6 nm'    [297.6]    [0.023142]    [143.95]

Here is another example, from a large atomic emission spectrum with over 10,000 data points and many hundreds of peaks. The reference table of known peaks in this case is taken from Table 1 of  ASTM C1301 - 95(2009)e1. With the settings I was using, 10 peaks were identified, shown in the table below. You can see that some of these elements have more than one line identified. Obviously, the lower the settings of the AmpThreshold, SlopeThreshold, and SmoothWidth, the more peaks will be detected; and the higher the setting of "MaxError", the more peaks will be close enough to be considered identified. In this example, the element names in the table below are hot-linked to the screen image of the corresponding peak detected as identified by iPeak. Some of these lines, especially Nickel 231.66nm, Silicon 288.18nm, and Iron 260.1nm, are rather weak and have poor signal-to-noise ratios, so their identification might be in doubt (especially Iron, since its wavelength error is greater than the rest). It's up to the experimenter to decide which peaks are strong enough to be significant. In this example, I used an independently published table of element wavelengths, rather than data acquired on that same instrument, which depends on the accurate wavelength calibration of the instrument; in fact, the wavelength calibration is likely excellent, based on the very small error for the two well-known and relatively strong Sodium lines at 589 and 589.59 nm.  (Even so, I set MaxError to 0.2 nm in this example to loosen up the wavelength matching requirements). 

'Name'        'Position'     'Error'      'Amplitude'
'Cadmium'     [  226.46]   [-0.039471]   [   44.603]
'Nickel'      [  231.66]   [ 0.055051]   [   26.381]
'Silicon'     [  251.65]   [ 0.041616]   [   45.275]
'Iron'        [   260.1]   [    0.156]   [    38.04]
'Silicon'     [  288.18]   [ 0.022458]   [   27.214]
'Strontium'   [  421.48]   [-0.068412]   [   41.119]
'Barium'      [  493.35]   [-0.057923]   [   72.466]
'Sodium'      [     589]   [0.0057964]   [   405.23]
'Sodium'      [  589.57]   [-0.015091]   [    315.2]
'Potassium'   [  766.54]   [ 0.051585]   [   61.987]

Note: The ZIP file contains the latest version of the iPeak function as well as some sample data to demonstrate peak identification (Example 8).  Obviously for your own applications, it's up to you to provide your own array of known peak maximum positions ('Positions') and matching cell array of names ('Names') for your particular types of signals.

Keyboard Controls (version 6.01): 
 
 Pan signal left and right...Coarse pan: < or >                            
                             Fine pan: left or right cursor arrow keys

                             Nudge
one point left or right: [ and ]

 Zoom in and out.............Coarse zoom: / or '  
                             Fine zoom: up or down cursor arrow keys
 Resets pan and zoom.........ESC
 
Select entire signal........Ctrl-A
 Refresh entire plot.........Enter (Updates cursor position in lower plot)
 Change plot color...........Shift-C (cycles through standard colors)
 Adjust AmpThreshold.........A,Z  (Larger values ignore short peaks)
 Type in AmpThreshold........Shift-A  (Type value and press Enter)
 Adjust SlopeThreshold.......S,X  (Larger values ignore broad peaks)
 Type in SlopeThreshold......Shift-S  (Type value and press Enter)
 Adjust SmoothWidth..........D,C  (Larger values ignore sharp peaks}
 Adjust FitWidth.............F,V  (Adjust to cover just top part of peaks)
 Toggle sharpen mode ........H  Helps detect overlapped peaks.
 Baseline correction.........B, then click baseline at multiple points
 Restore original signal.....G  to cancel previous background subtraction
 Invert signal...............-  Invert (negate) the signal (flip + and -)
 Set minimum to zero.........0 (Zero) Sets minimum signal to zero
 
Interpolate signal..........Shift-I  Interpolate (re-sample) to N points
 Toggle log y mode OFF/ON....Y 
Plot log Y axis on lower graph
 Cycles baseline modes.......T 0=none; 1=linear; 2=quadratic; 3=Flat.
 Toggle valley mode OFF/ON...U  Switch to valley mode
 Gaussian/Lorentzian switch..Shift-G  Switch between Gaussian/Lorentzian mode
 Print peak table............P  Prints Peak #, Position, Height, Width
 Save peak table.............Shift-P  Saves peak table as disc file
 Step through peaks..........Space/Tab  Jumps to next/previous peak
 
Jump to peak number.........J  Type peak number and press Enter
 Normal peak fit.............N  Fit peaks in upper window with peakfit.m
 Multiple peak fit...........M  Fit all peaks in signal with peakfit.m
 Ensemble average all peaks..Shift-E  (Read instructions first)
 Print keyboard commands.....K  Prints this list
 Print findpeaks arguments...Q  Prints findpeaks function with arguments.
 Print ipeak arguments.......W  Prints ipeak function with all arguments.
 Print report................R  Prints Peak table and parameters
 Print peak statistics.......E  prints mean, std of peak intervals, heights, etc.
 Peak labels ON/OFF......... L  Label all peaks detected in upper window.
 Peak ID ON/OFF..............I  Identifies closest peaks in 'Names' database.
 Print peak IDs..............O  Prints table of peaks IDs

Animated step-by-step instructions

iPeak Demo functions

demoipeak.m is a simple demo function that generates a noisy signal with peaks, calls iPeak, and then prints out a table of the actual peak parameters and a list of the peaks detected and measured by iPeak for comparison. Before running this demo, ipeak.m must be downloaded and placed in the Matlab path. The ZIP file at http://terpconnect.umd.edu/~toh/spectrum/ipeak6.zip contains several demo functions (ipeakdemo.m, ipeakdemo1.m, etc) that illustrate various aspects of the iPeak function and how it can be used effectively. Download the zip file, right-click and select "Extract all", then put the resulting files in the Matlab path and run them by typing their names at the Matlab command window prompt. To test for the proper installation and operation of iPeak, run testipeak.


ipeakdemo: effect of the peak detection parameters

Four Gaussian peaks with the same heights but different widths (10, 30, 50 and 70 units) This demonstrates the effect of SlopeThreshold and SmoothWidth on peak detection. Increasing SlopeThreshold (S key) will discriminate against the broader peaks. Increasing SmoothWidth (D key) will discriminate against the narrower peaks and noise. FitWidth (F/V keys) controls the number of points around the "top part" of the (unsmoothed) peak that are taken to estimate the peak heights, positions, and widths. A reasonable value is ordinarily about equal to 1/2 of the number of data points in the half-width of the peaks. In this case, where the peak widths are quite different, set it to about 1/2 of the number of data points in the narrowest peak.
ipeakdemo1: the baseline correction mode

Demonstration of background correction, for separated, narrow peaks on a large baseline. Each time you run this demo, you will get a different set of peaks and noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. Jump to the next/previous peaks using the Spacebar/Tab keys.

Hint: Set the linear baseline correction mode (T key), adjust the zoom setting so that the peaks are shown one at a time in the upper window, then press the P key to display the peak table. 

ipeakdemo2: peak overlap and the curve fitting functions.

Demonstration of error caused by overlapping peaks on a large offset baseline. Each time you run this demo, you will get a different set of peaks and noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. (Jump to the next/previous peaks using the Spacebar/Tab keys).

Hint: Use the B key and click on the baseline points, then press the P key to display the peak table. Or turn on the background correction mode (T key) and use the Normal curve fit (N key) with peak shape 1 (Gaussian). 

ipeakdemo3 Baseline shift caused by overlapping peaks

Demonstration of overlapping Lorentzian peaks, without an added background. A table of the actual peak positions, heights, widths, and areas is printed out in the command window; in this example, the true peak heights are 1,2 3,...10. Overlap of peaks can cause significant errors in measuring peak parameters, especially for Lorentzian peaks, because they have gently sloping sides that contribute to the baseline of any peaks in the region.

Hint: turn OFF the background correction mode (T key) and use the Normal curve fit (N key) to fit small groups of 2-5 peaks numbered in the upper window, with peak shape 2 (Lorentzian). For the greatest accuracy in measuring a particular peak, include one or two additional peaks on either side, to help account for the baseline shift caused by the sides of those neighboring peaks. Alternatively, if the total number of peaks is not too great, you can use the Multiple curve fit (M key) to fit the entire signal in the lower window.

ipeakdemo4: dealing with very noisy signals

Detection and measurement of four peaks in a very noisy signal. The signal-to-noise ratio of first peak is 2. Each time you run this demo, you will get a different set of noise. A table of the actual peak positions, heights, widths, and areas is printed out in the command window. Jump to the next/previous peaks using the Spacebar/Tab keys. The peak at x=100 is usually detected, but the accuracy of peak parameter measurement is poor because of the low signal-to-noise ratio. ipeakdemo6 is similar but has the option of different kinds on noise (white, pink, proportional, etc).

Hint: With very noisy signals it is usually best to increase SmoothWidth and FitWidth to help reduce the effect of the noise.

ipeakdemo5: dealing with highly overlapped peaks

In this demo the peaks are so highly overlapped that only one or two of the highest peaks yield distinct maxima that are detected by iPeak. The height, width, and area estimates are highly inaccurate because of this overlap. The normal peak fit function ('N' key) would be useful in this case, but it depends on iPeak for the number of peaks and for the initial guesses, and so it would fit only the peaks that were found and numbered.

To help in this case, pressing the 'H' key will activate the peak sharpen function that decreases peak width and increases peak height of all the peaks, making it easier to detect and number all the peaks for use by the peakfit function (N key). Note: peakfit fits the original unmodified peaks; the sharpening is used only to help locate the peaks to provide peakfit with suitable starting values..

Which to use: iPeak, ipf, or peakfit?  Download these Matlab demos that compare iPeak.m with ipf.m and peakfit.m for signals with a few peaks and signals with many peaks and that shows how to adjust iPeak to detect broad or narrow peaks. These are self-contained demos that include all required Matlab functions. Just place them in your path and click Run or type their name at the command prompt. Or you can download all these demos together in idemos.zip. For peak fitting, the N-key and M-key functions of iPeak have the advantage of using the peak positions and widths determined by the automatic peak finder routine as the first-guess vector for peakfit.m, often resulting in faster and more robust fits that ipf.m or peakfit.m themselves with default first guesses. Try fitting the x,y data in ipeakexampledata.mat for some examples of using iPeak for fitting overlapping peaks. On the other hand, ipf.m is better for fitting peaks that don't make distinct maxima and thus are not detected detected by the automatic peak finder routine.


Peak Finding and Measurement Spreadsheet

The spreadsheet pictured above implements the "findpeaks" peak finding and measurement method in a spreadsheet.  The input x,y data are contained in Sheet1, column A and B, rows 8 to 263. You can Copy and Paste your own data there. The amplitude threshold and slope threshold are set in cells B3 and E3, respectively. 

Smoothing and differentiation are performed by the convolution technique used by the spreadsheets DerivativeSmoothing.xls described previously. The Smooth Width and the Fit Width are both controlled by the number of non-zero convolution coefficients in row 5, columns J through Z. (In order to compute a symmetrical first derivative, the coefficients in columns J to Q must be the negatives of  the positive coefficients in columns S to Z). The original data and the smoothed derivative are shown in the two charts on Sheet1.

To detect peaks in the data, a series of three conditions are tested for each data point in columns F, G, and H, corresponding to the three nested loops in findpeaks.m:  

  1. Is the signal greater than Amplitude Threshold? (line 45 of findpeaks.m; column F in the spreadsheet)
  2. Is there a downward directed zero crossing in the smoothed first derivative? (line 43 of findpeaks.m; column G in the spreadsheet)
  3. Is the slope of the derivative at that point greater than the Slope Threshold? (line 44 of findpeaks.m; column H in the spreadsheet)

If the answer to all three questions is yes (highlighted by blue cell coloring), a peak is registered at that point (column I), counted in column J, and assigned an index number in column K.  

For the first 10 peaks found, the x,y original unsmoothed data are copied to Sheets 2 through 11, respectively, where that segment of data is subjected to a Gaussian least-squares fit, using the same technique as GaussianLeastSquares.xls.  The best-fit Gaussian parameter results are copied back to Sheet1, in the table in columns AH-AK.  (In its present form. the spreadsheet is limited to measuring 10 peaks, although it can detect any number of peaks.  Also it is limited in Smooth Width and Fit Width by the 17-point convolution coefficients).  

The spreadsheet is available in OpenOffice (ods) and in Excel (xls) and (xlsx) formats. They are functionally equivalent and differ only in minor cosmetic aspects. An example spreadsheet, with data, is available.  A demo version, with a calculated noisy waveform that you can modify, is also available.  Note: To enter data into the .xls and .xlsx versions, click the "Enable Editing" button in the yellow bar at the top.

To expand this spreadsheet to larger numbers of data points, simply drag down row 263, columns A through K, so that the formulas in those rows are replicated for the required number of additional rows, then adjust the charts to accommodate the extra rows. Expanding the spreadsheet to larger numbers of measured peaks is more difficult. You'll have to drag down row 17, columns AC through AK, and adjust the formulas in those rows for the required number of additional peaks, then copy all of Sheet11 and paste it into a series of new sheets (Sheet12, Sheet13, etc), one for each additional peak, and finally adjust the formulas in columns B and C in each of these additional sheets to refer to the appropriate row in Sheet1. Modify these additional equations in the same pattern as those for peaks 1-10.

A comparison of this spreadsheet to its Matlab/Octave equivalent findpeaksplot.m is instructive. On the positive side, the spreadsheet literally "spreads out" the data and the calculations spatially over a large number of cells and sheets, breaking down the discrete steps in a very graphic way.  In particular, the use of conditional formatting in columns F through K make the peak detection decision process more evident for each peak, and the least-squares sheets 2 through 11 lay out every detail of those calculations. Spreadsheet programs have many flexible and easy-to-use formatting options to make displays more attractive. On the down side, a spreadsheet as complicated as this one is far more difficult to construct than its Matlab/Octave equivalent. Much more serious, the spreadsheet is less flexible and harder to expand to larger signals and larger number of peaks. In contrast, the Matlab/Octave equivalent, while requiring some understanding of programming, is faster, much more flexible, and can easily handle signals and smooth/fit widths of any size. Moreover, because it is written as a Matlab/Octave function, it can be readily employed as an element in your own custom Matlab/Octave programs to perform even larger tasks.









Obsolete version: FindPeakSliders, for Matlab 6.5 

This is an older version that used mouse-controlled sliders for control.  However, as the program grew and more functions were added, it soon turned out that the slider version became impractical because the screen became cluttered more and more screen space would have to have been devoted to controls display, reducing the screen space available for plotting the all-important data.  Moreover, the slider function used does not work reliably in more recent versions of Matlab. Development was halted in 2008 and it therefore lacks all the important newer features of iPeak.  

Click to view enlarged figure
Chromatogram, before baseline correction

Click to view enlarged
                  figure

After baseline correction

Interactive findpeaks script using mouse-controlled sliders for interactive control. Requires Matlab 6.5. This can be used to determine what values of the parameters give the most reliable peak detection. Load your data set into the vectors x and y (x = independent variable, y = dependent variable), then run this m-file and adjust the sliders to change the four arguments of the findpeaks function. The four sliders correspond to the arguments of the findpeaks function described above: SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth (Smooth), and FitWidth (Fit). The range of these sliders is easily changed in lines 65 - 70 to suit a wide range of data types. The BG button is used for baseline (background) subtraction: click once on the BG button, then click on the baseline at five points starting to the left of the lowest x-value and ending to the right of the highest x-value. The background will be subtracted. You can repeat as needed. (You can also change the number of baseline points by changing BaselinePoints in the function BG.m). Peak number and the estimated position, height, and width of each peak is returned in the matrix P:
>> P
P =
1.0000 45.0095 0.1328 4.0102
2.0000 90.5301 0.3655 4.7127
3.0000 94.9673 1.4075 4.0967
......

DemoFindPeakSliders

Click to view
                enlarged figure
Click to view enlarged figure
A self-contained interactive demonstration of FindPeakSliders applied to noisy synthetic data set consisting of a random number of narrow peaks superimposed on a gently curved background. Requires Matlab 6.5. Use the sliders to explore the effect of the variables SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth (Smooth), and FitWidth (Fit), and the baseline correct (BG) button. Peak number and the estimated position, height, and width of each peak is returned in the matrix P.

DemoFindPeakSliders2


Click to view enlarged figure
A self-contained interactive demonstration of FindPeakSliders applied to a data set containing four simple peaks with increasing peak height and peak width. Use this to understand the difference between the variables SlopeThreshold (SlopeT), which discriminates on the basis of peak width, and AmpThreshold (AmpT), which discriminates on the basis of peak amplitude. Peak number and the estimated position, height, and width of each peak is returned in the matrix P.
>> P
P =
1.0000 49.9096 1.9895 20.1960
2.0000 126.1034 3.0585 42.3508
3.0000 201.3947 4.0752 63.3354
4.0000 299.9242 4.9925 82.2000


FindPeakSlidersG

Click to view
                enlarged figure
Click to view enlarged figure
Click to view
                enlarged figure
Click to view enlarged figure

Interactive findpeaks script with peak-zoom feature, for predefined data in x,y. Requires Matlab 6.5. Load a typical data set into the vectors x,y, then run this m-file and adjust the sliders to determine what values of the parameters give the most reliable peak detection. Peak number and position, height, and width of each peak is returned in the matrix P. This version has an additional slider, Peak, which zooms in on any one of the detected peaks and shows the fitted top of the peak (FitWidth) as a blue curve. To see all the peaks, set the Peaks slider to zero (all the way down).
Note: If the slider ranges are not appropriate for your data, change them in lines 63-69.

DemoFindPeakSlidersG

A self-contained interactive demonstration of FindPeakSlidersG. Requires Matlab 6.5.

Video Demonstration of FindPeakSliders

This is a 26-second, 1.4 MByte video (PeakFind2.wmv) of FindPeakSliders (version 1.1), showing the effect of varying the four parameters SlopeThreshold (SlopeT), AmpThreshold (AmpT), SmoothWidth, and FitWidth (labeled "PeakWidth" in this video).

Tom O'Haver
Professor Emeritus
Department of Chemistry and Biochemistry
The University of Maryland at College Park
toh@umd.edu
http://terpconnect.umd.edu/~toh/

Last updated, July, 2014

Website created with SeaMonkey.


index previous next

Number of unique visits since May 17, 2008:

counter create hit