Appendix Q: Measuring a buried peak

This simulation explores the problem of measuring the height of a small peak (also called a "child peak") that is buried in the tail of a much stronger overlapping peak (a "parent peak"), in the especially challenging case that the smaller peak is not even visible to the unaided eye. Three different measurement tools will be explored: iterative least-squares, classical least-squares regression, and peak detection, using the Matlab/Octave tools peakfit.m, cls.m, or findpeaksG.m, respectively. (Alternatively, you could use the corresponding spreadsheet templates).

In this example the larger peak is located at x=4 and has a height of 1.0 and a width of 1.66; the smaller measured peak is located at x=5 and has a height of 0.1 and also with a width of 1.66. Of course, for the purposes of this simulation, we pretend that we don't necessarily know all of these facts and we will try to find methods that will extract such information as possible from the data, even if the signal is noisy. The measured peak is small enough and close enough to the stronger overlapping peak (separated by less than the width of the peaks) that it
never forms a maximum in the total signal. So it looks like there is only one peak, as shown on the figure on the right. For that reason the findpeaks.m function (which automatically finds peak maxima) will not be useful by itself to locate the smaller peak. Simpler methods for detecting the second peak also fail to provide a way to measure the smaller second peak, such as inspecting the derivatives of the signal (the smoothed fourth derivative shows some evidence of asymmetry, but that could just be due to the shape of the larger peak), or Fourier self-deconvolution to narrow the peaks so they are distinguishable, but that is unlikely to be successful with this much noise. Least-squares methods work better when the signal-to-noise ratio is poor, and they can be fine-tuned to make use of available information or constraints, as will be demonstrated below.
 
The selection of the best method will depend on what is known about the signal and the constraints that can be imposed; this will depend in your knowledge of your experimental signal. In this simulation (performed by the Matlab/Octave script SmallPeak.m), the signal is composed of two Gaussian peaks (although that can be changed if desired in line 26). The first question is: are there more than one peak there? If we perform an unconstrained iterative fit of a
single Gaussian to the data, as shown in the figure on the right, it shows little or no evidence of a second peak - the residuals look pretty random. (If you could reduce the noise, or ensemble-average even as few as 10 repeat signals, then the noise would be low enough to see graphical evidence of a second peak). But as it is, there is nothing that suggests a second peak.

But suppose we suspect that there
should be another peak of the same Gaussian shape just on the right side of the larger peak. We can try fitting a pair of Gaussians to the data (figure on the left), but in this case the random noise is enough that the fit is not stable. When you run SmallPeak.m, the script performs 20 repeat fits (NumSignals in line 20) with the same underlying peaks but with 20 different random noise samples, revealing the stability (or instability) of each measurement method. The fitted peaks in Figure window 1 bounce around all over the place as the script runs, as you can see in the animation on the left. The fitting error is on average lower that the single-Gaussian fit, but that by itself does not mean that the peak parameters so measured will be reliable; it could just be "fitting the noise". If it were isolated all by itself, the small peak would have a S/N ratio of about 5 and it could be measured to a peak height precision of about 3%, but the presence of the larger interfering peak makes the measurement much more difficult. (Hint: After running SmallPeak.m the first time, spread out all the figure windows so they can all be seen separately and don't overlap. That way you can compared the stability of the different methods more easily.)


But suppose that we have reason to expect that the
two peaks will have the same width, but we don't know what that width might be. We could try an equal width Gaussian fit (peak shape #6, shown in Matlab/Octave Figure window 2); the resulting fit is much more stable and shows that a small peak is located at about x=5 on the right of the bigger peak, shown below on the left.  On the other hand, if we know the peak positions beforehand, but not the widths, we can use a fixed-position Gaussian fit (shape #16) shown on the right  (Figure window 3). In the very common situation where the objective is to measure an unknown concentration of a known component, then it's possible to prepare standard samples where the concentration of the sought component is high enough for its position or width to be determined with certainty. 

     

So far all of these examples have used iterative peak fitting with at least one peak parameter (position and/or width) unknown and determined by measurement. If, on the other hand, all the peak parameters are known except the peak height, then the faster and more direct classical least-squares regression (CLS) can be employed (Figure window 4). In this case you need to know the peak position and width of both the measured and the larger interfering peaks (the computer will calculate their heights). If the positions and the heights really are constant and known, then this method gives the best stability and  precision of measurement. It's also computationally faster, which might be important if you have lots of data to process automatically. 

The problem with CLS is that it fails to give accurate measurements if the peak position and/or width changes without warning, whereas two of the iterative methods (unconstrained Gaussian and equal-width Gaussian fits) can adapt to such changes. It some experiments it quite common to have small unexpected shifts in the peak position, especially in chromatography or other flow-based measurements, caused by unexpected changes in temperature, pressure, flow rate or other instrumental factors. In SmallPeaks.m, such x-axis shifts can be simulated using the variable "xshift" in line 18. It's initially zero, but if you set it to something greater (e.g. 0.2) you'll find that the equal-width Gaussian fit (Figure window 2) works better because it can keep up with the changes in x-axis shifts.

But with a greater x-axis shift (xshift=1.0) even the equal-width fit has trouble. Still, if we know the separation between the two peaks, it's possible to use the findpeaksG function to search for and locate the larger peak and to calculate the position of the smaller one. Then the CLS method, with the peak positions so determined for each separate signal, shown in Figure window 5 and labeled "findpeaksP" in the table below, works better. Alternatively, another way to use the findpeaks results is a variation of the equal-width iterative fitting method in which the first guess peak positions (line 82) are derived from the findpeaks results, shown in Figure window 6 and labeled "findpeaksP2" in the table below; that method does not depend on accurate knowledge of the peak widths, only their equality.

Each time you run SmallPeaks.m, all of these methods are computed "NumSignals" times (set in line 20) and compared in a table giving the average peak height accuracy of all the repeat runs:

xshift=0
Unconstr. EqualW  FixedP FixedP&W findpeaksP findpeaksP2
35.607    16.849  5.1375  4.4437   13.384     16.849

xshift=1
Unconstr. EqualW  FixedP FixedP&W findpeaksP findpeaksP2
31.263    44.107  22.794  46.18    10.607     10.808


Bottom line. The more you know about your signals, the better you can measure them. A stable signal with known peak positions and widths is the most precisely measurable in the presence of random noise ("FixedP&W"), but if the positions or widths vary from measurement to measurement, different methods must be used, and precision is degraded because more of the available information is used to account for changes other than the ones you want to measure.

This page is part of "A Pragmatic Introduction to Signal Processing", created and maintained by Prof. Tom O'Haver , Department of Chemistry and Biochemistry, The University of Maryland at College Park. Comments, suggestions and questions should be directed to Prof. O'Haver at toh@umd.edu. Updated July, 2022.