This simulation
explores the problem of measuring the height of a small peak
(also called a "child peak") that is buried in the tail of a
much stronger overlapping peak (a "parent peak"),
in the especially challenging case that the smaller peak is not even visible to
the unaided eye. Three different measurement tools will be
explored: iterative
least-squares, classical least-squares regression,
and peak detection, using the
Matlab/Octave tools peakfit.m,
cls.m, or findpeaksG.m,
respectively. (Alternatively, you could use the corresponding spreadsheet
templates).
In this example the larger peak is located at x=4 and has a
height of 1.0 and a width of 1.66; the smaller measured peak is
located at x=5 and has a height of 0.1 and also with a width of
1.66. Of course, for the purposes of this simulation, we pretend
that we don't necessarily know all of these facts and we will
try to find methods that will extract such information as
possible from the data, even if the signal is noisy. The
measured peak is small enough and close enough to the stronger
overlapping peak (separated by less than the width of the peaks)
that it never forms a maximum
in the total signal. So it looks like there is only
one peak, as
shown on the figure on the right. For that reason the
findpeaks.m function (which automatically finds peak maxima)
will not be useful by itself to locate the smaller peak. Simpler
methods for detecting the second peak also fail to provide a way
to measure the smaller second peak, such as inspecting the
derivatives of the signal (the smoothed fourth derivative shows
some
evidence of asymmetry, but that could just be due to the
shape of the larger peak), or Fourier self-deconvolution to
narrow the peaks so they are distinguishable, but that is
unlikely to be successful with this much noise. Least-squares
methods work better when the signal-to-noise ratio is poor, and they can be fine-tuned to make use of available
information or constraints, as will be demonstrated below.
The selection of the best method will depend on what is known
about the signal and the constraints that can be imposed; this
will depend in your knowledge of your experimental signal. In
this simulation (performed by the Matlab/Octave script SmallPeak.m),
the signal is composed of two Gaussian peaks (although that can
be changed if desired in line 26). The first question is: are
there more than one peak there? If we perform an unconstrained
iterative fit of a single Gaussian to the data, as shown in the figure on the
right, it shows little or no evidence of a second peak - the
residuals look pretty random. (If you could reduce the noise, or
ensemble-average even as few as 10
repeat signals, then the noise would be low enough to see
graphical evidence of a second peak). But as
it is, there is nothing that suggests a second peak.
But suppose we suspect that there should
be another peak of the same Gaussian
shape just on the right side of the larger peak. We can try
fitting a pair of Gaussians to the data (figure on the left), but in
this case the random noise is enough that the fit is not stable.
When you run SmallPeak.m,
the script performs 20 repeat fits (NumSignals in line 20) with
the same underlying peaks but with 20 different random noise
samples, revealing the stability (or instability) of each
measurement method. The fitted peaks in Figure window 1 bounce
around all over the place as the script runs, as you can see in
the animation on the left. The fitting error is on average lower that the
single-Gaussian fit, but that by itself does not mean that the
peak parameters so measured will be reliable; it could just be
"fitting the noise". If it were
isolated all by itself, the small
peak would have a S/N ratio of about 5 and it could
be measured to a peak height precision of about 3%, but the
presence of the larger interfering peak makes the measurement
much more difficult. (Hint: After running SmallPeak.m the first
time, spread out all the figure windows so they can all be seen
separately and don't overlap. That way you can compared the
stability of the different methods more easily.)
But suppose that we have reason to expect that the two peaks will have the same width, but we don't know what that width might be. We could
try an equal width Gaussian fit (peak shape #6, shown in Matlab/Octave
Figure window 2); the resulting fit is much more stable and
shows that a small peak is located at about x=5 on the right of
the bigger peak, shown below on the left. On the other
hand, if we know the peak positions beforehand, but not the widths, we can use a fixed-position Gaussian
fit (shape #16) shown on the right (Figure window 3). In
the very common situation where the objective is to measure an unknown
concentration of a known component,
then it's possible to prepare standard samples where the
concentration of the sought component is high enough for its
position or width to be determined with certainty.
So far all of these examples have used iterative
peak fitting with at least one peak parameter (position and/or
width) unknown and determined
by measurement. If, on the other hand, all
the peak parameters are known except the peak height,
then the faster and more direct classical least-squares regression
(CLS) can be employed (Figure window 4). In this case you need
to know the peak position and width of both the measured and the
larger interfering peaks (the computer will calculate their
heights). If the positions and the heights really are constant
and known, then this method gives the best stability and
precision of measurement. It's also computationally faster,
which might be important if you have lots of data to process
automatically.
The problem with CLS is that it fails to give accurate
measurements if the peak position and/or width changes without
warning, whereas two of the iterative methods (unconstrained
Gaussian and equal-width Gaussian fits) can adapt to such
changes. It some experiments it quite common to have small
unexpected shifts in the peak position, especially in
chromatography or other flow-based measurements, caused by
unexpected changes in temperature, pressure, flow rate or other
instrumental factors. In SmallPeaks.m, such x-axis shifts can be
simulated using the variable "xshift" in line 18. It's initially
zero, but if you set it to something greater (e.g. 0.2) you'll
find that the equal-width Gaussian fit (Figure window 2) works
better because it can keep up with the changes in x-axis shifts.
But with a greater x-axis shift (xshift=1.0) even the equal-width fit has trouble. Still, if we know the separation between the two peaks, it's possible to use the findpeaksG function to search for and locate the larger peak and to calculate the position of the smaller one. Then the CLS method, with the peak positions so determined for each separate signal, shown in Figure window 5 and labeled "findpeaksP" in the table below, works better. Alternatively, another way to use the findpeaks results is a variation of the equal-width iterative fitting method in which the first guess peak positions (line 82) are derived from the findpeaks results, shown in Figure window 6 and labeled "findpeaksP2" in the table below; that method does not depend on accurate knowledge of the peak widths, only their equality.
Each time you run
SmallPeaks.m, all of these methods are
computed "NumSignals" times (set in
line 20) and compared in a table giving the average peak height
accuracy of all the repeat runs:
xshift=0
Unconstr. EqualW FixedP FixedP&W findpeaksP
findpeaksP2
35.607 16.849 5.1375
4.4437 13.384 16.849
xshift=1
Unconstr. EqualW FixedP FixedP&W findpeaksP
findpeaksP2
31.263 44.107 22.794
46.18 10.607 10.808
Bottom line. The more
you know about your signals, the better you can measure them. A
stable signal with known peak positions and widths is the most precisely
measurable in the presence of random noise ("FixedP&W"), but
if the positions or widths vary from measurement to measurement,
different methods must be used, and precision is degraded
because more of the available information is used to account for
changes other than the ones you want to measure.
This page is part of "A Pragmatic Introduction to Signal
Processing", created and maintained by Prof. Tom O'Haver ,
Department of Chemistry and Biochemistry, The University of
Maryland at College Park. Comments, suggestions and questions
should be directed to Prof. O'Haver at toh@umd.edu. Updated July, 2022.