Appendix AG. Using real-signal modeling to determine measurement
accuracy
It's common to use
computer-generated signals whose characteristics are known
exactly to establish the accuracy of a proposed signal
processing method, analogous to the use of standards in
analytical chemistry.
But the problem with computer-generated signals is that they are
often too simple or too ideal, such as a series of peaks that
are all equal in height and width, of some idealized shape such
as a pure Gaussian, and with idealized added random white noise.
For the measurement
of the areas of partly overlapping peaks, such ideal peaks will result in overly
optimistic estimates of area measurement accuracy. One way to
create more realistic known signals for a particular application
is to use iterative
curve fitting. If it is possible to find a model that fits the experimental
data very well, with very low fitting error and with random
residuals, then the peak parameters from that fit are used to
construct a more realistic synthetic signal that yield a much
better evaluation of the measurement. Moreover, synthetic
signals can be modified at will to explore how the proposed
measurement method might work under other experimental
conditions (e.g., if the sampling frequency were higher).
To
demonstrate this idea, I downloaded a spectrum from the NIST IR
database that contained a set of four highly fused peaks.
To determine the true peak areas as accurately as possible, I
iteratively fit those peaks with four GLS peaks (41% Gaussian)
of different widths, yielding a fitting error of only 0.3%, and
an R2 of 0.99988, with unstructured random residuals, as shown
on the left. The best-fit peak parameters and the residual noise
were then used in a self-contained script to create a synthetic
model signal that is essentially identical to the experimental
spectrum, except that it has exactly known peak areas. Then, the
script uses the simpler and faster perpendicular drop method to
measure those areas, using second differentiation to
locate the original peak positions, peak area measurement by perpendicular
drop, which by itself is not expected to work on such
overlapped peaks, and finally repeating the area measurements
after sharpening the peaks by Fourier
self-deconvolution, using a low-pass Fourier
filter to control the noise).
As shown by the first
figure below, self-deconvolution sharpening can in fact improve
the peak area accuracy substantially, from an average error of
29% for the original signal to only 3.1% after deconvolution.
But because the peaks have different widths, there is no single
optimum deconvolution width. Tests show that the best overall
results are obtained when the deconvolution function shape
is the same as in the original signal and when the deconvolution
function width is 1.1 times the average peak
width in the signal. In the second figure, the peaks in the
model signal have been spread out artificially, with no other
change, just to show more clearly that this choice of
deconvolution function width causes the third peak to be "over
sharpened", resulting in negative lobes for that peak. (But
recall that deconvolution is done in a way that conserves total
peak area). A more
conservative approach, using the largest deconvolution width possible without the
signal ever going negative (about 0.8 times the average peak width in this case) results in
only a modest improvement in area accuracy (from 27% to 12%; graphic).
Sampling interval (cm-1)= 2
Change in peak separation (PeakSpread)=
0
Noise= 5e-05
GLS Shape (fraction Gaussian)= 0.41
Deconvolution Width= 23.7 points
(1.1 times the mean signal peak width)
Frequency Cutoff= 20%
\
Change in peak separation
("PeakSpread") = 100. All other parameters are unchanged.
This
page is part of "A Pragmatic Introduction to Signal
Processing", created and maintained by Prof. Tom O'Haver ,
Department of Chemistry and Biochemistry, The University of
Maryland at College Park. Comments, suggestions and questions
should be directed to Prof. O'Haver at toh@umd.edu. Updated January, 2023.