The interfacing of measurement instrumentation to small computers
for the purpose of online data acquisition has now become standard
practice in the modern laboratory for the purposes of performing
signal processing and data analysis and storage, using a large
number of digital computer-based numerical methods that are used
to transform signals into more useful forms, detect and measure
peaks, reduce noise, improve the resolution of over-lapping peaks,
compensate for instrumental artifacts, test hypotheses, optimize
measurement strategies, diagnose measurement difficulties, and
decompose complex signals into their component parts. These
techniques can often make difficult measurements easier by
extracting more information from the available data. Many of these
techniques are based on laborious mathematical procedures that
were not even practical before the advent of computerized
instrumentation. It is
important to appreciate the abilities, as well as the
limitations, of these techniques. But in recent
decades, computer storage and digital processing has become far
less costly and literally millions of times more capable,
reducing the cost of raw data and making complex computer-based
signal processing techniques both more practical and necessary.
It's not just the growth of computers: there are now new
materials, new instruments, new fabrication techniques, new
automation capabilities. We have lasers, fiber optics,
superconductors, supermagnets, holograms, quantum technology,
nanotechnology, and more. Sensors are now smaller and cheaper and
faster than ever before; we
can measure over a wider range of speeds, temperatures,
pressures, and locations. There are new kinds of data that we never had
before. As Erik Brynjolfsson and Andrew McAfee wrote in
The Second Machine Age (W. W. Norton, 2014): "...many types
of raw data are getting dramatically cheaper, and as data get
cheaper, the bottleneck increasingly is the ability to interpret
and use data".
This essay covers only
basic topics related to one-dimensional time-series signals, not
two-dimensional data such as images. It uses a pragmatic
approach and is limited to mathematics only up to the most
elementary aspects of calculus, statistics, and matrix math. For
the math phobic, you should know that this essay does not dwell
on the math and that it contains more than twice as many figures
as equations. Data processing without math? Not really! Math is
essential, just as it is for the technology of cell
phones, GPS, digital photography, the Web, and computer games.
But you can get started using these tools without
understanding all the underlying math and software details.
Seeing it work makes it more likely that you'll want to
understand how it works. But in the long run, it's not
enough just to know how to operate the software, any more than
knowing how to use a word processor or a MIDI sequencer makes
you a good author or musician.
Why do I title this document "signal processing"
rather than "data processing"? By "signal" I mean the
continuous x,y numerical data recorded by
scientific instruments as time-series,
where x may be time or another quantity like energy
or wavelength, as in the various forms of
spectroscopy. "Data" is a more general term that includes categorical
data as well. In other words, I'm oriented to data
that you would plot in a spreadsheet using the scatter
chart type rather than bar or pie charts.
Some of the examples come from my
own areas of research in analytical chemistry, but these
techniques have been used in a wide
range of application areas. My software has been cited in
journal papers, theses, and patents, covering fields from
industrial, environmental, medical, engineering, earth science,
space, military, financial, agriculture, and even music and
linguistics. Suggestions and experimental data sent by hundreds of
readers from their own work has helped shape my writing and
software development. Much effort has gone into making this
document concise and understandable; it has been highly
praised by many readers.
At the present time, this work does not cover
image processing, wavelet transforms, pattern recognition, or
factor analysis. For more advanced topics and for a more rigorous
treatment of the underlying mathematics, refer to the extensive
literature on signal processing and on statistics and
This site had its origin in one of the
experiments in a course called "Electronics
and Computer Interfacing for Chemists" that I developed and
taught at the University of Maryland in the 80's and 90's. The
first Web-based version went up in 1995. Subsequently it has been
revised and greatly expanded based on feedback from users. It is
still a work in progress and, as such, benefits from feedback from
readers and users.
This tutorial makes
considerable use of Matlab, a
high-performance commercial and proprietary numerical computing
environment and "fourth generation" programming language that is
widely used in research (14, 17, 19, 20), and Octave, a free
Matlab alternative that runs almost all of the programs and
examples in this tutorial. There is a good reason why this
language is so massively popular in science and engineering; it's
powerful, fast, and relatively easy to learn; it comes
with built-in functions for doing data processing tasks like
matrix math, Fourier transforms, convolution and deconvolution,
multilinear regression, and optimization; you can download
thousands of useful user-contributed functions; it can interface
to C, C++, Java, Fortran, and Python; and it's extensible to symbolic
computing and model-based
design for dynamic and embedded
systems. There are many code examples in this text that
you can Copy and Paste (or drag and drop) into the
Matlab/Octave command line to run or modify, which is especially
convenient if you can split
your screen between the two.
Some of the illustrations were produced on my old
90s-era freeware signal-processing application for Macintosh
OS8, called S.P.E.C.T.R.U.M. (Signal
Processing for Experimental Chemistry Teaching
and Research / University of Maryland)
Most of the techniques covered in this work can
also be performed in spreadsheets
(11, 22, 23) such as Excel or OpenOffice Calc.Octave
and the OpenOffice
Calc) spreadsheet program can be downloaded without cost
from their respective web sites.
If you are unfamiliar with Matlab/Octave, read these sections
about basics and functions and scripts
for a quick start-up. These are not really general-purpose
programming languages like C++ or Python; rather, they are
specifically suited to matrix manipulations, plotting of
functions and data, implementation of algorithms, creation of
user interfaces, and interfacing with programs written in other
languages - essentially the needs of numerical computing by
scientists and engineers. Matlab and Octave are more loosely
typed and are less well structured in a formal sense than
other languages, and thus they tend to be more favored by
scientists and engineers and less well liked by computer
scientists and professional programmers. To get a basic language
like Python up to the point where Matlab starts takes a
considerable effort and familiarity with computer jargon to
install add-on "packages".
is dedicated to the Joy of Uncompetitive Purposefulness.
"...in our culture of competitive self-comparison, we can
choose to amplify each other’s accomplishments because there
is, after all, enough to go around." Maria
"People are generally better persuaded by the reasons which they
have themselves discovered than by those which have come into
the mind of others." Blaise Pascal "...producing
technologies, and then teaching them to others, ... pushes
humankind ahead".David Premack
"A computer does not substitute for judgment any more than a
pencil substitutes for literacy. But writing without a pencil is
no particular advantage." Robert
"...in the course of looking deeply within ourselves, we
may challenge notions that give comfort before the terrors of
the world. Supporters of superstition and pseudoscience are
human beings with real feelings, who, like the skeptics, are
trying to figure out how the world works and what our role in it
might be. Their motives are in many cases consonant with
science." Carl Sagan, in The
Demon-Haunted World: Science as a Candle in the Dark.
"...[be] full of wonder, generously open to every notion,
[dismiss] nothing except for good reason, but at the same time,
and as second nature, [demand] stringent standards of evidence,
...[applied] with at least as much rigor to what [you] hold dear
as to what [you] are tempted to reject with impunity."Carl
1. Douglas A. Skoog, Principles of Instrumental Analysis,
Third Edition, Saunders, Philadelphia, 1984. Pages 73-76.
2. Gary D. Christian and James E. O'Reilly, Instrumental
Analysis, Second Edition, Allyn and Bacon, Boston, 1986.
3. Howard V. Malmstadt, Christie G. Enke, and Gary Horlick, Electronic
for Scientists, W. A. Benjamin, Menlo Park, 1974. Pages
4. Stephen C. Gates and Jordan Becker, Laboratory Automation
using the IBM PC, Prentice Hall, Englewood Cliffs, NJ, 1989.
5. Muhammad A. Sharaf, Deborah L Illman, and Bruce R. Kowalski, Chemometrics,
John Wiley and Sons, New York, 1986.
8. A. Felinger, Data Analysis and Signal Processing in
Chromatography, Elsevier Science (19 May 1998).
9. Matthias Otto, Chemometrics: Statistics and Computer
Application in Analytical Chemistry, Wiley-VCH (March 19, 1999).
Some parts viewable in Google
10. Steven W. Smith, The Scientist and Engineer's Guide to
Digital Signal Processing. (Downloadable chapter by chapter
in PDF format from http://www.dspguide.com/pdfbook.htm).
This is a much more general treatment of the topic.
23. R. de Levie, Advanced Excel for scientific data analysis,
Oxford University Press, New York (2004)
24. S. K. Mitra, Digital Signal Processing, a computer-based
approach, 4th edition, McGraw-Hill, New York, 2011.
“Calibration in Continuum-Source AA by Curve Fitting the
Transmission Profile” , T. C. O'Haver and J. Kindervater, J.
of Analytical Atomic Spectroscopy 1, 89 (1986)
26. “Estimation of Atomic Absorption Line Widths in
Air-Acetylene Flames by Transmission Profile Modeling”, T. C.
O'Haver and Jing-Chyi Chang, Spectrochim. Acta 44B,
27. “Effect of the Source/Absorber Width Ratio on the
Signal-to-Noise Ratio of Dispersive Absorption
Spectrometry”,T. C. O'Haver, Anal. Chem. 68, 164-169 (1991).
28. “Derivative Luminescence Spectrometry”,
G. L. Green and T. C. O'Haver, Anal. Chem. 46, 2191 (1974).
29. “Derivative Spectroscopy”, T. C. O'Haver
and G. L. Green, American Laboratory 7, 15 (1975).
Error Analysis of Derivative Spectroscopy for the Quantitative
Analysis of Mixtures”, T. C. O'Haver and G. L. Green, Anal.
Chem. 48, 312 (1976).
31. “Derivative Spectroscopy: Theoretical
Aspects”, T. C. O'Haver, Anal. Proc. 19, 22-28 (1982).
32. “Derivative and Wavelength Modulation
Spectrometry," T. C. O'Haver, Anal. Chem. 51, 91A (1979).
33. “A Microprocessor-based Signal
Processing Module for Analytical Instrumentation”, T. C.
O'Haver and A. Smith, American Lab. 13, 43 (1981).
34. “Introduction to Signal Processing in
Analytical Chemistry”, T. C. O'Haver, J. Chem. Educ.
35. “Applications of Computers and Computer
Software in Teaching Analytical Chemistry”, T. C. O'Haver, Anal.
Chem. 68, 521A (1991).
36. “The Object is Productivity”, T. C.
O'Haver, Intelligent Instruments and Computers March-April,
1992, p 67-70.
44. Nate Silver, The
Signal and the Noise: Why
So Many Predictions Fail-but Some Don't , Penguin Press,
2012. ISBN 159420411X . A much broader look at "signal" and
"noise", aimed at a general audience, but still worth reading.
59. T. C. O'Haver, Teaching and Learning Chemometrics with Matlab,
Chemometrics and Intelligent Laboratory Systems 6, 95-103
60. Allen B. Downey, "Think DSP", Green Tree Press, 2014.
(164-page PDF download). Python code instruction using sound as a
61. Purnendu K. Dasgupta, et. al, "Black Box Linearization for
Greater Linear Dynamic Range: The Effect of Power Transforms
on the Representation of Data", Anal. Chem. 2010, 82,
62. Joseph Dubrovkin, Mathematical Processing of Spectral Data in
Analytical Chemistry: A Guide to Error Analysis, Cambridge
Scholars Publishing, 2018, 379 pages. ISBN 978-1-5275-1152-1. Link.
63. Power Law Approach as a Convenient Protocol for Improving Peak
Shapes and Recovering Areas from Partially Resolved Peaks, M.
Farooq Wahab, Fabrice Gritti, Thomas C. O’Haver, Garrett
Hellinghausen, Daniel W. Armstrong, Chromatographia