[Introduction] [Signal arithmetic] [Signals and noise] [Smoothing] [Differentiation] [Peak Sharpening] [Harmonic analysis] [Fourier convolution] [Fourier deconvolution] [Fourier filter] [Wavelets] [Peak area measurement] [Linear Least Squares] [Multicomponent Spectroscopy] [Iterative Curve Fitting] [Hyperlinear quantitative absorption spectrophotometry] [Appendix and Case Studies] [Peak Finding and Measurement] [iPeak] [iSignal] [Peak Fitters] [iFilter] [iPower] [List of downloadable software] [Interactive tools]

From a signal-to-noise perspective, the stock market is
an interesting example. A national or global stock market is an
aggregation of large numbers of buyers and sellers of shares in
publicly traded companies. They are described by stock market *indexes*, which are
computed as the weighted average of a large number of selected
stocks. For example, the S&P
500 index is computed from the stock valuations of 500
large US companies. Millions of individuals and organizations
participate in the buying and selling of stocks on a daily
basis, so the S&P 500 index is a prototypical "big data"
conglomerate, reflecting the overall value of 500 of the largest
companies in the largest stock market on earth. Other stock
indices, such as the Russel 2000, include an even larger number
of smaller companies. Individual stocks can fail or fall
drastically in value, but the market indexes average out the
performance of hundreds of companies.

A plot of the daily value, V, of the S&P 500 index vs time,
T, from 1950 through September of 2016 is shown in the following
graphs.

Each plot contains
16608 data points, one for each business day, shown in red. The
graph on the left plots V and the graph on the right plots the *natural logarithm* of V,
ln(V). There are considerable up-and-down fluctuations over time
that can be related to historical events: the oil crisis of the
1970s, the tech boom and bust of 2000, the subprime mortgage
crisis of 2008. Still, the *long-term*
trend of the value is upwards - the current value
is over 100 times greater than its value in 1950. This is
basically why people invest in the stock market, because on
average, *over the long run*, stock values usually go up. The most common way to
model this overall long-term increase over time is based on the
equation
for
compound interest that predicts the growth of investments
that have a constant rate of return, such as savings accounts or
certificates of deposit:

V = S*(1
+ R)^{T}

where V is the value, S is the starting value, R
is the annual rate of return, and T is time. By itself, this
expression would yield a smooth curve, without all the peaks and
dips. The values of S and R that result in the *best fit* to the stock
market data (shown by the blue lines in the graphs) can be
determined in two ways:

(1) directly, using the iterative curve fitting method, shown on the left above, or

(2) by taking the logarithm of the values and fitting those to a straight line, shown on the right above.

FitSandP.m
is a Matlab/Octave script that performs both of these
calculations using the data in SandPfrom1950.mat. When applied to
the S&P 500 index data, the rate of
return R is about 0.07 (or 7%), but interestingly these two
methods give slightly *different
results*, even though the *exact same data* are used
for both, and even though *both *methods yield the *same *7% rate if applied to noiseless
*synthetic data* calculated
from this expression. This difference between methods is caused
by the irregularities in the stock data that deviate from a
smooth line - in other words, the *noise* - and it is exacerbated by the large range of the
value data V over time and by the fact that the average return
from 1950 to 1983 is slightly lower than that from 1983 to 2016.

From the point of view of curve fitting, the deviations from a
smooth curve described by the compound interest expression is
just *noise*.
But from the point of view of the stock market investor, those
deviations can be an opportunity and a warning. Naturally, most investors would like to know how the
stock market will behave in the future, but that requires
extrapolation beyond the range of the available data, which is
always uncertain and dangerous. But still, it's *most likely* (but not
certain) that the *long term *behavior of the market (say, over a period of 10 years
or more) will be similar to the past - that is, growing
exponentially at about the same rate as before but with
unpredictable fluctuations similar to what has occurred in the
past. We can take a closer look at those fluctuations by
inspecting the *residuals
*- that is, subtracting the fitted
curve from the raw data, as shown in iSignal
on the left.
There are several notable features of this "noise". First, the *deviations
are roughly proportional to V* and thus relatively equal
when plotted on a log scale. Second, the noise has a distinctly
*low-frequency *character;
the periodogram
(lower panel, in red) shows peaks at 33, 16, 8, and 4 years.
There are also, notably, numerous instances over the years when
there is a sharp dip followed by a slower recovery close to the
previous value. And conversely, every peak is eventually
followed by a dip. The conventional advice in investing is to
"buy low" (on the dips) and "sell high" (on the peaks). But of
course the problem is that you can not reliably determine *in
advance* exactly where the peaks and dips will fall; you
have only the past to guide you. Still, if the current market
value is much *higher* than the long-term trend, it will likely fall, and if
the market value is much *lower *than the long-term trend, it will likely rise,
eventually. The only thing you can be sure of is that, in the
long run, the market will rise. This is why saving for
retirement by investing in the stock market, and *starting as soon as possible*, is so important: over a 30-year working life, the
market is almost guaranteed to rise substantially. The most
painless way to do this is with your employer's 401k or 403b
automatic payroll withdrawal plan. You can not actually invest
in the stock market as a whole, but you can invest in *index mutual funds* or *exchange
traded funds* (ETFs), which are collections of stocks that
are constructed to match or track the components of a market index. Such funds
typically have *very low management fees*, an important factor
in selecting an investment. Other mutual funds attempt to
"beat the market" by carefully buying and selling stocks in an
attempt to create a return that is greater than the overall
market indexes; some are *temporarily *successful in
doing that, but they charge higher management fees. Mutual
finds and ETFs are much less risky investments than individual
stocks.

Some companies
periodically distribute payouts to investors called
"dividends". Those dividends are independent of the day-to-day
variations in stock price, so even if the stock value drops
temporarily, you still get the same dividend. For that reason
it's important that you set your investment account to "automatically
reinvest dividends", so when the share price drops, the
dividends are buying shares at the *lower price*. The S&P 500
index values used above, called *price returns*, did *not*
include dividend
reinvestment; the *total returns* with dividends reinvested
(https://en.wikipedia.org/wiki/S%26P_500_Index#Versions)
would have been substantially
higher, closer to 11%. (With an average total annual
return of 11%, and starting with an investment of $170 the
first month - that's less than $6 a day - and increasing it 5%
each year, you could accumulate over $600,000 over a
30 year working life, or $1,000,000 if you continued investing
an additional 5 years, as shown by the spreadsheet
graphic on the right). And that's starting at just $6 per
day, about the cost of a fancy coffee at Starbucks. Think
about that the next time you see a line of young people
waiting to order their daily coffee.

To illustrate how much influence stock market volatility fluctuation ("noise") has on the market gains, the Matlab/Octave script SnPsimulation.m adds proportional noise to the compound interest calculation to mimic the S&P data, performs the two curve fitting methods described above, repeats the allocations over and over with independent samples of proportional noise, and then calculates the mean and the relative standard deviation (RSD) of the rates of return. A typical result is:

TrueRateOfReturn =
0.07

Measured Rate RSD

Coordinate transformation: 0.07112
8.9%

Iterative curve fitting:
0.07972 19.9%

As
you
can see, the two methods don't agree. In this example, the
return calculated by the iterative method is higher, but it *could
just have easily been the other way*. The fact is that the
standard deviations are fairly large, and the iterative method
always has a higher *standard deviation,* because it
weights the higher values more heavily, where deviations from
the line are higher, whereas the log transformation method
weights the data more evenly. Even with this uncertainty,
investing in a stock market index fund almost always performs
better *in the long run* than more predictable investments
such as saving accounts or CDs, which have much lower rates of
return.

In
investing in the stock market, it's important to focus on the
long-term trends and not to be frightened by the short-term up
and down fluctuations. It's similar to the difference between weather
and climate;
the large and dramatic short-term *weather *variations
tend to disguise the much smaller long term *climate *warming
that is slowly melting the icecaps and raising
the sea levels (whether it is caused by human activity or
by natural causes alone or by a combination of both).

For
a spreadsheet template that allows you to calculate the possible
returns on long-term investments in stock market mutual funds,
see https://terpconnect.umd.edu/~toh/simulations/Investment.html.

**Note
added in June 2020**. The stock market data used above is
now several years old. You might be wondering how good those
data were at predicting the stock market trends since 2016.
Since that time, there have been many market disruptions, in
particular the trade wars of 2019 and the Coronavirus pandemic
of 2020.

The
recent changes are evident if you take a close look at the
period from 2016 to 2020, for which the return over that short
period was indeed greater, about 9.5%. These 2019 and 2020 dips,
although they were quite sharp and caused a lot of anxiety at
the time, recovered quickly and as a result had little effect on
the overall long-term performance. When stocks drop, even for
well-known and valid reasons, some investors buy shares at the
reduced prices, and when stocks rise, especially when they hit
all-time highs, some investors sell shares, to "lock in their
gains". This behavior acts as a natural brake on the
fluctuations of the market.

If we extend the 1950 - 2016 plot to include the S&P
results for the dates up to June 2020, you can see that
doing so has remarkably little effect, as seen in the log
plot below. The added data are just at the extreme top
right-hand corner and those fluctuations are small compared
to the previous historical events. The overall average
return is still about 7% (without dividend reinvestment). In
other words, the 1950 – 2016 data were pretty good
predictors of more recent market performance, despite the
alarming recent disruptions.

This page is part of "