(click here for pdf versions of lectures)
In physics, we take information and make predictions:
With speed \(V\) and angle \(\theta\) and distance \(d\) and height \(h\), we can say “exactly” where the ball will go.
But that’s an easy problem!
what do you need to know to predict how a coin will land?
(from: Stochastic Modeling of Scientific Data, P. Guttorp and V. N. Minin)
Is it feasible to “predict” exactly?
No.
Is it feasible to ``predict’’ probabilistically?
Easy! 50% chance Heads - 50% chance Tails
This “random” result weirdly tells us just about everything we need to know about the very complex problem of the coin-flip.
Probability Theory is the science of understanding mathematical “randomness”.
Distributions:
Relationships:
Relationships with factors:
Statistics is the art of sweeping complexity under the rug of ** randomness** (not because we’re lazy, but because its not practical)
((ok, sometimes it’s because we’re lazy)).
A question: How big are 2nd graders at Anshu County School in Nepal? - Population: The 2nd graders in Anshu County School. - Sample: The 2nd graders in Anshu County School. - Individual: A 2nd grader in Anshu County School. - Variable: Height, weight, - Covariate: Sex, age, etc…
Descriptive statistics are used to describe a population that had been completely sampled. Sample = Population
Question: What factors impact health of children in ``general’’?
This is a complex question! How do we best collect data? What’s the most we can learn from these data? How do we tease apart confounding effects in the variables? How do we even define some of these variables?
is when you use knowledge gained about a to extrapolate (infer) something about a larger . \ Statistics that allow us to infer about greater populations are referred to as ** inference statistics}.\
Sample < Population
\[Y = f(X | \theta)\]
For each these tasks, there are different more or less rigorous procedures - but in ALL cases - good judgement and reasoning are the MOST important step!
An important goal of this course is to learn how to apply “good judgement” and “good reasoning”. Statistics is NEVER just “plug and chug”!!
Predictive statistics has a lot of overlap / equivalence with inference, but differs in its goals.
You DO NOT care about the model \(f(\cdot)\) - you DO NOT care about \(\widehat{\theta}\) - you ONLY care about the prediction: \[ \widehat{Y_j} = E(f(X_j, ...)) \] where \(f(\cdot)\) was obtained using a training subset \(X_i,Y_i\) and is predicted onto a validation subset \(X_j, Y_j\).
More specifically - you want to minimize: \[Y_j - \widehat{Y_j}\] using whatever algorithms you can - parameteric / non-parametric / supervised / unsupervised.
In practice (varieties of) REGRESSION are used most of all.