The Student t- distribution
Let's say we are given
some data, a series of scalar values x_i with subscript 1 to n. The student
t-statistic is defined as
Now we can write $s^2$ as our unbiased estimator for the variance,
s^2 &=& \sum_i^N \frac{(x_i-\bar{x})^2}{N-1}
Let's play with the t statistic representation a little more,
We can now see that the numerator is a random variable with Expectation value
0, and if we think about things a little bit, assuming that each and every
data point is drawn from a Gaussian distribution:
x_i \sim N(\mu,\sigma)
Then the addition of all of our x_i's -- in order to create the mean -- will
be a convolution of all of our Gaussians, and thus an addition of our second
cumulants, the variance:
n\bar(x)=\sum_i^n x_i \sim N(\mu,\sigma) \star N(\mu,\sigma) \star
N(\mu,\sigma) \cdots &=& N(n\mu,\sqrt{n}\sigma)
This is of course the same idea as the width of a random walker, we see that
with n steps, the variance of our end Probability density goes like the square
of n. Now subtracting the mean is just centering our PDF:
\bar{x}-\mu=\sum_i^n \frac{x_i-\mu}{n} &\sim& N(0,\sqrt{n}\sigma)
And so we find, if we scale by our theoretical variance, the random variable
in the numerator is determined by a particularly tractable normal
distribution, with variance n
X=\frac{(\bar{x}-\mu)\sqrt{n}}{\sigma}&\sim& N(0,n)
Now for the denominator. Without the square root sign we have:
\frac{s^2}{\sigma^2}&=& \sum_i^n \frac{(x_i-\mu)^2}{(n-1)\sigma^2}
And we can see immediately this will be the sum of the square of our former
x_i-\mu &\sim& N(0,\sigma) \\
\frac{x_i-\mu}{\sigma} &\sim& N(0,1)\\
\frac{(x_i-\mu)^2}{\sigma^2} &\sim&
\frac{(x_i-\bar{x})^2}{(n-1)\sigma^2} &\sim&
Where the final PDF written is the standard Gamma density
N(0,1) &=& \frac{1}{\sqrt{2\pi}}e^{-x^2/2}\\
g_{\alpha,\nu}(x)&=& \frac{1}{\Gamma(\nu)}\alpha^\nu
x^{\nu-1}e^{-\alpha x}\\
g_{1/2,1/2}(x^2) &=&
Which is the required PDF for the square of our Gaussian random variable, with
unit variance. Now in order to convert to the square root of denominator
variable, let's call it y, we need to multiply by a certain factor:
\frac{(x_i-\bar{x})^2}{(n-1)\sigma^2}=\frac{s^2}{\sigma^2} &\sim &
g_{1/2,n/2}(x^2) \\
Y=\sqrt{\frac{s^2}{\sigma^2}} &\sim & 2Y g_{1/2,n/2}(Y^2)
And so we see that
where x,y each have their own probability density functions. The best way to
combine these pdf's is to to write constrain X and then integrate over all
possible values of Y. For example the cumulative distribution function for t
should be:
X &\sim & f\\
Y &\sim & g\\
T &\sim & P(t)\\
\Phi(t)=P(T \leq t)&=& \int_0^t \int_0^\infty g(y)F(t^\prime Y)
dt^\prime dy
P(t) &=& \int_0^\infty y f(ty)g(y) dy
Plugging in our probability densities from before, we have:
f(ty) &=& \frac{1}{\sqrt{2\pi}
g(y) &=&
P(t) &=& \int_0^\infty t \frac{1}{\sqrt{2\pi}
\int_0^\infty y^{n}
\int_0^\infty y^{n}
Now making the tedious variable change
s= \frac{y^2}{2}(1+\frac{t^2}{n^2})
we find
and so P(t) reduces to a Gamma integral:
\int_0^\infty y^{n}
I'm a bit off from the wikipedia article on the student-t here, but a good
exercise in combining PDFs.