Editor's Note: Because of difficulties in displaying a square root symbol on the web, we have used exponential notation. Whenever you see X0.5 we are expressing the square root of X. |
In the
previous parts of this series we have talked about what Standard
Deviation (SD) and Standard Error (SE) really mean. The formulas
for actually calculating them are not really important in
this day and age. How to use them and what they
mean becomes more important all the time. A brief review:
HOW ARE STANDARD DEVIATION AND STANDARD ERROR RELATED? SE = SD /n0.5 Now this is really quite a simple and beautiful little formula. It should be considered THE most important statistical formula in all of statistics.
All this happens with a simple little formula that anybody can understand and remember. It only gets ugly and complicated looking when you have multiple layers of sampling or lots of strata mixed together. The IDEA is simple and easy to grasp. WHAT IS A "t-TABLE"? Luckily, some nice person has figured this out, and published another table called the "t-table". It is very close to the Z-table except in the very small sample sizes. In fact, after a sample size of about 30 or so there is virtually no difference (which just means that you are now getting a very good estimate of the standard deviate). You often hear in statistics that "after a sample size of 30 it is correct to use the Z-table". This isn't really true, but there is so little difference between the tables that nobody worries about doing it. The t-table value depends on the sample size you have used to estimate the standard deviation. These tables sometimes use a special term for "the sample size minus 1" (n-1). They call this the "degrees of freedom". At any rate, the t-table just tells you how many standard deviates to go, each way, when you are making a confidence interval. An example of such a t-table is shown below. . |
A COMPLETE EXAMPLE We now know that 95% of the things in the population are within ±2.086 standard deviations of the sample mean. What is that in pounds? 2.086 * 25 pounds = 52.15 pounds each way. The "confidence interval" is therefore 200 pounds ±52 pounds (between 148 and 252 pounds if you prefer to state the end points). And how close is our SAMPLE MEAN to the true population mean? Well, even if the population was not normally distributed we can still use its SD to estimate how widely spread the sample means will be. We know that sample means are always normally distributed. We need to calculate the STANDARD ERROR, and we do this using the SE formula. 25 / 210.5 = ±5.45 pounds. Suppose we have decided to get a 90% confidence interval for the sample mean. We have to go out 1.725 standard errors each way according to the t-table, and in units this would be 1.725 * 5.45 = ±9.4 pounds. We can now estimate that the true population mean is 200 ±9.4 pounds (or 190.6 to 209.4 pounds). If you can follow the logic of this example you will be able to do the most practical parts of statistical analysis. It may take practice to do it quickly, but these are the main logical ideas you need to understand. When you read a statistics book there are a lot more terms you run into, but many of them are just slightly different ways of saying the same thing. Next time we will try to sort out a few of these so they don't get in your way. Once you see the pattern you will realize that SD and SE are really ALL you need to worry about. The business of how to create a confidence interval, and understanding standard deviation and standard error, are the longest and hardest part of this series. From now on it gets easier. Remember -- this statistics business has to do with somebody's MONEY and SWEAT, and if you can understand some of the basics, you might save a lot of each. It's worth the effort. |