Chi-Square Probabilities

Here is a table showing some of the chi-square values you are most likely to actually use -- this is not a complete table but I'm sure anybody interested can find a complete table if necessary.  And I'm sure there are plenty of stats programs which will calculate p values for a chi-square test for you.  This table is for people who don't want to buy and figure out how to use a whole stats package but might nevertheless find it useful to be able to use a chi-square test from time to time.


df 0.95 0.90 0.70 0.50 0.30 0.20 0.10 0.05 0.01 0.001
1 0.004 0.016 0.15 0.46 1.07 1.64 2.71 3.84 6.64 10.83
2 0.10 0.21 0.71 1.39 2.41 3.22 4.61 5.99 9.21 13.82
3 0.35 0.58 1.42 2.37 3.67 4.64 6.25 7.82 11.35 16.27
4 0.71 1.06 2.20 3.36 4.88 5.99 7.78 9.49 13.28 18.47
5 1.15 1.61 3.00 4.35 6.06 7.29 9.24 11.07 15.09 20.52

"df", or degrees of freedom, will be the number of categories you had in your chi-square test, minus one.  So if you had as categories black, some pink, lots of pink, then that's 3 categories and your df will be 2.  Even if your guess is that you're working with a two-gene system, you may not have more than four categories (or you might have as many as nine categories, depending on how the system works).

Therefore, in this case look at the row where df = 2.  The chi-square number you calculated was 1.64 -- that number falls between probabilities 0.50 and 0.30.  That means that you have between a 50% and a 30% probability that the difference you saw between your observed numbers and the expected numbers arose by chance alone.  In other words, the difference you see is not great enough to indicate that a real difference exists.  Your guess about the inheritance pattern for this trait is supported by this result because there does not appear to be a real difference between the expected 1 : 2 : 1 ratio and the ratio you actually got.

A lot of people simplify all this by just looking at the p = 0.05 column in the above table, which is italicized.  If the chi-square number they calculate is smaller than the relevant number in that table -- 5.99 for 2 degrees of freedom -- then they assume that there observed numbers really do agree with expectation -- any difference between observed and expected is too small to actually count.  If the number is greater than the number in that column, they conclude that there is a significant deviation between observed and expected and that they guessed wrong about mode of inheritance.

The choice of the 0.05 level is partly tradition.  At this level, you have a 5% chance of being wrong -- that is, if your chi-square number had been 5.89, then you would still be under 5.99, but not much under; the chance that the difference between observed and expected was real would be about 5%.  5% may not seem like a big chance, but keep in mind that if you do twenty chi-square tests, you would expect to be wrong once if you were wrong 5% of the time.  A 5% error rate is actually pretty large.

In practice, the smaller your chi-square numbers, the more likely it is that your observed numbers agree with your expectation -- there is no real difference between observed and expected.  If you get a number that approaches being "too large" then you usually would want to increase your sample size by counting more puppies.