Punnett Squares and Probability


Suppose you’re dealing with the quintessential “simple” situation in genetics:  a single-gene two-allele trait that shows complete dominance and recessiveness.  There are plenty of these traits in dogs, although, alas, there are lots more that are complicated in one way or another.  Still, let’s start with this very simple situation.

There are roughly six zillion web sites where you can take a look at this kind of genetic situation, frequently with pretty pictures of purple- and white-flowering peas.  (For example, bowlingsite.mcf.com/genetics/)  Lots of them are fine sites and I almost didn’t bother to include anything similar here.  But what I want to is explain not just how to use a Punnett square to predict outcomes for simple traits, but also how a real breeder might really use Punnett squares and (faster, less error-prone) probability calculations to predict outcomes for more complicated traits as well, because in the real world we are often going to be dealing with complicated traits. 

Being able to use Punnett squares and / or probability to figure out what you expect in puppies from particular crosses is handy in all kinds of ways.  Most obviously, this is the kind of analysis that would come in handy if you were trying to formulate a sensible guess about the inheritance pattern of a trait you see in your dogs.


The simplest case:  a single-gene trait --

Let’s take pyruvate kinase deficiency as our example to start with, before we get into anything complicated.  Pyruvate kinase is necessary for the production of red blood cells, so animals severely deficient in this enzyme are anemic.  PKD is known in Basenjis, Beagles, Cairns and Westies, and Poodles, according to Padgett (Control of Canine Genetic Diseases).  It’s thought to be a simple recessive, which is plausible because it’s a simple enzyme deficiency.  Moreover, a simple recessive inheritance pattern has been demonstrated pretty conclusively at least in Basenjis, as described by Hutt (Genetics for Dog Breeders). 

So, let’s let K stand for normal and k for deficient, and let’s say we cross two carriers.  What do we expect in the puppies?

First write down the parents we are crossing:  Kk x Kk.

Then set up the Punnett square like this:










Now, what we have is Dad’s gametes (say) across the top and Mom’s (we assume) down the side.  Notice that Dad himself has the genotype Kk – two alleles.  So does Mom.  What we have got across the top and down the side of the Punnett square are the types of gametes (sperm cells) that Dad could possibly make – he can give each sperm cell just one letter and either it will be a big K or a little k.  Same for Mom with her egg cells.

What we have inside the Punnett square are the puppies that result when sperm fertilizes egg:  bring Dad’s gamete down and Mom’s across and fill in each puppy genotype.  Each puppy gets a letter from Dad and one from Mom.  I’m making a big point of this because in my experience if you’re really clear on this – one allele of each gene per gamete – then you’ll avoid most common errors when looking at more complicated situations, which are coming up shortly.

Now, obviously, what we have in the square is one KK puppy, two Kk puppies, and one kk puppy.  The KK and Kk puppies will be clinically normal and the kk puppy will be anemic because it has pyruvate kinase deficiency.  You do not expect each litter to have exactly four puppies, three normal and one anemic.  You expect roughly a 3 normal : 1 anemic ratio of puppies.  Put another way:  every puppy born to this kind of cross has a ¼ chance (25%) of being anemic, versus a ¾ chance (75%) of being normal.  It’s like rolling a four-sided die for each puppy conceived and every time you roll a “4” the puppy is anemic.  If you were doing chi-square calculations because you had just guessed at the inheritance pattern and wanted to see if your guess was right, you would use .75 and .25 as your expected ratio for normal and affected puppies.

Punnett squares can get more complicated than this example suggests.  Let’s set up a more interesting Punnett square and then look at how we can avoid having to use big, cumbersome Punnett squares if we want to.


A Basic Complication:  Two-Gene Systems --

As long as traits are relatively simple in mode of inheritance (controlled by one, two or three genes), it is usually possible to work out Mendelian expectations for proposed matings.  For an example, let's take a look at a complicated Real World color system:  harlequin pattern in Great Danes.

Suppose you cross two harlequin Danes.  Now what we have here is a system where if an animal is homozygous for the merle allele (MM), it's probably defective (eye and ear defects) or dead, and if it's homozygous for the harlequin modifier (HH), it's dead again.  In order to be harlequin, it has to be heterozygous for both genes at the same time (MmHh).  If it's Mmhh, it'll be merle; if it's mmHh, it'll be black but able to pass on the harlequin modifier, and if it's mmhh, it'll be pure for black.

If you were going to set up a Punnett square for a harlequin x harlequin cross, which as we shall see is not strictly necessary, it would look like this:

  MH Mh mH mh
Mh MMHh MMhh MmHh Mmhh
mH MmHH MmHh mmHH mmHh
mh MmHh Mmhh mmHh mmhh

Of the puppies from the above cross --

7/16 of these puppies will die or at least have the problems associated with homozygous merle (they have the MM--, --HH, or MMHH genotypes).

4/16, or 1/4, of the puppies will have the desired harlequin markings (MmHh).

2/16, or 1/8, of the puppies will be merle (Mmhh).

3/16 of the puppies will be black, and of those, two thirds will carry harlequin (mmHh), which do not show harlequin but can pass it on; and one third will be pure for solid black (mmhh).

Results like these explain why breeders might prefer to cross harlequin Danes to black -- a MmHh x mmhh cross would give 1/4 undesirable merle puppies, but would totally avoid the lethality problem, while still maintaining the proportion of harlequin puppies in the litter at 1/4.  A sensible solution might be to make merle an allowed color in the breed, if you want to breed for harlequin.  Not knowing how this system works probably led to the disallowal of merle in this breed, thus creating a situation where breeders were accidentally encouraged to breed in ways that cause puppies to be zapped by the lethals inherent in the system.


A Twist on the Theme:  A Different Kind of Two-Gene System -- and How to Avoid The Cumbersome Punnett Square --

Here's another trait, even more complicated:

Suppose we have a type of epilepsy which is characterized by infrequent but big grand-mal type seizures.  Onset is at about two years of age.  More males than females are affected.  This is a type of epilepsy seen in, for example, Welsh Springer Spaniels and Standard Schnauzers.  Well, researchers working with this trait think it’s a two-gene system and they also think that one of the genes involved is sex-linked.  They think that both genes have to be homozygous recessive at the same time for the epilepsy to appear.  This kind of situation, where having just one dominant allele of either of two different genes is sufficient to create the normal phenotype, is sometimes called duplicate gene action.  This is one way in which you could see interaction between multiple genes to produce a single trait.  In this particular case, duplicate gene action gives us this situation:

Affected:  aaXbY or aaXbXb

Normal carriers:   AAXBXb  or AAXbXb or AaXBXB  or AaXBXb or AaXbXb or aaXBXb or aaXBXB or aaXBXB or aaXBY or AaXbY or AaXBY or AAXbY

Normal Non-carriers (clear of the trait):  AAXBXB or AAXBY

Obviously there are lots of different kinds of carriers, depending on whether they’re male or female, carrying just one deleterious recessive or both.  This situation is a good deal more annoying to work with than a "normal" two-gene system because of the sex linkage.  It's obvious, probably, that the chance of a carrier passing on a deleterious allele varies a lot, because a carrier might be carrying only one deleterious allele, or up to three.

Now, if you get an epileptic puppy, you know that both parents must have been carrying at least one recessive allele for the autosomal (non-sex-linked) gene (both parents must have been either Aa or aa).  This is true because if either parent had had the AA genotype, all the puppies would have gotten a big-A from that parent and they would all have been protected regardless of what else they inherited.  So it's correct to say that an AAXBXb  bitch is a carrier, but that doesn't mean she can pass epilepsy on to any of her puppies -- the most she can do is pass on carrier status.

This is probably a new idea for those who are used to thinking of all carriers as being able to pass on the recessive trait to their offspring, so it's worth emphasizing.  As you can see, it ain't necessarily so.

But if you match up the "wrong" sort of carriers with each other, you lose that kind of protection.  Let’s cross two dogs, both clinically normal, and see how this might work.

Dad:  AaXBY          Mom:  AaXBXb

Below is the Punnett square for this cross.  We have more boxes than in the above example because we’re dealing with more genes.  Again, we have gametes across the top and down the side and puppy genotypes inside the square.  Each gamete gets one allele of each gene.  You want all possible combinations of alleles that Dad can put in a sperm cell, and all possible combinations Mom can put in an egg cell.


























For each puppy born from a cross like this, we expect a 15/16 chance of normality and just a 1/16 chance of epilepsy  – and epileptic puppies will always be male.  This is because Dad had only one X chromosome to give his daughters and it was a big-B X – so all of his daughters are protected, no matter what alleles they got from Mom.  However, there is only a 2/16, or 1/8, chance that the puppy will not be a carrier for at least one of the genes involved in creating this form of epilepsy.

How about for a cross like this:

AaXbY   x   AaXBXb

Suppose all we want to know is what proportion of affected puppies we expect from a cross like this.  Do we have to draw another huge Punnett square and painfully count out puppy genotypes?  No.  We can use a much quicker and less typo-prone method if we wish.

First we explicitly identify what the genotypes of affected animals look like, as we did above when describing the full system.  Because there is a sex-linked gene involved in this system, there are, as shown above, two possible affected genotypes, one for boys and one for girls:  aaXbY  and  aaXbXb.

What kind of autosomal allele can Dad give a puppy in this case (remember, he is AaXbY)?  Big A or little a.  There’s exactly the same chance he’ll pass on each of these two alleles.  Put another way, there’s a 50% chance he’ll give any puppy his A and a 50% chance he’ll pass on his a.  Put still another way, 1/2 of the time he’ll pass on each allele.  The same for Mom (AaXBXb ) with her autosomal alleles, right?  Then . . .

½ a from Dad  x   ½ a from Mom = ¼ aa

We have simply multiplied the chance that Dad will contribute an a by the chance that Mom will do the same.

If we repeated this line of reasoning for the X-linked gene, we’d get:

½ Xb  x   ½ Xb  =  ¼  XbXb

½ Y    x   ½ Xb  =  ¼  XbY

Now let’s put the genes back together to make complete puppy genotypes:

¼ aa  x  ¼ XbXb  =  1/16  aaXbXb

¼ aa  x  ¼ XbY  =  1/16  aaXbY

From this cross (AaXbY   x   AaXBXb) we therefore expect roughly 1/8  (2/16)  of all the puppies born to be affected with epilepsy and we expect (from this particular cross) an even split between male and female puppies.

Again, if all you care about is the proportion of affected puppies, you don't have to set up the whole Punnett square.  All you have to know is what genotypes are affected and what the genotypes of the parents are.  Then you can just multiply the chance that dad will give his deleterious alleles by the chance that mom will and go straight to the affected puppies.


Putting it all to use --

Now, let’s see how a breeder might use this stuff when she is trying to figure out a system that looks simple in inheritance.  Perhaps she is trying to determine whether Cavalier Episodic Falling Syndrome is a single-gene recessive (which by the way somebody had done and that's what they concluded it is).  How do you get to such a conclusion?  By gathering data on litters where puppies develop Episodic Falling and seeing if the data matches the expected Mendelian ratios for a single-gene recessive.

If this is single-gene recessive, you expect a carrier x carrier cross to give you 1/4 affected puppies. 

Aa  x  Aa

½ a from Dad x  ½ a from Mom =  ¼ aa

Likewise, you expect an affected x carrier cross to give you 1/2 affected puppies.  You expect an affected x affected cross to give you 100% affected puppies.  You either gather data from "accidental" test crosses (crosses that turned out to be test crosses in retrospect as somebody discovered an animal was a carrier after breeding it) or you deliberately do test crosses of your own.  Then you use Punnett Squares or probability calculations to give you those ratios, you gather data on a reasonable number of puppies (more than one litter of each type, for Cavaliers), and then you use a chi-square test to decide whether what you got matches what you thought you ought to get. 


What if you think the system is not so simple?

Suppose a breeder finds herself facing a problem that looks sorta-simple but not like a clear-cut single-gene Mendelian recessive.  In this case, let’s say, the breeder has encountered a problem with a type of myelopathy, and that this problem is increasingly seen in her breed and looks like it might really cause problems.

The problem is the development in puppyhood of rear-end ataxia and weakness, progressing sometime after two years of age to complete rear-end paralysis.  Phenotypically the syndrome seems pretty consistent – affected animals all seem to generally be affected with about the same severity.  No one knows how this problem is inherited but the breeder suspects a simple recessive (that’s plausible:  Padgett lists about 14 types of ataxia and myelopathy, of which half are described as recessive and the rest as unknown).  Nor is this particular breeder okay with just stumbling along for the rest of her breeding career never knowing when the problem is going to crop up -- being a scientifically-minded sort of person, she wants to really try and figure out what's going on so she can get the problem out of her breeding program for good, not to mention contribute to getting it out of the breed.

So the breeder takes her data for litters in which affected puppies occurred.  For a total of five litters, she has 46 puppies.  Let’s assume that she has not underestimated normal puppies – litter size is large and she is confident she has not missed any carrier-to-carrier breedings that didn’t happen to produce affected puppies.  (If she was worried that she had missed some unaffected puppies, she would apply the standard correction.)  Given a simple one-gene recessive system, the breeder expected 1/4 of the puppies produced to be affected.

Suppose that she has, instead, 20 affected puppies and only 26 normal puppies.  That’s a heck of a lot of affected puppies for a simple single-gene system, unless the problem is dominant.  But it can't be a simple dominant either, because if it were, either Dad or Mom would have to be affected.  We have specified an affected phenotype which is pretty consistent and pretty hard to miss, so that's out.  The guess that this is a single-gene problem looks wrong.  If she does a chi-square test, she’ll find that this ratio is too far from expectations to agree with a single-gene assumption.

She could make a leap and declare that this problem must be a dominant with incomplete penetrance, and far too many people do exactly that when faced with a situation like the one described.  There is more to genetics than single-gene systems, and it is a mistake to decide that anything that's not a single-gene recessive must be a single-gene dominant, and then invoke incomplete penetrance to explain why it doesn't in fact follow the expected pattern for a single-gene dominant.

Yet, this problem does still “look” simple – there are a lot of ataxias and myelopathies that are thought to be simple recessives in other breeds, it’s easy to visualize the problem as produced by a specific enzyme dysfunction, it’s not complicated anatomically, animals are clinically either completely normal or completely affected, normal animals are producing affected puppies and never the reverse, the problem is very consistent in presentation and doesn’t appear to be affected by environmental factors.  What gives?

Well, maybe it wouldn’t hurt to try a two-gene system – that’s still simple (relatively), but it will definitely mess up your expected 3 normal : 1 affected ratio.  The breeder makes a guess:

Suppose the breeder guesses that only aabb puppies are affected.

There would then be several possible carrier x carrier crosses that could produce affected puppies:

AaBb x AaBb gives 1/4 aa  x  1/4 bb  =  1/16  aabb

Aabb x AaBb gives 1/4 aa x 1/2 bb = 1/8 aabb

aaBb x AaBb gives 1/2 aa x 1/4 bb = 1/8 aabb

aaBb x Aabb gives 1/2 aa x 1/2 bb = 1/4 aabb

    -- or no more than 1/4 of the puppies from any carrier x carrier cross should be affected, probably fewer.  The breeder is seeing a lot more affected puppies than that.  And notice that she doesn't have to work any of these expectations out with big Punnett squares.  She simply figures the expected proportion of affected puppies from each possible cross and moves on.

And since none of these crosses agrees with reality, she guesses again about the system.  Suppose that instead of just one dominant allele from either gene being enough to create the normal phenotype, you need a dominant allele from each gene at the same time to create the normal phenotype.  Now we have this situation:

A_B_ is normal (either a big letter or a little letter can be in each blank).

A_bb or aaB_ or aabb is affected.

Let’s try that AaBb  x  AaBb cross again.

From this cross, we expect:

¾ A_  x  ¾ B_  =  9/16 normal

¾ A_  x  ¼ bb  =  3/16 affected

¼ aa  x  ¾ B_  =  3/16 affected        --- for a total of 7/16 affected

¼ aa  x  ¼ bb  =  1/16 affected

If you do a Punnett square, which you do not need to do, this is what you’ll see in the 16 boxes.  We collapse all the different phenotypes that could be described as “normal” (AABB, AaBB, AABb, AaBb) into the phenotypic category A_B_, because that makes this analysis so much quicker.  Similarly for the affected categories, of course.

A ratio of 9 normal : 7 affected is in fact very, very close to what this breeder saw in her puppies (26 : 20).  If she does a chi-square test she will certainly find that this kind of genetic system is consistent with what she is seeing.  Just because it's consistent doesn't mean it's true.  But it could be true.  Taking this system as a working hypothesis, the breeder can then do some work (ie, testcrosses) to either gather support for or disprove this hypothesis, thus leading toward an understanding of the system that underlies this problem and eventually eliminating the problem in her breeding program and then in the breed.

This kind of two gene system described above is sometimes called complementary gene action.

We see this kind of genetic system in sweet pea flower color, as was shown by Bateson and Punnett (that’s where the square got its name) in some very early work on genetics.  I mention this only to establish that this kind of gene interaction is not just theoretically possible, but actually does occur in real life.

In sweet peas, there is a flow of events in which multiple genes make multiple enzymes that each catalyze particular steps in a metabolic chain reaction that leads eventually to pigment formation.  Blocking any step in the chain causes the process to break down, leading to a lack of pigment in the finished flower.  Because the chain can break at multiple points, problems with any of the genes involved can produce the same end result: white flowers that lack pigment.  This is why different dominant alleles from two different genes must both be present at the same time to create pigment.  You see how easy it is to visualize similar systems in animals.

A form of complementary gene action is thought to account for a paralysis of the hind legs in Great Dane x St. Bernard crosses, as described by Hutt.  Hutt also describes an experiment in which two different types of “rex” rabbits, when crossed, did not show the rex mutation – but when the offspring are crossed to produce an F2 generation, we do see a 9 normal : 7 rex ratio, exactly as described above.  This clearly indicates that there are two separate rex mutations, either of which can independently create the rex phenotype.

Willis suggests that the standard type of extended-white deafness we see in Dalmatians and various other breeds may be a two-gene system (Practical Genetics for Dog Breeders), but, as is unfortunately usual for this book, does not describe the evidence underlying this suggestion.  Brooks and Sargan, in “Genetic Aspects of Disease in Dogs” (The Genetics of the Dog, eds. Ruvinsky & Samson), suggest that a type of renal dysplasia in Lhasas, Shi Tzus, and Soft-Coated Wheaten Terriers is probably not a simple monogenic disorder (and they do give their reference, although no comments on or summary of the data).


Punnett squares and probability were not just invented to annoy students in genetics classes.  These are powerful tools that are very useful in formulating and testing hypotheses about modes of inheritance.  For fun and practice, why not try the following problems?  Answers, with commentary, are here.

1.  Dragons either prefer to eat sheep or princesses, never both.  The preference for princesses is completely dominant.  They also prefer to collect either silver or gold, never both.  The preference for gold is completely dominant.  These two traits are not linked and each is controlled by a single gene with two alleles.

One the one hand you have a dragon who prefers both gold and princesses.  His mother, however, preferred both silver and sheep.  On the other hand, you have a dragon who prefers gold and sheep.  Her father preferred silver and princesses.  When these two dragons get together, what is the probability that their dragonet will prefer gold and princesses?


2.  In Labrador Retrievers, one gene controls black and brown pigment; black (B) is dominant to brown (b).  If a dog is chocolate, even its nose and eye rims will be brown, not black – it cannot form black pigment at all.  A second gene controls yellow pigment, and not-yellow (E) is dominant to yellow (e).  If a dog is yellow, it doesn’t matter what type of B or b alleles it has:  it is yellow.  Yellow dogs that can form black pigment will, however, have black noses and eye rims.

If you cross two black Labs, both heterozygous for both genes, what phenotypic ratio would you expect in the puppies?


3.  In Doberman Pinschers, one gene controls black and brown pigment; black (B) is dominant to brown (b), just as in Labs (except in Dobermans we call brown dogs red rather than chocolate.  A second independent gene imposes the blue dilute on top of whatever color was created by the b-locus.  Not-blue (D) is dominant to blue (d).  Blue in combination with black gives the blue color; blue in combination with red gives fawn (Isabella).

If you cross a BbDd dog with a bbDd bitch, what is the chance that you’ll get a fawn puppy?


4.  In the same system, there is a problem called “color dilution alopecia”, in which an animal with the blue dilute (dd) loses a lot of its hair and basically becomes bald.  Let’s say that this condition is controlled by a third gene, where if the dominant allele A is present in combination with the dd genotype, the dog will develop alopecia.  [This is probably not what is actually going on with color dilution alopecia!  I just made up the system, though the problem is real enough.]

Now, if you cross two dogs that are both red and both carrying d and a, then what is the chance that any given puppy born will develop alopecia?