Reference

Some of what follows my be somewhat out of date, but is still relevant.  For the most up to date information regarding DNA testing for genealogy, click here to be taken to the ISOGG (International Society of Genetic Genealogy) Website.

Participant Identifiers

In order to protect the privacy of those whose DNA has been tested and discussed here, each participant has been assigned an identifying code comprised of 2 alpha characters and a numeral.  The alpha characters are meant to denote the place of origin (as much as is known) of the participants' ancestors, while the numeral separates one party from another when their ancestors came from the same place:

  • CB = Carribbean
  • CE = Central Europe
  • EA = East Anglia (England)
  • FR = France
  • IR = Ireland
  • LA = Louisiana (USA)
  • MI = Midlands (England)
  • ND = Normandy (France)
  • WC = West Country (England

Definitions

  • Haplotype: A unique genetic pattern of an individual or shared by a small group of individuals, pointing to common ancestry in the relatively recent past, used to differentiate families from one another within a genealogical timeframe.
  • Haplogroup: A genetic pattern shared by a large group of individuals, pointing to common ancestry far back in time, used to determine the common ethnic and geographical origins of the group.
  • TMRCA: "Time to Most Recent Common Ancestor" - measured in generations or years, dependent upon a given rate of mutation for the markers tested, and an average value for years per generation. 
  • Genealogical Timeframe: Generally the period of time, historically, that surnames have existed; which is anywhere from 500 to 1000 years ago, roughly. The Malet name has been known to exist for 1000 years or more.
  •  Transmission Event: 1 transmission event is counted for each marker tested, in every generation in the male line, i.e. when a father has a son and passes on the Y chromosome.    
  • FTDNA: "Family Tree DNA", the DNA service provider for this and many other Surname DNA studies.
  • NPE: "Non-Paternal Event", or "not the parent expected", simply means that somewhere in a given father to son line the Y-DNA inheritance was broken.  This most often occurs in the case of an illegitimate birth.

Methodology

The study involves an analysis of the Y-chromosome, carried only by males, that is passed from father to son (along with the surname) with, usually, no change. There are known mutation rates for the Y chromosome though, and this allows one family to differentiate itself from another over time.

Mutation Rates

There has been and continues to be a great deal of discussion on the topic of mutation rates for the markers on the Y chromosome. There have been 4 major studies done to date:

  • Heyer et al (1997)
  • Bianchi et al (1998)
  • Kayser et al (2000)
  • Holtkemper et al (2001)

The generally accepted rate is 1 mutation per 500 transmission events (.002), because this is the average of the values found by the first 3 studies. The 4th study, however, determined that the rate was 1 per 250 transmission events (.004). Most researchers feel that .002 is too low, and .004 is too high, so perhaps the actual rate lies somewhere between the two. In fact, Doug McDonald (thanks to Terry Barton for making this available on the web) has done his own analysis of FTDNA's 25 marker test (the one we have used), and concluded that the average mutation rate for those 25 markers is .0028.

Suffice to say that the jury is still out — but hopefully a more precise value will be available in the future. FTDNA has undertaken a new mutation rate study based on their large and ever growing database of results. Their goal is to determine an actual mutation rate for each of the markers that they test, something the other studies did not do.

Until a more definitive rate is available, we will use 3 different rates (.002, .003, and .004) when making calculations, and present our conclusions as a range of values.

Calculating TMRCA

One of the most useful aspects of a DNA study is to estimate the general timeframe in which a common ancestor for two otherwise unrelated families might have lived. If this falls within an acceptable genealogical timeframe, then the two families can be considered to be related in a genealogical sense, i.e. they have the same surname, and they probably have a common ancestor within the timeframe that the surname is known to have existed.

This can be looked at in two ways:

  • Generations
  • Years
Generations

Generations per Mutation 
Markers122537
R
a
t
e
.002422014
.00328139
.00421107
The table at the left shows the number of generations represented by a single mutation when x number of markers have been tested. It is given by the formula: 
"1 / mutation rate x number of markers tested". FTDNA offers 12, 25, and 37 marker tests, so each is represented in the table. As discussed above, we are using 3 different mutation rates in our calculations, so a value is provided for each mutation rate and each number of markers tested.

Years

The 3 tables below show the number of years corresponding to the number of generations calculated in the table above, relative to 3 different factors representing the number of years represented by an average generation, i.e. 20, 25, or 30.

12 Markers
Years202530
R
a
t
e
4284010501260
28560700840
21420525630
25 Markers
Years202530
R
a
t
e
20400500600
13260325390
10200250300
37 Markers
Years202530
R
a
t
e
14280350420
9180225270
7140175210

How Relationships are Determined

A "transmission event" occurs for each marker tested each time a father has a son and the Y chromosome is passed on. There is no way to predict when a mutation will occur, but given enough transmission events we can reasonably predict that one will have occurred somewhere along the line, based on the values presented in the above tables.

If there are very few mutations, e.g. 2 participants differ by 2 markers in a 25 marker test, then we know that the two individuals are very closely related, and can predict the general time frame in which their common ancestor must have lived, simply by multiplying the actual number of mutations by one of the factors from the above tables, and dividing by 2. We divide by 2 because the above factors represent rates for 25 markers per generation, but we have 50 markers between the 2 individuals for each generation.

Example 1 - Confirming the Paperwork

A given group has three participants, and each of them tested for 25 markers. There are 3 separate lines, all stemming from a common ancestor, each separated from one another by 9 generations, representing 27 transmission events. There is one mutation among the 3. The model predicts:

  • 25 markers x 27 transmission events x 0.2% = 1.35 mutations.

We can therefore accept that the one participant whose haplotype differs from the other two by one count on one marker (one mutation) is still related to them, because the actual number of mutations is less than what is predicted by the model, even at its most conservative mutation rate.

Example 2 - Predicting a Timeframe for a Common Ancestor

5 participants have tested for 25 markers and are divided into 2 family groups according to the results of the test. Group 1 has one mutation among 3 participants. Group 2 contains 2 individuals who are a perfect match. Each group has a documented genealogy going back 10 generations.

Since 2 of the 3 participants in group 1 match, we will consider their result to be the haplotype for that group, essentially ignoring the mutation in the 3rd person's line. There are 6 differences between the haplotype for group 1 and the haplotype for group 2. We will compare one set of markers for each group, so we have 50 markers to consider, which represents 50 transmission events per generation. We will use median values for mutation rate (.003) and years per generation (25) in our calculation of TMRCA:

  • Generations per mutation = 1 / 50 x .003 = 6.66
  • TMRCA (in generations) = 6.66 x 6 = 40
  • TMRCA (in years) = 40 x 25 = 1000