Tuesday, September 18, 2012

Paleoanthropology, genealogy and the miracle of DNA. Part Six: William’s DNA results



As per my previous explanatory posts, remember that the various types of tests (a) look at different ancestral lineages (male, female, both) and (b) focus on different genealogical and anthropological timeframes.

Y-DNA test

This test identifies genealogical connections on the direct paternal lineage and also suggests paternal ancestral origins via the Y-chromosome Short Tandem Repeats (STR) or Single Nucleotide Polymorphisms (SNP).

Y-Haplogroup (STR prediction)

I belong to the R1b1a2 Y-Haplogroup.  Haplogroup R is one of the two branches of the mega-haplogroup P and originated approximately 30,000 years ago in Central Asia. It has two main branches, R1 and R2. R1 spread from Central Asia into Europe while R2 spread east into the Indian subcontinent.

R1b is the most frequently occurring Y-chromosome haplogroup in Western Europe. One study determined its origin to be about 18,500 years before the present and are the direct descendants of Cro-Magnon man who dominated the Upper Paleolithic expansion into Europe.  R1b1a2 is believed to have expanded throughout Europe as humans re-colonized after the last ice age which ended approximately 10-12 thousand years ago. The Egyptian pharaoh, Tutankhamun (134 – 1323 BC) also belongs to the haplogroup R1b1a2.

Y-Haplogroup Deep Clade (SNP)

The SNP test put me in the R1b1a2a1a1a1 Haplogroup (short name is U198) which confirms the “lower resolution” STR prediction. According to the FTDNA group project, the highest concentration of R-U198 to-date is in men of English ancestry (but not in the Gaelic population) but even there, R-U198 is uncommon, making up about 2% of the male population. This does not mean that R-U198 necessarily originated in England but it seems likely that members of this haplogroup last shared a common direct-line paternal ancestor around 2,000-3,000 years ago which is indeed young enough to have originated in England after the last ice age had ended. It is believed that R1b in general spread rapidly (and relatively recently) westwards across Europe so the likely point of origin of R-U198 may lie anywhere on a track from south-eastern Europe to Britain.

FTDNA Y-chromosome Database Matches

My paternal line database match results are rather unexciting, due to size limitations of the databases. Over time, more and more individuals will hopefully add their data to them and more high-level matches could result.

·         67 marker test (shorter time to MRCA): No matches.
·         37 marker test: No matches.
·         25 marker test: 11 matches. One is Step 1 (a match of 24/25 markers); the others are Step 2 (a 23/25 match).  The countries of origin are UK/England & Germany.
·         12 marker test (longer time to MRCA): 749 of which many are Step Zero (12/12 match). Many European countries are represented but esp. UK/England/Germany. Five have the last name Hudson; all these are matches at Step 1.  

Hudson surname Y-DNA Project

No Hudson matches at the 25 marker level

Mitochondrial test (Full Genome MtDNA)

This tests sequences of the HVR1, HVR1 and the Coding Regions of the mitochondrial DNA. Due to the mtDNA slow mutation rate, the tests are often more applicable to deep-ancestry predictions than to more recent genealogical applications.

MtDNA Haplogroup

I belong to the Haplogroup U5a. The U5 Haplogroup (Bryan Sykes’ clan Ursula) is the oldest mtDNA haplogroup found in European Homo sapiens. It has a broad geographic distribution, ranging from Europe and North Africa to India and Central Asia. The wide distribution is due to its antiquity, with its appearance immediately following that of haplogroup R, after “the Out of Africa” exit.

The age of U5 is estimated at about 50,000 BP and approximately 11% of Europeans belong to this haplogroup. U5 most likely appeared in the Near East and spread into Europe in an early expansion before the last Ice Age and which also pre-dates the expansion of agriculture in Europe.  It was actually the principal mtDNA haplogroup of the Paleolithic hunter-gatherers in Northern Europe but declined through later times due to the influence of subsequent migrations. Interestingly, U5 individuals may have been come in contact with Neanderthals living in Europe at the time.

The sub-haplogroup or subclade U5a is a later mutation that arose around 20,000 years ago and thus most likely evolved during the last ice age.  The remains of Cheddar Man, a Mesolithic male found in Gough's Cave in Cheddar Gorge, Somerset and who died circa 7150 BC, were DNA tested and it was found that he also belonged to the U5a haplogroup.

The FTDNA U5 project placed me in sub-haplogroup U5a2a1d which is estimated to be about 3,500 years old.  The defining mutations for this group can be seen at Phylotree http://www.phylotree.org/tree/subtree_U.htm.

For many reasons, mtDNA and Y-chromosome lineages may indicate different migration patterns. However, the mtDNA haplogroup U is often associated with the Y-chromosome group R, predictably in Europe.

FTDNA MtDNA Database Matches and Project Groups

HVR1 only.   There are 38 matches, all in the U5 or U5a haplogroup. All have a maternal country of origin from Western or Central Europe.

HVR1 + HVR2. There are seven matches, all of haplogroup U5 or U5a and with maternal ancestry from England, Germany and Spain.

HVR1 + HVR2 + Coding Region (aka Full Genome Sequence, FGS). There are no matches at this (highest) level of resolution.

There are seven other people in the FTDNA U5 project group who are in the sub-haplogroup U5a2a1d. Of these, only three others have known ancestry and are from England, France and Wales.  Apparently, I also have one extra mutation at position 5892 that is unique for this group and this should be useful for identifying people who share a more recent common maternal ancestor.

Family Finder (autosomal) test

This test identifies and matches SNPs (Single Nucleotide Polymorphism) in the autosomal DNA which comes equally but randomly from both parents. It is thus a “gender neutral” test.

FTDNA Autosomal Database Matches

This can potentially match you with relatives descended from any of your ancestral lines from up to five generations.  I had eight pages of “matches” but most only in the “speculative” range with five in the “distant cousin” range (possibly the 3rd – 5th cousin range). One match included Hudson as an ancestral name but no other name matches were identified.

Autosomal Population Finder

This compares the subject’s autosomal signature to a world DNA population database which reflects the last 100 to 2,000 years (about 4 to 80 generations).  ).  This data is based on rapidly-emerging technology and will undoubtedly change over the coming years as the population definitions are further refined.

Continent (Subcontinent): Europe
Population: Orcadian reference group.  The DNA make-up is of ancient Britons with some components from the Picts (Iron-Age Celtic folks living in Scotland) and Vikings (Norse explorers who settled in wide areas of Europe from the late 8th  to the mid-11th century).
Percentage:  100.00%
Margin of Error: ±0.01%

FTDNA’s autosomal DNA tests use the Human Genetic Diversity Project (HGDP) at the University of Stanford to provide the reference groups for their population studies. In the breakdown I noticed that, the West European groups represented were Basque, French, Orcadian (Orkney Islands) and Spanish.  My own results happened to come back 100% Orcadian. Now clearly these groups are indeed no more or less than reference populations but surely anybody with a post Last Glacial Maximum heritage in northwest Europe would show other components. What happened to the Britons, the Celts and Anglo-Saxons? 

I pursued this concern with the Stamford group. Dr. Bruce Winney of the Department of Oncology at the University of Oxford is a coordinator of the “People of British Isles” study which is itself a component of the HGDP. He wrote:

You are absolutely correct in thinking that the HGDP doesn't really make a good set of reference populations for Europe.  This may well be why you came out 100% Orcadian and it cannot capture Ancient British and Anglo-Saxon ancestry at all.  However, we are hoping that with our project and collaborations, better possibilities are in the way, although I do suspect it will take a year or so before we get there.  You may want to keep an eye on our website (www.peopleofthebritishisles.org ), where we will make available a really interesting paper that we are writing at the moment.

A fascinating preliminary paper is here http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3260910/?tool=pubmed 

Contrary to this point of view would be the argument that the DNA of most white Britons has been passed down from relatively few individuals who occupied the region immediately after the last Ice Age; some studies claim that the DNA of the base-population of the British Isles has not changed much since 6000 BC. This would indicate a genetic signature that largely minimizes the effect of subsequent migrations to Britain from Europe. In this instance, the Ancient British (late Paleolithic – Neolithic) component of the Orcadian reference group would make it an acceptable surrogate.  I guess this just reinforces that we are at the cutting edge of the science. At the present time, “Orcadian” remains no more than an autosomal surrogate for the British Isles.

Tuesday, September 11, 2012

Paleoanthropology, genealogy and the miracle of DNA. Part Five.



Sharon’s (mitochondrial & autosomal) & brother Willy’s (Y-chromosome) DNA results

Family Finder test (autosomal; Sharon)

This test identifies and matches SNPs (Single Nucleotide Polymorphism) in the autosomal DNA which comes equally but randomly from both parents. It is thus a “gender neutral” test.

FTDNA Autosomal Database Matches

This can potentially match you with relatives descended from any of your ancestral lines from up to five or six generations.  Sharon got one confirmed 2nd cousin on the “close/immediate” category with whom she has already exchanged emails.  There were also 27 pages of total matches, most of which are in the 3rd - 5th cousin or “distant/speculative” categories. 

Autosomal Population Finder

This compares the subject’s autosomal signature to a world DNA population database which reflects the last 100 to 2,000 years (about 4 to 80 generations).  This data is based on rapidly-emerging technology and will undoubtedly change over the coming years as the population definitions are further refined.

Region
Population
%
Margin of Error
Western European
Orcadian reference group. Poss. descendants of Picts (Iron Age “Celts”) & Vikings (Norse explorers from the late 8th  cent.).
94.57%
±1.67%
Middle East
Palestinian, , Bedouin, Bedouin South, Druze, Iranian, Jewish.
5.43%
±1.67%

Full genome mitochondrial test (MtDNA; Sharon)

This tests sequences of the HVR1, HVR1 and Coding Regions of the mitochondrial DNA. Due to the mtDNA slow mutation rate, the tests are more applicable to deep ancestry predictions than to more recent genealogical applications.

MtDNA Haplogroup

Sharon belongs to the mitochondrial haplogroup J2a1a.  J (Bryan Sykes’ clan Jasmine) is believed to have originated in the Near/Middle East about 45000 years ago and is thus one of the oldest in Europe and the Middle East.  J2 is thought to be associated with the spread of agriculture from Mesopotamia during the Neolithic period about 18,500 years ago.  J2 is also interesting because it has been detected in Turkey, Italy, Sardinia, Iberia, and Iceland. These are all populations with traditionally prominent fishing industries and thus this connection might suggest recent migration paths that are related to the economic opportunities offered by fishing. J2a is now found homogenously across most of Europe but seems to be largely absent elsewhere. In searching for famous ancients who are members of the J2 MtDNA haplogroup, I only came up with Francesco Petrarca (1304 –1374), known as Petrarch, who was an Italian scholar & poet and one of the earliest humanists.

FTDNA MtDNA Database Matches and Project Groups

HVR1 only.   There are 105 matches, all in the J or J2a1a haplogroups. Most have a maternal country of origin from Western Europe.

HVR1 + HVR2. There are nine matches, all of haplogroups J or J2a1a and with maternal ancestry from the British Isles.

HVR1 + HVR2 + Coding Region (Full Genome Sequence, FGS). Sharon has two matches. Of these, one is a J2a1a with a maternal origin from Ireland. No data on the other. However, MTDNA creates relatively few mutations and thus can travel the female descent line for many generations before a mutation happens. Consequently, paper-trail connections are often elusive and the relevance of random matches, even at the FGS level, can be low in genealogical terms although a shared geographic maternal origin can sometimes be indicated.  Nevertheless, matches on the Coding Region could be important; for an exact FGS match, there is a 90% chance that the MRCA is within 16 generations or about 400 years. Both matches were contacted and one responded but with no obvious genealogical links within the time-frame of the two documented pedigrees.

Y-DNA test (Willy)

This test identifies genealogical connections on the direct paternal lineage and also suggests paternal ancestral origins via the Y-chromosome short tandem repeats (STR) or Single Nucleotide Polymorphisms (SNP).

Y-Haplogroup (STR prediction)

Willy belongs to the R1b1a2 Y-Haplogroup.  Haplogroup R is one of the two branches of the mega-haplogroup P and originated approximately 30,000 years ago in Central Asia. It has two main branches, R1 and R2. R1 spread from Central Asia into Europe while R2 spread east into the Indian subcontinent. Several of Willy’s matches (see below) are from the R1b1a2a1a1b5a sub-haplogroup which have a high occurrence amongst the Basque, French, Spanish and Portuguese. For Willy’s own sub-haplogroup to be determined, the FTDNA Deep Clade test would be required.

R1b is the most frequently occurring Y-chromosome haplogroup in Western Europe. One study determined its origin to be about 18,500 years before the present and are the direct descendants of Cro-Magnon man who dominated the Upper Paleolithic expansion into Europe.  R1b1a2 is believed to have expanded throughout Europe as humans re-colonized after the last ice age which ended approximately 10-12 thousand years ago. The Egyptian pharaoh, Tutankhamun (134 – 1323 BC) also belongs to the haplogroup R1b1a2.

FTDNA Y-chromosome Database Matches

Willy has a very high number level of matches, many with the same last name or a variant of it.  At the higher marker levels, the most-distant (chronologically) paternal country of origin as known by the subject, are all in the British Isles (plus one from Germany).

  • 67-marker level.  12 matches.   1-7 step “genetic distance”. Six have the Isaacs last name (or variant).
  • 37-marker level.  24 matches.  0-4 step “genetic distance”. Fifteen have the Isaacs last name (or variant).
  • 25-marker level.  105 matches.  0-2 step “genetic distance”. Fifteen have the Isaacs last name (or variant).
  • 12-marker level.  418 matches.  0-1 step “genetic distance”. Sixteen have the Isaacs last name (or variant).


At the 67-marker level, for men who share a common surname and who have matches of genetic distances of 0-2 steps, they most likely share a common ancestor within a genealogical timeframe. For an exact match (0 steps), this means there is a 50% chance that the most recent common ancestor (MRCA) is within three generations or less and a 90% it is within five generations or less. Willy does not have any exact (0 step) 67-marker matches but does have five matches at this marker level with genetic distances of 1 or 2 and the same last name. These should be the first people to contact to establish a genealogical relationship.

Monday, September 10, 2012

Paleoanthropology, genealogy and the miracle of DNA. Part Four.



Tests purchased from FamilyTreeDNA (Houston)


Y-chromosome (STR)
Mitochondrial
Autosomal (Family Finder)
What is tested?
Y chromosome DNA mutations (short tandem repeats, STR). See note #3.
Mitochondrial DNA mutations.
Single Nucleotide Polymorphisms (SNPs) from 22 pairs of autosomal chromosomes. See note #7.
DNA inheritance
Passed on by the father to sons only (hence only men can be tested)
Passed on by the mother to all children (& thus both men & women can be tested)
Inherited from both maternal and paternal lines (men & women can be tested)
Level of test
Four marker sequences; 12, 25, 37, 67. Highest level (111) was not included.
Full Mitochondrial Sequence (FMS) test which sequences the HVR1 region, the HVR2 region and the Coding Region. See note #4.
710,000 pairs of locations (SNPs)
Raw data
Number of sequence repeats (allele) at a given marker position (DYS).
Usually reported as differences from the revised Cambridge Reference Sequence (CRS)

Genealogical goals
Identify male relatives with a common paternal ancestor
Identify distant relatives with a common maternal ancestor
Identify relatives with a common ancestor
Genealogical/
historical timeframe
Recent to hundreds of years
Hundreds to thousands of years. See note #5.
Up to five or six generations (circa 100 – 150 years)
Close matches
The genetic “distance” between you and the match are expressed as “Steps”. See note #1.
Matches are likely to be very distant. See note #6.
Predicted relationships defined by autosomal DNA matched (in centiMorgans) & the longest matched segment. See notes #8 & #9.
“Deep ancestral” (archaeological) goals
Determine paternal line ancestral haplogroup (see note #2). Predicted from STR results. See note #3.
Determine maternal line ancestral haplogroup
Determine possible geographic origins by comparing results with world-wide type-populations. See note #10.
“Deep ancestral” (archaeological) timeframe
Usually tens of thousands of years
Tens of thousands of years
Hundreds to 2,000 years (up to about 80 generations). 

Notes:

1)  For the various levels of  Y-chromosome STR test, the steps between the subject and a match equate to the number of markers that match as follows:

“Step” level
Number of markers used in the FTDNA tests
12
25
37
67
0
11/12
25/25
37/37
67/67
1
12/12
24/25
36/37
66/67
2
NA
23/25
35/37
65/67
3
NA
NA
34/37
64/67
4
NA
NA
NA
63/67
5
NA
NA
NA
62/67
6
NA
NA
NA
61/67
7
NA
NA
NA
60/67

2)  Haplotype =  a set of closely-linked genetic markers which tend to be inherited together. Haplogroup = all of the descendants of a single individual who first showed a particular unique genetic characteristic. Haplogroups characterize the early migrations of specific population groups. Individuals with the same genetic mutation or "marker" can thus be linked back to the population where the marker first made an appearance.

3)  A Y-chromosome Single Nucleotide Polymorphism (SNP) test is available if either (a) the error in the haplogroup prediction from the STR test is high or (b) if more detail is required about the exact position within the haplogroup (aka “Deepclade”test).

4)  Mitochondrial DNA (mtDNA) has two major parts, the control region and the coding region. The control region is the hypervariable region (HVR) which is relatively fast changing. It may be further divided into two Hypervariable regions, HVR1 and HVR2. The coding region is the part of the mtDNA genome that contains genes and is believed to be slower mutating than the control region. Often, it is the mutations that are found in the coding region that are used to define the haplogroups

5)  Mitochondrial DNA mutations typically occur only once every 50 generations or about 1000 years.

6)  For an MtDNA match, the time window and confidence level increases with the higher resolution tests.

Matching Level
Maximum # Generations to MRCA
Confidence Interval
HVR1
52 (about 1,300 years)
50%
HVR1 & HVR2
28 (about 700 years)
50%
HVR1, HVR2  & Coding Region
16 (about 400 years)
90%

7)  Autosomal short tandem repeat (STR) tests are available but more often used in paternity testing etc. and are not normally offered to genealogists.

8) Autosomal matches are quantified using two measurements.

(i)  Shared cM - This is the sum of the autosomal DNA, given in centiMorgans (cM), that you and your genetic match share. When you share DNA segments with larger cM values with a match, your common ancestors are likely to come from generations that are more recent.
(ii) Longest Block - This is the largest DNA segment given in centiMorgans you and your genetic match share. A DNA segment (block) that is between 5 and 10 centiMorgans (cM) implies shared ancestry. A block that is 10 centiMorgans or larger indicates conclusive shared ancestry.

9) Matches are then grouped by predicted relationships:

(i) Relationship Range. This represents the upper and lower limits to the predicted relationship. The relationship is predicted with high confidence to fall within these limits.
(ii) Suggested Relationship. This is statistically the most likely relationship based on the amount of sharing between you and each match.

10)  This compares the subject’s autosomal signature to a world DNA population database which reflects the last 100 to 2,000 years (about 4 to 80 generations).  The test can detect small traces of genetic ancestry as low as 3% (about 5 to 6 generations) from a distinct Continental group. The continental groups and subgroups are based on genetic similarities and do not precisely match geographical regions. Similarly, the populations are strictly selected “type groups” designed to collectively represent the human genome. However these representative groups are not yet all-inclusive and more are gradually being added to the global database.