Saturday, April 28, 2012


Paleoanthropology, genealogy and the miracle of DNA

Part Two.

William Hudson. Latest update 28th April 2012

Cellular structure

Cells can be thought of as the building blocks of living organism. They consist primarily of the body of the cell or cytoplasm and the nucleus. The human nucleus contains 23 pairs of chromosomes (22 pairs of autosomes and one pair of sex chromosomes), giving a total of 46 per cell. The sex chromosomes are X & Y; Two X chromosomes produce a female; an X & Y produce a male. Y-chromosomes can only be passed on from father to son and is usually passed on without alternation or genetic mixing.  

Chromosomes are, in turn, are made up of DNA molecules. DNA is made up of four chemical bases: adenine (A), cytosine (C), guanine (G) and thymine (T). The DNA molecule is arranged as a double-helix like a ladder with the four bases making up the “rungs” and sugar phosphates making up the “rails”. The building block (nucleotides) of the DNA consists of a base plus a phosphate. Each rung of the DNA ladder consists of two bases (base pairs) but A can only bond with T and C always bonds with G.  Thus four “rung combinations” are possible: AG, GA, CT and TC.

The genetic code of organisms is determined by the arrangement of these base pairs within the DNA of the chromosomes.  A gene is the fundamental unit of hereditary and is simply a sequence of nucleotides on a chromosome (the coding region). However, about 95% of the DNA in the human genome is non-coding (aka “junk”). Outside of the nucleus, the cytoplasm of the cell includes mitochondria which has its own DNA, independent of the DNA included in the nuclear chromosomes. Mitochondria are divided into two parts, the control region and the coding region. The genetic code in mtDNA can only be passed on by females but to both sons and daughters.

A genetic marker is a distinctive feature of the DNA molecule that allows a particular position (or locus) on the molecule to be flagged.

DNA provides the “instructions” for cells to make identical copies of themselves. When spontaneous or random changes do occur, these are known as mutations. Such mutations that occur in the coding regions of chromosomes account for all genetic differences between humans whereas mutations in the non-coding or junk-DNA have no effect.

Y-chromosome testing

The Y-chromosome is inherited from father to son with minimal changes from one generation to the next. There is one type of Y-chromosome mutation that occurs at a relatively fast rate that is of use in a genealogical timeframe, known as a Short Tandem Repeat (STR). They are usually 2-5 bases in length; for example GATAGATAGATA where the three-base sequence is repeated. Results are expressed as the number of repeats (or allele) at a given marker. For example DYS390 – 24 would be 24 repeats at the marker in position DYS390. The complete set of a subject’s results is known as his haplotype.

Just knowing the haplotype numbers in of itself means little. However, the differences in STRs at select markers on the Y-chromosome (or polymorphs) can provide a basis for comparison among individuals and populations.  If the mutation rate is known, the time frame in which the two individuals shared a most recent common ancestor or MRCA can be determined. If their test results are a perfect or near perfect match, they are related within genealogy's time frame. For example, using Family Tree DNA’s 76-marker comparison, two men will most likely share a common ancestor within a genealogical timeframe with a match of 60 out of 67 or better. Probabilities can also be assigned. For example, an exact match in a high-resolution 111-marker test, there is a 95% probability that the common ancestor lived within five generations.

Another type of mutation is known as Single Nucleotide Polymorphisms (SNP) which occur only at a single nucleotide at a specific position in a chromosome and can only occur once in a single individual. An SNP is one type of Unique Event Polymorphism (UEP) which is has a mutation rate so low that it can be treated as a one-time event and is more applicable to “Deep Ancestry” studies. A group of descendents that each shares the same UEP is known as a Haplogroup.  These are identified by the letters A through S, with A being the African group from which all modern haplogroups are descended. Note: STR results can sometimes predict a likely haplogroup but this can only be confirmed by SNP testing.

mtDNA testing

Mitochondrial DNA is only passed though the maternal line and also with minimal changes from one generation to another. Testing can be done in one or both of two areas of the control region known as the Hypervariable Region (HVR1 and HVR2) although it is now possible to obtain a complete mtDNA sequence (16,659 bases or nucleotides). The test result is a string of bases, defined by their letters and raging from a few hundred to upwards of a thousand, that is compared to the Cambridge Reference Sequence (CRS) and the differences (which represent substitutions of bases) noted.  In absolute terms, the mtDNA mutation rate is low and most people have only a handful of differences with the CRS. Results are often more applicable to “Deep Ancestry” studies, rather than shorter timeframe genealogical projects. The complete set of mtDNA polymorphs then represents the individual’s haplotype. mtDNA haplogroups are also identified with letters but the sequence denotes the order in which they were discovered. As with the Y-DNA test, haplogroups can be indicated by mtDNA haplotypes but confirmation can only be obtained by SNP testing which is usually carried out in the coding region.

In some instances, mtDNA tests can have genealogical relevance but a nearly perfect match is not as helpful as it is for the above Y-DNA case. In the matrilineal case, it takes a perfect match to be really useful and even then the MRCA could have lived hundreds of years ago. The higher the resolution, the higher the chance that an exact match indicates a maternal common ancestor.

Autosomal testing

This test is carried out on the 22 non-gender determining chromosome pairs which do undergo changes (known as recombination) from one generation to another. A mixture of autosomal DNA is inherited from both parents in a roughly equal mix but it is shuffled up with each generation. Thus the test crosses gender lines; it is not restricted to either the paternal or maternal lines only. Conclusions from autosomal testing tend to be somewhat generic and the method suffers from a high error rate. Two types of test are available.

One identifies the number of times a given sequence repeats at each location (Short Tandem Repeats or STRs).  These tests would be applicable, for example, to paternity or sibling verification or adoption issues.

The second method tests for Single Nucleotide Polymorphism (SNP) and identifies the number and length of DNA segments that are shared between individuals. The more shared segments and the longer the length of those segments, the more common ancestors are possible.

Family Tree DNA

There are several genealogical DNA testing companies, most of which are in the US. Of all of these, it is difficult to argue against using Family Tree DNA, based in Houston. They offer the most complete suite of tests and, although not cheap, are competitively priced. Results are stored free for 25 years; they host many genealogical test projects, manage the largest DNA databases and are in partnership with the National Geographic Genographic Project.

Y-DNA testing is offered at three levels with 37, 67 and 111 markers. If a customer’s Y-DNA STR haplogroup cannot be predicted with 100% confidence, the Backbone SNP deep ancestry test is offered at no charge.  mtDNAPlus is a mid-level maternal line test that includes HVR1 and HVR1+HVR2 matches. mtFullSequence also tests the Coding Region. Family Finder is Family Tree DNA’s version of the autosomal DNA test.

No comments: