Paleoanthropology,
genealogy and the miracle of DNA
Part
Two.
William
Hudson. Latest update 28th April 2012
Cellular structure
Cells can be thought of as the building
blocks of living organism. They consist primarily of the body of the cell or cytoplasm
and the nucleus. The human nucleus contains 23 pairs of chromosomes
(22 pairs of autosomes and one pair of sex chromosomes), giving a total of 46
per cell. The sex chromosomes are X & Y; Two X chromosomes produce a
female; an X & Y produce a male. Y-chromosomes can only be passed on from
father to son and is usually passed on without alternation or genetic mixing.
Chromosomes
are, in turn, are made up of DNA molecules. DNA is made up of
four chemical bases: adenine (A), cytosine (C), guanine (G) and thymine (T).
The DNA molecule is arranged as a double-helix like a ladder with the
four bases making up the “rungs” and sugar phosphates making up the “rails”. The
building block (nucleotides) of the DNA consists of a base plus a phosphate. Each
rung of the DNA ladder consists of two bases (base pairs) but A can only bond
with T and C always bonds with G. Thus
four “rung combinations” are possible: AG, GA, CT and TC.
The genetic
code of organisms is determined by the arrangement of these base pairs within
the DNA of the chromosomes. A gene
is the fundamental unit of hereditary and is simply a sequence of nucleotides
on a chromosome (the coding region). However, about 95%
of the DNA in the human genome is non-coding (aka “junk”). Outside of
the nucleus, the cytoplasm of the cell includes mitochondria which has
its own DNA, independent of the DNA included in the nuclear chromosomes. Mitochondria
are divided into two parts, the control
region and the coding region. The genetic code in mtDNA can only be passed on
by females but to both sons and daughters.
A genetic
marker is a distinctive feature of the DNA molecule that allows a
particular position (or locus) on the molecule to be
flagged.
DNA
provides the “instructions” for cells to make identical copies of themselves.
When spontaneous or random changes do occur, these are known as mutations.
Such mutations that occur in the coding regions of chromosomes account for all
genetic differences between humans whereas mutations in the non-coding or
junk-DNA have no effect.
Y-chromosome
testing
The
Y-chromosome is inherited from father to son with minimal changes from one generation
to the next. There is one type of Y-chromosome mutation that occurs at a relatively
fast rate that is of use in a genealogical timeframe, known as a Short
Tandem Repeat (STR). They are usually 2-5 bases in
length; for example GATAGATAGATA where the three-base sequence is repeated. Results
are expressed as the number of repeats (or allele) at a given marker. For
example DYS390 – 24 would be 24 repeats at the marker in position DYS390. The
complete set of a subject’s results is known as his haplotype.
Just
knowing the haplotype numbers in of itself means little. However, the
differences in STRs at select markers on the Y-chromosome (or polymorphs)
can provide a basis for comparison among individuals and populations. If the mutation rate is known, the time frame
in which the two individuals shared a most recent common ancestor or MRCA can
be determined. If their test results are a perfect or near perfect match, they
are related within genealogy's time frame. For example, using Family Tree DNA’s
76-marker comparison, two men will most likely share a common ancestor within a
genealogical timeframe with a match of 60 out of 67 or better. Probabilities
can also be assigned. For example, an exact match in a high-resolution
111-marker test, there is a 95% probability that the common ancestor lived
within five generations.
Another
type of mutation is known as Single
Nucleotide Polymorphisms (SNP) which occur only at a single
nucleotide at a specific position in a chromosome and can only occur once in a
single individual. An SNP is one type of Unique Event Polymorphism (UEP)
which is has a mutation rate so low that it can be treated as a one-time event
and is more applicable to “Deep Ancestry” studies. A group of descendents that
each shares the same UEP is known as a Haplogroup. These are identified by the letters A through
S, with A being the African group from which all modern haplogroups are descended.
Note: STR results can sometimes predict a likely haplogroup but this can only
be confirmed by SNP testing.
mtDNA testing
Mitochondrial
DNA is only passed though the maternal line and also with minimal changes from
one generation to another. Testing can be done in one or both of two areas of
the control region known as the Hypervariable Region (HVR1
and HVR2)
although it is now possible to obtain a complete mtDNA sequence (16,659 bases
or nucleotides). The test result is a string of bases, defined by their letters
and raging from a few hundred to upwards of a thousand, that is compared to the
Cambridge
Reference Sequence (CRS) and the differences (which
represent substitutions of bases) noted.
In absolute terms, the mtDNA mutation rate is low and most people have
only a handful of differences with the CRS. Results are often more applicable
to “Deep Ancestry” studies, rather than shorter timeframe genealogical
projects. The complete set of mtDNA polymorphs then represents the individual’s
haplotype. mtDNA haplogroups are also identified with letters but the sequence
denotes the order in which they were discovered. As with the Y-DNA test,
haplogroups can be indicated by mtDNA haplotypes but confirmation can only be
obtained by SNP testing which is usually carried out in the coding region.
In some
instances, mtDNA tests can have genealogical relevance but a nearly perfect
match is not as helpful as it is for the above Y-DNA case. In the matrilineal
case, it takes a perfect match to be really useful and even then the MRCA could
have lived hundreds of years ago. The higher the resolution, the higher the
chance that an exact match indicates a maternal common ancestor.
Autosomal testing
This test
is carried out on the 22 non-gender determining chromosome pairs which do undergo
changes (known as recombination) from one generation to another. A mixture of
autosomal DNA is inherited from both parents in a roughly equal mix but it is
shuffled up with each generation. Thus the test crosses gender lines; it is not
restricted to either the paternal or maternal lines only. Conclusions from autosomal
testing tend to be somewhat generic and the method suffers from a high error
rate. Two types of test are available.
One
identifies the number of times a given sequence repeats at each location (Short
Tandem Repeats or STRs). These tests
would be applicable, for example, to paternity or sibling verification or
adoption issues.
The second
method tests for Single Nucleotide Polymorphism (SNP) and identifies the number
and length of DNA segments that are shared between individuals. The more shared
segments and the longer the length of those segments, the more common ancestors
are possible.
Family Tree DNA
There are several
genealogical DNA testing companies, most of which are in the US. Of all of these,
it is difficult to argue against using Family Tree DNA, based in Houston. They
offer the most complete suite of tests and, although not cheap, are competitively
priced. Results are stored free for 25 years; they host many genealogical test projects,
manage the largest DNA databases and are in partnership with the National
Geographic Genographic Project.
Y-DNA
testing is offered at three levels with 37, 67 and 111 markers. If a customer’s
Y-DNA STR haplogroup cannot be predicted with 100% confidence, the Backbone SNP
deep ancestry test is offered at no charge. mtDNAPlus
is a mid-level maternal line test that includes HVR1 and HVR1+HVR2 matches. mtFullSequence
also tests the Coding Region. Family Finder is Family Tree DNA’s version of the
autosomal DNA test.