A basic sequencing technique is the chain termination method, also known as the dideoxy method or the Sanger DNA sequencing method, developed by Frederick Sanger in 1972. The chain termination method involves DNA replication of a single-stranded template with the use of a DNA primer to initiate synthesis of a complementary strand, DNA polymerase, a mix of the four regular deoxynucleotide (dNTP) monomers, and a small proportion of dideoxynucleotides (ddNTPs), each labeled with a molecular beacon. The ddNTPs are monomers missing a hydroxyl group (–OH) at the site at which another nucleotide usually attaches to form a chain. Every time a ddNTP is randomly incorporated into the growing complementary strand, it terminates the process of DNA replication for that particular strand. This results in multiple short strands of replicated DNA that are each terminated at a different point during replication. When the reaction mixture is subjected to gel electrophoresis, the multiple newly replicated DNA strands form a ladder of differing sizes. Because the ddNTPs are labeled, each band on the gel reflects the size of the DNA strand when the ddNTP terminated the reaction.
In Sanger’s day, four reactions were set up for each DNA molecule being sequenced, each reaction containing only one of the four possible ddNTPs. Each ddNTP was labeled with a radioactive phosphorus molecule. The products of the four reactions were then run in separate lanes side by side on long, narrow PAGE gels, and the bands of varying lengths were detected by autoradiography. Today, this process has been simplified with the use of ddNTPs, each labeled with a different colored fluorescent dye or fluorochrome, in one sequencing reaction containing all four possible ddNTPs for each DNA molecule being sequenced. These fluorochromes are detected by fluorescence spectroscopy. Determining the fluorescence color of each band as it passes by the detector produces the nucleotide sequence of the template strand.
Since 2005, automated sequencing techniques used by laboratories fall under the umbrella of next generation sequencing, which is a group of automated techniques used for rapid DNA sequencing. These methods have revolutionized the field of molecular genetics because the low-cost sequencers can generate sequences of hundreds of thousands or millions of short fragments (25 to 600 base pairs) just in one day. Although several variants of next generation sequencing technologies are made by different companies (for example, 454 Life Sciences’ pyrosequencing and Illumina’s Solexa technology), they all allow millions of bases to be sequenced quickly, making the sequencing of entire genomes relatively easy, inexpensive, and commonplace. In 454 sequencing (pyrosequencing), for example, a DNA sample is fragmented into 400–600-bp single-strand fragments, modified with the addition of DNA adapters to both ends of each fragment. Each DNA fragment is then immobilized on a bead and amplified by PCR, using primers designed to anneal to the adapters, creating a bead containing many copies of that DNA fragment. Each bead is then put into a separate well containing sequencing enzymes. To the well, each of the four nucleotides is added one after the other; when each one is incorporated, pyrophosphate is released as a byproduct of polymerization, emitting a small flash of light that is recorded by a detector. This provides the order of nucleotides incorporated as a new strand of DNA is made and is an example of synthesis sequencing. Next generation sequencers use sophisticated software to get through the cumbersome process of putting all the fragments in order. Overall, these technologies continue to advance rapidly, decreasing the cost of sequencing and increasing the availability of sequence data from a wide variety of organisms quickly.
The National Center for Biotechnology Information houses a widely used genetic sequence database called GenBank where researchers deposit genetic information for public use. Upon publication of sequence data, researchers upload it to GenBank, giving other researchers access to the information. The collaboration allows researchers to compare newly discovered or unknown sample sequence information with the vast array of sequence data that already exists.