Strategies for the systematic sequencing of complex genomes

feature-image

Play all audios:

Loading...

ABSTRACT Recent spectacular advances in the technologies and strategies for DNA sequencing have profoundly accelerated the detailed analysis of genomes from myriad organisms. The past few


years alone have seen the publication of near-complete or draft versions of the genome sequence of several well-studied, multicellular organisms — most notably, the human. As well as


providing data of fundamental biological significance, these landmark accomplishments have yielded important strategic insights that are guiding current and future genome-sequencing


projects. KEY POINTS * The genome sequences of several eukaryotic organisms have been reported in recent years, including a yeast (_Saccharomyces cerevisiae_), a nematode (_Caenorhabditis


elegans_), an insect (_Drosophila melanogaster_), a plant (_Arabidopsis thaliana_) and human (_Homo sapiens_). * These spectacular achievements have been associated with a range of technical


advances in basic sequencing methodology, the automation of many of the key steps in the sequencing pipeline, the adoption of industrial-scale experimental protocols and the development of


improved computational tools for sequence analysis. * The two main strategies used for sequencing large, complex genomes are clone-by-clone shotgun sequencing and whole-genome shotgun


sequencing. Both approaches were used to generate the recently reported working draft human sequences. * In clone-by-clone sequencing, individual clones are selected from a contig map (a


type of physical map) and each is then sequenced by a shotgun-sequencing strategy. In turn, the genome sequence is assembled by pasting together the sequences of the individual clones. * In


whole-genome shotgun sequencing, the genome is broken into fragments of defined size classes, which are then cloned and used to generate sequence reads. In turn, the genome sequence is


assembled from the entire collection of sequence reads. * Each of the two main strategies has strengths and weaknesses, and a hybrid strategy that involves both whole-genome and


clone-by-clone shotgun-sequencing components is being adopted in many current projects. However, it remains to be determined how much sequencing should be done by each strategy when


implementing a hybrid approach. * For new sequencing projects, it is also important to consider whether the genome needs to be sequenced to high accuracy or whether a more draft-level


sequence can provide the information that is required. This consideration will probably influence the choice and implementation of a particular sequencing strategy. * Sequencing the genome


of a complex, multicellular eukaryote still poses massive technological challenges and requires a significant amount of funds. Choosing the appropriate sequencing strategy is therefore a


crucial step in any genome-sequencing project. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access


through your institution Subscribe to this journal Receive 12 print issues and online access $209.00 per year only $17.42 per issue Learn more Buy this article * Purchase on SpringerLink *


Instant access to full article PDF Buy now Prices may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional


subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS BEYOND ASSEMBLY: THE INCREASING FLEXIBILITY OF SINGLE-MOLECULE SEQUENCING TECHNOLOGY Article


09 May 2023 LONG-READ HUMAN GENOME SEQUENCING AND ITS APPLICATIONS Article 05 June 2020 A DRAFT HUMAN PANGENOME REFERENCE Article Open access 10 May 2023 REFERENCES * Green, E. D. in _The


Metabolic and Molecular Bases of Inherited Disease_ (eds Scriver, C. R. et al.) 259–298 (McGraw–Hill, New York, 2001). Google Scholar  * Sanger, F., Nicklen, S. & Coulson, A. R. DNA


sequencing with chain-terminating inhibitors. _Proc. Natl Acad. Sci. USA_ 74, 5463–5467 (1977).REPORTS THE NOBEL PRIZE-WINNING METHOD DEVELOPED BY FRED SANGER AND COLLEAGUES FOR SEQUENCING


DNA — CALLED DIDEOXY CHAIN TERMINATION SEQUENCING. ROUGHLY 25 YEARS LATER, THIS CONTINUES TO BE THE STATE-OF-THE-ART TECHNIQUE FOR LARGE-SCALE DNA SEQUENCING. CAS  PubMed  PubMed Central 


Google Scholar  * Smith, L. M. et al. Fluorescence detection in automated DNA sequence analysis. _Nature_ 321, 674–679 (1986).FIRST NOTABLE DESCRIPTION OF FLUORESCENCE-BASED DNA SEQUENCING


AND THE USE OF AUTOMATED INSTRUMENTATION FOR DETECTING THE SEQUENCING REACTION PRODUCTS. CAS  PubMed  Google Scholar  * Hunkapiller, T., Kaiser, R. J., Koop, B. F. & Hood, L. Large-scale


and automated DNA sequence determination. _Science_ 254, 59–67 (1991). CAS  PubMed  Google Scholar  * Mullikin, J. C. & McMurray, A. A. Sequencing the genome, fast. _Science_ 283,


1867–1868 (1999). CAS  PubMed  Google Scholar  * Meldrum, D. R. Sequencing genomes and beyond. _Science_ 292, 515–516 (2001). CAS  PubMed  Google Scholar  * Tabor, S. & Richardson, C. C.


A single residue in DNA polymerases of the _Escherichia coli_ DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. _Proc. Natl Acad. Sci. USA_


92, 6339–6343 (1995). CAS  PubMed  PubMed Central  Google Scholar  * Prober, J. M. et al. A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides. _Science_


238, 336–341 (1987). CAS  PubMed  Google Scholar  * Ju, J., Ruan, C., Fuller, C. W., Glazer, A. N. & Mathies, R. A. Fluorescence energy transfer dye-labeled primers for DNA sequencing


and analysis. _Proc. Natl Acad. Sci. USA_ 92, 4347–4351 (1995). CAS  PubMed  PubMed Central  Google Scholar  * Rosenblum, B. B. et al. New dye-labeled terminators for improved DNA sequencing


patterns. _Nucleic Acids Res._ 25, 4500–4504 (1997). CAS  PubMed  PubMed Central  Google Scholar  * Metzker, M. L., Lu, J. & Gibbs, R. A. Electrophoretically uniform fluorescent dyes


for automated DNA sequencing. _Science_ 271, 1420–1422 (1996). CAS  PubMed  Google Scholar  * Lee, L. G. et al. New energy transfer dyes for DNA sequencing. _Nucleic Acids Res._ 25,


2816–2822 (1997). CAS  PubMed  PubMed Central  Google Scholar  * Meldrum, D. Automation for genomics. I. Preparation for sequencing. _Genome Res._ 10, 1081–1092 (2000). CAS  PubMed  Google


Scholar  * Meldrum, D. Automation for genomics. II. Sequencers, microarrays, and future trends. _Genome Res._ 10, 1288–1303 (2000). CAS  PubMed  Google Scholar  * International Human Genome


Sequencing Consortium. Initial sequencing and analysis of the human genome. _Nature_ 409, 860–921 (2001).LANDMARK PAPER ABOUT THE INITIAL SEQUENCE OF THE HUMAN GENOME GENERATED BY THE PUBLIC


HUMAN GENOME PROJECT USING A CLONE-BY-CLONE SHOTGUN-SEQUENCING STRATEGY. * Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using _Phred_. I.


Accuracy assessment. _Genome Res._ 8, 175–185 (1998). Article  CAS  PubMed  Google Scholar  * Ewing, B. & Green, P. Base-calling of automated sequencer traces using _Phred_. II. Error


probabilities. _Genome Res._ 8, 186–194 (1998). CAS  PubMed  Google Scholar  * Gordon, D., Abajian, C. & Green, P. _Consed_: a graphical tool for sequence finishing. _Genome Res._ 8,


195–202 (1998).THE MOST COMMONLY USED SUITE OF COMPUTER PROGRAMS FOR CARRYING OUT BASE CALLING, SEQUENCE ASSEMBLY AND VIEWING OF SEQUENCE ASSEMBLIES ARE PHRED (REFERENCES 16 AND 17 ), PHRAP


AND CONSED (REFERENCE 18 ), RESPECTIVELY. REFERENCE 20 DESCRIBES AN IMPORTANT EXTENSION OF CONSED (A PROGRAM CALLED AUTOFINISH) THAT AUTOMATES SOME OF THE KEY STEPS IN SEQUENCE FINISHING.


CAS  PubMed  Google Scholar  * Bonfield, J. K., Smith, K. F. & Staden, R. A new DNA sequence assembly program. _Nucleic Acids Res._ 23, 4992–4999 (1995). CAS  PubMed  PubMed Central 


Google Scholar  * Gordon, D., Desmarais, C. & Green, P. Automated finishing with Autofinish. _Genome Res._ 11, 614–625 (2001). CAS  PubMed  PubMed Central  Google Scholar  * Olson, M.


& Green, P. A 'quality-first' credo for the Human Genome Project. _Genome Res._ 8, 414–415 (1998). CAS  PubMed  Google Scholar  * Felsenfeld, A., Peterson, J., Schloss, J.


& Guyer, M. Assessing the quality of the DNA sequence from The Human Genome Project. _Genome Res._ 9, 1–4 (1999). CAS  PubMed  Google Scholar  * Huang, G. M. High-throughput DNA


sequencing: a genomic data manufacturing process. _DNA Seq._ 10, 149–153 (1999). CAS  PubMed  Google Scholar  * Wendl, M. C., Dear, S., Hodgson, D. & Hillier, L. Automated sequence


preprocessing in a large-scale sequencing environment. _Genome Res._ 8, 975–984 (1998). CAS  PubMed  PubMed Central  Google Scholar  * Dedhia, N. N. & McCombie, W. R. Kaleidaseq: a


web-based tool to monitor data flow in a high throughput sequencing facility. _Genome Res._ 8, 313–318 (1998). CAS  PubMed  PubMed Central  Google Scholar  * Lawrence, C. B. et al. The


Genome Reconstruction Manager: a software environment for supporting high-throughput DNA sequencing. _Genomics_ 23, 192–201 (1994). CAS  PubMed  Google Scholar  * Kimmel, B. E., Palazzolo,


M. J., Martin, C. H., Boeke, J. D. & Devine, S. E. in _Genome Analysis: A Laboratory Manual. 1. Analyzing DNA_ (eds Birren, B. et al.) 455–532 (Cold Spring Harbor Laboratory Press, Cold


Spring Harbor, New York, 1997). Google Scholar  * Church, G. M. & Kieffer-Higgins, S. Multiplex DNA sequencing. _Science_ 240, 185–188 (1988). CAS  PubMed  Google Scholar  * Cherry, J.


L. et al. Enzyme-linked fluorescent detection for automated multiplex DNA sequencing. _Genomics_ 20, 68–74 (1994). CAS  PubMed  Google Scholar  * Smith, D. R. et al. Multiplex sequencing of


1.5 Mb of the _Mycobacterium leprae_ genome. _Genome Res._ 7, 802–819 (1997). CAS  PubMed  Google Scholar  * Gardner, R. C. et al. The complete nucleotide sequence of an infectious clone of


cauliflower mosaic virus by M13mp7 shotgun sequencing. _Nucleic Acids Res._ 9, 2871–2888 (1981). CAS  PubMed  PubMed Central  Google Scholar  * Anderson, S. Shotgun DNA sequencing using


cloned DNase I-generated fragments. _Nucleic Acids Res._ 10, 3015–3027 (1981). Google Scholar  * Sanger, F., Coulson, A. R., Hong, G. F., Hill, D. F. & Petersen, G. B. Nucleotide


sequence of the bacteriophage lambda DNA. _J. Mol. Biol._ 162, 729–773 (1982). REFERENCES 31–33 REPRESENT SOME OF THE EARLIEST PAPERS THAT REPORTED THE USE OF SHOTGUN SEQUENCING AS A


STRATEGY FOR ESTABLISHING THE SEQUENCE OF LARGE PIECES OF DNA. CAS  PubMed  Google Scholar  * Deininger, P. L. Random subcloning of sonicated DNA: application to shotgun DNA sequence


analysis. _Anal. Biochem._ 129, 216–223 (1983). CAS  PubMed  Google Scholar  * Messing, J. The universal primers and the shotgun DNA sequencing method. _Methods Mol. Biol._ 167, 13–31


(2001). CAS  PubMed  Google Scholar  * Ansorge, W., Voss, H. & Zimmermann, J. (eds) _DNA Sequencing Strategies_ (Wiley & Sons, Inc., New York, 1997). Google Scholar  * Adams, M. D.,


Fields, C. & Venter, J. C. (eds) _Automated DNA Sequencing and Analysis_ (Academic, Inc., San Diego, 1994). Google Scholar  * Spurr, N. K., Young, B. D. & Bryant, S. P. (eds) _ICRF


Handbook of Genome Analysis_ (Blackwell Science Ltd, Oxford, 1998). Google Scholar  * Green, E. D., Birren, B., Klapholz, S., Myers, R. M. & Hieter, P. (eds) _Genome Analysis: A


Laboratory Manual_ Vols 1–4 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1997). Google Scholar  * Goffeau, A. et al. The yeast genome directory. _Nature_ 387, S1–S105


(1997).DESCRIBES THE GENOME SEQUENCE OF THE FIRST EUKARYOTIC ORGANISM, THE YEAST _SACCHAROMYCES CEREVISIAE_ , BY A COLLECTION OF NUMEROUS SEQUENCING GROUPS (LARGE AND SMALL) AROUND THE


WORLD. Google Scholar  * The _C. elegans_ Sequencing Consortium. Genome sequence of the nematode _C. elegans_: a platform for investigating biology. _Science_ 282, 2012–2018 (1998).REPORTS


THE GENOME SEQUENCE OF THE FIRST MULTICELLULAR ORGANISM, THE NEMATODE WORM _CAENORHABDITIS ELEGANS_ , BY THE SEQUENCING GROUPS AT WASHINGTON UNIVERSITY AND THE SANGER CENTRE. * Wilson, R. K.


& Mardis, E. R. in _Genome Analysis: A Laboratory Manual. 1. Analyzing DNA_ (eds Birren, B. et al.) 397–454 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1997).


Google Scholar  * Olson, M., Hood, L., Cantor, C. & Botstein, D. A common language for physical mapping of the human genome. _Science_ 245, 1434–1435 (1989). CAS  PubMed  Google Scholar


  * Vollrath, D. in _Genome Analysis: A Laboratory Manual. 4. Mapping Genomes_ (eds Birren, B. et al.) 187–215 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1999).


Google Scholar  * Burke, D. T., Carle, G. F. & Olson, M. V. Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. _Science_ 236, 806–812


(1987). CAS  PubMed  Google Scholar  * Green, E. D., Hieter, P. & Spencer, F. A. in _Genome Analysis: A Laboratory Manual. 3. Cloning Systems_ (eds Birren, B. et al.) 297–565 (Cold


Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1998). Google Scholar  * Shizuya, H. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in


_Escherichia coli_ using an F-factor-based vector. _Proc. Natl Acad. Sci. USA_ 89, 8794–8797 (1992). CAS  PubMed  PubMed Central  Google Scholar  * Ioannou, P. A. et al. A new bacteriophage


P1-derived vector for the propagation of large human DNA fragments. _Nature Genet._ 6, 84–89 (1994). CAS  PubMed  Google Scholar  * Hudson, T. J. et al. An STS-based map of the human genome.


_Science_ 270, 1945–1954 (1995). CAS  PubMed  Google Scholar  * Chumakov, I. M. et al. A YAC contig map of the human genome. _Nature_ 377, 175–297 (1995). CAS  PubMed  Google Scholar  *


Bouffard, G. G. et al. A physical map of human chromosome 7: an integrated YAC contig map with average STS spacing of 79 kb. _Genome Res._ 7, 673–692 (1997). CAS  PubMed  Google Scholar  *


Nagaraja, R. et al. X chromosome map at 75-kb STS resolution, revealing extremes of recombination and GC content. _Genome Res._ 7, 210–222 (1997). CAS  PubMed  Google Scholar  * Nusbaum, C.


et al. A YAC-based physical map of the mouse genome. _Nature Genet._ 22, 388–393 (1999). CAS  PubMed  Google Scholar  * Marra, M. A. et al. High throughput fingerprint analysis of


large-insert clones. _Genome Res._ 7, 1072–1084 (1997).APPROACH FOR CONSTRUCTING SEQUENCE-READY BAC CONTIG MAPS BY RESTRICTION ENZYME DIGEST-BASED FINGERPRINT ANALYSIS. THIS GENERAL METHOD,


WHICH ESSENTIALLY REPRESENTS AN EXTENSION OF EARLIER MAPPING TECHNIQUES (FOR EXAMPLE, SEE REFERENCES 57–59 ), HAS BEEN USED TO GENERATE BAC CONTIG MAPS OF THE HUMAN, MOUSE, _ARABIDOPSIS


THALIANA_ AND OTHER GENOMES. CAS  PubMed  PubMed Central  Google Scholar  * Gregory, S. G., Howell, G. R. & Bentley, D. R. Genome mapping by fluorescent fingerprinting. _Genome Res._ 7,


1162–1168 (1997). CAS  PubMed  PubMed Central  Google Scholar  * Kohara, Y., Akiyama, K. & Isono, K. The physical map of the whole _E. coli_ chromosome: application of a new strategy for


rapid analysis and sorting of a large genomic library. _Cell_ 50, 495–508 (1987). CAS  PubMed  Google Scholar  * Olson, M. V. et al. Random-clone strategy for genomic restriction mapping in


yeast. _Proc. Natl Acad. Sci. USA_ 83, 7826–7830 (1986). CAS  PubMed  PubMed Central  Google Scholar  * Riles, L. et al. Physical maps of the six smallest chromosomes of _Saccharomyces


cerevisiae_ at a resolution of 2.6 kilobase pairs. _Genetics_ 134, 81–150 (1993). CAS  PubMed  PubMed Central  Google Scholar  * Coulson, A., Sulston, J., Brenner, S. & Karn, J. Toward a


physical map of the genome of the nematode _Caenorhabditis elegans_. _Proc. Natl Acad. Sci. USA_ 83, 7821–7825 (1986).REFERENCES 57–59 REPRESENT CLASSIC DESCRIPTIONS OF RESTRICTION ENZYME


DIGEST-BASED FINGERPRINT ANALYSIS, AS USED TO CONSTRUCT PHYSICAL MAPS OF THE _SACCHAROMYCES CEREVISIAE_ AND _CAENORHABDITIS ELEGANS_ GENOMES. IN BOTH CASES, THE RESULTING MAPS PAVED THE WAY


TOWARDS THE SEQUENCING OF THESE GENOMES, AS WELL AS PROVIDED KEY INSIGHT INTO THE STRATEGIES REQUIRED FOR MAPPING AND SEQUENCING THE HUMAN GENOME. CAS  PubMed  PubMed Central  Google Scholar


  * Marra, M. et al. A map for sequence analysis of the _Arabidopsis thaliana_ genome. _Nature Genet._ 22, 265–270 (1999). CAS  PubMed  Google Scholar  * Mozo, T. et al. A complete BAC-based


physical map of the _Arabidopsis thaliana_ genome. _Nature Genet._ 22, 271–275 (1999). CAS  PubMed  Google Scholar  * The International Human Genome Mapping Consortium. A physical map of


the human genome. _Nature_ 409, 934–941 (2001).PAPER REPORTING THE BAC-BASED PHYSICAL MAP OF THE HUMAN GENOME CONSTRUCTED BY THE HUMAN GENOME PROJECT. * Green, E. D. & Olson, M. V.


Chromosomal region of the cystic fibrosis gene in yeast artificial chromosomes: a model for human genome mapping. _Science_ 250, 94–98 (1990). CAS  PubMed  Google Scholar  * McPherson, J. D.


Sequence ready — or not? _Genome Res._ 7, 1111–1113 (1997). CAS  PubMed  Google Scholar  * Edwards, A. et al. Automated DNA sequencing of the human HPRT locus. _Genomics_ 6, 593–608 (1990).


CAS  PubMed  Google Scholar  * Chissoe, S. L. et al. Representation of cloned genomic sequences in two sequencing vectors: correlation of DNA sequence and subclone distribution. _Nucleic


Acids Res._ 25, 2960–2966 (1997). CAS  PubMed  PubMed Central  Google Scholar  * Bouck, J., Miller, W., Gorrell, J. H., Muzny, D. & Gibbs, R. A. Analysis of the quality and utility of


random shotgun sequencing at low redundancies. _Genome Res._ 8, 1074–1084 (1998). CAS  PubMed  PubMed Central  Google Scholar  * The _Arabidopsis_ Genome Initiative. Analysis of the genome


sequence of the flowering plant _Arabidopsis thaliana_. _Nature_ 408, 796–815 (2000).DESCRIBES THE GENOME SEQUENCE OF THE FIRST PLANT, _ARABIDOPSIS THALIANA_ , BY AN INTERNATIONAL CONSORTIUM


OF SEQUENCING GROUPS. * Coulson, A., Waterston, R., Kiff, J., Sulston, J. & Kohara, Y. Genome linking with yeast artificial chromosomes. _Nature_ 335, 184–186 (1988). CAS  PubMed 


Google Scholar  * Bentley, D. R. Decoding the human genome sequence. _Hum. Mol. Genet._ 9, 2353–2358 (2000). CAS  PubMed  Google Scholar  * Waterston, R. & Sulston, J. E. The Human


Genome Project: reaching the finish line. _Science_ 282, 53–54 (1998). CAS  PubMed  Google Scholar  * The Sanger Centre & The Washington University Genome Sequencing Center. Toward a


complete human genome sequence. _Genome Res._ 8, 1097–1108 (1998). * Bentley, D. R., Pruitt, K. D., Deloukas, P., Schuler, G. D. & Ostell, J. Coordination of human genome sequencing via


a consensus framework map. _Trends Genet._ 14, 381–384 (1998). CAS  PubMed  Google Scholar  * Dunham, I. et al. The DNA sequence of human chromosome 22. _Nature_ 402, 489–495 (1999). CAS 


PubMed  Google Scholar  * The Chromosome 21 Mapping and Sequencing Consortium. The DNA sequence of human chromosome 21. _Nature_ 405, 311–319 (2000).REFERENCES 74 AND 75 ANNOUNCE THE


COMPLETION OF FINISHED SEQUENCE FOR THE FIRST TWO HUMAN CHROMOSOMES — 22 AND 21, RESPECTIVELY. * The BAC Resource Consortium. Integration of cytogenetic landmarks into the draft sequence of


the human genome. _Nature_ 409, 953–958 (2001). * Yu, A. et al. Comparison of human genetic and sequence-based physical maps. _Nature_ 409, 951–953 (2001). CAS  PubMed  Google Scholar  *


Deloukas, P. et al. A physical map of 30,000 human genes. _Science_ 282, 744–746 (1998). CAS  PubMed  Google Scholar  * Olivier, M. et al. A high-resolution radiation hybrid map of the human


genome draft sequence. _Science_ 291, 1298–1302 (2001). CAS  PubMed  Google Scholar  * Venter, J. C., Smith, H. O. & Hood, L. A new strategy for genome sequencing. _Nature_ 381, 364–366


(1996). CAS  PubMed  Google Scholar  * Mahairas, G. G. et al. Sequence-tagged connectors: a sequence approach to mapping and scanning the human genome. _Proc. Natl Acad. Sci. USA_ 96,


9739–9744 (1999). CAS  PubMed  PubMed Central  Google Scholar  * Wendl, M. C. et al. Theories and applications for sequencing randomly selected clones. _Genome Res._ 11, 274–280 (2001). CAS


  PubMed  PubMed Central  Google Scholar  * Fraser, C. M. & Fleischmann, R. D. Strategies for whole microbial genome sequencing and analysis. _Electrophoresis_ 18, 1207–1216 (1997). CAS


  PubMed  Google Scholar  * Fleischmann, R. D. et al. Whole-genome random sequencing and assembly of _Haemophilus influenzae_ Rd. _Science_ 269, 496–512 (1995).PUBLICATION REPORTING THE


GENOME SEQUENCE OF THE FIRST PROKARYOTIC ORGANISM, THE BACTERIUM _HAEMOPHILUS INFLUENZAE_ . THIS EFFORT WAS THE FIRST REPORT OF USING A WHOLE-GENOME SHOTGUN-SEQUENCING STRATEGY TO SEQUENCE


THE GENOME OF A FREE-LIVING ORGANISM. CAS  PubMed  Google Scholar  * Fraser, C. M., Eisen, J. A. & Salzberg, S. L. Microbial genome sequencing. _Nature_ 406, 799–803 (2000). CAS  PubMed


  Google Scholar  * Adams, M. D. et al. The genome sequence of _Drosophila melanogaster_. _Science_ 287, 2185–2195 (2000). PubMed  Google Scholar  * Myers, E. W. et al. A whole-genome


assembly of _Drosophila_. _Science_ 287, 2196–2204 (2000).REFERENCES 86 AND 87 REPORT THE INITIAL GENOME SEQUENCE OF _DROSOPHILA MELANOGASTER_ GENERATED BY A HYBRID STRATEGY THAT INVOLVED


BOTH WHOLE-GENOME SHOTGUN SEQUENCING AND CLONE-BY-CLONE SHOTGUN SEQUENCING. THIS PROJECT REFLECTED A COLLABORATION BETWEEN THE PUBLIC HUMAN GENOME PROJECT AND CELERA GENOMICS. CAS  PubMed 


Google Scholar  * Hoskins, R. A. et al. A BAC-based physical map of the major autosomes of _Drosophila melanogaster_. _Science_ 287, 2271–2274 (2000). CAS  PubMed  Google Scholar  * Weber,


J. L. & Myers, E. W. Human whole-genome shotgun sequencing. _Genome Res._ 7, 401–409 (1997). CAS  PubMed  Google Scholar  * Green, P. Against a whole-genome shotgun. _Genome Res._ 7,


410–417 (1997).REFERENCES 89 AND 90 PROVIDE POINT/COUNTER-POINT PERSPECTIVES THAT DETAIL THE OPPOSING VIEWS ON THE USE OF A WHOLE-GENOME SHOTGUN-SEQUENCING STRATEGY FOR SEQUENCING THE HUMAN


GENOME. CAS  PubMed  Google Scholar  * Venter, J. C. et al. Shotgun sequencing of the human genome. _Science_ 280, 1540–1542 (1998). CAS  PubMed  Google Scholar  * Venter, J. C. et al. The


sequence of the human genome. _Science_ 291, 1304–1351 (2001).LANDMARK PAPER REPORTING THE INITIAL SEQUENCE OF THE HUMAN GENOME GENERATED BY CELERA GENOMICS USING A WHOLE-GENOME


SHOTGUN-SEQUENCING STRATEGY IN CONJUNCTION WITH AVAILABLE CLONE-BY-CLONE DATA PROVIDED BY THE HUMAN GENOME PROJECT. CAS  PubMed  Google Scholar  * Bouck, J. B., Metzker, M. L. & Gibbs,


R. A. Shotgun sample sequence comparisons between mouse and human genomes. _Nature Genet._ 25, 31–33 (2000). CAS  PubMed  Google Scholar  * The International SNP Map Working Group. A map of


human genome sequence variation containing 1.42 million single nucleotide polymorphisms. _Nature_ 409, 928–933 (2001). * Crollius, H. R. et al. Estimate of human gene number provided by


genome-wide analysis using _Tetraodon nigroviridis_ DNA sequence. _Nature Genet._ 25, 235–238 (2000). CAS  Google Scholar  * Brenner, S. et al. Characterization of the pufferfish (_Fugu_)


genome as a compact model vertebrate genome. _Nature_ 366, 265–268 (1993). CAS  PubMed  Google Scholar  * McConkey, E. H. & Varki, A. A primate genome project deserves high priority.


_Science_ 289, 1295–1296 (2000). CAS  PubMed  Google Scholar  * Varki, A. A chimpanzee genome project is a biomedical imperative. _Genome Res._ 10, 1065–1070 (2000). CAS  PubMed  Google


Scholar  * VandeBerg, J. L., Williams-Blangero, S., Dyke, B. & Rogers, J. Examining priorities for a primate genome project. _Science_ 290, 1504–1505 (2000). CAS  PubMed  Google Scholar


  * Soderlund, C., Longden, I. & Mott, R. FPC: a system for building contigs from restriction fingerprinted clones. _Comput. Appl. Biosci._ 13, 523–535 (1997). CAS  PubMed  Google


Scholar  * Soderlund, C., Humphray, S., Dunham, A. & French, L. Contigs built with fingerprints, markers, and FPC V4.7. _Genome Res._ 10, 1772–1787 (2000). CAS  PubMed  PubMed Central 


Google Scholar  * Sasaki, T. & Burr, B. International Rice Genome Sequencing Project: the effort to completely sequence the rice genome. _Curr. Opin. Plant Biol._ 3, 138–141 (2000). CAS


  PubMed  Google Scholar  Download references ACKNOWLEDGEMENTS I thank F. Collins, J. Touchman and R. Wilson for critical reading of this manuscript. AUTHOR INFORMATION AUTHORS AND


AFFILIATIONS * Genome Technology Branch and NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, 20892, Maryland, USA Eric D.


Green Authors * Eric D. Green View author publications You can also search for this author inPubMed Google Scholar RELATED LINKS RELATED LINKS FURTHER INFORMATION Human Genome Project


_Saccharomyces cerevisiae_ _Caenorhabditis elegans_ _Drosophila melanogaster_ _Arabidopsis thaliana_ _Homo sapiens_ _Escherichia coli_ Phred Phrap Consed GAP mouse BAC fingerprint map of the


mouse genome rat zebrafish TIGR comprehensive microbial resource Celera Genomics _Tetraodon nigroviridis_ _Fugu rubripes_ rice GLOSSARY * FINISHED SEQUENCE Complete sequence of a clone or


genome, with a defined level of accuracy and contiguity. * SEQUENCE-TAGGED SITE (STS). Short (for example, <1,000 bp), unique sequence associated with a PCR assay that can be used to


detect that site in the genome. * CONTIG Overlapping series of clones or sequence reads (for a clone contig or sequence contig, respectively) that corresponds to a contiguous segment of the


source genome. * MINIMAL TILING PATH A minimal set of overlapping clones that together provides complete coverage across a genomic region. * COVERAGE The average number of times a genomic


segment is represented in a collection of clones or sequence reads (synonymous with redundancy). * SEQUENCE-READY MAP Typically considered an overlapping bacterial clone map (for example, a


BAC contig map) with sufficiently redundant clone coverage to allow for the rational selection of clones for sequencing. * UNIVERSAL PRIMING SITE A short sequence (for example, 16–24 bases)


in a cloning vector, immediately adjacent to the vector–insert junction to which a common (that is, universal) sequencing primer can anneal. * FULL-SHOTGUN SEQUENCE A type of prefinished


sequence, in this case with sufficient coverage to make it ready for sequence finishing (typically on the order of 8–10-fold coverage). * WORKING DRAFT SEQUENCE A type of prefinished


sequence, often meant to correspond to sequence with coverage that puts it at roughly the halfway point towards full-shotgun sequence. * PREFINISHED SEQUENCE Sequence derived from a


preliminary assembly during a shotgun-sequencing project (at this stage, the sequence is often not contiguous nor highly accurate). * RADIATION HYBRID MAP Physical map of markers (typically


STSs) positioned on the basis of the frequency with which they are separated by radiation-induced breaks (map construction involves the PCR analysis of rodent cell lines, each containing


different fragments of the source genome). RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Green, E. Strategies for the systematic sequencing of complex


genomes. _Nat Rev Genet_ 2, 573–583 (2001). https://doi.org/10.1038/35084503 Download citation * Issue Date: 01 August 2001 * DOI: https://doi.org/10.1038/35084503 SHARE THIS ARTICLE Anyone


you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the


Springer Nature SharedIt content-sharing initiative