Genome Sequencing Progress

by Robert J. Bradbury

Background

DNA genomes are the blueprints or "instruction programs" for biological machines (organisms).  Genomes are written in 4 letters in the DNA alphabet (A, C, G, T) which are the molecular bases Adenine, Cytosine, Guanine and Thymine that are linked together to form DNA.  Linear DNA sequences which range in size from a few hundred bases to a few hundred thousand bases are known as "genes".  These genes are the words of the DNA language and are converted by the machinery of a cell into the proteins which do most of the construction and destruction activities which are required to support life.  In addition to the genes sequences, there are much smaller regulatory sequences (usually 10-20 letters long) which may be associated with each gene that are used by cells to control when and how much of each gene should be produced.  In the genomes of more evolved organisms, there may be a large amount of junk DNA which is left over from the process of evolution.  At this time it is believed by many scientists that there is no purpose for this DNA.

As Tables 1 & 2 shows Genome Centers around the world are making rapid progress sequencing genomes of various organisms.
 

Table 1: Simple Genome Sequencing: complete or in progress
Organism
Size1
Progress
# Genes
Date
Source(s)
 
Mbp
       
Haemophilus influenzae
100%
 1,738
8/31/95
TIGR, Science269(5223):496-512 (28 Jul 1995)
Mycoplasma genitalium
100%
470
10/31/95
TIGR, Science270(5235):397-404 (20 Oct 1995)
Methanococcus jannaschii
100%
 1,727
8/23/96
TIGR; Science273(5278):1058-1073 (23 Aug 1996)
Synechocystis sp. PCC6803
100%
 3,168
9/30/96
KDRI (Japan)
Aquifex aeolicus
1.5
100%
1,512
1998
(pub)
Diversa
Nature 392:353-358 (26 Mar 1998)
Archaeoglobus fulgidus
100%
2,436
11/27/97
TIGR
Nature 390:364-370 (27 Nov 1997)
Helicobacter pylori
100%
 1,590
8/7/97
TIGR, GTC 
Nature 390:539-547 (7 Aug 1997)
Streptococcus pneumoniae
100%
 2,236
 11/21/97
TIGR, Science 293:498-506 (20 Jul 2001) [Supp]
GTC(?), Alabama
Escherichia coli
100%
 4,288
1/31/97
U. Wisconson, Japan
Science 277(5331):1453-1462 (5 Sep 1997)
Methanobacterium thermoautotrophicum
100%
 1,918
5/31/97
GTC; Ohio State University; Jnl of Bacteriology 179:7135-55 (1997)
Bacillus subtilis
100%
 ~4100
7/19/97
Europe, Japan
Nature 390:249-256 (20 Nov 1997)
Mycoplasma pneumoniae
100%
 679
11/30/96
U. Heidelberg
Neisseria meningitidis
   (Serogroup A strain Z2491)
   (Serogroup B Strain MC58)
 
100%
100%
2,121
2,158
 late '97

Sanger, Nature404:502-506(30 Mar 2000)
TIGR, Oxford; Science 287(5459):1809-1815 (10 Mar 2000)
Pyrobaculum aerophilum
2.22
100%
5,402
1/15/02
CalTech, UCLA
Ureaplasma urealyticum
0.8
100%
 
 late '97
U. Alabama, ABI
Pyrococcus horikosjii
 100%
 1,956
 late '97
MITI , Univ. Tokyo
DNA Research 5:55-76 (30 Apr 1998)
Clostridium acetobutylicum
4.03
98%
 
9/30/97
GTC
Mycobacterium leprae
~100%
 
7/7/99
Sanger, GTC
Neisseria gonorrhoeae
97%
 
12/8/97
U. Oklahoma
Streptococcus pyogenes
100%
 1,752
12/8/97
U. Oklahoma
Sanger; PNAS98(8):4658-4663 (2001)
Mycobacterium tuberculosis
100%
~4000
6/10/99
Sanger, Nature 393(6685):537-544 (11 Jun 1998),
TIGR, GTC
Deinococcus radiodurans
100%
 3,187
~ 1998 
23/11/99
TIGR
Science 286(5444):1571-1577 (1999)
Rhodobacter capsulatus
4.00
100%
 
~ 1998
Univ. Chicago
Thermotoga maritima
1.8
100%
 
~ 1998
TIGR
Nature 399:323-329 (1999)
Enterococcus faecalis 100%  
6/11/98
TIGR, Genome Therapeutics
Treponema pallidum
100%
1,041
 7/17/98
TIGR, Univ. Texas
Science 281(5375):375-388 (1998)
Borrelia burgdorferi
100%
 
 11/12/97
TIGR, BNL
Nature 390:580-586
Vibrio cholerae
4.03
100%
3,885
~1998
TIGR; Nature 406(6795):477-483 (2000)
Rickettsia prowazekii 
1.12
100%
 834
1998
Univ. of Uppsala
Nature 396(6707):133-140 (1998)
Chlorobium tepidum         TIGR
Caulobacter crescentus 
4.02
100% 
 3,767
 3/27/01
TIGR; PNAS 98(7):3460-3465 (2001)
Legionella pneumophila
4.1
      TIGR
Mycobacterium avium
4.7
   
~ 2000
TIGR
Porphyromonas gingivalis
~100%
 
6/11/98
TIGR; Science Daily article
Shewanella putrefaciens
4.5
      TIGR
Mycoplasma capricolum 
1.2
      GMU
Pyrococcus furiosus
100%
 2,065
8/?/1999
U. Maryland, U. Utah
Pyrococcus abyssii
 1.8
 100%
 1,765
12/23/00
France; @Genoscope
Sulfolobus solfataricus
100%
 
2/15/01
IMB (Canada)
LBMGE (Orsay, France)
PNAS 98(14):7835-7840
Thiobacillus ferrooxidans
2.6
~100%
2,159
3/28/00
PNAS 97(7):3509-3514
Salmonella typhimurium
4.85
 100%
 4,330
10/25/01
WashU (in assembly), SGSCC
Infect. Immun. 66:4305-4312 (1998)
FEMS Microbiology Letters 173:411-423 (1999)
TIGR
Nature 413:852 - 856 (25 Oct 2001)
Salmonella typhi
4.5
~100%
 4,599
10/25/01
Sanger
Nature 413:848 - 852 (25 Oct 2001)
Salmonella paratyphi  
 2x
 
1999
WashU (2x shotgun coverage)
Klebsiella pneumoniae  
 3x
 
1999
WashU (3x shotgun coverage)
Aeropyrum pernix
1.67
~100%
 2,694
1999
NITE
DNA Research 6:83-101 (1999)
Streptomyces coelicolor
8+
 ~50%
 
~ 2001
Sanger
Chlamydia trachomatis
    Serovar D

    MoPn/Strain Nigg
    Strain AR39

 

1.04

1.07
1.22

100%

100%
100%

894

924
1,052

1998

UC Berkeley, Stanford
Science 282(5389):754-759 (23 Oct 1998)
TIGR
Nuc. Acids Res. 28:1397-1406 (2000)
Chlamydia pneumoniae
1.23
100%
1,073
1999
UC Berkeley, Stanford, Incyte, Nature Genetics 21(4):385-389 (1999)
Giardia lamblia          
Clostridium difficile
4.4
 4.281
 
2000
Sanger (in assembly 11/01)
Campylobacter jejuni
100%
1,708
1999
Sanger; Nature 403:665-668 (10 Feb2000)
Yersinia pestis
4.38
 4.941
 4,012
2000
Sanger; Nature413(6855):523-527 (4 Oct 2001)
Bordetella pertussis
3.88
4.067
 
2000
Sanger (in assembly 08/00)
Bordetella bronchiseptica
 4.9
4.453
96.4%
 
2000
Sanger (in assembly 08/00)
Bordetella parapertussis  
2.7%
 
2000
Sanger
Mycobacterium bovis
4.4
4.230
 
1999
Sanger (in assembly 08/00)
Pseudomonas aeruginosa
~100%
 
12/15/99
Pseudomonas Genome Project
Nature 406:959-964 (31 Aug 2000) 
Thermotoga maritima
100%
1,877
5/27/99
TIGR;
Nature 399(6734):323-329 (27 May 1999)
Deinococcus radiodurans 3.28 100% 3,245 11/19/99 Science 286(5444):1571-7 (19 Nov 1999)
Bacillus Stearothermophilus
3.13
~100%
 
8/12/00
U. Oklahoma (gap closure)
Actinobacillus actinomycetemcomitans
 ~100%
 
 8/14/00
U. Oklahoma (gap closure)
Staphylococcus aureus
    NCTC 8325
    MRSA strain EMRSA-16
    MSSA strain
 2.8
~100%
~99%
0%
   
U. Oklahoma
Sanger
Streptococcus mutans         U. Oklahoma
Xylella fastidiosa
2.7
100% 
2,904
7/13/00
UNICAMP; Nature 406:151-157 (13 Jul 2000)
Thermoplasma acidophilum
1.6
 100%
 1,509
9/28/00
Max-Planck-Institute for Biochemistry;
Nature 407:508-513 (28 Sep 2000) 
Thermoplasma volcanium
 100%
 1,522
 2/5/01
AIST
Halobacterium salinarium
4.0
      Max-Planck-Institute for Biochemistry 
Methanosarcina mazei
4.10
100%
 3,371
4/?/02
Göttingen Genomics lab
Methanogenium frigidum
?
      UNSW, Sydney
Methanococcus maripaludis
1.66
100%
1,722 
10/?/04
University of Washington
Sulfolobus tokodaii
2.69
100%
2,826 
8/31/01
None?
Ferroplasma acidarmanus
2.0
      JGI
Methanosarcina barkeri fusaro
2.8
      JGI
Halobacterium NRC-1
2.6
100%
2,630
10/3/00
PNAS 97(22):12176-81 (24 Oct 2000)
Crenarcheaum symbiosum
~2.5
      Diversa
Sinorhizobium meliloti
6.7
100%
6,204
7/27/01
Science 293(5530):668-672
Pasteurella multocida
2.26
100%
2,014
 3/6/01
PNAS 98(6):3460-3465
Nitrosomonas europaea
 100%
    JGI
Prochlorococcus marinus
100%
    JGI
Rhodopseudomonas palustris
?
    JGI
Nostoc punctiforme
100%
    JGI
Marine synechococcus
-
    JGI
Cytophaga hutchinsonii         JGI
Ralstonia metallidurans  
100%?
  ~12/??/00 JGI
Pyrolobus furmarii
1.85
100%
~2000
10/4/01
Diversa & Celera; BBC article
Streptomyces coelicolor
8.66
100%
7,825
5/8/02
ABCNews article
Fusobacterium nucleatum (ATCC 25586)
2.17
100%
2,037
4/?/02
J. Bacteriol 184(7):2005-18 (Apr 2002)
Brucella melitensis 
3.29
100%
3,197
12/26/01
PNAS 99(1):443-448 (2002)
Brucella suis 3.31 100%
 
9/23/02
PNAS 99(20):13148-13153 (2002)
Shewanella oneidensis
4.97
100%
4,758
10/7/02
Nat Biotechnol. 220(11):1118-23 (Nov 2002)
Bacteroides thetaiotaomicron
?
100%
~4,800
3/28/03
Science, March 28, 2003
Science Daily article
Porphyromonas gingivalis (W83)
2.34
100%
?
1/?/04
J. Bacteriol 186(2):593 (Jan 2004)
Wolbachia pipientis wMel
1.27
100%
?
3/16/04
PLoS Biol. 2(3):E69 (16 Mar 2004)
Science Daily article
The Japan Times article (25 Mar 2004)
Treponema denticola
2.84
100%
?
4/2/04
PNAS 101(15):5646-51 (13 Apr 2004)
Desulfovibrio vulgaris
3.57
 100%
 ?
 4/1104
Science Daily article (14 Apr 2004)
Nature Biotechnology 22(5):554-9 (May 2004).
Geobacter sulfurreducens  
100%
 
12/12/03
Science 302(5652):1967-9 (12 Dec 2003)
U. Mass. Press Release (11 Dec 2003)
Silicibacter pomeroyi
~4.6
100%
 
12/16/04
Nature 432(7019):910-3 (16 Dec 2004)
DehalocDoccoides ethenogenes  ~1.4
100%
 
1/7/05
Science 307(5706):105-8 (7 Jan 2005);
Science Daily article (1/20/05)
Cryptococcus neoformans
~20
100%
~6500
1/13/05
Science Epub (13 Jan 2005);
Science Daily article (1/24/05)
Total: ~76 organisms, ~251 Mbp (~ = 8% of Human Genome)


 
 
Table 2: Complex Genome Sequencing: complete or in progress
Organism
Size
Progress
# Genes
Date
Source(s)
  Mbp        
Saccharomyces cereviseae (yeast)
100%
6340
4/24/96
Stanford, WashU, Sanger, Europe
Science 274(5287):546-567
(25 Oct 1996)
C. elegans (nematode)
~100
100%
18,425
8/1/98
Sanger, WashU
Science 282:2012-2018(11 Dec 1998);
WormPep 18 (11 Oct 1999)
Schizosaccharomyces pombe (yeast)
~14
>90%
 
~2000
Sanger
Arabidopsis thaliana
~125
92.3%
25,498
 12/14/2000
Stanford; A thaliana Genome Center; TAIR
Nature 408:791;796-815 (14 Dec 1999)
Plasmodium falciparum (malaria)
23
100%
 5,300
10/3/2002
Sanger, TIGR, Stanford
Nature 419(6906):498-511 (3 Oct 2002).
Leishmania major
 33.6
 ~20%
 
 2003
Sanger
Candida albicans (yeast)
~16
~100%
~5920
May 2002 
Stanford, Sanger, U. Minnesota, Incyte
Trypanosoma brucei
14.2
 
 
 
TIGR, Sanger, Cambridge
Pneumocystis carinii (pneumonia)
7.7
      Sanger
Aspergillus fumigatus
 ~33
 
    Sanger; Aspergillus web site
Aspergillus nidulans
31
 > 2.4 Mbp
    Aspergillus Genomics; U. Oklahoma
Dictyostelium discoideum
34
 
 
 
Sanger
Babesia bovis
9.4
      Sanger
Plasmodium chabaudi
~30
      Sanger
Drosophila melanogaster (fruit fly)
~120
100%
13,601
1999
Celera Genomics
Science 287:2185-95(24 Mar 2000)
Homo sapiens (man)
3213
~100%
 ~42,000
~2000
NCBI, Celera Genomics
Sequencing ~ complete June 2000
Papers released 11 Feb 2001
Oryza sativa ssp. japonica (rice)
430
~100%
 
4/4/2000
Monsanto, University of Washington,
IRGSP
Mus musculus (mouse)
2600
>99%
 
4/27/2001
Whitehead/MIT, Celera Genomics;
Science now  427:2 (2001)
>94%
    Public sequence, unassembled
Nature 411:121 (2001)
Apis mellifera (honey bee)      
1/7/2004
Baylor; NHGRIHoney Bee Genome Assembled (6x coverage); S.D. Honey Bee Genome Assembled
Rattus norvegicus (rat) ~2750
 >90%
20,973+
28,51622
~2004
Baylor, Celera Genomics, Genome Therapeutics
Science 305:3 (5 Mar 2001)
Nature 428(6982):493-521 (1 Apr 2004).
Fugu rubripes (pufferfish)
~365
~100%
>30,000
~2002
Fugu Project; Fugu Genome Page
Science 297:1301-10 (23 Aug 2002).
Anopheles gambiae (mosquito)
~278
91%
~14,000
2002
Science 306:3 (6 Mar 2001); R. A. Holt et al.; Science298:129-149 (4 Oct 2002);
The Scientist article (3 Oct 2002).
Cryptosporidium parvum
~9
100%
 
3/2004
Science Daily article (31 Mar  2004);
Science304:441-5 (16 Apr 2004)
Cryptosporidium hominis
9.2
100%
 
11/2004
Science Daily article (2 Nov  2004);
Nature431:1107-12 (28 Oct 2004)
Brachydanio rerio (Zebrafish) ~1500    
~2005
Sanger, Max-Planck-Institut für Entwicklungsbiologie, Harvard; U. Oregon; Zebrafish Genome Initiative
Phanerochaete chrysosporium
(white rot fungus)
~30
100%
11,777
2004
JGI (4 May 2004).  Science Daily article; Press release; Nat Biotechnol. Epub 2004 May 02
Gallus gallus (chicken)
~1000
~100%
 
3/1/2004
Sanger, Iowa State; Washington University; Press ReleaseWhite Paper; Chicken Genome Assembled; Nature article (8 Dec 2004);
Bos Taurus (cow)
~3000
      Cow Databases
Oreochromis niloticus (Tilapia [fish])         Tilapia Genome Project
Musa acuminata calcutta 4
(Banana)
~600
   
 ~2004
Global Musa Genomics Consortium
INIBAP PROMUSA;  BBC article
Future Harvest article
Science article
Neurospora crassa
(bread mold)
~40
100%
~10,000
2003
Nature 422(6934):859-68 (24 Apr 2003);
MIT Press Release (26 Sep 2000).
Pan troglodytes (chimpanzee) ~3100
~100%
draft
 
12/10/2003
RIKEN's Genomic Sciences Center with centers from Germany, China, Taiwan, and Korea (Science note)
Chimp Genome Assembled by Sequencing Centers (4x coverage);
Science Daily article
BCM Chimpanzee Genomic Analysis
Canis familiaris (dog) ~2500
100%
 
7/14/2004
New Scientist ns99994682 (16 Feb 2004);
Dog Genome Assembled (7x coverage);
Science Daily article.
Phytophthora ramorum
(sudden oak death)
65
100%
~15,000
6/11/2004
New Scientist ns99995102 (11 Jun 2004).
Phytophthora sojae          
Phanerochaete chrysosporium
(white rot fungus)
~30
       
Xenopus tropicalis ~1700    
~2005?
White Paper
Macropus eugenii
(tammar wallaby)
~3000    
~2007?
Kangaroo Hops in Line for Genome Sequencing (2x coverage); White Paper
Monodelphis domestica
(short-tailed opossum)
     
~2005?
White Paper
Coffee
?
~100%
draft(?)
~35,000
8/10/2004
Non-public sequence reported by São Paulo Research Assistance Foundation (Fapesp) and the Genetic Resources and Biotechnology Center (Cenargen) of the Brazilian Agricultural Research Company (Embrapa) in Brazil.
Trichomonas vaginalis
~80
~100%
(draft)
? ?
Science Daily article
Entamoeba histolytica
~24
~100%
(draft)
~10,000   Nature 433:865-868 (24 Feb 2005)
Pathema Database; Science Daily article
Zea mays (corn)
~5000
0.5%
?
?
http://www.maizeseq.org/
Fusarium graminearum
(Wheat Scab Fungus)
~36
~100%
~11,600
4/13/2004
@ Broad Institute
@USDA
Magnaporthe girsea
(Rice Blast Fungus)
?
~100%
~11,000
?
N.C.S.U., Broad Institute
Science Daily article
Mycosphaerella graminicola
(Wheat leaf spootting fungus)
?
~0%
~15,000
~6/30/2006
Science Daily article
USDA ARS announcement

Progress on the Human Genome is increasing and at the current time the fraction of the Human Genome that has been sequenced is 24.7% (+66.2% in draft form, as of Sept. 18, 2000).  This data is available in public databases.  An additional 85,000+ gene sequence clusters from cDNA's representing ~100 Mbases (3% of human genome) is present in public and private database (Unigene and comments made by Incyte and HGS at public conferences).  In 1996, only ~20 MBases/year of large scale sequencing capacity was directed towards the HGP.  This has lead the Wellcome Trust to commit to providing the funding to increase the capacity of the Sanger Center to allow it to sequence 1/6th of the Human Genome (500 MBases) which would require a minimum capacity of 150 MBases / year (assuming a coverage of 2).  The entry of Celera Genomics in late 1998 caused the NIH to significantly accelerate genome sequencing in 1999, so estimates for the completion of the genome are now in the 2001-2002 time period.  In September of 1999, Incyte announced that it predicted the number of human genes would be between 129,769 and 142,634.
 

Table III: Human Genome Chromosome Progress
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y

The requirements for sequencing capacity for the HGP underestimate the overall requirements for sequencing capacity.  There are projects, initial efforts, plans or good scientific reasons for doing the genomes of  the 50 major human disease causing organisms (500+ Mbp), Drosophila (~120 Mbp of coding sequence), Mouse (3 Gbp) and Zebrafish (1.7 Gbp).

In agriculture, mapping or sequencing projects are in progress for Cattle (3 Gbp), Pig (2.7 Gbp), Sheep, Chicken (1.2 Gbp), Arabidopsis (100 Mbp), Maize (5 Gbp), Rice (400 Mbp), Barley, Bean, Cotton, Lettuce, Mushroom, Pine, Rye, Sorghum, Sugarcane, Wheat, Apple, Brassicas, Cabbage, Clover, Cucumber, Grass, Lilium, Pea, Peach, Rye, Snapdragon, Spruce, Tobacco, Tomato,  Alfalfa, Almond, Asparagus, Berry, Carrot, Celery, Chrysanthemum, Citrus, Clover, Cocoa, Cotton, Cucumber, Cuphea, Lentil, Oat, Onion, Papaya, Pea, Peach, Peanut, Pear, Pepper, Pine, Plum, Poplar, Potato, Rice, Rose, Soybean, Spruce, Squash, Sugarcane, Sunflower and Turf Grass.

The sequencing requirements for these organisms should exceed 200 billion base pairs!

The Stanford Genome Center, SW Medical Center, and the Whitehead Institute have devoted significant resources to the developing highly automated high throughput methods for genome mapping and sequencing. The Stanford Genome Center intends to make available in October of 1997, the parts lists, plans and diagrams for the automated machines which have been developed for genome sequencing (www-sghc.stanford.edu).  Estimates from Stanford, place the current cost for finished DNA sequence information at around $.25 per base by the year 2000, implying that Genome Centers which are sequencing 10 MB/year have budgets in excess of $2.5 million per year.

NIH grants in 1997 were targeting a cost of $.05 per base for genome sequencing.  These may not have taken into account the throughput increases that would be allowed by newer capillary sequencing machines.

In understanding sequencing, it is necessary to recognize that the bacterial and yeast genome sequencing which has been done thus far has emphasized shotgun sequencing methods.  This approach requires that the genome be randomly fragmented, sequenced to high redundancy and assembled by computer sequence matching of overlapping fragments.  Genomes sequenced in this way require raw sequence data from 5 to 10 times greater than the actual genome size.  The lowest redundancy levels reached thus far have been around 2.5x genome size by the University of Heidelberg on M. pneumoniae primarily using primer walking sequencing methods.  Current estimates from Stanford would place human genome sequencing requirements at 3-4x the genome size.

Table 3 lists the existing large scale genome centers and their current or planned participation in large scale sequencing efforts and the Human Genome Project.
 
 

Table 4: Genome Sequencing Centers.
Genome Center
Sequencing Capacity
Interest Areas
Sanger Center, U.K. 37 ABI 373's, 35: ABI 377's = ~4 Mbp/day (raw) (1997) 
1/6th of Human Genome (planned)
Yeast, 
C. elegans, 
Human (Chr. 22, X, 6, 1, 20, 13)
Washington University, St. Louis, MO 8/97 capacity ~3.5 Mbp/month (finished), 
1 Chomosome+
C. elegans, ESTs, 
Human (Chr. 2, 7 + 3, 5, 8 ,12, 16, 22, 14, X partial)
DOE Genome Center (LLNL, LBL, LANL), Foster City, CA 3 Chromosomes, center currently under construction. Human (Chr. 21, 19, 16, 5), Drosophila
TIGR, MD 21 ABI 373's, 6 ABI 377's (11/1995) 
6 ABI 373's, 6 ABI 377's devoted to Chr. 16p in 1996
ESTs, Microbial genomes 
Human (Chr. 16p)
Stanford Human Genome Center, Stanford University, CA 1 Chromosome, 
5-10 Mbp/year in 1997, 17.5 Mbp 1997-1999.
 
SW Medical Center, Univ. Texas, TX 1 Chromosome, 10 Mbp anticipated 1/8/97 to 1/8/98.  Human (Chr. 11, 15)
Whitehead Institute, MIT, MA 158 capillary based sequencers (9/26/2000), capacity 135,000 sequencing reactions (~70+MBP)/day Mouse
University of Oklahoma; OK 1 Chromosome Microbial genomes, 
Human (Chr. 22, 7)
Baylor College, TX 2 Chromosomes Human (Chr. X, 8)
University of Washington, WA 1 Chromosome Human (Chr. 7)
University of Texas, San Antonio, TX 2 Chromosomes Human (Chr. 3 and 8)
Columbia University, NY 2 Chromosomes Human (Chr 13 and 1)
Albert Einstein/Yale, CN 1 Chromosome Human (Chr. 12)
University of Pennsylvania, PA 1 Chromosome Human (Chr. 22); NCHGR Chr22
University College London 2 Chromosomes Human (Chr. 9 and Y)
JSC Corp. Portions of 3 Chromosomes Human (Chr. 3, 6, 21)
Ordered by current sequencing capacity or estimated potential future capacity
if currently building genome center.

There are several companies which have large amounts of sequencing capacity but which are not actively involved in the Human Genome Project as shown in Table 5.
 
 

Table 3: Commercial organizations with significant sequencing capacity.
Company Sequencing Capacity Interest Areas
Celera Genomics 1999: 200+ ABI 3700's Drosophila Genome, Human Genome, Rice Genome
Human Genome Sciences Similar to Incyte ESTs
Incyte 1997: 68 ABI 377's. ESTs
Genome Therapeutics Much less than Incyte Human (Chr. 10)
Millenium Much less than Incyte  
Sequana Much less than Incyte asthma, obesity, osteoporosis, diabetes, schizophrenia, manic depression

At the current time these companies are focused primarily on human diseases with some emphasis on agricultural genomes.  It is not expected, with the exception of Genome Therapeutics that they will play a major role in genome sequencing at this time.


References


Notes

  1. Genome Sizes are typically measured in millions of base pairs (Mbp) or billions of base pares (Gbp), that is the number of A, C, G and T nucleotides.
  2. This is the number of transcripts identified.  Transcripts may not produce functional genes.

Genome Sequencing Instrumentation Information Sources

Instrumentation Suppliers

Links of Interest


Created: October 1997
Last Modified: 20 Jan 2007
HTML Editor: Robert J. Bradbury