SARS-Associated Coronavirus (SARS-CoV) Sequencing

On April 14, 2003, the Centers for Disease Control and Prevention (CDC) announced the completion of the full-length genetic sequencing of the genome of the SARS-associated coronavirus (SARS-CoV). The sequence data confirmed that SARS-CoV is a previously unrecognized coronavirus. Information provided by collaborators at the National Microbiology Laboratory, Canada; University of California at San Francisco; Erasmus University, Rotterdam; and Bernhard-Nocht Institute, Hamburg, facilitated the sequencing effort.

All of the sequence, except for the leader sequence, was derived directly from viral RNA. The genome of SARS-CoV is 29,727 nucleotides in length, and the genome organization is similar to that of other coronaviruses. Open-reading frames corresponding to the predicted polymerase protein (polymerase 1a, 1b), spike protein (S), small membrane protein (E), membrane protein (M), and nucleocapsid protein (N), plus several other open-reading frames of unknown function, have been identified.

Persons interested in viewing published GenBank information on SARS-CoV (Urbani strain) sequences may do so at the website of the National Center for Biotechnology Information, National Library of Medicine The accession number for the sequence of SARS-CoV (Urbani strain) is AY278741.

 Top of Page