This artwork, created in March 2020, was inspired by the coronavirus that is causing the COVID-19 outbreak and is currently playing havoc across the world.

This image was made using DNA sequence but I used a completely different technique in this work to my usual SangerArtworks. It was made using a Javascript program that plotted the entire 29,903 bases of the COVID-19 coronavirus genome in a square, with the shape of the virus layered into the sequence.

Viruses are not actually living organisms. They do have genomes but are completely reliant on other "real" living things for their existence and replication. They cannot survive and reproduce on their own and so by definition are not alive. Viruses are made up of complex organic molecules such as proteins, lipids, carbohydrates and the all important nucleic acids (either DNA or RNA) that make up their genomes.

Viruses come in all shapes and sizes and can infect the cells of all other living things from simple bacteria to complex life forms such as us humans.

The virus that causes COVID-19 is part of a family of coronaviruses that infect mammals and birds and cause respiratory illnesses. Some coronaviruses can cause the common cold, although most cases of the cold are due to a different virus family called rhinoviruses. The coronavirus that causes COVID-19 can lead to much more severe symptoms that can lead to pneumonia and death.


Coronaviruses have a single-stranded RNA genome that is enclosed in an envelope of lipid and protein. The proteins that make up the virus are encoded in the RNA genome and include the characteristic spike glycoprotein (S) that protrudes from the surface of the envelope, giving the virus its name -"corona" referring to the solar corona appearance of the virions under an electron microscope.

The virus that causes COVID-19 is related to other coronaviruses that caused the Severe Acute Respiratory Syndrome (SARS) outbreak in 2002-2003 and more recently Middle East Respiratory Syndrome (MERS). The official name of the virus is not COVID-19, that term refers to the disease itself (Coronavirus disease 2019). The virus is officially called SARS-Cov-2, because it is a relative of the original SARS causing virus SARS-Cov-1. The genome of the new SARS-CoV-2 was sequenced by Chinese researchers and published in early February 2020 in the journal Nature (see here). I downloaded the full sequence (accession number: NC_045512.2) from the National Center for Biotechnology Information (NCBI) website here. The researchers sequenced the entire genome of a virus isolate obtained from a single patient who was a worker at the Seafood Market in Wuhan, China and was admitted to hospital on 26 December 2019. The scientists used a next generation sequencing (NGS) method to unveil the entire sequence of the viral genome. This method called RNA-seq requires the conversion of RNA into DNA, a process called reverse transcription, before sequencing of the resultant DNA using one of several possible methods. The Illumina method of DNA sequencing was used in this case and revealed a genome of 29,903 nucleotides in length.

Comparison of the genome sequence of the new COVID-19 causing virus revealed that it is most closely related to a coronavirus found in bats, showing 89.1% nucleotide identity. It is also related to the original SARS causing virus but with lower sequence similarity. Importantly, the region of the spike glycoprotein that helps the virus to infect cells is very similar to that in the SARS causing virus, suggesting that the virus infects cells through the same route, which involves the ACE2 protein on the surface of cells. This article in the New York Times gives a great visual representation of how the coronavirus infects cells and hijacks the cell's machinery to copy itself and spread the infection.

Working out the sequence of the SARS-Cov-2 genome has been immensely important for both diagnosing and combating COVID-19. The sequence depicted in this image has been used to develop real-time PCR based tests for diagnosis and is also being used for designing vaccines. Most vaccine development strategies are aimed at targeting sequences encoding the Spike (S) protein on the surface of the virus particle. This shows just how important DNA sequencing has been and will continue to be in battling this virus.

  • Facebook
  • Twitter
  • Instagram

© 2020 by Daniel Wallace