| Peer-Reviewed

Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis

Received: 15 June 2020    Accepted: 3 July 2020    Published: 13 August 2020
Views:       Downloads:
Abstract

As COVID-19 has spread from the first reported cases into a global pandemic, there has been a number of efforts to understand the mutations and clusters of genetic lineages of the SARS-CoV-2 virus. The high mutation rate and rapid spread makes this analysis capable of tracking chains of infections as well as putting individual sequences in context. Whole genomes of the SARS-CoV-2 virus are being collected and shared from across the globe. With the advent of affordable and prolific Next Generation Sequencing, this is the first pandemic in which the genomic evolution of the pathogen can be tracked in near real-time. So far, phylogenetic analysis methods have recently found a broader application in this regard. Here we demonstrate that Principal Component Analysis (PCA), used heavily in population genetics, corroborates the existing findings while providing unique new capabilities to understand our public repositories of complete virus sequences. This novel application of PCA is demonstrated on all publicly available SARS-CoV-2 samples from GenBank and other open-access databases until mid-April. We show that PCA is a useful and easy-to-use tool to analyze SARS-CoV-2 genomes in addition to phylogenetic analytics. It offers a previously untapped opportunity to analyze the dynamics of the current SARS-CoV-2 pandemic in a new way.

Published in European Journal of Clinical and Biomedical Sciences (Volume 6, Issue 4)
DOI 10.11648/j.ejcbs.20200604.11
Page(s) 49-55
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

SARS-CoV-2, COVID-19, Principal Component Analysis, Next-Generation Sequencing

References
[1] J. H. C. f. C. Impact, "COVID-19 Global Map," [Online]. Available: https://coronavirus.jhu.edu/map.html.
[2] P. Forster, L. Forster, C. Renfrew and M. Forster, "Phylogenetic network analysis of SARS-CoV-2 genomes," Proceedings of the National Academy of Sciences, vol. 117, no. 17, pp. 9241-9243, 2020.
[3] A. Scherer, Genetic Analysis of the COVID-19 Virus and Other Pathogens, ISBN 978-0-9986882-8-2, Golden Helix, 2020.
[4] C. Scherer and A. Scherer, "Diagnosing and Tracking COVID-19 Infections Leveraging Next-Gen Sequencing," accepted for publication, Journal of Precision Medicine, vol. July, 2020.
[5] A. Scherer, "Leveraging Next-Generation Sequencing Technology in the Fight Against COVID-19," Clinical Lab Manager, vol. May 4, 2020.
[6] F. P and R. C., "Evolution. Mother tongue and Y chromosomes.," Science., vol. 333, pp. 1390-1, 2011.
[7] C. Renfrew and P. Bahn, The Cambridge World Prehistory, Cambridge University Press, 2014.
[8] P. Forster and C. Renfrew, Phylogenetic Methods and the Prehistory of Languages., McDonald Institute Press, 2006.
[9] K. Bryc, A. Auton, M. R. Nelson, J. R. Oksenberg, S. L. Hauser, S. Williams, A. Froment, J. M. Bodo, C. T. Wambebe, S. A. and C. D. Bustamante, "Genome-wide Patterns of Population Structure and Admixture in West Africans and African Americans," Proceedings of the National Academy of Sciences of the United States of America, Vols. 107, 2, pp. 786-91, 2010.
[10] I. Lazaridis, N. Patterson and A. Mittnik, "Ancient human genomes suggest three ancestral populations for present-day Europeans.," Nature, vol. 513, p. 409–413, 2014.
[11] NCBI Resource Coordinators, "Database resources of the National Center for Biotechnology Information," Nucleic Acids Research, vol. 44, no. D1, pp. D7-D19, 2016.
[12] W. F, Z. S, Y. B, C. YM and W. W, "A new coronavirus associated with human respiratory disease in China." Nature, vol. 579, no. 7798, pp. 265-269, 2020.
[13] H. Li, "Minimap2: pairwise alignment for nucleotide sequences," Bioinformatics, vol. 34, no. 18, p. 3094–3100, 2018.
[14] H. Li, "A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data," Bioinformatics, vol. 27, no. 21, p. 2987–2993, 2011.
[15] J. Fauver, M. Petrone and E. Hodcroft, "Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States.," Cell, vol. 181, no. 5, pp. 990-996, 2020.
[16] L. v. Dorp, M. Acman, D. Richard, L. P. Shaw, C. E. Ford, L. Ormond, C. J. Owen, J. Pang, C. C. Tan, F. A. Boshier, A. T. Ortiz and F. Balloux, "Emergence of genomic diversity and recurrent mutations in SARS-CoV-2," Infection, Genetics and Evolution, vol. 83, 2020.
Cite This Article
  • APA Style

    Christiane Scherer, James Grover, Darby Kammeraad, Gabe Rudy, Andreas Scherer. (2020). Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis. European Journal of Clinical and Biomedical Sciences, 6(4), 49-55. https://doi.org/10.11648/j.ejcbs.20200604.11

    Copy | Download

    ACS Style

    Christiane Scherer; James Grover; Darby Kammeraad; Gabe Rudy; Andreas Scherer. Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis. Eur. J. Clin. Biomed. Sci. 2020, 6(4), 49-55. doi: 10.11648/j.ejcbs.20200604.11

    Copy | Download

    AMA Style

    Christiane Scherer, James Grover, Darby Kammeraad, Gabe Rudy, Andreas Scherer. Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis. Eur J Clin Biomed Sci. 2020;6(4):49-55. doi: 10.11648/j.ejcbs.20200604.11

    Copy | Download

  • @article{10.11648/j.ejcbs.20200604.11,
      author = {Christiane Scherer and James Grover and Darby Kammeraad and Gabe Rudy and Andreas Scherer},
      title = {Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis},
      journal = {European Journal of Clinical and Biomedical Sciences},
      volume = {6},
      number = {4},
      pages = {49-55},
      doi = {10.11648/j.ejcbs.20200604.11},
      url = {https://doi.org/10.11648/j.ejcbs.20200604.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ejcbs.20200604.11},
      abstract = {As COVID-19 has spread from the first reported cases into a global pandemic, there has been a number of efforts to understand the mutations and clusters of genetic lineages of the SARS-CoV-2 virus. The high mutation rate and rapid spread makes this analysis capable of tracking chains of infections as well as putting individual sequences in context. Whole genomes of the SARS-CoV-2 virus are being collected and shared from across the globe. With the advent of affordable and prolific Next Generation Sequencing, this is the first pandemic in which the genomic evolution of the pathogen can be tracked in near real-time. So far, phylogenetic analysis methods have recently found a broader application in this regard. Here we demonstrate that Principal Component Analysis (PCA), used heavily in population genetics, corroborates the existing findings while providing unique new capabilities to understand our public repositories of complete virus sequences. This novel application of PCA is demonstrated on all publicly available SARS-CoV-2 samples from GenBank and other open-access databases until mid-April. We show that PCA is a useful and easy-to-use tool to analyze SARS-CoV-2 genomes in addition to phylogenetic analytics. It offers a previously untapped opportunity to analyze the dynamics of the current SARS-CoV-2 pandemic in a new way.},
     year = {2020}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Investigating the Global Spread of SARS-CoV-2 Leveraging Next-Gen Sequencing and Principal Component Analysis
    AU  - Christiane Scherer
    AU  - James Grover
    AU  - Darby Kammeraad
    AU  - Gabe Rudy
    AU  - Andreas Scherer
    Y1  - 2020/08/13
    PY  - 2020
    N1  - https://doi.org/10.11648/j.ejcbs.20200604.11
    DO  - 10.11648/j.ejcbs.20200604.11
    T2  - European Journal of Clinical and Biomedical Sciences
    JF  - European Journal of Clinical and Biomedical Sciences
    JO  - European Journal of Clinical and Biomedical Sciences
    SP  - 49
    EP  - 55
    PB  - Science Publishing Group
    SN  - 2575-5005
    UR  - https://doi.org/10.11648/j.ejcbs.20200604.11
    AB  - As COVID-19 has spread from the first reported cases into a global pandemic, there has been a number of efforts to understand the mutations and clusters of genetic lineages of the SARS-CoV-2 virus. The high mutation rate and rapid spread makes this analysis capable of tracking chains of infections as well as putting individual sequences in context. Whole genomes of the SARS-CoV-2 virus are being collected and shared from across the globe. With the advent of affordable and prolific Next Generation Sequencing, this is the first pandemic in which the genomic evolution of the pathogen can be tracked in near real-time. So far, phylogenetic analysis methods have recently found a broader application in this regard. Here we demonstrate that Principal Component Analysis (PCA), used heavily in population genetics, corroborates the existing findings while providing unique new capabilities to understand our public repositories of complete virus sequences. This novel application of PCA is demonstrated on all publicly available SARS-CoV-2 samples from GenBank and other open-access databases until mid-April. We show that PCA is a useful and easy-to-use tool to analyze SARS-CoV-2 genomes in addition to phylogenetic analytics. It offers a previously untapped opportunity to analyze the dynamics of the current SARS-CoV-2 pandemic in a new way.
    VL  - 6
    IS  - 4
    ER  - 

    Copy | Download

Author Information
  • Department of Microbiology and Hygiene, Institute of Laboratory Medicine, Evangelical Clinical Bethel, Bielefeld, Germany

  • Golden Helix, Inc, Bozeman, Montana, United States

  • Golden Helix, Inc, Bozeman, Montana, United States

  • Golden Helix, Inc, Bozeman, Montana, United States

  • Golden Helix, Inc, Bozeman, Montana, United States

  • Sections