Session Type: 1-hour Mini Oral Flash
Session Title: 1-hour Mini Oral Flash
Authors(s): N. Strepis, B. Goderie, S. Hu, C.H.W. Klaassen
Authors Affiliations(s): Erasmus MC University Medical Center, Netherlands
Background:
Next generation sequencing (NGS) is utilized constantly for bacterial analysis in research and diagnostics centers. However, the basic metric for NGS, the coverage is not yet well-established. Furthermore, the quality assessment of the assembled genomes is often based on simple parameters. In this study, we performed a critical assessment of ten different coverage depths and six bacterial genome assemblers for three bacterial species.
Methods:We used NGS short-reads (Illumina) from 100 samples for each Escherichia coli (Gram-negative), Klebsiella pneumoniae (Gram-negative) and Staphylococcus aureus (Gram-positive) species. Samples with coverage >100x and deriving from different sequencing centers were used. In silico trimming of reads was performed in these samples for mimicking 10-100x (intervals of 10) coverage depth. We evaluated six assemblers; Abyss, Unicycler, Spades, Skesa, Shovel and the CLC Genomics. All assemblies were evaluated by general assembly statistics, and more in depth by Single Nucleotide Polymorphisms (SNPs) distribution based on core genome Multi Locus Sequence Type (cgMLST) and k-mer analysis.
Results:Abyss assemblies indicated the lowest assembly statistics and major discrepancies on SNP calling. In contrast, Unicycler, Spades, Skesa and Shovel demonstrated valid assembly statistics, however a significant variation on SNP calling among the assemblers was observed. CLC Genomics shown valid assembly quality but with significant discrepancies in SNP calling. For few assemblies of identical isolates, the SNP variation among assemblers was above the cgMLST thresholds of identical isolate clusters. Considering sequencing coverage, the assembly quality remained in majority similar at >20x for E. coli and K. pneumoniae, and at >30x for S. aureus. In general, we observed that Unicycler was more reliable for Gram-negative species while for Gram-positive species assemblies, it was CLC Genomics.
Conclusions:We evaluated all commonly used assemblers with multiple criteria and demonstrated their performance on three different bacterial species. The minimum sequence coverage was addressed for E. coli, K. pneumoniae and S. aureus. Research and diagnostic centers can benefit from the choice of the right coverage depth and assembler, and generate bacterial genomes that can reliably serve either their research goal or clinical decision-making.
Keyword(s): genomes, diagnostics, bacteriaSession Type: 1-hour Mini Oral Flash
Session Title: 1-hour Mini Oral Flash
Authors(s): N. Strepis, B. Goderie, S. Hu, C.H.W. Klaassen
Authors Affiliations(s): Erasmus MC University Medical Center, Netherlands
Background:
Next generation sequencing (NGS) is utilized constantly for bacterial analysis in research and diagnostics centers. However, the basic metric for NGS, the coverage is not yet well-established. Furthermore, the quality assessment of the assembled genomes is often based on simple parameters. In this study, we performed a critical assessment of ten different coverage depths and six bacterial genome assemblers for three bacterial species.
Methods:We used NGS short-reads (Illumina) from 100 samples for each Escherichia coli (Gram-negative), Klebsiella pneumoniae (Gram-negative) and Staphylococcus aureus (Gram-positive) species. Samples with coverage >100x and deriving from different sequencing centers were used. In silico trimming of reads was performed in these samples for mimicking 10-100x (intervals of 10) coverage depth. We evaluated six assemblers; Abyss, Unicycler, Spades, Skesa, Shovel and the CLC Genomics. All assemblies were evaluated by general assembly statistics, and more in depth by Single Nucleotide Polymorphisms (SNPs) distribution based on core genome Multi Locus Sequence Type (cgMLST) and k-mer analysis.
Results:Abyss assemblies indicated the lowest assembly statistics and major discrepancies on SNP calling. In contrast, Unicycler, Spades, Skesa and Shovel demonstrated valid assembly statistics, however a significant variation on SNP calling among the assemblers was observed. CLC Genomics shown valid assembly quality but with significant discrepancies in SNP calling. For few assemblies of identical isolates, the SNP variation among assemblers was above the cgMLST thresholds of identical isolate clusters. Considering sequencing coverage, the assembly quality remained in majority similar at >20x for E. coli and K. pneumoniae, and at >30x for S. aureus. In general, we observed that Unicycler was more reliable for Gram-negative species while for Gram-positive species assemblies, it was CLC Genomics.
Conclusions:We evaluated all commonly used assemblers with multiple criteria and demonstrated their performance on three different bacterial species. The minimum sequence coverage was addressed for E. coli, K. pneumoniae and S. aureus. Research and diagnostic centers can benefit from the choice of the right coverage depth and assembler, and generate bacterial genomes that can reliably serve either their research goal or clinical decision-making.
Keyword(s): genomes, diagnostics, bacteria