Accurately assembling nanopore sequencing data of highly pathogenic bacteria

Saved in:
Bibliographic Details
Published in:BMC Genomics vol. 26 (2025), p. 1
Main Author: Thomas, Christine
Other Authors: Brangsch, Hanka, Galeone, Valentina, Hölzer, Martin, Marz, Manja, Linde, Jörg
Published:
Springer Nature B.V.
Subjects:
Online Access:Citation/Abstract
Full Text
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000nab a2200000uu 4500
001 3247110401
003 UK-CbPIL
022 |a 1471-2164 
024 7 |a 10.1186/s12864-025-11793-6  |2 doi 
035 |a 3247110401 
045 2 |b d20250101  |b d20251231 
084 |a 58495  |2 nlm 
100 1 |a Thomas, Christine 
245 1 |a Accurately assembling nanopore sequencing data of highly pathogenic bacteria 
260 |b Springer Nature B.V.  |c 2025 
513 |a Journal Article 
520 3 |a BackgroundBacterial genome exploration and outbreak analysis rely heavily on robust whole-genome sequencing and bioinformatics analysis. Widely-used genomic methods, such as genotyping and detection of genetic markers demand high sequencing accuracy and precise genome assembly for reliable results.MethodsTo assess the utility of nanopore sequencing for genotyping highly pathogenic bacteria with low mutation rates, we sequenced six reference strains using Oxford Nanopore Technologies (ONT) R10.4.1 chemistry and Illumina and evaluated different assembly strategies. The publicly available RefSeq assemblies were chosen as the ground truth. Publicly available sequencing data from key foodborne and public-health-related bacterial pathogens were examined to provide a broader context for the analysis.ResultsWhile for Bacillus (Ba.) anthracis an almost perfect assembly was achieved, results varied for other species. For Brucella (Br.) spp., the final assemblies comprised five to 46 different nucleotides in comparison to Sanger-sequenced references. For some key foodborne and public-health-related bacterial pathogens (Klebsiella (K.) variicola, Listeria spp., Mycobacterium (M.) tuberculosis, Staphylococcus (Sta.) aureus, and Streptococcus (Str.) pyogenes) perfect genomes were obtained. Enhanced basecalling models have generally improved assembly accuracy, however, for certain species such as Br. abortus, older models have produced higher accuracy. While long-read polishing mainly improves assembly quality with only one round needed, our results indicate that this process may also degrade assembly quality. Overall, 81% of the observed errors in ONT assemblies were located within coding sequences (CDS). Furthermore, we found that methylation caused 6.5% of the errors, and the bacterial methylation-aware medaka polishing model reduced the number of errors linked to methylation. Core-genome Multilocus Sequence Typing (cgMLST) analysis revealed allele differences in Ba. anthracis, Br. abortus, and Francisella (F.) tularensis for some assemblers, although with fewer than five allele differences. In the case of Br. melitensis, some assemblies included five allele differences, whereas for Br. suis the correct cgMLST alleles were observed.ConclusionsAssembling nanopore data from pathogenic bacteria vary in quality across different species and methods. However, errors persist in the final assemblies, including within cgMLST loci, influencing the reliability of outbreak predictions. Nevertheless, specific combinations of existing tools can generate perfect genome assemblies from bacterial ONT sequencing data for outbreak analysis without short-read polishing. 
653 |a DNA methylation 
653 |a Nucleotides 
653 |a Bacteria 
653 |a Public health 
653 |a Klebsiella 
653 |a Methylation 
653 |a Mutation 
653 |a Bioinformatics 
653 |a Alleles 
653 |a Assembling 
653 |a Assemblies 
653 |a Outbreaks 
653 |a Nucleotide sequence 
653 |a Accuracy 
653 |a Pathogens 
653 |a Plasmids 
653 |a Genotyping 
653 |a Gene sequencing 
653 |a Polishing 
653 |a Chromosomes 
653 |a Genomic analysis 
653 |a Errors 
653 |a Whole genome sequencing 
653 |a Genetic markers 
653 |a Mutation rates 
653 |a Multilocus sequence typing 
653 |a Genomes 
653 |a Environmental 
700 1 |a Brangsch, Hanka 
700 1 |a Galeone, Valentina 
700 1 |a Hölzer, Martin 
700 1 |a Marz, Manja 
700 1 |a Linde, Jörg 
773 0 |t BMC Genomics  |g vol. 26 (2025), p. 1 
786 0 |d ProQuest  |t Health & Medical Collection 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3247110401/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3247110401/fulltext/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch