NGSTroubleFinder: A tool for detection and quantification of contamination and kinship across human NGS data

Wedi'i Gadw mewn:
Manylion Llyfryddiaeth
Cyhoeddwyd yn:bioRxiv (Feb 5, 2025)
Prif Awdur: Valentini, Samuel
Awduron Eraill: Venturelli, Tecla, Gallego, Xavier, Perez-Cano, Laura, Guney, Emre
Cyhoeddwyd:
Cold Spring Harbor Laboratory Press
Pynciau:
Mynediad Ar-lein:Citation/Abstract
Full Text - PDF
Full text outside of ProQuest
Tagiau: Ychwanegu Tag
Dim Tagiau, Byddwch y cyntaf i dagio'r cofnod hwn!

MARC

LEADER 00000nab a2200000uu 4500
001 3163596310
003 UK-CbPIL
022 |a 2692-8205 
024 7 |a 10.1101/2025.01.31.635690  |2 doi 
035 |a 3163596310 
045 0 |b d20250205 
100 1 |a Valentini, Samuel 
245 1 |a NGSTroubleFinder: A tool for detection and quantification of contamination and kinship across human NGS data 
260 |b Cold Spring Harbor Laboratory Press  |c Feb 5, 2025 
513 |a Working Paper 
520 3 |a Quality control is a fundamental but often neglected step in any NGS pipeline. Detecting issues like cross-sample contamination and sample swaps is essential to control the data integrity. Here, we present NGSTroubleFinder, a novel python tool to detect cross-sample contamination in human Whole-Genome and Whole-Transcriptome Sequencing data, sample swaps and mismatches between the reported and the inferred genetic and transcriptomic sexes. NGSTroubleFinder is implemented in Python and incorporates a custom-built parallelized pileup engine written in C. The tool reports extensive information on the samples both in textual and HTML format including key plots for easy interpretation of the results. Availability and Implementation NGSTroubleFinder is written in Python and C, and it can be easily installed with pip. The tool source code and the models are freely available on github (https://github.com/STALICLA-RnD/NGSTroubleFinder) and a containerized version is available on dockerhub (https://hub.docker.com/r/staliclarnd/ngstroublefinder).Competing Interest StatementAuthors are employees of STALICLA DDS.Footnotes* https://github.com/STALICLA-RnD/NGSTroubleFinder 
653 |a Transcriptomes 
653 |a Transcriptomics 
653 |a Contamination 
653 |a Quality control 
700 1 |a Venturelli, Tecla 
700 1 |a Gallego, Xavier 
700 1 |a Perez-Cano, Laura 
700 1 |a Guney, Emre 
773 0 |t bioRxiv  |g (Feb 5, 2025) 
786 0 |d ProQuest  |t Biological Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3163596310/abstract/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3163596310/fulltextPDF/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u https://www.biorxiv.org/content/10.1101/2025.01.31.635690v1