Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow

Guardado en:
Bibliografiske detaljer
Udgivet i:Journal of Cheminformatics vol. 16, no. 1 (Dec 2024), p. 101
Hovedforfatter: Moreira-Filho, José T.
Andre forfattere: Ranganath, Dhruv, Conway, Mike, Schmitt, Charles, Kleinstreuer, Nicole, Mansouri, Kamel
Udgivet:
Springer Nature B.V.
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3093686877
003 UK-CbPIL
022 |a 1758-2946 
024 7 |a 10.1186/s13321-024-00894-1  |2 doi 
035 |a 3093686877 
045 2 |b d20241201  |b d20241231 
084 |a 113329  |2 nlm 
100 1 |a Moreira-Filho, José T.  |u National Institute of Environmental Health Sciences, National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Division of Translational Toxicology, Research Triangle Park, USA (GRID:grid.280664.e) (ISNI:0000 0001 2110 5790) 
245 1 |a Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow 
260 |b Springer Nature B.V.  |c Dec 2024 
513 |a Journal Article 
520 3 |a With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, we developed a user-friendly chemical grouping workflow implemented in KNIME, a free, open-source, low/no-code, data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. Furthermore, we implemented tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. We demonstrate the utility of this workflow through a case study using an eye irritation and corrosion dataset.Scientific contributionsThis work presents a novel, comprehensive chemical grouping workflow in KNIME, enhancing accessibility by integrating a user-friendly graphical interface that eliminates the need for extensive programming skills. This workflow uniquely combines several features such as automated molecular descriptor calculation, feature selection, dimensionality reduction, and machine learning algorithms (both supervised and unsupervised), with hyperparameter optimization to refine chemical grouping accuracy. Moreover, we have introduced an innovative interpretative step and natural language summaries to elucidate the underlying reasons for chemical groupings, significantly advancing the usability of the tool and interpretability of the results. 
653 |a Source code 
653 |a Physicochemical properties 
653 |a Applications programs 
653 |a Skills 
653 |a Workflow 
653 |a Unsupervised learning 
653 |a Biological activity 
653 |a Feature selection 
653 |a Biological properties 
653 |a Machine learning 
653 |a User interfaces 
653 |a Automation 
653 |a Irritation 
653 |a Natural language (computers) 
653 |a Learning algorithms 
653 |a Natural language 
653 |a Data analysis 
653 |a Informatics 
653 |a Chemical activity 
653 |a Corrosion mechanisms 
653 |a Summaries 
653 |a Algorithms 
653 |a Corrosion tests 
653 |a Software 
653 |a Visualization 
653 |a Economic 
700 1 |a Ranganath, Dhruv  |u University of North Carolina at Chapel Hill, Chapel Hill, USA (GRID:grid.10698.36) (ISNI:0000 0001 2248 3208) 
700 1 |a Conway, Mike  |u National Institute of Environmental Health Sciences, Research Triangle Park, USA (GRID:grid.280664.e) (ISNI:0000 0001 2110 5790) 
700 1 |a Schmitt, Charles  |u National Institute of Environmental Health Sciences, Division of Translational Toxicology, Research Triangle Park, USA (GRID:grid.280664.e) (ISNI:0000 0001 2110 5790) 
700 1 |a Kleinstreuer, Nicole  |u National Institute of Environmental Health Sciences, National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Division of Translational Toxicology, Research Triangle Park, USA (GRID:grid.280664.e) (ISNI:0000 0001 2110 5790) 
700 1 |a Mansouri, Kamel  |u National Institute of Environmental Health Sciences, National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Division of Translational Toxicology, Research Triangle Park, USA (GRID:grid.280664.e) (ISNI:0000 0001 2110 5790) 
773 0 |t Journal of Cheminformatics  |g vol. 16, no. 1 (Dec 2024), p. 101 
786 0 |d ProQuest  |t Health & Medical Collection 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3093686877/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3093686877/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch