Software Performance Optimization for Classification and Linking of Administrative Documents

Guardado en:
Detalles Bibliográficos
Publicado en:Programming and Computer Software vol. 50, no. 6 (Dec 2024), p. 457
Autor principal: Slavin, O. A.
Publicado:
Springer Nature B.V.
Materias:
Acceso en línea:Citation/Abstract
Full Text
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:This paper discusses technologies for software performance optimization. Optimization methods are divided into high-level and low-level, as well as parallelization. The described optimization methods are applied to programs and software systems for processing large volumes of information, which have hot spots. An algorithm for classifying and linking fields in a recognized image of an administrative document is described. The implementation features of the classification and linking tasks, which consist in using constellations of text key points and a modified Levenshtein distance, are considered. For optical character recognition (OCR), Smart Document Engine and Tesseract are employed. Several methods used to optimize the performance of functions for document classification and linking are described. The performance optimization of the system for sorting administrative document image streams is considered. The proposed methods for software performance optimization are suitable not only for image processing algorithms but also for computational algorithms with cyclic information processing. The approach can also be used in modern CAD systems to analyze the content of recognized text files.
ISSN:0361-7688
1608-3261
DOI:10.1134/S0361768824700324
Fuente:Advanced Technologies & Aerospace Database