Software Performance Optimization for Classification and Linking of Administrative Documents

Uloženo v:
Podrobná bibliografie
Vydáno v:Programming and Computer Software vol. 50, no. 6 (Dec 2024), p. 457
Hlavní autor: Slavin, O. A.
Vydáno:
Springer Nature B.V.
Témata:
On-line přístup:Citation/Abstract
Full Text
Full Text - PDF
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Abstrakt:This paper discusses technologies for software performance optimization. Optimization methods are divided into high-level and low-level, as well as parallelization. The described optimization methods are applied to programs and software systems for processing large volumes of information, which have hot spots. An algorithm for classifying and linking fields in a recognized image of an administrative document is described. The implementation features of the classification and linking tasks, which consist in using constellations of text key points and a modified Levenshtein distance, are considered. For optical character recognition (OCR), Smart Document Engine and Tesseract are employed. Several methods used to optimize the performance of functions for document classification and linking are described. The performance optimization of the system for sorting administrative document image streams is considered. The proposed methods for software performance optimization are suitable not only for image processing algorithms but also for computational algorithms with cyclic information processing. The approach can also be used in modern CAD systems to analyze the content of recognized text files.
ISSN:0361-7688
1608-3261
DOI:10.1134/S0361768824700324
Zdroj:Advanced Technologies & Aerospace Database