Software Performance Optimization for Classification and Linking of Administrative Documents
Uloženo v:
| Vydáno v: | Programming and Computer Software vol. 50, no. 6 (Dec 2024), p. 457 |
|---|---|
| Hlavní autor: | |
| Vydáno: |
Springer Nature B.V.
|
| Témata: | |
| On-line přístup: | Citation/Abstract Full Text Full Text - PDF |
| Tagy: |
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstrakt: | This paper discusses technologies for software performance optimization. Optimization methods are divided into high-level and low-level, as well as parallelization. The described optimization methods are applied to programs and software systems for processing large volumes of information, which have hot spots. An algorithm for classifying and linking fields in a recognized image of an administrative document is described. The implementation features of the classification and linking tasks, which consist in using constellations of text key points and a modified Levenshtein distance, are considered. For optical character recognition (OCR), Smart Document Engine and Tesseract are employed. Several methods used to optimize the performance of functions for document classification and linking are described. The performance optimization of the system for sorting administrative document image streams is considered. The proposed methods for software performance optimization are suitable not only for image processing algorithms but also for computational algorithms with cyclic information processing. The approach can also be used in modern CAD systems to analyze the content of recognized text files. |
|---|---|
| ISSN: | 0361-7688 1608-3261 |
| DOI: | 10.1134/S0361768824700324 |
| Zdroj: | Advanced Technologies & Aerospace Database |