|
|
2019 » Papers » Volume 1 » Designing a Document Image Analysis System on 3 Axis: Education, Research and Performance 1. DESIGNING A DOCUMENT IMAGE ANALYSIS SYSTEM ON 3 AXIS: EDUCATION, RESEARCH AND PERFORMANCE Authors: Vlasceanu Giorgiana Violeta, Boiangiu Costin Anton, Deaconescu Razvan Adrian, Rughinis Razvan, Prodan Marcel, Avatavului Cristian, Mocanu Irina Volume 1 | DOI: 10.12753/2066-026X-19-027 | Pages: 202-208 | Download PDF | Abstract
Technology advances to make life easier for people. We tend to surround us with devices as small as possible and with the highest computing power. The need for data access from everywhere is an important detail. As a consequence, digital documents have been gaining ground on printed ones and for some sectors, the latter were even replaced.
The need and the obligation to preserve the written cultural heritage, represented by books and valuable documents, some of them rare and even unique, forced us to imagine a system that protects the patrimony but makes it also accessible. In order to make books easily available to the public and at the lowest possible risk for the protection of the originals, we came to the idea of designing and creating an efficient digitization system of these records.
The current article presents the proposed architecture of a Document Image Analysis System that will process the information with individual modules for each type of operation. The main scope for such tool is to recognize information from the documents and extract them for electronic use. The flow of operations are indicated by user, some steps can be eliminated depending on the user's desire and needs.
In order to design an efficient Document Image Analysis System, we need a 3 axis approach: Education - involving students that can receive tasks for replacing modules and validating their homework, Research - performing various tests and Performance - testing the module interconnection and enabling the system to be extremely configurable.
No matter what axis is considered, the main scope is the flexibility of the system - performed by individual modules as physical binaries or collection of binaries that are linked via scripts. Each module is designed to accomplish a certain major task by executing several sub-tasks whose results, in most cases, are subject to an intelligent voting process that produces the module's output data. | Keywords
Retroconversion; Document Image Analysis; Optical Character Recognition; OCR; Digitization; Document Export; Lib2life. |
|
|
|