A deep learning-based solution for digitization of invoice images with automatic invoice generation and labelling


Creative Commons License

Arslan H., Işık Y. E., Görmez Y.

INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, cilt.27, sa.1, ss.97-109, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 27 Sayı: 1
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s10032-023-00449-4
  • Dergi Adı: INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Sayfa Sayıları: ss.97-109
  • Sivas Cumhuriyet Üniversitesi Adresli: Evet

Özet

Nowadays, the level of invoice traffic between companies has reached enormous levels. Invoices are crucial financial documents for companies, and they need to extract this information from these documents to access and control them quickly when necessary. While electronic invoices can be easily transferred to the company’s ERP system with the help of integrators, information from printed invoices must be entered into the ERP system. Information entry is generally performed manually by company employees, so the probability of error is high. The automatic recognition of information in printed invoices will reduce the possibility of error. It will also save time and money by reducing workforce requirements. This study proposes a deep learning-based solution for detecting fields in image invoices that are in high demand among businesses. The system offers an end-to-end solution, which includes a novel method for generating synthetic invoices and automatic labeling. Three invoice templates were used to evaluate the usability of the system and an adaptive fine-tuning-based solution is proposed for newly coming invoice templates. Furthermore, 6 different object detection models were compared to find the most suitable one for our problem. The system was also tested with 1022 real invoice images that were manually labeled to test real-world usage. The results indicated that the fine-tuned model achieved an accuracy that was 8.4% higher than the baseline models. In tests performed on CPU, TOOD and Cascade-RCNN models were the most successful algorithms, while YOLOv5 was the fastest running algorithm. Depending on the priority of the needs, both algorithms can be preferred for real-time usage in the detection of invoice fields. The synthetic invoice generation code is available at https://github.com/SCU-CENG/Invoice-Generation.