Loading [MathJax]/extensions/MathMenu.js
Document Classification and Key Information Extraction Using Multimodal Transformers | IEEE Conference Publication | IEEE Xplore

Document Classification and Key Information Extraction Using Multimodal Transformers


Abstract:

Companies manage and track their expenses either physically or through software applications. However, manual expense entry steps are prone to errors. Manual expense entr...Show More

Abstract:

Companies manage and track their expenses either physically or through software applications. However, manual expense entry steps are prone to errors. Manual expense entry errors losses in terms of money, time and productivity. Therefore, this study presents a novel system on the automation of document information entry with a special focus on financial documents through machine I earning techniques. The methodology involves training LayoutLM models for sequence and token classification to categorize and extract detailed information from various financial documents such a s receipts and invoices. The proposed system integrates state-of-the-art models such as LayoutLMv2, LayoutLMv3, and fastText to achieve accurate document classification a nd information extraction. The designed system was implemented and tested on various types of receipts and invoices containing financial values, using evaluation metrics such as accuracy, precision, recall, and F1-score. The capability of the proposed system to achieve high accuracy, precision and F1 scores above 90 % across various document types and in automated document processing tasks reaffirms its suitability for document processing applications.
Date of Conference: 26-28 October 2024
Date Added to IEEE Xplore: 11 December 2024
ISBN Information:

ISSN Information:

Conference Location: Antalya, Turkiye

Contact IEEE to Subscribe

References

References is not available for this document.