Appearance Enhancement for Camera-Captured Document Images in the Wild | IEEE Journals & Magazine | IEEE Xplore

Appearance Enhancement for Camera-Captured Document Images in the Wild


Impact Statement:Document AI has attracted enormous interest due to increased industrial demand. An automatic, accurate, and rapid Document AI system can significantly improve productivit...Show More

Abstract:

Camera-captured document images usually suffer from various appearance degradations, which hamper the clarity of content and preclude subsequent analysis and recognition ...Show More
Impact Statement:
Document AI has attracted enormous interest due to increased industrial demand. An automatic, accurate, and rapid Document AI system can significantly improve productivity. More and more images encountered in Document AI are now coming from photographs, which usually contain appearance degradations due to uncontrolled photographic environments. Such degradations hamper the clarity of content and preclude subsequent application of Document AI systems. The existing methods only account for limited scenarios and are not sufficiently practical. To address these issues and tackle the problem of document image appearance enhancement, which is an open problem that has not been well studied, we propose GCDRNet. With global context learning, detail-related restoration, multiscale, and multiloss designs, GCDRNet provides superior performance and can handle various types of appearance degradations simultaneously. We also propose a new dataset called RealDAE, which, to the best of our knowledge, i...

Abstract:

Camera-captured document images usually suffer from various appearance degradations, which hamper the clarity of content and preclude subsequent analysis and recognition systems. Most of the existing methods are tailored for one or relatively few degradations, making them feasible only in limited scenarios. However, in real-world applications, degradations are more diverse, and different degradations may arise simultaneously in a single image. To remedy this limitation, we aimed to achieve appearance enhancement for camera-captured document images in the wild, where degradations exhibit more diversity and may coexist simultaneously within the same image. To realize this, we propose a new end-to-end neural network called GCDRNet, which consists of two cascaded subnets, global context learning network (GC-Net) and detail restoration network (DR-Net). The GC-Net is used for global context modeling, and the DR-Net is used for detail restoration through a multiscale and multiloss training s...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 5, May 2024)
Page(s): 2319 - 2330
Date of Publication: 02 October 2023
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.