Towards a canonical and structured representation of PDF documents through reverse engineering | IEEE Conference Publication | IEEE Xplore