Work-efficient parallel non-maximum suppression for embedded GPU architectures | IEEE Conference Publication | IEEE Xplore