Parallelizing general histogram application for CUDA architectures | IEEE Conference Publication | IEEE Xplore