Skip to Main Content
This work presents an efficient CUDA implementation of the Canny edge detection Filter for the Insight Segmentation and Registration Toolkit (ITK). The algorithm is tested on three generations of NVidia GPGPUs, showing performance gains of 3.6 to 50 times when compared to the standard ITK Canny running on two CPU models. The CUDA-enabled Canny is also compared to a more efficient Canny implementation from the OpenCV library. Examples of coding strategies to avoid warp serialization in CUDA are shown on a smart implementation of the Sobel filter, as well as on other algorithms.