Skip to Main Content
Threaded Ray eXecution (TRaX) is a highly parallel multithreaded multicore processor architecture designed for real-time ray tracing. The TRaX architecture consists of a set of thread processors that include commonly used functional units (FUs) for each thread and that share larger FUs through a programmable interconnect. The memory system takes advantage of the application's read-only access to the scene database and write-only access to the frame buffer output to provide efficient data delivery with a relatively simple memory system. One specific motivation behind TRaX is to accelerate single-ray performance instead of relying on ray packets in single-instruction-multiple-data mode to boost throughput, which can fail as packets become incoherent with respect to the objects in the scene database. In this paper, we describe the TRaX architecture and our performance results compared to other architectures used for ray tracing. Simulated results indicate that a multicore version of the TRaX architecture running at a modest speed of 500 MHz provides real-time ray-traced images for scenes of a complexity found in video games. We also measure performance as secondary rays become less coherent and find that TRaX exhibits only minor slowdown in this case while packet-based ray tracers show more significant slowdown.