Volume rendering is an indispensable tool in visualization, with applications ranging from simulation data analysis to imaging medical and biological scan data. The principal means of visualizing a volume consist of choosing an isosurface or interpreting the volume in its entirety via direct volume rendering (DVR). Traditionally, these modalities have been implemented using separate algorithms, and were employed with different visualization and application goals. Isosurfaces are commonly generated by extracting a triangle mesh, and are useful in understanding topological and geometric behavior of a scalar field at implicitly defined boundaries. Direct volume rendering is a more expressive method of visualizing volume data, in which the user supplies a transfer function mapping scalar values to colors. As opposed to a single isovalue corresponding to a 2-manifold surface, a transfer function allows for rendering 3-manifold segments of the volume.

In principle, an isosurface can be defined by a transfer function with a Dirac impulse at the chosen isovalue. However, rendering such a transfer function poses problems for most conventional volume rendering algorithms. Methods involving uniform sampling and postclassification will invariably miss an infinitely fine impulse, entirely omitting the desired surface features. Preintegrated transfer functions remedy this, but introduce new artifacts due to their discretization of scalar values into a 2D lookup table, and weighting assumptions on the volume rendering integral. Most commonly, the solution to rendering isosurfaces within a volume rendering framework has been to increase the sampling rate and to smooth the transfer function. Nonetheless, doing so can be computationally wasteful and limits classification.

Spatial traversal strategies for GPU isosurface ray casting closely mirror those for DVR sampling. We pair these processes by identifying isovalues of interest at peaks of a 1D transfer function, using the uniform volume ray casting process to isolate these roots, and sampling directly at the desired isovalues. While numerous applications allow for multi-modal DVR and isosurface visualization, to the best of our knowledge our approach of sampling isosurfaces directly within the volume rendering integral has not previously been employed. Perhaps this is because standard techniques employing smooth transfer functions were considered sufficient. Nonetheless, definition and accurate rendering of sharp transfer functions is desirable, not only in terms of overall image quality but in the ability to classify features flexibly and render accurately with a fixed sampling budget. To further ensure samples are spent wisely, we devise a novel approach to volumetric sampling using a quadratic function for incrementing samples based on ray differential propagation. This helps in sufficiently sampling features close to the viewpoint, and is particularly useful when employing higher-order filters for which samples are expensive. While orthogonal, these techniques work well together, particlarly in rendering nearby high-frequency features with high fidelity. We compare both methods with standard techniques, and show how they offer higher quality imaging and better classification for various data sets.

Levoy [16] employed ray casting in the first implementation of direct volume rendering. The advent of z-buffer hardware and built-in texture interpolation units allowed for interactive performance with slice-based rasterization approaches [2], [4]. Similarly, rasterization methods employing splatting [32] proved to be efficient, particularly for applications involving unstructured data and higher-order reconstruction filters [34], [35]. While optimized CPU algorithms are capable of interactive volume rendering [11], [15], GPU approaches gained popularity, due to improved computational throughput and built-in texture fetching and interpolation. With programmable shader support for branching and looping, volume ray casting methods experienced resurgence on the GPU [14], [26],

The conventional means of rendering discrete isosurfaces from volume data has been to extract a mesh using marching cubes [20]. Mesh extraction methods can be combined with min-max spatial subdivision structures [33], as well as view dependent [18] approaches for further efficiency. Marching cubes only approximates the implicit surface on a coarse scale, and more sophisticated methods [28] are generally not suited for dynamic extraction. However, it is possible to combine extraction with splatting [19] for efficient rendering, or to employ splat-ting directly on isosurfaces [3].

Ray casting methods were first applied towards volumetric isosurfacing by Sramek [31]. Parker et al. [24], [25] implemented a tile-based parallel ray tracer and achieved interactive rendering of isosurfaces from large structured volumes, employing a hierarchical grid as a min-max acceleration structure and an analytical cubic root solving technique for trilinear patches. Hadwiger et al. [7] combined rasterization of min-max blocks with adaptive sampling and a secant method solver to ray cast discrete isosurfaces on the GPU. Our peak finding method is close in spirit to this approach; however we employ our solving method not only in rendering isosurfaces but in handling potentially sharp unidimensional transfer functions. Ray differentials were introduced by Igehy [8] as a way of calculating image-space derivatives of pixels as rays are transmitted, reflected and refracted in world-space, and using these values for filtering. While similar concepts have been used in multiresolution isosurface ray casting [13], to our knowledge no approach has used ray differentials for volumetric sampling.

A large body of volume rendering literature deals with transfer functions, both in how to construct them and employ them in classification. To limit artifacts when sampling high-frequency features of a transfer function, the best existing approaches are preintegration [5], [21], [27] and analytical integration of specially constructed transfer functions [10]. Hadwiger et al. [6] analyze the transfer function for discontinuities to generate a pre-compressed visibility function employed in volumetric shadow mapping. Our approach is similar except that we search for local maxima, and use these directly in enhancing classification.

SECTION 3

## Background and Overview

Direct volume rendering is the process of modeling a volume as a participating optical medium, and estimating the emission and absorption of these media according to a discrete approximation of the radiative transport equation. On a segment of a ray, irradiance is formulated as
TeX Source
$$I(a,b) = \int_a^b {\rho _E (f(s))\rho \alpha (f(s))e^{ - \int_a^s {\rho \alpha (f(t))dt} } ds}$$
where ρ_{E} is the emissive (color) term and ρ_{α} is the opacity term of the transfer function; *a*, *b* are the segment endpoints, and is the scalar field function evaluated at a distance *t* along the ray. To compute this integral, we must approximate it discretely. The conventional approach of Levoy [16] is to break up the ray into equally spaced segments, approximating the opacity integral as a Riemann Sum,
TeX Source
$$e^{ - \int_a^s {\rho \alpha (f(t))dt} } = \prod\limits_{i = 0}^n {e^{ - \Delta t\rho \alpha (f(i\Delta t))} } = \prod\limits_{i = 0}^n {(1 - \alpha _i)}$$
where Δ *t* is the uniform sampling step *n* = (*s – a*)*/*Δ *t,* and
TeX Source
$$\alpha _i \approx 1 - e^{ - \Delta t\rho \alpha (f(i\Delta t))}$$
Discretizing the integral on [*a*, *b*] in Equation 1 as a summation, we have the following discrete approximation for *I*,
TeX Source
$$I \approx \sum\limits_{i = 0}^n {\rho _E (i)\prod\limits_{j = 0}^{i - 1} {(1 - \alpha _j)} }$$
where *ρ*_{E}(*i*, *= ρ*_{E}(*i*Δ *t*)) is given by the transfer function. Evaluating the transfer function after reconstruction is known as postclassification. Typical sampling behavior of postclassification with uniform sampling along the ray is illustrated in Figure 1(a). When high-frequency features are present in *ρ*_{α}(*f*(*t*)), many samples are required to accurately integrate along the ray.

To eliminate artifacts and achieve high-quality volume rendering, we must adequately sample with respect to the Nyquist limits of all component functions contributing to the signal. The principal signal sources consist of the scalar field function and the transfer function . Engel et al. [5] note that this frequency can be either the maximum Nyquist frequency of all separate sources, or the product of the Nyquist frequencies of these sources. By discretizing the transfer function and scalar field integrals separately, preintegration can achieve greater fidelity for high-frequency transfer functions with fewer samples, as illustrated in Figure 1(b).

Separately integrating the transfer and field functions via preintegration presents separate issues, however. Problems occur when the scalar field or transfer function are undersampled by their respective discrete integrations. Like postclassification, preintegration is susceptible to undersampling, though artifacts are manifested differently. Preintegration assumes the scalar field function varies piecewise-smoothly between entry and exit samples *f*_{i} = *f*(*t*) and . Depending on the frequency of the field function, this is often not the case. Specifically, computing the opacity integral on a segment uses the trapezoid rule (or similar numerical integration), which scales the opacity summation by Δ *f* = | *f*_{i} – *f*_{o}| to approximate *ρ*_{α}. When *ρ*_{α}(*f*) is smooth (specifically, Lipschitz) this approximation behaves nicely. However, sharp features in the transfer function break this assumption, leading to bias and improperly scaled opacity. Though blending the integrals of front and back samples smoothens results [21], it does not accurately capture sharp peaks. In addition, preintegration relies on a fixed quantization of entry and exit opacities into a table. Permitting dynamic changes in the transfer function limits the size of this table, hence the minimum width Δ *f* between two field values used to query the transfer function integral. Nonetheless, visualizing features with higher precision can be desirable for more accurate classification.

This paper describes two techniques that overcome deficiencies of existing methods. The main contribution is peak finding, which overcomes many limitations of postclassification and preintegration by sampling directly at sharp features of a transfer function. This consists of analyzing a transfer function for local maxima, and explicitly solving for roots of the filter function to render isosurfaces at these peaks. As with preintegration, peak finding employs a 2D lookup table; however rather than querying an approximation of the integral itself, we query which peaks possibly lie within that range of field values. The general concept is illustrated in Figure 1(c), and its implementation is described in detail in Section 4.

In Section 5 we present differential sampling. We note that transformations on the ray from world-space to image-space convolve the volume rendering integral, and provide a new sampling method respecting the Nyquist frequency of the image plane. Our method borrows from the ray differentials formulations of Igehy [8] in developing its sampling strategy. This is discussed in Section 5.

Our system consists of a straightforward volumetric ray caster, employing a grid acceleration structure traversed per-ray in a GLSL fragment shader, and classifying via a 1D transfer function specified as a piecewise-linear set of points. Section 6 discusses how to integrate differential sampling and peak finding into this framework. The end goal of this work is to enable interactive high-fidelity volume rendering with sharp transfer function features using fewer samples than conventional methods. We show how these algorithms help to accomplish that in Section 7.

Peak finding is motivated by the shortcomings of both standard post-classified (Figure 1(a)) and preintegrated (Figure 1(b)) volume rendering with transfer functions containing sharp features approaching Dirac impulses. The general approach is similar to isosurface ray casting in that we solve directly for roots. Ray-isosurface intersection consists of solving the continuous reconstruction filter function as a 1D implicit function of *t* at an isovalue: .

Numerous numerical methods exist for solving roots of this equation; interactive ray tracing algorithms commonly employ a combination of Descartes's rule of signs and an iterative solver [7], [22], [30], or more robust recursive methods such as interval arithmetic [12]. The substantial difference is that in these systems, the isovalue is given explicitly by the user; whereas in ours the isovalue must be inferred from the transfer function. By employing these root-finding methods in searching for peaks of the transfer function, we have far lesser chance of missing them, allowing for smoother reconstruction and better shading of isosurface features within our volume rendering framework. The general concept is illustrated in Figure 1(c).

### 4.1 Determining peaks and building the lookup table

Peak finding is similar to preintegration in that we query a 2D lookup table for each segment along the ray. However, rather than storing a preintegrated radiance approximation, our table stores an isovalue *ν* or set of isovalues *ν*_{i}; that possibly exist within this segment, sorted from the first to last peak value encountered on a given segment defined by the entry and exit values of the scalar field function, [*f*_{i},f_{0}].

Before building the lookup table, we analyze our transfer function *ρ*_{α} and search for peaks. Specifically, we consider whether a given point is a local maximum (i.e. greater than both its immediate neighbors) with respect to the opacity component. The set of peaks consists of at most half the number of actual data points in our piecewise-linear transfer function, but typically it is far less. Smooth ID functions such as splines would have relatively fewer peaks, existing at the critical points of these functions. As we are interested in sharp features, we consider piecewise-linear functions. It is equally possible to use this technique to search for local minima; however due to their low radiance contribution the impact of doing so is not generally noticeable.

Having computed the array of peaks, we construct the lookup table. For a range of values [*i,j*] corresponding to lookup entries from our volume . If *i < j,* we search our transfer function for the next peak point (or in the case of multiple peaks, next 4 points) such that the opacity *ρ*_{α}(ν) > *i* and *ρ*_{α}(ν) ≤ *j.* If *i > j,* we search in descending order for peaks with *ρ*_{α}(*ν*)≤ *i* and *ρ*_{α}(ν) > *i.* When necessary, a segment spanning multiple peaks will reverse the sorting order to register all possible peaks within that segment. This process is again similar to preintegration, except that separate discrete peak values are stored instead of a single integral approximation. In each table entry, we store the domain isovalue(s *ν* corresponding to each peak. When no peak exists, we use a flag outside of the range of scalar values in the volume. Building the lookup table is relatively undemanding, and proceeds in *O*(*N*^{2}) time, similarly to the algorithm of [21] for preintegration. In practice, building a peak-finding table is roughly twice as fast as building a preintegrated table at the same resolution. Moreover, in many cases a coarser discretization (128 bins) is sufficient for peak finding, whereas preintegration would require a larger table for comparable quality when rendering near-discrete isosurfaces.

### 4.2 Root solving and classification

Peak finding occurs between samples in the main ray casting loop. Before sampling at the next step , we fetch the nearest peak value from a 2D texture using the same . If the peak exists, we subtract that isovalue from the entry and exit values, and employ Descartes' rule of signs. If this test succeeds, we assume the segment contains a root. Bracketed by , we use three iterations of a secant method (also employed by [7], [22]) to solve the root:
TeX Source
$$t_1 = t_0 - f(t_0){{t_1 - t_0 } \over {f(t_1) - f(t_0)}}$$
When the secant method completes, we have an estimate for the root *t* along the ray segment. We now sample at this position and perform postclassification. However, sampling at the peak requires two subtle choices. First, we do not evaluate our field , but rather assume that the value at this point is our desired isovalue. This works because we are solving for the root position, not its value; moreover for sharp transfer functions it is crucial in avoiding Moire patterns. Second, we do not not scale *ρ*_{α} by the segment distance Δ *t* (in Equation 3) but instead use a constant Δ *t* = 1. Although this may seem counterintuitive, the scaled extinction coefficient is itself a correction mechanism for the inherently discrete approximation of the volume rendering integral. Moreover, an unsealed opacity assumes that we always sample at this isovalue regardless of the sampling rate or local behavior of *ρ*_{α}(*f*) along the ray segment. This is precisely our goal with peak finding. While the resulting approach arguably biases the volume rendering integration towards these peaks, it is critical in detecting them without excessively increasing the sampling rate. In practice this strategy does not greatly bias our integral, as the relative contribution of values outside the peak is small.

Finding multiple peaks can be useful when the step size Δ *t* is large, or when peaks are spaced closely together. Our implementation handles up to four multiple peaks within a single segment with a straightforward extension, which can be enabled at runtime as necessary. As described in Section 4.1 we construct the peak finding table with four sequential peaks contained within the given segment [*f*_{i},f_{0}]. Since isosurfaces are encountered in precomputed order between the minimum and maximum field values, we can simply perform peak finding sequentially on all four values in that order.

### 4.3 Algorithm integration and usage

Peak finding is equivalent to volume rendering with a discrete isosurface-finding step in between. One can trivially modify the algorithm to support different rendering modalities. We allow for:

Sampling from *both* uniformly/differentially sampled DVR and peak finding (default).

Sampling from *either* uniformly/differentially sampled DVR or peak finding (peak_xor_dvr).

Transparent isosurfacing of peaks only (peak_only).

DVR only, disabling peak finding (dvr_only).

These options can be invoked with small switches to the shader code and incur no performance penalty or code overhead. The uppercase flags above correspond to macros in the GLSL pseudocode provided in the Appendix.

Peak finding is attractive in that its algorithm is not significantly different from either volume rendering or isosurface ray casting. Both algorithms employ regular sampling, in the case of DVR to compute the volume rendering integral and in the case of isosurfacing to isolate roots. Peak finding takes advantage of this and does both. As a result, this technique can be implemented quickly by extending existing Tenderers. Although we propose peak finding in conjunction with differential sampling, the two techniques are orthogonal. It is equally possible to employ peak finding in a uniform sampling ray caster, a slice-based volume Tenderer, or a shear-warp system.

Overall, peak finding and preintegration are similar, but make different assumptions about the integral over a given segment. Preintegration assumes this integral can be accurately approximated by piecewise summation. This works well when the transfer function and convolved field are smooth, but encounters difficulties when they are not. Peak finding assumes this integral can be approximated by one or several discrete impulses. This introduces bias, but is better suited for noisy data and sharp *C*_{0} transfer functions for which standard techniques fail.

SECTION 5

## Differential Sampling for Volume Rendering

Uniform sampling ignores an important component of the convolved volume rendering integral and its resulting Nyquist limit. With a pin-hole camera, the projective transformation on the image plane is itself a signal convolution. Thus, regular sampling in world-space under-samples features close to the viewpoint relative to those further away. To remedy this, we can employ a sampling strategy that uses the ray distance itself as a sampling metric. This can be accomplished with a new function *T* whose derivative varies linearly with distance, i.e.
TeX Source
$$\Delta T = {{\partial T} \over {\partial t}} = at + b\quad T(t) = {a \over 2}t^2 + bt + c$$
Then we sample along the ray at . The question remains how to choose *a, b* and *c* so that the sampling step is proportional to pixel width. We turn to the concept of ray differentials [8], which quantifies world-space transformations in image-space derivatives. Specifically, we use the ray differential transfer equation to formulate *T* as a function of image-space.

### 5.1 Ray differentials

With ray differentials [8], the general goal is to compute the image-space derivatives of a series of functions convolving the image plane, beginning with generation of rays in a pinhole camera,
TeX Source
$$\vec d(x,y) = \vec w + x\vec u + y\vec v$$
where is the central view direction are the right and up vectors. Unitizing a ray comprises another transformation:
TeX Source
$$\vec O(x,y) = \vec o\quad \vec D(x,y) = (\vec d)(\vec d \cdot \vec d)^{ - 1/2}$$
Then the unit-parameterized ray has the image-space partial with respect to *x* (and similarly for *y*):
TeX Source
$${{\partial \vec R} \over {\partial x}}(t) = {{\partial \vec O} \over {\partial x}} + t{{\partial \vec D} \over {\partial x}} + {{\partial t} \over {\partial x}}\vec D$$
As , this holds for any discrete difference Δ *t* as well. For our purposes of choosing a constant image-space measure, it suffices to consider only *x* differentials. Lastly, the differential of the unitized with respect to the *x* image-space coordinate is:
TeX Source
$${{\partial \vec D} \over {\partial x}} = {{(\vec d \cdot \vec d)\vec u - (\vec d \cdot \vec u)\vec d} \over {(\vec d \cdot \vec d)^{3/2} }}$$
Derivations are given in more detail in the original paper [8],

### 5.2 Differential sampling construction

Our general strategy is to define a base sampling rate proportional to an image-space quantity, and use the ray differential transfer equation (Equation 9) to derive our sampling function *T.* To accomplish this, we use the image-space *x* as our discretization, and construct a sampling scheme where is proportional to the differential quantity . As world-space Δ *t* is proportional to *x,* for some scalar *k,*
TeX Source
$$\displaylines{ {{\partial \vec R} \over {\partial x}}(\Delta t) = \Delta t{{\partial \vec D} \over {\partial x}} + {{\partial \Delta t} \over {\partial t}}\vec D = kx\left| {{{\partial \vec D} \over {\partial x}}} \right|\vec D + k\vec D \cr = \vec D\left({k\left| {{{\partial \vec D} \over {\partial x}}} \right|x - k} \right) \cr}$$
Since is normalized and our discrete step Δ *t* is arbitrary, the user can choose any *k* and preserve a correlation between the distance-based sampling step Δ *t* and *x.* Similarly, to use *x* as unit of measure along the ray, we project so that it is collinear with , i.e . Then from Equation 9, we have:
TeX Source
$$\displaylines{ {{\partial \vec R} \over {\partial x}}(\Delta t) = \Delta t{{\partial \vec D} \over {\partial x}} + {{\partial \Delta t} \over {\partial x}}\vec D = kx|{{\partial \vec D} \over {\partial x}}|\vec D + k\vec D \cr = \vec D(k|{{\partial \vec D} \over {\partial x}}|x + k) \cr}$$

Since , this gives us
TeX Source
$$|{{\partial \vec R} \over {\partial x}}(\Delta t)| = k|{{\partial \vec D} \over {\partial x}}|x + k$$
From Figure 2, notice that . Since θ between any two rays is constant, tan(*θ*) is also constant (its computation is left as an exercise). This can be incorporated into a new constant *k′ = k* tan(*θ*); or if *k* is arbitrarily chosen we can omit this step and use *k′ = k.* We then employ the differential construction of *T* in Equation 6 but in terms of image-space *x,*
TeX Source
$${{\partial T} \over {\partial x}} = \Delta t = k'|{{\partial \vec D} \over {\partial x}}|x + k'$$
For convenience let and *b = k′.* The antiderivative yields our differential sampling function *T:*
TeX Source
$${{\partial ^2 T} \over {\partial x^2 }} = a\quad {{\partial T} \over {\partial x}} = ax + b\quad T(x) = {a \over 2}x^2 + bx + c$$
When we begin sampling at *t* = *T*(*x*, *=* 0, we can assume *c* = 0.

### 5.3 Computing and incrementing samples

Differential sampling is simple to implement in a volume ray casting framework. We first compute from Equation 10. While the user can choose any *k,* we ensure it is some multiple of world-space pixel footprint at the image plane, e.g . From this we compute *k′* (if necessary *a* and *b.* Theoretically *s*_{k} < 1/2 is required to satisfy the Nyquist limit of the image plane. In practice this rate is excessive, and *s*_{k} = 4 is a good conservative default.

From the ray origin, the sampling process begins at *x* = 0, where
TeX Source
$${{\partial T} \over {\partial x}} = b\quad T(x) = 0$$
Then at each ray casting iteration, we sample at , and perform the following increments, where Δ *t* is our discretization of ,
TeX Source
$$\vec P_1 = \vec P_0 + \Delta t_0 \vec D\quad \Delta t_1 = \Delta t_0 + a$$
Thus, incrementing the position from one sample to the next consists only of an extra vector multiplication and addition, on top of the vector addition for uniform sampling. This is also outlined in the pseudocode in the Appendix.

We implemented our ray casting framework in OpenGL and GLSL. The pinhole camera vectors , and are computed on the CPU and then sent to the fragment shader, where a ray is generated from the pixel *x* and *y* values according to Equation 7. The 1D transfer function is given as a set of points {*v*, {*r*, *g,b*, *a*}}, then processed into a fairly wide (8K elements) ID texture, allowing for rapid access on the GPU and generally sufficient transfer function precision Δ*f* > 1*e* − 4. We implemented a tricubic B-spline filter using the method of [29], with the BC smoothing (*B* = 2 *C* = 1) kernel of [23]. We optionally employ this for both DVR sampling and root solving.

### 6.1 Space skipping

Even with fairly dense transfer functions, most data sets are sparse enough to warrant an empty space skipping mechanism. We choose a simple uniform grid with a 3DDDA algorithm [1] where each grid cell stores min-max values of enclosed voxels. Fairly coarse grids (64^{3} cells) work best on the GPU, and this structure can be updated interactively when the transfer function changes. The fragment shader then traverses the macrocell grid using the 3DDDA algorithm in an outer loop. When a macrocell is nonempty, we enter the volume rendering loop, with peak finding tests taken between samples. To begin sampling, we find the first *t* at which to sample when entering a macrocell. With differential sampling, we solve for the maximum *x* after *T*_{enter},
TeX Source
$$ax^2 {\rm{/}}2 + bx = T_{enter} \quad x_{tenter} = (- b + \sqrt {b^2 + 2aT_{enter})} {\rm{/}}a$$
We then compute the floor values ⌊*x*_{tenter}⌋, *T*(⌊*x*_{tenter}⌋) and , which can be simplified significantly from Equation 15; and subsequently sample and increment as in Equation 17. To avoid duplicate samples, we store the greatest *t* at which we already sampled, and use the maximum of that and *T*_{enter}.

### 6.2 Adaptive sampling

As discussed in [7], purely adaptive methods (for example based on local gradient) perform poorly on GPUs due to poor thread coherence. However, we do achieve better performance by varying the sampling rate on a per-macrocell basis. In this scheme, each macrocell computes a metric based on the ratio of the maximum standard deviation of its voxels to that of the entire volume . As this represents a multiplier for the frequency, its inverse can be used to vary the sampling step size Δ *t.* In practice we wish this to be a positive integer, and a multiplier *M = 2m*–^{1} + 1 delivers good results. With uniform sampling one simply employs *M*Δ *t* as the new sampling rate. With differential sampling *M* modifies our increments as follows:
TeX Source
$${{\partial ^2 T_M } \over {\partial x^2 }} = \sum\limits_{i = 1}^M a = {{M(M + 1)} \over 2}a\quad {{\partial T_M } \over {\partial x}} = Max + {{\partial ^2 T_M } \over {\partial x^2 }} + b$$
No modifications to *T*(*x*) are required, since the initial *x* for that macrocell can be any integer.

Unless otherwise stated, all results were collected on a 2.5 GHz Intel Xeon and an NVIDIA 285 GTX GPU, with trilinear filtering, differential sampling *(s*_{k} = 4) and the exclusive-or peak finding modality. For each scene we plot the (*f*, *ρ*_{α}(*f*)), scaled to the maximum opacity of the transfer function. To evaluate complexity, we count the total number of filter evaluations (including peak finding) or DVR-only samples (without peak finding), and divide these by the number of pixels. As with any DVR system, performance varies widely with the number of samples taken. Opaque isosurfaces and low-frequency scenes are simplest and render at real-time rates. The focus of our work is in handling sharp features, which requires higher sampling rates. Overall, image quality is excellent and our system is generally interactive (Table 1). While analysis of macrocells falls outside the scope of this paper, they usually deliver 1.2x to 5x performance improvement depending on the scene. Although other approaches have greater total sample throughput, our system is competitive in how it spends samples and resulting quality.

### 7.1 Peak finding

Peak finding is useful when the combined frequency of the volume and transfer function is too high for effective regular sampling. In such cases, postclassification would require near-infinite sampling to accurately reproduce features. Preintegration succeeds in detecting high-frequencies of the transfer function, but integrates and shades them incorrectly when undersampling the scalar field.

An obvious scenario in which conventional sampling methods fail is a transfer function containing one or more Dirac-like features, as shown in Figure 3. Peak finding succeeds in reproducing these features as semi-transparent isosurfaces, and rendering smoother volumetric features in the correct order. While postclassification misses peak features outright, preintegration detects and reproduces a surface. However, with preintegration the range Δ *f* along a given segment can significantly skew the opacity integral; two segments with different Δ *f* may sample the same impulse but have different irradiances. With peak finding, this is not the case. In addition, preintegration shades at the segment endpoint, as opposed to locally at the hit position of the isosurface, resulting in Moire patterns. Finally, when an impulse is defined with a discretization smaller than that of the preintegrated table, peak finding with a smaller table can reproduce features that preintegration misses. In practice, this is less a concern than the aforementioned integration and shading issues with preintegration.

Peak finding is an intriguing method for rendering noisy or entropic data, for example from scanned sources in medicine or biology. Here, even when the transfer function is sampled adequately, the filtered field function of the volume (hence the convolved signal) is not. While artifacts are not as noticeable due to the noisy nature of renderings, high-frequency features are again omitted. Due to convolution of the high data frequency, features can be lost even with moderate-frequency transfer functions. Simply increasing opacity at peaks does not correct the problem, and widening the transfer function broadens the classification. Choosing a higher sampling rate can remedy this, but at high performance cost. Meanwhile, at sampling rates well below the Nyquist limit, peak finding successfully reproduces sharp features with the desired opacity and color, as shown in Figure 4. The fireset in Figure 6 also illustrates this phenomenon.

Finding multiple peaks is typically not necessary unless several sharp features are close together in the transfer function. This option better ensures peaks are rendered in the correct order, and costs roughly 20% performance (Figure 5 (left)). More significantly, we find that bias from always sampling at peaks is manageable Figure 5 (right) considers a smooth transfer function that looks nearly identical with peak finding and postclassification (Figure 5c,e). Peaks with opacity magnified by 16 (Figure 5d) and peak isosurfaces only (Figure 5f) are shown for contrast. The only disadvantage of peak finding in such cases is that it is not necessary and more costly. While it is possible to construct transfer functions for which peak regions have relatively higher contribution to the radiance and show greater bias, for the most part peak finding accentuates isosurface-like features as desired.

### 7.2 Differential sampling

Differential sampling delivers better results close to the viewpoint, and not noticeably worse quality in the distance. A major appeal of this method is that the sampling rate is view-dependent; it automatically and locally matches sampling to the frequency of the image plane, thus requiring less work on the part of the user. In evaluating differential sampling, it is difficult to enforce a constant average sampling rate, so we use frame rate as the control variable and compare the results in Figure 6. Exact performance figures are given in Table 2. At similar frame rate, uniform sampling undersamples nearby features, and differential sampling remedies this, yielding consistently better quality and surpsingly little quality loss further away. Peak finding amplifies undersampling artifacts at silhouettes; as a result differential sampling in conjunction with peak finding is particularly desirable up close.

More subjectively, we can choose a single converged image as the control, and compare frame rates required for each scheme to achieve comparable quality. We use Figure 7 and the differential sampling halves of Figure 6 as reference; results are given in Table 2 (bottom). Adequately sampled, these scenes look generally similar with uniform and differential schemes. However, differential sampling can deliver up to 3x better frame rate, particularly when overall frequency is low. In Figure 7(a,b), converged images of the aneurism with postclassiflcation and B-spline filtering look nearly identical, but run at 1.0 and 2.7 fps with uniform and differential sampling, respectively (0.86 and 1.8 fps with peak finding). Conversely, in cases where data is entropic and classified with multiple peaks, differential sampling is less effective, requiring a smaller *s*_{k} to adequately sample faraway regions, while oversampling nearby features. This is more noticeable with peak finding, where adequate sampling is necessary for robust root isolation of isosurfaces. Overall, differential sampling seldom delivers worse quality than uniform at the same frame rate. The backpack in Figure 7(c,d), a noisy scanned volume classified with peak finding and multiple peaks, still renders at 1.6 fps with both sampling methods and similar quality.

As evident in Table 2, differential sampling often requires half or less as many uniform samples for equivalent visual quality. Ideally, half as many samples would correspond to exactly double the frame rate. In practice this is not the case, due to the parallel nature of GPUs and worse memory coherence at far-away samples when using differential sampling. With tricubic B-spline filtering, the higher cost of computing samples outweighs this penalty, yielding relatively better performance with differential sampling than with uniform (1.5-3x as opposed to l-2x). Nonetheless, differential sampling remains clearly worthwhile with trilinear filtering.

Our proposed techniques advance the state-of-the-art in high-quality volume ray casting. Peak finding allows for near-discrete isosurfaces to be specified within a volume rendering transfer function, and provides a new tool in the classification arsenal. It yields viable classification of entropic and noisy data, handles pathological cases that are unadressed by postclassification and preintegration, and is not significantly slower than those techniques. Differential sampling allows for better quality rendering of features closer to the camera, with less overall sampling and correspondingly higher frame rate.

The main drawback of peak finding is that it is more costly than preintegration, and unnecessary when the transfer function and data are smooth. Again, an argument can be made that introducing discrete isosurfaces into the volume rendering integral is inherently biased. In addition, the rule of signs is not a robust root isolation method, and surfaces can be missed near sharp silhouettes. The main limitation of differential sampling is that it would be difficult to implement outside of a ray casting framework. When *s*_{k} is very small, differential sampling encounters numerical problems resulting in worse artifacts at greater sampling rates, shown in the close-up in Figure 7(c,d). This is rarely an issue in practice, and could be remedied with double-precision GPU arithmetic. The chief drawback of our implementation is that it traverses an acceleration structure in the fragment shader, which is likely slower than rasterized bricking or slicing. Most of our chosen scenes are costly to sample regardless of space skipping, but we could employ a proxy rasterization technique such as [17] for better performance.

Several extensions to this work are worth pursuing. Differential sampling could be used in more traditional applications of ray differentials such as multiscale filtering and level of detail, which could improve quality and allow efficient rendering of large data. Peak finding could be extended to handle multidimensional and multifield transfer functions, which could use topological methods to find peaks in higher dimensions. We are also interested in combining preintegration and peak finding for better classification.

### Acknowledgments

This work was supported by the German Research Foundation (DFG) through the University of Kaiserslautern International Research Training Group (IRTG 1131); as well as the National Science Foundation under grants CNS-0615194, CNS-0551724, CCF-0541113, IIS-0513212, and DOE VACET SciDAC, KAUST GRP KUS-C1-016-04. Additional thanks to Liz Jurrus and Tolga Tasdizen for the zebrafish data, and to the anonymous reviewers for their comments.