### 3.2 Apparatus

The apparatus consisted of two stereo displays (Mitsubishi Diamond Pro 2070SB, 22" display, 1280×1024 resolution per display, approx. 83 dpi) arranged as a left and right monitor (see Figure 1). The frame rate was 15 fps. Each display has 1280×1024 pixels with a display area of 16 × 12 inches, giving an individual pixel size of 0.012 inches (0.03 cm). Head-tracking was not used in this study. Participants sat in a standard desk chair and viewing distance was 1.5 feet, although they were free to move their head closer or further away; we estimate that the viewing distance from the display could vary between 1 - 2 feet. Thus, the visual angle per pixel could vary between 3.44 minutes of arc (1-foot viewing distance) and 1.72 minutes of arc (2-foot viewing distance).

Participants wore StereoGraphics active shutter glasses for both monoscopic and stereoscopic viewing conditions to keep display parameters like brightness constant. The left monitor displayed 3D visualizations and the right monitor was used for the "Type of CP?" trial to display the CPs shown in Figure 4 for reference.

Participants rotated the dataset about the X and Y axis relative to the screen using the cursor keys. Answers were given by typing on the keyboard. The room was darkened to provide a clearer view of the display, but indirect lighting helped illuminate the keyboard buttons when viewed through stereo glasses.

### 3.3 Conditions

*Stereoscopic viewing:* In the stereo viewing condition views were generated for each eye assuming an eye separation of 2.5 inches. For the monoscopic condition, the views presented to each eye were identical.

*Integral curve renderings:* All trials visualized a set of integral curves computed from a 4×4×4 array of seed points spaced regularly through the volume. Integral curves were computed using VTK's RungeKutta4 integrator [12] with propagation unit set to cell length, a maximum propagation of 100 units and a terminal speed of 0.1 units. Two rendering methods were used for the integral curves: a) 1-pixel wide lines and b) textured, lit tubes 8 pixels in diameter (see Figures 2 and 3, respectively). Both used anti-aliased rendering, and in both integral curves were colored so that curves with similar spatial position and shape had similar color [7]. In addition, each method draws geometric glyphs along the curve and spaces them proportionally to the magnitude or speed of the vector field at that location–the larger the gap between shapes, the faster the vector field along that segment of the line. The shapes themselves are arrow-like and indicate direction of movement along the line.

Tubes were shaded by two directional lights: light #1, a "headlight" always located at the viewer's eye position and shining in the direction of the virtual camera's viewing direction, and light #2, a light fixed in world-space at the location (-5, -5, -2) (at the start of each trial the light is "over the viewer's right shoulder"). (Note that the volume is centered at the world origin and has bounding volume (-1, -1, -1) to (1, 1, 1).) The tube surface was textured with an arrow texture that encoded the local speed and direction of the vector field. At a larger scale, "feathers" like those on an archer's arrow were drawn on the tubes to reflect speed and direction, just as the arrows do in the lines method. Finally, the tubes also had a black halo around them to emphasize their front-to-back ordering [22]. Pilot studies indicated that the halos on tubes were beneficial and no participants made negative comments about them. Tubes had a radius of 0.0115 (as explained in section 3.3.1) and halos were drawn by enabling OpenGL's front face culling and drawing each tube a second time with the radius of the tube doubled (i.e., a radius of 0.0230). Several radii were tested for the tube radius but pilot subjects noted that doubling the radius worked well.

#### 3.3.1 Pilot studies

We ran a series of pilot studies to determine parameters for the seeding strategies as well as the visualization methods. In pilot studies we used the regularly spaced seeding method found to work best in [13]; we compared *n* x *n* x *n* arrays for n = 3, 4, 5, 6 and found n = 4 to be the best balance of clarity and information from streamlines. We did find that slightly different densities were best for different tasks so the density chosen was a compromise (practical constraints on study length required selecting a single density value). For the visualization methods, first one subject explored the parameter space and tuned variables like tube radius, halo radius, glyph spacing, glyph scale, texture shape, texture scale, texture spacing, etc. to his or her preferred settings. Then for each parameter we performed a "step wedge" by looking as a group of four at a range of parameters neighboring each setting selected by the subject. A step wedge is a controlled process for exploring the effect of some process on a range of inputs–for example, in image processing a step wedge could help see the effect of a filter by applying the filter to a set of equally spaced gray values inputs and displaying the output values adjacent to each input value. Our step wedge explored not image processing filters but parameterizations of the visualization methods–our pilot subjects assigned a score to each parameter (e.g., seed spacing, tube radius, etc.) and we ultimately selected the average value for each parameter.

A difficult decision was what degree of interaction to allow, since while interactivity is a vital element of data visualization it can confound formal user study results. A highly interactive system allows many usage patterns, more training, and the development of different strategies by users. Because our main interest here was visual performance, we minimized user interaction. At first, participants could only rotate about the Y-axis, but since many requested rotation about the X-axis too it was added. Rotation was controlled through the cursor keys on the keyboard.

*Tasks:* Our five tasks all aimed at testing how well subjects understand "chunks of 3D flow". Informally, understanding chunks of 3D flow is the commonality we have noted from working on 3D flow tasks and developing visualization methods in collaboration with fluids researchers. While flow experts are often searching for different scientific features and often only indicate what they look for given a statement of the problem, our best general description of a flow scientist's task when visualizing 3D flow fields is that to varying degrees they all explore or study a localized point, often in considering the neighborhood around a region of interest. Finding and describing many common flow features fit this categorization including swirling flow, stagnation points, vortices, flow separation, flow reversal, and high-residence time. 3D flow scientists sometimes reduce their problems to 2D visualization or quantitative analysis, but in the context of this paper we consider that a different problem from 3D flow visualization.

Specifically, the five tasks were:

Task 1: Is a given point a critical point (CP)?

Task 2: Identify the type of a given CP

Task 3: Does the field advect from point A to B?

Task 4: Is there swirl at a given point?

Task 5: Is the speed faster at point A or B?

Note that Task 5 tests whether the advection speed (i.e., magnitude of the vector field) is faster at point A or B.

All tasks were binary choices except for task #2 which involved picking one of eight CPs (see Figure 4). The total number of conditions was the product of the above, 2×2×5=20. Participants performed four instances of each condition (two for each possible binary answer) except for conditions involving task #2, for which participants performed two instances of the eight possible answers–pilot tests showed that more instances would have been too fatiguing. Thus the total number of trials was 2×2×4×4 + 2×2×8×2 = 128 trials.

### 3.4 Datasets

Each trial consisted of a dataset and visualization method pairing. In pilot studies we observed that task difficulty sometimes appeared to be a function of the dataset, and we thus wanted to ensure that all participants saw the same visual stimuli.

We required a controlled set of stimuli to perform this study. We generated 1000 3D vector fields and then selected a subset of 128 for use in the study (see details below). We use 3D vector fields generated from a Gaussian-based radial basis function [8]. Each field was generated by first selecting six random locations uniformly distributed on the volume [-1, 1] × [-1, 1] × [-1, 1]. (We tested using more and less random locations, but found six random locations generally gave a good level of vector field complexity.) At each random location, a vector was generated such that all three components of each random vector were chosen from a uniform random distribution between -1 and 1. Using the Gaussian-based radial basis function with a shape parameter of 1.2, we sampled the field to a 32 × 32 × 32 regular grid spanning the range [-1, 1] × [-1, 1] × [-1, 1]. An example of a full dataset presented to a participant during a training trial is shown in Figure 5.

We used the Newton-Raphson method and eigenanalysis to detect and classify critical points [27]. The eight types of first-order critical points in stable vector fields are represented in our datasets: two node types, two saddle types, and four spiral types (see Figure 4). We removed vector fields that did not have 1, 2, 3, or 4 critical points. Pilot studies [9] suggested that the fields were complex enough to measure the effectiveness of visualization methods.

From the pool of 1000 datasets we selected datasets for each task satisfying the conditions below. We required CPs to be located in the middle part of the dataset (i.e., the central third for the X, Y, and Z dimensions), so that its context was more likely to be useful. The following details the specific parameters used to select datasets used with each task. The trial generation phase produced 128 trials. All subjects experienced the same trial conditions, but in an order determined by Latin squares [19].

The study preparation involved a task parameter tuning phase whose objective was to cause participant accuracy to average about the midpoint between guessing and a perfect score. Through iterative testing we ultimately selected datasets and 3D points for the tasks (where applicable) using the criteria below. Below, "Near the center of the dataset" means that the x, y, and z coordinates of a point are in the range [−1/3, 1/3].

**Task 1: Is a CP?** CP near the center of the dataset, 4 instances per visualization method (4 × 4 = 16 trials), 2 instances are CPs, 2 instances not CPs.

**Task 2: Type of CP?** CP near the center of the dataset, 2 instances of each of 8 CPs per visualization method (16×4=64 trials).

**Task 3: Advection task** First point (P1) near the center of the dataset volume, second point (P2) on surface of sphere of radius 0.1 × √3 centered at P1, 4 instances per visualization method (4 × 4 = 16 trials), 2 instances on surface, 2 instances rotated by 20° about a random vector relative to P1.

**Task 4: Is there swirl?** Query point near the center of the dataset volume, 4 instances per visualization method (4 × 4 = 16 trials), instances per method in ranges [-4, -3], [-1, 0], [0, 1], and [3, 4]. The majority of λ_{2} values for all datasets were in the range [-7, 7].

**Task 5: Which point is faster?** Query points near the center of the dataset volume, 4 instances per visualization method (4 × 4 = 16 trials), difference in speed between query points in range *speed*_{max} × [0.2,0.5] (where *speed*_{max} is the maximum speed for the specific dataset).

After the culling process, each dataset had between one and three CPs (66 datasets had 1, 53 had 2, and 9 had 3).