By Topic

Date 25-26 Oct. 1993

Filter Results

Displaying Results 1 - 17 of 17
  • Author index

    Page(s): 0_2
    Save to Project icon | Request Permissions | PDF file iconPDF (35 KB)  
    Freely Available from IEEE
  • Proceedings of 1993 IEEE Parallel Rendering Symposium

    Save to Project icon | Request Permissions | PDF file iconPDF (213 KB)  
    Freely Available from IEEE
  • Integrating volume data analysis and rendering on distributed memory architectures

    Page(s): 89 - 96, 112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (820 KB)  

    The ability to generate visual representations of data, and the ability to enhance data into a suitable form for the purpose of visual representation, form two key components in a scientific visualization system. By a visual representation we mean the ability to render the data, using visual cues, such that the important features are readily perceived by the user. By the ability to enhance data we mean the ability to apply transformations to the data so that salient features embedded in the data become discernible and quantifiable. The rendering of data, computer graphics, and the enhancement of data, image processing, have emerged over the last twenty years into separate scientific disciplines. However, in scientific visualization and other applications of empirical data interpretation, we are increasingly confronted with the need to combine both data rendering and data transformation capabilities under one system framework. This paper describes the design issues and implementation of a program for visualizing and enhancing volume data on distributed memory architectures. Our design is motivated by the desire to interactively view, transform, and interpret volume data acquired using seismic imaging techniques. Experimental results derived from an implementation on the Connection Machine CM-5 are described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A task adaptive parallel graphics renderer

    Page(s): 27 - 34, 107
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (872 KB)  

    This paper presents a graphics renderer which incorporates new partitioning methodologies of memory and work for efficient execution on a parallel computer. The task adaptive domain decomposition scheme is an image space method involving dynamic partitioning of rectangular pixel area tasks. We show that this method requires little overhead, allows coherence within a parallel context, handles worst case scenarios with reasonable speedup, executes efficiently, and requires minimal processor synchronization. The implementation analysis indicates that load imbalance is the major cause of performance degradation at the higher processor counts. Even so, on a variety of test scenes, an average rendering speedup of 79 was achieved utilizing 96 processors on the BBN TC2000 multiprocessor with processor efficiency ranging from 66% to 94% View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel volume-rendering algorithm performance on mesh-connected multicomputers

    Page(s): 97 - 104
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (764 KB)  

    This work examines the network performance of mesh-connected multicomputers applied to parallel volume rendering algorithms. This issue has not been addressed in papers describing particular parallel implementations, but is pertinent to anyone designing or implementing parallel rendering algorithms. Parallel volume rendering algorithms fall into two main classes-image and object partitions. Communication requirements for algorithms in these classes are analyzed. Network performance for these algorithms is estimated by using an existing model of mesh network behavior. The performance estimates are verified by tests on the Touchstone Delta. The results indicate that, for a fixed screen size, the performance of 2D mesh networks scales very well then used with object partition algorithms-the time required for communication actually decreases as the data and system sizes increase. A Touchstone Delta implementation of an object partition algorithm is briefly described to illustrate the algorithm's low communication requirements View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A pyramid-based approach to interactive terrain visualization

    Page(s): 67 - 70, 106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (460 KB)  

    This paper describes a multiresolution approach to the visualization of surface data. The algorithms discussed allow the generation of arbitrary views of 3-dimensional surfaces. Image processing and texture mapping techniques are combined in a new 3-pass scanline algorithm to achieve smooth and continuous translations, rotations, and scale changes of large data sets. The implementation of the algorithms on a massively parallel SIMD video supercomputer, the Princeton Engine, allows the scenes to be generated interactively at video rates View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Progressive refinement radiosity on ring-connected multicomputers

    Page(s): 71 - 76
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (660 KB)  

    The progressive refinement method is investigated for parallelization on ring-connected multicomputers. A synchronous scheme, based on static task assignment, is proposed, in order to achieve better coherence during the parallel light distribution computations. An efficient global circulation scheme is proposed for the parallel light distribution computations, which reduces the total volume of concurrent communication by an asymptotical factor. The proposed parallel algorithm is implemented on a ring-embedded Intel's PSC/2 hypercube multicomputer. Load balance quality of the proposed static assignment schemes are evaluated experimentally. The effect of coherence in the parallel light distribution computations on the shooting patch selection sequence is also investigated View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pixel merging for object-parallel rendering: A distributed snooping algorithm

    Page(s): 49 - 56
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    In the purely object-parallel approach to multiprocessor rendering, each processor is assigned responsibility to render a subset of the graphics database. When rendering is complete, pixels from the processors must be merged and globally z-buffered. On an arbitrary multiprocessor interconnection network, the straightforward algorithm for pixel merging requires d-A total network bandwidth per frame, where d- is the depth complexity of the scene and A is the area of the screen or window. This algorithm is used by the Kubota Pacific Denali and appears to be used by the Evans and Sutherland Freedom series. An alternative algorithm, the PixelFlow algorithm, requires nA network bandwidth per frame, where n is the number of processors. But the merging is pipelined in PixelFlow so that each network link must only support A bandwidth per frame. However, that algorithm requires a separate special-purpose network for pixel merging. In this paper we present and analyze an expected-case log (d-)A algorithm for pixel merging that uses network broadcast, and we discuss the algorithm's applicability to shared-memory bus architectures View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable parallel volume raycasting for nonrectilinear computational grids

    Page(s): 81 - 88, 111
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (896 KB)  

    A scalable approach to parallel volume raycasting of structured and unstructured computational grids is presented. The algorithm is general enough to handle non-convex grids and cells, grids with voids, grids constructed from multiple grids, and embedded geometrical primitives. The algorithm is designed for a highly parallel MIMD architecture which features both local memory and shared memory with nonuniform access times. It has been implemented on a BBN TC2000 and benchmarked on several datasets. A variation of the algorithm which provides fast image updates for a changing transfer function is also presented. A distributed approach to controlling the execution of the volume render is used and the graphical user interface designed for this purpose is briefly described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segmented ray casting for data parallel volume rendering

    Page(s): 7 - 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (892 KB)  

    Interactive volume rendering is important to the timely analysis of three-dimensional data, but workstations take seconds to minutes to render data sets of a few megabytes. We have developed a parallel ray-casting technique, called Segmented Ray Casting, which can render a 128×128×128 data set at 2-3 frames per second on a 4K processor DECmpp 1200/Sx Model 100. Pixel values in the image plane are computed by casting rays through the volume data. The rays are segmented based on the intersection with the data sublocks in the processors. Each processor computes the color and opacity of the ray segments which pass through its subblock, which are then sent to the appropriate processor for composition with other segment values. Unlike other data-parallel volume renderers, Segmented Ray Casting does not require the transposition of volume data between processors at any time, nor does it suffer from resampling artifacts due to shearing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multicomputer polygon rendering algorithm for interactive applications

    Page(s): 43 - 48, 109
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (632 KB)  

    This paper presents a new multicomputer polygon rendering algorithm that is specialized for interactive applications. The algorithm differs from previous algorithms in two ways. First, it load balances the rasterization once per frame, instead of as the frame progresses, using the previous frame's distribution of polygons on the screen as input to the load-balancing algorithm. Second, it uses a new message sending scheme that reduces the number of messages required. These characteristics mean that the algorithm only requires global synchronization between frames, which allows for higher frame rates. The algorithm was selected using a simulator which confirmed that using the previous frame's polygon distribution on the screen is nearly as good as using the current frame's distribution. The algorithm is implemented on Caltech's Intel Touchstone Delta, a 512 processor multicomputer system, and preliminary performance figures are given. The highest performance achieved to date is 930,000 triangles per second using 256 processors and a 806,640 triangle data set View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel volume rendering and data coherence

    Page(s): 23 - 26, 106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (404 KB)  

    The two key issues in implementing a parallel ray-casting volume renderer are the work distribution and the data distribution. We have implemented such a renderer on the Fujitsu AP1000 using an adaptive image-space subdivision algorithm based on the worker-farm paradigm for the work distribution, and a distributed virtual memory, implemented in software, to provide the data distribution. Measurements show that this scheme works efficiently and effectively utilizes the data coherence that is inherent in volume data View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient parallel ray tracing scheme for distributed memory parallel computers

    Page(s): 77 - 80
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (420 KB)  

    The ray-tracing algorithm produces high quality images by taking multiple luminous effects into account. Hence, it requires many computations and a large memory capacity. The use of parallel machines is a solution in order to reduce significantly the synthesis time. Distributed Memory Parallel Computers offer an interesting performance/cost ratio but need to distribute computations and data. This paper is a study of the implementation of the ray-tracing algorithm on a Distributed Memory Parallel Computer. An original solution, based on the association of a data parallelism approach with a task parallelism one, is presented. A dynamic load redistribution mechanism allows us to ensure a good load balance during the synthesis phase. At the end of the paper, some results of our transputer implementation are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A MIMD rendering algorithm for distributed memory architectures

    Page(s): 35 - 42, 108
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB)  

    We present a parallel rendering algorithm targeted to MIMD distributed-memory message-passing architectures. For maximum performance, the algorithm exploits both object-level and image level parallelism. The behavior of the algorithm is examined both analytically and experimentally. The results show that the choice of message size has a significant impact on performance. Scalability to large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 confirms the analytical results and demonstrates increasing performance from 1 to 128 processors across a wide range of scene complexities View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel approximate computation of projections for animated volume rendered displays

    Page(s): 61 - 66
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (576 KB)  

    We present an approximate volume rendering algorithm that can compute multiple views of a 3D voxel-based data set concurrently. The approach employs a unique new method for combining partial results from neighboring objections to compute a sequence of rotated views, in fewer instructions than would be required for independent computations. For instance, the algorithm can compute a set of N projections through an N×N×N data set in only O(log N) parallel steps, using only O(N3) total operations (work), matching the bounds for computing a single projection by conventional methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A data distributed, parallel algorithm for ray-traced volume rendering

    Page(s): 15 - 22, 105
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (820 KB)  

    This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local raytracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Permutation warping for data parallel volume rendering

    Page(s): 57 - 60, 110
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    Volume rendering algorithms visualize sampled three dimensional data. A variety of applications create sampled data, including medical imaging, simulations, animation, and remote sensing. Researchers have sought to speed up volume rendering because of the high run time and wide application. Our algorithm uses permutation warping to achieve linear speedup on data parallel machines. This new algorithm calculates higher quality images than previous distributed approaches, and also provides more view angle freedom. We present permutation warping results on the SIMD MasPar MP-1. The efficiency results from nonconflicting communication. The communication remains efficient with arbitrary view directions, larger data sets, larger parallel machines, and high order filters. We show constant run time versus view angle, tunable filter quality, and efficient memory implementation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.