next up previous contents
Next: SIMD Algorithms Up: Splatting on Multicomputers Previous: Pixel-Plane 5

Splatting Algorithm

The idea in the splatting as defined by Ulrich Neumman [13] is to slice the dataset one plane at a time. The planes will be taken out in the orthogonal direction of the dataset that most approximates the image plane direction. In the beginning each GP is assigned, in a round-robin fashion, a set of planes to computegif.

Each GP will process a whole plane at a time. They will transform, clip and break each voxel in this plane into a kernel description that will be sent to the renderers where the QEE will be used to evaluate the kernel. For efficiency, the gaussian kernel is approximated by a quadratic kernel that can be computed in the QEE of the renderers. The compositing is done in the renderers as it receives new planes.

The algorithm takes advantage of the fact that the Pixel-Plane 5 can be configured with a variable set of GPs and renderers. As the GPs prepare data for the renderers, once the data is sent the GPs have to wait for the renderers to finish rendering. The algorithm can work in two ways: either the set of GPs can sent more planes than can be process by the renderers (Render-Bound performance) or the renderers can process planes faster than the GPs can send ( GP-Bound performance). The slowest link determines the speed of the algorithm. For instance, for the Render-Bound case, a token is create for each renderer and at any given point in time only one GP can send its current processed plane to a renderer , the GP that has correspondent token. GPs pass the tokens around as the renderers finish processing their planes.

  
Figure 8: Data synchronization on the Pixel-Plane 5. The left GP holds the right to renderer R1 while passing the usage of renderer R0 to the GP on the right.

As the data is loaded in each GP in the beginning by the host, each GP will have to hold three planes for each one plane it processes. Even though each GP only does computations for one plane, it needs both neighbors to be able to calculate the gradient of that voxel used in shading.

This algorithm leads itself to a very efficient implementation on the Pixel-Plane 5 because the compositing and the kernel calculations are being done in the renderers. The system is able to achieve 4 frames per second on a dataset using a GP-Bound configuration (40 GPs and 16 renderers). Neumann [13] did a detailed analysis of the number of cycles it takes to render the datasets. He found that the utilization of the computing power (both GPs and renderers) is lower than peak performance. One of the main problem is the need for synchronization at every plane.

Another implementation of splatting in parallel architectures were done by Westover in the Pixel-Plane 5 [12] and in a network of Sun SPARCs [11]. Todd Elvins [40] did an implementation on the nCUBE where he tryed several combinations optimizations of the algorithm. His times for a dataset were around 23 seconds for an 8 processor configuration.



next up previous contents
Next: SIMD Algorithms Up: Splatting on Multicomputers Previous: Pixel-Plane 5



Claudio Silva
Thu Apr 20 16:03:37 EDT 1995