next up previous
Next: Interactivity Considerations with Up: Interactivity and Performance Previous: Performance limits on

Rendering Performance on the Paragon

Table 2 has the complete list of timings for the Intel Paragon using PVR with the clustering scheme described. The data set used for Table 2 is the negative potential of a high-potential iron protein, which is a simulated data set of resolution (see Figure 13 for a rendering example). The table deserves a few explanations. The ``Processors'' column has the number of processors used for rendering, and the ``Clusters'' column has the number of clusters. For instance, when we have 16 processors in 8 clusters, the data set is being replicated 2 times, once every 8 processors. Extracting the rate at which images are being generated from this table is less obvious. The reason is that the Rendering time column only has the time to calculate a single image, but depending on the -group option, several images may be calculated concurrently. As explained previously, every time a number is repeated in the -group option, the host will schedule that image to various clusters concurrently, thus speeding up the image generation. However, at the same time, the collector will have more work, as it will need to group the image back together from its pieces (that is why we allow multiple collectors).

The numbers in the table correspond to the total time taken to calculate and re-group the images. In a 32/2 -group 0,0 configuration (line 28), it takes only 0.82 seconds to generate one image, that corresponds to 1.2 frames/sec. With a 32/2 -group 0,1 configuration (line 27) takes 1.3 seconds to generate one image, but 2 images are generated concurrently, so the average frame rate is 1.5 frames/sec. This way, the PVR flexible configuration system can be used to trade image generation latency for image throughput. In certain applications it is more important to generate as many images as possible in a certain time, while in others the fastest rendering time is the goal. PVR cluster mechanisms can be used to study these effects.

Table 2 and Figures 10 and 11 have to be used together to determine the possible rendering times and the time the first image (of each bulk command) gets out of the machine. One has to keep in mind that Table 2 does not reflect the compositing latency as those numbers only reflect the actual rendering time latency. We wrote PVR scripts to generate the data for Table 2 and Figures 10 and 11 automatically. For the MRI head data set (256256113) we were able to achieve about 2 frames per second with a 64/2 -group 0,0 for images (see Figure 14).

  
Table 2: Rendering timings calculated on the Intel Paragon running OSF/1 for images, using a data set.



next up previous
Next: Interactivity Considerations with Up: Interactivity and Performance Previous: Performance limits on



Claudio Silva
Thu Apr 20 13:45:22 EDT 1995