A Parallel Scheme for Large‐scale Polygon Rasterization on CUDA‐enabled GPUs
Published online on May 13, 2016
Abstract
		
This research develops a parallel scheme to adopt multiple graphics processing units (GPUs) to accelerate large‐scale polygon rasterization. Three new parallel strategies are proposed. First, a decomposition strategy considering the calculation complexity of polygons and limited GPU memory is developed to achieve balanced workloads among multiple GPUs. Second, a parallel CPU/GPU scheduling strategy is proposed to conceal the data read/write times. The CPU is engaged with data reads/writes while the GPU rasterizes the polygons in parallel. This strategy can save considerable time spent in reading and writing, further improving the parallel efficiency. Third, a strategy for utilizing the GPU's internal memory and cache is proposed to reduce the time required to access the data. The parallel boundary algebra filling (BAF) algorithm is implemented using the programming models of compute unified device architecture (CUDA), message passing interface (MPI), and open multi‐processing (OpenMP). Experimental results confirm that the implemented parallel algorithm delivers apparent acceleration when a massive dataset is addressed (50.32 GB with approximately 1.3 × 108 polygons), reducing conversion time from 25.43 to 0.69 h, and obtaining a speedup ratio of 36.91. The proposed parallel strategies outperform the conventional method and can be effectively extended to a CPU‐based environment.