Parallel Light Speed Labeling: an efficient connected component algorithm for labeling and analysis on multi-core processors
In the last decade, many papers have been published to present sequential connected component labeling (CCL) algorithms. As modern processors are multi-core and tend to many cores, designing a CCL algorithm should address parallelism and multithreading. After a review of sequential CCL algorithms and a study of their variations, this paper presents the parallel version of the Light Speed Labeling for Connected Component Analysis (CCA) and compares it to our parallelized implementations of State-of-the-Art sequential algorithms. We provide some benchmarks that help to figure out the intrinsic differences between these parallel algorithms. We show that thanks to its run-based processing, the LSL is intrinsically more efficient and faster than all pixel-based algorithms. We show also, that all the pixel-based are memory-bound on multi-socket machines and so are inefficient and do not scale, whereas LSL, thanks to its RLE compression can scale on such high-end machines. On a 4×15-core machine, and for 8192×8192 images, LSL outperforms its best competitor by a factor ×10.8 and achieves a throughput of 42.4 gigapixel labeled per second.
ISSN: 1861-8200 EISSN: 1861-8219 Journal of Real-Time Image Processing https://hal.archives-ouvertes.fr/hal-01361188 Journal of Real-Time Image Processing, Springer Verlag, 2016, <http://link.springer.com/article/10.1007/s11554-016-0574-2>. <10.1007/s11554-016-0574-2> http://link.springer.com/article/10.1007/s11554-016-0574-2ARRAY(0x7f4f39145728) 2016-03-24