logo EDITE Pierre FORTIN
État académique
Thèse soutenue
Laboratoire: personnel permanent
Encadrement de thèses (depuis 2007)
Ellipse bleue: doctorant, ellipse jaune: docteur, rectangle vert: permanent, rectangle jaune: HDR. Trait vert: encadrant de thèse, trait bleu: directeur de thèse, pointillé: jury d'évaluation à mi-parcours ou jury de thèse.
Productions scientifiques
Towards solving the Table Maker's Dilemma on GPU
Since 1985, the IEEE 754 standard defines formats, rounding modes and basic operations for floating-point arithmetic. In 2008 the standard has been extended, and recommendations have been added about the rounding of some elementary functions such as trigonometric functions (cosine, sine, tangent and their inverses), exponentials, and logarithms. However to guarantee the exact rounding of these functions one has to approximate them with a sufficient precision. Finding this precision is known as the \emphTable Maker's Dilemma. To determine this precision, it is necessary to find the \emphhardest-to-round argument of these functions. Lefèvre et al. proposed in 1998 an algorithm which improves the exhaustive search by computing a lower bound on the distance between a line segment and a grid. We present in this paper an analysis of this algorithm in order to deploy it efficiently on GPU. We manage to obtain a speedup of 15.4 on a NVIDIA Fermi GPU over one single high-end CPU core.
20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, Garching, Germany 2012
Comparisons of different codes for galactic N-body simulations
Vol. 531 A120, pp. ONLINE 2011
Deployment on GPUs of an application in computational atomic physics
12th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC) in conjunction with the 25th International Parallel and Distributed Processing Symposium (IPDPS), Anchorage, Alaska, USA 2011
Efficient Complex Matrix Multiplication on the Synergistic Processing Element of the Cell Processor
Workshop on Parallel Programming and Applications on Accelerator Clusters (PPAAC10), held in conjunction with IEEE Cluster 2010 2010
High-performance BLAS formulation of the adaptive Fast Multipole Method
Vol. 51, No. 3-4, pp. 177-188 2010
Fast Multipole Method on the Cell Broadband Engine: the Near Field Part
Parallel Computing: From Multicores and GPU's to Petascale, Selected Papers from the international Parallel Computing Conference (ParCo2009) 2009