URN zum Zitieren der Version auf EPub Bayreuth: urn:nbn:de:bvb:703-epub-6559-0
Titelangaben
Lehmann, Moritz ; Krause, Mathias ; Amati, Giorgio ; Sega, Marcello ; Harting, Jens ; Gekle, Stephan:
Accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit, and customized 16-bit number formats.
In: Physical Review E.
Bd. 106
(26 Juli 2022)
Heft 1
.
- No. 015308.
ISSN 1550-2376
DOI der Verlagsversion: https://doi.org/10.1103/PhysRevE.106.015308
Volltext
|
|||||||||
Download (10MB)
|
Angaben zu Projekten
Projekttitel: |
Offizieller Projekttitel Projekt-ID SFB 1357 Mikroplastik 391977956 FOR 2688, Instabilities, Bifurcations and Migration in Pulsating Flow, Projects No. B3 417989940 FOR 2688, Instabilities, Bifurcations and Migration in Pulsating Flow, Projects No. B2 417989464 |
---|---|
Projektfinanzierung: |
Deutsche Forschungsgemeinschaft |
Abstract
Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here we evaluate the possibility to use even FP16 and posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop customized 16-bit formats—based on a modified IEEE-754 and on a modified posit standard—that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method), and, finally, the impact of a raindrop (based on a volume-of-fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.