A dynamic block activation framework for continuum models

feature-image

Play all audios:

Loading...

ABSTRACT Efficient utilization of massively parallel computing resources is crucial for advancing scientific understanding through complex simulations. However, existing adaptive methods


often face challenges in implementation complexity and scalability on modern parallel hardware. Here we present dynamic block activation (DBA), an acceleration framework that can be applied


to a broad range of continuum simulations by strategically allocating resources on the basis of the dynamic features of the physical model. By exploiting the hierarchical structure of


parallel hardware and dynamically activating and deactivating computation blocks, DBA optimizes performance while maintaining accuracy. We demonstrate DBA’s effectiveness through solving


representative models spanning multiple scientific fields, including materials science, biophysics and fluid dynamics, achieving 216–816 central processing unit core-equivalent speedups on a


single graphics processing unit (GPU), up to fivefold acceleration compared with highly optimized GPU code and nearly perfect scalability up to 32 GPUs. By addressing common challenges,


such as divergent memory access, and reducing programming burden, DBA offers a promising approach to fully leverage massively parallel systems across multiple scientific computing domains.


Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54


other Nature Portfolio journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and


online access to articles $99.00 per year only $8.25 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to


local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT


BEING VIEWED BY OTHERS KOHN–SHAM TIME-DEPENDENT DENSITY FUNCTIONAL THEORY WITH TAMM–DANCOFF APPROXIMATION ON MASSIVELY PARALLEL GPUS Article Open access 26 May 2023 SHIFTING SANDS OF


HARDWARE AND SOFTWARE IN EXASCALE QUANTUM MECHANICAL SIMULATIONS Article 25 April 2025 THE CO-EVOLUTION OF COMPUTATIONAL PHYSICS AND HIGH-PERFORMANCE COMPUTING Article 23 August 2024 DATA


AVAILABILITY All of the experiments in this Resource are based on simulations, and there are no input data. Source data are provided with this paper. CODE AVAILABILITY The code that supports


the results within this Resource is available via GitHub at https://github.com/zhangruoyao68/DBA and via Zenodo at https://doi.org/10.5281/zenodo.14868458 (ref. 66). REFERENCES * Kirk, D.


B. & Hwu, W.-M. W. _Programming Massively Parallel Processors_ 3rd edn (Morgan Kaufmann, 2016). Google Scholar  * Wang, Q., Ihme, M., Chen, Y.-F. & Anderson, J. A TensorFlow


simulation framework for scientific computing of fluid flows on tensor processing units. _Comput. Phys. Commun._ 274, 108292 (2022). Article  MathSciNet  Google Scholar  * Castro, M. D.,


Vilariño, D. L., Torres, Y. & Llanos, D. R. The role of field-programmable gate arrays in the acceleration of modern high-performance computing workloads. _Computer_ 57, 66–76 (2024).


Article  Google Scholar  * Steinkraus, D., Buck, I. & Simard, P. Y. Using GPUs for machine learning algorithms. In _Eighth International Conference on Document Analysis and Recognition


(ICDAR’05)_ 2, 1115–1120 (IEEE, 2005). * Fung. J. Computer vision on the GPU. In _GPU Gems 2: Programming Techniques for High-Performance Graphics and General Purpose Computation_ 1st edn


(eds Pharr, M. et al.) Chap. 40 (Addison-Wesley, 2005.) * Götz, A. W., Wölfle, T. & Walker, R. C. Quantum chemistry on graphics processing units. In _Annual Reports in Computational


Chemistry_ Vol. 6 (ed Wheeler, R. A.) Chap. 2 (Elsevier, 2010). * Anderson, J. A., Glaser, J. & Glotzer, S. C. HOOMD-blue: A python package for high-performance molecular dynamics and


hard particle monte carlo simulations. _Comput. Mater. Sci._ 173, 109363 (2020). Article  Google Scholar  * Phillips, J. C. et al. Scalable molecular dynamics on CPU and GPU architectures


with NAMD. _J. Chem. Phys._ 153, 044130 (2020). Article  Google Scholar  * Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. _Nature_ 630,


493–500 (2024). Article  Google Scholar  * Niemeyer, K. E. & Sung, C.-J. Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. _J.


Supercomput._ 67, 528–564 (2014). Article  Google Scholar  * Michalakes, J. & Vachharajani, M. GPU acceleration of numerical weather prediction. In _Proc. IEEE International Symposium on


Parallel and Distributed Processing_ 1–7 (IEEE, 2008). * Eklund, A., Dufort, P., Forsberg, D. & LaConte, S. M. Medical image processing on the GPU—past, present and future. _Med. Image


Anal._ 17, 1073–1094 (2013). Article  Google Scholar  * Berger, M. J. & Oliger, J. Adaptive mesh refinement for hyperbolic partial differential equations. _J. Comput. Phys._ 53, 484–512


(1984). Article  MathSciNet  Google Scholar  * Berger, M. J. & Colella, P. Local adaptive mesh refinement for shock hydrodynamics. _J. Comput. Phys._ 82, 64–84 (1989). Article  Google


Scholar  * Teunissen, J. & Ebert, U. Afivo: a framework for quadtree/octree AMR with shared-memory parallelization and geometric multigrid methods. _Comput. Phys. Commun._ 233, 156–166


(2018). Article  Google Scholar  * Rollier, M., Zielinski, K. M. C., Daly, A. J., Bruno, O. M. & Baetens, J. M. A comprehensive taxonomy of cellular automata. _Commun. Nonlinear Sci.


Numer. Simul._ 140, 108362 (2025). Article  MathSciNet  Google Scholar  * Provatas, N., Goldenfeld, N. & Dantzig, J. Adaptive mesh refinement computation of solidification


microstructures using dynamic data structures. _J. Comput. Phys._ 148, 265–290 (1999). Article  MathSciNet  Google Scholar  * Gaston, D., Newman, C., Hansen, G. & Lebrun-Grandié, D.


MOOSE: a parallel computational framework for coupled systems of nonlinear equations. _Nucl. Eng. Des._ 239, 1768–1778 (2009). Article  Google Scholar  * Greenwood, M. et al. Quantitative 3D


phase field modelling of solidification using next-generation adaptive mesh refinement. _Comput. Mater. Sci._ 142, 153–171 (2018). Article  Google Scholar  * DeWitt, S., Rudraraju, S.,


Montiel, D., Andrews, W. B. & Thornton, K. PRISMS-PF: a general framework for phase-field modeling with a matrix-free finite element method. _npj Comput. Mater._ 6, 29 (2020). Article 


Google Scholar  * Popinet, S. An accurate adaptive solver for surface-tension-driven interfacial flows. _J. Comput. Phys._ 228, 5838–5866 (2009). Article  MathSciNet  Google Scholar  *


Zhang, W. et al. AMReX: a framework for block-structured adaptive mesh refinement. _J. Open Source Softw._ 4, 1370 (2019). Article  Google Scholar  * Teyssier, R. Cosmological hydrodynamics


with adaptive mesh refinement—a new high resolution code called RAMSES. _Astron. Astrophys._ 385, 337–364 (2002). Article  Google Scholar  * Bryan, G. L. et al. ENZO: an adaptive mesh


refinement code for astrophysics. _Astrophys. J. Suppl. Ser._ 211, 19 (2014). Article  Google Scholar  * Stone, J. M., Tomida, K., White, C. J. & Felker, K. G. The Athena++ adaptive mesh


refinement framework: design and magnetohydrodynamic solvers. _Astrophys. J. Suppl. Ser._ 249, 4 (2020). * Zhang, W., Myers, A., Gott, K., Almgren, A. & Bell, J. AMReX: block-structured


adaptive mesh refinement for multiphysics applications. _Int. J. High Perform. Comput. Appl._ 35, 508–526 (2021). Article  Google Scholar  * Schive, H.-Y. et al. gamer-2: a GPU-accelerated


adaptive mesh refinement code—accuracy, performance, and scalability. _Mon. Not. R. Astron. Soc._ 481, 4815–4840 (2018). Article  Google Scholar  * Wang, P., Abel, T. & Kaehler, R.


Adaptive mesh fluid simulations on GPU. _New Astron._ 15, 581–589 (2010). Article  Google Scholar  * Giuliani, A. & Krivodonova, L. Adaptive mesh refinement on graphics processing units


for applications in gas dynamics. _J. Comput. Phys._ 381, 67–90 (2019). Article  MathSciNet  Google Scholar  * Liu, Z., Tian, F.-B. & Feng, X. An efficient geometry-adaptive mesh


refinement framework and its application in the immersed boundary lattice Boltzmann method. _Comput. Methods Appl. Mech. Eng._ 392, 114662 (2022). Article  MathSciNet  Google Scholar  *


Farooqi, M. N. et al. Asynchronous AMR on multi-GPUs. In _Lecture Notes in Computer Science_ Vol. 11887 (eds Weiland, M. et al.) 113–123 (Springer, 2019). * Beckingsale, D., Gaudin, W.,


Herdman, A. & Jarvis, S. Resident block-structured adaptive mesh refinement on thousands of graphics processing units. In _Proc. 44th International Conference on Parallel Processing_


61–70 (IEEE, 2015). * Wang, J. & Yalamanchili, S. Characterization and analysis of dynamic parallelism in unstructured GPU applications. In _Proc. IEEE International Symposium on


Workload Characterization_ 51–60 (IEEE, 2014). * Hohenberg, P. C. & Halperin, B. I. Theory of dynamic critical phenomena. _Rev. Mod. Phys._ 49, 435–479 (1977). Article  Google Scholar  *


Kobayashi, R. Modeling and numerical simulations of dendritic crystal growth. _Physica D_ 63, 410–423 (1993). Article  Google Scholar  * Steinbach, I. Phase-field models in materials


science. _Model. Simul. Mat. Sci. Eng._ 17, 073001 (2009). Article  Google Scholar  * Francois, M. M. et al. Modeling of additive manufacturing processes for metals: Challenges and


opportunities. _Curr. Opin. Solid State Mater. Sci._ 21, 198–206 (2017). Article  Google Scholar  * Berry, J. et al. Toward multiscale simulations of tailored microstructure formation in


metal additive manufacturing. _Mater. Today_ 51, 65–86 (2021). Article  Google Scholar  * Allen, S. M. & Cahn, J. W. Ground state structures in ordered binary alloys with second neighbor


interactions. _Acta Metall._ 20, 423–433 (1972). Article  Google Scholar  * Mullins, W. W. & Sekerka, R. F. Stability of a planar interface during solidification of a dilute binary


alloy. _J. Appl. Phys._ 35, 444–451 (1964). Article  Google Scholar  * Plapp, M. & Karma, A. Multiscale random-walk algorithm for simulating interfacial pattern formation. _Phys. Rev.


Lett._ 84, 1740–1743 (2000). Article  Google Scholar  * Zhang, R., Mao, S. & Haataja, M. P. Chemically reactive and aging macromolecular mixtures. II. Phase separation and coarsening.


_J. Chem. Phys._ 161, 184903 (2024). Article  Google Scholar  * Brangwynne, C. P. et al. Germline P granules are liquid droplets that localize by controlled dissolution/condensation.


_Science_ 324, 1729–1732 (2009). Article  Google Scholar  * Hyman, A. A., Weber, C. A. & Jülicher, F. Liquid–liquid phase separation in biology. _Annu. Rev. Cell Dev. Biol._ 30, 39–58


(2014). Article  Google Scholar  * Berry, J., Brangwynne, C. P. & Haataja, M. Physical principles of intracellular organization via active and passive phase transitions. _Rep. Prog.


Phys._ 81, 046601 (2018). Article  Google Scholar  * Mao, S., Kuldinow, D., Haataja, M. P. & Košmrlj, A. Phase behavior and morphology of multicomponent liquid mixtures. _Soft Matter_


15, 1297–1311 (2019). Article  Google Scholar  * Cahn, J. W. & Hilliard, J. E. Free energy of a nonuniform system. III. nucleation in a two-component incompressible fluid. _J. Chem.


Phys._ 31, 688–699 (1959). Article  Google Scholar  * Lifshitz, I. M. & Slyozov, V. V. The kinetics of precipitation from supersaturated solid solutions. _J. Phys. Chem. Solids_ 19,


35–50 (1961). Article  Google Scholar  * Wagner, C. Theorie der Alterung von Niederschlägen durch Umlösen (Ostwald Reifung). _Z. Elektrochem. Ber. Bunsenges. Phys. Chem._ 65, 581–591 (1961).


Google Scholar  * Helmholtz XLIII. on discontinuous movements of fluids. _Lond. Edinb. Dublin Philos. Mag. J. Sci._ 36, 337–346 (1868). Article  Google Scholar  * Toro, E. F. Riemann


Solvers and Numerical Methods for Fluid Dynamics. 3rd edn, Springer, (2009). Book  Google Scholar  * McNally, C. P., Lyra, W. & Passy, J.-C. A well-posed Kelvin–Helmholtz instability


test and comparison. _Astrophys. J. Suppl. Ser._ 201, 18 (2012). Article  Google Scholar  * Foullon, C., Verwichte, E., Nakariakov, V. M., Nykyri, K. & Farrugia, C. J. Magnetic


Kelvin–Helmholtz instability at the Sun. _Astrophys. J. Lett._ 729, L8 (2011). Article  Google Scholar  * Smyth, W. & Moum, J. Ocean mixing by Kelvin–Helmholtz instability.


_Oceanography_ 25, 140–149 (2012). Article  Google Scholar  * Rusanov, V. V. The calculation of the interaction of non-stationary shock waves and obstacles. _USSR Comput. Math. Math. Phys._


1, 304–320 (1962). Article  Google Scholar  * Burau, H. et al. PIConGPU: a fully relativistic particle-in-cell code for a GPU cluster. _IEEE Trans. Plasma Sci. IEEE Nucl. Plasma Sci. Soc._


38, 2831–2839 (2010). Article  Google Scholar  * Crespo, A. C., Dominguez, J. M., Barreiro, A., Gómez-Gesteira, M. & Rogers, B. D. GPUs, a new tool of acceleration in CFD: efficiency and


reliability on smoothed particle hydrodynamics methods. _PLoS ONE_ 6, e20685 (2011). Article  Google Scholar  * Montessori, A. et al. Thread-safe lattice boltzmann for high-performance


computing on GPUs. _J. Comput. Sci._ 74, 102165 (2023). Article  Google Scholar  * Teyssier, R., Chapon, D. & Bournaud, F. The driving mechanism of starbursts in galaxy mergers.


_Astrophys. J. Lett._ 720, L149–L154 (2010). Article  Google Scholar  * Shaw, D. E. et al. Anton 3. In _Proc. International Conference for High Performance Computing, Networking, Storage and


Analysis_ 1–11 (ACM, 2021). * Mudigere, D. et al. Software–hardware co-design for fast and scalable training of deep learning recommendation models. In _Proc. 49th Annual International


Symposium on Computer Architecture_ 993–1011 (ACM, 2022). * Cong, J. et al. FPGA HLS today: successes, challenges, and opportunities. _ACM Trans. Reconfigurable Technol. Syst._ 15, 1–42


(2022). Article  Google Scholar  * Dally, W. J., Turakhia, Y. & Han, S. Domain-specific hardware accelerators. _Commun. ACM_ 63, 48–57 (2020). Article  Google Scholar  * Rocki, K. et al.


Fast stencil-code computation on a wafer-scale processor. In _SC20: International Conference for High Performance Computing, Networking, Storage and Analysis_ Vol. 58 1–14 (IEEE, 2020). *


Watanabe, S. & Aoki, T. Large-scale flow simulations using lattice boltzmann method with AMR following free-surface on multiple GPUs. _Comput. Phys. Commun._ 264, 107871 (2021). Article


  MathSciNet  Google Scholar  * Zhang, R. & Xia, Y. Source code for a dynamic block activation framework for continuum models. _Zenodo_ https://doi.org/10.5281/zenodo.14868458 (2025).


Download references ACKNOWLEDGEMENTS R.Z. was supported by the National Science Foundation (NSF) Materials Research Science and Engineering Center Program through the Princeton Center for


Complex Materials (PCCM) (grant no. DMR-2011750). Y.X. was supported by the National Natural Science Foundation of China (grant no. 12204162). Useful discussions with M. P. Haataja, R.


Teyssier, S. Cohen, J. Lalmansingh and Q. Cai are gratefully acknowledged. The simulations presented in this Resource were performed on computational resources managed and supported by


Princeton Research Computing, a consortium of groups including the Princeton Institute for Computational Science and Engineering (PICSciE) and Research Computing at Princeton University.


AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ, USA Ruoyao Zhang * College of Materials Science and


Engineering, Hunan University, Changsha, People’s Republic of China Yang Xia Authors * Ruoyao Zhang View author publications You can also search for this author inPubMed Google Scholar *


Yang Xia View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS R.Z. and Y.X. conceptualized the study, designed the computational framework,


implemented the code, analyzed results, visualized simulations and drafted the paper. CORRESPONDING AUTHOR Correspondence to Yang Xia. ETHICS DECLARATIONS COMPETING INTERESTS The authors


declare no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Computational Science_ thanks Cody Permann, Tatu Pinomaa, Nicolò Scapin and the other, anonymous, reviewer(s) for


their contribution to the peer review of this work. Primary Handling Editor: Fernando Chirigati, in collaboration with the _Nature Computational Science_ team. ADDITIONAL INFORMATION


PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION


Supplementary Figs. 1 and 2, Supplementary model descriptions and Supplementary Tables 1 and 2. SOURCE DATA SOURCE DATA FIG. 3 Excel file for the source data used in Fig. 3. SOURCE DATA FIG.


4 Excel file for the source data used in Fig. 4. SOURCE DATA FIG. 5 Excel file for the source data used in Fig. 5. RIGHTS AND PERMISSIONS Springer Nature or its licensor (e.g. a society or


other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of


this article is solely governed by the terms of such publishing agreement and applicable law. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Zhang, R., Xia, Y. A dynamic block


activation framework for continuum models. _Nat Comput Sci_ 5, 345–354 (2025). https://doi.org/10.1038/s43588-025-00780-2 Download citation * Received: 30 August 2024 * Accepted: 19


February 2025 * Published: 17 March 2025 * Issue Date: April 2025 * DOI: https://doi.org/10.1038/s43588-025-00780-2 SHARE THIS ARTICLE Anyone you share the following link with will be able


to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing


initiative