Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Per Process IO, performance summary in docs, new example case. #256

Merged
merged 8 commits into from
Dec 14, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docs/documentation/case.md
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,7 @@ Note that `time_stepper` $=$ 3 specifies the total variation diminishing (TVD),
| `format` | Integer | Output format. [1]: Silo-HDF5; [2] Binary |
| `precision` | Integer | [1] Single; [2] Double |
| `parallel_io` | Logical | Parallel I/O |
| `file_per_process` | Logical | Whether or not to write one IO file per process |
| `cons_vars_wrt` | Logical | Write conservative variables |
| `prim_vars_wrt` | Logical | Write primitive variables |
| `alpha_rho_wrt(i)` | Logical | Add the partial density of the fluid $i$ to the database \|
Expand Down Expand Up @@ -377,7 +378,10 @@ The table lists formatted database output parameters. The parameters define vari
With parallel I/O, MFC inputs and outputs a single file throughout pre-process, simulation, and post-process, regardless of the number of processors used.
Parallel I/O enables the use of different number of processors in each of the processes (i.e. simulation data generated using 1000 processors can be post-processed using a single processor).

- `cons_vars_wrt` and `prim_vars_wrt} activate output of conservative and primitive state variables into the database, respectively.
- `file_per_process` deactivates shared file MPI-IO and activates file per process MPI-IO. The default behaviour is to use a shared file.
File per process is usefull when running on 10's of thousands of ranks.

- `cons_vars_wrt` and `prim_vars_wrt` activate output of conservative and primitive state variables into the database, respectively.

- `[variable's name]_wrt` activates output of the each specified variable into the database.

Expand Down
74 changes: 74 additions & 0 deletions docs/documentation/expectedPerformance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Performance Results

MFC has been extensively benchmarked on both CPUs and GPUs. A summary of these results follow.

## Expected time-steps/hour

The following table outlines expected performance in terms of number of time-steps per hour
(rounded to the nearest hundred) for various problem sizes and hardware for a inviscid, 6-equation,
3D simulation. CPU results utilize an entire die.

| Hardware | # Ranks | 1M Cells | 4M Cells | 8M Cells | Compiler | Computer |
| ---: | :----: | :----: | :---: | :---: | :----: | :--- |
| Nvidia V100 | 1 | 88.5k | 18.7k | N/A | NVHPC 22.11 | PACE Phoenix |
| Nvidia A100 | 1 | 114.4k | 34.6k | 16.5k | NVHPC 23.5 | Wingtip |
| AMD MI250x | 1 | 77.5k | 22.3k | 11.2k | CCE 16.0.1 | OLCF Frontier |
| Intel Xeon Gold 6226 | 12 | 2.5k | 0.7k | 0.4k | GNU 10.3.0 | Pace Phoenix |
| Apple Silicon M2 | 6 | 2.8k | 0.6k | 0.2k | GNU 13.2.0 | N/A |

If `'model_eqns' : 3` is replaced by `'model_eqns' : 2`, an inviscid 5-equation model is used.
The following table outlines expected performance in terms of number of time-steps per hour
(rounded to the nearest hundred) for various problem sizes and hardware for a inviscid, 5-equation,
3D simulation. CPU results utilize an entire die.

| Hardware | # Ranks | 1M Cells | 4M Cells | 8M Cells | Compiler | Computer |
| ---: | :----: | :----: | :---: | :---: | :----: | :--- |
| Nvidia V100 | 1 | 113.4k | 26.2k | N/A | NVHPC 22.11 | PACE Phoenix |
wilfonba marked this conversation as resolved.
Show resolved Hide resolved
| Nvidia A100 | 1 | 153.5k | 48.0k | 22.5k | NVHPC 23.5 | Wingtip |
| AMD MI250x | 1 | 104.2k | 31.0k | 14.8k | CCE 16.0.1 | OLCF Frontier |
| Intel Xeon Gold 6226 | 12 | 5.4k | 1.6k | 0.8k | GNU 10.3.0 | Pace Phoenix |
wilfonba marked this conversation as resolved.
Show resolved Hide resolved
| Apple Silicon M2 | 6 | 3.7k | 11.0k | 0.3k | GNU 13.2.0 | N/A |

## Weak scaling

Strong scaling results are obtained by increasing the problem size with the number of processes
so that work per process remains constant.

### AMD MI250X GPU
MFC weask scales to 65,536 AMD MI250X GPUs on OLCF Frontier with 96% efficiency.
wilfonba marked this conversation as resolved.
Show resolved Hide resolved

<img src="../res/weakScaling/frontier.svg" style="height: 50%; width:50%; border-radius: 10pt"/>

### Nvidia V100 GPU
MFC weak scales to 13,824 V100 Nvidia V100 GPUs on OLCF Summit with 97% efficiency.
wilfonba marked this conversation as resolved.
Show resolved Hide resolved

<img src="../res/weakScaling/summit.svg" style="height: 50%; width:50%; border-radius: 10pt"/>

### IMB Power9 CPU
MFC Weak scales to 13,824 Power9 CPU cores on OLCF Summit with 1% of ideal scaling.
wilfonba marked this conversation as resolved.
Show resolved Hide resolved

<img src="../res/weakScaling/cpuScaling.svg" style="height: 50%; width:50%; border-radius: 10pt"/>

## Strong scaling

Strong scaling results are obtained by keeping the problem size constant and increasing
the number of process so that work per process decreases.

### Nvidia V100 GPU

For these tests, the base case utilizes 8 GPUs with one MPI process per GPU. The performance
is analyzed at two different problem sizes of 16 and 64M grid points, with the base case using
2 and 8M grid points per process.

#### 16M Grid Points
<img src="../res/strongScaling/strongScaling16.svg" style="width: 50%; border-radius: 10pt"/>

#### 64M Grid Points
<img src="../res/strongScaling/strongScaling64.svg" style="width: 50%; border-radius: 10pt"/>

### IBM Power 9 CPU

CPU strong scaling tests are done with problem sizes of 16, 32, and 64M grid points, with the
base case using 2, 4, and 8M cells per process.

<img src="../res/strongScaling/cpuStrongScaling.svg" style="width: 50%; border-radius: 10pt"/>
1 change: 1 addition & 0 deletions docs/res/strongScaling/S01.95
wilfonba marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<!doctype html><html lang="en" class="no-js"><head><meta charset="utf-8"> <!-- begin SEO --><title>Towards exascale multiphase compressible flow simulation via scalable interface capturing-based solvers and GPU acceleration - Anand Radhakrishnan</title><meta property="og:locale" content="en-US"><meta property="og:site_name" content="Anand Radhakrishnan"><meta property="og:title" content="Towards exascale multiphase compressible flow simulation via scalable interface capturing-based solvers and GPU acceleration"><link rel="canonical" href="https://anandrdbz.github.io/https:/meetings.aps.org/Meeting/DFD22/Session/S01.95"><meta property="og:url" content="https://anandrdbz.github.io/https:/meetings.aps.org/Meeting/DFD22/Session/S01.95"><meta property="og:description" content=""><meta property="og:type" content="article"><meta property="article:published_time" content="2022-11-21T00:00:00-08:00"> <script type="application/ld+json"> { "@context" : "http://schema.org", "@type" : "Person", "name" : "Anand", "url" : "https://anandrdbz.github.io", "sameAs" : null } </script> <!-- end SEO --><link href="https://anandrdbz.github.io/feed.xml" type="application/atom+xml" rel="alternate" title="Anand Radhakrishnan Feed"> <!-- http://t.co/dKP3o1e --><meta name="HandheldFriendly" content="True"><meta name="MobileOptimized" content="320"><meta name="viewport" content="width=device-width, initial-scale=1.0"> <script> document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js '; </script> <!-- For all browsers --><link rel="stylesheet" href="https://anandrdbz.github.io/assets/css/main.css"><meta http-equiv="cleartype" content="on"> <!-- start custom head snippets --><link rel="apple-touch-icon" sizes="57x57" href="https://anandrdbz.github.io/images/apple-touch-icon-57x57.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="60x60" href="https://anandrdbz.github.io/images/apple-touch-icon-60x60.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="72x72" href="https://anandrdbz.github.io/images/apple-touch-icon-72x72.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="76x76" href="https://anandrdbz.github.io/images/apple-touch-icon-76x76.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="114x114" href="https://anandrdbz.github.io/images/apple-touch-icon-114x114.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="120x120" href="https://anandrdbz.github.io/images/apple-touch-icon-120x120.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="144x144" href="https://anandrdbz.github.io/images/apple-touch-icon-144x144.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="152x152" href="https://anandrdbz.github.io/images/apple-touch-icon-152x152.png?v=M44lzPylqQ"><link rel="apple-touch-icon" sizes="180x180" href="https://anandrdbz.github.io/images/apple-touch-icon-180x180.png?v=M44lzPylqQ"><link rel="icon" type="image/png" href="https://anandrdbz.github.io/images/favicon-32x32.png?v=M44lzPylqQ" sizes="32x32"><link rel="icon" type="image/png" href="https://anandrdbz.github.io/images/android-chrome-192x192.png?v=M44lzPylqQ" sizes="192x192"><link rel="icon" type="image/png" href="https://anandrdbz.github.io/images/favicon-96x96.png?v=M44lzPylqQ" sizes="96x96"><link rel="icon" type="image/png" href="https://anandrdbz.github.io/images/favicon-16x16.png?v=M44lzPylqQ" sizes="16x16"><link rel="manifest" href="https://anandrdbz.github.io/images/manifest.json?v=M44lzPylqQ"><link rel="mask-icon" href="https://anandrdbz.github.io/images/safari-pinned-tab.svg?v=M44lzPylqQ" color="#000000"><link rel="shortcut icon" href="/images/favicon.ico?v=M44lzPylqQ"><meta name="msapplication-TileColor" content="#000000"><meta name="msapplication-TileImage" content="https://anandrdbz.github.io/images/mstile-144x144.png?v=M44lzPylqQ"><meta name="msapplication-config" content="https://anandrdbz.github.io/images/browserconfig.xml?v=M44lzPylqQ"><meta name="theme-color" content="#ffffff"><link rel="stylesheet" href="https://anandrdbz.github.io/assets/css/academicons.css"/> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { equationNumbers: { autoNumber: "all" } } }); </script> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ tex2jax: { inlineMath: [ ['$','$'], ["\\(","\\)"] ], processEscapes: true } }); </script> <script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/latest.js?config=TeX-MML-AM_CHTML' async></script> <!-- end custom head snippets --></head><body> <!--[if lt IE 9]><div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</div><![endif]--><div class="masthead"><div class="masthead__inner-wrap"><div class="masthead__menu"><nav id="site-nav" class="greedy-nav"> <button><div class="navicon"></div></button><ul class="visible-links"><li class="masthead__menu-item masthead__menu-item--lg"><a href="https://anandrdbz.github.io/">Anand Radhakrishnan</a></li><li class="masthead__menu-item"><a href="https://anandrdbz.github.io/publications/">Publications</a></li><li class="masthead__menu-item"><a href="https://anandrdbz.github.io/talks/">Conferences</a></li></ul><ul class="hidden-links hidden"></ul></nav></div></div></div><div id="main" role="main"><div class="sidebar sticky"><div itemscope itemtype="http://schema.org/Person"><div class="author__avatar"> <img src="https://anandrdbz.github.io/images/profile_pic.png" class="author__avatar" alt="Anand Radhakrishnan"></div><div class="author__content"><h3 class="author__name">Anand Radhakrishnan</h3><p class="author__bio">PhD Student at Georgia Tech</p></div><div class="author__urls-wrapper"> <button class="btn btn--inverse">Follow</button><ul class="author__urls social-icons"><li><i class="fa fa-fw fa-map-marker" aria-hidden="true"></i> S1347D, Coda Building</li><li><i class="fa fa-fw fa-map-marker" aria-hidden="true"></i> Georgia Institute of Technology</li><li><a href="mailto:[email protected]"><i class="fas fa-fw fa-envelope" aria-hidden="true"></i> Email</a></li><li><a href="https://github.com/anandrdbz"><i class="fab fa-fw fa-github" aria-hidden="true"></i> Github</a></li><li><a href="https://scholar.google.com/citations?user=sBOAsEYAAAAJ&hl=en"><i class="fas fa-fw fa-graduation-cap"></i> Google Scholar</a></li></ul></div></div></div><article class="page" itemscope itemtype="http://schema.org/CreativeWork"><meta itemprop="headline" content="Towards exascale multiphase compressible flow simulation via scalable interface capturing-based solvers and GPU acceleration"><meta itemprop="description" content=""><meta itemprop="datePublished" content="November 21, 2022"><div class="page__inner-wrap"><header><h1 class="page__title" itemprop="headline">Towards exascale multiphase compressible flow simulation via scalable interface capturing-based solvers and GPU acceleration</h1><p class="page__date"><strong><i class="fa fa-fw fa-calendar" aria-hidden="true"></i>Date:</strong> <time datetime="2022-11-21T00:00:00-08:00">November 21, 2022</time></p></header><section class="page__content" itemprop="text"></section><footer class="page__meta"></footer><section class="page__share"><h4 class="page__share-title">Share on</h4><a href="https://twitter.com/intent/tweet?text=https://anandrdbz.github.io/https:/meetings.aps.org/Meeting/DFD22/Session/S01.95" class="btn btn--twitter" title="Share on Twitter"><i class="fab fa-twitter" aria-hidden="true"></i><span> Twitter</span></a> <a href="https://www.facebook.com/sharer/sharer.php?u=https://anandrdbz.github.io/https:/meetings.aps.org/Meeting/DFD22/Session/S01.95" class="btn btn--facebook" title="Share on Facebook"><i class="fab fa-facebook" aria-hidden="true"></i><span> Facebook</span></a> <a href="https://www.linkedin.com/shareArticle?mini=true&url=https://anandrdbz.github.io/https:/meetings.aps.org/Meeting/DFD22/Session/S01.95" class="btn btn--linkedin" title="Share on LinkedIn"><i class="fab fa-linkedin" aria-hidden="true"></i><span> LinkedIn</span></a></section><nav class="pagination"> <a href="https://anandrdbz.github.io/https:/sc22.supercomputing.org/presentation/?id=rpost122&sess=sess275" class="pagination--pager" title="Scalable GPU Accelerated Simulation of Multiphase Compressible Flow ">Previous</a> <a href="#" class="pagination--pager disabled">Next</a></nav></div></article></div><div class="page__footer"><footer> <!-- start custom footer snippets --> <a href="/sitemap/">Sitemap</a> <!-- end custom footer snippets --><div class="page__footer-follow"><ul class="social-icons"><li><strong>Follow:</strong></li><li><a href="http://github.com/anandrdbz"><i class="fab fa-github" aria-hidden="true"></i> GitHub</a></li><li><a href="https://anandrdbz.github.io/feed.xml"><i class="fa fa-fw fa-rss-square" aria-hidden="true"></i> Feed</a></li></ul></div><div class="page__footer-copyright">&copy; 2023 Anand. Powered by <a href="http://jekyllrb.com" rel="nofollow">Jekyll</a> &amp; <a href="https://github.com/academicpages/academicpages.github.io">AcademicPages</a>, a fork of <a href="https://mademistakes.com/work/minimal-mistakes-jekyll-theme/" rel="nofollow">Minimal Mistakes</a>.</div></footer></div><script src="https://anandrdbz.github.io/assets/js/main.min.js"></script> <script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', '', 'auto'); ga('send', 'pageview'); </script></body></html>
Loading
Loading