-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for the Metal backend (#48)
This PR adds the support for Metal KA backend
- Loading branch information
Showing
19 changed files
with
513 additions
and
333 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
name = "Chmy" | ||
uuid = "33a72cf0-4690-46d7-b987-06506c2248b9" | ||
authors = ["Ivan Utkin <[email protected]>, Ludovic Raess <[email protected]>, and contributors"] | ||
version = "0.1.19" | ||
version = "0.1.20" | ||
|
||
[deps] | ||
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" | ||
|
@@ -13,10 +13,12 @@ MacroTools = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09" | |
[weakdeps] | ||
AMDGPU = "21141c5a-9bdb-4563-92ae-f87d6854732e" | ||
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba" | ||
Metal = "dde4c033-4e86-420c-a63e-0dd931031962" | ||
|
||
[extensions] | ||
ChmyAMDGPUExt = "AMDGPU" | ||
ChmyCUDAExt = "CUDA" | ||
ChmyMetalExt = "Metal" | ||
|
||
[compat] | ||
AMDGPU = "0.8, 0.9, 1" | ||
|
@@ -25,4 +27,5 @@ CUDA = "5" | |
KernelAbstractions = "0.9" | ||
MPI = "0.20" | ||
MacroTools = "0.5" | ||
Metal = "1" | ||
julia = "1.9" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,16 @@ | ||
# Examples Overview | ||
|
||
This page provides an overview of [Chmy.jl](https://github.com/PTsolvers/Chmy.jl) examples. These selected examples demonstrate how [Chmy.jl](https://github.com/PTsolvers/Chmy.jl) can be used to solve various numerical problems using architecture-agnostic kernels both on a single-device and in a distributed way. | ||
This page provides an overview of [Chmy.jl](https://github.com/PTsolvers/Chmy.jl) examples. These selected examples demonstrate how Chmy.jl can be used to solve various numerical problems using architecture-agnostic kernels both on a single-device and in a distributed way. | ||
|
||
## Table of Contents | ||
|
||
| Example | Description | | ||
| Example | Description | | ||
|:------------|:------------| | ||
| [Diffusion 2D](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d.jl) | Solving the 2D diffusion equation on an uniform grid. | | ||
| [Diffusion 2D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d_mpi.jl) | Solving the 2D diffusion equation on an uniform grid distributedly using MPI. | | ||
| [Single-Device Performance Optimization](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d_perf.jl) | Revisiting the 2D diffusion problem with focus on performance optimization techniques on a single-device architecture | | ||
| [Stokes 2D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/stokes_2d_inc_ve_T_mpi.jl) | Solving the 2D Stokes equation with thermal coupling on an uniform grid. | | ||
| [Stokes 3D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/stokes_3d_inc_ve_T_mpi.jl) | Solving the 3D Stokes equation with thermal coupling on an uniform grid distributedly using MPI. | | ||
| [2D Grid Visualization](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/grids_2d.jl) | Visualization of a 2D `StructuredGrid`. | | ||
| [3D Grid Visualization](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/grids_3d.jl) | Visualization of a 3D `StructuredGrid` | | ||
| [Diffusion 2D](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d.jl) | Solving the 2D diffusion equation on a uniform grid. | | ||
| [Diffusion 2D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d_mpi.jl) | Solving the 2D diffusion equation on a uniform grid and distributed parallelisation using MPI. | | ||
| [Single-Device Performance Optimisation](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_2d_perf.jl) | Revisiting the 2D diffusion problem with focus on performance optimisation techniques on a single-device architecture. | | ||
| [Stokes 2D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/stokes_2d_inc_ve_T_mpi.jl) | Solving the 2D Stokes equation with thermal coupling on a uniform grid. | | ||
| [Stokes 3D with MPI](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/stokes_3d_inc_ve_T_mpi.jl) | Solving the 3D Stokes equation with thermal coupling on a uniform grid and distributed parallelisation using MPI. | | ||
| [Diffusion 1D with Metal](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/diffusion_1d_mtl.jl) | Solving the 1D diffusion equation using the Metal backend and single precision (`Float32`) on a uniform grid. | | ||
| [2D Grid Visualization](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/grids_2d.jl) | Visualization of a 2D `StructuredGrid`. | | ||
| [3D Grid Visualization](https://github.com/PTsolvers/Chmy.jl/blob/main/examples/grids_3d.jl) | Visualization of a 3D `StructuredGrid`. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
using Chmy, Chmy.Architectures, Chmy.Grids, Chmy.Fields, Chmy.BoundaryConditions, Chmy.GridOperators, Chmy.KernelLaunch | ||
using KernelAbstractions | ||
using Printf | ||
using CairoMakie | ||
|
||
using Metal | ||
|
||
@kernel inbounds = true function compute_q!(q, C, χ, g::StructuredGrid, O) | ||
I = @index(Global, NTuple) | ||
I = I + O | ||
q.x[I...] = -χ * ∂x(C, g, I...) | ||
end | ||
|
||
@kernel inbounds = true function update_C!(C, q, Δt, g::StructuredGrid, O) | ||
I = @index(Global, NTuple) | ||
I = I + O | ||
C[I...] -= Δt * divg(q, g, I...) | ||
end | ||
|
||
@views function main(backend=CPU(); nx=(32, )) | ||
arch = Arch(backend) | ||
# geometry | ||
grid = UniformGrid(arch; origin=(-1f0, ), extent=(2f0, ), dims=nx) | ||
launch = Launcher(arch, grid; outer_width=(4, )) | ||
# physics | ||
χ = 1.0f0 | ||
# numerics | ||
Δt = minimum(spacing(grid))^2 / χ / ndims(grid) / 2.1f0 | ||
nt = 100 | ||
# allocate fields | ||
C = Field(backend, grid, Center()) | ||
q = VectorField(backend, grid) | ||
# initial conditions | ||
set!(C, rand(Float32, size(C))) | ||
bc!(arch, grid, C => Neumann()) | ||
# visualisation | ||
fig = Figure(; size=(400, 320)) | ||
ax = Axis(fig[1, 1]; xlabel="x", ylabel="y", title="it = 0") | ||
plt = lines!(ax, centers(grid)..., interior(C) |> Array) | ||
display(fig) | ||
# action | ||
for it in 1:nt | ||
@printf("it = %d/%d \n", it, nt) | ||
launch(arch, grid, compute_q! => (q, C, χ, grid)) | ||
launch(arch, grid, update_C! => (C, q, Δt, grid); bc=batch(grid, C => Neumann())) | ||
end | ||
KernelAbstractions.synchronize(backend) | ||
plt[2] = interior(C) |> Array | ||
ax.title = "it = $nt" | ||
display(fig) | ||
return | ||
end | ||
|
||
n = 64 | ||
|
||
main(MetalBackend(); nx=(n, ) .- 2) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
module ChmyMetalExt | ||
|
||
using Metal, KernelAbstractions | ||
|
||
import Chmy.Architectures: heuristic_groupsize, set_device!, get_device, pointertype | ||
|
||
Base.unsafe_wrap(::MetalBackend, ptr::Metal.MtlPtr, dims) = unsafe_wrap(MtlArray, ptr, dims) | ||
|
||
pointertype(::MetalBackend, T::DataType) = Metal.MtlPtr{T} | ||
|
||
set_device!(dev::Metal.MTL.MTLDeviceInstance) = Metal.device!(dev) | ||
|
||
get_device(::MetalBackend, id::Integer) = Metal.MTL.MTLDevice(id) | ||
|
||
heuristic_groupsize(::MetalBackend, ::Val{1}) = (256,) | ||
heuristic_groupsize(::MetalBackend, ::Val{2}) = (32, 8) | ||
heuristic_groupsize(::MetalBackend, ::Val{3}) = (32, 8, 1) | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
e4f9e84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register
e4f9e84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/116651
Tip: Release Notes
Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.
To add them here just re-invoke and the PR will be updated.
Tagging
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: