Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run the code on multi-node platform #231

Open
Dongxueyang opened this issue Jul 6, 2020 · 7 comments
Open

run the code on multi-node platform #231

Dongxueyang opened this issue Jul 6, 2020 · 7 comments

Comments

@Dongxueyang
Copy link

Hi @stoiver :

I want to run a large simulation. To improve the efficiency of calculation. I want to run the program on a supercomputing platform, and use multi-nodes. So if the codes support multi-node computing mode?

@stoiver
Copy link
Member

stoiver commented Jul 6, 2020

@Dongxueyang yes anuga can run on multinode suprcomputers. Parallelisation is implemented via MPI. The python 2 version has been extensively run in parallel on the NCI (raijin). I haven't as yet tried it on gadi. It uses the pypar mpi python wrapper.

We are just moving over to using python 3. We seem to have a working version which uses mpi4py as the MPI python wrapper. It would be great if you could test out the python 3 version. I will push it over to the GA git repository (branch anuga_py3).

@Dongxueyang
Copy link
Author

@stoiver That is great. Thank you so much. I can try to download the version of anuga_py3 and try to use on a multinode suprcomputers(with mpi4py). So can I get the branch anuga_py3 now? Where can I download and test?

@stoiver
Copy link
Member

stoiver commented Jul 12, 2020

@Dongxueyang You can use the anuga_py3 branch of the anuga_core repository. Might be best to clone a new copy of anuga_core and add the branch. Ie

git clone -b anuga_py3 https://github.com/GeoscienceAustralia/anuga_core.git

You can get a hint at which python libraries to install by looking at the shell scripts in the tools directory in downloaded repository.

@Dongxueyang
Copy link
Author

Hi @stoiver. If I want to run a simulation on a multi-node platform.(use two nodes and 24 cores (12 cores/node))
I use this command:
mpirun -machinefile machinefile -np 24 python test.py
and the machinefile:

node1_id
node2_id

Is the command right?
If it is wrong. how to run the simulation on two nodes (24cores)?

Thansk a lot. Hope your reply.
Dong

@Dongxueyang
Copy link
Author

@stoiver
Did you see the question above and could you give me some advice. I try to run the example/simple_examples/channel3_parallel.py on two nodes (48cores). But I can not run the simulation.

Dong

@stoiver
Copy link
Member

stoiver commented Jul 31, 2020

@Dongxueyang you need to setup mpi to run on your 24 cores. THis would depend on whether you are using openmpi or mpich. Do you have a system admin person for your system? You should be able to setup your mpirun command to run by default on your two nodes. I recall when working on a cluster a few years ago that you need to ensure you can automatically log into the two nodes using ssh keys. But as suggested, get help from you system admin.

@Dongxueyang
Copy link
Author

@stoiver Ok, thanks I use openmpi on the cluster. I try to ask the system admin firstly. Thanks a lot.
And I want to know I must install the same openmpi on every nodes, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants