HyperShell is an elegant, cross-platform, high-throughput computing utility for processing shell commands over a distributed, asynchronous queue. It is a highly scalable workflow automation tool for many-task scenarios.
Built on Python and tested on Linux, macOS, and Windows.
Several tools offer similar functionality but not all together in a single tool with the user ergonomics we provide. Novel design elements include but are not limited to
- Cross-platform: run on any platform where Python runs. In fact, the server and client can run on different platforms in the same cluster.
- Client-server: workloads do not need to be monolithic. Run the server as a stand-alone service with SQLite or Postgres as a persistent database and dynamically scale clients as needed.
- Staggered launch: At the largest scales (1000s of nodes, 100k+ of workers), the launch process can be challenging. Come up gradually to balance the workload.
- Database in-the-loop: run in-memory for quick, ad-hoc workloads. Otherwise, include a database for persistence, recovery when restarting, and search.
Documentation is available at hypershell.readthedocs.io.
For basic usage information on the command line use: hs --help
. For a more
comprehensive usage guide on the command line you can view the manual page with
man hs
.
Contributions are welcome. If you find bugs or have questions, open an Issue here. We've added a Code of Conduct recently, adapted from the Contributor Covenant, version 2.0.
If HyperShell has helped in your research please consider citing us.
@inproceedings{lentner_2022,
author = {Lentner, Geoffrey and Gorenstein, Lev},
title = {HyperShell v2: Distributed Task Execution for HPC},
year = {2022},
isbn = {9781450391610},
publisher = {Association for Computing Machinery},
url = {https://doi.org/10.1145/3491418.3535138},
doi = {10.1145/3491418.3535138},
booktitle = {Practice and Experience in Advanced Research Computing},
articleno = {80},
numpages = {3},
series = {PEARC '22}
}