Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
kazcw committed Oct 20, 2021
0 parents commit 153540a
Show file tree
Hide file tree
Showing 6 changed files with 659 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/target
241 changes: 241 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 14 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "jerbs"
version = "0.1.0"
edition = "2018"
authors = ["Kaz Wesley <[email protected]>"]
description = "Command-line work-stealing scheduler."
repository = "https://github.com/kazcw/jerbs"
license = "GPL-3.0"
categories = ["command-line-utilities"]
exclude = [".gitignore"]

[dependencies]
clap = "2.33"
rusqlite = "0.26"
59 changes: 59 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# jerbs

Command-line work-stealing scheduler.

## Operation

Create a job database:
```
$ jerbs work.db new
```

Define a job and enqueue some repetitions:
```
$ jerbs work.db new-job --count 17 <<< "info for thing to do 17 times"
1
```
The output is the job id, which you can use to edit the job later.

See what's scheduled:
```
$ jerbs work.db list-jobs -v
1 17 "info for thing to do 17 times"
```
(Note: do not use verbose output (`-v`) for scripting. It is intended to be
human-readable and the format is unstable.)

Run a worker:
```
$ while jerbs work.db take $$ | read JOB; do echo $JOB; done
```
Now start some more!

## Typical Usage

I made this so I could have a tmux with a worker process in each pane, all
taking jobs from the same queue. The worker processes run a shell script that
uses this utility to pick the next job.

A job's payload is a blob of data. What's in the blob is up to you. If a job
needs multiple parameters, the blobs could be filenames indicating where to
find the job data; or, you might pack the data directly into the blob with a
delimiter-based format or `jq` or something.

Worker IDs can be any utf-8 string. If your worker is a bash script, you can
pass `$$` to use your worker's PID.

Because the data blob for your task may contain characters that are subject to
string interpolation hazards, any command that requires a blob will read it
from standard input by default. If your blobs are shell-safe, you can instead
use `--data` to include your blob in the arguments.

## Comparison to alternatives

Other work-stealing schedulers (like GNU Parallel) are frameworks; they own the
worker processes, so you can only configure workers through the framework.
`jerbs` inverts this paradigm: `jerbs` is a utility to be used from your worker
script. With `jerbs` you can easily assign unique resources to the workers, pin
workers to CPUs/NUMA nodes, or dynamically vary the number of simultaneous
jobs. At last, the workers control the means of production.
Loading

0 comments on commit 153540a

Please sign in to comment.