Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing a PerspectiveProxyServer #19

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
[bumpversion]
current_version = 0.2.1
current_version = 0.2.2
commit = True
tag = False
commit_args = -s

[bumpversion:file:pyproject.toml]
search = version = "{current_version}"
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ on:
tags:
- v*
paths-ignore:
- docs/
- LICENSE
- README.md
pull_request:
Expand Down
29 changes: 29 additions & 0 deletions .github/workflows/wiki.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Publish Docs

on:
push:
branches:
- main
paths:
- "docs/**"
- "README.md"
workflow_dispatch:

concurrency:
group: docs
cancel-in-progress: true

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Upload Documentation to Wiki
uses: Andrew-Chen-Wang/github-wiki-action@v4
with:
path: docs/wiki
26 changes: 15 additions & 11 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@
build: ## build python/javascript
python -m build .

requirements: ## install prerequisite python build requirements
python -m pip install --upgrade pip toml
python -m pip install `python -c 'import toml; c = toml.load("pyproject.toml"); print("\n".join(c["build-system"]["requires"]))'`
python -m pip install `python -c 'import toml; c = toml.load("pyproject.toml"); print(" ".join(c["project"]["optional-dependencies"]["develop"]))'`

develop: ## install to site-packages in editable mode
python -m pip install --upgrade build pip setuptools toml twine wheel
pip install `python -c 'import toml; c = toml.load("pyproject.toml"); print(" ".join(c["project"]["optional-dependencies"]["develop"]))'`
cd js; yarn
python -m pip install -e .[develop]

Expand All @@ -20,37 +23,38 @@ install: ## install to site-packages
###########
.PHONY: testpy testjs test tests

testpy: ## Clean and Make unit tests
testpy: ## run the python unit tests
python -m pytest -v raydar/tests --junitxml=junit.xml --cov=raydar --cov-report=xml:.coverage.xml --cov-branch --cov-fail-under=1 --cov-report term-missing

testjs: ## Clean and Make js tests
testjs: ## run the javascript unit tests
cd js; yarn test

test: tests
tests: testpy testjs ## run the tests
tests: testpy testjs ## run all the unit tests

###########
# Linting #
###########
.PHONY: lintpy lintjs lint fixpy fixjs fix format

lintpy: ## Black/flake8 python
lintpy: ## lint python with isort and ruff
python -m isort raydar setup.py --check
python -m ruff check raydar setup.py
python -m ruff format --check raydar setup.py

lintjs: ## ESlint javascript
lintjs: ## lint javascript with eslint
cd js; yarn lint

lint: lintpy lintjs ## run linter
lint: lintpy lintjs ## run all linters

fixpy: ## Black python
fixpy: ## autoformat python code with isort and ruff
python -m isort raydar setup.py
python -m ruff format raydar setup.py

fixjs: ## ESlint Autofix JS
fixjs: ## autoformat javascript code with eslint
cd js; yarn fix

fix: fixpy fixjs ## run black/tslint fix
fix: fixpy fixjs ## run all autofixers
format: fix

#################
Expand Down
158 changes: 6 additions & 152 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,165 +10,19 @@

A [perspective](https://perspective.finos.org/) powered, user editable ray dashboard via ray serve.

[![Build Status](https://github.com/Point72/raydar/actions/workflows/build.yml/badge.svg)](https://github.com/Point72/raydar/actions/workflows/build.yml)
[![Build Status](https://github.com/Point72/raydar/actions/workflows/build.yml/badge.svg?branch=main&event=push)](https://github.com/Point72/raydar/actions/workflows/build.yml)
[![PyPI Version](https://img.shields.io/pypi/v/raydar.svg)](https://pypi.python.org/pypi/raydar)
[![License](https://img.shields.io/pypi/l/raydar.svg)](https://github.com/Point72/raydar/blob/main/LICENSE)
[![Python Versions](https://img.shields.io/badge/python-3.8_%7C_3.9_%7C_3.10_%7C_3.11-blue)](https://github.com/Point72/raydar/blob/main/pyproject.toml)

<br/>

[![Python Versions](https://img.shields.io/badge/python-3.8_%7C_3.9_%7C_3.10_%7C_3.11_&7C3.12-blue)](https://github.com/Point72/raydar/blob/main/pyproject.toml)

## Features

The `raydar` module provides an actor which can process collections of ray object references on your behalf, and can serve a [perspective](https://github.com/finos/perspective) dashboard in which to visualize that data.

```python
from raydar import RayTaskTracker
task_tracker = RayTaskTracker()
```

Passing collections of object references to this actor's `process` method causes those references to be tracked in an internal polars dataframe, as they finish running.

```python
@ray.remote
def example_remote_function():
import time
import random
time.sleep(1)
if random.randint(1,100) > 90:
raise Exception("This task should sometimes fail!")
return True

refs = [example_remote_function.remote() for _ in range(100)]
task_tracker.process(refs)
```

This internal dataframe can be accessed via the `.get_df()` method.

`raydar` provides an interface to create and interact with [perspective](https://github.com/finos/perspective) tables, as well as a UI served through [ray serve](https://docs.ray.io/en/latest/serve/index.html). It comes with a variety of ray integrations, including a detailed and scalable task tracker which scales far beyond the ray default task tracking view.

| task_id | user_defined_metadata | attempt_number | name | ... | start_time_ms | end_time_ms | task_log_info | error_message |
| :--- | :--- | :--- | :--- | :-- | :--- | :--- | :--- | :--- |
| `str` | `f32` | `i64` | `str` | | `datetime[ms,America/New_York]` | `datetime[ms,America/New_York]` | `struct[6]` | `str` |
| | | | | | | | | |
| 16310a0f0a... | `null` | 0 | `example_remote_function` | ... | 2024-01-29 07:17:09.340 EST | 2024-01-29 07:17:12.115 EST | `{"/tmp/ray/session_2024-01-29_07...` | `null` |
| c2668a65bd... | `null` | 0 | `example_remote_function` | ... | 2024-01-29 07:17:09.341 EST | 2024-01-29 07:17:12.107 EST | `{"/tmp/ray/session_2024-01-29_07...` | `null` |
| 32d950ec0c... | `null` | 0 | `example_remote_function` | ... | 2024-01-29 07:17:09.342 EST | 2024-01-29 07:17:12.115 EST | `{"/tmp/ray/session_2024-01-29_07...` | `null` |
| e0dc174c83... | `null` | 0 | `example_remote_function` | ... | 2024-01-29 07:17:09.343 EST | 2024-01-29 07:17:12.115 EST | `{"/tmp/ray/session_2024-01-29_07...` | `null` |
| f4402ec78d... | `null` | 0 | `example_remote_function` | ... | 2024-01-29 07:17:09.343 EST | 2024-01-29 07:17:12.115 EST | `{"/tmp/ray/session_2024-01-29_07...` | `null` |

Additionally, setting the `enable_perspective_dashboard` flag to `True` in the `RayTaskTracker`'s construction serves a perspective dashboard with live views of your completed references.

```python
task_tracker = RayTaskTracker(enable_perspective_dashboard=True)
```

![Example](docs/img/example_perspective_dashboard.gif)

## Create/Store Custom Views
From the developer console, save your workspace layout locally.

```javascript
let workspace = document.getElementById('perspective-workspace');

// Save the current layout
workspace.save().then(config => {
// Convert the configuration object to a JSON string
let json = JSON.stringify(config);

// Create a Blob object from the JSON string
let blob = new Blob([json], {type: "application/json"});

// Create a download link
let link = document.createElement('a');
link.href = URL.createObjectURL(blob);
link.download = 'workspace.json';

// Append the link to the document body and click it to start the download
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
});
```

Then, move this json file to `js/src/layouts/default.json`.

![Example](docs/img/example_perspective_dashboard_layouts.gif)

## Expose Ray GCS Information
The data available to you includes much of what Ray's GCS tracks, and also allows for user defined metadata per task.

Specifically, tracked fields include:
* `task_id`
* `user_defined_metadata`
* `attempt_number`
* `name`
* `state`
* `job_id`
* `actor_id`
* `type`
* `func_or_class_name`
* `parent_task_id`
* `node_id`
* `worker_id`
* `error_type`
* `language`
* `required_resources`
* `runtime_env_info`
* `placement_group_id`
* `events`
* `profiling_data`
* `creation_time_ms`
* `start_time_ms`
* `end_time_ms`
* `task_log_info`
* `error_message`

![Example](docs/img/example_task_metadata.gif)

## Custom Sources / Update Logic

The proxy server helpd by the `RayTaskTracker` is exposed via the `.proxy_server()` property, meaning we can create new tables as follows:


```python
task_tracker = RayTaskTracker(enable_perspective_dashboard=True)
proxy_server = task_tracker.proxy_server()
proxy_server.remote(
"new",
"metrics_table",
{
"node_id": "str",
"metric_name": "str",
"value": "float",
"timestamp": "datetime",
},
)
```

### Example: Live Per-Node Training Loss Metrics

If a user were to then update this table with data coming from, for example, a pytorch model training loop with metrics:

```python
def my_model_training_loop()

for epoch in range(num_epochs):
# ... my training code here ...

data = dict(
node_id=ray.get_runtime_context().get_node_id(),
metric_name="loss",
value=loss.item(),
timestamp=time.time(),
)
proxy_server.remote("update", "metrics_table", [data])
```

Then they can expose a live view at per-node loss metrics across our model training process:

![Example](docs/img/example_custom_metrics.gif)
[More information is available in our wiki](https://github.com/Point72/raydar/wiki)

## Installation

`raydar` can be installed via [pip](https://pip.pypa.io) or [conda](https://docs.conda.io/en/latest/), the two primary package managers for the Python ecosystem.

To install `raydar` via **pip**, run this command in your terminal:
Expand All @@ -184,5 +38,5 @@ conda install raydar -c conda-forge
```

## License
This software is licensed under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.

This software is licensed under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.
1 change: 1 addition & 0 deletions docs/wiki/API-Reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Coming soon!
Loading
Loading