Skip to content

Commit

Permalink
Add index.md and CONTRIBUTING.md to docs folder (logicalclocks#55)
Browse files Browse the repository at this point in the history
  • Loading branch information
robzor92 authored Jan 28, 2022
1 parent 78af96a commit 794621a
Show file tree
Hide file tree
Showing 2 changed files with 309 additions and 0 deletions.
215 changes: 215 additions & 0 deletions docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
## Python development setup
---

- Fork and clone the repository

- Create a new Python environment with your favourite environment manager, e.g. virtualenv or conda

- Install repository in editable mode with development dependencies:

```bash
cd python
pip install -e ".[dev]"
```

- Install [pre-commit](https://pre-commit.com/) and then activate its hooks. pre-commit is a framework for managing and maintaining multi-language pre-commit hooks. The Model Registry uses pre-commit to ensure code-style and code formatting through [black](https://github.com/psf/black) and [flake8](https://gitlab.com/pycqa/flake8). Run the following commands from the `python` directory:

```bash
cd python
pip install --user pre-commit
pre-commit install
```

Afterwards, pre-commit will run whenever you commit.

- To run formatting and code-style separately, you can configure your IDE, such as VSCode, to use black and flake8, or run them via the command line:

```bash
cd python
flake8 hsml
black hsml
```

### Python documentation

We follow a few best practices for writing the Python documentation:

1. Use the google docstring style:

```python
"""[One Line Summary]
[Extended Summary]
[!!! example
import xyz
]
# Arguments
arg1: Type[, optional]. Description[, defaults to `default`]
arg2: Type[, optional]. Description[, defaults to `default`]
# Returns
Type. Description.
# Raises
Exception. Description.
"""
```

If Python 3 type annotations are used, they are inserted automatically.


2. Model registry entity engine methods (e.g. ModelEngine etc.) only require a single line docstring.
3. REST Api implementations (e.g. ModelApi etc.) should be fully documented with docstrings without defaults.
4. Public Api such as metadata objects should be fully documented with defaults.

#### Setup and Build Documentation

We use `mkdocs` together with `mike` ([for versioning](https://github.com/jimporter/mike/)) to build the documentation and a plugin called `keras-autodoc` to auto generate Python API documentation from docstrings.

**Background about `mike`:**
`mike` builds the documentation and commits it as a new directory to the gh-pages branch. Each directory corresponds to one version of the documentation. Additionally, `mike` maintains a json in the root of gh-pages with the mappings of versions/aliases for each of the directories available. With aliases you can define extra names like `dev` or `latest`, to indicate stable and unstable releases.

1. Currently we are using our own version of `keras-autodoc`

```bash
pip install git+https://github.com/logicalclocks/keras-autodoc@split-tags-properties
```

2. Install HSML with `docs` extras:

```bash
pip install -e .[dev,docs]
```

3. To build the docs, first run the auto doc script:

```bash
cd ..
python auto_doc.py
```

##### Option 1: Build only current version of docs

4. Either build the docs, or serve them dynamically:

Note: Links and pictures might not resolve properly later on when checking with this build.
The reason for that is that the docs are deployed with versioning on docs.hopsworks.ai and
therefore another level is added to all paths, e.g. `docs.hopsworks.ai/[version-or-alias]`.
Using relative links should not be affected by this, however, building the docs with version
(Option 2) is recommended.

```bash
mkdocs build
# or
mkdocs serve
```

##### Option 2 (Preferred): Build multi-version doc with `mike`

###### Versioning on docs.hopsworks.ai

On docs.hopsworks.ai we implement the following versioning scheme:

- current master branches (e.g. of hsml corresponding to master of Hopsworks): rendered as current Hopsworks snapshot version, e.g. **2.2.0-SNAPSHOT [dev]**, where `dev` is an alias to indicate that this is an unstable version.
- the latest release: rendered with full current version, e.g. **2.1.5 [latest]** with `latest` alias to indicate that this is the latest stable release.
- previous stable releases: rendered without alias, e.g. **2.1.4**.

###### Build Instructions

4. For this you can either checkout and make a local copy of the `upstream/gh-pages` branch, where
`mike` maintains the current state of docs.hopsworks.ai, or just build documentation for the branch you are updating:

Building *one* branch:

Checkout your dev branch with modified docs:
```bash
git checkout [dev-branch]
```

Generate API docs if necessary:
```bash
python auto_doc.py
```

Build docs with a version and alias
```bash
mike deploy [version] [alias] --update-alias
# for example, if you are updating documentation to be merged to master,
# which will become the new SNAPSHOT version:
mike deploy 2.2.0-SNAPSHOT dev --update-alias
# if you are updating docs of the latest stable release branch
mike deploy [version] latest --update-alias
# if you are updating docs of a previous stable release branch
mike deploy [version]
```

If no gh-pages branch existed in your local repository, this will have created it.

**Important**: If no previous docs were built, you will have to choose a version as default to be loaded as index, as follows

```bash
mike set-default [version-or-alias]
```

You can now checkout the gh-pages branch and serve:
```bash
git checkout gh-pages
mike serve
```

You can also list all available versions/aliases:
```bash
mike list
```

Delete and reset your local gh-pages branch:
```bash
mike delete --all
# or delete single version
mike delete [version-or-alias]
```

#### Adding new API documentation

To add new documentation for APIs, you need to add information about the method/class to document to the `auto_doc.py` script:

```python
PAGES = {
"connection.md": [
"hsml.connection.Connection.connection",
"hsml.connection.Connection.setup_databricks",
]
"new_template.md": [
"module",
"xyz.asd"
]
}
```

Now you can add a template markdown file to the `docs/templates` directory with the name you specified in the auto-doc script. The `new_template.md` file should contain a tag to identify the place at which the API documentation should be inserted:

```
## The XYZ package
{{module}}
Some extra content here.
!!! example
```python
import xyz
```
{{xyz.asd}}
```

Finally, run the `auto_doc.py` script, as decribed above, to update the documentation.

For information about Markdown syntax and possible Admonitions/Highlighting etc. see
the [Material for Mkdocs themes reference documentation](https://squidfunk.github.io/mkdocs-material/reference/abbreviations/).
94 changes: 94 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Hopsworks Model Registry

<p align="center">
<a href="https://community.hopsworks.ai"><img
src="https://img.shields.io/discourse/users?label=Hopsworks%20Community&server=https%3A%2F%2Fcommunity.hopsworks.ai"
alt="Hopsworks Community"
/></a>
<a href="https://docs.hopsworks.ai"><img
src="https://img.shields.io/badge/docs-HSML-orange"
alt="Hopsworks Model Registry Documentation"
/></a>
<a href="https://pypi.org/project/hsml/"><img
src="https://img.shields.io/pypi/v/hsml?color=blue"
alt="PyPiStatus"
/></a>
<a href="https://archiva.hops.works/#artifact/com.logicalclocks/hsml"><img
src="https://img.shields.io/badge/java-HSML-green"
alt="Scala/Java Artifacts"
/></a>
<a href="https://pepy.tech/project/hsml/month"><img
src="https://pepy.tech/badge/hsml/month"
alt="Downloads"
/></a>
<a href="https://github.com/psf/black"><img
src="https://img.shields.io/badge/code%20style-black-000000.svg"
alt="CodeStyle"
/></a>
<a><img
src="https://img.shields.io/pypi/l/hsml?color=green"
alt="License"
/></a>
</p>

HSML is the library to interact with the Hopsworks Model Registry. The library makes it easy to export and manage models.

The library automatically configures itself based on the environment it is run.
However, to connect from an external Python environment additional connection information, such as host and port, is required. For more information about the setup from external environments, see the setup section.

## Getting Started On Hopsworks

Instantiate a connection and get the project model registry handle
```python
import hsml

# Create a connection
connection = hsml.connection()

# Get the model registry handle for the project's model registry
mr = connection.get_model_registry()
```

Create a new model
```python
mnist_model_meta = mr.tensorflow.create_model(name="mnist",
version=1,
metrics={"accuracy": 0.94},
description="mnist model description")
mnist_model_meta.save("/tmp/model_directory")
```

Download a model
```python
mnist_model_meta = mr.get_model("name", version=1)

model_path = mnist_model_meta.download()
```

Delete a model
```python
mnist_model_meta.delete()
```

Get best performing model
```python
mnist_model_meta = mr.get_best_model('mnist', 'accuracy', 'max')

```

You can find more examples on how to use the library in [examples.hopsworks.ai](https://examples.hopsworks.ai).

## Documentation

Documentation is available at [Hopsworks Model Registry Documentation](https://docs.hopsworks.ai/).

## Issues

For general questions about the usage of Hopsworks Machine Learning please open a topic on [Hopsworks Community](https://community.hopsworks.ai/).

Please report any issue using [Github issue tracking](https://github.com/logicalclocks/machine-learning-api/issues).


## Contributing

If you would like to contribute to this library, please see the [Contribution Guidelines](CONTRIBUTING.md).

0 comments on commit 794621a

Please sign in to comment.