Skip to content

Commit

Permalink
Merge branch 'mlflow:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
lu-ohai authored Sep 11, 2023
2 parents 53d0139 + 59880b7 commit 1a97945
Show file tree
Hide file tree
Showing 657 changed files with 25,449 additions and 9,455 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/autoformat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ jobs:
run: |
ruff --fix .
black .
blacken-docs $(git ls-files '*.py' '*.rst' '*.md')
blacken-docs $(git ls-files '*.py' '*.rst' '*.md') || true
# ************************************************************************
# js
# ************************************************************************
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/cancel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@v3
- uses: actions/github-script@v6
with:
github-token: ${{ secrets.MLFLOW_AUTOMATION_TOKEN }}
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const script = require(
`${process.env.GITHUB_WORKSPACE}/.github/workflows/cancel.js`
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ jobs:
with:
submodules: recursive
- uses: ./.github/actions/untracked
- uses: ./.github/actions/free-disk-space
- uses: ./.github/actions/setup-python
- uses: ./.github/actions/setup-pyenv
- uses: ./.github/actions/setup-java
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/preview-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
pip install requests
- name: Create preview link
env:
GITHUB_TOKEN: ${{ secrets.MLFLOW_AUTOMATION_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python dev/preview_docs.py \
--commit-sha ${{ github.event.pull_request.head.sha }} \
Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/rerun.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
module.exports = async ({ context, github, workflow_id }) => {
const { owner, repo } = context.repo;
const { data: workflowRunsData } = await github.rest.actions.listWorkflowRuns({
owner,
repo,
workflow_id,
event: "schedule",
});

if (workflowRunsData.total_count === 0) {
return;
}

const { id: run_id, conclusion } = workflowRunsData.workflow_runs[0];
if (conclusion === "success") {
return;
}

const jobs = await github.paginate(github.rest.actions.listJobsForWorkflowRun, {
owner,
repo,
run_id,
});
const failedJobs = jobs.filter((job) => job.conclusion !== "success");
if (failedJobs.length === 0) {
return;
}

await github.rest.actions.reRunWorkflowFailedJobs({
repo,
owner,
run_id,
});
};
25 changes: 25 additions & 0 deletions .github/workflows/rerun.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Cross version tests sometimes fail due to transient errors. This workflow reruns failed tests.
name: rerun-cross-version-tests

on:
schedule:
# Run this workflow daily at 17:00 UTC (4 hours after cross-version-tests.yml workflow)
- cron: "0 17 * * *"

concurrency:
group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.head_ref || github.ref }}
cancel-in-progress: true

jobs:
set-matrix:
runs-on: ubuntu-latest
timeout-minutes: 10
if: github.repository == 'mlflow-automation/mlflow'
steps:
- uses: actions/checkout@v3
- uses: actions/github-script@v6
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const rerun = require(`${process.env.GITHUB_WORKSPACE}/.github/workflows/rerun.js`);
await rerun({ context, github, workflow_id: "cross-version-tests.yml" });
2 changes: 1 addition & 1 deletion .github/workflows/stale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
steps:
- uses: harupy/stale@mlflow-stale-bot
with:
repo-token: ${{ secrets.MLFLOW_AUTOMATION_TOKEN }}
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: "This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 35 days."
close-issue-message: "This issue was closed because it has been stalled for 14 days with no activity."
days-before-stale: 14
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,14 @@ Job Statuses
|examples| |cross-version-tests| |r-devel| |test-requirements| |stale| |push-images|

.. |examples| image:: https://img.shields.io/github/actions/workflow/status/mlflow-automation/mlflow/examples.yml?branch=master&event=schedule&label=Examples&style=for-the-badge&logo=github
:target: https://github.com/mlflow-automation/mlflow/actions?query=workflow%3AExamples+event%3Aschedule
:target: https://github.com/mlflow-automation/mlflow/actions/workflows/examples.yml?query=workflow%3AExamples+event%3Aschedule
:alt: Examples Action Status
.. |cross-version-tests| image:: https://img.shields.io/github/actions/workflow/status/mlflow-automation/mlflow/cross-version-tests.yml?branch=master&event=schedule&label=Cross%20version%20tests&style=for-the-badge&logo=github
:target: https://github.com/mlflow-automation/mlflow/actions?query=workflow%3ACross%2Bversion%2Btests+event%3Aschedule
:target: https://github.com/mlflow-automation/mlflow/actions/workflows/cross-version-tests.yml?query=workflow%3A%22Cross+version+tests%22+event%3Aschedule
.. |r-devel| image:: https://img.shields.io/github/actions/workflow/status/mlflow-automation/mlflow/r.yml?branch=master&event=schedule&label=r-devel&style=for-the-badge&logo=github
:target: https://github.com/mlflow-automation/mlflow/actions?query=workflow%3AR+event%3Aschedule
:target: https://github.com/mlflow-automation/mlflow/actions/workflows/r.yml?query=workflow%3AR+event%3Aschedule
.. |test-requirements| image:: https://img.shields.io/github/actions/workflow/status/mlflow-automation/mlflow/requirements.yml?branch=master&event=schedule&label=test%20requirements&logo=github&style=for-the-badge
:target: https://github.com/mlflow-automation/mlflow/actions?query=workflow%3ATest%2Brequirements+event%3Aschedule
:target: https://github.com/mlflow-automation/mlflow/actions/workflows/requirements.yml?query=workflow%3A"Test+requirements"+event%3Aschedule
.. |stale| image:: https://img.shields.io/github/actions/workflow/status/mlflow/mlflow/stale.yml?branch=master&event=schedule&label=stale&logo=github&style=for-the-badge
:target: https://github.com/mlflow/mlflow/actions?query=workflow%3AStale+event%3Aschedule
.. |push-images| image:: https://img.shields.io/github/actions/workflow/status/mlflow/mlflow/push-images.yml?event=release&label=push-images&logo=github&style=for-the-badge
Expand Down
17 changes: 17 additions & 0 deletions docs/source/auth/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -612,6 +612,23 @@ Authentication configuration is located at ``mlflow/server/auth/basic_auth.ini``
- Default admin username if the admin is not already created
* - ``admin_password``
- Default admin password if the admin is not already created
* - ``authorization_function``
- Function to authenticate requests

The ``authorization_function`` setting supports pluggable authentication methods
if you want to use another authentication method than HTTP basic auth. The value
specifies ``module_name:function_name``. The function has the following signature:

.. code-block:: python
def authenticate_request() -> Union[Authorization, Response]:
...
The function should return a ``werkzeug.datastructures.Authorization`` object if
the request is authenticated, or a ``Response`` object (typically
``401: Unauthorized``) if the request is not authenticated. For an example of how
to implement a custom authentication method, see ``tests/server/auth/jwt_auth.py``.
**NOTE:** This example is not intended for production use.

Custom Authentication
=====================
Expand Down
3 changes: 2 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
"sphinx.ext.viewcode",
"sphinx.ext.napoleon",
"sphinx_click.ext",
"test_code_block",
"testcode_block",
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -329,6 +329,7 @@
("py:class", "pandas.core.series.Series"),
("py:class", "pandas.core.frame.DataFrame"),
("py:class", "pandas.DataFrame"),
("py:class", "pyspark.sql.DataFrame"),
("py:class", "pyspark.sql.dataframe.DataFrame"),
("py:class", "matplotlib.figure.Figure"),
("py:class", "plotly.graph_objects.Figure"),
Expand Down
4 changes: 2 additions & 2 deletions docs/source/gateway/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -979,13 +979,13 @@ Here are some examples for how you might use curl to interact with the Gateway:
curl -X GET http://my.gateway:8888/api/2.0/gateway/routes/embeddings
2. List all routes: ``GET /api/2.0/gateway/routes``
2. List all routes: ``GET /api/2.0/gateway/routes/``

This endpoint returns a list of all routes.

.. code-block:: bash
curl -X GET http://my.gateway:8888/api/2.0/gateway/routes
curl -X GET http://my.gateway:8888/api/2.0/gateway/routes/
3. Query a route: ``POST /gateway/{route}/invocations``

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def _func(): # <- obj_line
Docstring <- obj_line + offset + extra_offset
.. test-code-block:: <- obj_line + offset + extra_offset + lineno_in_docstring
.. testcode:: <- obj_line + offset + extra_offset + lineno_in_docstring
...
"""
pass
Expand Down Expand Up @@ -92,7 +92,7 @@ def run(self):


def setup(app):
app.add_directive("test-code-block", TestCodeBlockDirective)
app.add_directive("testcode", TestCodeBlockDirective)
return {
"version": "builtin",
"parallel_read_safe": False,
Expand Down
2 changes: 1 addition & 1 deletion examples/evaluation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ pip install scikit-learn xgboost shap>=0.40 matplotlib

Run in this directory with Python.

```
```sh
python evaluate_on_binary_classifier.py
python evaluate_on_multiclass_classifier.py
python evaluate_on_regressor.py
Expand Down
2 changes: 2 additions & 0 deletions examples/jwt_auth/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
"""The jwt_auth.py example in this module directory is also used by
tests/server/auth/test_auth.py."""
44 changes: 44 additions & 0 deletions examples/jwt_auth/jwt_auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""Sample JWT authentication module for testing purposes.
NOT SUITABLE FOR PRODUCTION USE.
"""
import logging
from typing import Union

import jwt
from flask import Response, make_response, request
from werkzeug.datastructures import Authorization

BEARER_PREFIX = "bearer "

_logger = logging.getLogger(__name__)


def authenticate_request() -> Union[Authorization, Response]:
_logger.debug("Getting token")
error_response = make_response()
error_response.status_code = 401
error_response.set_data(
"You are not authenticated. Please provide a valid JWT Bearer token with the request."
)
error_response.headers["WWW-Authenticate"] = 'Bearer error="invalid_token"'

token = request.headers.get("Authorization")
if token is not None and token.lower().startswith(BEARER_PREFIX):
token = token[len(BEARER_PREFIX) :] # Remove prefix
try:
# NOTE:
# - This is a sample implementation for testing purposes only.
# - Here we're using a hardcoded key, which is not secure.
# - We also aren't validating that the user exists.
token_info = jwt.decode(token, "secret", algorithms=["HS256"])
if not token_info: # pragma: no cover
_logger.warning("No token_info returned")
return error_response

return Authorization(auth_type="jwt", data=token_info)
except jwt.exceptions.InvalidTokenError:
pass

_logger.warning("Missing or invalid authorization token")
return error_response
2 changes: 1 addition & 1 deletion examples/pytorch/CaptumExample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ mlflow run . -P max_epochs=5 -P learning_rate=0.01 -P use_pretrained_model=True

Or to run the training script directly with custom parameters:

```
```sh
python Titanic_Captum_Interpret.py \
--max_epochs 50 \
--lr 0.1
Expand Down
2 changes: 1 addition & 1 deletion examples/pytorch/MNIST/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ mlflow run . -P max_epochs=5 -P devices=1 -P batch_size=32 -P num_workers=2 -P l

Or to run the training script directly with custom parameters:

```
```sh
python mnist_autolog_example.py \
--trainer.max_epochs 5 \
--trainer.devices 1 \
Expand Down
4 changes: 3 additions & 1 deletion mlflow/_promptlab.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,9 @@ def save_model(

if conda_env is None:
if pip_requirements is None:
inferred_reqs = infer_pip_requirements(path, "mlflow._promptlab", [])
inferred_reqs = infer_pip_requirements(
path, "mlflow._promptlab", [f"mlflow[gateway]=={__version__}"]
)
default_reqs = sorted(inferred_reqs)
else:
default_reqs = None
Expand Down
3 changes: 1 addition & 2 deletions mlflow/artifacts/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,7 @@ def download_artifacts(
artifact_repo = get_artifact_repository(
add_databricks_profile_info_to_artifact_uri(artifact_uri, tracking_uri)
)
artifact_location = artifact_repo.download_artifacts(artifact_path, dst_path=dst_path)
return artifact_location
return artifact_repo.download_artifacts(artifact_path, dst_path=dst_path)


def load_text(artifact_uri: str) -> str:
Expand Down
3 changes: 1 addition & 2 deletions mlflow/azure/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,7 @@ def _append_query_parameters(url, parameters):
query_dict.update(parameters)
new_query = urllib.parse.urlencode(query_dict)
new_url_components = parsed_url._replace(query=new_query)
new_url = urllib.parse.urlunparse(new_url_components)
return new_url
return urllib.parse.urlunparse(new_url_components)


def _build_block_list_xml(block_list):
Expand Down
9 changes: 6 additions & 3 deletions mlflow/data/spark_dataset.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
import logging
from functools import cached_property
from typing import Any, Dict, Optional, Union
from typing import TYPE_CHECKING, Any, Dict, Optional, Union

from mlflow.data.dataset import Dataset
from mlflow.data.dataset_source import DatasetSource
Expand All @@ -16,6 +16,9 @@
from mlflow.types.utils import _infer_schema
from mlflow.utils.annotations import experimental

if TYPE_CHECKING:
import pyspark

_logger = logging.getLogger(__name__)


Expand All @@ -28,7 +31,7 @@ class SparkDataset(Dataset, PyFuncConvertibleDatasetMixin):

def __init__(
self,
df,
df: "pyspark.sql.DataFrame",
source: DatasetSource,
targets: Optional[str] = None,
name: Optional[str] = None,
Expand Down Expand Up @@ -255,7 +258,7 @@ def load_delta(

@experimental
def from_spark(
df,
df: "pyspark.sql.DataFrame",
path: Optional[str] = None,
table_name: Optional[str] = None,
version: Optional[str] = None,
Expand Down
5 changes: 4 additions & 1 deletion mlflow/gateway/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,10 @@ def kill_child_processes(parent_pid):
"""
parent = psutil.Process(parent_pid)
for child in parent.children(recursive=True):
child.terminate()
try:
child.terminate()
except psutil.NoSuchProcess:
pass
_, still_alive = psutil.wait_procs(parent.children(), timeout=3)
for p in still_alive:
p.kill()
Expand Down
3 changes: 1 addition & 2 deletions mlflow/lightgbm.py
Original file line number Diff line number Diff line change
Expand Up @@ -841,8 +841,7 @@ def get_input_example():

def infer_model_signature(input_example):
model_output = model.predict(input_example)
model_signature = infer_signature(input_example, model_output)
return model_signature
return infer_signature(input_example, model_output)

# Whether to automatically log the trained model based on boolean flag.
if _log_models:
Expand Down
Loading

0 comments on commit 1a97945

Please sign in to comment.