-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b19a056
commit 7f44453
Showing
12 changed files
with
110 additions
and
0 deletions.
There are no files selected for viewing
34 changes: 34 additions & 0 deletions
34
cli/endpoints/online/custom-container/tfserving/half-plus-two-integrated/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Deploy an integrated inference container | ||
This example shows how to deploy a fully-integrated inference BYOC image using TFServing with the model integrated into the container itself and both `model` and `code` absent from the deployment yaml. | ||
|
||
## How to deploy | ||
To deploy this example execute the deploy-custom-container-half-plus-two-integrated.sh script in the CLI directory. | ||
|
||
## Testing | ||
The endpoint can be tested using the code at the end of the deployment script: | ||
```bash | ||
curl -d @sample-data.json -H "Content-Type: application/json" -H "Authorization: Bearer $KEY" $SCORING_URL | ||
``` | ||
|
||
The inputs are a list of tensors of dimension 2. | ||
```json | ||
{ | ||
"inputs" : [ | ||
[[1,2,3,4]], | ||
[[0,1,1,1]] | ||
] | ||
} | ||
``` | ||
|
||
## Model | ||
This model is a simple [Half Plus Two](https://www.tensorflow.org/tfx/serving/docker) model which returns `2X+0.5` for input tensors of dimension 2. | ||
|
||
```python | ||
class HPT(tf.Module): | ||
@tf.function(input_signature=[tf.TensorSpec(shape=[None,None], dtype=tf.float32)]) | ||
def __call__(self, x): | ||
return tf.math.add(tf.math.multiply(x, 0.5),2) | ||
``` | ||
|
||
## Image | ||
This is a BYOC TFServing image with the model integrated into the container itself. |
17 changes: 17 additions & 0 deletions
17
...stom-container/tfserving/half-plus-two-integrated/half-plus-two-integrated-deployment.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json | ||
name: integratedbyoc | ||
endpoint_name: {{ENDPOINT_NAME}} | ||
environment: | ||
image: {{ACR_NAME}}.azurecr.io/azureml-examples/tfsintegrated:1 | ||
inference_config: | ||
liveness_route: | ||
port: 8501 | ||
path: /v1/models/hpt | ||
readiness_route: | ||
port: 8501 | ||
path: /v1/models/hpt | ||
scoring_route: | ||
port: 8501 | ||
path: /v1/models/hpt:predict | ||
instance_type: Standard_DS3_v2 | ||
instance_count: 1 |
5 changes: 5 additions & 0 deletions
5
...e/custom-container/tfserving/half-plus-two-integrated/half-plus-two-integrated.Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
FROM docker.io/tensorflow/serving:latest | ||
|
||
ENV MODEL_NAME=hpt | ||
|
||
COPY models /models |
Binary file added
BIN
+8.32 KB
...ts/online/custom-container/tfserving/half-plus-two-integrated/models/hpt/1/saved_model.pb
Binary file not shown.
Binary file added
BIN
+89 Bytes
...r/tfserving/half-plus-two-integrated/models/hpt/1/variables/variables.data-00000-of-00001
Binary file not shown.
Binary file added
BIN
+144 Bytes
...ustom-container/tfserving/half-plus-two-integrated/models/hpt/1/variables/variables.index
Binary file not shown.
6 changes: 6 additions & 0 deletions
6
cli/endpoints/online/custom-container/tfserving/half-plus-two-integrated/sample-data.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
{ | ||
"inputs" : [ | ||
[[1,2,3,4]], | ||
[[0,1,1,1]] | ||
] | ||
} |
17 changes: 17 additions & 0 deletions
17
cli/endpoints/online/custom-container/tfserving/half-plus-two/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Deploy the Half Plus Two model using TFServing | ||
In this example, we deploy a single model (half-plus-two) using a TFServing custom container. | ||
|
||
This example can be run end-to-end by executing the `deploy-custom-container-tfserving-half-plus-two.sh` script in the `CLI` directory. | ||
|
||
## Model | ||
This example uses the `half-plus-two` model, which is downloaded in the script. In the deployment yaml it is registered as a model and mounted at runtime at the `AZUREML_MODEL_DIR` environment variable as in standard deployments. The default location for model mounting is `/var/azureml-app/azureml-models/<MODEL_NAME>/<MODEL_VERSION>` unless overridden by the `model_mount_path` field in the deployment yaml. | ||
|
||
This path is passed to TFServing as an environment variable in the deployment YAML. | ||
|
||
## Build the image | ||
This example uses the `tensorflow/serving` image with no modifications as defined in the `tfserving.dockerfile`. Although this example demonstrates the usual workflow of building the image on an ACR instance, this deployment could bypass the ACR build step and include the `docker.io` path of the image as the image URL in the deployment YAML. | ||
|
||
## Environment | ||
The environment is defined inline in the deployment yaml and references the ACR url of the image. The ACR must be associated with the workspace (or have a user-assigned managed identity that enables ACRPull) in order to successfully deploy. | ||
|
||
The environment also contains an `inference_config` block that defines the `liveness`, `readiness`, and `scoring` routes by path and port. Because the images used in this examples are based on the AzureML Inference Minimal images, these values are the same as those in a non-BYOC deployment, however they must be included since we are now using a custom image. |
1 change: 1 addition & 0 deletions
1
cli/endpoints/online/custom-container/tfserving/half-plus-two/sample_request.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"instances": [1.0, 2.0, 5.0]} |
26 changes: 26 additions & 0 deletions
26
cli/endpoints/online/custom-container/tfserving/half-plus-two/tfserving-deployment.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json | ||
name: tfserving-deployment | ||
endpoint_name: tfserving-endpoint | ||
model: | ||
name: tfserving-mounted | ||
version: {{MODEL_VERSION}} | ||
path: ./half_plus_two | ||
environment_variables: | ||
MODEL_BASE_PATH: /var/azureml-app/azureml-models/tfserving-mounted/{{MODEL_VERSION}} | ||
MODEL_NAME: half_plus_two | ||
environment: | ||
#name: tfserving | ||
#version: 1 | ||
image: docker.io/tensorflow/serving:latest | ||
inference_config: | ||
liveness_route: | ||
port: 8501 | ||
path: /v1/models/half_plus_two | ||
readiness_route: | ||
port: 8501 | ||
path: /v1/models/half_plus_two | ||
scoring_route: | ||
port: 8501 | ||
path: /v1/models/half_plus_two:predict | ||
instance_type: Standard_DS3_v2 | ||
instance_count: 1 |
3 changes: 3 additions & 0 deletions
3
cli/endpoints/online/custom-container/tfserving/half-plus-two/tfserving-endpoint.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
$schema: https://azuremlsdk2.blob.core.windows.net/latest/managedOnlineEndpoint.schema.json | ||
name: tfserving-endpoint | ||
auth_mode: aml_token |
1 change: 1 addition & 0 deletions
1
cli/endpoints/online/custom-container/tfserving/half-plus-two/tfserving.dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
FROM tensorflow/serving |