diff --git a/.gitignore b/.gitignore index 1682c94e1bd..6a4b1af2b57 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,7 @@ +# Operating System files +.DS_Store +Thumbs.db + log/ obj/ _site/ diff --git a/articles/ai-services/openai/how-to/deployment-types.md b/articles/ai-services/openai/how-to/deployment-types.md index 06f4b379b8f..189fd08ae0b 100644 --- a/articles/ai-services/openai/how-to/deployment-types.md +++ b/articles/ai-services/openai/how-to/deployment-types.md @@ -30,7 +30,7 @@ Azure OpenAI offers three types of deployments. These provide a varied level of | **Offering** | **Global-Batch** | **Global-Standard** | **Standard** | **Provisioned** | |---|:---|:---|:---|:---| -| **Best suited for** | Offline scoring

Workloads that are not latency sensitive and can be completed in hours.

For use cases that do not have data processing residency requirements.| Recommended starting place for customers.

Global-Standard will have the higher default quota and larger number of models available than Standard.

For production applications that do not have data processing residency requirements. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.| +| **Best suited for** | Offline scoring

Workloads that are not latency sensitive and can be completed in hours.

For use cases that do not have data processing residency requirements.| Recommended starting place for customers.

Global-Standard will have the higher default quota and larger number of models available than Standard. | For customers with data residency requirements. Optimized for low to medium volume. | Real-time scoring for large consistent volume. Includes the highest commitments and limits.| | **How it works** | Offline processing via files |Traffic may be routed anywhere in the world | | | | **Getting started** | [Global-Batch](./batch.md) | [Model deployment](./create-resource.md) | [Model deployment](./create-resource.md) | [Provisioned onboarding](./provisioned-throughput-onboarding.md) | | **Cost** | [Least expensive option](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/)
50% less cost compared to Global Standard prices. Access to all new models with larger quota allocations. | [Global deployment pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | [Regional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) | May experience cost savings for consistent usage | diff --git a/articles/ai-studio/how-to/costs-plan-manage.md b/articles/ai-studio/how-to/costs-plan-manage.md index 072a1760dd2..a16bd59a65c 100644 --- a/articles/ai-studio/how-to/costs-plan-manage.md +++ b/articles/ai-studio/how-to/costs-plan-manage.md @@ -18,7 +18,10 @@ author: Blackmist [!INCLUDE [Feature preview](~/reusable-content/ce-skilling/azure/includes/ai-studio/includes/feature-preview.md)] -This article describes how you plan for and manage costs for Azure AI Studio. First, you use the Azure pricing calculator to help plan for Azure AI Studio costs before you add any resources for the service to estimate costs. Next, as you add Azure resources, review the estimated costs. +This article describes how you plan for and manage costs for Azure AI Studio. First, you use the Azure pricing calculator to help plan for Azure AI Studio costs before you add any resources for the service to estimate costs. Next, as you add Azure resources, review the estimated costs. + +> [!TIP] +> Azure AI Studio does not have a specific page in the Azure pricing calculator. Azure AI Studio is composed of several other Azure services, some of which are optional. This article provides information on using the pricing calculator to estimate costs for these services. You use Azure AI services in Azure AI Studio. Costs for Azure AI services are only a portion of the monthly costs in your Azure bill. You're billed for all Azure services and resources used in your Azure subscription, including the third-party services. diff --git a/articles/ai-studio/how-to/deploy-models-cohere-command.md b/articles/ai-studio/how-to/deploy-models-cohere-command.md index 6d2b26f9ef4..6a7ec5cea5a 100644 --- a/articles/ai-studio/how-to/deploy-models-cohere-command.md +++ b/articles/ai-studio/how-to/deploy-models-cohere-command.md @@ -231,7 +231,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -244,7 +244,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` @@ -256,13 +256,15 @@ Cohere Command chat models can create JSON outputs. Set `response_format` to `js ```python +from azure.ai.inference.models import ChatCompletionsResponseFormatJSON + response = client.complete( messages=[ SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using." " the following format: { ""answer"": ""response"" }."), UserMessage(content="How many languages are in the world?"), ], - response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT } + response_format=ChatCompletionsResponseFormatJSON() ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-cohere-embed.md b/articles/ai-studio/how-to/deploy-models-cohere-embed.md index e49a992bc19..2bb7a7302c9 100644 --- a/articles/ai-studio/how-to/deploy-models-cohere-embed.md +++ b/articles/ai-studio/how-to/deploy-models-cohere-embed.md @@ -617,12 +617,12 @@ Cohere Embed V3 models can optimize the embeddings based on its use case. | Description | Language | Sample | |-------------------------------------------|-------------------|-----------------------------------------------------------------| -| Web requests | Bash | [Command-R](https://aka.ms/samples/cohere-command-r/webrequests) - [Command-R+](https://aka.ms/samples/cohere-command-r-plus/webrequests) | +| Web requests | Bash | [cohere-embed.ipynb](https://aka.ms/samples/embed-v3/webrequests) | | Azure AI Inference package for JavaScript | JavaScript | [Link](https://aka.ms/azsdk/azure-ai-inference/javascript/samples) | | Azure AI Inference package for Python | Python | [Link](https://aka.ms/azsdk/azure-ai-inference/python/samples) | -| OpenAI SDK (experimental) | Python | [Link](https://aka.ms/samples/cohere-command/openaisdk) | -| LangChain | Python | [Link](https://aka.ms/samples/cohere/langchain) | -| Cohere SDK | Python | [Link](https://aka.ms/samples/cohere-python-sdk) | +| OpenAI SDK (experimental) | Python | [Link](https://aka.ms/samples/cohere-embed/openaisdk) | +| LangChain | Python | [Link](https://aka.ms/samples/cohere-embed/langchain) | +| Cohere SDK | Python | [Link](https://aka.ms/samples/cohere-embed/cohere-python-sdk) | | LiteLLM SDK | Python | [Link](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/litellm.ipynb) | #### Retrieval Augmented Generation (RAG) and tool use samples diff --git a/articles/ai-studio/how-to/deploy-models-jais.md b/articles/ai-studio/how-to/deploy-models-jais.md index f333192f295..b1831700770 100644 --- a/articles/ai-studio/how-to/deploy-models-jais.md +++ b/articles/ai-studio/how-to/deploy-models-jais.md @@ -201,7 +201,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -214,7 +214,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-llama.md b/articles/ai-studio/how-to/deploy-models-llama.md index 66140ad7a08..8cdc9350477 100644 --- a/articles/ai-studio/how-to/deploy-models-llama.md +++ b/articles/ai-studio/how-to/deploy-models-llama.md @@ -255,7 +255,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -268,7 +268,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-mistral-nemo.md b/articles/ai-studio/how-to/deploy-models-mistral-nemo.md index 95eb2177578..8367b3a1b25 100644 --- a/articles/ai-studio/how-to/deploy-models-mistral-nemo.md +++ b/articles/ai-studio/how-to/deploy-models-mistral-nemo.md @@ -209,7 +209,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -222,7 +222,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` @@ -234,13 +234,15 @@ Mistral Nemo chat model can create JSON outputs. Set `response_format` to `json_ ```python +from azure.ai.inference.models import ChatCompletionsResponseFormatJSON + response = client.complete( messages=[ SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using." " the following format: { ""answer"": ""response"" }."), UserMessage(content="How many languages are in the world?"), ], - response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT } + response_format=ChatCompletionsResponseFormatJSON() ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-mistral-open.md b/articles/ai-studio/how-to/deploy-models-mistral-open.md index 9c1051e6745..20aa4714814 100644 --- a/articles/ai-studio/how-to/deploy-models-mistral-open.md +++ b/articles/ai-studio/how-to/deploy-models-mistral-open.md @@ -257,7 +257,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -270,7 +270,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-mistral.md b/articles/ai-studio/how-to/deploy-models-mistral.md index a6a54194fcb..ba88312a4da 100644 --- a/articles/ai-studio/how-to/deploy-models-mistral.md +++ b/articles/ai-studio/how-to/deploy-models-mistral.md @@ -239,7 +239,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -252,7 +252,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` @@ -264,13 +264,15 @@ Mistral premium chat models can create JSON outputs. Set `response_format` to `j ```python +from azure.ai.inference.models import ChatCompletionsResponseFormatJSON + response = client.complete( messages=[ SystemMessage(content="You are a helpful assistant that always generate responses in JSON format, using." " the following format: { ""answer"": ""response"" }."), UserMessage(content="How many languages are in the world?"), ], - response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT } + response_format=ChatCompletionsResponseFormatJSON() ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-phi-3-5-moe.md b/articles/ai-studio/how-to/deploy-models-phi-3-5-moe.md index d688e988f8c..8c167738f7b 100644 --- a/articles/ai-studio/how-to/deploy-models-phi-3-5-moe.md +++ b/articles/ai-studio/how-to/deploy-models-phi-3-5-moe.md @@ -219,7 +219,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -232,7 +232,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md b/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md index a74fb95ade6..207dd6c34aa 100644 --- a/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md +++ b/articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md @@ -215,7 +215,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -228,7 +228,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-phi-3-vision.md b/articles/ai-studio/how-to/deploy-models-phi-3-vision.md index a79761c982e..2542fb3e05b 100644 --- a/articles/ai-studio/how-to/deploy-models-phi-3-vision.md +++ b/articles/ai-studio/how-to/deploy-models-phi-3-vision.md @@ -215,7 +215,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -228,7 +228,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/how-to/deploy-models-phi-3.md b/articles/ai-studio/how-to/deploy-models-phi-3.md index 40f9da64d97..849f972e8a0 100644 --- a/articles/ai-studio/how-to/deploy-models-phi-3.md +++ b/articles/ai-studio/how-to/deploy-models-phi-3.md @@ -256,7 +256,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -269,7 +269,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/ai-studio/index.yml b/articles/ai-studio/index.yml index 984caefc6de..9279c0e670a 100644 --- a/articles/ai-studio/index.yml +++ b/articles/ai-studio/index.yml @@ -8,6 +8,7 @@ metadata: ms.service: azure-ai-studio ms.custom: - build-2024 + - copilot-learning-hub ms.topic: landing-page ms.reviewer: eur ms.author: eur diff --git a/articles/ai-studio/reference/reference-model-inference-api.md b/articles/ai-studio/reference/reference-model-inference-api.md index 233ce5f5ee1..076ff8e775d 100644 --- a/articles/ai-studio/reference/reference-model-inference-api.md +++ b/articles/ai-studio/reference/reference-model-inference-api.md @@ -318,7 +318,7 @@ The following example shows the response for a chat completion request indicatin ```python import json -from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormat +from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormatJSON from azure.core.exceptions import HttpResponseError try: @@ -327,7 +327,7 @@ try: SystemMessage(content="You are a helpful assistant."), UserMessage(content="How many languages are in the world?"), ], - response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT } + response_format=ChatCompletionsResponseFormatJSON() ) except HttpResponseError as ex: if ex.status_code == 422: diff --git a/articles/machine-learning/.openpublishing.redirection.machine-learning.json b/articles/machine-learning/.openpublishing.redirection.machine-learning.json index 9beeab11120..bfb22970cf4 100644 --- a/articles/machine-learning/.openpublishing.redirection.machine-learning.json +++ b/articles/machine-learning/.openpublishing.redirection.machine-learning.json @@ -2972,8 +2972,13 @@ }, { "source_path_from_root": "/articles/machine-learning/v-fake/how-to-cicd-data-ingestion.md", - "redirect_url": "/azure/machine-learning/how-to-cicd-data-ingestion", - "redirect_document_id": true + "redirect_url": "/azure/machine-learning/how-to-devops-machine-learning", + "redirect_document_id": false + }, + { + "source_path_from_root": "/articles/machine-learning/v1/how-to-cicd-data-ingestion.md", + "redirect_url": "/azure/machine-learning/how-to-devops-machine-learning", + "redirect_document_id": false }, { "source_path_from_root": "/articles/machine-learning/how-to-debug-pipelines-application-insights.md", diff --git a/articles/machine-learning/component-reference/add-columns.md b/articles/machine-learning/component-reference/add-columns.md index f60fa19a88f..1c5a5f98bd0 100644 --- a/articles/machine-learning/component-reference/add-columns.md +++ b/articles/machine-learning/component-reference/add-columns.md @@ -18,8 +18,6 @@ This article describes a component in Azure Machine Learning designer. Use this component to concatenate two datasets. You combine all columns from the two datasets that you specify as inputs to create a single dataset. If you need to concatenate more than two datasets, use several instances of **Add Columns**. - - ## How to configure Add Columns 1. Add the **Add Columns** component to your pipeline. @@ -42,4 +40,4 @@ If there are two columns with the same name in the input datasets, a numeric suf ## Next steps -See the [set of components available](component-reference.md) to Azure Machine Learning. \ No newline at end of file +See the [set of components available](component-reference.md) to Azure Machine Learning. diff --git a/articles/machine-learning/how-to-deploy-models-phi-3-5-moe.md b/articles/machine-learning/how-to-deploy-models-phi-3-5-moe.md index e7fd0e8d373..2b96be6f139 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3-5-moe.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3-5-moe.md @@ -219,7 +219,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -232,7 +232,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md index e9cbee839c3..b8df2003caa 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3-5-vision.md @@ -214,7 +214,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -227,7 +227,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/machine-learning/how-to-deploy-models-phi-3-vision.md b/articles/machine-learning/how-to-deploy-models-phi-3-vision.md index d4a92afbadc..807c8dc6a4e 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3-vision.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3-vision.md @@ -213,7 +213,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](reference-model-inference-api.md). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -226,7 +226,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/machine-learning/how-to-deploy-models-phi-3.md b/articles/machine-learning/how-to-deploy-models-phi-3.md index 035de19dd7c..8ffa1a0ec6f 100644 --- a/articles/machine-learning/how-to-deploy-models-phi-3.md +++ b/articles/machine-learning/how-to-deploy-models-phi-3.md @@ -257,7 +257,7 @@ print_stream(result) Explore other parameters that you can specify in the inference client. For a full list of all the supported parameters and their corresponding documentation, see [Azure AI Model Inference API reference](https://aka.ms/azureai/modelinference). ```python -from azure.ai.inference.models import ChatCompletionsResponseFormat +from azure.ai.inference.models import ChatCompletionsResponseFormatText response = client.complete( messages=[ @@ -270,7 +270,7 @@ response = client.complete( stop=["<|endoftext|>"], temperature=0, top_p=1, - response_format={ "type": ChatCompletionsResponseFormat.TEXT }, + response_format=ChatCompletionsResponseFormatText(), ) ``` diff --git a/articles/machine-learning/how-to-managed-network.md b/articles/machine-learning/how-to-managed-network.md index 22f18384a37..02352e0cef4 100644 --- a/articles/machine-learning/how-to-managed-network.md +++ b/articles/machine-learning/how-to-managed-network.md @@ -8,7 +8,7 @@ ms.subservice: enterprise-readiness ms.reviewer: None ms.author: larryfr author: Blackmist -ms.date: 04/11/2024 +ms.date: 08/29/2024 ms.topic: how-to ms.custom: - build-2023 @@ -142,7 +142,7 @@ Before following the steps in this article, make sure you have the following pre resource_group = "" # get a handle to the subscription - ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group) + ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group) ``` # [Azure portal](#tab/portal) @@ -294,7 +294,7 @@ To configure a managed VNet that allows internet outbound communications, use th ```python # Get the existing workspace - ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, "myworkspace") + ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name="myworkspace") ws = ml_client.workspaces.get() # Basic managed VNet configuration @@ -568,7 +568,7 @@ To configure a managed VNet that allows only approved outbound communications, u ```python # Get the existing workspace - ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, "myworkspace") + ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name="myworkspace") ws = ml_client.workspaces.get() # Basic managed VNet configuration @@ -793,7 +793,7 @@ To enable the [serverless Spark jobs](how-to-submit-spark-jobs.md) for the manag ```python # Connect to a workspace named "myworkspace" - ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name="myworkspace") + ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name="myworkspace") # whether to provision Spark vnet as well include_spark = True @@ -839,7 +839,7 @@ The following example shows how to provision a managed VNet: ```python # Connect to a workspace named "myworkspace" -ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name="myworkspace") +ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name="myworkspace") # whether to provision Spark vnet as well include_spark = True @@ -889,7 +889,7 @@ resource_group = "" workspace = "" ml_client = MLClient( - DefaultAzureCredential(), subscription_id, resource_group, workspace + DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name=workspace ) # Get workspace info @@ -936,7 +936,7 @@ The following example demonstrates how to manage outbound rules for a workspace ```python # Connect to the workspace -ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace_name="myworkspace") +ml_client = MLClient(DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group, workspace_name="myworkspace") # Specify the rule name rule_name = "" diff --git a/articles/machine-learning/reference-model-inference-api.md b/articles/machine-learning/reference-model-inference-api.md index 6b65eb665e0..ecd8e0b929a 100644 --- a/articles/machine-learning/reference-model-inference-api.md +++ b/articles/machine-learning/reference-model-inference-api.md @@ -313,7 +313,7 @@ The following example shows the response for a chat completion request indicatin ```python import json -from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormat +from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormatJSON from azure.core.exceptions import HttpResponseError try: @@ -322,7 +322,7 @@ try: SystemMessage(content="You are a helpful assistant."), UserMessage(content="How many languages are in the world?"), ], - response_format={ "type": ChatCompletionsResponseFormat.JSON_OBJECT } + response_format=ChatCompletionsResponseFormatJSON() ) except HttpResponseError as ex: if ex.status_code == 422: diff --git a/articles/machine-learning/reference-yaml-component-command.md b/articles/machine-learning/reference-yaml-component-command.md index 4a2ca4e1972..15a75fc6e97 100644 --- a/articles/machine-learning/reference-yaml-component-command.md +++ b/articles/machine-learning/reference-yaml-component-command.md @@ -37,7 +37,7 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late | `is_deterministic` | boolean |This option determines if the component will produce the same output for the same input data. You should usually set this to `false` for components that load data from external sources, such as importing data from a URL. This is because the data at the URL might change over time. | | `true` | | `command` | string | **Required.** The command to execute. | | | | `code` | string | Local path to the source code directory to be uploaded and used for the component. | | | -| `environment` | string or object | **Required.** The environment to use for the component. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.

To reference an existing custom environment, use the `azureml::` syntax. To reference a curated environment, use the `azureml://registries/azureml/environment//versions/` syntax. For more information on how to reference environments see [How to Manage Environments](https://learn.microsoft.com/azure/machine-learning/how-to-manage-environments-v2)

To define an environment inline, follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they aren't supported for inline environments. | | | +| `environment` | string or object | **Required.** The environment to use for the component. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.

To reference an existing custom environment, use the `azureml::` syntax. To reference a curated environment, use the `azureml://registries/azureml/environment//versions/` syntax. For more information on how to reference environments see [How to Manage Environments](how-to-manage-environments-v2.md)

To define an environment inline, follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they aren't supported for inline environments. | | | | `distribution` | object | The distribution configuration for distributed training scenarios. One of [MpiConfiguration](#mpiconfiguration), [PyTorchConfiguration](#pytorchconfiguration), or [TensorFlowConfiguration](#tensorflowconfiguration). | | | | `resources.instance_count` | integer | The number of nodes to use for the job. | | `1` | | `inputs` | object | Dictionary of component inputs. The key is a name for the input within the context of the component and the value is the component input definition.

Inputs can be referenced in the `command` using the `${{ inputs. }}` expression. | | | diff --git a/articles/machine-learning/reference-yaml-job-command.md b/articles/machine-learning/reference-yaml-job-command.md index 38a396d876e..eb8bf6b1a30 100644 --- a/articles/machine-learning/reference-yaml-job-command.md +++ b/articles/machine-learning/reference-yaml-job-command.md @@ -9,7 +9,7 @@ ms.topic: reference ms.custom: cliv2, devx-track-python, update-code author: Blackmist ms.author: larryfr -ms.date: 07/25/2024 +ms.date: 08/29/2024 ms.reviewer: balapv --- @@ -34,9 +34,9 @@ The source JSON schema can be found at https://azuremlschemas.azureedge.net/late | `experiment_name` | string | Experiment name to organize the job under. Each job's run record is organized under the corresponding experiment in the studio's "Experiments" tab. If omitted, Azure Machine Learning defaults it to the name of the working directory where the job was created. | | | | `description` | string | Description of the job. | | | | `tags` | object | Dictionary of tags for the job. | | | -| `command` | string | **Required (if not using `component` field).** The command to execute. | | | +| `command` | string | The command to execute. | | | | `code` | string | Local path to the source code directory to be uploaded and used for the job. | | | -| `environment` | string or object | **Required (if not using `component` field).** The environment to use for the job. Can be either a reference to an existing versioned environment in the workspace or an inline environment specification.

To reference an existing environment, use the `azureml::` syntax or `azureml:@latest` (to reference the latest version of an environment).

To define an environment inline, follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they aren't supported for inline environments. | | | +| `environment` | string or object | The environment to use for the job. Can be either a reference to an existing versioned environment in the workspace or an inline environment specification.

To reference an existing environment, use the `azureml::` syntax or `azureml:@latest` (to reference the latest version of an environment).

To define an environment inline, follow the [Environment schema](reference-yaml-environment.md#yaml-syntax). Exclude the `name` and `version` properties as they aren't supported for inline environments. | | | | `environment_variables` | object | Dictionary of environment variable key-value pairs to set on the process where the command is executed. | | | | `distribution` | object | The distribution configuration for distributed training scenarios. One of [MpiConfiguration](#mpiconfiguration), [PyTorchConfiguration](#pytorchconfiguration), or [TensorFlowConfiguration](#tensorflowconfiguration). | | | | `compute` | string | Name of the compute target to execute the job on. Can be either a reference to an existing compute in the workspace (using the `azureml:` syntax) or `local` to designate local execution. **Note:** jobs in pipeline didn't support `local` as `compute` | | `local` | diff --git a/articles/machine-learning/toc.yml b/articles/machine-learning/toc.yml index 35953b631d1..af015069571 100644 --- a/articles/machine-learning/toc.yml +++ b/articles/machine-learning/toc.yml @@ -327,9 +327,6 @@ - name: Data preparation with Azure Synapse displayName: data, data prep, spark, spark pool, cluster, spark cluster,dataset, datastore href: ./v1/how-to-data-prep-synapse-spark-pool.md - - name: DevOps for data ingestion - displayName: data, ingestion, devops - href: ./v1/how-to-cicd-data-ingestion.md - name: Import data in the designer displayName: designer, data, import, dataset, datastore href: ./v1/how-to-designer-import-data.md diff --git a/articles/machine-learning/v1/how-to-cicd-data-ingestion.md b/articles/machine-learning/v1/how-to-cicd-data-ingestion.md deleted file mode 100644 index 61b07b09fed..00000000000 --- a/articles/machine-learning/v1/how-to-cicd-data-ingestion.md +++ /dev/null @@ -1,480 +0,0 @@ ---- -title: DevOps for a data ingestion pipeline -titleSuffix: Azure Machine Learning -description: Learn how to apply DevOps practices to build a data ingestion pipeline to prepare data using Azure Data Factory and Azure Databricks. -services: machine-learning -ms.service: azure-machine-learning -ms.subservice: mlops -ms.topic: how-to -ms.custom: UpdateFrequency5, data4ml -ms.author: larryfr -author: Blackmist -manager: davete -ms.reviewer: iefedore -ms.date: 08/17/2022 -# Customer intent: As an experienced data engineer, I need to create a production data ingestion pipeline for the data used to train my models. ---- - -# DevOps for a data ingestion pipeline - -In most scenarios, a data ingestion solution is a composition of scripts, service invocations, and a pipeline orchestrating all the activities. In this article, you learn how to apply DevOps practices to the development lifecycle of a common data ingestion pipeline that prepares data for machine learning model training. The pipeline is built using the following Azure services: - -* __Azure Data Factory__: Reads the raw data and orchestrates data preparation. -* __Azure Databricks__: Runs a Python notebook that transforms the data. -* __Azure Pipelines__: Automates a continuous integration and development process. - -## Data ingestion pipeline workflow - -The data ingestion pipeline implements the following workflow: - -1. Raw data is read into an Azure Data Factory (ADF) pipeline. -1. The ADF pipeline sends the data to an Azure Databricks cluster, which runs a Python notebook to transform the data. -1. The data is stored to a blob container, where it can be used by Azure Machine Learning to train a model. - -![data ingestion pipeline workflow](media/how-to-cicd-data-ingestion/data-ingestion-pipeline.png) - -## Continuous integration and delivery overview - -As with many software solutions, there is a team (for example, Data Engineers) working on it. They collaborate and share the same Azure resources such as Azure Data Factory, Azure Databricks, and Azure Storage accounts. The collection of these resources is a Development environment. The data engineers contribute to the same source code base. - -A continuous integration and delivery system automates the process of building, testing, and delivering (deploying) the solution. The Continuous Integration (CI) process performs the following tasks: - -* Assembles the code -* Checks it with the code quality tests -* Runs unit tests -* Produces artifacts such as tested code and Azure Resource Manager templates - -The Continuous Delivery (CD) process deploys the artifacts to the downstream environments. - -![cicd data ingestion diagram](media/how-to-cicd-data-ingestion/cicd-data-ingestion.png) - -This article demonstrates how to automate the CI and CD processes with [Azure Pipelines](https://azure.microsoft.com/services/devops/pipelines/). - -## Source control management - -Source control management is needed to track changes and enable collaboration between team members. -For example, the code would be stored in an Azure DevOps, GitHub, or GitLab repository. The collaboration workflow is based on a branching model. - -### Python Notebook Source Code - -The data engineers work with the Python notebook source code either locally in an IDE (for example, [Visual Studio Code](https://code.visualstudio.com)) or directly in the Databricks workspace. Once the code changes are complete, they are merged to the repository following a branching policy. - -> [!TIP] -> We recommended storing the code in `.py` files rather than in `.ipynb` Jupyter Notebook format. It improves the code readability and enables automatic code quality checks in the CI process. - -### Azure Data Factory Source Code - -The source code of Azure Data Factory pipelines is a collection of JSON files generated by an Azure Data Factory workspace. Normally the data engineers work with a visual designer in the Azure Data Factory workspace rather than with the source code files directly. - -To configure the workspace to use a source control repository, see [Author with Azure Repos Git integration](/azure/data-factory/source-control#author-with-azure-repos-git-integration). - -## Continuous integration (CI) - -The ultimate goal of the Continuous Integration process is to gather the joint team work from the source code and prepare it for the deployment to the downstream environments. As with the source code management this process is different for the Python notebooks and Azure Data Factory pipelines. - -### Python Notebook CI - -The CI process for the Python Notebooks gets the code from the collaboration branch (for example, ***master*** or ***develop***) and performs the following activities: -* Code linting -* Unit testing -* Saving the code as an artifact - -The following code snippet demonstrates the implementation of these steps in an Azure DevOps ***yaml*** pipeline: - -```yaml -steps: -- script: | - flake8 --output-file=$(Build.BinariesDirectory)/lint-testresults.xml --format junit-xml - workingDirectory: '$(Build.SourcesDirectory)' - displayName: 'Run flake8 (code style analysis)' - -- script: | - python -m pytest --junitxml=$(Build.BinariesDirectory)/unit-testresults.xml $(Build.SourcesDirectory) - displayName: 'Run unit tests' - -- task: PublishTestResults@2 - condition: succeededOrFailed() - inputs: - testResultsFiles: '$(Build.BinariesDirectory)/*-testresults.xml' - testRunTitle: 'Linting & Unit tests' - failTaskOnFailedTests: true - displayName: 'Publish linting and unit test results' - -- publish: $(Build.SourcesDirectory) - artifact: di-notebooks -``` - -The pipeline uses [flake8](https://pypi.org/project/flake8/) to do the Python code linting. It runs the unit tests defined in the source code and publishes the linting and test results so they're available in the Azure Pipelines execution screen. - -If the linting and unit testing is successful, the pipeline will copy the source code to the artifact repository to be used by the subsequent deployment steps. - -### Azure Data Factory CI - -CI process for an Azure Data Factory pipeline is a bottleneck for a data ingestion pipeline. -There's no continuous integration. A deployable artifact for Azure Data Factory is a collection of Azure Resource Manager templates. The only way to produce those templates is to click the ***publish*** button in the Azure Data Factory workspace. - -1. The data engineers merge the source code from their feature branches into the collaboration branch, for example, ***master*** or ***develop***. -1. Someone with the granted permissions clicks the ***publish*** button to generate Azure Resource Manager templates from the source code in the collaboration branch. -1. The workspace validates the pipelines (think of it as of linting and unit testing), generates Azure Resource Manager templates (think of it as of building) and saves the generated templates to a technical branch ***adf_publish*** in the same code repository (think of it as of publishing artifacts). This branch is created automatically by the Azure Data Factory workspace. - -For more information on this process, see [Continuous integration and delivery in Azure Data Factory](/azure/data-factory/continuous-integration-delivery). - -It's important to make sure that the generated Azure Resource Manager templates are environment agnostic. This means that all values that may differ between environments are parametrized. Azure Data Factory is smart enough to expose the majority of such values as parameters. For example, in the following template the connection properties to an Azure Machine Learning workspace are exposed as parameters: - -```json -{ - "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", - "contentVersion": "1.0.0.0", - "parameters": { - "factoryName": { - "value": "devops-ds-adf" - }, - "AzureMLService_servicePrincipalKey": { - "value": "" - }, - "AzureMLService_properties_typeProperties_subscriptionId": { - "value": "0fe1c235-5cfa-4152-17d7-5dff45a8d4ba" - }, - "AzureMLService_properties_typeProperties_resourceGroupName": { - "value": "devops-ds-rg" - }, - "AzureMLService_properties_typeProperties_servicePrincipalId": { - "value": "6e35e589-3b22-4edb-89d0-2ab7fc08d488" - }, - "AzureMLService_properties_typeProperties_tenant": { - "value": "72f988bf-86f1-41af-912b-2d7cd611db47" - } - } -} -``` - -However, you may want to expose your custom properties that are not handled by the Azure Data Factory workspace by default. In the scenario of this article an Azure Data Factory pipeline invokes a Python notebook processing the data. The notebook accepts a parameter with the name of an input data file. - -```Python -import pandas as pd -import numpy as np - -data_file_name = getArgument("data_file_name") -data = pd.read_csv(data_file_name) - -labels = np.array(data['target']) -... -``` - -This name is different for ***Dev***, ***QA***, ***UAT***, and ***PROD*** environments. In a complex pipeline with multiple activities, there can be several custom properties. It's good practice to collect all those values in one place and define them as pipeline ***variables***: - -![Screenshot shows a Notebook called PrepareData and M L Execute Pipeline called M L Execute Pipeline at the top with the Variables tab selected below with the option to add new variables, each with a name, type, and default value.](media/how-to-cicd-data-ingestion/adf-variables.png) - -The pipeline activities may refer to the pipeline variables while actually using them: - -![Screenshot shows a Notebook called PrepareData and M L Execute Pipeline called M L Execute Pipeline at the top with the Settings tab selected below.](media/how-to-cicd-data-ingestion/adf-notebook-parameters.png) - -The Azure Data Factory workspace ***doesn't*** expose pipeline variables as Azure Resource Manager templates parameters by default. The workspace uses the [Default Parameterization Template](/azure/data-factory/continuous-integration-delivery-resource-manager-custom-parameters) dictating what pipeline properties should be exposed as Azure Resource Manager template parameters. To add pipeline variables to the list, update the `"Microsoft.DataFactory/factories/pipelines"` section of the [Default Parameterization Template](/azure/data-factory/continuous-integration-delivery-resource-manager-custom-parameters) with the following snippet and place the result json file in the root of the source folder: - -```json -"Microsoft.DataFactory/factories/pipelines": { - "properties": { - "variables": { - "*": { - "defaultValue": "=" - } - } - } - } -``` - -Doing so will force the Azure Data Factory workspace to add the variables to the parameters list when the ***publish*** button is clicked: - -```json -{ - "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", - "contentVersion": "1.0.0.0", - "parameters": { - "factoryName": { - "value": "devops-ds-adf" - }, - ... - "data-ingestion-pipeline_properties_variables_data_file_name_defaultValue": { - "value": "driver_prediction_train.csv" - } - } -} -``` - -The values in the JSON file are default values configured in the pipeline definition. They're expected to be overridden with the target environment values when the Azure Resource Manager template is deployed. - -## Continuous delivery (CD) - -The Continuous Delivery process takes the artifacts and deploys them to the first target environment. It makes sure that the solution works by running tests. If successful, it continues to the next environment. - -The CD Azure Pipelines consists of multiple stages representing the environments. Each stage contains [deployments](/azure/devops/pipelines/process/deployment-jobs) and [jobs](/azure/devops/pipelines/process/phases?tabs=yaml) that perform the following steps: - -* Deploy a Python Notebook to Azure Databricks workspace -* Deploy an Azure Data Factory pipeline -* Run the pipeline -* Check the data ingestion result - -The pipeline stages can be configured with [approvals](/azure/devops/pipelines/process/approvals?tabs=check-pass) and [gates](/azure/devops/pipelines/release/approvals/gates) that provide additional control on how the deployment process evolves through the chain of environments. - -### Deploy a Python Notebook - -The following code snippet defines an Azure Pipeline [deployment](/azure/devops/pipelines/process/deployment-jobs) that copies a Python notebook to a Databricks cluster: - -```yaml -- stage: 'Deploy_to_QA' - displayName: 'Deploy to QA' - variables: - - group: devops-ds-qa-vg - jobs: - - deployment: "Deploy_to_Databricks" - displayName: 'Deploy to Databricks' - timeoutInMinutes: 0 - environment: qa - strategy: - runOnce: - deploy: - steps: - - task: UsePythonVersion@0 - inputs: - versionSpec: '3.x' - addToPath: true - architecture: 'x64' - displayName: 'Use Python3' - - - task: configuredatabricks@0 - inputs: - url: '$(DATABRICKS_URL)' - token: '$(DATABRICKS_TOKEN)' - displayName: 'Configure Databricks CLI' - - - task: deploynotebooks@0 - inputs: - notebooksFolderPath: '$(Pipeline.Workspace)/di-notebooks' - workspaceFolder: '/Shared/devops-ds' - displayName: 'Deploy (copy) data processing notebook to the Databricks cluster' -``` - -The artifacts produced by the CI are automatically copied to the deployment agent and are available in the `$(Pipeline.Workspace)` folder. In this case, the deployment task refers to the `di-notebooks` artifact containing the Python notebook. This [deployment](/azure/devops/pipelines/process/deployment-jobs) uses the [Databricks Azure DevOps extension](https://marketplace.visualstudio.com/items?itemName=riserrad.azdo-databricks) to copy the notebook files to the Databricks workspace. - -The `Deploy_to_QA` stage contains a reference to the `devops-ds-qa-vg` variable group defined in the Azure DevOps project. The steps in this stage refer to the variables from this variable group (for example, `$(DATABRICKS_URL)` and `$(DATABRICKS_TOKEN)`). The idea is that the next stage (for example, `Deploy_to_UAT`) will operate with the same variable names defined in its own UAT-scoped variable group. - -### Deploy an Azure Data Factory pipeline - -A deployable artifact for Azure Data Factory is an Azure Resource Manager template. It's going to be deployed with the ***Azure Resource Group Deployment*** task as it is demonstrated in the following snippet: - -```yaml - - deployment: "Deploy_to_ADF" - displayName: 'Deploy to ADF' - timeoutInMinutes: 0 - environment: qa - strategy: - runOnce: - deploy: - steps: - - task: AzureResourceGroupDeployment@2 - displayName: 'Deploy ADF resources' - inputs: - azureSubscription: $(AZURE_RM_CONNECTION) - resourceGroupName: $(RESOURCE_GROUP) - location: $(LOCATION) - csmFile: '$(Pipeline.Workspace)/adf-pipelines/ARMTemplateForFactory.json' - csmParametersFile: '$(Pipeline.Workspace)/adf-pipelines/ARMTemplateParametersForFactory.json' - overrideParameters: -data-ingestion-pipeline_properties_variables_data_file_name_defaultValue "$(DATA_FILE_NAME)" -``` -The value of the data filename parameter comes from the `$(DATA_FILE_NAME)` variable defined in a QA stage variable group. Similarly, all parameters defined in ***ARMTemplateForFactory.json*** can be overridden. If they are not, then the default values are used. - -### Run the pipeline and check the data ingestion result - -The next step is to make sure that the deployed solution is working. The following job definition runs an Azure Data Factory pipeline with a [PowerShell script](https://github.com/microsoft/DataOps/tree/master/adf/utils) and executes a Python notebook on an Azure Databricks cluster. The notebook checks if the data has been ingested correctly and validates the result data file with `$(bin_FILE_NAME)` name. - -```yaml - - job: "Integration_test_job" - displayName: "Integration test job" - dependsOn: [Deploy_to_Databricks, Deploy_to_ADF] - pool: - vmImage: 'ubuntu-latest' - timeoutInMinutes: 0 - steps: - - task: AzurePowerShell@4 - displayName: 'Execute ADF Pipeline' - inputs: - azureSubscription: $(AZURE_RM_CONNECTION) - ScriptPath: '$(Build.SourcesDirectory)/adf/utils/Invoke-ADFPipeline.ps1' - ScriptArguments: '-ResourceGroupName $(RESOURCE_GROUP) -DataFactoryName $(DATA_FACTORY_NAME) -PipelineName $(PIPELINE_NAME)' - azurePowerShellVersion: LatestVersion - - task: UsePythonVersion@0 - inputs: - versionSpec: '3.x' - addToPath: true - architecture: 'x64' - displayName: 'Use Python3' - - - task: configuredatabricks@0 - inputs: - url: '$(DATABRICKS_URL)' - token: '$(DATABRICKS_TOKEN)' - displayName: 'Configure Databricks CLI' - - - task: executenotebook@0 - inputs: - notebookPath: '/Shared/devops-ds/test-data-ingestion' - existingClusterId: '$(DATABRICKS_CLUSTER_ID)' - executionParams: '{"bin_file_name":"$(bin_FILE_NAME)"}' - displayName: 'Test data ingestion' - - - task: waitexecution@0 - displayName: 'Wait until the testing is done' -``` - -The final task in the job checks the result of the notebook execution. If it returns an error, it sets the status of pipeline execution to failed. - -## Putting pieces together - -The complete CI/CD Azure Pipeline consists of the following stages: -* CI -* Deploy To QA - * Deploy to Databricks + Deploy to ADF - * Integration Test - -It contains a number of ***Deploy*** stages equal to the number of target environments you have. Each ***Deploy*** stage contains two [deployments](/azure/devops/pipelines/process/deployment-jobs) that run in parallel and a [job](/azure/devops/pipelines/process/phases?tabs=yaml) that runs after deployments to test the solution on the environment. - -A sample implementation of the pipeline is assembled in the following ***yaml*** snippet: - -```yaml -variables: -- group: devops-ds-vg - -stages: -- stage: 'CI' - displayName: 'CI' - jobs: - - job: "CI_Job" - displayName: "CI Job" - pool: - vmImage: 'ubuntu-latest' - timeoutInMinutes: 0 - steps: - - task: UsePythonVersion@0 - inputs: - versionSpec: '3.x' - addToPath: true - architecture: 'x64' - displayName: 'Use Python3' - - script: pip install --upgrade flake8 flake8_formatter_junit_xml - displayName: 'Install flake8' - - checkout: self - - script: | - flake8 --output-file=$(Build.BinariesDirectory)/lint-testresults.xml --format junit-xml - workingDirectory: '$(Build.SourcesDirectory)' - displayName: 'Run flake8 (code style analysis)' - - script: | - python -m pytest --junitxml=$(Build.BinariesDirectory)/unit-testresults.xml $(Build.SourcesDirectory) - displayName: 'Run unit tests' - - task: PublishTestResults@2 - condition: succeededOrFailed() - inputs: - testResultsFiles: '$(Build.BinariesDirectory)/*-testresults.xml' - testRunTitle: 'Linting & Unit tests' - failTaskOnFailedTests: true - displayName: 'Publish linting and unit test results' - - # The CI stage produces two artifacts (notebooks and ADF pipelines). - # The pipelines Azure Resource Manager templates are stored in a technical branch "adf_publish" - - publish: $(Build.SourcesDirectory)/$(Build.Repository.Name)/code/dataingestion - artifact: di-notebooks - - checkout: git://${{variables['System.TeamProject']}}@adf_publish - - publish: $(Build.SourcesDirectory)/$(Build.Repository.Name)/devops-ds-adf - artifact: adf-pipelines - -- stage: 'Deploy_to_QA' - displayName: 'Deploy to QA' - variables: - - group: devops-ds-qa-vg - jobs: - - deployment: "Deploy_to_Databricks" - displayName: 'Deploy to Databricks' - timeoutInMinutes: 0 - environment: qa - strategy: - runOnce: - deploy: - steps: - - task: UsePythonVersion@0 - inputs: - versionSpec: '3.x' - addToPath: true - architecture: 'x64' - displayName: 'Use Python3' - - - task: configuredatabricks@0 - inputs: - url: '$(DATABRICKS_URL)' - token: '$(DATABRICKS_TOKEN)' - displayName: 'Configure Databricks CLI' - - - task: deploynotebooks@0 - inputs: - notebooksFolderPath: '$(Pipeline.Workspace)/di-notebooks' - workspaceFolder: '/Shared/devops-ds' - displayName: 'Deploy (copy) data processing notebook to the Databricks cluster' - - deployment: "Deploy_to_ADF" - displayName: 'Deploy to ADF' - timeoutInMinutes: 0 - environment: qa - strategy: - runOnce: - deploy: - steps: - - task: AzureResourceGroupDeployment@2 - displayName: 'Deploy ADF resources' - inputs: - azureSubscription: $(AZURE_RM_CONNECTION) - resourceGroupName: $(RESOURCE_GROUP) - location: $(LOCATION) - csmFile: '$(Pipeline.Workspace)/adf-pipelines/ARMTemplateForFactory.json' - csmParametersFile: '$(Pipeline.Workspace)/adf-pipelines/ARMTemplateParametersForFactory.json' - overrideParameters: -data-ingestion-pipeline_properties_variables_data_file_name_defaultValue "$(DATA_FILE_NAME)" - - job: "Integration_test_job" - displayName: "Integration test job" - dependsOn: [Deploy_to_Databricks, Deploy_to_ADF] - pool: - vmImage: 'ubuntu-latest' - timeoutInMinutes: 0 - steps: - - task: AzurePowerShell@4 - displayName: 'Execute ADF Pipeline' - inputs: - azureSubscription: $(AZURE_RM_CONNECTION) - ScriptPath: '$(Build.SourcesDirectory)/adf/utils/Invoke-ADFPipeline.ps1' - ScriptArguments: '-ResourceGroupName $(RESOURCE_GROUP) -DataFactoryName $(DATA_FACTORY_NAME) -PipelineName $(PIPELINE_NAME)' - azurePowerShellVersion: LatestVersion - - task: UsePythonVersion@0 - inputs: - versionSpec: '3.x' - addToPath: true - architecture: 'x64' - displayName: 'Use Python3' - - - task: configuredatabricks@0 - inputs: - url: '$(DATABRICKS_URL)' - token: '$(DATABRICKS_TOKEN)' - displayName: 'Configure Databricks CLI' - - - task: executenotebook@0 - inputs: - notebookPath: '/Shared/devops-ds/test-data-ingestion' - existingClusterId: '$(DATABRICKS_CLUSTER_ID)' - executionParams: '{"bin_file_name":"$(bin_FILE_NAME)"}' - displayName: 'Test data ingestion' - - - task: waitexecution@0 - displayName: 'Wait until the testing is done' - -``` - -## Next steps - -* [Source Control in Azure Data Factory](/azure/data-factory/source-control) -* [Continuous integration and delivery in Azure Data Factory](/azure/data-factory/continuous-integration-delivery) -* [DevOps for Azure Databricks](https://marketplace.visualstudio.com/items?itemName=riserrad.azdo-databricks) diff --git a/breadcrumb/azure-ai/toc.yml b/breadcrumb/azure-ai/toc.yml index 24a4f665463..1e5849a0ff9 100644 --- a/breadcrumb/azure-ai/toc.yml +++ b/breadcrumb/azure-ai/toc.yml @@ -26,8 +26,8 @@ items: tocHref: /azure/ai-services/custom-vision-service/ topicHref: /azure/ai-services/custom-vision-service/ - name: Language Understanding (LUIS) - tocHref: /azure/ai-services/LUIS/ - topicHref: /azure/ai-services/LUIS/ + tocHref: /azure/ai-services/luis/ + topicHref: /azure/ai-services/luis/ - name: QnA Maker tocHref: /azure/ai-services/qnamaker/ topicHref: /azure/ai-services/qnamaker/