Skip to content

Commit

Permalink
Rakoll/nlp ft component sample (#2448)
Browse files Browse the repository at this point in the history
* Merge branch 'main' of https://github.com/Azure/azureml-examples

* Fixed typos in ner sweep notebook

* Fixed a typo in the registry

* Revert to head version

* Updated NLP notebooks with subgraph orchestration feature

* Fixed a typo

* Removed snippet for listing models from azureml-staging registry

* Added link to sweep notebook in all three NLP notebooks

* Updated notebooks to include pipeline_id_override flag

* Updated compute to NC6s_v3 sku

* Removed pipeline_id override flag from job properties.

* Updated compute creation

* Fixed formatting

* Added create compute cell for component runtime section
  • Loading branch information
raviskolli authored Aug 22, 2023
1 parent 4df5a43 commit 25f87b8
Show file tree
Hide file tree
Showing 4 changed files with 433 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -15,6 +16,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -46,6 +48,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -88,6 +91,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -111,6 +115,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -174,6 +179,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -183,6 +189,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -242,6 +249,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -288,6 +296,108 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.3 Runs with models from Hugging Face (Preview)\n",
"\n",
"In addition to the model algorithms supported natively by AutoML, you can launch individual runs to explore any model algorithm from HuggingFace transformers library that supports text classification. Please refer to this [documentation](https://huggingface.co/models?pipeline_tag=text-classification&library=transformers&sort=trending) for the list of models.\n",
"\n",
"If you wish to try a model algorithm (say microsoft/deberta-large-mnli), you can specify the job for your AutoML NLP runs as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Compute target setup\n",
"\n",
"from azure.ai.ml.entities import AmlCompute\n",
"from azure.core.exceptions import ResourceNotFoundError\n",
"\n",
"compute_name = \"gpu-cluster-nc6s-v3\"\n",
"\n",
"try:\n",
" _ = ml_client.compute.get(compute_name)\n",
" print(\"Found existing compute target.\")\n",
"except ResourceNotFoundError:\n",
" print(\"Creating a new compute target...\")\n",
" compute_config = AmlCompute(\n",
" name=compute_name,\n",
" type=\"amlcompute\",\n",
" size=\"Standard_NC6s_v3\",\n",
" idle_time_before_scale_down=120,\n",
" min_instances=0,\n",
" max_instances=4,\n",
" )\n",
" ml_client.begin_create_or_update(compute_config).result()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create the AutoML job with the related factory-function.\n",
"\n",
"text_classification_hf_job = automl.text_classification(\n",
" experiment_name=exp_name,\n",
" compute=compute_name,\n",
" training_data=my_training_data_input,\n",
" validation_data=my_validation_data_input,\n",
" target_column_name=\"Sentiment\",\n",
" primary_metric=\"accuracy\",\n",
" tags={\"my_custom_tag\": \"My custom value\"},\n",
")\n",
"\n",
"text_classification_hf_job.set_limits(timeout_minutes=120)\n",
"text_classification_hf_job.set_featurization(dataset_language=dataset_language_code)\n",
"text_classification_hf_job.set_training_parameters(\n",
" model_name=\"roberta-base-openai-detector\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Submit the AutoML job\n",
"\n",
"returned_hf_job = ml_client.jobs.create_or_update(\n",
" text_classification_hf_job\n",
") # submit the job to the backend\n",
"\n",
"print(f\"Created job: {returned_job}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.jobs.stream(returned_hf_job.name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.4 Hyperparameter Sweep Runs (Public Preview)\n",
"\n",
"AutoML allows you to easily train models for Single Label Text Classification on your text data. You can control the model algorithm to be used, specify hyperparameter values for your model, as well as perform a sweep across the hyperparameter space to generate an optimal model.\n",
"\n",
"When using AutoML for text tasks, you can specify the model algorithm using the `model_name` parameter. You can either specify a single model or choose to sweep over multiple models. Please refer to the <font color='blue'><a href=\"https://github.com/Azure/azureml-examples/blob/48957c70bd53912077e81a180f424f650b414107/sdk/python/jobs/automl-standalone-jobs/automl-nlp-text-named-entity-recognition-task-distributed-sweeping/automl-nlp-text-ner-task-distributed-with-sweeping.ipynb\">sweep notebook</a></font> for detailed instructions on configuring and submitting a sweep job."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -308,6 +418,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -366,6 +477,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -388,6 +500,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -419,6 +532,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -476,6 +590,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -561,6 +676,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down Expand Up @@ -627,6 +743,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down Expand Up @@ -702,6 +819,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down Expand Up @@ -741,6 +859,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down Expand Up @@ -807,6 +926,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down Expand Up @@ -839,6 +959,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
Expand Down
Loading

0 comments on commit 25f87b8

Please sign in to comment.