Skip to content

Latest commit

 

History

History
78 lines (50 loc) · 4.17 KB

README.md

File metadata and controls

78 lines (50 loc) · 4.17 KB

Building the Entailment Model

Model Training

The process is documented in the Jupyter notebooks in this folder:

The relevant training arguments were as follows:

    num_train_epochs = 10,             # Total number of training epochs
    per_device_train_batch_size = 128, # Batch size per device during training
    per_device_eval_batch_size = 256,  # Batch size for evaluation
    warmup_steps = 500,                # Number of warmup steps for learning rate scheduler
    weight_decay = 0.01,               # Strength of weight decay
    lr_scheduler_type="inverse_sqrt",
    save_strategy='epoch',

Model Structure

To perform the inference, we need the following data:

  • our finetuned model
  • tokenizer (which we did not change)
  • list of possible labels, including multilabels like flight+airfare
  • list of base labels, like flight and airfare
  • list of hypotheses for each base label

We use custom hypotheses, for example

This example asks for a rental car or taxi price

instead of standard

This example is about ground_fare

To improve the performance, we only perform inference on base labels and combine them into multiclass labels (the "probability" of a multiclass label is computed as a sum of base label probabilities with a penalty).

We return top 3 label choices, provided their probability is above the 0.2 threshold. However, for the train and test set the probability is usually close to 1 for single-class labels which reflects their relative simplicity.

Considerations

We have considered how to simplify model deployment for the new set of intents. In the simplest case intents can be added directly to labels.txt and base_labels.tsv. If that is not sufficient, a similar fine-tuning procedure can be performed.

The A/B testing or blue/green deployment strategies are made easier by the ability of the service to dynamically switch models.

References:

Articles:

HuggingFace:

arXiv: