Grid Demo | Text Classification

In this demo example, you'll train a text classification model using PyTorch Lightning, transformers, and datasets!

The full tutorial can be found in the Grid documentation here.

If you haven't already set up the Grid CLI, follow this 1 minute guide on how to install the Grid CLI.

TLDR: pip install lightning-grid --upgrade

grid login

Overview

This example involves three steps:

Downloading the data
Uploading the data to Grid using Grid datastores
Training the train.py script using grid run

Upload the data to Grid

For this example we use the Lightning Flash IMDB dataset.

grid datastore create --source https://pl-flash-data.s3.amazonaws.com/imdb.zip --name imdb-ds

When the datastore upload is complete, check the status of the datastore with grid datastore list. Wait until Status of datastore shows as Succeeded before moving to the next step.

Submit a training run with Grid

Training Parameters Here are the parameters we'll specify to grid run:

Grid flags:

--instance_type: defines number of GPUs and memory
--gpus: the number of GPUs per experiment
--datastore_name: the name of the datastore (created above) that you'd like to attach to this training run
--datastore_version: the version of the datatstore to attach to this training run (defaults to 1)
--grid_disk_size: the disk size in GB to allocate to each node in the cluster

Then we'll specify the script we're using to train our model followed by the script arguments.

Script: src/train.py

These are the arguments defined by the train.py script:

Script arguments:

train_file
valid_file
test_file
max_epochs

Cool! Now we can spin up a Grid run.

Submit the command below to train a run on a single GPU:

grid run \
    --name imdb-demo \
    --gpus 1 \
    --instance_type p3.2xlarge \
    --datastore_name imdb-ds \
    --disk_size 500 \
      train.py \
    --gpus 1  \
    --train_file /datastores/imdb-ds/train.csv \
    --valid_file /datastores/imdb-ds/valid.csv  \
    --test_file /datastores/imdb-ds/test.csv \
    --max_epochs 1

You can use the grid status command to check on the status of the run. To view progess in the Grid UI, use grid view.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.py		data.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grid Demo | Text Classification

Overview

Upload the data to Grid

Submit a training run with Grid

About

Releases

Packages

Contributors 3

Languages

License

gridai/grid-text-classification

Folders and files

Latest commit

History

Repository files navigation

Grid Demo | Text Classification

Overview

Upload the data to Grid

Submit a training run with Grid

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages