Skip to content

Commit

Permalink
Add Rclone-Cloudflare download instructiosn to README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
nathanw-mlc authored and pgmpablo157321 committed Feb 21, 2024
1 parent 5d0c221 commit dc94ae3
Showing 1 changed file with 25 additions and 20 deletions.
45 changes: 25 additions & 20 deletions recommendation/dlrm_v2/pytorch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,18 +67,15 @@ cd $HOME/mlcommons/inference/loadgen
CFLAGS="-std=c++14" python setup.py develop --user
```


### Downloading model weights

File name | framework | Size in bytes (`du *`) | MD5 hash (`md5sum *`)
-|-|-|-
framework | Size in bytes (`du *`) | MD5 hash (`md5sum *`)
-|-|-
N/A | pytorch | <2GB | -
[weight_sharded](https://cloud.mlcommons.org/index.php/s/XzfSeLgW8FYfR3S/download) | pytorch | 97.31GB | -
pytorch | 97.31GB | -

You can download the weights by running:
```
wget https://cloud.mlcommons.org/index.php/s/XzfSeLgW8FYfR3S/download -O weights.zip
unzip weights.zip
```
#### CM method

The following MLCommons CM commands can be used to programmatically download the model checkpoint.

Expand All @@ -88,24 +85,32 @@ cm pull repo mlcommons@ck
cm run script --tags=get,ml-model,dlrm,_pytorch,_weight_sharded,_rclone -j
```

(optional) To speed up future downloads, we recommend you save the weights in a bucket (E.g GCP, AWS). For example, after saving the checkpoint in a GCP bucket, you can download the weights faster by running:
#### Manual method

The above command automatically runs a set of Rclone commands to download the data from a Cloudflare R2 bucket. However, if you'd like to run the Rclone commands manually, you can do so as follows:

To run Rclone on Windows, you can download the executable [here](https://rclone.org/install/#windows).
To install Rclone on Linux/macOS/BSD systems, run:
```
export BUCKET_NAME=<BUCKET_CONTAINING_MODEL>
cd $HOME/mlcommons/inference/recommendation/dlrm_v2/pytorch/model/
gsutil -m cp -r "gs://$BUCKET_NAME/model_weights/*" .
sudo -v ; curl https://rclone.org/install.sh | sudo bash
```
Once Rclone is installed, run the following command to authenticate with the bucket:
```
rclone config create mlc-inference s3 provider=Cloudflare access_key_id=f65ba5eef400db161ea49967de89f47b secret_access_key=fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint=https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com
```
You can then navigate in the terminal to your desired download directory and run the following commands to download the model weights:

### Downloading dataset
| Original dataset | download link |
| ---- | ---- |
| Criteo Terabyte (day 23) | https://labs.criteo.com/2013/12/download-terabyte-click-logs/ |
```
rclone copy mlc-inference:mlcommons-inference-wg-public/model_weights ./model_weights -P
```

#### (optional)

1. The Criteo fake dataset can be created in place of the real datasets in order to facilitate debugging and testing. We provide a fake (random) data generator that can be used to quickly generate data samples in a format compatible with the original dataset. Please use the following script in `./tools` to quickly create random samples for the corresponding models, which will be placed into `./fake_criteo` directory
To speed up future downloads, we recommend you save the weights in a bucket (E.g GCP, AWS). For example, after saving the checkpoint in a GCP bucket, you can download the weights faster by running:
```
./make_fake_criteo.sh
mv ./fake_criteo .. && cd ..
export DATA_DIR=./fake_criteo
export BUCKET_NAME=<BUCKET_CONTAINING_MODEL>
cd $HOME/mlcommons/inference/recommendation/dlrm_v2/pytorch/model/
gsutil -m cp -r "gs://$BUCKET_NAME/model_weights/*" .
```


Expand Down

0 comments on commit dc94ae3

Please sign in to comment.