Skip to content

Latest commit

 

History

History

transformers-neuronx

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

PyTorch Neuron (transformers-neuronx) Samples for AWS Inf2 & Trn1

This directory contains sample Jupyter Notebooks demonstrating tensor parallel inference for various PyTorch large language models (LLMs) on AWS Inferentia (Inf2) instances) and AWS Trainium (Trn1) instances.

For additional information on these training scripts, please refer to the tutorials found in the official Inferentia and Trainium documentation.

Inference

The following samples are available for LLM tensor parallel inference:

Name Instance type
facebook/opt-13b Inf2 & Trn1
facebook/opt-30b Inf2 & Trn1
facebook/opt-66b Inf2
meta-llama/Llama-2-13b Inf2 & Trn1