GitHub - pytorch-labs/applied-ai: Applied AI experiments and examples for PyTorch

Applied AI repo

For experiments and research on Applied AI.

Projects

Kernels

Housing a variety of Triton and CUDA kernels for training and inference.

Inference kernels = no backward pass support.

Triton Kernels

1 - Triton - MoE (Mixtral) GEMM for accelerating inference. Uses col major access pattern to increase locality.

2 - Triton - Fused Softmax for both training and inference.

3 - Triton - Fused RMSNorm for both training and inference.

Fused RMSNorm Kernel

Other projects from Applied AI

CUDA Mode - Reading group for learning CUDA programming - (Discord, Lecture Materials, Lecture recordings)
llama-recipes - Recipes for fine-tuning and inference for Llama model series
NeurIPS'23 LLM Efficiency Challenge - 1LLM + 1GPU + 1Day competition - (website, code, NeurIPS Workshop recordings)

Papers and Publications

PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation paper
Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK Work Decomposition paper
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel paper
Sustainable AI: Environmental Implications, Challenges and Opportunities paper

License

The applied-ai repo is released under the BSD 3 license.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
assets/images		assets/images
kernels		kernels
tutorials/triton		tutorials/triton
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Applied AI repo

Projects

Kernels

Triton Kernels

1 - Triton - MoE (Mixtral) GEMM for accelerating inference. Uses col major access pattern to increase locality.

2 - Triton - Fused Softmax for both training and inference.

3 - Triton - Fused RMSNorm for both training and inference.

Other projects from Applied AI

Papers and Publications

License

About

Releases

Packages

Contributors 8

Languages

License

pytorch-labs/applied-ai

Folders and files

Latest commit

History

Repository files navigation

Applied AI repo

Projects

Kernels

Triton Kernels

1 - Triton - MoE (Mixtral) GEMM for accelerating inference. Uses col major access pattern to increase locality.

2 - Triton - Fused Softmax for both training and inference.

3 - Triton - Fused RMSNorm for both training and inference.

Other projects from Applied AI

Papers and Publications

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages