From 43962731d5af8168231d9b0645acecd46284ff63 Mon Sep 17 00:00:00 2001 From: Dmitry Matora Date: Thu, 2 May 2024 05:52:24 +0300 Subject: [PATCH] Init --- .gitignore | 1 + README.md | 21 +++++++++++++++ data.json | 3 +++ index.html | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 100 insertions(+) create mode 100644 .gitignore create mode 100644 README.md create mode 100644 data.json create mode 100644 index.html diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..485dee6 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.idea diff --git a/README.md b/README.md new file mode 100644 index 0000000..459b227 --- /dev/null +++ b/README.md @@ -0,0 +1,21 @@ +# LLM Inference Speeds + +This repository contains benchmark data for various Large Language Models (LLM) based on their inference speeds measured in tokens per second. The benchmarks are performed across different hardware configurations using the prompt "tell a story". + +## About the Data + +The data represents the performance of several LLMs, detailing the tokens processed per second on specific hardware setups. Each entry includes the model name, the hardware used, and the measured speed. + +## Explore the Benchmarks + +You can view and interact with the benchmark data through a searchable table on our GitHub Pages site. Use the search field to filter by model name and explore different hardware performances. + +**[View the Inference Speeds Table](https://dmatora.github.io/inference-speed/)** + +## Contributing + +Contributions to the benchmark data are welcome! Please refer to the contributing guidelines for more information on how you can contribute. + +## License + +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. diff --git a/data.json b/data.json new file mode 100644 index 0000000..b27b2c9 --- /dev/null +++ b/data.json @@ -0,0 +1,3 @@ +[ + {"model": "Model Mistral Instruct 7B Q4", "hardware": "i7-7700HQ", "speed": "3 tokens/sec", "proof": "https://github.com/dmatora/inference-speed/issues/1"} +] diff --git a/index.html b/index.html new file mode 100644 index 0000000..8464157 --- /dev/null +++ b/index.html @@ -0,0 +1,75 @@ + + + + + + LLM Inference Speeds + + + +

LLM Inference Speeds

+ + + + + + + + + + + + + +
ModelHardwareSpeedProof
+ + + +