Benchmark Client

A small tool you run on your own computer to measure how fast your local AI models are. Results are shared anonymously with the community.

Download

Pick your platform. No installation needed — just download and run.

Windows (.exe)Linux (x86_64)Linux / macOS (arm64)

On Linux or macOS, make the file executable first: chmod +x llm-benchmark*

How it works

You need Ollama, LM Studio, vLLM, or HF TGI running locally with at least one model loaded. The whole process takes about 5–15 minutes.

Run the client

Open a terminal, navigate to where you downloaded the file, and run it. It starts an interactive prompt.

Select your AI tool and model

The client automatically finds any running AI tools on your machine and lists the available models.

Wait for the benchmark

It sends a series of test prompts to your model and measures how fast it responds.

Results are submitted

Once done, the results are automatically added to the community leaderboard so everyone can compare.

View all benchmark results →

What data is collected

What is sent

✓GPU model and video memory size
✓CPU model and number of cores
✓Total system RAM
✓Operating system name and version
✓AI tool name and version (e.g. Ollama 0.3)
✓Model name and parameter size (e.g. llama3 8B)
✓Benchmark results: generation speed, memory usage, time to first token
✓The benchmark prompt texts and your model's responses to them (for server-side output quality scoring)

What is never sent

✕Your name, email, or any personal information
✕Your IP address (not stored)
✕Your model weights or any local files
✕Your own conversations or chat history

The only purpose of data collection is to power the community comparison tables on this site. All submitted results are public and visible to everyone.