Account Setup
Get your API keys and configure your environment to upload benchmark results.
Create Your Account
Sign up at app.cliwatch.com. You'll land in your personal workspace. This is where benchmark results are stored and where you manage team access.
Each workspace has its own API keys, benchmark history, and team members.
Get Your CLIWATCH_API_KEY
This key authenticates uploads and CLI commands against the CLIWatch API.
- Go to app.cliwatch.com and click API Keys in the sidebar
- Click Create API Key and give it a label (e.g., "CI" or "local dev")
- Copy the key immediately (it starts with
cw_and is scoped to the current workspace)
API keys are only displayed once at creation time. If you lose a key, revoke it and create a new one.
You can create multiple keys (e.g., one for CI, one for local dev). Revoke keys anytime from the same page.
Get Your AI_GATEWAY_API_KEY
This key authenticates LLM calls through the Vercel AI Gateway. All model calls (Anthropic, OpenAI, Google, etc.) go through a single gateway, so you only need one key for all providers.
- Go to the Vercel AI Gateway dashboard
- Create an API key (it starts with
vck_) - Configure which LLM providers to enable (Anthropic, OpenAI, Google, etc.)
The AI Gateway key is not a CLIWatch key. It's a separate Vercel service. You need both keys to run benchmarks.
Add Secrets to CI
GitHub Actions
- Go to your repo → Settings → Secrets and variables → Actions
- Click New repository secret and add:
CLIWATCH_API_KEY: yourcw_...keyAI_GATEWAY_API_KEY: yourvck_...key
Secrets are available as ${{ secrets.CLIWATCH_API_KEY }} in workflows. See the full GitHub Actions guide.
GitLab CI
- Go to your project → Settings → CI/CD → Variables
- Add both variables with:
- Masked enabled (hides values in logs)
- Protected enabled if you only run benchmarks on protected branches
See the full GitLab CI guide.
Verify Your Setup
Test your CLIWatch API key
export CLIWATCH_API_KEY="cw_..."
cliwatch runs
If the key is valid, you'll see your recent benchmark runs (or an empty list for new workspaces).
Test your config (no keys needed)
cli-bench --dry-run
This validates your cli-bench.yaml and prints the prompt that would be sent to the LLM, without making any API calls. Use it to verify your task suite before spending API credits.
Run a real benchmark
export CLIWATCH_API_KEY="cw_..."
export AI_GATEWAY_API_KEY="vck_..."
cli-bench --upload
Results appear at app.cliwatch.com. See the Dashboard Guide for how to interpret them.