An LLM benchmark for Svelte 5 based on the HumanEval methodology.
Work in progress
SvelteBench evaluates LLM-generated Svelte components by testing them against predefined test suites. It works by sending prompts to LLMs, generating Svelte components, and verifying their functionality through automated tests.
nvm use
npm install
# Create .env file from example
cp .env.example .env
Then edit the .env
file and add your API keys.
# Run the benchmark with settings from .env file
npm start
For faster development, you can enable debug mode in your .env
file:
DEBUG_MODE=true
DEBUG_PROVIDER=anthropic
DEBUG_MODEL=claude-3-7-sonnet-20250219
DEBUG_TEST=counter
Debug mode runs only one provider/model combination, making it much faster for testing during development.
You can provide a context file (like Svelte documentation) to help the LLM generate better components:
# Run with a context file
npm run run-tests -- --context ./context/svelte.dev/llms-small.txt && npm run build
The context file will be included in the prompt to the LLM, providing additional information for generating components.
After running the benchmark, you can visualize the results using the built-in visualization tool:
npm run build
You can now find the visualization in the dist
directory.
To add a new test:
src/tests/
with the name of your testprompt.md
file with instructions for the LLMtest.ts
file with Vitest tests for the generated componentExample structure:
src/tests/your-test/
├── prompt.md # Instructions for the LLM
└── test.ts # Tests for the generated component
After running the benchmark, results are saved to a JSON file in the benchmarks
directory. The file is named benchmark-results-{timestamp}.json
.
When running with a context file, the results filename will include "with-context" in the name: benchmark-results-with-context-{timestamp}.json
.