An LLM benchmark for Svelte 5 based on the HumanEval methodology.
Work in progress
SvelteBench evaluates LLM-generated Svelte components by testing them against predefined test suites. It works by sending prompts to LLMs, generating Svelte components, and verifying their functionality through automated tests.
nvm use
npm install
# Create .env file from example
cp .env.example .env
Then edit the .env
file and add your API keys:
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
# Anthropic Configuration
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# Run the benchmark with settings from .env file
npm start
After running the benchmark, you can visualize the results using the built-in visualization tool:
npm run build
You can now find the visualization in the dist
directory.
To add a new test:
src/tests/
with the name of your testprompt.md
file with instructions for the LLMtest.ts
file with Vitest tests for the generated componentExample structure:
src/tests/your-test/
├── prompt.md # Instructions for the LLM
└── test.ts # Tests for the generated component
After running the benchmark, results are saved to a JSON file in the benchmarks
directory. The file is named benchmark-results-{timestamp}.json
.