Browser-based AI that learns to play Pokémon Silver. No server required.
TALLGRASS combines an LLM (Qwen2.5-1.5B via WebGPU), a trainable policy network (TensorFlow.js), and a Game Boy Color emulator (binjgb/WASM) to play Pokémon Silver entirely client-side.
The curriculum follows a canonical Pokémon Silver progression: Johto story → Elite Four + Champion → All 8 Kanto gym badges (16 badges total). Episode management and checkpoint-based save states ensure long-horizon training without stalling.
git clone https://github.com/sidmohan0/tallgrass.git
cd tallgrass/svelte-app
npm install
npm run dev
┌───────────────────┐
│ Application │
│ Startup │
└─────────┬─────────┘
│
▼
┌───────────────────┐
│ Configuration │
│ Resolution │
└─────────┬─────────┘
│
▼
┌───────────────────┐
│ Runtime Engine │
│ Construction │
└─────────┬─────────┘
│
┌──────────────────┴──────────────────┐
│ │
▼ ▼
┌─────────────────────┐ ┌────────────────────────┐
│ Deterministic │ │ LLM-Enhanced Engine │
│ Engine │ │ │
│ │ │ ┌──────────────────┐ │
│ • Rules │ │ │ LLM Provider │ │
│ • Heuristics │ │ │ │ │
│ │ │ │ • WebLLM (local) │ │
└─────────────────────┘ │ │ • Qwen2.5-1.5B │ │
│ └────────┬─────────┘ │
│ │ │
│ ┌────────▼─────────┐ │
│ │ Policy Network │ │
│ │ (TensorFlow.js) │ │
│ └─────────────────┘ │
└────────────────────────┘
│
▼
┌───────────────────┐
│ Interaction Flow │
│ (New / Continue) │
└─────────┬─────────┘
│
▼
┌───────────────────┐
│ User Actions │
│ → Engine → Output │
└───────────────────┘
┌───────────────────┐
│ Player Action │
│ (Message / Move) │
└─────────┬─────────┘
│
▼
┌───────────────────┐
│ Intent Parsing │
│ (Battle / Info) │
└─────────┬─────────┘
│
┌───────────────────────────┼───────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌────────────────────┐ ┌──────────────────┐
│ Battle Decision │ │ Game Knowledge │ │ Meta / Utility │
│ │ │ │ │ │
│ • Move choice │ │ • Type matchups │ │ • Help │
│ • Switch logic │ │ • Abilities │ │ • Rules │
│ • Risk assessment│ │ • Status effects │ │ • Settings │
└─────────┬────────┘ └─────────┬──────────┘ └─────────┬────────┘
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌────────────────────┐ ┌──────────────────┐
│ Outcome Modeling │ │ Knowledge Lookup │ │ Static Responses │
│ │ │ │ │ │
│ • Damage ranges │ │ • Dex data │ │ • Deterministic │
│ • Speed checks │ │ • Learnsets │ │ • No AI needed │
│ • KO chances │ │ • Items / Moves │ │ │
└─────────┬────────┘ └─────────┬──────────┘ └──────────────────┘
│ │
└──────────────┬─────────┘
▼
┌───────────────────┐
│ Decision Synthesis │
│ (Best Action) │
└─────────┬─────────┘
│
▼
┌───────────────────┐
│ Player Feedback │
│ (Explain / Act) │
└───────────────────┘
Emulator (emulator.js): binjgb WASM wrapper. Direct RAM access via readMemory(), save/load state.
State Detection (silver-state.js): Reads Gen 2 memory addresses (map group/number, badges, battle state, coordinates). SilverStateDetector provides game mode detection (overworld/menu/battle/dialogue) and location tracking.
LLM Engine (llm.js): Qwen2.5-1.5B via WebLLM. WebGPU-accelerated inference. Model weights cached in browser (~1.5GB).
Policy Network (browser-trainer.js): TensorFlow.js feedforward network (12 features → 64 → 32 → 6 actions). Trained via experience replay with prioritized sampling.
Agent (agent.js + rl-agent.js): Coordinates LLM planning and policy network. Action masking restricts action space by game mode. Menu guardrails prevent spam.
Curriculum (curriculum/silver.js): 47 ordered checkpoints across 3 phases (Johto → League → Kanto). Completion conditions detect badges, locations, story flags.
Episode Manager (episode-manager.js): Episode termination (party wipe, max steps 100k, watchdog 50k). Checkpoint saves created automatically on checkpoint completion. Resets preserve curriculum progress.
Action Masking (action-mask.js): Mode-based filtering (overworld: directional+A, menu: directional+A+B, dialogue: A only, battle: battle actions). Menu guardrails enforce novelty and cooldown.
Johto (0-8 badges): Starter → All 8 Johto gym badges
League (8 badges): Victory Road → Elite Four → Champion. Hall of Fame is a midpoint, not completion.
Kanto (9-16 badges): S.S. Aqua → All 8 Kanto gym badges. Completion = 16 badges total.
Termination: Party wipe (true blackout, not scripted losses), max steps (100k), or watchdog (50k steps without checkpoint progress).
Checkpoint Saves: Created automatically when curriculum checkpoints complete. Only most recent save kept. Stored in localStorage as tallgrass-checkpoint-{id}.
Reset Behavior: Loads most recent checkpoint save, preserves curriculum progress and badges, resets episode counters.
Scripted Loss Handling: Early rival fight losses don't trigger resets. Only true wipes that result in forced healing terminate episodes.
| Data Type | Storage | Size |
|---|---|---|
| LLM Weights | Browser Cache | ~1.5GB |
| Policy Network | IndexedDB | ~10-50MB |
| Experience Buffer | IndexedDB | Variable |
| Checkpoint Saves | localStorage | ~1-5MB |
| Curriculum Progress | localStorage | <1KB |
| Address | Purpose |
|---|---|
| 0xD35D | Map group (0-26) |
| 0xD35E | Map number within group |
| 0xD361 | Player Y coordinate |
| 0xD362 | Player X coordinate |
| 0xD057 | Battle type (0 = no battle) |
| 0xD125 | Text box ID (0 = no dialogue/menu) |
| 0xCC26 | Menu type |
| 0xD356 | Johto badges (bits 0-7) |
| 0xD357 | Kanto badges (bits 0-7) |
| 0xD163 | Party count |
Base Rewards: Badge +1000, Pokémon caught +100, level up +50, new map +200, trainer battle won +75, wild battle won +30. Penalties: whiteout -500, fainted -100.
Progress Rewards: Continuous shaping based on distance to next checkpoint.
Adaptive Rewards: LLM generates verifiable tests every 10 steps (50-1000 scale).
Curriculum Checkpoints: Badge 1000, key item 500, location 200, event 300, story 400.
cd svelte-app
npm run build
npm run preview
Deploy svelte-app/build/ to any static hosting.
TALLGRASS/
├── svelte-app/ # Main SvelteKit application
│ ├── src/
│ │ ├── lib/
│ │ │ ├── components/ # UI components
│ │ │ ├── core/ # Core game logic
│ │ │ │ ├── game/ # Game-specific modules
│ │ │ │ ├── curriculum/ # Curriculum system
│ │ │ │ └── ...
│ │ │ └── stores/ # Svelte stores
│ │ └── routes/ # SvelteKit routes
└── README.md
MIT
Built by Abdillahi Nur