Tallgrass

Browser-based AI that learns to play Pokémon Silver. LLM + policy network + GBC emulator, all client-side.

#ai #browser #emulator #game-boy #llm #pokemon #reinforcement-learning #svelte #tensorflow #webgpu

TALLGRASS

Browser-based AI that learns to play Pokémon Silver. No server required.

TALLGRASS combines an LLM (Qwen2.5-1.5B via WebGPU), a trainable policy network (TensorFlow.js), and a Game Boy Color emulator (binjgb/WASM) to play Pokémon Silver entirely client-side.

The curriculum follows a canonical Pokémon Silver progression: Johto story → Elite Four + Champion → All 8 Kanto gym badges (16 badges total). Episode management and checkpoint-based save states ensure long-horizon training without stalling.

Requirements

Chrome/Edge 113+ (WebGPU)
Pokémon Silver ROM (.gbc)
~2GB free space (for LLM model download)

Quick Start

git clone https://github.com/sidmohan0/tallgrass.git
cd tallgrass/svelte-app
npm install
npm run dev

Architecture

                          ┌───────────────────┐
                          │   Application     │
                          │     Startup       │
                          └─────────┬─────────┘
                                    │
                                    ▼
                          ┌───────────────────┐
                          │ Configuration     │
                          │   Resolution      │
                          └─────────┬─────────┘
                                    │
                                    ▼
                          ┌───────────────────┐
                          │ Runtime Engine    │
                          │   Construction    │
                          └─────────┬─────────┘
                                    │
                 ┌──────────────────┴──────────────────┐
                 │                                     │
                 ▼                                     ▼
      ┌─────────────────────┐            ┌────────────────────────┐
      │  Deterministic      │            │  LLM-Enhanced Engine    │
      │      Engine         │            │                        │
      │                     │            │  ┌──────────────────┐  │
      │  • Rules            │            │  │ LLM Provider     │  │
      │  • Heuristics       │            │  │                  │  │
      │                     │            │  │ • WebLLM (local)  │  │
      └─────────────────────┘            │  │ • Qwen2.5-1.5B   │  │
                                          │  └────────┬─────────┘  │
                                          │           │            │
                                          │  ┌────────▼─────────┐ │
                                          │  │ Policy Network   │ │
                                          │  │ (TensorFlow.js)  │ │
                                          │  └─────────────────┘ │
                                          └────────────────────────┘
                                    │
                                    ▼
                          ┌───────────────────┐
                          │ Interaction Flow  │
                          │ (New / Continue)  │
                          └─────────┬─────────┘
                                    │
                                    ▼
                          ┌───────────────────┐
                          │   User Actions    │
                          │ → Engine → Output │
                          └───────────────────┘

                          ┌───────────────────┐
                          │   Player Action   │
                          │ (Message / Move)  │
                          └─────────┬─────────┘
                                    │
                                    ▼
                          ┌───────────────────┐
                          │  Intent Parsing   │
                          │ (Battle / Info)   │
                          └─────────┬─────────┘
                                    │
        ┌───────────────────────────┼───────────────────────────┐
        │                           │                           │
        ▼                           ▼                           ▼
┌──────────────────┐     ┌────────────────────┐     ┌──────────────────┐
│ Battle Decision  │     │ Game Knowledge     │     │ Meta / Utility   │
│                  │     │                    │     │                  │
│ • Move choice    │     │ • Type matchups    │     │ • Help           │
│ • Switch logic   │     │ • Abilities        │     │ • Rules          │
│ • Risk assessment│     │ • Status effects  │     │ • Settings       │
└─────────┬────────┘     └─────────┬──────────┘     └─────────┬────────┘
          │                        │                            │
          ▼                        ▼                            ▼
┌──────────────────┐     ┌────────────────────┐     ┌──────────────────┐
│ Outcome Modeling │     │ Knowledge Lookup   │     │ Static Responses │
│                  │     │                    │     │                  │
│ • Damage ranges  │     │ • Dex data         │     │ • Deterministic  │
│ • Speed checks  │     │ • Learnsets        │     │ • No AI needed   │
│ • KO chances    │     │ • Items / Moves    │     │                  │
└─────────┬────────┘     └─────────┬──────────┘     └──────────────────┘
          │                        │
          └──────────────┬─────────┘
                         ▼
                ┌───────────────────┐
                │ Decision Synthesis │
                │ (Best Action)     │
                └─────────┬─────────┘
                          │
                          ▼
                ┌───────────────────┐
                │ Player Feedback   │
                │ (Explain / Act)   │
                └───────────────────┘

Components

Emulator (emulator.js): binjgb WASM wrapper. Direct RAM access via readMemory(), save/load state.

State Detection (silver-state.js): Reads Gen 2 memory addresses (map group/number, badges, battle state, coordinates). SilverStateDetector provides game mode detection (overworld/menu/battle/dialogue) and location tracking.

LLM Engine (llm.js): Qwen2.5-1.5B via WebLLM. WebGPU-accelerated inference. Model weights cached in browser (~1.5GB).

Policy Network (browser-trainer.js): TensorFlow.js feedforward network (12 features → 64 → 32 → 6 actions). Trained via experience replay with prioritized sampling.

Agent (agent.js + rl-agent.js): Coordinates LLM planning and policy network. Action masking restricts action space by game mode. Menu guardrails prevent spam.

Curriculum (curriculum/silver.js): 47 ordered checkpoints across 3 phases (Johto → League → Kanto). Completion conditions detect badges, locations, story flags.

Episode Manager (episode-manager.js): Episode termination (party wipe, max steps 100k, watchdog 50k). Checkpoint saves created automatically on checkpoint completion. Resets preserve curriculum progress.

Action Masking (action-mask.js): Mode-based filtering (overworld: directional+A, menu: directional+A+B, dialogue: A only, battle: battle actions). Menu guardrails enforce novelty and cooldown.

Curriculum

Johto (0-8 badges): Starter → All 8 Johto gym badges

League (8 badges): Victory Road → Elite Four → Champion. Hall of Fame is a midpoint, not completion.

Kanto (9-16 badges): S.S. Aqua → All 8 Kanto gym badges. Completion = 16 badges total.

Episode & Reset Semantics

Termination: Party wipe (true blackout, not scripted losses), max steps (100k), or watchdog (50k steps without checkpoint progress).

Checkpoint Saves: Created automatically when curriculum checkpoints complete. Only most recent save kept. Stored in localStorage as tallgrass-checkpoint-{id}.

Reset Behavior: Loads most recent checkpoint save, preserves curriculum progress and badges, resets episode counters.

Scripted Loss Handling: Early rival fight losses don't trigger resets. Only true wipes that result in forced healing terminate episodes.

Storage

Data Type	Storage	Size
LLM Weights	Browser Cache	~1.5GB
Policy Network	IndexedDB	~10-50MB
Experience Buffer	IndexedDB	Variable
Checkpoint Saves	localStorage	~1-5MB
Curriculum Progress	localStorage	<1KB

Memory Addresses (Gen 2)

Address	Purpose
0xD35D	Map group (0-26)
0xD35E	Map number within group
0xD361	Player Y coordinate
0xD362	Player X coordinate
0xD057	Battle type (0 = no battle)
0xD125	Text box ID (0 = no dialogue/menu)
0xCC26	Menu type
0xD356	Johto badges (bits 0-7)
0xD357	Kanto badges (bits 0-7)
0xD163	Party count

Reward System

Base Rewards: Badge +1000, Pokémon caught +100, level up +50, new map +200, trainer battle won +75, wild battle won +30. Penalties: whiteout -500, fainted -100.

Progress Rewards: Continuous shaping based on distance to next checkpoint.

Adaptive Rewards: LLM generates verifiable tests every 10 steps (50-1000 scale).

Curriculum Checkpoints: Badge 1000, key item 500, location 200, event 300, story 400.

Development

cd svelte-app
npm run build
npm run preview

Deploy svelte-app/build/ to any static hosting.

Project Structure

TALLGRASS/
├── svelte-app/              # Main SvelteKit application
│   ├── src/
│   │   ├── lib/
│   │   │   ├── components/  # UI components
│   │   │   ├── core/        # Core game logic
│   │   │   │   ├── game/    # Game-specific modules
│   │   │   │   ├── curriculum/  # Curriculum system
│   │   │   │   └── ...
│   │   │   └── stores/      # Svelte stores
│   │   └── routes/          # SvelteKit routes
└── README.md

License

MIT

Built by Abdillahi Nur

Top categories

tailwind daisyui admin template popup mdsvex portfolio blog form ecommerce ui carousel auth dark seo image routing