Experimental full-text and vector search engine with web crawler and modern UI.
⚠️ Experimental - Active development, APIs may change.
Backend (Rust)
crawler - Web crawler with rate limitingparser - HTML parser and content extractionindexer - Inverted index builderembedder - Generate embeddings (Ollama/NVIDIA/Cloudflare)api_server - REST API + WebSocket serverFrontend (SvelteKit)
git clone <repo>
cd search
cp .env.example .env
# Crawl websites
cargo run --bin crawler -- --seed https://example.com --max-pages 100
# Parse HTML
cargo run --bin parser
# Build index (with PageRank)
cargo run --bin indexer -- --pagerank
# Generate embeddings (optional, for vector search)
cargo run --bin embedder
# Full-text search (default)
cargo run --bin api_server
# Vector search
cargo run --bin api_server -- -v
# Hybrid search
cargo run --bin api_server -- --hybrid
cd frontend
npm install
npm run dev
API_HOST=0.0.0.0
API_PORT=5050
# For cloud embeddings
NVIDIA_API_KEY=your_key
CLOUDFLARE_API_TOKEN=your_token
CLOUDFLARE_ACCOUNT_ID=your_id
PUBLIC_API_URL=http://localhost:5050
PUBLIC_WS_URL=ws://localhost:5050
/ or Ctrl+K - Focus searchEsc - Clear search↑/↓ - Navigate suggestions←/→ - Previous/next pagej/k - Scroll down/upGET /search?q=query&limit=10
ws://localhost:5050/suggest
// Send: "query prefix"
// Receive: {"prefix": "...", "suggestions": [...]}
search/
├── src/
│ ├── bin/ # Binaries
│ ├── lib.rs # Library modules
│ ├── indexer.rs # Inverted index
│ ├── vector_index.rs
│ ├── crawler.rs
│ ├── parser.rs
│ └── ...
├── frontend/
│ └── src/routes/ # SvelteKit pages
├── Cargo.toml
└── README.md
# Run tests
cargo test
# Build release
cargo build --release
# Frontend dev
cd frontend && npm run dev
Experimental project - contributions welcome but expect breaking changes.