A local-first job-search dashboard. Polls five ATSes daily, discovers new sponsoring companies weekly, surfaces everything in one Svelte UI. No accounts, no API keys, no telemetry.
I got tired of running the same Boolean searches across five career portals every morning, by hand. Google Alerts kept missing things. LinkedIn's idea of a relevant role isn't mine. So I built the dashboard I wanted: every company I care about polled once a day, discovery surfacing new ones weekly, matching logic in a single JSON file I can read and edit. It runs on my laptop, I check it daily, and I tighten the matching whenever I find a new pattern worth catching. Built it for myself first — if it's useful to anyone else, even better.
pnpm mcp exposes the SQLite DB as Claude-callable tools. Ask "what should I follow up on this week?" in any Claude session.data/trawl.db) with WAL + foreign keys + a numbered-migrations system so schema changes ship as discrete reviewable files instead of accreted ALTERs.pnpm blocks postinstall scripts by default.@fastify/middie mounting Vite middleware in dev, @fastify/static in prod.https://github.com/user-attachments/assets/12087329-4383-45c2-b431-bee7c834ec16
# 1. Install dependencies (pnpm required)
pnpm install
# 2. Build better-sqlite3's native binding. pnpm blocks postinstall scripts
# by default; this approves the one legitimate native compile we need.
cd node_modules/.pnpm/better-sqlite3@*/node_modules/better-sqlite3 && npm run install && cd -
# 3. Set up the local SearXNG container (one-time)
cp docker/searxng/settings.yml.example docker/searxng/settings.yml
sed -i '' "s/REPLACE_WITH_RANDOM_HEX/$(openssl rand -hex 32)/" docker/searxng/settings.yml
docker compose up -d searxng
# 4. Start the app on a single port
pnpm dev
# → http://localhost:3030
The server seeds five sample companies, polls them on startup, kicks off discovery, and surfaces matched jobs in the dashboard. Add or remove companies from the Companies tab. Tune the title/stack filters in config/stack_keywords.json.
Every tracked company is polled daily at 09:00 local. For each job posting, the matcher runs two checks: the title must contain one of your titleKeywords, and the body must mention at least one stack token from stackPatterns. Matches land in the dashboard tagged with their reasons (e.g. senior frontend, React, TypeScript).
pollAll() uses per-host request pools (5 concurrent per ATS, 3 for SmartRecruiters per their documented 5 req/sec/IP limit). The dashboard polls /api/poll/status every second while a poll is running and shows live progress (Polling 12 / 30...). A consecutive_failures counter on each company increments on 404s and resets on success; after three in a row the company is auto-archived.
Weekly, Mondays at 09:30. Four sources run in sequence:
AndrewStetsenko/tech-jobs-with-relocation/README.md for companies hiring internationally.lukasz-madon/awesome-remote-job/README.md for remote-DNA companies.config/discovery_queries.json against your local SearXNG instance, extracts result URLs, detects ATS slugs.Repo sources review queue. Approved repos get their READMEs re-parsed weekly, with any new companies surfaced into the standard discovery queue.A two-stage review keeps the noise low: trust a repo once, and from then on its contents flow automatically into the companies queue. Both queues have one-click Approve / Dismiss.
pnpm mcp starts an MCP server (stdio transport) that exposes the DB as tools for job review, discovery review, repo-source review, application status, outreach status, and discovery triggers. Wire it into Claude Desktop or the CLI and you can query your job tracker conversationally.
{
"mcpServers": {
"trawl": {
"command": "pnpm",
"args": ["mcp"],
"cwd": "/absolute/path/to/trawl"
}
}
}
If your MCP client does not support cwd, run the command through a shell from the repo directory or use your client's equivalent working-directory setting.
| Layer | Tech | Why |
|---|---|---|
| HTTP server | Fastify 5 (TypeScript) | Async-first, JSON-schema validation, structured pino logger built in |
| UI | Svelte 5 (runes API) + Vite 6 | Smallest runtime, compile-time reactivity, fits the simple CRUD shape |
| Single-port serving | @fastify/middie mounting Vite middleware in dev; @fastify/static in prod |
No dual-port DX friction, no SPA framework overkill |
| DB | SQLite via better-sqlite3, WAL + foreign keys, numbered migrations |
File-based, zero infra, synchronous writes are fine here |
| Scheduler | node-cron inside the Fastify process |
One process, no Redis, simple |
| Concurrency | Per-host request pools (5 concurrent per ATS, 3 for SmartRecruiters) | Cap bursts to any single ATS host, keep wall-clock fast |
| Search backend (discovery) | Self-hosted SearXNG via Docker | Free, no signup, no CAPTCHA, JSON output |
| Repo-source discovery | GitHub Search API (unauthenticated) | Free, 60 req/hr, no token needed for the weekly cadence |
trawl/
├── src/
│ ├── index.ts # Thin entry — calls server.ts
│ ├── server.ts # Fastify bootstrap, route registration, cron
│ ├── config.ts # Env vars + module-level constants (PORT, ROOT, IS_DEV, …)
│ ├── db.ts # SQLite open + migrations + seed companies + insertCompany helper
│ ├── migrate.ts # Numbered-migrations runner
│ ├── migrations/ # 0001-initial-schema.sql — the single schema file (add 0002+ as needed)
│ ├── matching.ts # titleMatches / bodyMatches / matchReasons from stack_keywords.json
│ ├── polling.ts # pollAll(), per-company fetch, auto-archive on consecutive 404s
│ ├── discovery.ts # Four discovery sources + repo-parse post-step
│ ├── ats.ts # ATS adapters, RequestPool primitive, per-host pools, detectAts()
│ ├── country.ts # FLAG_TO_COUNTRY, COUNTRY_HINTS, inferCountry(), extractCountry()
│ ├── domain.ts # Shared TypeScript types (Ats, Job, Company, …)
│ ├── state.ts # persistentState() proxy — survives restarts via app_state K/V table
│ ├── routes/ # Fastify route modules grouped by domain
│ │ ├── jobs.ts
│ │ ├── companies.ts
│ │ ├── discoveries.ts
│ │ ├── repo-sources.ts
│ │ └── system.ts
│ ├── util/ # Pure helpers (country normalization, fetch retry, slug, text)
│ └── mcp.ts # Independent MCP server — opens the same SQLite file
├── client/ # Svelte SPA
│ ├── index.html
│ ├── main.ts
│ ├── app.css
│ └── lib/
│ ├── App.svelte
│ ├── Dashboard.svelte # Top-level page — composes the smaller components below
│ ├── DigestStrip.svelte # "N new jobs this week" header strip
│ ├── FunnelStrip.svelte # Application-status pipe chips
│ ├── JobFilters.svelte # Search, country chips, filters bar
│ ├── JobTable.svelte # The matched-jobs table
│ ├── JobRow.svelte # A single row (extracted for clarity)
│ ├── NotesEditor.svelte # Expanded notes/recruiter editor under a row
│ ├── Paginator.svelte
│ ├── Companies.svelte # Tracked companies + add-company form
│ ├── Discoveries.svelte # Review queue from auto-discovery (companies)
│ ├── RepoSources.svelte # Review queue for GitHub aggregator repos
│ ├── SystemStatus.svelte # Cron schedule + last-run timestamps
│ ├── util/ # similar.ts (Jaccard JD similarity), listControls.ts
│ └── api.ts # Fetch wrappers + shared error helpers
├── config/ # Editable JSON config (matching, discovery, seed companies)
├── docker/searxng/ # SearXNG config (real settings.yml is gitignored)
├── docker-compose.yml
├── data/ # SQLite runtime data (gitignored)
├── package.json
├── tsconfig.json / tsconfig.app.json
├── eslint.config.js
├── .prettierrc.json
├── vite.config.ts
├── svelte.config.js
├── .env.example
├── AGENTS.md # Repo-level guidelines for human + AI contributors
├── CLAUDE.md # Specifics for Claude Code working in this repo
└── README.md
| File | Purpose | Edit when |
|---|---|---|
config/companies.json |
Initial company seeds. Inserted on first run via INSERT OR IGNORE. |
You want to bake new companies into the default seed (rare — use the UI). |
config/stack_keywords.json |
titleKeywords array + stackPatterns array (pattern/flags/label). Controls what titles and JD bodies are considered a match. |
Tune matching — add/remove keywords without touching source code. |
config/discovery_queries.json |
Array of { query: "..." } Boolean queries sent to SearXNG. |
You want to add/remove search patterns (e.g., new ATS hosts, different roles). |
config/repo_discovery_queries.json |
Array of plain strings — keyword queries for GitHub Search API. | You want to broaden / narrow the kinds of aggregator repos surfaced. |
docker/searxng/settings.yml |
GITIGNORED. Real SearXNG config with generated secret_key. | Auto-created during setup. Regenerate by re-running the openssl + sed pair. |
| Var | Default | Purpose |
|---|---|---|
PORT |
3030 |
Fastify port. |
NODE_ENV |
development |
When production, serves built Svelte from dist/client/ via @fastify/static. In dev mode, mounts Vite in-process for HMR. |
SEARXNG_URL |
http://localhost:8888 |
Where SearXNG is reachable. |
No API keys, no tokens, no quotas. Everything is local-only or hits free public APIs unauthenticated.
One generated secret in this project: the SearXNG secret_key. SearXNG uses it locally to sign session cookies. Generated via openssl rand -hex 32 during setup. It lives only in docker/searxng/settings.yml, which is gitignored.
| File | Has secret? | Tracked? |
|---|---|---|
docker/searxng/settings.yml.example |
placeholder only | yes |
docker/searxng/settings.yml |
real generated value | NO (gitignored) |
data/trawl.db |
local DB only, no creds | NO (gitignored) |
Verify the gitignore before any git add:
git check-ignore -v docker/searxng/settings.yml data/trawl.db
Every external service (Greenhouse, Lever, Ashby, SmartRecruiters, Workable, GitHub raw, SearXNG) is hit unauthenticated. No keys to leak.
| Command | What it does |
|---|---|
pnpm dev |
Fastify + Vite middleware on :3030. HMR for Svelte, hot-reload for backend via tsx watch. |
pnpm start |
Production mode: NODE_ENV=production, serves built assets via @fastify/static. Run pnpm build first. |
pnpm mcp |
Start the MCP server (stdio transport). Wire into Claude Desktop / CLI via the JSON snippet above. |
pnpm build |
Vite builds Svelte to dist/client/. |
pnpm check |
svelte-check against tsconfig.app.json — client-side type check. |
pnpm lint |
ESLint (flat config, v9) on TS + Svelte. |
pnpm format |
Prettier --write .. |
pnpm format:check |
Prettier --check . (CI-style). |
docker compose up -d searxng |
Start the SearXNG container (port 8888). |
docker compose stop searxng |
Stop it. |
pollAll() is fire-and-forget:
POST /api/poll returns 202 immediately. The function runs in the background and updates the pollState persisted via state.ts.pollAll() runs Promise.all across all companies, gated by per-host request pools (5 concurrent per ATS, 3 for SmartRecruiters).titleMatches() && bodyMatches(), upserts into jobs with ON CONFLICT(id) DO UPDATE — so re-polling re-evaluates with current filter logic.consecutive_failures counter on each company increments on 404 fetches and resets on the first success. After three in a row the company is auto-archived (so dead slugs stop wasting cycles).GET /api/poll/status every second while a poll is running. Shows Polling X/Y... live.Daily cron also calls pollAll() at 09:00 local. State (startedAt, finishedAt, last-error, last-auto-archived list) is persisted to the app_state table so the dashboard renders correct values after a restart.
runDiscovery() runs four sources in sequence, plus a post-step for parsing approved repos:
AndrewStetsenko/tech-jobs-with-relocation/README.md. Parses the "Companies hiring internationally" table. Extracts ATS slugs from careers URLs via detectAts().lukasz-madon/awesome-remote-job/README.md. Parses the "Companies with 'remote DNA'" section.${SEARXNG_URL}/healthz. Runs each query in config/discovery_queries.json against SearXNG (?format=json), extracts result URLs, runs detectAts()./search/repositories?q=...&sort=stars) for each query in config/repo_discovery_queries.json. Filters: stars ≥ 50, pushed within the last 24 months. Inserts candidates into repo_sources with status pending. Reviewable via the Repo sources tab.repo_sources row with status='approved' whose parsed_at is null or older than 7 days, fetches its README, runs a generic Markdown-link extractor [name](url), runs detectAts() on each link, and inserts new companies into discoveries with source=github:OWNER/REPO.Two-stage review: repo → companies. A repo is approved once (you trust the list) and from then on its contents flow automatically into the company discoveries queue.
Weekly cron also calls runDiscovery() on Mondays at 09:30 local.
| Tool | What it does |
|---|---|
list_jobs |
List active jobs by default. Filter by match, country, pipeline status, company, or archive. |
list_discoveries |
List auto-discovered companies. Filter by source or review status. |
list_repo_sources |
List GitHub aggregator repos. Defaults to non-dismissed repos. |
list_companies |
List tracked companies with archive and polling-health fields. |
get_company |
Look up one company by slug. |
approve_discovery / dismiss_discovery |
Review one pending company discovery. |
approve_all_discoveries |
Approve all pending discoveries; returns newly inserted company count. |
dismiss_all_discoveries |
Dismiss all pending discoveries. |
approve_repo_source |
Approve and parse one pending repo source. |
unapprove_repo_source |
Move one approved repo source back to pending. |
dismiss_repo_source |
Dismiss one pending repo source. |
set_application_status |
Set a manual application status; submitted jobs get the same follow-up behavior as the UI. |
mark_applied |
Legacy alias for set_application_status. |
set_outreach_status |
Set outreach status for a job. |
run_discovery |
Run the full discovery pipeline unless one is already running. |
discover_repo_sources |
Run only GitHub repo-source discovery. |
The MCP server imports the same DB bootstrap path as Fastify, so migrations, seed data, and startup hygiene run consistently. Read tools query SQLite directly. Write tools call shared helpers used by the Fastify routes for discovery approval, application status, follow-up dates, outreach, and repo-source review, so MCP and the UI do not drift. The current server opens the same SQLite file as Fastify; WAL mode makes concurrent reads safe, and writes are serialized by SQLite's write lock. No authentication — only expose via local stdio.
Schema lives in src/migrations/ as numbered files — .sql for pure DDL, .ts for backfills that need JS logic. src/migrate.ts lists files, applies any whose id isn't already in the schema_migrations table, and records each successful apply — every migration runs inside its own transaction.
Adding a column / index / data backfill means adding the next numbered file. Don't edit a frozen migration.
Core tables:
companies (slug PK, name, ats, country, archived, name_resolved,
consecutive_failures, added_at)
jobs (id PK, company_slug FK, title, location, country, url,
matched, match_reasons, notes, dedup_key, recruiter,
archived, seen_at)
discoveries (id, name, slug, ats, country, source, sample_url, status,
found_at, UNIQUE(slug, ats))
repo_sources (id, full_name UNIQUE, description, stars, pushed_at,
html_url, status, parsed_at, found_at)
job_descriptions (job_id PK FK, body, captured_at)
app_state (key PK, value, updated_at)
schema_migrations (id PK, applied_at)
Constraint highlights: ats is checked against the five supported adapters. country is checked to be either empty or a two-letter uppercase code. status columns on discoveries and repo_sources are CHECK-constrained to their allowed enum values. Foreign keys are enabled at connection time (PRAGMA foreign_keys = ON).
All direct deps come from known maintainers / official orgs:
fastify, @fastify/static, @fastify/middie): NearForm + Fastify core team.svelte, @sveltejs/vite-plugin-svelte, svelte-check): Svelte core.better-sqlite3: WiseLibs.node-cron: Lucas Merencia.tsx: Hiroki Osame.@tsconfig/svelte: Microsoft.globals: ESLint Foundation.prettier-plugin-svelte: Prettier + Svelte orgs.@modelcontextprotocol/sdk: Anthropic.All pinned to exact versions in package.json (no ^, no ~) so a malicious patch can't auto-upgrade you. Run pnpm audit periodically. pnpm blocks postinstall scripts by default; only better-sqlite3's native compile is manually approved (legitimate).
| Service | Limit | Approach |
|---|---|---|
| Greenhouse / Lever / Ashby / Workable | No documented public limit | 5-concurrent per host via request pool. Polling cadence is daily. No throttle needed. |
| SmartRecruiters | 5 req/sec/IP (documented) | 3-concurrent (pool reduced from 5) so brief bursts stay comfortably under. |
GitHub Search API (/search/repositories) |
10 req/min unauthenticated (strict) | 2-second throttle between queries. 7 queries × 2s = 14s total per discovery run, well under the limit. If you bump to >5 queries, increase the throttle proportionally. |
GitHub raw content (raw.githubusercontent.com/...) |
5000 req/hr | ~2–10 fetches per discovery cycle. Nothing. |
| SearXNG (local) | None — it's your container | limiter: false in config. |
If a 429 hits any ATS, the pollAll() loop catches per-company errors and continues with the rest. The failed company gets retried on the next poll cycle; if it 404s three polls in a row it auto-archives.
| Job | When (local time) | What it does |
|---|---|---|
| Poll | Daily at 09:00 | Fetches new postings from every tracked company |
| Discovery | Mondays at 09:30 | Runs all four discovery sources + re-parses approved GitHub repos |
Both also run once on server startup so a fresh launch gets immediate data. The Dashboard shows last-run timestamps and next-fire countdowns at the top, refreshed every 30 seconds, via GET /api/system/status.
| Symptom | Likely cause | Fix |
|---|---|---|
Discovery (boolean-search): SearXNG not reachable |
Container not running | docker compose up -d searxng |
EADDRINUSE: address already in use 127.0.0.1:3030 |
Previous instance still running | lsof -ti:3030 | xargs kill -9 |
Could not locate the bindings file (better-sqlite3) |
Native build skipped by pnpm | cd node_modules/.pnpm/better-sqlite3@*/node_modules/better-sqlite3 && npm run install |
MIT. See LICENSE.