Open-source phishing detection engine for real-time URL analysis. Detect malicious links, explain every verdict, and generate a security report in real time.
⚡ Quick Start · ⚙️ Detection Engine · 🏛 Architecture · 📚 Docs · 🤝 Contributing
Paste a URL → get a trust score, verdict, and detailed report in real time.
Live demo: https://safesurf.xorwave.com
git clone https://github.com/abhizaik/phishing-detection.git
cd phishing-detection
make build && make up
Open Web UI: localhost:3000
Detailed setup guide: docs/setup.md
| Feature | SafeSurf | VirusTotal | Google Safe Browsing | URLScan.io | CheckPhish |
|---|---|---|---|---|---|
| Live crawl, instant results | ✅ | Partial | ❌ | Partial | Partial |
| Explains every verdict | ✅ | Partial | ❌ | Partial | Partial |
| Beginner-friendly interface | ✅ | Partial | Partial | Partial | Partial |
| Credential form detection | ✅ | ❌ | ❌ | Partial | ✅ |
| Follows redirect chains | ✅ | ✅ | ❌ | ✅ | ✅ |
| Detailed technical insights | ✅ | ❌ | ❌ | ✅ | Partial |
| Live page preview | ✅ | ❌ | ❌ | ✅ | ✅ |
| Detection using AI/ML | ❌ | ✅ | ✅ | Partial | ✅ |
| Known phishing database coverage | Partial | ✅ | ✅ | Partial | Partial |
| Scan multiple URLs at once | ❌ | ✅ | ✅ | ✅ | ❌ |
| Browser protection | ✅ | ✅ | ✅ | ✅ | ❌ |
| Open source | ✅ | ❌ | ❌ | ❌ | ❌ |
Fast scanners (like Google Safe Browsing) give you a verdict from database lookup with no explanation or live scanning. Deep crawlers (like URLScan.io) take too long. SafeSurf bridges the gap by doing live analysis with per-signal explanations in real time — and it's open-source.
Analyze a URL via HTTP:
curl "http://localhost:8080/api/v1/analyze?url=https://example.com"
Sample Response:
{
"url": "https://example.com",
"trust_score": 100,
"verdict": "Safe",
"reasons": {
"good_reasons": [...]
}
}
Full response schema → docs/api.md#example
18 concurrent goroutines run across 7 signal categories, producing 33 individual signals. Every check emits a reason string — good, bad, or neutral — so the final score is always fully explainable. No black-box verdicts.
Score formula: finalScore = clamp(50 + (trustScore − riskScore) × 0.5) → Risky < 30 · Suspicious 30–64 · Safe ≥ 65
50 is the neutral baseline — a URL with no signals scores exactly 50 (Suspicious), the right default for an unknown URL. Trust signals pull the score up, risk signals pull it down, each weighted at 0.5× so neither dominates alone. Both scores are individually clamped to 0–100 before the formula runs, preventing a single catastrophic signal from drowning all other context.
URL Signals (8 checks)
HTTP / Network (4 checks, single HTTP request)
DNS (3 checks)
TLS / SSL (2 checks, single TLS handshake)
Domain Intelligence (6 checks)
Content Analysis (8 checks)
<iframe> (credential theft / clickjacking vector)Threat Intelligence (2 checks)
Not a safety guarantee. Use alongside other defenses.
Four containerized services on a shared Docker bridge network. The Go backend is the only service that makes outbound calls to external APIs — the frontend, Chrome, and cache are strictly internal.
| Service | Role |
|---|---|
safesurf-web |
SvelteKit UI — :3000 (prod) · :5173 (dev) |
safesurf-backend |
Go REST API & analyzer engine — :8080 |
safesurf-chrome |
Headless Chrome — WebSocket :9222 |
safesurf-valkey |
Valkey (Redis-compatible) — :6379, LRU cache, volume-persisted |
sync.WaitGroup; panics are recovered per-task without failing the requestserver/
cmd/safesurf/ entry point
internal/analyzer/ goroutine runner, task definitions, score aggregation
internal/service/
checks/ 18 individual analyzer implementations
screenshot/ headless Chrome integration
cache/ Valkey client
threatfeeds/ PhishTank client
typosquat/ brand similarity engine
web/website/ SvelteKit UI
web/chrome-extension/ browser extension
docker/ dev & prod Compose configs
docs/ API, setup, architecture, security
If you use this project in academic or research work, please cite it — see CITATION.cff.
Copyright (C) 2023–2026 Abhishek K P
SafeSurf is dual-licensed:
If you found this project helpful, consider giving it a star.