Gaming Community Sentiment Dashboard
CommunityPulse aggregates sentiment and trends across multiple platforms — YouTube, community forums, news sites, and search trends — into a single dashboard. Players and content creators see what the community is talking about without manually browsing every source.
This project showcases cross-platform data ingestion, NLP-driven topic discovery, real-time sentiment tracking, and a responsive Svelte 5 frontend backed by a FastAPI async API.
| Layer | Technology |
|---|---|
| Frontend | SvelteKit 2 + Svelte 5 (runes: $state, $derived, $effect) |
| Backend | FastAPI (Python 3.11+), async SQLAlchemy 2, Pydantic v2 |
| Database | PostgreSQL 17 |
| Cache | Valkey 8 (Redis-compatible) |
| NLP | BERTopic + Sentence Transformers + cardiffnlp RoBERTa sentiment |
| Toxicity | Detoxify (Unitary) |
| AI Summaries | Google Gemini 1.5 Flash (optional) |
| Scheduling | APScheduler (async) |
| Container | Docker Compose (6 services) |
6-container Docker Compose deployment:
Key additions in the roadmap:
The NLP pipeline processes ingested posts through three stages:
Resilience: A circuit breaker wraps each model call — after N failures the circuit opens, and a dead letter queue captures posts for retry on the next scheduler pass.
Eight data source adapters feed the pipeline:
| Source | Method | Rate Strategy |
|---|---|---|
| YouTube | Data API v3 | Daily quota tracking (9k/10k budget) |
| Official News | Publisher RSS | Locale-aware, 6-hour cycle |
| TierSite | Web scraping | Polite crawling with backoff |
| GuideSite | Web scraping | Polite crawling with backoff |
| News Source A | RSS feed | 30-item window per cycle |
| News Source B | RSS feed | 30-item window per cycle |
| Public JSON API | No auth needed, 50-post window | |
| Google Trends | pytrends | 60s inter-request delay, 12-hour cycle |
Each adapter implements a common DataSourceAdapter interface. The IngestionService handles deduplication (external ID upsert) and forwards clean posts to the NLP queue.
gaming-community-analytics-tracker/
├── backend/
│ ├── app/
│ │ ├── api/routes/ # REST endpoints (dashboard, ingestion, feedback)
│ │ ├── dashboard/ # Aggregation, explanation generation, patch tracking
│ │ ├── ingestion/
│ │ │ ├── adapters/ # 8 data source adapters
│ │ │ ├── service.py # Dedup + upsert orchestration
│ │ │ └── scheduler.py # APScheduler job configuration
│ │ ├── models/ # SQLAlchemy async models
│ │ ├── nlp/
│ │ │ ├── topics.py # BERTopic seed-guided clustering
│ │ │ ├── sentiment.py # RoBERTa sentiment scoring
│ │ │ ├── toxicity.py # Detoxify toxicity detection
│ │ │ ├── circuit_breaker.py # Failure isolation
│ │ │ └── dead_letter.py # Retry queue for failed analyses
│ │ └── services/ # Digest generation, topic naming
│ ├── scripts/ # Seed data, migrations
│ └── tests/ # Pytest (async fixtures, mock NLP)
├── frontend/
│ ├── src/
│ │ ├── lib/
│ │ │ ├── components/ # TopicCard, SentimentBar, PatchPulse, etc.
│ │ │ ├── stores/ # Svelte 5 rune stores ($state, $derived)
│ │ │ └── i18n/ # Internationalization (English MVP)
│ │ └── routes/ # SvelteKit pages (dashboard, digest, patch-pulse)
│ └── e2e/ # Playwright E2E tests
├── database/
│ ├── ddl/ # Schema definitions
│ └── dml/ # Seed data, migrations
├── docker-compose.yml # 6-service orchestration
└── docs/ # Architecture diagrams
| Endpoint | Method | Description |
|---|---|---|
/api/dashboard/trending |
GET | Trending topics with sentiment |
/api/dashboard/topics |
GET | All topics list |
/api/dashboard/topics/{slug} |
GET | Single topic details |
/api/dashboard/sources |
GET | Source distribution |
/api/dashboard/patch-pulse |
GET | Current patch sentiment |
/api/dashboard/aggregate |
POST | Trigger aggregation |
/api/dashboard/digest/summary |
POST | AI digest summary |
| Endpoint | Method | Description |
|---|---|---|
/api/feedback/vote |
POST | Submit vote (thumbs up/down) |
/api/feedback/report |
POST | Report inaccurate topic |
/api/feedback/general |
POST | Submit general feedback |
| Endpoint | Method | Description |
|---|---|---|
/api/ingestion/trigger |
POST | Trigger ingestion by platform |
/api/ingestion/status |
GET | All source statuses |
/api/ingestion/quota |
GET | YouTube API quota usage |
/api/ingestion/nlp-stats |
GET | NLP processing statistics |
/api/ingestion/nlp-sentiment |
POST | Trigger sentiment analysis |
| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | Health check with DB/cache status |
| Decision | Optimized For | Trade-off |
|---|---|---|
| Isolated NLP worker container | API memory stability (~200MB vs ~2GB with models loaded) | Extra container orchestration complexity |
| Circuit breaker + DLQ | Graceful degradation when models fail | Eventual consistency — posts analyzed on retry, not immediately |
| BERTopic with seed topics | Consistent topic categories across runs | Less dynamic than fully unsupervised clustering |
| Valkey cache for aggregations | Sub-50ms dashboard loads on cached data | Stale reads between aggregation cycles (up to 6 hours) |
| Session-based anonymous feedback | Privacy-first — no user accounts required for MVP | Limited per-user analytics |
| APScheduler in-process | Zero additional infrastructure for scheduling | Single point of failure — moves to message queue in roadmap |
CommunityPulse started as a way to answer a simple question: what is the gaming community actually talking about right now? The answer required pulling data from fragmented sources, applying NLP at scale, and surfacing results through an intuitive dashboard.
The technical approach prioritizes reliability and observability. Every model call is wrapped in a circuit breaker. Failed analyses land in a dead letter queue with automatic retry. The NLP worker runs in isolation so a model crash never takes down the API. Ingestion adapters share a common interface, making it straightforward to add new data sources.
Looking ahead, the roadmap moves from scheduled polling to event-driven ingestion, adds WebSocket push for real-time updates, and introduces user accounts with customizable alert thresholds. The architecture is designed to scale horizontally — the NLP worker is the natural first candidate for pod autoscaling under load.
MIT License - See LICENSE file for details.