music-recap Svelte Themes

Music Recap

Music analytics platform — 22-module analysis pipeline, ML clustering, NLP, 102 Svelte visualizations

Music Recap — Spotify Analytics & Visualization Platform

Analyze 340K+ Spotify play events through a 22-module Python pipeline with K-means clustering, graph algorithms, and NLP enrichment. Interactive Svelte 5 frontend with D3.js + GSAP visualizations.

Features

Pipeline & Data Processing

  • Ingest Spotify streaming history (341K+ events) into SQLite with WAL mode
  • 22 specialized analysis modules processing listening patterns, audio features, artist networks
  • Multi-source enrichment: Last.fm (listeners, playcount, bio), Wikipedia (QID), MusicBrainz (ISRC/MBID), Deezer IDs, LrcLib (lyrics)
  • Parallel rate-limited API requests with retry logic
  • Topological dependency resolution for pipeline ordering

ML & Statistical Analysis

  • K-means audio clustering (6 audio dimensions: acousticness, danceability, energy, instrumentalness, liveness, speechiness; silhouette-optimized k)
  • Custom paired t-test with Lentz's continued fraction for regularized incomplete beta
  • Pearson correlation for feature analysis
  • Emotion dictionary matching & sentiment scoring
  • Language detection for lyrics

Frontend Visualization

  • 102+ interactive Svelte 5 components (acts 1–6)
  • D3.js charts: heatmaps, scatter plots, bar charts, force-directed graphs, histograms
  • GSAP animations & timeline sequencing
  • Network graph with community detection & bridge-artist identification
  • Responsive layout, hover interactions, export to PNG

Architecture

  • Python 3.x backend (22 modules, 341K+ event processing)
  • SQLite3 database (10+ tables, WAL mode for concurrent reads)
  • Svelte 5 + SvelteKit frontend (102 components, static site output)
  • Zero external service dependencies (all enrichment cached locally)

Architecture Flow

Spotify Streaming History (JSON)
    ↓
[ingest.py] → SQLite plays table (341K+ records)
    ↓
[enrich_*] → Parallel enrichment (Last.fm, MusicBrainz, Wikipedia, Deezer, LrcLib)
    ↓
[pipeline/analysis/a01–a22] → 22 analysis modules
    ├─ a01: Overview stats & heatmaps
    ├─ a02-a06: Sessions, skips, platforms, replays
    ├─ a07-a08: Albums, mood timeline
    ├─ a09: Audio clusters (K-means)
    ├─ a10-a15: Audio feature arcs, genres, artist lifecycle, networks, age, popularity
    ├─ a16-a17: Lyrics themes & lines
    ├─ a19-a22: Personality, seasonal patterns, absences
    ↓
[JSON outputs → build/]
    ↓
[Svelte Frontend] → 102 components, 6 interactive "acts"
    ↓
HTML/CSS/JS (static site)

Analysis Modules

Module Category Description
a01_overview Stats Total plays, unique tracks/artists, listening hours, year range
a02_sessions Sessions Listening session detection, duration patterns
a03_skips Behavior Skip analysis, early exit rates
a04_circadian Patterns Hour-of-day, day-of-week patterns
a05_platforms Context Device/platform breakdown (web, mobile, desktop)
a06_replays Behavior Replay frequency, favorites
a07_albums Collections Album stats, top albums
a08_mood_timeline Trends Mood sentiment over time
a09_audio_clusters ML K-means clusters on audio features
a10_feature_arcs Trends Audio feature evolution
a11_genres Collections Genre distribution, trends
a12_artist_lifecycle Trends Artist discovery, activity, decline
a13_network Graph Co-listening network, community detection
a14_musical_age Stats Artist debut, listener tenure
a15_popularity Stats Popularity scores, Spotify metrics
a16_lyrics_themes NLP TF-IDF themes, emotion dictionaries
a17_lyrics_lines NLP Lyric line extraction & sentiment
a19_personality NLP Listening personality profile
a20_seasonal Patterns Seasonal trends
a22_absences Behavior Gaps in listening, offline periods

Getting Started

Prerequisites: Python 3.9+, Node.js 18+

# 1. Install Python dependencies
pip install -r requirements.txt

# 2. Ingest Spotify history (requires JSON files in streaming_history/)
python -m pipeline.ingest --db music_recap.db

# 3. Enrich metadata (Last.fm, MusicBrainz, etc.)
python -m pipeline.enrich_artists --db music_recap.db

# 4. Run analysis pipeline
python -m pipeline.analysis --db music_recap.db --build build

# 5. Start frontend dev server
cd frontend && npm install && npm run dev
# Open http://localhost:5173

Commands

make ingest         # Run ingestion
make enrich-artists # Enrich artist metadata
make analyze        # Run full analysis pipeline
make test           # Run pytest
make clean          # Remove database

Tech Stack

Layer Technology Role
Backend Python 3.x Pipeline orchestration
Database SQLite3 (WAL) 341K+ play events, metadata cache
ML/Stats NumPy, SciPy K-means, t-tests, correlations
Enrichment Requests, parallel Last.fm, MusicBrainz, Wikipedia, Deezer APIs
Frontend Svelte 5, SvelteKit Reactive component framework
Visualization D3.js, GSAP Charts, animations, interactions
Styling Tailwind CSS Responsive design
Testing Pytest Unit tests for pipelines

Created for santifer.io. Portfolio: cv-santiago

Top categories

Loading Svelte Themes