UrlCleanerShortener Svelte Themes

Urlcleanershortener

URL shortener built on a custom LSM storage engine — WAL, MemTable, SSTable, Bloom filters. Zero SQL, zero ORM. .NET 8 + Svelte.

URL Shortener — LSM + SSTable Storage Engine

Storage engine inspired by LevelDB/Cassandra. ZeroSQL-a.

Storage arhitecture

WRITE PATH

ShortenAsync(url)

  1. WAL.Append(record) ← durability, fsync na disk

  2. MemTable.TryAdd(record) ← SortedDictionary<string,UrlRecord> │
  3. MemTable.IsFull?

    YES → FlushMemTableAsync()
    ├── SSTableWriter.Write(sortedSnapshot)
    ├── Novi MemTable
    ├── WAL.Truncate()
    └── SSTableCount ≥ threshold? → CompactAsync()

READ PATH

Get(shortCode)

  1. GlobalBloom.MightContain()?
    │ false → return null (garantirano ne postoji)
    │ true → nastavi
  2. MemTable.Get(shortCode) → O(log n)
    │ found → return

  3. For each SSTable (newest first):
    a. sst.Bloom.MightContain()? → jump through if miss
    b. BinarySearch(sparseIndex) → closest entry
    c. Sequential scan → O(log n + SparseInterval)
    found → return

SSTable format na disku

.sst FILE:

HEADER (28B)
8B magic = "SSTABLE\0"
4B version
8B record_count
8B created_at_utc

DATA SECTION
[record 0] [record 1] ... [record N]
Sortirano po ShortCode (lexikografski)
Svaki record: binarni format, varijabilna
duljina (2+URL duljine + JSON params)

SPARSE INDEX
Svaki 32. zapis → (shortCode, byteOffset)
Binary search u indexu → scan max 32 zapisa

BLOOM FILTER
Bit-packed bools + k + addedCount
~117KB za 100K elements, 1% FP rate

FOOTER (24B)
8B index_section_offset
8B bloom_section_offset
8B magic (validacija)

Compaction

Kad ima ≥4 SSTable files:

SSTable_1 (1000 writes) ─┐
SSTable_2 (800 writes)  ─┤
SSTable_3 (600 writes)  ─┤ → K-way merge sort → SSTable_merged (2400 writes)
SSTable_4 (400 writes)  ─┘

DUPES: first-write-wins (no edits → oldest is match)
Old files: deleted after succesful merge.

WHY SSTable INSTEAD SQL?

Feature SQL (EF Core) LSM + SSTable
Write INSERT + ACID overhead WAL append → O(1)
Read B-tree index Bloom filter + sparse index
Deletes/edits Supported Not needed (immutable URLs)
Schema Migration needed None
Dependencies SQL Server ZERO
Compaction N/A Automatic
Recovery Transaction log WAL replay

Setup

cd UrlShortener.API
dotnet run
# Kreira ./data/ direktorij s WAL i SSTable datotekama automatski
# Swagger: http://localhost:5000/swagger
cd UrlShortener.Frontend
npm install
npm run dev

API Endpoints

Method Path DESCRIPTION
POST /api/urls/shorten Skraćuje URL
GET /api/urls/expand/{code} Expand SHORT CODE
GET /api/urls/storage/stats MemTable + SSTable + WAL + Bloom statistike
POST /api/urls/storage/flush Manual flush MemTable → SSTable
POST /api/urls/storage/compact Manual compaction SSTable-ova

Konfiguracija (appsettings.json)

"Storage": {
  "DataDirectory": "data",
  "MemTableThreshold": 500,
  "CompactionTrigger": 4
}
Parametar Default DESCRIPTION
MemTableThreshold 500 HOW MANY WRITES IN MemTable before flusha
CompactionTrigger 4 How many SSTable-s triggers compaction

Struktura projekta

UrlShortener.API/
├── Infrastructure/
│   ├── Bloom/
│   │   └── BloomFilter.cs          ← double hashing, serijalization
│   ├── WAL/
│   │   └── WriteAheadLog.cs        ← append, fsync, truncate, recovery
│   ├── MemTable/
│   │   └── MemTable.cs             ← SortedDictionary, thread-safe
│   ├── SSTable/
│   │   └── SSTableEngine.cs        ← Writer + Reader, sparse index, Bloom
│   ├── LsmStorageEngine.cs         ← orchestration write/read/flush/compact
│   └── RecordSerializer.cs         ← binar format for UrlRecord
├── Services/
│   ├── UrlCleanerService.cs
│   └── UrlShortenerService.cs
├── Controllers/
│   └── UrlsController.cs
└── Models/
    └── Models.cs

data/                               ← automaticly created
├── wal.log                         ← WAL file
└── sstables/
    ├── sstable_20240101120000.sst
    └── sstable_20240101130000.sst

Top categories

Loading Svelte Themes