anki-sentences Svelte Themes

Anki Sentences

🌍 A toolkit for generating sentence-based Anki decks for studying foreign languages

Anki Language Sentence Study Decks

A toolkit for building sentence-based Anki decks from Tatoeba data.

The project takes a list of target words, or the top N words from a frequency list, finds useful example sentences, enriches each card with translations, difficulty metadata, and Google Text-to-Speech audio, then exports a ready-to-import .apkg file.

This repository is built around one config-driven pipeline.

What It Builds

Each generated deck can include:

  • Source-language example sentences from Tatoeba
  • Sentence translations from Tatoeba
  • Word-by-word and n-gram translation hints from Argos Translate or Google Translate
  • Difficulty scores based on local frequency data
  • Google Text-to-Speech audio files and Anki [sound:...] tags
  • Word-level audio timestamp metadata for the card UI
  • A bundled Svelte Anki card template
  • A final .apkg package that can be imported into Anki

Requirements

Use the Nix shell to install required packages:

nix develop

Quick Start

Edit apps/deck-cli/deck.config.jsonc, then run:

cd apps/deck-cli
bun run build

By default, generated files are written under output/ from paths configured in apps/deck-cli/deck.config.jsonc.

Configuration

The pipeline is controlled by apps/deck-cli/deck.config.jsonc.

The checked-in config references apps/deck-cli/deck.config.schema.json, so editors with JSON Schema support should provide validation and completion.

To run with another config file:

DECK_CONFIG_PATH=/absolute/or/relative/path.jsonc bun run src/index.ts

The default CLI entrypoint is apps/deck-cli/src/index.ts.

Pipeline Passes

Available pass names:

  • retrieve - fetch matching Tatoeba sentence rows into the CSV
  • enrich-translations - add word and n-gram translation metadata
  • enrich-translation-alternatives - fill missing translation alternatives where available
  • enrich-difficulty - score and sort cards by difficulty
  • enrich-audio - generate Google TTS audio and timestamp metadata
  • build-apkg - build the Anki package

You can remove passes from apps/deck-cli/deck.config.jsonc when iterating on a specific stage. For example, after retrieving sentences once, you can rerun only enrichment or packaging passes against the existing CSV.

Translation Providers

Argos Translate

Argos Translate runs locally through the FastAPI service in apps/argos-translate-service/.

Start it from the deck CLI directory:

cd apps/deck-cli
bun run argos:start

Then set:

"translation": {
  "provider": "argos",
  "sourceLanguage": "de",
  "targetLanguage": "en",
  "argos": {
    "translateUrl": "http://127.0.0.1:8000/translate",
    "cachePath": "../../output/argos-translate-cache.json",
    "alternatives": 2
  }
}

The service host and port can be overridden with:

ARGOS_HOST=127.0.0.1
ARGOS_PORT=8000

Google Translate

Google Translate uses either an API key, an access token, or local Application Default Credentials.

Set the provider to google:

"translation": {
  "provider": "google",
  "sourceLanguage": "de",
  "targetLanguage": "en",
  "argos": {
    "translateUrl": "http://127.0.0.1:8000/translate",
    "cachePath": "../../output/argos-translate-cache.json",
    "alternatives": 2
  },
  "google": {
    "translateUrl": "https://translation.googleapis.com/language/translate/v2",
    "cachePath": "../../output/google-translate-cache.json",
    "accessToken": null,
    "apiKey": null,
    "quotaProject": null
  }
}

Google Text-to-Speech

The audio pass uses Google Cloud Text-to-Speech and requires OAuth2 credentials. API keys are not supported for this endpoint.

For local development, use Application Default Credentials:

gcloud auth application-default login

Make sure the relevant APIs are enabled in your Google Cloud project:

  • Cloud Text-to-Speech API
  • Cloud Translation API, if using Google Translate

ffmpeg must be available on PATH; the CLI transcodes generated speech to AAC before packaging it into Anki.

Card Template

The card UI lives in apps/card-template/ and is bundled into a single HTML artifact:

cd apps/deck-cli
bun run template:build

The generated artifact is written to apps/card-template/dist/index.html. The APKG build uses this template automatically.

Usefull Commands

Data Sources

Project Status + Notes

This project is feature-complete for its original goal: generating personal Anki sentence decks from Tatoeba with translation hints, difficulty ordering, audio, and a bundled card UI.

Future work is expected to be maintenance, source updates, small quality fixes.

Dont hesitate to send a pr and feel free to reach out if you have any questions :)

License

MIT

See LICENSE

Top categories

Loading Svelte Themes