Blog post: Build Your Own GitHub Copilot
This repo contains:
cd src/dataset_gen && npm install
node gen.js <path to svelte repo>
Then follow the notebook for running SFT on the training data.
If you follow the notebook, you should have a generated.test.jsonl
(or generated-post-finetune.test.jsonl
) file containing the prefix, suffix, expected completion and actual completion.
You can then run python src/metrics.py <path to generated jsonl file>
to get some basic accuracy and BLEU metrics.