Generates training set based on the Svelte documentation.
Work in progress
Run once to generate main dataset. Run again to generate validation / test datasets.