webcrawler Svelte Themes

Webcrawler

A Python based Webcrawler with a svelte frontend that allows the user to query webcrawls and brose the Database. Additional Analytics features are planned.

Webcrawler

This project is a simple web crawler implemented in Python. It was inspired by @afazio1, who presented their own web crawler in a YouTube video.

In contrast to their implementation, this version was created in Python with the goal of getting familiar with the Python ecosystem in preparation for future Python-heavy projects.

Status / ROADMAP

  • Before I go and build yet another frontend (which I have done several times in the past). I might pursuit learning more about data analytics to implement these features in the logic first. Onwards it will be more useful to actually build the UI, when more features are available.

  • After that I will also look more into the deployment process.

Goals & Focus

The main focus of this project is to:

  • Explore and learn the use of Python libraries such as Beautiful Soup (bs4), pandas, and others.
  • Build and expand on data analysis logic based on crawled data.
  • Lay the groundwork for future features, such as:
    • A custom web crawler implementation.
    • A UI for easier interaction.
    • Containerizing the app for simplified deployment and usage.

Current State

At the moment, the project uses Beautiful Soup for parsing, but a custom crawling mechanism might be added later to gain more control over the scraping and data processing pipeline.

Setup

TBD

Top categories

Loading Svelte Themes