SiteSeeker is a minimalist web crawler designed for constructing detailed website graphs and extracting link information. The app supports parallel web scraping to ensure efficient and scalable data extraction.
To run the application, you will need Docker installed.
You can specify custom environment variables in the .env file located at the root of the project. By default, the following ports are used for backend:
BACKEND_PORT=3000
FRONTEND_PORT=8080
The .env
file in Frontend
has to contain localhost with the port specified as FRONTEND_PORT
in the global .env
file.
VITE_API_URL=http://localhost:3000
You can modify these values to suit your needs.
To clone and run the project using Docker, follow these steps:
git clone https://github.com/MiraZzle/site-seeker.git
docker compose up --build
After building the containers, you can use the following command to run them:
docker compose up
This will start both the backend and frontend of the application.
If you need to to run the project in development mode. Follow the steps below:
git clone https://github.com/MiraZzle/site-seeker.git
cd backend
Install dependencies:
npm install
Start the backend server:
node src/server.mjs
cd frontend
Install dependencies:
npm install
Start the development server:
npm run dev
Once the application is running, visit http://localhost followed by the port number you specified in the .env file (e.g., http://localhost:3000 for the backend or http://localhost:5173 for the frontend) to access the respective services.
For detailed information about the project's architecture, features, and usage examples, please visit our GitHub Wiki.