uniLLM Backend
A FastAPI-based backend service for the German Student Info Chatbot, providing RAG (Retrieval-Augmented Generation) capabilities with vector search and LLM integration.
Table of Contents
Overview
The uniLLM backend is built with:
- FastAPI - Modern, fast web framework for building APIs
- LlamaIndex - RAG framework for document indexing and retrieval
- Qdrant - Vector database for semantic search
- OpenAI GPT-4 - Large language model for response generation
Prerequisites
- Python 3.8+
- Docker and Docker Compose
- Git
Installation
- Clone the repository:
git clone https://github.com/rissalhedna/unillm.git
cd unillm-backend
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Start the required services:
docker compose up -d
This will start:
- PostgreSQL database on port 5432
- Qdrant vector database on port 6333
Configuration
Create a .env
file in the root directory with the following variables:
# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/postgres
# Qdrant Vector Database
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_URL=http://localhost:6333
# OpenAI API
OPENAI_API_KEY=your_openai_api_key_here
# Environment
DEV=dev # or "prod" for production
Development
Local Development
- Start the development server:
fastapi dev main.py
The API will be available at http://localhost:8000
- Access the interactive API documentation:
- Swagger UI:
http://localhost:8000/docs
- ReDoc:
http://localhost:8000/redoc
Docker Development
- Build and run with Docker:
docker compose up --build
Code Quality
Before committing changes, run pre-commit hooks:
pre-commit run --all-files
API Endpoints
Chat Endpoints
POST /chat
- Send a message to the chatbot
GET /chats
- Retrieve chat history
GET /chats/{chat_id}
- Get specific chat conversation
Health Check
GET /health
- API health status
Documentation
GET /docs
- Swagger UI documentation
GET /redoc
- ReDoc documentation
Data Pipeline
The backend includes a comprehensive data processing pipeline for managing the knowledge base:
Web Scraping
Located in /scripts
folder:
handbook_germany_crawler.py
- Scrapes handbook-germany.de
study_in_germany_crawler.py
- Scrapes study-in-germany.de
Data Processing
The pipeline processes data through:
- Web Scraping: Automated crawlers collect information from German study websites
- Data Cleaning: Raw data is processed and cleaned
- Text Chunking: Documents are split into appropriate chunks for embeddings
- Vectorization: Text chunks are converted to embeddings using OpenAI
- Storage: Embeddings are stored in Qdrant vector database
Running the Pipeline
- Install browser dependencies:
playwright install chromium
- Run scrapers:
cd scripts
python handbook_germany_crawler.py
python study_in_germany_crawler.py
- Process the data using the notebook:
jupyter notebook notebooks/scraping_cleaning_pipeline.ipynb
Architecture
User Query → FastAPI → LlamaIndex → Qdrant (Vector Search) → OpenAI GPT-4 → Response
Key Components
- Query Processing: FastAPI receives and validates user queries
- RAG System: LlamaIndex orchestrates retrieval and generation
- Vector Search: Qdrant finds relevant document chunks
- Response Generation: OpenAI GPT-4 generates contextual responses
- Fallback System: Search engine fallback for queries outside knowledge base
Deployment
Production Deployment
- Set environment variables for production
- Use Docker for containerized deployment:
docker build -t unillm-backend .
docker run -p 8000:8000 unillm-backend
- Railway: Simple deployment with database support
- Linode: Cost-effective VPS hosting
- Docker: Containerized deployment
Current Status
✅ Working Features:
- FastAPI backend with async support
- RAG system with Qdrant integration
- Document indexing and retrieval
- OpenAI GPT-4 integration
- Web scraping pipeline
- Docker containerization
🔧 Known Issues:
- Chat history persistence needs improvement
- Search engine fallback system in development
🚧 Upcoming Features:
- Enhanced search engine fallback
- CV matching functionality
- Improved caching system
- Rate limiting and authentication
- Monitoring and logging improvements
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests and pre-commit hooks
- Submit a pull request
License
This project is licensed under the MIT License.