Localama is a lightweight AI chat application for running Ollama models locally. It provides a clean, responsive chat interface with real-time streaming, enabling seamless interaction with large language models directly on your machine.
Model Size | Minimum RAM |
---|---|
7B | 8 GB |
13B | 16 GB |
33B | 32 GB |
Install Ollama in your machine from here
Test Run ollama:
ollama serve
Note: Ollama should run in: http://localhost:11434/
Get Ollama model from here
Pull your model:
ollama pull <model-name>
Check available models (optional):
ollama list
Test run your model (optional):
ollama pull <model-name>
Method | Description |
---|---|
Docker | Run both frontend and backend together. |
Manual | Run frontend and backend separately on your machine. |
Run both frontend and backend together using Docker Compose. Install Docker and Docker Compose. And then Clone this repository for docker script:
git clone https://github.com/sharifmrahat/localama.git
cd localama
Run docker compose:
docker-compose up -d
Check the status:
docker ps
To stop the services:
docker-compose down
Backend: http://localhost:5000/
Frontend: http://localhost:3000/
Run the Frontend and Backend separately after cloning from GitHub.
Note: To run the frontend with Bun you've to install it in your machine. You can also use NPM.
Clone backend repository:
git clone https://github.com/sharifmrahat/localama-api.git
cd localama-api
Install dependencies:
npm install
Run backend service:
npm run start:dev
Backend: http://localhost:5000/
Clone frontend repository:
git clone https://github.com/sharifmrahat/localama-fe.git
cd localama-fe
Install dependencies:
bun install
Run frontend:
bun run dev
Frontend: http://localhost:5173/
Owner: Sharif