Murph

Prajwal-Pujari

MURPH is an intelligent, voice-activated AI assistant with personality, browser control capabilities, and natural conversation abilities. Built with FastAPI, Ollama, and Svelte, MURPH combines speech recognition, natural language processing, and text-to-speech to create a seamless voice interaction experience.

Download

MURPH - Intelligent Voice Assistant & Browser Automation Agent

MURPH is an advanced, voice-activated AI assistant featuring natural language processing, browser automation, and contextual memory. Designed for seamless voice interactions, MURPH combines state-of-the-art speech recognition, intelligent conversation management, and system-level control capabilities to deliver a comprehensive virtual assistant experience.

Features
Architecture
Prerequisites
Installation
Configuration
Usage
API Reference
Troubleshooting
Contributing
License
Acknowledgments

Features

Core Capabilities

Advanced Speech Recognition: Leverages OpenAI Whisper for high-accuracy speech-to-text conversion
Contextual Conversation Memory: Utilizes ChromaDB vector database for persistent conversation history and context retention
Natural Voice Synthesis: Implements Piper TTS for natural-sounding male voice output with gTTS fallback support
Adaptive Personality System: Configurable humor levels (0-100%) ranging from professional to highly personable interactions
Full Browser Automation: Selenium-powered web browser control for autonomous navigation and interaction

Browser Control

Website navigation and URL management
Integrated search across major platforms (Google, Wikipedia, YouTube)
YouTube video playback control (play/pause/volume/navigation)
Page manipulation (scrolling, content reading, tab management)
Multi-tab session handling
Application switching and focus management

System Integration

Application launch and control
Automated text input across applications
File system operations (read/write/list)
Directory navigation
Inter-application communication

Intelligence Features

Real-time weather data retrieval
Time and date queries
Web search integration
Conversation history tracking and analysis
Context-aware response generation

Architecture

MURPH employs a modern, microservices-inspired architecture:

Backend: FastAPI server handling voice processing, AI inference, and system operations
Frontend: Svelte-based responsive UI for voice interaction and visual feedback
AI Engine: Ollama-powered LLM (Llama 3) for natural language understanding
Vector Database: ChromaDB for semantic search and conversation memory
Speech Pipeline: Whisper → Ollama → Piper/gTTS for complete voice interaction

Prerequisites

Ensure the following software is installed on your system:

Requirement	Version	Purpose
Python	3.12+	Backend runtime
Node.js	18+	Frontend development
FFmpeg	Latest	Audio processing
Ollama	Latest	LLM inference
Chrome	Latest	Browser automation
Git	Latest	Version control

Installation

System Dependencies

FFmpeg Installation

Windows:

# Download from https://ffmpeg.org/download.html
# Extract and add bin folder to System PATH
# Verify installation
ffmpeg -version

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg -y
ffmpeg -version

macOS:

brew install ffmpeg
ffmpeg -version

Ollama Setup

Install Ollama:

Visit https://ollama.ai and download the appropriate installer for your operating system.

Pull Required Model:

# Download Llama3.1:8b-instruct-q4_K_M
ollama pull llama3.1:8b-instruct-q4_K_M
   
# Verify installation
ollama list

# Test model (optional)
ollama run llama3:8b "Hello, test message"

Start Ollama Service:

# Ollama typically runs as a background service
# If not running, start manually:
ollama serve

ChromaDB Installation

ChromaDB is included in the Python dependencies and will be installed automatically. However, for optimal performance, ensure the following:

System Requirements:
- Minimum 4GB RAM available
- 1GB free disk space for database storage

Installation Verification:

# After pip install, verify ChromaDB
python -c "import chromadb; print(chromadb.__version__)"

Database Initialization:

ChromaDB will automatically initialize on first run. The database files will be stored in:
```
./memory_db/
```

Configuration (Optional):

For production deployments, consider ChromaDB's client-server mode:

# Install ChromaDB server
pip install chromadb[server]

# Run ChromaDB server
chroma run --path ./memory_db

Backend Installation

Clone Repository:

git clone https://github.com/Prajwal-Pujari/Murph-.git
cd murph-ai-assistant

Create Virtual Environment:

# Create environment
python -m venv venv
   
# Activate environment
# Windows:
venv\Scripts\activate
# Linux/macOS:
source venv/bin/activate

Install Python Dependencies:

pip install --upgrade pip
pip install -r requirements.txt

requirements.txt contents:

fastapi==0.104.1
uvicorn[standard]==0.24.0
aiohttp==3.9.1
torch==2.1.0
openai-whisper==20231117
pyttsx3==2.90
chromadb==0.4.18
gtts==2.4.0
piper-tts==1.2.0
selenium==4.15.2
webdriver-manager==4.0.1
pyautogui==0.9.54
pygetwindow==0.0.9
python-multipart==0.0.6
python-dotenv==1.0.0

Note: If you encounter any dependency conflicts, consider using these compatible versions. For the latest versions, remove version specifications.

Download Piper TTS Model:

# Create models directory
mkdir -p models
cd models
   
# Download male voice model
# Visit: https://github.com/rhasspy/piper/releases/
# Download both files:
# - en_US-hfc_male-medium.onnx
# - en_US-hfc_male-medium.onnx.json

# Or use wget (Linux/macOS):
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/en_US-hfc_male-medium.onnx
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/en_US-hfc_male-medium.onnx.json

cd ..

Initialize ChromaDB:

# ChromaDB will auto-initialize, but you can pre-create the directory
mkdir -p memory_db
   
# Test ChromaDB setup
python -c "import chromadb; client = chromadb.PersistentClient(path='./memory_db'); print('ChromaDB initialized successfully')"

Start Backend Server:

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend Installation

Navigate to Frontend Directory:
```
cd frontend
```
Install Node Dependencies:
```
npm install
```
Start Development Server:
```
npm run dev
```
Build for Production (Optional):
```
npm run build
npm run preview
```

Access Application

Open your web browser and navigate to:

http://localhost:5173

The backend API will be available at:

http://localhost:8000

API documentation can be accessed at:

http://localhost:8000/docs

Configuration

Backend Configuration

Edit main.py to customize settings:

# Ollama Configuration
OLLAMA_API_URL = "http://localhost:11434/api/generate"
MODEL_NAME = "llama3.1:8b-instruct-q4_K_M"
OLLAMA_TIMEOUT = 120  # seconds

# ChromaDB Configuration
CHROMA_DB_PATH = "./memory_db"
COLLECTION_NAME = "conversation_history"

# CORS Settings
origins = [
    "http://localhost:5173",
    "http://localhost:3000",
]

# Voice Configuration
PIPER_MODEL_PATH = "models/en_US-hfc_male-medium.onnx"
USE_PIPER_TTS = True  # Set to False to use gTTS

# Personality Settings
DEFAULT_HUMOR_LEVEL = 85  # 0-100

Frontend Configuration

Update API endpoint in frontend/src/routes/+page.svelte:

const API_BASE_URL = 'http://localhost:8000';

Environment Variables

Create a .env file in the project root:

OLLAMA_API_URL=http://localhost:11434/api/generate
OLLAMA_MODEL=llama3:8b
OLLAMA_TIMEOUT=120

CHROMA_DB_PATH=./memory_db
PIPER_MODEL_PATH=./models/en_US-hfc_male-medium.onnx

CORS_ORIGINS=http://localhost:5173,http://localhost:3000

DEFAULT_HUMOR_LEVEL=85

Usage

Voice Interaction

Primary Input Method: Press and hold SPACEBAR to record audio, release to process.

Command Examples

General Queries

"Hey MURPH, what's the current time?"
"What's the weather like in San Francisco?"
"Tell me about yourself"

Browser Automation

"Open Google and search for artificial intelligence"
"Navigate to Wikipedia and look up quantum computing"
"Play 'Stairway to Heaven' on YouTube"
"Pause the video"
"Increase volume to 80%"
"Go to the next video"
"Close this tab"
"Scroll down the page"

System Operations

"Open Visual Studio Code"
"Switch to Chrome"
"List files in the current directory"
"Read the contents of readme.txt"
"Write 'Hello World' to test.txt"

Personality Adjustment

"Set humor level to 100"  # Maximum personality
"Set humor level to 50"   # Balanced mode
"Set humor level to 0"    # Professional mode

Keyboard Shortcuts

Shortcut	Action
SPACE (hold)	Record voice input
Ctrl + Shift + H	View conversation history
Esc	Cancel recording

API Reference

Endpoints

POST `/voice-chat`

Process voice input and return AI response.

Request: multipart/form-data

audio: Audio file (WAV, MP3, etc.)

Response: application/json

{
  "text": "Transcribed text",
  "response": "AI response text",
  "audio_url": "/audio/response.mp3",
  "timestamp": "2025-11-17T10:30:00Z"
}

GET `/history`

Retrieve conversation history.

Response: Array of conversation entries

POST `/set-humor`

Adjust personality humor level.

Request: application/json

{
  "level": 85
}

Troubleshooting

Voice Synthesis Issues

Symptom: Female voice instead of male voice

Solution:

Verify Piper model files exist in models/ directory
Check backend logs for Piper initialization messages
Ensure both .onnx and .onnx.json files are present
Restart backend server after adding models

Ollama Connection Errors

Symptom: "Ollama is taking too long to respond" or timeout errors

Solution:

Verify Ollama service is running:
```
ollama list
```
Pre-load the model to reduce first-request latency:
```
ollama run llama3:8b
```
Check system resources (minimum 8GB RAM recommended)
Increase timeout in main.py:
```
OLLAMA_TIMEOUT = 180
```

ChromaDB Issues

Symptom: Database initialization errors or persistence failures

Solution:

Ensure write permissions for memory_db/ directory:
```
chmod -R 755 memory_db
```

Delete and reinitialize database:

rm -rf memory_db
python -c "import chromadb; chromadb.PersistentClient(path='./memory_db')"

Check available disk space (minimum 1GB required)
Verify ChromaDB version compatibility:
```
pip install --upgrade chromadb
```

Browser Automation Failures

Symptom: Selenium commands not executing

Solution:

Update ChromeDriver:

pip install --upgrade webdriver-manager

Ensure Chrome browser is up to date
Check Chrome is in system PATH
Grant necessary permissions for browser automation

Audio Recording Issues

Symptom: "Failed to load audio" or microphone access denied

Solution:

Grant microphone permissions in browser settings
Verify FFmpeg installation:
```
ffmpeg -version
```
Check browser console for detailed error messages
Test microphone with other applications
Use HTTPS or localhost (required for microphone access)

Contributing

We welcome contributions from the community! Please follow these guidelines:

Fork the repository

Create a feature branch:

git checkout -b feature/YourFeatureName

Commit your changes:

git commit -m "Add: Brief description of changes"

Push to your fork:

git push origin feature/YourFeatureName

Open a Pull Request

Development Guidelines

Follow PEP 8 style guide for Python code
Use ESLint for JavaScript/Svelte code
Write unit tests for new features
Update documentation for API changes
Ensure all tests pass before submitting PR

License

This project is licensed under the MIT License. See the LICENSE file for complete details.

Acknowledgments

MURPH is built upon several outstanding open-source projects:

OpenAI Whisper - State-of-the-art speech recognition
Piper TTS - High-quality neural text-to-speech
Ollama - Efficient local LLM inference
ChromaDB - AI-native vector database
FastAPI - Modern Python web framework
Svelte - Reactive frontend framework
Selenium - Browser automation framework

Contact

Developer: @Gravity_Exists

Project Repository: https://github.com/Prajwal-Pujari/Murph-

Issues & Support: GitHub Issues

Roadmap

Upcoming Features

Multi-language support (Spanish, French, German, Japanese)
Custom wake word detection
Plugin architecture for third-party extensions
Native mobile applications (iOS/Android)
Voice cloning and custom voice profiles
Calendar integration (Google Calendar, Outlook)
Music streaming service integration (Spotify, Apple Music)
Smart home device control (Home Assistant, HomeKit)
Email management and notifications
Document analysis and summarization
Code generation and debugging assistance

Long-term Vision

Distributed deployment architecture
Multi-user support with individual profiles
Enterprise security features
Cloud synchronization
Advanced sentiment analysis
Proactive assistance based on user patterns

Version: 1.0.0
Last Updated: November 2025
Status: Active Development

Top categories

Murph

MURPH - Intelligent Voice Assistant & Browser Automation Agent

Table of Contents

Features

Core Capabilities

Browser Control

System Integration

Intelligence Features

Architecture

Prerequisites

Installation

System Dependencies

FFmpeg Installation

Ollama Setup

ChromaDB Installation

Backend Installation

Frontend Installation

Access Application

Configuration

Backend Configuration

Frontend Configuration

Environment Variables

Usage

Voice Interaction

Command Examples

General Queries

Browser Automation

System Operations

Personality Adjustment

Keyboard Shortcuts

API Reference

Endpoints

POST /voice-chat

GET /history

POST /set-humor

Troubleshooting

Voice Synthesis Issues

Ollama Connection Errors

ChromaDB Issues

Browser Automation Failures

Audio Recording Issues

Contributing

Development Guidelines

License

Acknowledgments

Contact

Roadmap

Upcoming Features

Long-term Vision

Top categories

POST `/voice-chat`

GET `/history`

POST `/set-humor`