Control your entire computer with natural language, voice, and hand gestures.
An open-source, privacy-first AI agent that plans, executes, and verifies complex multi-step tasks.
š Visit the Official Website (helioxos.dev) š
Quick Start ⢠JARVIS Mode ⢠Features ⢠Architecture ⢠Security ⢠Contributing
Unlike simple command runners, Heliox OS is a true agentic system inspired by robust autonomous architectures like OpenClaw, running a continuous ReAct loop with a modular multi-agent orchestrator:
Heliox OS now includes a futuristic, Iron Man-style interface:
Heliox OS uses a modular multi-agent architecture where specialized agents collaborate to solve complex tasks:
| Agent | Domain | Key Skills |
|---|---|---|
| š„ļø System Agent | OS Operations | Files, processes, services, power, input control, screen vision, triggers |
| š» Code Agent | Development | Code generation, execution, debugging, dev tooling (git, pip, npm) |
| š Web Agent | Web & APIs | Browser automation, scraping, HTTP requests, downloads |
| š Monitor Agent | Monitoring | CPU, RAM, disk, network monitoring with threshold alerts |
| š” Communication Agent | Messaging | Email, Slack, Discord, webhooks, desktop notifications |
How it works: The Planner generates an action plan ā the Orchestrator analyzes each action type ā routes to the correct specialist ā agents execute in sequence ā results merge for verification.
Dynamic Spawning: Agents can be created on-demand at runtime via the agent_spawn API endpoint.
| Task | Type | Status |
|---|---|---|
| Web scrape Wikipedia + word frequency analysis | Web + Code | ā |
| Background CPU trigger with voice alert | System Monitor | ā |
| Screenshot OCR + text reversal + file tree | Vision + Code | ā |
| Multi-page web comparison (Python vs JS) | Web + Analysis | ā |
| Create project scaffold + run unit tests | File + Code | ā |
| REST API fetch + JSON parse + formatted table | API + Code | ā |
| CSV data pipeline + financial analysis | Data + Code | ā |
| And more... | ā |
| Platform | Status |
|---|---|
| Windows 10/11 | ā Full support |
| Ubuntu / Debian | ā Full support |
| macOS | ā Full support |
| Fedora / Arch | ā Via dnf/pacman |
file_read Ā· file_write Ā· file_delete Ā· file_move Ā· file_copy Ā· file_list Ā· file_search Ā· file_permissions
process_list Ā· process_kill Ā· process_info
shell_command Ā· shell_script (multi-line bash/powershell/python)
code_execute ā Run Python, PowerShell, Bash, or JavaScript with auto-fix on failure
browser_navigate Ā· browser_extract Ā· browser_extract_table Ā· browser_extract_links
screenshot Ā· screen_ocr Ā· screen_analyze
package_install Ā· package_remove Ā· package_update Ā· package_search
Auto-detects: winget, choco, brew, apt, dnf, pacman
system_info Ā· cpu_usage Ā· memory_usage Ā· disk_usage Ā· network_info Ā· battery_info
window_list Ā· window_focus Ā· window_close Ā· window_minimize Ā· window_maximize
volume_get Ā· volume_set Ā· volume_mute
brightness_get Ā· brightness_set Ā· screenshot
power_shutdown Ā· power_restart Ā· power_sleep Ā· power_lock Ā· power_logout
wifi_list Ā· wifi_connect Ā· wifi_disconnect
clipboard_read Ā· clipboard_write
schedule_create Ā· schedule_list Ā· schedule_delete Ā· trigger_create
env_get Ā· env_set Ā· env_list
download_file
service_start Ā· service_stop Ā· service_restart Ā· service_enable Ā· service_disable Ā· service_status
gnome_setting_read Ā· gnome_setting_write Ā· dbus_call
registry_read Ā· registry_write
open_url Ā· open_application Ā· notify
graph TD
User(["User Input: Voice, Text, Gestures"]) --> Gateway
subgraph "Frontend Gateway - Tauri + Svelte"
Gateway["WebSocket Hub"]
GUI["Desktop Window"]
HUD["Ambient System HUD"]
VC["Voice Controller"]
GC["Hand Gesture Controller"]
RPV["ReAct Pipeline Visualizer"]
TVS["Thought Visualization š§ "]
Gateway --- GUI
GUI --- HUD
GUI --- VC
GUI --- GC
GUI --- RPV
RPV --- TVS
end
Gateway --> Fusion
subgraph "Multimodal Fusion"
Fusion["Intent Fusion Engine"]
Fusion --> |"voice + gesture"| FusedIntent["Fused Intent"]
end
FusedIntent --> Daemon
subgraph "Agent Runtime - Python"
Daemon["Agent Server / ReAct Loop"] --> Memory[("Long-term Memory")]
Memory --> |"Vector + Semantic"| ChromaDB[("ChromaDB")]
Daemon --> Decomposer["Task Decomposer"]
Decomposer --> Planner["LLM Planner"]
Planner --> PromptImprover["Prompt Improver"]
PromptImprover --> |"reuse strategies"| Planner
Planner --> Router{"Model Router"}
Router --> Ext_LLM("Gemini / OpenAI / Claude")
Router --> Int_LLM("Ollama")
Planner --> Sandbox["Simulation Sandbox"]
Sandbox --> |"risk report"| Security["Security Gate"]
Security --> Orchestrator["Agent Orchestrator"]
end
subgraph "Multi-Agent System"
Orchestrator --> SA["System Agent"]
Orchestrator --> CA["Code Agent"]
Orchestrator --> WA["Web Agent"]
Orchestrator --> MA["Monitor Agent"]
Orchestrator --> COMM["Communication Agent"]
end
subgraph "Plugin Ecosystem"
PluginReg["Plugin Registry"]
PluginReg -->|"tools"| Orchestrator
P1["developer-tools"]
P2["media-control"]
P3["home-assistant"]
P1 --- PluginReg
P2 --- PluginReg
P3 --- PluginReg
end
SA --> Executor["System Executor"]
CA --> Executor
WA --> Executor
COMM --> Executor
MA --> BG["Background Tasks"]
Executor --> Verifier["Verifier"]
BG --> Verifier
Verifier --> Reflector["Reflector"]
Reflector --> PromptImprover
Reflector --> Memory
Reflector --> SkillReg["Skill Registry"]
subgraph "Subconscious Layer"
SubAgent["Subconscious Agent"]
SubAgent -->|"persona rules"| Planner
Reflector --> SubAgent
SubAgent --> PersonaFile["~/.heliox/persona.md"]
end
subgraph "Screen Awareness"
ScreenVision["Screen Vision Agent"]
ScreenVision -->|"context"| Planner
ScreenVision --> AppDetect["Active App Detector"]
ScreenVision --> DiffEngine["Screenshot Diff"]
end
subgraph "Reasoning Telemetry"
Daemon -.-> |"events"| ReasoningEmitter["Reasoning Emitter"]
ReasoningEmitter -.-> |"WebSocket"| TVS
end
Heliox OS implements 13 research-level features that push beyond typical AI agents:
| # | Feature | Status | Module |
|---|---|---|---|
| 1 | Persistent Long-Term Memory (Vector + Semantic) | ā | memory/store.py + ChromaDB |
| 2 | Self-Reflection Loop | ā | agents/reflector.py |
| 3 | Tool Discovery / Skill Registry | ā | agents/reflector.py (skill_registry table) |
| 4 | Task Decomposition Engine | ā | agents/decomposer.py |
| 5 | Autonomous Background Agents | ā | agents/background.py + monitor_agent.py |
| 6 | Multi-Agent Collaboration | ā | agents/orchestrator.py (5 specialists) |
| 7 | Real-Time Reasoning Visualization | ā | reasoning/events.py + ReActPipeline.svelte |
| 8 | Simulation Sandbox | ā | agents/sandbox.py |
| 9 | Self-Improving Prompt System | ā | agents/prompt_improver.py |
| 10 | Plugin Ecosystem | ā | plugins/__init__.py |
| 11 | Flagship Plugins (Developer, Media, IoT) | ā | plugins/developer/, plugins/media/, plugins/homeassistant/ |
| 12 | Subconscious Agent (Persona Learning) | ā | agents/subconscious.py |
| 13 | Screen Vision (Continuous Screen Awareness) | ā | agents/screen_vision.py |
Complex goals are automatically broken into dependency-aware subtask trees:
User: "Build a Flask API for todo list"
ā 1. [system] Create project folder
ā 2. [system] Install Flask (depends: 1)
ā 3. [code] Generate API code (depends: 1)
ā 4. [code] Create requirements.txt (depends: 2)
ā 5. [code] Run tests (depends: 3, 4)
Before executing dangerous commands, the sandbox produces an impact report:
ā ļø Simulation Report:
Risk: HIGH
Impact: 154 files affected (wildcard)
Warnings:
- ā ļø Plan contains destructive actions
- š Plan requires elevated privileges
- ā»ļø 2 action(s) are NOT reversible
Recommendation: ā ļø HIGH RISK ā Confirm impact
Successful reasoning chains are stored and reused:
Heliox OS ships with 3 flagship plugins and supports community-built extensions:
| Plugin | Type | Capabilities |
|---|---|---|
| developer-tools | Code | Jira tickets, git clone, branch, commit, push, GitHub PRs |
| media-control | System | Spotify (play/pause/skip), system volume, YouTube, media keys |
| home-assistant | IoT | Smart lights, switches, thermostats, scenes, device discovery |
Drop custom plugins into ~/.heliox/plugins/ ā they're auto-discovered at startup.
{
"name": "my-plugin",
"version": "1.0.0",
"tools": [{"name": "my_tool", "inputs": ["arg1"], "action_type": "api_call"}]
}
A background agent that runs every 30 minutes to review the day's actions and learn user preferences:
~/.heliox/persona.md that is injected into planner contextpersona_add_preference APIpreference, habit, constraint, styleContinuous computer-vision loop that gives the agent awareness of what the user sees:
The easiest way to get started is to download the pre-compiled installer for your operating system.
Heliox OS_x64-setup.exeHeliox OS_aarch64.dmgHeliox OS_x86_64.dmg.AppImage or .debNote: The Python backend requires Python 3.11+ installed on your system. You must start the local daemon manually for now.
If you want to contribute or modify Heliox OS, build it from the source code:
1. Install the Python daemon:
git clone https://github.com/VyomKulshrestha/Heliox-OS.git
cd Heliox OS/daemon
pip install -e ".[full,dev]"
2. Choose your LLM:
ollama pull llama3.1:8b -> ollama serve3. Run the daemon:
cd daemon
python -m pilot.server
4. Run the frontend:
cd tauri-app/ui
npm install
npm run dev
"Show me my system info"
"Take a screenshot and read the text on screen"
"Go to Wikipedia's page on AI and summarize the first 3 paragraphs"
"Create a Python project with tests and run them"
"Kill the process using the most CPU"
"Monitor my CPU and alert me when it goes above 80%"
"Download a file and show me a tree of the folder"
"List all .py files on my Desktop"
"Set my volume to 50%"
"Create a CSV with sales data and analyze it"
"What's my IP address?"
"Install Firefox"
[!WARNING] PLEASE READ BEFORE USE: SYSTEM COMPROMISE RISK Heliox OS is an autonomous agent with the ability to execute code, delete files, and run terminal commands directly on your host operating system. While we have provided sandbox measures, the AI has real system access. Do NOT run Heliox OS with root/Administrator privileges unless absolutely necessary. We are not responsible for accidental data loss caused by LLM hallucinations.
| Tier | Level | Auto-Execute | Examples |
|---|---|---|---|
| 0 - Read Only | š¢ | Yes | file_read, system_info, clipboard_read |
| 1 - User Write | š” | Yes | file_write, clipboard_write, env_set |
| 2 - System Modify | š | Needs Confirm | package_install, service_restart, wifi_connect |
| 3 - Destructive | š“ | Needs Confirm | file_delete, process_kill, power_shutdown |
| 4 - Root Critical | ā | Needs Confirm | root operations, disk operations |
Config file: ~/.config/pilot/config.toml
[model]
provider = "ollama" # "ollama" | "cloud"
ollama_model = "llama3.1:8b"
cloud_provider = "gemini" # "gemini" | "openai" | "claude"
[security]
root_enabled = false
confirm_tier2 = true
unrestricted_shell = false
snapshot_on_destructive = true
[server]
host = "127.0.0.1"
port = 8785
We love contributions! Whether it's adding a new gesture, fixing a bug, or building a new plugin, check out our guides to get started.
MIT