Heliox-OS Svelte Themes

Heliox Os

An open-source, privacy-first AI System Control Agent (JARVIS-like) using voice and hand gestures

Heliox OS — AI System Control Agent

Release Downloads Build Status CI Good First Issues License Platform

Heliox OS Jarvis Demo

Control your entire computer with natural language, voice, and hand gestures.
An open-source, privacy-first AI agent that plans, executes, and verifies complex multi-step tasks.

🌐 Visit the Official Website (helioxos.dev) 🌐

Quick Start • JARVIS Mode • Features • Architecture • Security • Contributing


Why Heliox OS?

Unlike simple command runners, Heliox OS is a true agentic system inspired by robust autonomous architectures like OpenClaw, running a continuous ReAct loop with a modular multi-agent orchestrator:

  1. Gateway Hub & Memory — LLM evaluates persistent memory context before reasoning.
  2. Planner — Converts natural language into a structured multi-step action plan.
  3. Agent Orchestrator — Routes each action to the correct specialist agent (System, Code, Web, Monitor, Communication).
  4. Specialist Agents — Five domain experts execute actions via native OS APIs (never flimsy GUI automation).
  5. Verifier — Post-execution verification confirms the action succeeded and feeds results back into the loop.
  6. Reflector — Self-improvement engine learns from successes and failures.
  7. Security — Five-tier permission system with confirmation gates and rollback support.

šŸ¤– JARVIS Mode (New!)

Heliox OS now includes a futuristic, Iron Man-style interface:

  • šŸŽ¤ Voice Control: Push-to-talk or use the always-on "Hey Heliox" wake word.
  • šŸ—£ļø Text-to-Speech: Heliox OS speaks its responses aloud to you.
  • 🤚 30+ Hand Gestures: Control your PC via webcam — static poses (Palm, Thumbs Up/Down, Peace, Fist, OK, Pinch, Vulcan, Devil Horns, Snap, etc.) and motion gestures (Swipe, Circular Volume, Palm Push/Pull, Two-Finger Workspace Switch).
  • šŸ–Šļø Air Drawing: Point your index finger to draw glowing trails in the air.
  • šŸŒ€ Arc Reactor UI: Animated spinning reactor logo, neural background, and particle explosion effects on task completion.
  • šŸ“Š Ambient HUD: Holographic system monitor overlay for CPU, RAM, and Disk metrics.
  • šŸ”¬ ReAct Pipeline Visualizer: Real-time animated graph showing the AI's reasoning process — each stage (Memory → Planning → Routing → Execution → Verification → Reflection) lights up as it runs.

🧠 Multi-Agent Orchestrator

Heliox OS uses a modular multi-agent architecture where specialized agents collaborate to solve complex tasks:

Agent Domain Key Skills
šŸ–„ļø System Agent OS Operations Files, processes, services, power, input control, screen vision, triggers
šŸ’» Code Agent Development Code generation, execution, debugging, dev tooling (git, pip, npm)
🌐 Web Agent Web & APIs Browser automation, scraping, HTTP requests, downloads
šŸ“Š Monitor Agent Monitoring CPU, RAM, disk, network monitoring with threshold alerts
šŸ“” Communication Agent Messaging Email, Slack, Discord, webhooks, desktop notifications

How it works: The Planner generates an action plan → the Orchestrator analyzes each action type → routes to the correct specialist → agents execute in sequence → results merge for verification.

Dynamic Spawning: Agents can be created on-demand at runtime via the agent_spawn API endpoint.

🧪 Tested With 10 Complex Tasks — 80%+ Pass Rate

Task Type Status
Web scrape Wikipedia + word frequency analysis Web + Code āœ…
Background CPU trigger with voice alert System Monitor āœ…
Screenshot OCR + text reversal + file tree Vision + Code āœ…
Multi-page web comparison (Python vs JS) Web + Analysis āœ…
Create project scaffold + run unit tests File + Code āœ…
REST API fetch + JSON parse + formatted table API + Code āœ…
CSV data pipeline + financial analysis Data + Code āœ…
And more... āœ…

šŸ–„ļø Cross-Platform Support

Platform Status
Windows 10/11 āœ… Full support
Ubuntu / Debian āœ… Full support
macOS āœ… Full support
Fedora / Arch āœ… Via dnf/pacman

⚔ 50+ Action Types

File Operations

file_read Ā· file_write Ā· file_delete Ā· file_move Ā· file_copy Ā· file_list Ā· file_search Ā· file_permissions

Process Management

process_list Ā· process_kill Ā· process_info

Shell Execution

shell_command Ā· shell_script (multi-line bash/powershell/python)

Code Execution

code_execute — Run Python, PowerShell, Bash, or JavaScript with auto-fix on failure

Browser & Web

browser_navigate Ā· browser_extract Ā· browser_extract_table Ā· browser_extract_links

Screen & Vision

screenshot Ā· screen_ocr Ā· screen_analyze

Package Management

package_install Ā· package_remove Ā· package_update Ā· package_search Auto-detects: winget, choco, brew, apt, dnf, pacman

System Information

system_info Ā· cpu_usage Ā· memory_usage Ā· disk_usage Ā· network_info Ā· battery_info

Window Management

window_list Ā· window_focus Ā· window_close Ā· window_minimize Ā· window_maximize

Audio / Volume

volume_get Ā· volume_set Ā· volume_mute

Display / Screen

brightness_get Ā· brightness_set Ā· screenshot

Power Management

power_shutdown Ā· power_restart Ā· power_sleep Ā· power_lock Ā· power_logout

Network / WiFi

wifi_list Ā· wifi_connect Ā· wifi_disconnect

Clipboard

clipboard_read Ā· clipboard_write

Scheduled Tasks & Triggers

schedule_create Ā· schedule_list Ā· schedule_delete Ā· trigger_create

Environment Variables

env_get Ā· env_set Ā· env_list

Downloads

download_file

Service Management (Linux)

service_start Ā· service_stop Ā· service_restart Ā· service_enable Ā· service_disable Ā· service_status

GNOME / Desktop (Linux)

gnome_setting_read Ā· gnome_setting_write Ā· dbus_call

Windows Registry

registry_read Ā· registry_write

Open / Launch / Notify

open_url Ā· open_application Ā· notify

Architecture

graph TD
    User(["User Input: Voice, Text, Gestures"]) --> Gateway

    subgraph "Frontend Gateway - Tauri + Svelte"
        Gateway["WebSocket Hub"]
        GUI["Desktop Window"]
        HUD["Ambient System HUD"]
        VC["Voice Controller"]
        GC["Hand Gesture Controller"]
        RPV["ReAct Pipeline Visualizer"]
        TVS["Thought Visualization 🧠"]
        Gateway --- GUI
        GUI --- HUD
        GUI --- VC
        GUI --- GC
        GUI --- RPV
        RPV --- TVS
    end

    Gateway --> Fusion

    subgraph "Multimodal Fusion"
        Fusion["Intent Fusion Engine"]
        Fusion --> |"voice + gesture"| FusedIntent["Fused Intent"]
    end

    FusedIntent --> Daemon

    subgraph "Agent Runtime - Python"
        Daemon["Agent Server / ReAct Loop"] --> Memory[("Long-term Memory")]
        Memory --> |"Vector + Semantic"| ChromaDB[("ChromaDB")]
        Daemon --> Decomposer["Task Decomposer"]
        Decomposer --> Planner["LLM Planner"]
        Planner --> PromptImprover["Prompt Improver"]
        PromptImprover --> |"reuse strategies"| Planner
        Planner --> Router{"Model Router"}
        Router --> Ext_LLM("Gemini / OpenAI / Claude")
        Router --> Int_LLM("Ollama")
        Planner --> Sandbox["Simulation Sandbox"]
        Sandbox --> |"risk report"| Security["Security Gate"]
        Security --> Orchestrator["Agent Orchestrator"]
    end

    subgraph "Multi-Agent System"
        Orchestrator --> SA["System Agent"]
        Orchestrator --> CA["Code Agent"]
        Orchestrator --> WA["Web Agent"]
        Orchestrator --> MA["Monitor Agent"]
        Orchestrator --> COMM["Communication Agent"]
    end

    subgraph "Plugin Ecosystem"
        PluginReg["Plugin Registry"]
        PluginReg -->|"tools"| Orchestrator
        P1["developer-tools"]
        P2["media-control"]
        P3["home-assistant"]
        P1 --- PluginReg
        P2 --- PluginReg
        P3 --- PluginReg
    end

    SA --> Executor["System Executor"]
    CA --> Executor
    WA --> Executor
    COMM --> Executor
    MA --> BG["Background Tasks"]

    Executor --> Verifier["Verifier"]
    BG --> Verifier
    Verifier --> Reflector["Reflector"]
    Reflector --> PromptImprover
    Reflector --> Memory
    Reflector --> SkillReg["Skill Registry"]

    subgraph "Subconscious Layer"
        SubAgent["Subconscious Agent"]
        SubAgent -->|"persona rules"| Planner
        Reflector --> SubAgent
        SubAgent --> PersonaFile["~/.heliox/persona.md"]
    end

    subgraph "Screen Awareness"
        ScreenVision["Screen Vision Agent"]
        ScreenVision -->|"context"| Planner
        ScreenVision --> AppDetect["Active App Detector"]
        ScreenVision --> DiffEngine["Screenshot Diff"]
    end


    subgraph "Reasoning Telemetry"
        Daemon -.-> |"events"| ReasoningEmitter["Reasoning Emitter"]
        ReasoningEmitter -.-> |"WebSocket"| TVS
    end

🧠 Research-Level AI Architecture

Heliox OS implements 13 research-level features that push beyond typical AI agents:

# Feature Status Module
1 Persistent Long-Term Memory (Vector + Semantic) āœ… memory/store.py + ChromaDB
2 Self-Reflection Loop āœ… agents/reflector.py
3 Tool Discovery / Skill Registry āœ… agents/reflector.py (skill_registry table)
4 Task Decomposition Engine āœ… agents/decomposer.py
5 Autonomous Background Agents āœ… agents/background.py + monitor_agent.py
6 Multi-Agent Collaboration āœ… agents/orchestrator.py (5 specialists)
7 Real-Time Reasoning Visualization āœ… reasoning/events.py + ReActPipeline.svelte
8 Simulation Sandbox āœ… agents/sandbox.py
9 Self-Improving Prompt System āœ… agents/prompt_improver.py
10 Plugin Ecosystem āœ… plugins/__init__.py
11 Flagship Plugins (Developer, Media, IoT) āœ… plugins/developer/, plugins/media/, plugins/homeassistant/
12 Subconscious Agent (Persona Learning) āœ… agents/subconscious.py
13 Screen Vision (Continuous Screen Awareness) āœ… agents/screen_vision.py

šŸ”§ Task Decomposition Engine

Complex goals are automatically broken into dependency-aware subtask trees:

User: "Build a Flask API for todo list"
 → 1. [system] Create project folder
 → 2. [system] Install Flask            (depends: 1)
 → 3. [code]   Generate API code         (depends: 1)
 → 4. [code]   Create requirements.txt   (depends: 2)
 → 5. [code]   Run tests                 (depends: 3, 4)

šŸ›”ļø Simulation Sandbox

Before executing dangerous commands, the sandbox produces an impact report:

āš ļø Simulation Report:
  Risk: HIGH
  Impact: 154 files affected (wildcard)
  Warnings:
    - āš ļø Plan contains destructive actions
    - šŸ” Plan requires elevated privileges
    - ā™»ļø 2 action(s) are NOT reversible
  Recommendation: āš ļø HIGH RISK — Confirm impact

🧬 Self-Improving Prompt System

Successful reasoning chains are stored and reused:

  • Keyword-indexed prompt templates with success/failure rates
  • Automatic strategy matching for similar future tasks
  • Rolling improvement — the agent gets better over time

šŸ”Œ Plugin Ecosystem

Heliox OS ships with 3 flagship plugins and supports community-built extensions:

Plugin Type Capabilities
developer-tools Code Jira tickets, git clone, branch, commit, push, GitHub PRs
media-control System Spotify (play/pause/skip), system volume, YouTube, media keys
home-assistant IoT Smart lights, switches, thermostats, scenes, device discovery

Drop custom plugins into ~/.heliox/plugins/ — they're auto-discovered at startup.

{
  "name": "my-plugin",
  "version": "1.0.0",
  "tools": [{"name": "my_tool", "inputs": ["arg1"], "action_type": "api_call"}]
}

🧠 Subconscious Agent (Persona Learning)

A background agent that runs every 30 minutes to review the day's actions and learn user preferences:

  • Clusters behavioral patterns ("always writes Python", "prefers dark mode")
  • Extracts actionable rules with confidence scores
  • Writes a ~/.heliox/persona.md that is injected into planner context
  • Supports manual preference setting via persona_add_preference API
  • Categories: preference, habit, constraint, style

šŸ‘ļø Screen Vision Agent

Continuous computer-vision loop that gives the agent awareness of what the user sees:

  • Takes screenshots every 2 seconds, hashes for change detection
  • Detects the active application and window title cross-platform
  • Maintains a rolling context buffer of recent screen states
  • When user says "summarize this" or "close that", the planner already knows the target
  • Optional LLM-powered screen description for advanced awareness

šŸš€ Installation

The easiest way to get started is to download the pre-compiled installer for your operating system.

  1. Go to the GitHub Releases page.
  2. Download the installer for your OS:
    • Windows: Heliox OS_x64-setup.exe
    • macOS (Apple Silicon): Heliox OS_aarch64.dmg
    • macOS (Intel): Heliox OS_x86_64.dmg
    • Linux: .AppImage or .deb
  3. Install the app.
  4. Open Heliox OS and enter your API Key (e.g., Gemini, OpenAI, Claude) in the Settings tab.

Note: The Python backend requires Python 3.11+ installed on your system. You must start the local daemon manually for now.

Option 2: Build from Source (For Developers)

If you want to contribute or modify Heliox OS, build it from the source code:

1. Install the Python daemon:

git clone https://github.com/VyomKulshrestha/Heliox-OS.git
cd Heliox OS/daemon
pip install -e ".[full,dev]"

2. Choose your LLM:

  • Local (Ollama): ollama pull llama3.1:8b -> ollama serve
  • Cloud (Gemini/OpenAI/Claude): Add your API key in the app GUI.

3. Run the daemon:

cd daemon
python -m pilot.server

4. Run the frontend:

cd tauri-app/ui
npm install
npm run dev

Example Commands

"Show me my system info"
"Take a screenshot and read the text on screen"
"Go to Wikipedia's page on AI and summarize the first 3 paragraphs"
"Create a Python project with tests and run them"
"Kill the process using the most CPU"
"Monitor my CPU and alert me when it goes above 80%"
"Download a file and show me a tree of the folder"
"List all .py files on my Desktop"
"Set my volume to 50%"
"Create a CSV with sales data and analyze it"
"What's my IP address?"
"Install Firefox"

šŸ›”ļø Security

[!WARNING] PLEASE READ BEFORE USE: SYSTEM COMPROMISE RISK Heliox OS is an autonomous agent with the ability to execute code, delete files, and run terminal commands directly on your host operating system. While we have provided sandbox measures, the AI has real system access. Do NOT run Heliox OS with root/Administrator privileges unless absolutely necessary. We are not responsible for accidental data loss caused by LLM hallucinations.

  • All AI outputs pass through structured schema validation before execution
  • Five-tier permission system (read-only through root-level)
  • Confirmation required for system-modifying and destructive actions
  • Snapshot-based rollback via Btrfs or Timeshift (Linux)
  • Append-only audit log for all executed actions
  • Command whitelist with optional unrestricted mode
  • Encrypted API key storage via platform keyring (GNOME Keyring / Windows Credential Manager)
  • API keys are NEVER logged, included in plans, or sent to local LLMs

Permission Tiers

Tier Level Auto-Execute Examples
0 - Read Only 🟢 Yes file_read, system_info, clipboard_read
1 - User Write 🟔 Yes file_write, clipboard_write, env_set
2 - System Modify 🟠 Needs Confirm package_install, service_restart, wifi_connect
3 - Destructive šŸ”“ Needs Confirm file_delete, process_kill, power_shutdown
4 - Root Critical ā›” Needs Confirm root operations, disk operations

Configuration

Config file: ~/.config/pilot/config.toml

[model]
provider = "ollama"           # "ollama" | "cloud"
ollama_model = "llama3.1:8b"
cloud_provider = "gemini"     # "gemini" | "openai" | "claude"

[security]
root_enabled = false
confirm_tier2 = true
unrestricted_shell = false
snapshot_on_destructive = true

[server]
host = "127.0.0.1"
port = 8785

šŸ¤ Contributing

We love contributions! Whether it's adding a new gesture, fixing a bug, or building a new plugin, check out our guides to get started.

  1. Read our Contributing Guide to set up your dev environment.
  2. Check the Good First Issues tab on GitHub to find beginner-friendly tasks.
  3. Review our Code of Conduct.
  4. Join the community discussions in GitHub Discussions.

License

MIT

Top categories

Loading Svelte Themes