A high-performance hardware compatibility calculator designed specifically for running local, open-source Large Language Models (LLMs).
With the explosion of local AI architecture, knowing exactly whether an immense model like Llama 3 70B can physically fit into your GPU's VRAM is critical. The RunMyAIModel calculator takes the guesswork out of AI hardware constraints through rigorous mathematical profiling.
This calculator does not use arbitrary "T-shirt sizes." It actively computes hardware constraints using the following physical hardware logic:
The total VRAM required is never just the weights. We calculate the exact contextual bounds required to run the model at specific lengths:
Total VRAM Needed = Model Weight (GB) + (Context Length in K * kv_per_1k_gb) + 1.0GB (Inference Overhead)
In LLM inference, memory bandwidth (not compute TFOPs) is the primary bottleneck for generation speed:
Predicted Tok/s = Effective Bandwidth (GB/s) / (Model Weight (GB) + 1.0GB)
If a model exceeds your VRAM, it spills (offloads) into System RAM, destroying bandwidth caps over the slow PCIe lane. We penalize speed predictions accordingly:
Offload Ratio = (Spillover Weight in RAM) / (Total Model Weight)
Base Speed Penalty = 15% + (Offload Ratio * 70%)
Macs (M1-M4) do not have separate VRAM and System RAM. The UI automatically zeroes out System RAM when "Apple" is selected, treating the entire memory pool as Unified fast VRAM, bypassing the standard PCIe offload penalty entirely.
Bandwidth does not scale perfectly linearly when bridging multiple GPUs together:
$/hr) that guarantees a fit.Built with a stunning, ultra-minimalist, high-contrast dark mode aesthetic replicating premium tech tooling architectures (like Vercel and Apple).
flex-wrap).If you wish to fork and run this project locally:
git clone https://github.com/yourusername/RunMyAIModel.git
cd RunMyAiModel
npm install
npm run dev
http://localhost:5173.Non-Commercial & Contribution License
This project is actively maintained by the creator but welcomes community involvement! You are highly encouraged to read the code, fork the repository, and submit Pull Requests to improve the mathematical logic or hardware databases.
However, protective copyright applies to commercial usage: You are strictly prohibited from deploying, distributing, or utilizing this code (including its exact mathematical engines, database schema, and UI architecture) for any commercial purposes or for-profit projects without explicit written permission from the creator.