A privacy-first platform to manage and run powerful Large Language Models (LLMs) locally, with an optional cloud relay for seamless remote access.
Key Features • Download & Install • Documentation • Development
CloudToLocalLLM bridges the gap between secure local AI execution and the convenience of cloud-based management. Designed for privacy-conscious users and businesses, it allows you to run models like Llama 3 and Mistral entirely on your own hardware while offering an optional, secure pathway for remote interaction.
Note: The project is currently in Heavy Development/Early Access.
- 🔒 Privacy-First: Run models locally using Ollama. Your data stays on your device by default.
- 💻 Cross-Platform: Native support for Windows and Linux, with a responsive Web interface.
- ⚡ Hybrid Architecture: Seamlessly switch between local models when needed.
- 🔌 Extensible: Integrated with LangChain for advanced AI workflows.
- ☁️ Cloud Infrastructure: Deployed on AWS EKS for scalable management.
- 🏠 Self-Hosted: Easily deploy your own instance on any Linux VPS using Docker Compose.
To use CloudToLocalLLM locally:
- Ollama: The engine that runs the AI models.
- Pull a model:
ollama pull llama3.2
- Pull a model:
- Go to the Latest Releases page.
- Download the installer or executable (
.exefor Windows,.AppImagefor Linux). - Launch the application.
Latest web deployment: cloudtolocalllm.online
- User Guide: Configuration and usage.
- Self-Hosting Guide: Run your own relay server.
- System Architecture: Technical deep dive.
- AWS Operations: EKS deployment details.
- Frontend: Flutter (Linux, Windows, Web)
- Backend: Node.js (Express.js)
- Authentication: Auth0
- Deployment: AWS EKS (Cloud) or Docker Compose (Self-Hosted)
- Clone:
git clone https://github.com/CloudToLocalLLM-online/CloudToLocalLLM.git - Deps:
flutter pub get&&(cd services/api-backend && npm install) - Run:
flutter run -d linux(Desktop) orflutter run -d chrome(Web)
This project is licensed under the MIT License.