Your team pasted the same Confluence page into ChatGPT three times this week because nobody trusts the shared account with access to client docs. Legal asked where prompts go. IT asked why Slack history is in a US SaaS vector database. You need ChatGPT energy — with answers grounded in Google Drive, GitHub, and the wiki you already have.
Onyx (formerly Danswer) is built for that problem. ~30k GitHub stars, MIT-licensed Community Edition, and an open-source AI platform that connects to 50+ data sources — Slack, Google Drive, Confluence, GitHub, Salesforce, and more — then answers questions with RAG, agents, web search, and deep research on LLMs you choose.
What it actually does
Onyx is an enterprise AI chat layer, not a bare Ollama front end. Upload files, connect apps, and ask questions — answers cite sources from your indexed knowledge.
Agentic RAG. Hybrid vector + keyword search, background indexing workers, and AI agents that retrieve before they answer. Built for "find the clause in last year's contract" not "write me a poem."
50+ connectors. Sync documents from Google Drive, Slack, Confluence, Notion, GitHub, Jira, and dozens more. Permission mirroring means users only see RAG results from docs they could access in the source app — important when you're not dumping the whole company wiki into one bucket.
Custom agents. Persona, knowledge scope, and actions per agent — support triage, legal review tone, engineering onboarding bot. MCP support for external tool integration.
Deep research. Multi-step research flows that produce long-form reports — Onyx has topped public deep-research benchmarks. Web search via Serper, Brave, SearXNG, or built-in crawlers.
More than chat. Code execution in a sandbox, artifact generation (documents, graphics), voice mode, image generation — optional features you enable when you need them, not day-one requirements.
Any LLM. OpenAI, Anthropic, Gemini, or self-hosted Ollama, vLLM, LiteLLM. Hybrid setups are normal: cloud model for hard reasoning, local model for sensitive threads.
Onyx vs Open WebUI vs Khoj
We've covered lighter AI stacks:
- Open WebUI — excellent multi-user chat front end for Ollama; upload files per thread
- Khoj — personal second brain with scheduled automations and Obsidian hooks
- Jan — desktop ChatGPT feel with local models
Onyx is the team enterprise search lane — connector sync, permission-aware RAG, SSO/RBAC (Enterprise Edition), audit logs, and deep research at org scale. Overkill for solo homelab chat; right-sized when ten people need answers from company docs without pasting into chatgpt.com.
Why self-host?
Company knowledge stays on your infrastructure. Indexed Slack messages, Drive files, and Confluence pages on a Canadian VPS — not an opaque multi-tenant AI vendor's cluster.
PIPEDA and client contracts. "Our AI search runs in Montreal on hardware we control" beats "we use a US SaaS and hope their DPA holds."
Air-gapped option. Onyx advertises fully air-gapped deployment — local LLMs, local index, no outbound calls when policy requires it.
MIT Community Edition. Core chat, RAG, agents, and actions are free under MIT. Enterprise Edition adds SCIM, advanced analytics, whitelabeling — check their pricing if you need those.
What running it takes
Guided install:
curl -fsSL https://onyx.app/install_onyx.sh | bash
The script creates an onyx_data directory, lets you pick deployment mode, and starts Docker Compose.
Onyx Lite — under 1 GB RAM, chat UI and agents without full document indexing. Good for testing or teams that only need LLM chat without connector sync.
Standard Onyx — full stack: vector + keyword index, background workers for connector sync, Redis cache, MinIO blob store, model inference containers for indexing. Plan serious RAM (8 GB+ to start, more as document count grows) and fast disk for the index.
Docs at docs.onyx.app cover Docker, Kubernetes, and cloud guides. HTTPS via reverse proxy; configure SSO if exposing beyond VPN. Back up the onyx_data volume and connector credentials.
Who it's for (and who should skip it)
Good fit: teams replacing "paste into ChatGPT" with grounded internal search, agencies with client docs across Drive and GitHub, Canadian businesses needing audit-friendly AI adoption, orgs with 50+ connector use cases.
Maybe skip it: solo devs who just want Ollama in a browser — Open WebUI is simpler ops. If you won't connect any data sources, Lite mode is a heavy chat box. If 512 MB RAM is your whole VPS, this isn't the app.
Hosting it in Canada
We deploy Standard Onyx on Canadian Docker hosting — sized for Redis, MinIO, and indexing workers, TLS, VPN or SSO in front, and backup scope for the vector index and blob store.
Tell us connector count and document volume — we'll recommend Lite vs Standard honestly and size RAM before the first Confluence sync fills the disk.