Open WebUI (AI Chat)

Open WebUI is a ChatGPT-style interface for interacting with LLMs — with conversation history, document upload for RAG, custom system prompts, and multi-model support.

Aithroyz connects Open WebUI to the LLM Gateway (when deployed) so Claude, GPT-4, and Gemini are available from the model selector without any additional configuration.

Access

URL: https://chat.<env-name>.ops.aithroyz.com

First user: The first account to register on a fresh deployment automatically becomes the admin. Register before sharing the URL.

⚠

Open WebUI does not ship with pre-created accounts. Register your admin account immediately after deployment to prevent unauthorized access.

Key features

Multi-model support

Switch between any model in the LLM Gateway or Ollama from the model selector at the top of each conversation.

Document upload RAG

Attach PDFs, DOCX, TXT, and other files directly in chat using the paperclip icon — Open WebUI chunks and embeds them on the fly for context retrieval.

Knowledge collections

Workspace → Knowledge → create a named collection of documents. Attach a collection to a conversation for persistent retrieval across sessions.

Custom model personas

Create named model configurations with a specific base model, system prompt, and temperature — share them with other users in the same instance.

Web search

Enable web search in Admin → Settings → Web Search. Open WebUI fetches live search results and includes them as context in the LLM call.

Image generation

Connect an image generation backend (DALL-E via LLM Gateway, or a local Stable Diffusion endpoint) in Admin → Settings → Images.

Creating a custom persona

Personas let you create pre-configured model identities that any user can select from the model picker:

Open Workspace → Models

Click the Workspace icon in the left sidebar, then select Models.

New Model

Click + New Model and give it a name (e.g. "Security Analyst").

Select base model

Choose any model available in your instance — the LLM Gateway models appear if connected.

Write a system prompt

Enter the system prompt that defines the persona's role, tone, and constraints.

Save and share

Click Save. The persona appears in the model selector for all users on this instance.

Document RAG

Two ways to use documents as context in a conversation:

Per-message upload

Click the paperclip icon next to the message input to attach a file. It is embedded and used as context for that conversation only.

Knowledge collections

Workspace → Knowledge → New Collection. Add documents. When starting a chat, type # to attach a collection — its contents are retrieved semantically across all sessions.

✓

For large document libraries (100+ files), use Flowise + Qdrant for RAG instead of Open WebUI's built-in knowledge — it scales better and gives you more control over chunking strategy.

Tips

Pipelines

Admin → Pipelines lets you add Python middleware between the user message and the LLM — useful for content filtering, prompt injection detection, or logging.

Functions

Workspace → Functions lets you write custom Python tools that the LLM can call (function calling / tool use) — similar to GPT custom actions.

User management

As admin, invite users via Admin → Users. Each user gets their own conversation history while sharing the model connections.

LLM GatewayRead article →Ollama (Local LLM Runner)Read article →Flowise (LangChain Visual Builder)Read article →

Open WebUI (AI Chat)

Open WebUI is a ChatGPT-style interface for interacting with LLMs — with conversation history, document upload for RAG, custom system prompts, and multi-model support.

Aithroyz connects Open WebUI to the LLM Gateway (when deployed) so Claude, GPT-4, and Gemini are available from the model selector without any additional configuration.

Access

URL: https://chat.<env-name>.ops.aithroyz.com

First user: The first account to register on a fresh deployment automatically becomes the admin. Register before sharing the URL.

⚠

Open WebUI does not ship with pre-created accounts. Register your admin account immediately after deployment to prevent unauthorized access.

Key features

Multi-model support

Switch between any model in the LLM Gateway or Ollama from the model selector at the top of each conversation.

Document upload RAG

Attach PDFs, DOCX, TXT, and other files directly in chat using the paperclip icon — Open WebUI chunks and embeds them on the fly for context retrieval.

Knowledge collections

Workspace → Knowledge → create a named collection of documents. Attach a collection to a conversation for persistent retrieval across sessions.

Custom model personas

Create named model configurations with a specific base model, system prompt, and temperature — share them with other users in the same instance.

Web search

Enable web search in Admin → Settings → Web Search. Open WebUI fetches live search results and includes them as context in the LLM call.

Image generation

Connect an image generation backend (DALL-E via LLM Gateway, or a local Stable Diffusion endpoint) in Admin → Settings → Images.

Creating a custom persona

Personas let you create pre-configured model identities that any user can select from the model picker:

Open Workspace → Models

Click the Workspace icon in the left sidebar, then select Models.

New Model

Click + New Model and give it a name (e.g. "Security Analyst").

Select base model

Choose any model available in your instance — the LLM Gateway models appear if connected.

Write a system prompt

Enter the system prompt that defines the persona's role, tone, and constraints.

Save and share

Click Save. The persona appears in the model selector for all users on this instance.

Document RAG

Two ways to use documents as context in a conversation:

Per-message upload

Click the paperclip icon next to the message input to attach a file. It is embedded and used as context for that conversation only.

Knowledge collections

Workspace → Knowledge → New Collection. Add documents. When starting a chat, type # to attach a collection — its contents are retrieved semantically across all sessions.

✓

For large document libraries (100+ files), use Flowise + Qdrant for RAG instead of Open WebUI's built-in knowledge — it scales better and gives you more control over chunking strategy.

Tips

Pipelines

Admin → Pipelines lets you add Python middleware between the user message and the LLM — useful for content filtering, prompt injection detection, or logging.

Functions

Workspace → Functions lets you write custom Python tools that the LLM can call (function calling / tool use) — similar to GPT custom actions.

User management

As admin, invite users via Admin → Users. Each user gets their own conversation history while sharing the model connections.