Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
179 commits
Select commit Hold shift + click to select a range
1d30d8a
build(deps): bump python-multipart from 0.0.18 to 0.0.20 in /backend
dependabot[bot] Apr 1, 2025
fc5c8db
build(deps): bump einops from 0.8.0 to 0.8.1 in /backend
dependabot[bot] Apr 1, 2025
bb0e11b
build(deps): bump pypandoc from 1.13 to 1.15 in /backend
dependabot[bot] Apr 1, 2025
e0ec2cd
refac: $user
tjbck Apr 1, 2025
abba7c1
[FEAT]-Adjust Translations for temporary chat
weisser-dev Apr 1, 2025
361584b
fix: update Polish translations for clarity and accuracy
lukasz-pekala Apr 1, 2025
295c7eb
[improvement] default permission for new groups is false for enforce …
weisser-dev Apr 1, 2025
b60beb6
fix: improve Polish translations for clarity and accuracy
lukasz-pekala Apr 1, 2025
79cca06
Merge pull request #12260 from weisser-dev/feat-default-value-for-enf…
tjbck Apr 1, 2025
f3c2614
Merge pull request #12253 from open-webui/dependabot/pip/backend/dev/…
tjbck Apr 1, 2025
4ce05ad
Merge pull request #12255 from open-webui/dependabot/pip/backend/dev/…
tjbck Apr 1, 2025
cea5c1a
Merge pull request #12254 from open-webui/dependabot/pip/backend/dev/…
tjbck Apr 1, 2025
9f470a4
fix: update Polish translations to sound more natural
lukasz-pekala Apr 1, 2025
c8210d4
upated like in PR discussed
weisser-dev Apr 1, 2025
70bb056
Merge branch 'open-webui:dev' into dev
weisser-dev Apr 1, 2025
29d5745
fix: update Polish translation for "Capture" to improve accuracy
lukasz-pekala Apr 1, 2025
c79fa42
Merge pull request #12262 from lukasz-pekala/fb/fixes-to-polish-trans…
tjbck Apr 1, 2025
15f6e3a
Merge pull request #12259 from weisser-dev/dev
tjbck Apr 1, 2025
85c8a9b
i18n: zh-cn
panda44312 Apr 1, 2025
8799ff9
fix
panda44312 Apr 1, 2025
fa72c27
i18n: update zh-TW
TiancongLx Apr 1, 2025
289d49d
Merge pull request #12269 from panda44312/patch-9
tjbck Apr 1, 2025
fffa793
Merge pull request #12270 from TiancongLx/dev
tjbck Apr 1, 2025
825becc
Arabic Translation
Saidoua Apr 1, 2025
1ac6879
Add Mistral OCR integration and configuration support
paddy313 Mar 22, 2025
b652b8e
Update translation.json
Xelaph Apr 1, 2025
88b9324
Update translation.json
Xelaph Apr 1, 2025
82e5a64
Update translation.json
Xelaph Apr 1, 2025
883ad55
Update translation.json
Xelaph Apr 1, 2025
8f8c344
Pin onnxruntime to 1.20.1 to address SIGILL on certain arm64 hosts
lowlyocean Apr 1, 2025
0447d90
Update translation.json
OriginalSimon Apr 1, 2025
93d7702
refactor: move MistralLoader to a separate module and just use the re…
paddy313 Apr 1, 2025
c5a8d2f
refactor: update MistralLoader documentation and adjust parameters fo…
paddy313 Apr 1, 2025
2b7dd6e
refactor: standardize filter valve retrieval logic
landerrosette Apr 1, 2025
39b0c06
Merge pull request #12299 from lowlyocean/rpi_onnxruntime
tjbck Apr 1, 2025
1b37bdc
Merge pull request #12278 from Saidoua/main
tjbck Apr 1, 2025
2a16399
Merge pull request #12294 from Xelaph/dev
tjbck Apr 1, 2025
5ab7830
Merge pull request #12304 from OriginalSimon/main
tjbck Apr 1, 2025
adaa614
fix
panda44312 Apr 2, 2025
d0db475
Merge pull request #12310 from landerrosette/fix_filter_priority
tjbck Apr 2, 2025
0ac00b9
refactor: update import path for MistralLoader
paddy313 Apr 2, 2025
de8f94b
[i18n] Russian localization update
SadmL Apr 2, 2025
d65471c
fix
silentoplayz Apr 2, 2025
ee68c9e
Update Chats.svelte
silentoplayz Apr 2, 2025
517a57b
Merge pull request #12331 from panda44312/patch-10
tjbck Apr 2, 2025
0554bbb
Merge pull request #12307 from paddy313/feature/mistral_ocr
tjbck Apr 2, 2025
6ee3004
Merge pull request #12353 from SadmL/dev
tjbck Apr 2, 2025
bb7f4d4
Merge pull request #12364 from silentoplayz/archive-chats-option
tjbck Apr 3, 2025
548c7f1
Added OAUTH_USE_PICTURE_CLAIM env var
CityOfBunbury Apr 3, 2025
dd5bafe
Update env.py
silentoplayz Apr 3, 2025
9036945
Merge pull request #12355 from silentoplayz/logging-fix
tjbck Apr 3, 2025
0644abe
fix: admin folder deletion issue
tjbck Apr 3, 2025
94bf494
enh: unload hybrid model if set to False
tjbck Apr 3, 2025
7eea95a
feat: direct tools user permissions
tjbck Apr 3, 2025
9435345
refac: settings ui styling
tjbck Apr 3, 2025
506950b
Merge pull request #12376 from MushroomLamp-COB/main
tjbck Apr 3, 2025
b15bf0d
refac
tjbck Apr 3, 2025
7a1e10f
refac: rm OAUTH_USE_PICTURE_CLAIM
tjbck Apr 3, 2025
959995c
refac: use selected model for merge response
tjbck Apr 3, 2025
2277566
fix: tool server api key not being sent
tjbck Apr 3, 2025
5c5160c
refac: remove `None` params
tjbck Apr 3, 2025
c0711ba
refac
tjbck Apr 3, 2025
436e3ff
refac
tjbck Apr 3, 2025
561b2c0
refac: styling
tjbck Apr 3, 2025
faa68fc
enh: image tool response
tjbck Apr 3, 2025
9113218
refac
tjbck Apr 3, 2025
100f5a5
refac
tjbck Apr 3, 2025
bcf0a87
refac: styling
tjbck Apr 3, 2025
ba77a72
refac: styling
tjbck Apr 3, 2025
dfdde8b
refac
tjbck Apr 3, 2025
a1f3300
fix: tls cert requirement
tjbck Apr 3, 2025
be20e6d
refac: message edit
tjbck Apr 3, 2025
2b0d9c2
fix: don't show export button if nothing to export
silentoplayz Apr 4, 2025
1366822
Update Feedbacks.svelte
silentoplayz Apr 4, 2025
6d5cb6b
Add query param to remove content from GET /api/v1/files
gaby Apr 4, 2025
41b9ff9
fix: funcs, models, prompts, & tools conditional export button
silentoplayz Apr 4, 2025
1c57e3e
Fix API_KEY_ALLOWED_ENDPOINTS
gaby Apr 4, 2025
a25d876
i18n: add Tibetan language
thirdpoler Apr 4, 2025
ec3435d
make content parameter optional in OpenAI chat completion API endpoint
Apr 4, 2025
138e985
Rename field to include_content
gaby Apr 4, 2025
3b2b6e1
Added missing parameter for query_doc_with_hybrid_search.
mahenning Apr 4, 2025
64b68b3
Merge pull request #12447 from floriankick/fix-openai-api-empty-messa…
tjbck Apr 4, 2025
a32bb85
Merge pull request #12450 from mahenning/fix-missing-parameter-rag
tjbck Apr 4, 2025
77fa110
enh: openapi tool server custom path
tjbck Apr 4, 2025
feaf434
refac: input prompt
tjbck Apr 4, 2025
b612af2
refac
tjbck Apr 4, 2025
7e46cc4
Merge pull request #12439 from narutopden/i18n-add-Tibetan-translation
tjbck Apr 4, 2025
793aa30
Merge pull request #12433 from gaby/fix-allowed-endpoints
tjbck Apr 4, 2025
cb94a87
refac: tools styling
tjbck Apr 5, 2025
5e06b6d
refac: styling
tjbck Apr 5, 2025
4ad10f0
chore: format
tjbck Apr 5, 2025
193a927
fix: temp chat
tjbck Apr 5, 2025
a2f2203
Merge pull request #12426 from silentoplayz/fix-feedback
tjbck Apr 5, 2025
8cf8121
Update utils.py
Phlogi Apr 5, 2025
0f310b3
Merge pull request #12476 from Phlogi/dev-hybrid-search
tjbck Apr 5, 2025
e0da600
tweak default rag template to be more coherent and improve consistenc…
Ithanil Apr 5, 2025
1b8557c
i18n: improve fr translation
theredcat Apr 5, 2025
0c0505e
refac
tjbck Apr 5, 2025
ee44383
refac
tjbck Apr 5, 2025
579aca6
Merge pull request #12477 from Ithanil/improved_rag_template
tjbck Apr 5, 2025
61778c8
Merge pull request #12478 from theredcat/main
tjbck Apr 5, 2025
9747a0e
refac: tool servers
tjbck Apr 5, 2025
c9e9ce9
refac
tjbck Apr 5, 2025
66db2e1
refac: tools removed UNNECESSARY CODE
tjbck Apr 5, 2025
93bb77e
refac
tjbck Apr 5, 2025
e570a98
refac: substandard codebase overhauled
tjbck Apr 5, 2025
56dc7c5
refac
tjbck Apr 5, 2025
ae484e8
refac
tjbck Apr 5, 2025
807b208
refac
tjbck Apr 5, 2025
fe4f760
refac
tjbck Apr 5, 2025
09344bb
Fallback from desc to summary to placeholder
dan-sullivan Apr 5, 2025
84788b7
Update translation.json
Kylapaallikko Apr 5, 2025
b4277c7
Make auth error messages generic
gaby Apr 5, 2025
3245504
Fix formatting issues
gaby Apr 5, 2025
d23f757
refac
tjbck Apr 5, 2025
32f309b
chore: format
tjbck Apr 5, 2025
cd0a1b4
fix: fix for text file handling with docling
FabioPolito24 Apr 5, 2025
b3110ca
Merge branch 'dev' into dev
tjbck Apr 5, 2025
69c68df
Merge pull request #12480 from Kylapaallikko/dev
tjbck Apr 5, 2025
48d690c
Merge pull request #12481 from gaby/generic-errors
tjbck Apr 5, 2025
ef787e4
Merge pull request #12486 from FabioPolito24/text-file-handling-docling
tjbck Apr 5, 2025
d5063f4
doc: changelog
tjbck Apr 5, 2025
13dfca4
chore: bump
tjbck Apr 5, 2025
da94835
Merge pull request #12373 from open-webui/dev
tjbck Apr 5, 2025
5296ee0
refac
tjbck Apr 5, 2025
41aca84
Merge pull request #12491 from open-webui/dev
tjbck Apr 5, 2025
33451cf
i18n: update zh-TW
TiancongLx Apr 5, 2025
7162566
Merge pull request #12496 from dan-sullivan/fix-12479/add-desc-to-ope…
tjbck Apr 5, 2025
9ea6cea
refac
tjbck Apr 5, 2025
26e1bfc
Merge remote-tracking branch 'upstream/dev' into dev
TiancongLx Apr 6, 2025
2729d8a
fix web results all getting the same source id when bypassing embeddi…
Ithanil Apr 6, 2025
4476060
fix web results all getting the same source id when using embedding a…
Ithanil Apr 6, 2025
2994c58
Update translation.json
newnol Apr 6, 2025
246c839
i18n: zh-cn
panda44312 Apr 6, 2025
60c6e00
i18n: fix
panda44312 Apr 6, 2025
eda3eba
Merge branch 'open-webui:main' into fix-12237
gaby Apr 6, 2025
ff1d454
Fix formatting
gaby Apr 6, 2025
d266490
Fix dependabot configuration
gaby Apr 6, 2025
a506a1a
only keep URLs as sources for which the content could actually be ret…
Ithanil Apr 6, 2025
9445383
Update index.ts
gaby Apr 6, 2025
6323b9f
Merge pull request #12515 from gaby/fix-dependabot
tjbck Apr 6, 2025
635c08a
Merge pull request #12517 from Ithanil/only_keep_retrieved_urls
tjbck Apr 6, 2025
b8cf9b7
Merge pull request #12511 from panda44312/patch-11
tjbck Apr 6, 2025
51bf9ea
Merge pull request #12493 from TiancongLx/dev
tjbck Apr 6, 2025
89e7913
Bump Python base Docker image to 3.12
gaby Apr 6, 2025
1bde36e
Merge pull request #12510 from newnol/patch-1
tjbck Apr 6, 2025
1e98ae7
Merge pull request #12431 from gaby/fix-12237
tjbck Apr 6, 2025
c1ff697
refac
tjbck Apr 6, 2025
624c525
Optimize GitHub actions
gaby Apr 6, 2025
6274f4d
Merge pull request #12520 from gaby/bump-docker
tjbck Apr 6, 2025
6751d68
Merge pull request #12506 from Ithanil/fix_web_result_source_ids
tjbck Apr 6, 2025
64a0b28
refac
tjbck Apr 6, 2025
bf11871
Merge pull request #12521 from gaby/optimize-workflow
tjbck Apr 6, 2025
dc2f8ec
refac
tjbck Apr 6, 2025
66cd40f
chore: format
tjbck Apr 6, 2025
9825d03
Merge pull request #12507 from Ithanil/fix_web_result_collection_sour…
tjbck Apr 6, 2025
155dbd5
refac
tjbck Apr 6, 2025
f243e52
refac
tjbck Apr 6, 2025
2722613
refac
tjbck Apr 6, 2025
5d2e725
refac
tjbck Apr 6, 2025
20211a4
refac
tjbck Apr 6, 2025
03759e7
refac
tjbck Apr 7, 2025
3413dd8
refac
tjbck Apr 7, 2025
bc17511
refac
tjbck Apr 7, 2025
65ed76a
refac: embedding prefix
tjbck Apr 7, 2025
cbe2056
fix: audio file upload response issue
tjbck Apr 7, 2025
a8bc0d6
chore: format
tjbck Apr 7, 2025
914eb49
chore: include `accelerate` dependency
tjbck Apr 7, 2025
f1a4afe
i18n: update zh-TW
TiancongLx Apr 7, 2025
fabe587
Merge pull request #12525 from TiancongLx/dev
tjbck Apr 7, 2025
2be08f2
revert
tjbck Apr 7, 2025
06820e7
refac: citations modal
tjbck Apr 7, 2025
bd28173
Track backend/requirements.txt updates
gaby Apr 7, 2025
3885ea4
Merge pull request #12526 from gaby/dependabot-update
tjbck Apr 7, 2025
40d019f
refac
tjbck Apr 7, 2025
93962f0
doc: changelog
tjbck Apr 7, 2025
2c083f5
chore: format
tjbck Apr 7, 2025
8e8bc31
doc: changelog
tjbck Apr 7, 2025
63533c9
Merge pull request #12524 from open-webui/dev
tjbck Apr 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,26 @@
version: 2
updates:
- package-ecosystem: uv
directory: '/'
schedule:
interval: monthly
target-branch: 'dev'

- package-ecosystem: pip
directory: '/backend'
schedule:
interval: monthly
target-branch: 'dev'

- package-ecosystem: npm
directory: '/'
schedule:
interval: monthly
target-branch: 'dev'

- package-ecosystem: 'github-actions'
directory: '/'
schedule:
# Check for updates to GitHub Actions every week
interval: monthly
target-branch: 'dev'
14 changes: 12 additions & 2 deletions .github/workflows/format-backend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,18 @@ on:
branches:
- main
- dev
paths:
- 'backend/**'
- 'pyproject.toml'
- 'uv.lock'
pull_request:
branches:
- main
- dev
paths:
- 'backend/**'
- 'pyproject.toml'
- 'uv.lock'

jobs:
build:
Expand All @@ -17,15 +25,17 @@ jobs:

strategy:
matrix:
python-version: [3.11]
python-version:
- 3.11.x
- 3.12.x

steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: '${{ matrix.python-version }}'

- name: Install dependencies
run: |
Expand Down
10 changes: 9 additions & 1 deletion .github/workflows/format-build-frontend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,18 @@ on:
branches:
- main
- dev
paths-ignore:
- 'backend/**'
- 'pyproject.toml'
- 'uv.lock'
pull_request:
branches:
- main
- dev
paths-ignore:
- 'backend/**'
- 'pyproject.toml'
- 'uv.lock'

jobs:
build:
Expand All @@ -21,7 +29,7 @@ jobs:
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '22' # Or specify any other version you want to use
node-version: '22'

- name: Install Dependencies
run: npm install
Expand Down
40 changes: 40 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,46 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.6.2] - 2025-04-06

### Added

- 🌍 **Improved Global Language Support**: Expanded and refined translations across multiple languages to enhance clarity and consistency for international users.

### Fixed

- 🛠️ **Accurate Tool Descriptions from OpenAPI Servers**: External tools now use full endpoint descriptions instead of summaries when generating tool specifications—helping AI models understand tool purpose more precisely and choose the right tool more accurately in tool workflows.
- 🔧 **Precise Web Results Source Attribution**: Fixed a key issue where all web search results showed the same source ID—now each result gets its correct and distinct source, ensuring accurate citations and traceability.
- 🔍 **Clean Web Search Retrieval**: Web search now retains only results from URLs where real content was successfully fetched—improving accuracy and removing empty or broken links from citations.
- 🎵 **Audio File Upload Response Restored**: Resolved an issue where uploading audio files did not return valid responses, restoring smooth file handling for transcription and audio-based workflows.

### Changed

- 🧰 **General Backend Refactoring**: Multiple behind-the-scenes improvements streamline backend performance, reduce complexity, and ensure a more stable, maintainable system overall—making everything smoother without changing your workflow.

## [0.6.1] - 2025-04-05

### Added

- 🛠️ **Global Tool Servers Configuration**: Admins can now centrally configure global external tool servers from Admin Settings > Tools, allowing seamless sharing of tool integrations across all users without manual setup per user.
- 🔐 **Direct Tool Usage Permission for Users**: Introduced a new user-level permission toggle that grants non-admin users access to direct external tools, empowering broader team collaboration while maintaining control.
- 🧠 **Mistral OCR Content Extraction Support**: Added native support for Mistral OCR as a high-accuracy document loader, drastically improving text extraction from scanned documents in RAG workflows.
- 🖼️ **Tools Indicator UI Redesign**: Enhanced message input now smartly displays both built-in and external tools via a unified dropdown, making it simpler and more intuitive to activate tools during conversations.
- 📄 **RAG Prompt Improved and More Coherent**: Default RAG system prompt has been revised to be more clear and citation-focused—admins can leave the template field empty to use this new gold-standard prompt.
- 🧰 **Performance & Developer Improvements**: Major internal restructuring of several tool-related components, simplifying styling and merging external/internal handling logic, resulting in better maintainability and performance.
- 🌍 **Improved Translations**: Updated translations for Tibetan, Polish, Chinese (Simplified & Traditional), Arabic, Russian, Ukrainian, Dutch, Finnish, and French to improve clarity and consistency across the interface.

### Fixed

- 🔑 **External Tool Server API Key Bug Resolved**: Fixed a critical issue where authentication headers were not being sent when calling tools from external OpenAPI tool servers, ensuring full security and smooth tool operations.
- 🚫 **Conditional Export Button Visibility**: UI now gracefully hides export buttons when there's nothing to export in models, prompts, tools, or functions, improving visual clarity and reducing confusion.
- 🧪 **Hybrid Search Failure Recovery**: Resolved edge case in parallel hybrid search where empty or unindexed collections caused backend crashes—these are now cleanly skipped to ensure system stability.
- 📂 **Admin Folder Deletion Fix**: Addressed an issue where folders created in the admin workspace couldn't be deleted, restoring full organizational flexibility for admins.
- 🔐 **Improved Generic Error Feedback on Login**: Authentication errors now show simplified, non-revealing messages for privacy and improved UX, especially with federated logins.
- 📝 **Tool Message with Images Improved**: Enhanced how tool-generated messages with image outputs are shown in chat, making them more readable and consistent with the overall UI design.
- ⚙️ **Auto-Exclusion for Broken RAG Collections**: Auto-skips document collections that fail to fetch data or return "None", preventing silent errors and streamlining retrieval workflows.
- 📝 **Docling Text File Handling Fix**: Fixed file parsing inconsistency that broke docling-based RAG functionality for certain plain text files, ensuring wider file compatibility.

## [0.6.0] - 2025-03-31

### Added
Expand Down
38 changes: 31 additions & 7 deletions backend/open_webui/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,12 +331,14 @@ def __getattr__(self, key):
# OAuth config
####################################


ENABLE_OAUTH_SIGNUP = PersistentConfig(
"ENABLE_OAUTH_SIGNUP",
"oauth.enable_signup",
os.environ.get("ENABLE_OAUTH_SIGNUP", "False").lower() == "true",
)


OAUTH_MERGE_ACCOUNTS_BY_EMAIL = PersistentConfig(
"OAUTH_MERGE_ACCOUNTS_BY_EMAIL",
"oauth.merge_accounts_by_email",
Expand Down Expand Up @@ -466,6 +468,7 @@ def __getattr__(self, key):
os.environ.get("OAUTH_USERNAME_CLAIM", "name"),
)


OAUTH_PICTURE_CLAIM = PersistentConfig(
"OAUTH_PICTURE_CLAIM",
"oauth.oidc.avatar_claim",
Expand Down Expand Up @@ -878,6 +881,17 @@ def oidc_oauth_register(client):
pass
OPENAI_API_BASE_URL = "https://api.openai.com/v1"

####################################
# TOOL_SERVERS
####################################


TOOL_SERVER_CONNECTIONS = PersistentConfig(
"TOOL_SERVER_CONNECTIONS",
"tool_server.connections",
[],
)

####################################
# WEBUI
####################################
Expand Down Expand Up @@ -1034,6 +1048,11 @@ def oidc_oauth_register(client):
== "true"
)

USER_PERMISSIONS_FEATURES_DIRECT_TOOL_SERVERS = (
os.environ.get("USER_PERMISSIONS_FEATURES_DIRECT_TOOL_SERVERS", "False").lower()
== "true"
)

USER_PERMISSIONS_FEATURES_WEB_SEARCH = (
os.environ.get("USER_PERMISSIONS_FEATURES_WEB_SEARCH", "True").lower() == "true"
)
Expand Down Expand Up @@ -1071,6 +1090,7 @@ def oidc_oauth_register(client):
"temporary_enforced": USER_PERMISSIONS_CHAT_TEMPORARY_ENFORCED,
},
"features": {
"direct_tool_servers": USER_PERMISSIONS_FEATURES_DIRECT_TOOL_SERVERS,
"web_search": USER_PERMISSIONS_FEATURES_WEB_SEARCH,
"image_generation": USER_PERMISSIONS_FEATURES_IMAGE_GENERATION,
"code_interpreter": USER_PERMISSIONS_FEATURES_CODE_INTERPRETER,
Expand Down Expand Up @@ -1727,6 +1747,11 @@ class BannerModel(BaseModel):
os.getenv("DOCUMENT_INTELLIGENCE_KEY", ""),
)

MISTRAL_OCR_API_KEY = PersistentConfig(
"MISTRAL_OCR_API_KEY",
"rag.mistral_ocr_api_key",
os.getenv("MISTRAL_OCR_API_KEY", ""),
)

BYPASS_EMBEDDING_AND_RETRIEVAL = PersistentConfig(
"BYPASS_EMBEDDING_AND_RETRIEVAL",
Expand Down Expand Up @@ -1875,26 +1900,25 @@ class BannerModel(BaseModel):
)

DEFAULT_RAG_TEMPLATE = """### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [source_id] **only when the <source_id> tag is explicitly provided** in the context.
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id="1">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [source_id] (e.g., [1], [2]) when a `<source_id>` tag is explicitly provided in the context.**
- Do not cite if the <source_id> tag is not provided in the context.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in "whitepaper.pdf" with a provided <source_id>, the response should include the citation like so:
* "According to the study, the proposed method increases efficiency by 20% [whitepaper.pdf]."
If no <source_id> is present, the response should omit the citation.
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* "According to the study, the proposed method increases efficiency by 20% [1]."

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [source_id] only when the <source_id> tag is present in the context.
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
{{CONTEXT}}
Expand Down
18 changes: 18 additions & 0 deletions backend/open_webui/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,8 @@
OPENAI_API_CONFIGS,
# Direct Connections
ENABLE_DIRECT_CONNECTIONS,
# Tool Server Configs
TOOL_SERVER_CONNECTIONS,
# Code Execution
ENABLE_CODE_EXECUTION,
CODE_EXECUTION_ENGINE,
Expand Down Expand Up @@ -191,6 +193,7 @@
DOCLING_SERVER_URL,
DOCUMENT_INTELLIGENCE_ENDPOINT,
DOCUMENT_INTELLIGENCE_KEY,
MISTRAL_OCR_API_KEY,
RAG_TOP_K,
RAG_TOP_K_RERANKER,
RAG_TEXT_SPLITTER,
Expand Down Expand Up @@ -355,6 +358,7 @@

from open_webui.utils.auth import (
get_license_data,
get_http_authorization_cred,
decode_token,
get_admin_user,
get_verified_user,
Expand Down Expand Up @@ -477,6 +481,15 @@ async def lifespan(app: FastAPI):

app.state.OPENAI_MODELS = {}

########################################
#
# TOOL SERVERS
#
########################################

app.state.config.TOOL_SERVER_CONNECTIONS = TOOL_SERVER_CONNECTIONS
app.state.TOOL_SERVERS = []

########################################
#
# DIRECT CONNECTIONS
Expand Down Expand Up @@ -582,6 +595,7 @@ async def lifespan(app: FastAPI):
app.state.config.DOCLING_SERVER_URL = DOCLING_SERVER_URL
app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT = DOCUMENT_INTELLIGENCE_ENDPOINT
app.state.config.DOCUMENT_INTELLIGENCE_KEY = DOCUMENT_INTELLIGENCE_KEY
app.state.config.MISTRAL_OCR_API_KEY = MISTRAL_OCR_API_KEY

app.state.config.TEXT_SPLITTER = RAG_TEXT_SPLITTER
app.state.config.TIKTOKEN_ENCODING_NAME = TIKTOKEN_ENCODING_NAME
Expand Down Expand Up @@ -862,6 +876,10 @@ async def commit_session_after_request(request: Request, call_next):
@app.middleware("http")
async def check_url(request: Request, call_next):
start_time = int(time.time())
request.state.token = get_http_authorization_cred(
request.headers.get("Authorization")
)

request.state.enable_api_key = app.state.config.ENABLE_API_KEY
response = await call_next(request)
process_time = int(time.time()) - start_time
Expand Down
38 changes: 27 additions & 11 deletions backend/open_webui/retrieval/loaders/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@
YoutubeLoader,
)
from langchain_core.documents import Document

from open_webui.retrieval.loaders.mistral import MistralLoader

from open_webui.env import SRC_LOG_LEVELS, GLOBAL_LOG_LEVEL

logging.basicConfig(stream=sys.stdout, level=GLOBAL_LOG_LEVEL)
Expand Down Expand Up @@ -181,13 +184,16 @@ def load(
for doc in docs
]

def _is_text_file(self, file_ext: str, file_content_type: str) -> bool:
return file_ext in known_source_ext or (
file_content_type and file_content_type.find("text/") >= 0
)

def _get_loader(self, filename: str, file_content_type: str, file_path: str):
file_ext = filename.split(".")[-1].lower()

if self.engine == "tika" and self.kwargs.get("TIKA_SERVER_URL"):
if file_ext in known_source_ext or (
file_content_type and file_content_type.find("text/") >= 0
):
if self._is_text_file(file_ext, file_content_type):
loader = TextLoader(file_path, autodetect_encoding=True)
else:
loader = TikaLoader(
Expand All @@ -196,11 +202,14 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
mime_type=file_content_type,
)
elif self.engine == "docling" and self.kwargs.get("DOCLING_SERVER_URL"):
loader = DoclingLoader(
url=self.kwargs.get("DOCLING_SERVER_URL"),
file_path=file_path,
mime_type=file_content_type,
)
if self._is_text_file(file_ext, file_content_type):
loader = TextLoader(file_path, autodetect_encoding=True)
else:
loader = DoclingLoader(
url=self.kwargs.get("DOCLING_SERVER_URL"),
file_path=file_path,
mime_type=file_content_type,
)
elif (
self.engine == "document_intelligence"
and self.kwargs.get("DOCUMENT_INTELLIGENCE_ENDPOINT") != ""
Expand All @@ -222,6 +231,15 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
api_endpoint=self.kwargs.get("DOCUMENT_INTELLIGENCE_ENDPOINT"),
api_key=self.kwargs.get("DOCUMENT_INTELLIGENCE_KEY"),
)
elif (
self.engine == "mistral_ocr"
and self.kwargs.get("MISTRAL_OCR_API_KEY") != ""
and file_ext
in ["pdf"] # Mistral OCR currently only supports PDF and images
):
loader = MistralLoader(
api_key=self.kwargs.get("MISTRAL_OCR_API_KEY"), file_path=file_path
)
else:
if file_ext == "pdf":
loader = PyPDFLoader(
Expand Down Expand Up @@ -257,9 +275,7 @@ def _get_loader(self, filename: str, file_content_type: str, file_path: str):
loader = UnstructuredPowerPointLoader(file_path)
elif file_ext == "msg":
loader = OutlookMessageLoader(file_path)
elif file_ext in known_source_ext or (
file_content_type and file_content_type.find("text/") >= 0
):
elif self._is_text_file(file_ext, file_content_type):
loader = TextLoader(file_path, autodetect_encoding=True)
else:
loader = TextLoader(file_path, autodetect_encoding=True)
Expand Down
Loading
Loading