Introduction

You can find the features discussed in this section under [Tools Kit] -> [Knowledge Base] in the sidebar

Adding Knowledge Base

Click [Add New Knowledge Base], and you can configure a new knowledge base in the pop-up dialog
Enter the [Knowledge Base Name] and [Knowledge Base Description]. Note! The [Knowledge Base Description] affects which knowledge base the large model calls, so describe your knowledge base as accurately as possible
For first-time use, you can add new providers in the [Model Service Interface]. After adding providers, please return to this page and select [Embedding Model] to continue. Embedding model names typically contain keywords such as "embedding", "ebd", "bge".
Click [Advanced Settings], where you can configure [Chunk Size], [Overlap Size], [Number of Returned Paragraphs], and [Search Weight]

Chunk Size: Determines the size of knowledge base segments and how many tokens each result will contain when the large model queries the knowledge base
Overlap Size: Determines the maximum overlap length between two segments, preventing segments from breaking in the middle of paragraphs and affecting semantics
Number of Returned Paragraphs: Determines how many results are returned during queries
Search Weight: Determines the proportion of two query methods - when set to 0, only [Keyword Search] is used; when set to 1, only [Semantic Search] is used

Upload files, supporting most file formats except images; image-based PDFs cannot be parsed either

This sub-page is used to configure common rules for all knowledge bases
Knowledge Base Search Timing: You can choose before thinking, after thinking, or both

Before Thinking: Before the large model sees user input, it first queries all enabled knowledge bases, then reranks (if enabled), and finally returns all knowledge base results with the original question to the large model
After Thinking: The large model determines which knowledge base to call based on the knowledge base description, and may not query any knowledge base for irrelevant questions
Both: Includes both previous options

Enable Reranking Model: Determines whether to enable the reranking model. Currently only tested with jina and vllm's rerank model interfaces, but other providers' rerank model interfaces are likely compatible as they are probably similar to these two
Number of Returned Results: Determines how many results the reranking model will return. For example, if 10 knowledge bases are automatically queried before thinking, returning 50 results, which are then filtered to 5 results by the reranking model, this number is determined by the [Number of Returned Results] in [Knowledge Base Configuration]