Studio: Knowledge Base – Blip | Blip Help

Index

Introduction

The Knowledge Base is a feature that centralizes documents, links, and articles so that the virtual agent can quickly access reliable information during interactions with users.

It allows organizing content into catalogs, importing files, linking URLs, and keeping information always up to date.

How to access the Knowledge Base

This is the home screen that allows access to the catalog management area of the Knowledge Base. To access it, follow the path:

In the contract panel, locate the Knowledge Base card.
Click on Import base.

Catalog

It is a grouper of information within the Knowledge Base. It centralizes files and URLs about the same topic, allowing the virtual agent to use this data to respond to users with precision.

How to create a catalog

When accessing the Knowledge base, click on Create catalog.

This screen will be displayed only on the first access

In the window displayed, type the name of the catalog. Click Save to complete or Cancel to exit.

Add content to the catalog

After creating a catalog, the Configure catalog screen will be displayed. In it, you can add files and URLs that the AI agent will use to respond to users.

Steps

Click on Add content.
Choose between:
- Import file – send a document from your computer.
- Add URL – link a public web page.

Import file

Allows sending documents directly from your computer to the catalog.

Steps

Select the Import file option. Accepted formats: XLSX, CSV, JSON, PDF, TXT, DOCX, PPTX, MD.
Drag and drop the document or click to select.
Click Save to complete or Cancel to go back.

Important points

Added files serve as a basis for the agent's responses.
It is possible to combine various file types in the same catalog; however, each upload — even if it contains duplicate documents — will be added to the agent's history.
Files cannot be edited: if it is necessary to modify content, you must delete the old file or update a new version.
The agent does not interpret content in images, so ensure that all relevant information is in the text.
For the agent to be able to send media (images, audios, videos, PDFs, or links) as a response, the Knowledge Base must follow a specific template (available below in File template examples. Use the AI Agent example knowledge base file as a reference to structure the content).
All links to media must be public, ensuring that the agent can access them and deliver them correctly to the user.

Content type	File type	Recommendations	Restrictions
FAQ	XLSX and CSV	• Leaving questions and answers in the same cell or line increases the Agent's efficiency	It is mandatory to have a column named “text” (with lowercase letters) for the file to be interpreted. Otherwise, an error will be presented.
FAQ	DOCX and PDF	• Keep questions and answers close together, on the same line or on different but sequential lines. • Avoid the presence of images in the file or ensure that all relevant information is in textual elements	DOCX: Up to 1 million characters PDF: Up to 250 thousand characters (averaging 120 to 150 pages)
FAQ	TXT and MD	• Keep questions and answers on the same line. • If that is not possible, keep both as close as possible, that is, on different but sequential lines.	TXT: Up to 500 KB Up to 500 thousand characters
Other contents (manuals, reports, or guides that are not FAQs)	PDF and DOCX	• Remove tables of contents, if any exist. • Avoid the presence of images in the file or ensure that all relevant information is in textual elements. • If the file is very long, break it into more than one file to facilitate upload and data maintenance.	DOCX: Up to 1 million characters PDF: Up to 250 thousand characters (on average 120 to 150 pages)
Other contents (manuals, reports, or guides that are not FAQs)	TXT and MD	• Remove tables of contents if they exist.	TXT: Up to 500 KB Up to 500 thousand characters
Other contents (manuals, reports, or guides that are not FAQs)	PPTX	• Verify that the relevant content for the Agent is contained in textual information of the slides	Up to 30 MB
Other contents (manuals, reports, or guides that are not FAQs)	JSON	• Recommended for inserting isolated paragraphs and independent information with up to 700 characters • The field “text” must exist in this format

File template examples

Format	Template
XLSX	Base de conhecimento exemplo - Agente de IA
CSV	modelo_csv_FAQ_Notificações_Ativas
TXT	modeo_txt_FAQ_Notificações_Ativas
PDF	modelo_pdf_Notificações_Ativas.pdf
DOC/DOCX	modelo_docs_Notificações_Ativas
MD	modelo_md_Notificações_Ativas
PPT/PPTX	modelo_pptx_Boas práticas_Templates_WhatsApp.pptx

After adding a URL or file for the first time, you will be directed to the Configure Catalog screen, where you can manage all files and URLs linked to that catalog. On this screen, it is also possible to include new items by clicking the Add Content button.

Note: The file import process follows the same procedure described in the Import File topic.

Add URL

Allows linking a public web page to the catalog, ensuring the content is accessible to the agent without authentication.

Steps:

Click on Add content;
Enter the full address (including https://);
Click Save or Cancel.

Configure the catalog

Screen that allows viewing and configuring all items (files and URLs) of a catalog. For each content, information such as name and type, date and author of the last update, and synchronization status (Active, Inactive, or Synchronizing) is displayed.

Steps:

On the Knowledge Base screen, locate the desired catalog.
Click on the options menu (three-dot icon) and select Open.

When opening the catalog, the complete listing of files and URLs will be displayed, allowing you to check details and use the actions menu.

Actions menu (three-dot icon) available for each item:

Open: Displays the content of the selected item;
Download: Downloads the file in its original format to your device;
Update: Replaces the existing content with a new version;
Restore: Reverts the content to the most recent previous version saved in Blip;
Activate/Deactivate: Defines whether the content will be available for use by the AI agent;
Delete: Permanently removes the content from the catalog.

Manage linked URLs

Displays all pages associated with a public URL registered in the catalog. Only URLs accessible without authentication or blocks can be synchronized correctly.

Steps:

In the URL type content, click Open.
Enable or disable the synchronization of each page.
Delete pages when necessary.
Use the search to locate specific pages.

Configuring the Knowledge Base in the Tools tab

The Tools tab allows your agent to interact with external resources and consult specific information to enrich its responses. Currently, the knowledge base is connected to the agent as a tool, allowing the agent to access multiple contexts in an organized way.

How to add a Knowledge Base

When clicking Add tool, a list of options will be displayed. In the Search submenu, select the Knowledge Base option.

Unlike previous versions, your agent can now have one or several tools of this type. This is useful for separating different subjects (e.g., one base for "Technical FAQ" and another for "HR Policies"), allowing for more precise referencing in the agent's instructions.

Main Settings

For each knowledge base tool, it will be necessary to define the following fields:

Name: A unique name to help identify the tool.
Description: This field is fundamental. This is where you will explain to the agent when it should consult this base and what it will find there.
- Example: "Use this tool to answer questions about prices, plans, and payment methods."

Selecting Contents (Catalogs)

For the tool to have information to consult, it needs Catalogs. Catalogs centralize your files and URLs.

Click on Add catalog.
In the window that opens, you can select existing catalogs or click Create catalog to be redirected to the content management area.

Practical Tip: You can select an entire catalog or just specific contents within it by checking the checkboxes. This gives you total control over what each tool can "read."

Optimization and Query Definitions

To ensure the agent responds quickly and economically, the platform uses RAG (Retrieval Augmented Generation) technology. This means that instead of reading all documents at once, the agent searches only for the most relevant snippets for the user's context.

In the Query Definitions section, you can adjust:

Chunks returned: Defines how many "pieces" of information the agent receives per query.
Recommendation: The default value is 3. Adding more snippets can help the agent respond better, but it consumes more tokens and may exceed the model's context window.

Power up your Knowledge Base with Preprocessing

Preprocessing optimizes your knowledge base documents before they are made available to Studio agents. This process improves information quality, search accuracy, and the efficiency of generated responses. We offer three types of preprocessing that can be enabled according to your needs.

1. Optimization (Noise Cleaning)

What is it?

Optimization is an automatic cleaning process that removes "noise" and unnecessary formatting from your documents' text. The goal is to standardize the content, ensuring that the AI focuses only on relevant information.

How does it work?

This processor applies a set of rules to refine the text. Based on its default configuration, it performs the following actions on each segment of the document:

Unicode Correction: Repairs characters that were corrupted or poorly encoded.
HTML Removal: Eliminates all HTML tags (e.g., <div>, <p>, <span>) that may be present in documents extracted from the web.
Space Normalization: Removes multiple spaces, tabs, and excessive line breaks, replacing them with a single space or line break, respectively.

Practical Example:

Original Text:

<p>A Reunião será na terça.

Confira o tópico principal.</p>

Optimized Text:

A Reunião será na terça. Confira o tópico principal.

2. Indexing (Summaries and Tags with AI)

What is it?

Indexing uses an AI Agent to enrich each segment (chunk) of your document with a concise summary and relevant keywords. This creates a "semantic index" that drastically improves the system's ability to find the exact information the user is looking for.

How does it work?

An AI agent specialized in knowledge synthesis analyzes each piece of text and generates:

Summary: A short summary (2-3 sentences) explaining the specific topic of that segment.
Keywords: A list of 5 to 8 essential terms, entities, or technical jargon found in the text.

The process is performed in the same language as the original document to maintain consistency.

Practical Example:

Chunk Text:

"Article 14 of the service contract stipulates that the contracted party must notify the contracting party 30 days in advance of any scheduled interruption. Failure to comply with this clause will result in financial penalties, as detailed in Appendix B."
Indexing Result:
- Summary: "This segment details the Article 14 clause regarding the mandatory 30-day prior notice for service interruptions. Failure to provide notice results in financial fines."
- Keywords: "service contract, Article 14, prior notice, scheduled interruption, clause, financial penalties, Appendix B"

When to use?

In documents that function as a collection of independent information, where each segment has value on its own. Ideal for:

Question and Answer Bases (FAQ): Where each question/answer pair is a "fact" that needs to be found in isolation.
Tabular or split content: Documents where information is already segmented into blocks, such as a spreadsheet with product descriptions or a list of internal policies.
Articles or blog posts: Where each paragraph or section addresses a specific sub-topic that can be summarized to facilitate searching.

Main benefit

Creates a rich "index" that allows the search engine to find specific segments with high precision, even if the user's search uses synonyms or related terms.

Cost

Consumes AI tokens according to the size of the document.

3. Contextualization (Global Semantic Reading with AI)

What is it?

Contextualization is the most advanced preprocessing. It uses an AI Agent to perform a "global semantic reading" of the entire document. For each text segment, the AI describes where it fits into the overall context of the document, acting as a "GPS" for the search engine.

How does it work?

Unlike Indexing, which focuses on the content of the segment, Contextualization focuses on its location and purpose. The AI agent reads the entire document to understand its structure (chapters, sections, flow of ideas) and then generates a short sentence (15-25 words) for each segment, describing its contextual role.

Practical Example:

Document: "Company IT Security Manual"
Chunk Text:

"All employees must use passwords with at least 12 characters, including uppercase letters, lowercase letters, numbers, and symbols."
Contextualization Result:
- Generated Context: "'Password Policy' section of the manual, specifying complexity requirements for employee access credentials."

When to use?

In long, continuous, and structured documents where the position of information within the whole is crucial to its meaning. Ideal for:

Technical manuals and user guides: Where it is important to know if a segment is in the "Installation," "Troubleshooting," or "Advanced Settings" section.
Contracts and legal documents: Where the context of a clause (e.g., "Termination Clause," "Penalty Appendix") is fundamental.
Scientific articles and research reports: Where the structure (Introduction, Methodology, Results, Conclusion) gives meaning to each part of the text.

Main benefit

Acts as a "GPS" for the search, informing the system not only what is in the segment, but where it fits into the document's information flow.

Cost

Consumes AI tokens according to the size of the document.

Comparative Table

Option	Ideal for...	Main Benefit	AI Cost
Optimization	All file types, especially "dirty" ones.	Ensures the quality and consistency of the base text.	No
Indexing	Segmented content (FAQs, tables, short articles).	Makes each segment "findable" by its specific content.	Yes
Contextualization	Long and structured documents (manuals, contracts).	Locates information within the general structure of the document.	Yes

Golden Rule

If your documents are a collection of facts, where each can be read independently, choose Indexing.
If your documents tell a story or follow a logical structure, where context is king, choose Contextualization.
Optimization is like tidying up the house before decorating: use it whenever possible.

How to use Preprocessing in Studio

Now, when clicking "Import files" within the catalog, a side menu will open:

In this side menu, it is possible to upload multiple files and choose which preprocessing you wish to apply to each of them. Simply select the checkbox for the desired preprocessing and then click save:

The bases will be uploaded and preprocessed. Processing time may vary depending on the size of the base and the selected settings.

Best Practices

Keep content always up to date.
Use clear and standardized names to facilitate searches.
Review URLs periodically to avoid broken links.

Blip Academy

Want to learn how Blip Studio works and how to use it? Access Blip Academy and learn for free.

For more information, visit the discussion on the subject at our community or videos on our channel. 😃

Index

Introduction

How to access the Knowledge Base

Catalog

How to create a catalog

Add content to the catalog

Import file

Add URL

Configure the catalog

Manage linked URLs

Configuring the Knowledge Base in the Tools tab

How to add a Knowledge Base

Main Settings

Selecting Contents (Catalogs)

Optimization and Query Definitions

Power up your Knowledge Base with Preprocessing

1. Optimization (Noise Cleaning)

What is it?

How does it work?

Practical Example:

2. Indexing (Summaries and Tags with AI)

What is it?

How does it work?

Practical Example:

When to use?

Main benefit

Cost

3. Contextualization (Global Semantic Reading with AI)

What is it?

How does it work?

Practical Example:

When to use?

Main benefit

Cost

Comparative Table

Golden Rule

How to use Preprocessing in Studio

Best Practices

Blip Academy

Related articles