Studio: Knowledge Base March 25, 2026 12:57 Updated Index Introduction How to access the Knowledge Base Catalog Configure the catalog Configuring the Knowledge Base in the Tools tab Power up your Knowledge Base with Preprocessing Best Practices Blip Academy IntroductionThe Knowledge Base is a feature that centralizes documents, links, and articles so that the virtual agent can quickly access reliable information during interactions with users.It allows organizing content into catalogs, importing files, linking URLs, and keeping information always up to date.How to access the Knowledge BaseThis is the home screen that allows access to the catalog management area of the Knowledge Base. To access it, follow the path: In the contract panel, locate the Knowledge Base card. Click on Import base. CatalogIt is a grouper of information within the Knowledge Base. It centralizes files and URLs about the same topic, allowing the virtual agent to use this data to respond to users with precision.How to create a catalogWhen accessing the Knowledge base, click on Create catalog.This screen will be displayed only on the first accessIn the window displayed, type the name of the catalog. Click Save to complete or Cancel to exit.Add content to the catalogAfter creating a catalog, the Configure catalog screen will be displayed. In it, you can add files and URLs that the AI agent will use to respond to users.Steps Click on Add content. Choose between: Import file – send a document from your computer. Add URL – link a public web page. Import fileAllows sending documents directly from your computer to the catalog.Steps Select the Import file option. Accepted formats: XLSX, CSV, JSON, PDF, TXT, DOCX, PPTX, MD. Drag and drop the document or click to select. Click Save to complete or Cancel to go back. Important points Added files serve as a basis for the agent's responses. It is possible to combine various file types in the same catalog; however, each upload — even if it contains duplicate documents — will be added to the agent's history. Files cannot be edited: if it is necessary to modify content, you must delete the old file or update a new version. The agent does not interpret content in images, so ensure that all relevant information is in the text. For the agent to be able to send media (images, audios, videos, PDFs, or links) as a response, the Knowledge Base must follow a specific template (available below in File template examples. Use the AI Agent example knowledge base file as a reference to structure the content). All links to media must be public, ensuring that the agent can access them and deliver them correctly to the user. Content type File type Recommendations Restrictions FAQ XLSX and CSV • Leaving questions and answers in the same cell or line increases the Agent's efficiency It is mandatory to have a column named “text” (with lowercase letters) for the file to be interpreted. Otherwise, an error will be presented. FAQ DOCX and PDF • Keep questions and answers close together, on the same line or on different but sequential lines. • Avoid the presence of images in the file or ensure that all relevant information is in textual elements DOCX:Up to 1 million characters PDF: Up to 250 thousand characters (averaging 120 to 150 pages) FAQ TXT and MD • Keep questions and answers on the same line.• If that is not possible, keep both as close as possible, that is, on different but sequential lines. TXT:Up to 500 KB Up to 500 thousand characters Other contents (manuals, reports, or guides that are not FAQs) PDF and DOCX • Remove tables of contents, if any exist. • Avoid the presence of images in the file or ensure that all relevant information is in textual elements. • If the file is very long, break it into more than one file to facilitate upload and data maintenance. DOCX:Up to 1 million characters PDF:Up to 250 thousand characters (on average 120 to 150 pages) Other contents (manuals, reports, or guides that are not FAQs) TXT and MD • Remove tables of contents if they exist. TXT:Up to 500 KBUp to 500 thousand characters Other contents (manuals, reports, or guides that are not FAQs) PPTX • Verify that the relevant content for the Agent is contained in textual information of the slides Up to 30 MB Other contents (manuals, reports, or guides that are not FAQs) JSON • Recommended for inserting isolated paragraphs and independent information with up to 700 characters • The field “text” must exist in this format File template examples Format Template XLSX Base de conhecimento exemplo - Agente de IA CSV modelo_csv_FAQ_Notificações_Ativas TXT modeo_txt_FAQ_Notificações_Ativas PDF modelo_pdf_Notificações_Ativas.pdf DOC/DOCX modelo_docs_Notificações_Ativas MD modelo_md_Notificações_Ativas PPT/PPTX modelo_pptx_Boas práticas_Templates_WhatsApp.pptx After adding a URL or file for the first time, you will be directed to the Configure Catalog screen, where you can manage all files and URLs linked to that catalog. On this screen, it is also possible to include new items by clicking the Add Content button.Note: The file import process follows the same procedure described in the Import File topic. Add URL Allows linking a public web page to the catalog, ensuring the content is accessible to the agent without authentication. Steps: Click on Add content; Enter the full address (including https://); Click Save or Cancel. Configure the catalog Screen that allows viewing and configuring all items (files and URLs) of a catalog. For each content, information such as name and type, date and author of the last update, and synchronization status (Active, Inactive, or Synchronizing) is displayed. Steps: On the Knowledge Base screen, locate the desired catalog. Click on the options menu (three-dot icon) and select Open. When opening the catalog, the complete listing of files and URLs will be displayed, allowing you to check details and use the actions menu. Actions menu (three-dot icon) available for each item: Open: Displays the content of the selected item; Download: Downloads the file in its original format to your device; Update: Replaces the existing content with a new version; Restore: Reverts the content to the most recent previous version saved in Blip; Activate/Deactivate: Defines whether the content will be available for use by the AI agent; Delete: Permanently removes the content from the catalog. Manage linked URLs Displays all pages associated with a public URL registered in the catalog. Only URLs accessible without authentication or blocks can be synchronized correctly. Steps: In the URL type content, click Open. Enable or disable the synchronization of each page. Delete pages when necessary. Use the search to locate specific pages. Configuring the Knowledge Base in the Tools tab The Tools tab allows your agent to interact with external resources and consult specific information to enrich its responses. Currently, the knowledge base is connected to the agent as a tool, allowing the agent to access multiple contexts in an organized way. How to add a Knowledge Base When clicking Add tool, a list of options will be displayed. In the Search submenu, select the Knowledge Base option.Unlike previous versions, your agent can now have one or several tools of this type. This is useful for separating different subjects (e.g., one base for "Technical FAQ" and another for "HR Policies"), allowing for more precise referencing in the agent's instructions. Main Settings For each knowledge base tool, it will be necessary to define the following fields: Name: A unique name to help identify the tool. Description: This field is fundamental. This is where you will explain to the agent when it should consult this base and what it will find there. Example: "Use this tool to answer questions about prices, plans, and payment methods." Selecting Contents (Catalogs) For the tool to have information to consult, it needs Catalogs. Catalogs centralize your files and URLs. Click on Add catalog. In the window that opens, you can select existing catalogs or click Create catalog to be redirected to the content management area. Practical Tip: You can select an entire catalog or just specific contents within it by checking the checkboxes. This gives you total control over what each tool can "read." Optimization and Query Definitions To ensure the agent responds quickly and economically, the platform uses RAG (Retrieval Augmented Generation) technology. This means that instead of reading all documents at once, the agent searches only for the most relevant snippets for the user's context. In the Query Definitions section, you can adjust: Chunks returned: Defines how many "pieces" of information the agent receives per query. Recommendation: The default value is 3. Adding more snippets can help the agent respond better, but it consumes more tokens and may exceed the model's context window. Power up your Knowledge Base with Preprocessing Preprocessing optimizes your knowledge base documents before they are made available to Studio agents. This process improves information quality, search accuracy, and the efficiency of generated responses. We offer three types of preprocessing that can be enabled according to your needs. 1. Optimization (Noise Cleaning) What is it? Optimization is an automatic cleaning process that removes "noise" and unnecessary formatting from your documents' text. The goal is to standardize the content, ensuring that the AI focuses only on relevant information. How does it work? This processor applies a set of rules to refine the text. Based on its default configuration, it performs the following actions on each segment of the document: Unicode Correction: Repairs characters that were corrupted or poorly encoded. HTML Removal: Eliminates all HTML tags (e.g., <div>, <p>, <span>) that may be present in documents extracted from the web. Space Normalization: Removes multiple spaces, tabs, and excessive line breaks, replacing them with a single space or line break, respectively. Practical Example: Original Text: <p>A Reunião será na terça. Confira o tópico principal.</p>Optimized Text: A Reunião será na terça. Confira o tópico principal. 2. Indexing (Summaries and Tags with AI)What is it?Indexing uses an AI Agent to enrich each segment (chunk) of your document with a concise summary and relevant keywords. This creates a "semantic index" that drastically improves the system's ability to find the exact information the user is looking for. How does it work? An AI agent specialized in knowledge synthesis analyzes each piece of text and generates: Summary: A short summary (2-3 sentences) explaining the specific topic of that segment. Keywords: A list of 5 to 8 essential terms, entities, or technical jargon found in the text. The process is performed in the same language as the original document to maintain consistency. Practical Example: Chunk Text: "Article 14 of the service contract stipulates that the contracted party must notify the contracting party 30 days in advance of any scheduled interruption. Failure to comply with this clause will result in financial penalties, as detailed in Appendix B." Indexing Result: Summary: "This segment details the Article 14 clause regarding the mandatory 30-day prior notice for service interruptions. Failure to provide notice results in financial fines." Keywords: "service contract, Article 14, prior notice, scheduled interruption, clause, financial penalties, Appendix B" When to use? In documents that function as a collection of independent information, where each segment has value on its own. Ideal for: Question and Answer Bases (FAQ): Where each question/answer pair is a "fact" that needs to be found in isolation. Tabular or split content: Documents where information is already segmented into blocks, such as a spreadsheet with product descriptions or a list of internal policies. Articles or blog posts: Where each paragraph or section addresses a specific sub-topic that can be summarized to facilitate searching. Main benefit Creates a rich "index" that allows the search engine to find specific segments with high precision, even if the user's search uses synonyms or related terms.CostConsumes AI tokens according to the size of the document. 3. Contextualization (Global Semantic Reading with AI) What is it? Contextualization is the most advanced preprocessing. It uses an AI Agent to perform a "global semantic reading" of the entire document. For each text segment, the AI describes where it fits into the overall context of the document, acting as a "GPS" for the search engine. How does it work? Unlike Indexing, which focuses on the content of the segment, Contextualization focuses on its location and purpose. The AI agent reads the entire document to understand its structure (chapters, sections, flow of ideas) and then generates a short sentence (15-25 words) for each segment, describing its contextual role. Practical Example: Document: "Company IT Security Manual" Chunk Text: "All employees must use passwords with at least 12 characters, including uppercase letters, lowercase letters, numbers, and symbols." Contextualization Result: Generated Context: "'Password Policy' section of the manual, specifying complexity requirements for employee access credentials." When to use? In long, continuous, and structured documents where the position of information within the whole is crucial to its meaning. Ideal for: Technical manuals and user guides: Where it is important to know if a segment is in the "Installation," "Troubleshooting," or "Advanced Settings" section. Contracts and legal documents: Where the context of a clause (e.g., "Termination Clause," "Penalty Appendix") is fundamental. Scientific articles and research reports: Where the structure (Introduction, Methodology, Results, Conclusion) gives meaning to each part of the text. Main benefit Acts as a "GPS" for the search, informing the system not only what is in the segment, but where it fits into the document's information flow. Cost Consumes AI tokens according to the size of the document. Comparative Table Option Ideal for... Main Benefit AI Cost Optimization All file types, especially "dirty" ones. Ensures the quality and consistency of the base text. No Indexing Segmented content (FAQs, tables, short articles). Makes each segment "findable" by its specific content. Yes Contextualization Long and structured documents (manuals, contracts). Locates information within the general structure of the document. Yes Golden Rule If your documents are a collection of facts, where each can be read independently, choose Indexing. If your documents tell a story or follow a logical structure, where context is king, choose Contextualization. Optimization is like tidying up the house before decorating: use it whenever possible. How to use Preprocessing in Studio Now, when clicking "Import files" within the catalog, a side menu will open:In this side menu, it is possible to upload multiple files and choose which preprocessing you wish to apply to each of them. Simply select the checkbox for the desired preprocessing and then click save:The bases will be uploaded and preprocessed. Processing time may vary depending on the size of the base and the selected settings. Best Practices Keep content always up to date. Use clear and standardized names to facilitate searches. Review URLs periodically to avoid broken links. Blip Academy Want to learn how Blip Studio works and how to use it? Access Blip Academy and learn for free. For more information, visit the discussion on the subject at our community or videos on our channel. 😃 Related articles Studio: First Steps - Basic Settings How to Schedule a Message with the Scheduler Extension Sending WhatsApp Active Messages on Blip Desk Activation of Additional Numbers on Blip - WhatsApp Embedded SignUp How to Use Variables in Blip Desk Canned Responses