Index
Introduction
The Unit Tests functionality allows you to validate the accuracy of your conversational flow responses, whether deterministic or based on an AI agent.
The test works by configuring interaction pairs: you define the input message, which can be text or a public URL, and the expected response, which can be text and/or file validation. During execution, the test sends each interaction and stops upon detecting the first failure.
This process ensures that the system responds correctly and allows for quick adjustments in case of problems, being an essential tool to verify if the expected behaviors are correct before being deployed to production. The functionality facilitates the maintenance and continuous evolution of your Smart Contact.
How to access Unit Tests
On the Studio screen:
In the upper right corner of the screen, click the icon
.
In the menu that opens, select Unit tests.
The following screen will open:
Managing Unit Tests
The unit test management screen is the starting point for viewing, creating, and running tests for your bot or AI agent.
Create test
Click the Create test button in the upper right corner or, if no tests have been created, click the button centered in the list Create new test. It is possible to fill in all the parameters for the test manually or import a settings file by clicking the button:
Import settings: load a file in Blip format with ready-made tests.
Other resources
Search: Search field to find specific tests by name.
-
Test list: Displays all created unit tests, with the following information:
Test name: Name defined for the test.
Interactions: Total number of interactions configured for the test.
Last run: Date and time of the last time the test was executed.
-
Last status: Indicates the result of the last test execution, which can be:
Awaiting execution: The test was created but has not yet been executed.
Success: All test interactions passed successfully (e.g., "Success 18/18").
Metric failure: The test was executed, but some interactions failed (e.g., the message "Metric failure 25/57" indicates that 25 interactions failed out of a total of 57 configured interactions).
Error starting: The test could not be executed due to an error at the start (e.g., "Error starting 2/3").
Interrupted: The test can be interrupted during execution.
Delete tests: removes one or more tests.
Run tests: executes one or more selected tests.
Configuring a Unit Test
When creating or editing a test, you will have access to three configuration tabs: Definitions, Variables, and Interactions.
Interactions
This tab is where you define the sequence of questions and answers to validate the behavior of your bot or AI agent.
Order: The order in which the interactions will be executed. You can reorder interactions by dragging the grid icons.
Description: The text input that will be sent to the bot.
-
Result: The status of the interaction after the test execution, which can be:
Awaiting execution: The interaction has not yet been tested.
Success: The bot's response matched what was expected.
Error starting: The interaction could not be started.
Metric failure: The bot's response did not match what was expected.
Interrupted: The test was interrupted by the user during execution.
Configuring an Interaction:
By clicking on an interaction, you can expand the section to configure it in detail.
Input type: Defines the type of input you are sending.
Input Message: The user input can be simple text or a public URL pointing to a file.
-
Expected Response:
Text Blocks: The expected response can be one or more snippets of text.
When dealing with structured formats, such as JSON menus, it is advisable to include the JSON directly, ensuring that the system understands and compares as expected.
File Type: Additionally, the response may require the presence of specific files, such as documents, images, audio, or video. The configuration must specify not only the type but also the expected quantity. For example, if the interaction should return two documents, the configuration must reflect this. The test will fail if the response does not exactly match the number and type of expected files.
Text: The text the bot will receive (e.g., "What are the opening hours?").
Textual Comparison Metric:
Similarity:
The similarity metric evaluates how close a generated response is to the expected response in terms of content and structure. It allows for variations while still considering the response valid.
Usage Recommendation:
Ideal for flexible systems, such as intelligent agents, which can generate responses with some variation.
Define the similarity threshold to establish the acceptable degree of variation. For example, a threshold of 6.5 indicates that the response must have at least 65% similarity with what is expected.
Exact Match:
This metric requires the generated response to be completely identical to the expected response, without any deviation or variation, including punctuation and special characters.
Usage Recommendation:
Ideal for deterministic systems where precision is crucial.
Ensures the response is exactly as expected, ensuring consistency and accuracy.
It considers formatting differences, such as line breaks within a block or separation into distinct blocks, indicating messages sent separately.
Variables
In this tab, you can manage the context variables that will be used in the test flow. Add, edit, or remove the variables that your bot or AI agent may need to start the flow correctly.
Type: context or contact
Name: Variable name (e.g., numbercpf).
Value: Value the variable will have (e.g., 129.452.875-06).
New variable: Adds a new variable.
Definitions
In this tab, you define the timeout for the test to run.
Response timeout: Use the slider to set the time limit for each interaction of your test. If the bot's response takes longer than the stipulated time, the interaction will be considered a failure.
Running and Analyzing Tests
Click Save after configuring the test.
In the test list, select the test and click Run tests.
View the status in the list and click on the test to analyze the result.
Interactions with Metric failure or Error starting indicate points that need adjustments.
For more information, visit the discussion on the subject at our community or videos on our channel. 😃