Skip to main content

Training Datasets

The training panel in the AI Agent is specifically designed for managing the datasets of the agent. Currently, it is only available for agents of the QA type.

When users ask questions to the QA agent, the agent will retrieve relevant knowledge from these datasets. The agent will then analyze and reorganize the information before responding to the user.

Illustration of showing the agent training panel

What is dataset?

A dataset refers to a collection of structured or unstructured data that is organized and stored together. It can include various types of information such as text, numbers, images, or audio files.

At aitable.ai, the primary data source acting as a dataset for the QA Agent is the datasheet within the space station. When creating the QA Agent, you can bind a specific datasheet, and during the training process, the agent will read all the data from the records in that datasheet.

In addition to that, the attachments in the column of the datasheet can also be used as a data source for training the agent. Currently, only files in the formats of ".pdf", ".md", or "*.docx" are available for training.

Similarly, the AI Agent can make use of the URLs stored in the "URL" column. It can crawl the content of web pages linked to those URLs and utilize them as a data source for training.

When is the timing to retrain the QA Agent?

The QA Agent may need to be retrained in certain circumstances to update or improve its knowledge and abilities. Here are some examples of when retraining may be necessary:

Adding or updating datasets: If you have new datasets or have made modifications to existing datasets, retraining the QA Agent can enable it to use this new data to provide more accurate and comprehensive answers.

Improving performance: If the QA Agent is frequently making errors or providing inaccurate answers when responding to user questions, retraining can help it learn and understand better answers, thereby enhancing its performance and accuracy.

How to change dataset?

  1. Click on the "training" button located in the top right corner of the page. This will take you to the Training interface.

    Illustration of showing the agent training panel

  2. On the Training interface, click on the "Change" button. This will open the dataset selection modal.

    Illustration of showing how to change dataset

  3. Click the button "Save and Train" to start the training process.

    Illustration of showing the button "Save and Train"

Tips

If your datasheet contains attachment fields or URL fields, the content within these fields will also be trained together.

Please note that only files in the formats ".pdf", ".md", or "*.docx" are accepted for training. Any other file formats will be filtered and not considered for training purposes.

How to check the training status?

Below the training panel, you can access all the past training logs. These logs provide information about the status of the training, resource consumption, and the data sources used.

Illustration of showing the training history

To check the training status, follow these steps:

  1. Open the training panel and scroll down to the "Training History" section.
  2. The training history will display records generated after each training started.
  3. Click on any record to view the detailed training log.
  4. In the detailed training log, you will find the following information:
    • Training start time
    • Training status
    • Credit cost during training
    • Information about the data sources used

By reviewing the training history, you can gain insights into the status of each training, credit cost, and the data sources utilized. This will help you effectively manage and optimize your training process.