← Back to Guides
setupBeginner16 min read time

Knowledge Base Training

Complete guide for uploading documents, configuring sitemap crawling, and setting citation rules to improve AI agent accuracy.

Author: Bravin AI Team
Published on January 12, 2024
Updated on March 6, 2024

Introduction

Your AI agent is only as good as the knowledge you provide. This guide covers how to train your agent on your documentation, website content, and business information.

What You Can Upload:

  • PDF documents (product manuals, policies)
  • Word documents (guides, FAQs)
  • Website pages (via sitemap)
  • Text files (policies, scripts)
  • Google Docs (via integration)
  • Spreadsheets (product catalogs)

Free plan: 10 documents, 100 pages total. Pro plan: unlimited documents and pages.

Upload Documents

Step 1: Access Knowledge Base Go to Agent → Knowledge Base

Step 2: Upload Files

  1. Click Upload Documents
  2. Select files (or drag & drop)
  3. Add titles and descriptions
  4. Choose visibility (public/private)
  5. Click Process

Processing Time:

  • Small files (< 10 pages): ~30 seconds
  • Large files (100+ pages): 2-5 minutes
  • The system extracts text, indexes content, and creates embeddings

Scanned PDFs must have OCR applied first. Use tools like Adobe Acrobat or online OCR services.

Sitemap Crawling

Crawl Your Website: The easiest way to train your agent on your entire website.

Step 1: Find Your Sitemap Usually located at:

  • https://yoursite.com/sitemap.xml
  • https://yoursite.com/sitemap_index.xml

Step 2: Add to Bravin

  1. Go to Knowledge Base → Web Sources
  2. Click Add Sitemap
  3. Enter sitemap URL
  4. Select pages to include/exclude
  5. Set crawl frequency (daily, weekly, monthly)

Step 3: Initial Crawl

  • Click Start Crawl
  • Watch progress in dashboard
  • Typically takes 5-30 minutes depending on site size

Automatic re-crawling keeps your agent updated with your latest content without manual uploads.

Example Sitemap Configuration
{
  "sitemap": "https://yoursite.com/sitemap.xml",
  "include": ["/docs", "/faq", "/support"],
  "exclude": ["/admin", "/checkout", "/account"],
  "maxPages": 500,
  "crawlFrequency": "weekly"
}

Document Formatting Best Practices

For Best Results:

1. Use Clear Headers

  • H1 for main topics
  • H2 for subtopics
  • H3 for details

2. Structured Content

  • Use bullet points for lists
  • Bold important information
  • Keep paragraphs short (3-5 sentences)

3. FAQ Format Question-answer format works best:

Good FAQ Structure
## What is your return policy?

We accept returns within 30 days of purchase. Items must be:
- Unused and in original packaging
- Accompanied by receipt
- In resaleable condition

To initiate a return, email support@company.com with your order number.

## How long does shipping take?

Shipping times vary by location:
- USA: 3-5 business days
- Canada: 5-7 business days
- International: 7-14 business days

Express shipping is available at checkout.

Citation Rules

Enable Source Citations: When the agent quotes from your knowledge base, it can cite the source.

Configuration:

  1. Go to Agent → Knowledge Base → Settings
  2. Enable "Show Source Citations"
  3. Choose citation format:
    • Footnote style: "According to our return policy [1]"
    • Inline style: "According to our return policy (source: Returns Policy, p. 3)"
    • Link style: "Learn more in our return policy"

Benefits:

  • Builds trust
  • Allows customers to verify information
  • Reduces "where did you find that?" questions
  • Improves perceived accuracy

Citations increase customer trust by 40% according to our A/B tests.

Test Your Knowledge Base

Testing Checklist:

1. Ask Common Questions

  • Top 10 customer questions
  • Edge cases
  • Negative scenarios (what you don't offer)

2. Check Accuracy

  • Is the answer correct?
  • Is it up-to-date?
  • Does it cite the right source?

3. Coverage Analysis Go to Knowledge Base → Analytics:

  • Questions without good answers
  • Most-used sources
  • Gaps in content

4. Confidence Scores

  • Answers with confidence < 70% need review
  • Add more training data for low-confidence topics

Test with real customer questions from your support history, not just hypothetical scenarios.

Ongoing Maintenance

Weekly Tasks:

  • Review conversations flagged as low-confidence
  • Check for new unanswered questions
  • Update documents if business policies change

Monthly Tasks:

  • Audit knowledge base for outdated content
  • Remove deprecated documents
  • Add new product information
  • Review citation accuracy

Quarterly Tasks:

  • Full content audit
  • Re-train on new data
  • A/B test different response styles
  • Survey customers about answer quality

Automation: Set up automatic alerts in Settings → Notifications:

  • When confidence drops below threshold
  • When unanswered questions spike
  • When documents need updating

Companies that update their knowledge base weekly see 35% higher accuracy scores.

Was this guide helpful?