AKFA Guide chatbot interface with suggested questions about careers, divisions, social impact, and contact information

Self-Hosted RAG Chatbot for AKFA Holding

We developed a custom RAG chatbot for AKFA Holding's trilingual corporate website that answers visitor questions using only indexed site content.

Uzbekistan

Solution:

AI Assistant

Tech Stack:

Python, PostgreSQL & MySQL, React & Next.js, Web Frontend, Backend & APIs, Laravel, Docker, Cloud & DevOps, Data

Services:

UI Design, DevOps, UX Design, AI, QA, Software Development, Information Security, Workflow Automation, Backend, AI Transformation, Frontend

KPI:

Project Timeline: complete the project in 1 month.

Content update latency: under 10 minutes.

Client & Context

AKFA Holding is one of the largest conglomerates in Central Asia. It spans construction, appliances, tourism, healthcare, education.

The holding includes 40+ companies, which makes the website rich in information but difficult to navigate quickly.

To help visitors easily find relevant companies, services, and information without wasting time, AKFA needed a reliable chatbot, especially in Uzbek.

AKFA website displayed on a laptop with global export map highlighting 31 countries.

AKFA’s global structure needs more than menu navigation. The chatbot helps users quickly reach the right part of a large corporate site.

Goals

1
Ground every answer in published website content and uploaded internal documents.
2
Let content managers update content, system prompts, and error messages in three languages without developers.
3
Find relevant answers by meaning, not just exact keyword matches, using RAG architecture.

Challenges & How We Overpowered Them

Uzbek language support is poor in commercial AI tools.

We self-hosted a multilingual embedding model with native Uzbek coverage.

CMS holds 20+ content model types with multilingual fields.

We built a custom indexer directly around the Filament CMS schema.

A hallucinating bot creates reputational risk for the brand.

The bot grounds every answer in indexed content and declines when nothing relevant is found.

Inside the AKFA Holding Chatbot

Why We Built It Custom

PostgreSQL was already in the stack — enabling pgvector gave us a vector index without a separate service.

Filament holds ~20 model types with multilingual fields. SaaS bots don't speak this schema and tend to underperform on Uzbek.

Page-aware retrieval, source citations, and refusal policy are architectural choices.

Comparison of SaaS chatbot limitations versus custom RAG benefits for Uzbek support, CMS schema, citations, and infrastructure control

Custom RAG solved what SaaS chatbots could not.

Turning a Complex CMS into Chatbot Knowledge

A custom indexer reads CMS content through an internal API. It splits content into chunks and stores 768-dimensional vector embeddings in PostgreSQL with the pgvector extension.

Content managers can also upload supplementary TXT documents — regulations, FAQs, briefs — and the bot indexes them the same way.

Custom indexing pipeline from Filament CMS and uploaded TXT documents to chunks, embeddings, and pgvector search

The indexer turns CMS content into chatbot knowledge. Pages and TXT files are cleaned, chunked, embedded, and stored in pgvector.

AI Architecture That Can Evolve Without Rebuilding Everything

We split the AI stack by how often each layer changes.

Embeddings are the foundation — switching them means full reindexing. We self-host intfloat/multilingual-e5-base for native languages coverage.

LLM generation is the experimentation zone. We route through OpenRouter and swap models with a config change.

The Bot Checks the Current Page First

Every request includes the visitor's current page URL. The bot first searches for relevant chunks on that specific page.

With three or more quality matches, it answers from the page. With fewer, it expands to the full site index — page context is a priority with graceful fallback.

Current-page-first chatbot behavior using page context before falling back to full-site retrieval

Page context makes answers more precise. The bot checks the current URL first, then expands search to the full site.

Answers Based Only on Verified Content

The LLM generates phrasing. Indexed content provides the facts.

When retrieval finds nothing relevant, the bot says so plainly. Every answer ships with a deduplicated list of source URLs, enforced at the architecture level.

Mobile chatbot response showing grounded answers with links to three source pages on the AKFA website

Every answer stays traceable. Source URLs show where the chatbot found the information.

Follow-Up Questions That Still Make Sense

Short follow-ups get rewritten into standalone queries before retrieval. "When was it founded?" becomes "When was AKFA Holding founded?"

Chat history travels with each request from the client. The RAG service stays stateless — easier to scale, better for privacy.

Fresh Answers Within 10 Minutes of Every CMS Update

Every 10 minutes, the indexer checks the CMS for pages with an updated timestamp.

Only changed content gets reindexed. Unchanged chunks reuse existing embeddings, saving compute.

Deleted or unpublished pages disappear from the index automatically on the next cycle.

Auto-sync diagram showing the chatbot reindexing updated CMS content every 10 minutes

Knowledge stays fresh automatically. Every 10 minutes, the system checks CMS updates and reindexes only changed content.

Compliance & Security

Internal API secured with private token authentication.

Frontend input sanitized against XSS attacks.

Bot declines to respond when relevant indexed content is absent.

Results

The chatbot launched on AKFA Holding's corporate website. It answers visitor questions in three languages, cites sources in every response, and refreshes its knowledge within 10 minutes of any content update in the CMS.

Let's solve your challenge