Why I Prefer MongoDB For AI Applications

I use MongoDB as the primary database for AI-powered products like DocRouter.AI and SigAgent.AI. This post explains how it’s implemented—migrations, vector search, and knowledge bases—and why I prefer it over alternatives like Postgres for document-centric, JSON-heavy AI workloads. I want to store a very large number of documents (DocRouter) or logs (SigAgent) without spending much time tuning the database for horizontal scaling; MongoDB fits that need well.

Brief trade-offs vs Postgres

Postgres with jsonb can model the same document-style records and even integrate vector search via extensions, but it shines most when you need strong relational guarantees and complex joins around a relatively stable schema. MongoDB is a better fit when almost everything is a JSON document, the schema evolves quickly, and you care more about horizontal scaling and quick turnaround than about classic SQL features.

In my case, the workloads are heavily document- and log-centric, so the ergonomics and scaling model of MongoDB outweigh the benefits of staying inside the relational/Postgres ecosystem.

Aspect	MongoDB Approach	Postgres Alternative
Schema flexibility	Evolving via migrations; no enforcement from the DB, but disciplined application code keeps documents consistent.	Rigid but enforceable with DDL; great for stable relations, heavier for fast-changing AI schemas.
Scaling	Horizontal scaling and sharding are built-in; easy to grow with large document/log volumes.	Requires extensions or external tooling (e.g. Citus) and more tuning when pushing JSONB-heavy workloads.
Consistency model	Tunable; we use majority writes and typically read from secondaries, so most reads are eventually consistent.	Strong ACID semantics with primary reads by default; great when you need strict guarantees across transactions.
Vector search	Native `$vectorSearch` in MongoDB 8.2+/Atlas.	Via `pgvector`, which has mature support.
Joins & relations	Limited	Strong
Dev speed	Quick iterations on JSON schemas and migrations; fits rapid AI product experiments.	Slower to evolve schemas cleanly; better when you already know the long-term relational shape.

It’s also totally reasonable to split responsibilities: e.g. Postgres for relational metadata, a dedicated vector store like Pinecone/Weaviate/Qdrant for embeddings, or Redis as an embedding cache in front of another database. For DocRouter.AI and SigAgent.AI, I’m happy to trade some theoretical optimality for a single operational datastore (MongoDB handling documents, logs, and vectors) until scale or workload complexity justifies introducing extra systems.

DocRouter and SigAgent: One Backend, Two Products

DocRouter.AI is a smart document router: you upload documents, define schemas and prompts, and it extracts structured data (e.g. from invoices, medical records, forms) using LLMs. SigAgent is a Claude agent monitor with a different UX and product focus, but it’s built on the same stack.

Roughly 90% of the backend is shared. Both use the same Python package, analytiq_data: MongoDB client, migrations, queue layer, auth, and app startup. The same MongoDB database layout, indices, and migration history apply to both. Product-specific code lives in routes and frontends; the data layer is common.

Loading diagram...

📝 Edit in Excalidraw

Strict Schema via Discipline: Migrations as “Schema”

MongoDB doesn’t enforce a schema. I still want predictable structure and safe upgrades, so we enforce it in code and process:

Consistent document shape per collection. We use a fixed set of field names and types in application code (and in TypeScript/Pydantic where applicable). In practice, each collection behaves like a table with a known “schema.”
Every change goes through migrations. Renaming fields, adding/removing fields, splitting or renaming collections—all of it is done in versioned migration classes with up() and down(). The current schema version is stored in a migrations collection; on startup we run run_migrations(analytiq_client) and bring the DB to the latest version.
Element types as schema. We treat document fields as if they were typed: e.g. schema_id, prompt_version, organization_id are always present where we expect them. New code assumes the post-migration shape. So we get comparable safety to Postgres (known structure, no surprise shapes) while staying in a document model—as long as we’re disciplined and never bypass migrations.

So: schema is “enforced” by convention and migrations, not by the database. That keeps development fast without giving up control.

How Migrations and Indices Are Implemented

At a high level:

Migrations are versioned Python classes with up()/down() methods that evolve collections in lockstep with the code.
Indices are either created inside migrations (long-term, repeatable) or via a small helper at runtime for per-module needs.
Profiling tools (Atlas Query Insights, profiler, Compass/Performance Advisor) surface slow queries that suggest new compound indexes.

Migrations

Migrations live in migration.py. Each migration is a class with:

description: short human-readable summary
up(db): apply the change (e.g. rename field, backfill, add index)
down(db): revert the change when possible

The runner loads the list MIGRATIONS, assigns a version by index, and stores the current version in db.migrations under _id: "schema_version". On startup we run pending migrations in order. Examples from the codebase:

Field renames: e.g. RenameUserFields (camelCase → snake_case)
New fields: e.g. LlmResultFieldsMigration (adds is_edited, is_verified, timestamps)
Structural changes: e.g. RenameCollections (schemas → schema_revisions, schema_versions → schemas), UseMongoObjectIDs

Collection layout and index strategy evolve in one place, with a clear history and rollback path.

Indices

We manage indices in two ways:

Inside migrations for long-term indices. These are created with create_index(..., background=True) and dropped in down() so rollback is consistent.
At runtime via ensure_index() in analytiq_data/mongodb/index.py. Given a collection, index spec, and name, it creates the index if missing (and optionally drops other non-_id indexes). We use this for things like payments_usage_records (org_id, timestamp) when the payments module initializes.

Example index patterns:

Queue collections: (status, created_at) for find_one_and_update({ status: "pending" }, sort: { created_at: 1 }).
docs: (organization_id, upload_date desc) for paginated listing by org.
llm_runs: (document_id, prompt_id, prompt_version desc) for “latest run by document and prompt,” and (document_id, prompt_revid) for exact revision lookup.
document_index: unique (kb_id, document_id) and a non-unique document_id index for cascade deletes.

To find candidates for new indices, we use Atlas Query Insights (filter by operations returning >1,000 documents, then add a compound index matching the filter and sort) or, on Community Edition, enable the database profiler and filter system.profile by nReturned:

db.setProfilingLevel(1, { slowms: 100 })
db.system.profile.find({ nReturned: { $gt: 1000 } }).sort({ ts: -1 }).limit(20)

Level 2 profiling captures everything but has I/O cost, so we use it in short bursts or staging. Level 1 (slow only) is cheaper and still surfaces most heavy queries.

If you prefer GUIs over shell tools, both MongoDB Compass and the Atlas Performance Advisor surface similar slow-query/index recommendations in a more visual way, which is often easier when you’re just getting started tuning a workload.

With schema and indices in place, here’s how we run it.

Dev/Prod Setup and Vector Search

Quickly, the environments look like this:

Local: MongoDB running on localhost or via the Atlas Local Docker image (bundling mongod + mongot) for easy vector search testing.
Production: MongoDB Atlas or AWS DocumentDB, configured for majority writes and read scaling.
Self-hosted: MongoDB 8.2+ with mongot in a replica set, using the same createSearchIndexes / $vectorSearch APIs.

For local development, MONGODB_URI defaults to mongodb://localhost:27017. For vector search, we use the MongoDB Atlas Local Docker image (mongodb/mongodb-atlas-local:latest), which bundles mongod and mongot (the search/vector process) in one container.

For production we use MongoDB Atlas; for on-prem consulting we use AWS DocumentDB. The client is configured with w='majority', readPreference='secondaryPreferred', and retryWrites=False (DocumentDB doesn’t support retryWrites).

For self-hosted Community Edition, vector search requires MongoDB 8.2+ with mongot running alongside mongod in a replica set. The createSearchIndexes command and $vectorSearch aggregation work the same way across Atlas, Community 8.2+, and the local Docker image—no separate code path.

Knowledge Bases on Top of Vector Search

We implement knowledge bases in the shared backend using MongoDB’s vector search, with three key pieces: an indexing pipeline, a search pipeline, and a reconciliation service.

Loading diagram...

📝 Edit in Excalidraw

Indexing: Blue-Green Atomic Swap

Each knowledge base gets its own vector collection (kb_vectors_<kb_id>). When a document is indexed, we chunk the text (via Chonkie, a small library for token/sentence/recursive text chunking), generate embeddings via LiteLLM, a lightweight multi-provider LLM/embedding client, and then atomically swap old vectors for new ones inside a MongoDB transaction:

async with await client.start_session() as session:
    async with session.start_transaction():
        # Delete old vectors for this document
        await vectors_collection.delete_many(
            {"document_id": document_id}, session=session
        )
        # Insert new vectors
        await vectors_collection.insert_many(new_vectors, session=session)
        # Update document_index entry
        await db.document_index.update_one(
            {"kb_id": kb_id, "document_id": document_id},
            {"$set": { ... }},
            upsert=True, session=session
        )
        # Update KB stats (document_count, chunk_count)
        await db.knowledge_bases.update_one(
            {"_id": ObjectId(kb_id)},
            {"$set": {"document_count": total_docs, "chunk_count": total_chunks}},
            session=session
        )

This blue-green pattern means a document is never in a half-indexed state: either all its new vectors are visible, or the old ones remain.

Loading diagram...

📝 Edit in Excalidraw

Embedding Cache

Embeddings are cached in a global embedding_cache collection keyed by (SHA-256 chunk hash, embedding model). When the same text chunk appears in multiple KBs (or is re-indexed after an edit), we skip the API call and reuse the cached vector. This saves both cost and latency—especially when re-indexing a large KB after a configuration change.

Search

The query string is embedded with the same model, then we run an aggregation whose first stage is $vectorSearch (with numCandidates = max(top_k * 10, 100) for better recall). We apply filters (organization, tags, date range) in the vector index definition, add vectorSearchScore via $meta, and optionally coalesce neighboring chunks for richer context.

Reconciliation: Keeping KBs in Sync

Documents and KBs can drift: a document’s tags change, a document is deleted, or vectors are orphaned after a failed indexing run. The reconciliation service (reconciliation.py) detects and fixes this:

Missing documents: documents with matching tags but no document_index entry → queued for indexing.
Stale documents: indexed documents whose tags no longer match the KB → removed.
Orphaned vectors: vectors without a corresponding document_index entry → deleted.

Reconciliation uses a distributed lock (atomic find_one_and_update with a 10-minute TTL) so only one worker reconciles a given KB at a time. It processes documents in batches of 100 to keep memory bounded, and supports dry-run mode for auditing without side effects.

In short: we get strict schema via migrations, predictable indexing, vector search for RAG with atomic updates, and self-healing knowledge bases—with one backend shared between DocRouter and SigAgent, and the same patterns whether we run on local Mongo, Atlas, or DocumentDB.