RAG Metadata Filtering Strategies

Intermediate 10 min read

What you'll learn

✓Why metadata filtering matters in RAG
✓Common metadata schemas
✓Pre-filter versus post-filter strategies
✓How to combine filters with vector search
✓Pitfalls and how to avoid empty results

Prerequisites

•Familiar with APIs
•Basic RAG concepts

What and Why

Vector similarity is great at finding semantically related text but blind to attributes like date, owner, language, or document type. Metadata filtering layers structured constraints on top of semantic search so you can say things like “find chunks similar to this query, but only from the 2025 policy docs in English that this user can access.”

This matters for three reasons: precision, scope, and security. Precision improves because irrelevant matches get pruned. Scope lets users narrow into a project or product. Security ensures a user cannot retrieve documents they should not see, which is a hard requirement in enterprise RAG.

Mental Model

Imagine a library where every book has both a topic and a colored sticker on the spine indicating its language, year, and department. Semantic search picks books by topic. Filters check the sticker. You can either pull a giant pile by topic and then sort by sticker (post-filter) or first walk to the right shelf and only search topics there (pre-filter).

Pre-filtering is faster and more accurate when your filter is selective. Post-filtering is simpler but wastes work and can return too few results if the filter is strict.

Hands-on Example

Most modern vector databases (Pinecone, Weaviate, Qdrant, pgvector with jsonb) support filters in the query. Here is a Qdrant example:

from qdrant_client import QdrantClient
from qdrant_client.http import models as qm

client = QdrantClient("localhost", port=6333)

results = client.search(
    collection_name="docs",
    query_vector=embed("How do I rotate API keys?"),
    query_filter=qm.Filter(
        must=[
            qm.FieldCondition(key="product", match=qm.MatchValue(value="auth")),
            qm.FieldCondition(key="year", range=qm.Range(gte=2024)),
        ]
    ),
    limit=5,
)

Here product and year are metadata fields stored alongside each vector. The database uses an index on those fields to restrict the search space before doing nearest-neighbor lookup.


Pre-filter:
query ---> [filter by product=auth] ---> [vector search] ---> top-k

Post-filter:
query ---> [vector search top-N] ---> [filter by product=auth] ---> top-k

Hybrid:
query ---> [filter] ---> [vector search] ---> [re-rank] ---> top-k

Pre-filter vs post-filter retrieval flows

A common schema is to attach: source, doc_id, chunk_id, created_at, updated_at, author, team, product, language, acl (list of allowed user or group IDs), and maybe section_path.

Trade-offs

Pre-filter is fast when the filter selects a small slice and the database supports indexed filtering. It can hurt recall if you over-constrain. For example, filtering to a single quarter may exclude an evergreen FAQ chunk that is still the best answer.

Post-filter is safe in correctness but wastes compute, since you fetch many vectors only to throw most away. It also risks returning too few results if your over-fetch factor is wrong.

Storing too many metadata fields slows writes and bloats memory. Storing too few limits future filters. A practical middle path is to add only the fields you can imagine filtering on within six months, and keep the rest as raw text.

Permissions are the trickiest area. ACL lists work but explode for documents shared with many groups. Group-based ACLs and role expansion at query time scale better, but require a sync between your identity system and the vector store.

Practical Tips

Always include source and a stable doc_id in metadata. You will need it for de-duplication and citations.
For dates, store epoch seconds or ISO strings consistently. Mixed formats break range filters silently.
Test what happens when filters return zero hits. The pipeline should fall back gracefully rather than handing the LLM nothing.
Combine filters with hybrid search (BM25 plus vectors) for queries that contain exact identifiers like ticket numbers.
Log filter usage. You will discover which fields users actually rely on, and which ones can be dropped.
Push permissions filters as close to the database as possible. Never filter ACLs in application code only.

Wrap-up

Metadata filters turn a generic semantic search into a focused, secure retriever. Decide your schema early, pick pre-filtering when selective and post-filtering when not, and watch out for over-constrained queries that return nothing. Done well, metadata filtering quietly doubles the perceived quality of your RAG system without changing the model or the embeddings.