Embed your content with a modern embedding model, store in pgvector, and query with cosine similarity. For best results, combine with BM25 (keyword) in a hybrid score. Beats traditional keyword search on 80% of query types.
create extension if not exists vector;.alter table articles add column embedding vector(1536);. Dimension matches your embedding model.create index on articles using ivfflat (embedding vector_cosine_ops) with (lists = 100);. For larger: use HNSW.SELECT *, 1 - (embedding <=> $1) as similarity FROM articles ORDER BY embedding <=> $1 LIMIT 20.tsvector for full-text. Combine: final_score = 0.5 * vector_score + 0.5 * bm25_score. Ask AI: "Generate a Postgres function that returns top-K hybrid-ranked results."| Tool | Best For | Price |
|---|---|---|
| pgvector | Postgres vector store | Free |
| text-embedding-3-small | Cheap & good | $0.02/M tokens |
| bge-m3 | Self-hosted embed | Free |
| Cohere Rerank-compat | Re-ranking | $1/1K |
| tsvector (pg) | BM25 | Free |
Q: Vector DB (Pinecone, Qdrant) vs pgvector? Pgvector wins for most — 1M+ vectors run fine with HNSW, no extra infra.
Q: Which embedding model? 3-small (1536d) for speed. 3-large (3072d) for quality. bge-m3 self-hosted for privacy.
Q: How do I handle multilingual content? Use multilingual embeddings (bge-m3, multilingual-e5) — they cross languages natively.
Q: Real-time vs batch indexing? Trigger-based re-embed on update works up to ~100 writes/min. Above that, queue.
Q: How do I evaluate quality? Create 50 query → expected result pairs. Measure Hit@3, NDCG@10. Compare configs.
Q: Can I search images? Yes — use CLIP embeddings for images + text in the same vector space.
Semantic search is a one-afternoon upgrade that dramatically improves product experience. Add pgvector, embed your content, layer hybrid scoring, and watch bounce rates drop. No new infrastructure needed.
Free newsletter
Join thousands of creators and builders. One email a week — practical AI tips, platform updates, and curated reads.
No spam · Unsubscribe anytime
Build a searchable, chat-enabled knowledge base from your docs using RAG, pgvector, and a clean chat UI — for internal o…
Ship a Perplexity-style AI search engine using embeddings, RAG, and streaming LLM responses — deployed on your own infra…
Build a production retrieval-augmented generation app with pgvector, embeddings, and any OpenAI-compatible LLM. Covers c…
Comments
Sign in to join the conversation
No comments yet. Be the first to share your thoughts!