An embedding is a fixed-length vector of floating-point numbers that captures the semantic meaning of a piece of data.
text-embedding-3-large or voyage-3Computers do not understand "cat" and "kitten" are related — but their embedding vectors point in nearly the same direction in a high-dimensional space. The embedding model has learned this from reading billions of sentences (Google AI blog on word2vec, 2013; OpenAI embedding guide, 2024).
Embeddings turn the fuzzy idea of "meaning" into math you can index, cluster, and search.
Similarity is measured by cosine similarity: 1.0 = identical meaning, 0 = unrelated, -1 = opposite.
Both produce vectors but from different encoders. CLIP (OpenAI, 2021) embeds text and images into the same space, so "a photo of a dog" and an actual dog photo land near each other. That enables cross-modal search.
Are embeddings reversible? No — you cannot reconstruct original text from a vector, though you can sometimes infer sensitive info.
What is a vector database? A database optimized for nearest-neighbor search. Examples: pgvector, Pinecone, Weaviate.
Are embeddings model-specific? Yes — you cannot mix vectors from different models. Re-embed everything if you switch.
How big is an embedding? 1536 floats = ~6 KB per document. A million documents = ~6 GB.
Do embeddings cost money? Yes, but cheap — usually $0.02-0.13 per million tokens.
Can embeddings replace a database? No — they complement keyword search and structured queries.
Do images use the same embeddings as text? Only with multimodal models like CLIP.
Embeddings are the quiet engine behind search, RAG, and recommendations. Master them and most AI products become simple. More guides at Misar Blog.
Free newsletter
Join thousands of creators and builders. One email a week — practical AI tips, platform updates, and curated reads.
No spam · Unsubscribe anytime
A complete list of 25 free AI writing tools in 2026 — Claude, ChatGPT, Gemini, Grammarly, QuillBot, Hemingway, and more…
The top free AI image generators in 2026 — DALL-E via Bing, Gemini, Ideogram, Leonardo, Stable Diffusion, Flux — with qu…
The top free AI tools for nonprofits in 2026 — grant writing, donor outreach, social posts, translations, research — wit…
Comments
Sign in to join the conversation
No comments yet. Be the first to share your thoughts!