Question 1

What are embeddings in plain terms?

Accepted Answer

An embedding converts a piece of text — a word, sentence, or entire document — into a list of numbers that encodes its meaning. Two pieces of text with similar meanings produce similar number lists, which lets software measure semantic similarity mathematically. This is how a search engine can return results about "automobile insurance" when you type "car coverage" — even though the words don't match.

Question 2

Do I need a separate embedding model if I'm already using a generative LLM?

Accepted Answer

Yes, in most cases. Generative LLMs (the kind tracked on ModelPicker) produce text as output. Embedding models produce vectors as output. If you're building a RAG system, you typically use an embedding model to index your documents and retrieve relevant chunks, then pass those chunks to a generative LLM to formulate an answer. They're complementary tools, not substitutes — and they're priced and evaluated differently.

Question 3

How do embeddings relate to context windows?

Accepted Answer

They're related but distinct concepts. A context window determines how much text a generative LLM can read in one prompt — across the 52 models we track, context windows average roughly 498K tokens and reach up to 2 million tokens. Embeddings, by contrast, compress the meaning of text into a fixed-size vector regardless of document length (within the embedding model's own input limit). Large context windows reduce your need for retrieval by letting you feed more raw text directly to the LLM, but embeddings enable search and retrieval at scales — millions of documents — that no context window can accommodate.

Question 4

Are embedding models included in ModelPicker's benchmarks?

Accepted Answer

Not currently. Our 12-test benchmark suite evaluates generative capabilities — tasks like strategic analysis, tool calling, structured output, and long-context comprehension. Embedding quality requires different evaluation methods, such as retrieval precision and recall on domain-specific corpora. We focus on generative models; dedicated embedding model comparison is a separate evaluation domain.

Embeddings

What Is It?

Why It Matters

How It Applies

Frequently Asked Questions