llm/docs/embeddings/index.md
Simon Willison 77cf56e54a
Initial CLI support and plugin hook for embeddings, refs #185
* Embeddings plugin hook + OpenAI implementation
* llm.get_embedding_model(name) function
* llm embed command, for returning embeddings or saving them to SQLite
* Tests using an EmbedDemo embedding model
* llm embed-models list and emeb-models default commands
* llm embed-db path and llm embed-db collections commands
2023-08-27 22:24:10 -07:00

1.1 KiB

(embeddings)=

Embeddings

Embedding models allow you to take a piece of text - a word, sentence, paragraph or even a whole articles, and convert that into an array of floating point numbers.

This floating point array is called an "embedding vector", and works as a numerical representation of the semantic meaning of the content in a many-multi-dimensional space.

By calculating the distance between embedding vectors, we can identify which content is semantically "nearest" to other content.

This can be used to build features like related article lookups. It can also be used to build semantic search, where a user can search for a phrase and get back results that are semantically similar to that phrase even if they do not share any exact keywords.

LLM supports multiple embedding models through {ref}plugins <plugins>. Once installed, an embedding model can be used on the command-line or via the Python API to calculate and store embeddings for content, and then to perform similarity searches against those embeddings.

---
maxdepth: 3
---
cli
writing-plugins
binary