mirror of
https://github.com/Hopiu/llm.git
synced 2026-05-03 19:34:44 +00:00
Mention brute-force approach, link to vector indexing issue
Refs #216. Closes #214
This commit is contained in:
parent
94f0a1a337
commit
f842fbea49
2 changed files with 5 additions and 1 deletions
|
|
@ -285,6 +285,8 @@ llm-docs/plugins/index.md
|
|||
|
||||
The `llm similar` command searches a collection of embeddings for the items that are most similar to a given or item ID.
|
||||
|
||||
This currently uses a slow brute-force approach which does not scale well to large collections. See [issue 216](https://github.com/simonw/llm/issues/216) for plans to add a more scalable approach via vector indexes provided by plugins.
|
||||
|
||||
To search the `quotations` collection for items that are semantically similar to `'computer science'`:
|
||||
|
||||
```bash
|
||||
|
|
|
|||
|
|
@ -116,7 +116,9 @@ if Collection.exists(db, "entries"):
|
|||
(embeddings-python-similar)=
|
||||
## Retrieving similar items
|
||||
|
||||
Once you have populated a collection of embeddings you can retrieve the entries that are most similar to a given string using the `similar()` method:
|
||||
Once you have populated a collection of embeddings you can retrieve the entries that are most similar to a given string using the `similar()` method.
|
||||
|
||||
This method uses a brute force approach, calculating distance scores against every document. This is fine for small collections, but will not scale to large collections. See [issue 216](https://github.com/simonw/llm/issues/216) for plans to add a more scalable approach via vector indexes provided by plugins.
|
||||
|
||||
```python
|
||||
for entry in collection.similar("hound"):
|
||||
|
|
|
|||
Loading…
Reference in a new issue