Commit graph

34 commits

Author SHA1 Message Date
Dan Turkel
1d5d73481a
Filter IDs with --prefix in llm similar (#1052) 2025-05-23 22:54:55 -07:00
Simon Willison
f641b89882 llm similar -p/--plain option, closes #853 2025-03-28 00:36:08 -07:00
Simon Willison
62c90dd472
llm prompt --schema X option and model.prompt(..., schema=) parameter (#777)
Refs #776

* Implemented new llm prompt --schema and model.prompt(schema=)
* Log schema to responses.schema_id and schemas table
* Include schema in llm logs Markdown output
* Test for schema=pydantic_model
* Initial --schema CLI documentation
* Python docs for schema=
* Advanced plugin docs on schemas
2025-02-26 16:58:28 -08:00
Simon Willison
20c18a716d -q multiple option for llm models and llm embed-models
Refs #748
2025-02-13 15:35:18 -08:00
Simon Willison
9a1374b447
llm embed-multi --prepend option (#746)
* llm embed-multi --prepend option

Closes #745
2025-02-12 15:19:18 -08:00
web-sst
6f7ea406bf
Register full embedding model names (#654)
Provide backward compatible aliases.
This makes available the same model names that ttok uses.
2025-01-22 20:14:03 -08:00
Simon Willison
8021e12aaa
Windows readline fix, plus run CI against macOS and Windows
* Run CI on Windows and macOS as well as Ubuntu, refs #407
* Use pyreadline3 on win32
* Back to fail-fast since we have a bigger matrix now
* Mark some tests as xfail on windows
2024-01-26 16:24:58 -08:00
Simon Willison
8b78ac6099 Fix for bug where embed did not use default model, closes #317 2023-10-31 21:19:59 -07:00
e. alvarez
839b4d7161
Fix issues: #274, #280 (#282)
* Fix issue with reading directories in `iterate_files()` (#280)
* Add directory checking logic in `iterate_files()` (#274)
* Added tests for #282, #274, #280

---------

Co-authored-by: Simon Willison <swillison@gmail.com>
2023-09-18 23:14:30 -07:00
Simon Willison
33dee4762e llm embed-multi --batch-size option, closes #273 2023-09-13 16:33:27 -07:00
Simon Willison
603da35e37 Fixed flaky order tests, refs #271 2023-09-12 11:37:36 -07:00
Simon Willison
22a59f795e llm collections defaults to llm collections list, close #265 2023-09-11 23:08:11 -07:00
Simon Willison
591ad6f571 Revert "Reuse embeddings for hashed content, --store now works on second run - closes #224"
This reverts commit 267e2ea999.

It's broken, see:

https://github.com/simonw/llm/issues/224#issuecomment-1715014393
2023-09-11 22:56:52 -07:00
Simon Willison
267e2ea999 Reuse embeddings for hashed content, --store now works on second run - closes #224 2023-09-11 22:44:22 -07:00
Simon Willison
52cec1304b
Binary embeddings (#254)
* Binary embeddings support, refs #253
* Write binary content to content_blob, with tests - refs #253
* supports_text and supports_binary embedding validation, refs #253
2023-09-11 18:58:44 -07:00
Simon Willison
5ba34dbe36 llm embed-db is now llm collections, refs #229 2023-09-10 14:24:27 -07:00
Alexis Métaireau
df32d7685d
Updated error message for invalid or missing embedding model (#257)
* Updated error message for missing embedding model

---------

Co-authored-by: Simon Willison <swillison@gmail.com>
2023-09-10 11:56:29 -07:00
Simon Willison
78a0e9bd44 llm --files --encoding option and latin-1 fallback, closes #225 2023-09-04 12:28:31 -07:00
Simon Willison
3bf781fba2 Duplicate content is only embedded once, closes #217 2023-09-03 17:39:11 -07:00
Simon Willison
0eda99e91c Default embedding model finishing touches, closes #222 2023-09-03 17:21:47 -07:00
Simon Willison
b9c19a5666 Tests for multiple --files pairs 2023-09-03 16:40:00 -07:00
Simon Willison
6f62b7d613 Tests for llm embed-multi --files, refs #215 2023-09-03 16:40:00 -07:00
Simon Willison
0da1ed7d98 --remove-default for llm embed-models default, refs #222 2023-09-03 16:40:00 -07:00
Simon Willison
c8c0f80441 --prefix for llm embed-multi, refs #215 2023-09-03 16:40:00 -07:00
Simon Willison
70a3d4bdc4 Test for llm embed-multi against SQLite, refs #215 2023-09-03 16:40:00 -07:00
Simon Willison
5e686fe8b3 Tests for CSV/TSV/JSON/NL, refs #215 2023-09-03 16:40:00 -07:00
Simon Willison
213e0b0c75 embed-db delete-collection command and .delete() method, closes #219 2023-09-03 12:55:48 -07:00
Simon Willison
a5d6b580ba Store content_hash in embeddings table, refs #217
Uses new migrations feature from https://github.com/simonw/sqlite-migrate/issues/9
2023-09-03 10:50:51 -07:00
Simon Willison
26332045dd llm embed --metadata option, closes #209 2023-09-03 07:43:23 -07:00
Simon Willison
73a9043108 Store updated timestamp on embeddings, closes #211 2023-09-02 20:40:33 -07:00
Simon Willison
6b042a264a Fixed -i and -i - modes for llm embed command 2023-09-01 20:24:58 -07:00
Simon Willison
4be89facb5 Fixed and finished llm similar command, closes #190 2023-09-01 19:01:16 -07:00
Simon Willison
3ee92152e8 Error conditions for 'llm similar', refs #190 2023-09-01 18:31:59 -07:00
Simon Willison
77cf56e54a
Initial CLI support and plugin hook for embeddings, refs #185
* Embeddings plugin hook + OpenAI implementation
* llm.get_embedding_model(name) function
* llm embed command, for returning embeddings or saving them to SQLite
* Tests using an EmbedDemo embedding model
* llm embed-models list and emeb-models default commands
* llm embed-db path and llm embed-db collections commands
2023-08-27 22:24:10 -07:00