Release 0.17a0

Refs #587, #590
2026-05-17 10:11:06 +00:00 · 2024-10-28 15:46:52 -07:00 · 2024-10-28 15:46:52 -07:00 · ba1ccb3a4a
commit ba1ccb3a4a
parent 758ff9ac17
4 changed files with 40 additions and 2 deletions
--- a/docs/changelog.md
+++ b/docs/changelog.md
@ -1,5 +1,39 @@
 # Changelog

+(v0_17a0)=
+## 0.17a0 (2024-10-28)
+
+Alpha support for **attachments**, allowing multi-modal models to accept images, audio, video and other formats. [#578](https://github.com/simonw/llm/issues/578)
+
+Attachments {ref}`in the CLI <usage-attachments>` can be URLs:
+
+```bash
+llm "describe this image" \
+  -a https://static.simonwillison.net/static/2024/pelicans.jpg
+```
+Or file paths:
+```bash
+llm "extract text" -a image1.jpg -a image2.jpg
+```
+Or binary data, which may need to use `--attachment-type` to specify the MIME type:
+```bash
+cat image | llm "extract text" --attachment-type - image/jpeg
+```
+
+Attachments are also available {ref}`in the Python API <python-api-attachments>`:
+
+```python
+model = llm.get_model("gpt-4o-mini")
+response = model.prompt(
+    "Describe these images",
+    attachments=[
+        llm.Attachment(path="pelican.jpg"),
+        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
+    ]
+)
+```
+Plugins that provide alternative models can support attachments, see {ref}`advanced-model-plugins-attachments` for details.
+
 (v0_16)=
 ## 0.16 (2024-09-12)

--- a/docs/python-api.md
+++ b/docs/python-api.md
@ -49,6 +49,9 @@ response = model.prompt(
    system="Answer like GlaDOS"
 )
 ```
+
+(python-api-attachments)=
+
 ### Attachments

 Model that accept multi-modal input (images, audio, video etc) can be passed attachments using the `attachments=` keyword argument. This accepts a list of `llm.Attachment()` instances.
--- a/docs/usage.md
+++ b/docs/usage.md
@ -45,6 +45,7 @@ Some models support options. You can pass these using `-o/--option name value` -
 ```bash
 llm 'Ten names for cheesecakes' -o temperature 1.5
 ```
+(usage-attachments)=
 ### Attachments

 Some models are multi-modal, which means they can accept input in more than just text. GPT-4o and GPT-4o mini can accept images, and models such as Google Gemini 1.5 can accept audio and video as well.
@ -56,7 +57,7 @@ llm "describe this image" -a https://static.simonwillison.net/static/2024/pelica
 ```
 Attachments can be passed using URLs or file paths, and you can attach more than one attachment to a single prompt:
 ```bash
-llm "describe these images" -a image1.jpg -a image2.jpg
+llm "extract text" -a image1.jpg -a image2.jpg
 ```
 You can also pipe an attachment to LLM by using `-` as the filename:
 ```bash
--- a/setup.py
+++ b/setup.py
@ -1,7 +1,7 @@
 from setuptools import setup, find_packages
 import os

-VERSION = "0.16"
+VERSION = "0.17a0"


 def get_long_description():