mirror of
https://github.com/Hopiu/llm.git
synced 2026-03-30 11:20:23 +00:00
- https://github.com/simonw/llm/issues/507#issuecomment-2458639308 * register_model is now async aware Refs https://github.com/simonw/llm/issues/507#issuecomment-2458658134 * Refactor Chat and AsyncChat to use _Shared base class Refs https://github.com/simonw/llm/issues/507#issuecomment-2458692338 * fixed function name * Fix for infinite loop * Applied Black * Ran cog * Applied Black * Add Response.from_row() classmethod back again It does not matter that this is a blocking call, since it is a classmethod * Made mypy happy with llm/models.py * mypy fixes for openai_models.py I am unhappy with this, had to duplicate some code. * First test for AsyncModel * Still have not quite got this working * Fix for not loading plugins during tests, refs #626 * audio/wav not audio/wave, refs #603 * Black and mypy and ruff all happy * Refactor to avoid generics * Removed obsolete response() method * Support text = await async_mock_model.prompt("hello") * Initial docs for llm.get_async_model() and await model.prompt() Refs #507 * Initial async model plugin creation docs * duration_ms ANY to pass test * llm models --async option Refs https://github.com/simonw/llm/pull/613#issuecomment-2474724406 * Removed obsolete TypeVars * Expanded register_models() docs for async * await model.prompt() now returns AsyncResponse Refs https://github.com/simonw/llm/pull/613#issuecomment-2475157822 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
166 lines
6.9 KiB
Markdown
166 lines
6.9 KiB
Markdown
(advanced-model-plugins)=
|
|
# Advanced model plugins
|
|
|
|
The {ref}`model plugin tutorial <tutorial-model-plugin>` covers the basics of developing a plugin that adds support for a new model.
|
|
|
|
This document covers more advanced topics.
|
|
|
|
(advanced-model-plugins-async)=
|
|
|
|
## Async models
|
|
|
|
Plugins can optionally provide an asynchronous version of their model, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html). This is particularly useful for remote models accessible by an HTTP API.
|
|
|
|
The async version of a model subclasses `llm.AsyncModel` instead of `llm.Model`. It must implement an `async def execute()` async generator method instead of `def execute()`.
|
|
|
|
This example shows a subset of the OpenAI default plugin illustrating how this method might work:
|
|
|
|
|
|
```python
|
|
from typing import AsyncGenerator
|
|
import llm
|
|
|
|
class MyAsyncModel(llm.AsyncModel):
|
|
# This cn duplicate the model_id of the sync model:
|
|
model_id = "my-model-id"
|
|
|
|
async def execute(
|
|
self, prompt, stream, response, conversation=None
|
|
) -> AsyncGenerator[str, None]:
|
|
if stream:
|
|
completion = await client.chat.completions.create(
|
|
model=self.model_id,
|
|
messages=messages,
|
|
stream=True,
|
|
)
|
|
async for chunk in completion:
|
|
yield chunk.choices[0].delta.content
|
|
else:
|
|
completion = await client.chat.completions.create(
|
|
model=self.model_name or self.model_id,
|
|
messages=messages,
|
|
stream=False,
|
|
)
|
|
yield completion.choices[0].message.content
|
|
```
|
|
This async model instance should then be passed to the `register()` method in the `register_models()` plugin hook:
|
|
|
|
```python
|
|
@hookimpl
|
|
def register_models(register):
|
|
register(
|
|
MyModel(), MyAsyncModel(), aliases=("my-model-aliases",)
|
|
)
|
|
```
|
|
|
|
(advanced-model-plugins-attachments)=
|
|
|
|
## Attachments for multi-modal models
|
|
|
|
Models such as GPT-4o, Claude 3.5 Sonnet and Google's Gemini 1.5 are multi-modal: they accept input in the form of images and maybe even audio, video and other formats.
|
|
|
|
LLM calls these **attachments**. Models can specify the types of attachments they accept and then implement special code in the `.execute()` method to handle them.
|
|
|
|
See {ref}`the Python attachments documentation <python-api-attachments>` for details on using attachments in the Python API.
|
|
|
|
### Specifying attachment types
|
|
|
|
A `Model` subclass can list the types of attachments it accepts by defining a `attachment_types` class attribute:
|
|
|
|
```python
|
|
class NewModel(llm.Model):
|
|
model_id = "new-model"
|
|
attachment_types = {
|
|
"image/png",
|
|
"image/jpeg",
|
|
"image/webp",
|
|
"image/gif",
|
|
}
|
|
```
|
|
These content types are detected when an attachment is passed to LLM using `llm -a filename`, or can be specified by the user using the `--attachment-type filename image/png` option.
|
|
|
|
**Note:** *MP3 files will have their attachment type detected as `audio/mpeg`, not `audio/mp3`.
|
|
|
|
LLM will use the `attachment_types` attribute to validate that provided attachments should be accepted before passing them to the model.
|
|
|
|
### Handling attachments
|
|
|
|
The `prompt` object passed to the `execute()` method will have an `attachments` attribute containing a list of `Attachment` objects provided by the user.
|
|
|
|
An `Attachment` instance has the following properties:
|
|
|
|
- `url (str)`: The URL of the attachment, if it was provided as a URL
|
|
- `path (str)`: The resolved file path of the attachment, if it was provided as a file
|
|
- `type (str)`: The content type of the attachment, if it was provided
|
|
- `content (bytes)`: The binary content of the attachment, if it was provided
|
|
|
|
Generally only one of `url`, `path` or `content` will be set.
|
|
|
|
You should usually access the type and the content through one of these methods:
|
|
|
|
- `attachment.resolve_type() -> str`: Returns the `type` if it is available, otherwise attempts to guess the type by looking at the first few bytes of content
|
|
- `attachment.content_bytes() -> bytes`: Returns the binary content, which it may need to read from a file or fetch from a URL
|
|
- `attachment.base64_content() -> str`: Returns that content as a base64-encoded string
|
|
|
|
A `id()` method returns a database ID for this content, which is either a SHA256 hash of the binary content or, in the case of attachments hosted at an external URL, a hash of `{"url": url}` instead. This is an implementation detail which you should not need to access directly.
|
|
|
|
Note that it's possible for a prompt with an attachments to not include a text prompt at all, in which case `prompt.prompt` will be `None`.
|
|
|
|
Here's how the OpenAI plugin handles attachments, including the case where no `prompt.prompt` was provided:
|
|
|
|
```python
|
|
if not prompt.attachments:
|
|
messages.append({"role": "user", "content": prompt.prompt})
|
|
else:
|
|
attachment_message = []
|
|
if prompt.prompt:
|
|
attachment_message.append({"type": "text", "text": prompt.prompt})
|
|
for attachment in prompt.attachments:
|
|
attachment_message.append(_attachment(attachment))
|
|
messages.append({"role": "user", "content": attachment_message})
|
|
|
|
|
|
# And the code for creating the attachment message
|
|
def _attachment(attachment):
|
|
url = attachment.url
|
|
base64_content = ""
|
|
if not url or attachment.resolve_type().startswith("audio/"):
|
|
base64_content = attachment.base64_content()
|
|
url = f"data:{attachment.resolve_type()};base64,{base64_content}"
|
|
if attachment.resolve_type().startswith("image/"):
|
|
return {"type": "image_url", "image_url": {"url": url}}
|
|
else:
|
|
format_ = "wav" if attachment.resolve_type() == "audio/wav" else "mp3"
|
|
return {
|
|
"type": "input_audio",
|
|
"input_audio": {
|
|
"data": base64_content,
|
|
"format": format_,
|
|
},
|
|
}
|
|
```
|
|
As you can see, it uses `attachment.url` if that is available and otherwise falls back to using the `base64_content()` method to embed the image directly in the JSON sent to the API. For the OpenAI API audio attachments are always included as base64-encoded strings.
|
|
|
|
### Attachments from previous conversations
|
|
|
|
Models that implement the ability to continue a conversation can reconstruct the previous message JSON using the `response.attachments` attribute.
|
|
|
|
Here's how the OpenAI plugin does that:
|
|
|
|
```python
|
|
for prev_response in conversation.responses:
|
|
if prev_response.attachments:
|
|
attachment_message = []
|
|
if prev_response.prompt.prompt:
|
|
attachment_message.append(
|
|
{"type": "text", "text": prev_response.prompt.prompt}
|
|
)
|
|
for attachment in prev_response.attachments:
|
|
attachment_message.append(_attachment(attachment))
|
|
messages.append({"role": "user", "content": attachment_message})
|
|
else:
|
|
messages.append(
|
|
{"role": "user", "content": prev_response.prompt.prompt}
|
|
)
|
|
messages.append({"role": "assistant", "content": prev_response.text()})
|
|
```
|