- https://github.com/simonw/llm/issues/507#issuecomment-2458639308 * register_model is now async aware Refs https://github.com/simonw/llm/issues/507#issuecomment-2458658134 * Refactor Chat and AsyncChat to use _Shared base class Refs https://github.com/simonw/llm/issues/507#issuecomment-2458692338 * fixed function name * Fix for infinite loop * Applied Black * Ran cog * Applied Black * Add Response.from_row() classmethod back again It does not matter that this is a blocking call, since it is a classmethod * Made mypy happy with llm/models.py * mypy fixes for openai_models.py I am unhappy with this, had to duplicate some code. * First test for AsyncModel * Still have not quite got this working * Fix for not loading plugins during tests, refs #626 * audio/wav not audio/wave, refs #603 * Black and mypy and ruff all happy * Refactor to avoid generics * Removed obsolete response() method * Support text = await async_mock_model.prompt("hello") * Initial docs for llm.get_async_model() and await model.prompt() Refs #507 * Initial async model plugin creation docs * duration_ms ANY to pass test * llm models --async option Refs https://github.com/simonw/llm/pull/613#issuecomment-2474724406 * Removed obsolete TypeVars * Expanded register_models() docs for async * await model.prompt() now returns AsyncResponse Refs https://github.com/simonw/llm/pull/613#issuecomment-2475157822 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
6.9 KiB
(advanced-model-plugins)=
Advanced model plugins
The {ref}model plugin tutorial <tutorial-model-plugin> covers the basics of developing a plugin that adds support for a new model.
This document covers more advanced topics.
(advanced-model-plugins-async)=
Async models
Plugins can optionally provide an asynchronous version of their model, suitable for use with Python asyncio. This is particularly useful for remote models accessible by an HTTP API.
The async version of a model subclasses llm.AsyncModel instead of llm.Model. It must implement an async def execute() async generator method instead of def execute().
This example shows a subset of the OpenAI default plugin illustrating how this method might work:
from typing import AsyncGenerator
import llm
class MyAsyncModel(llm.AsyncModel):
# This cn duplicate the model_id of the sync model:
model_id = "my-model-id"
async def execute(
self, prompt, stream, response, conversation=None
) -> AsyncGenerator[str, None]:
if stream:
completion = await client.chat.completions.create(
model=self.model_id,
messages=messages,
stream=True,
)
async for chunk in completion:
yield chunk.choices[0].delta.content
else:
completion = await client.chat.completions.create(
model=self.model_name or self.model_id,
messages=messages,
stream=False,
)
yield completion.choices[0].message.content
This async model instance should then be passed to the register() method in the register_models() plugin hook:
@hookimpl
def register_models(register):
register(
MyModel(), MyAsyncModel(), aliases=("my-model-aliases",)
)
(advanced-model-plugins-attachments)=
Attachments for multi-modal models
Models such as GPT-4o, Claude 3.5 Sonnet and Google's Gemini 1.5 are multi-modal: they accept input in the form of images and maybe even audio, video and other formats.
LLM calls these attachments. Models can specify the types of attachments they accept and then implement special code in the .execute() method to handle them.
See {ref}the Python attachments documentation <python-api-attachments> for details on using attachments in the Python API.
Specifying attachment types
A Model subclass can list the types of attachments it accepts by defining a attachment_types class attribute:
class NewModel(llm.Model):
model_id = "new-model"
attachment_types = {
"image/png",
"image/jpeg",
"image/webp",
"image/gif",
}
These content types are detected when an attachment is passed to LLM using llm -a filename, or can be specified by the user using the --attachment-type filename image/png option.
Note: *MP3 files will have their attachment type detected as audio/mpeg, not audio/mp3.
LLM will use the attachment_types attribute to validate that provided attachments should be accepted before passing them to the model.
Handling attachments
The prompt object passed to the execute() method will have an attachments attribute containing a list of Attachment objects provided by the user.
An Attachment instance has the following properties:
url (str): The URL of the attachment, if it was provided as a URLpath (str): The resolved file path of the attachment, if it was provided as a filetype (str): The content type of the attachment, if it was providedcontent (bytes): The binary content of the attachment, if it was provided
Generally only one of url, path or content will be set.
You should usually access the type and the content through one of these methods:
attachment.resolve_type() -> str: Returns thetypeif it is available, otherwise attempts to guess the type by looking at the first few bytes of contentattachment.content_bytes() -> bytes: Returns the binary content, which it may need to read from a file or fetch from a URLattachment.base64_content() -> str: Returns that content as a base64-encoded string
A id() method returns a database ID for this content, which is either a SHA256 hash of the binary content or, in the case of attachments hosted at an external URL, a hash of {"url": url} instead. This is an implementation detail which you should not need to access directly.
Note that it's possible for a prompt with an attachments to not include a text prompt at all, in which case prompt.prompt will be None.
Here's how the OpenAI plugin handles attachments, including the case where no prompt.prompt was provided:
if not prompt.attachments:
messages.append({"role": "user", "content": prompt.prompt})
else:
attachment_message = []
if prompt.prompt:
attachment_message.append({"type": "text", "text": prompt.prompt})
for attachment in prompt.attachments:
attachment_message.append(_attachment(attachment))
messages.append({"role": "user", "content": attachment_message})
# And the code for creating the attachment message
def _attachment(attachment):
url = attachment.url
base64_content = ""
if not url or attachment.resolve_type().startswith("audio/"):
base64_content = attachment.base64_content()
url = f"data:{attachment.resolve_type()};base64,{base64_content}"
if attachment.resolve_type().startswith("image/"):
return {"type": "image_url", "image_url": {"url": url}}
else:
format_ = "wav" if attachment.resolve_type() == "audio/wav" else "mp3"
return {
"type": "input_audio",
"input_audio": {
"data": base64_content,
"format": format_,
},
}
As you can see, it uses attachment.url if that is available and otherwise falls back to using the base64_content() method to embed the image directly in the JSON sent to the API. For the OpenAI API audio attachments are always included as base64-encoded strings.
Attachments from previous conversations
Models that implement the ability to continue a conversation can reconstruct the previous message JSON using the response.attachments attribute.
Here's how the OpenAI plugin does that:
for prev_response in conversation.responses:
if prev_response.attachments:
attachment_message = []
if prev_response.prompt.prompt:
attachment_message.append(
{"type": "text", "text": prev_response.prompt.prompt}
)
for attachment in prev_response.attachments:
attachment_message.append(_attachment(attachment))
messages.append({"role": "user", "content": attachment_message})
else:
messages.append(
{"role": "user", "content": prev_response.prompt.prompt}
)
messages.append({"role": "assistant", "content": prev_response.text()})