llm/docs/python-api.md

302 lines
8.7 KiB
Markdown
Raw Normal View History

(python-api)=
# Python API
LLM provides a Python API for executing prompts, in addition to the command-line interface.
2023-07-11 14:43:21 +00:00
Understanding this API is also important for writing {ref}`plugins`.
## Basic prompt execution
To run a prompt against the `gpt-4o-mini` model, run this:
```python
import llm
model = llm.get_model("gpt-4o-mini")
# Optional, you can configure the key in other ways:
model.key = "sk-..."
response = model.prompt("Five surprising names for a pet pelican")
print(response.text())
```
The `llm.get_model()` function accepts model IDs or aliases. You can also omit it to use the currently configured default model, which is `gpt-4o-mini` if you have not changed the default.
In this example the key is set by Python code. You can also provide the key using the `OPENAI_API_KEY` environment variable, or use the `llm keys set openai` command to store it in a `keys.json` file, see {ref}`api-keys`.
2023-09-12 17:36:29 +00:00
The `__str__()` method of `response` also returns the text of the response, so you can do this instead:
```python
print(llm.get_model().prompt("Five surprising names for a pet pelican"))
2023-09-12 17:36:29 +00:00
```
You can run this command to see a list of available models and their aliases:
```bash
llm models
```
If you have set a `OPENAI_API_KEY` environment variable you can omit the `model.key = ` line.
Calling `llm.get_model()` with an invalid model ID will raise a `llm.UnknownModelError` exception.
2023-07-24 15:41:05 +00:00
(python-api-system-prompts)=
2023-07-20 19:02:15 +00:00
### System prompts
For models that accept a system prompt, pass it as `system="..."`:
```python
response = model.prompt(
"Five surprising names for a pet pelican",
system="Answer like GlaDOS"
)
```
2024-10-28 22:46:52 +00:00
(python-api-attachments)=
### Attachments
Model that accept multi-modal input (images, audio, video etc) can be passed attachments using the `attachments=` keyword argument. This accepts a list of `llm.Attachment()` instances.
This example shows two attachments - one from a file path and one from a URL:
```python
import llm
model = llm.get_model("gpt-4o-mini")
response = model.prompt(
"Describe these images",
attachments=[
llm.Attachment(path="pelican.jpg"),
llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
]
)
```
Use `llm.Attachment(content=b"binary image content here")` to pass binary content directly.
2023-07-20 19:02:15 +00:00
### Model options
For models that support options (view those with `llm models --options`) you can pass options as keyword arguments to the `.prompt()` method:
```python
model = llm.get_model()
print(model.prompt("Names for otters", temperature=0.2))
```
### Models from plugins
Any models you have installed as plugins will also be available through this mechanism, for example to use Anthropic's Claude 3.5 Sonnet model with [llm-claude-3](https://github.com/simonw/llm-claude-3):
```bash
pip install llm-claude-3
```
Then in your Python code:
```python
import llm
model = llm.get_model("claude-3.5-sonnet")
# Use this if you have not set the key using 'llm keys set claude':
model.key = 'YOUR_API_KEY_HERE'
response = model.prompt("Five surprising names for a pet pelican")
print(response.text())
```
Some models do not use API keys at all.
2024-11-21 04:12:43 +00:00
(python-api-listing-models)=
### Listing models
The `llm.get_models()` list returns a list of all available models, including those from plugins.
```python
import llm
for model in llm.get_models():
print(model.model_id)
```
Use `llm.get_async_models()` to list async models:
```python
for model in llm.get_async_models():
print(model.model_id)
```
llm.get_async_model(), llm.AsyncModel base class and OpenAI async models (#613) - https://github.com/simonw/llm/issues/507#issuecomment-2458639308 * register_model is now async aware Refs https://github.com/simonw/llm/issues/507#issuecomment-2458658134 * Refactor Chat and AsyncChat to use _Shared base class Refs https://github.com/simonw/llm/issues/507#issuecomment-2458692338 * fixed function name * Fix for infinite loop * Applied Black * Ran cog * Applied Black * Add Response.from_row() classmethod back again It does not matter that this is a blocking call, since it is a classmethod * Made mypy happy with llm/models.py * mypy fixes for openai_models.py I am unhappy with this, had to duplicate some code. * First test for AsyncModel * Still have not quite got this working * Fix for not loading plugins during tests, refs #626 * audio/wav not audio/wave, refs #603 * Black and mypy and ruff all happy * Refactor to avoid generics * Removed obsolete response() method * Support text = await async_mock_model.prompt("hello") * Initial docs for llm.get_async_model() and await model.prompt() Refs #507 * Initial async model plugin creation docs * duration_ms ANY to pass test * llm models --async option Refs https://github.com/simonw/llm/pull/613#issuecomment-2474724406 * Removed obsolete TypeVars * Expanded register_models() docs for async * await model.prompt() now returns AsyncResponse Refs https://github.com/simonw/llm/pull/613#issuecomment-2475157822 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-11-14 01:51:00 +00:00
### Streaming responses
For models that support it you can stream responses as they are generated, like this:
```python
response = model.prompt("Five diabolical names for a pet goat")
for chunk in response:
print(chunk, end="")
```
The `response.text()` method described earlier does this for you - it runs through the iterator and gathers the results into a string.
If a response has been evaluated, `response.text()` will continue to return the same string.
llm.get_async_model(), llm.AsyncModel base class and OpenAI async models (#613) - https://github.com/simonw/llm/issues/507#issuecomment-2458639308 * register_model is now async aware Refs https://github.com/simonw/llm/issues/507#issuecomment-2458658134 * Refactor Chat and AsyncChat to use _Shared base class Refs https://github.com/simonw/llm/issues/507#issuecomment-2458692338 * fixed function name * Fix for infinite loop * Applied Black * Ran cog * Applied Black * Add Response.from_row() classmethod back again It does not matter that this is a blocking call, since it is a classmethod * Made mypy happy with llm/models.py * mypy fixes for openai_models.py I am unhappy with this, had to duplicate some code. * First test for AsyncModel * Still have not quite got this working * Fix for not loading plugins during tests, refs #626 * audio/wav not audio/wave, refs #603 * Black and mypy and ruff all happy * Refactor to avoid generics * Removed obsolete response() method * Support text = await async_mock_model.prompt("hello") * Initial docs for llm.get_async_model() and await model.prompt() Refs #507 * Initial async model plugin creation docs * duration_ms ANY to pass test * llm models --async option Refs https://github.com/simonw/llm/pull/613#issuecomment-2474724406 * Removed obsolete TypeVars * Expanded register_models() docs for async * await model.prompt() now returns AsyncResponse Refs https://github.com/simonw/llm/pull/613#issuecomment-2475157822 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-11-14 01:51:00 +00:00
(python-api-async)=
## Async models
Some plugins provide async versions of their supported models, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html).
To use an async model, use the `llm.get_async_model()` function instead of `llm.get_model()`:
```python
import llm
model = llm.get_async_model("gpt-4o")
```
You can then run a prompt using `await model.prompt(...)`:
```python
response = await model.prompt(
"Five surprising names for a pet pelican"
)
print(await response.text())
```
Or use `async for chunk in ...` to stream the response as it is generated:
```python
async for chunk in model.prompt(
"Five surprising names for a pet pelican"
):
print(chunk, end="", flush=True)
```
(python-api-conversations)=
## Conversations
LLM supports *conversations*, where you ask follow-up questions of a model as part of an ongoing conversation.
To start a new conversation, use the `model.conversation()` method:
```python
model = llm.get_model()
conversation = model.conversation()
```
You can then use the `conversation.prompt()` method to execute prompts against this conversation:
```python
response = conversation.prompt("Five fun facts about pelicans")
print(response.text())
```
This works exactly the same as the `model.prompt()` method, except that the conversation will be maintained across multiple prompts. So if you run this next:
```python
response2 = conversation.prompt("Now do skunks")
print(response2.text())
```
You will get back five fun facts about skunks.
The `conversation.prompt()` method supports attachments as well:
```python
response = conversation.prompt(
"Describe these birds",
attachments=[
llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg")
]
)
```
Access `conversation.responses` for a list of all of the responses that have so far been returned during the conversation.
2023-08-20 05:00:38 +00:00
(python-api-response-on-done)=
## Running code when a response has completed
For some applications, such as tracking the tokens used by an application, it may be useful to execute code as soon as a response has finished being executed
You can do this using the `response.on_done(callback)` method, which causes your callback function to be called as soon as the response has finished (all tokens have been returned).
The signature of the method you provide is `def callback(response)` - it can be optionally an `async def` method when working with asynchronous models.
Example usage:
```python
import llm
model = llm.get_model("gpt-4o-mini")
response = model.prompt("a poem about a hippo")
response.on_done(lambda response: print(response.usage()))
print(response.text())
```
Which outputs:
```
Usage(input=20, output=494, details={})
In a sunlit glade by a bubbling brook,
Lived a hefty hippo, with a curious look.
...
```
Or using an `asyncio` model, where you need to `await response.on_done(done)` to queue up the callback:
```python
import asyncio, llm
async def run():
model = llm.get_async_model("gpt-4o-mini")
response = model.prompt("a short poem about a brick")
async def done(response):
print(await response.usage())
print(await response.text())
await response.on_done(done)
print(await response.text())
asyncio.run(run())
```
2023-08-20 05:00:38 +00:00
## Other functions
The `llm` top level package includes some useful utility functions.
### set_alias(alias, model_id)
The `llm.set_alias()` function can be used to define a new alias:
```python
import llm
llm.set_alias("mini", "gpt-4o-mini")
2023-08-20 05:00:38 +00:00
```
The second argument can be a model identifier or another alias, in which case that alias will be resolved.
If the `aliases.json` file does not exist or contains invalid JSON it will be created or overwritten.
2023-08-20 05:06:51 +00:00
### remove_alias(alias)
2023-08-20 05:06:51 +00:00
Removes the alias with the given name from the `aliases.json` file.
Raises `KeyError` if the alias does not exist.
```python
import llm
llm.remove_alias("turbo")
```
### set_default_model(alias)
This sets the default model to the given model ID or alias. Any changes to defaults will be persisted in the LLM configuration folder, and will affect all programs using LLM on the system, including the `llm` CLI tool.
```python
import llm
llm.set_default_model("claude-3.5-sonnet")
```
### get_default_model()
This returns the currently configured default model, or `gpt-4o-mini` if no default has been set.
```python
import llm
model_id = llm.get_default_model()
```
To detect if no default has been set you can use this pattern:
```python
if llm.get_default_model(default=None) is None:
print("No default has been set")
```
Here the `default=` parameter specifies the value that should be returned if there is no configured default.
### set_default_embedding_model(alias) and get_default_embedding_model()
These two methods work the same as `set_default_model()` and `get_default_model()` but for the default {ref}`embedding model <embeddings>` instead.