Test for async toolbox, docs for toolboxes in general

Closes #1090, refs #997
This commit is contained in:
Simon Willison 2025-05-26 10:23:03 -07:00
parent 00f44a848a
commit e23e13e6c7
4 changed files with 137 additions and 88 deletions

View file

@ -283,9 +283,9 @@ See also [the llm tag](https://simonwillison.net/tags/llm/) on my blog.
* [Response.fake()](https://llm.datasette.io/en/stable/plugins/plugin-utilities.html#response-fake)
* [Python API](https://llm.datasette.io/en/stable/python-api.html)
* [Basic prompt execution](https://llm.datasette.io/en/stable/python-api.html#basic-prompt-execution)
* [Tools](https://llm.datasette.io/en/stable/python-api.html#tools)
* [System prompts](https://llm.datasette.io/en/stable/python-api.html#system-prompts)
* [Attachments](https://llm.datasette.io/en/stable/python-api.html#attachments)
* [Tools](https://llm.datasette.io/en/stable/python-api.html#tools)
* [Schemas](https://llm.datasette.io/en/stable/python-api.html#schemas)
* [Fragments](https://llm.datasette.io/en/stable/python-api.html#fragments)
* [Model options](https://llm.datasette.io/en/stable/python-api.html#model-options)

View file

@ -87,42 +87,15 @@ def register_tools(register):
register(count_char, name="count_character_in_word")
```
Functions are useful for simple tools, but some tools may have more advanced needs. You can also define tools as a class (known as a "toolbox"), which provides the following advantages:
Tools can also be implemented as classes, as described in {ref}`Toolbox classes <python-api-toolbox>` in the Python API documentation.
- Toolbox tools can bundle multiple tools together
- Toolbox tools can be configured, e.g. to give filesystem tools access to a specific directory
- Toolbox instances can persist shared state in between tool invocations
You can register classes like the `Memory` example from there by passing the class (_not_ an instance of the class) to `register()`:
Toolboxes are classes that extend `llm.Toolbox`. Any methods that do not begin with an underscore will be exposed as tool functions.
This example sets up key/value memory storage that can be used by the model:
```python
import llm
class Memory(llm.Toolbox):
_memory = None
def _get_memory(self):
if self._memory is None:
self._memory = {}
return self._memory
def set(self, key: str, value: str):
"Set something as a key"
self._get_memory()[key] = value
def get(self, key: str):
"Get something from a key"
return self._get_memory().get(key) or ""
def append(self, key: str, value: str):
"Append something as a key"
memory = self._get_memory()
memory[key] = (memory.get(key) or "") + "\n" + value
def keys(self):
"Return a list of keys"
return list(self._get_memory().keys())
...
@llm.hookimpl
def register_tools(register):

View file

@ -45,63 +45,6 @@ If you have set a `OPENAI_API_KEY` environment variable you can omit the `model.
Calling `llm.get_model()` with an invalid model ID will raise a `llm.UnknownModelError` exception.
(python-api-tools)=
### Tools
{ref}`Tools <tools>` are functions that can be executed by the model as part of a chain of responses.
You can define tools in Python code - with a docstring to describe what they do - and then pass them to the `model.prompt()` method using the `tools=` keyword argument. If the model decides to request a tool call the `response.tool_calls()` method show what the model wants to execute:
```python
import llm
def upper(text: str) -> str:
"""Convert text to uppercase."""
return text.upper()
model = llm.get_model("gpt-4.1-mini")
response = model.prompt("Convert panda to upper", tools=[upper])
tool_calls = response.tool_calls()
# [ToolCall(name='upper', arguments={'text': 'panda'}, tool_call_id='...')]
```
You can call `response.execute_tool_calls()` to execute those calls and get back the results:
```python
tool_results = response.execute_tool_calls()
# [ToolResult(name='upper', output='PANDA', tool_call_id='...')]
```
To pass the results of the tool calls back to the model you need to use a utility method called `model.chain()`:
```python
chain_response = model.chain(
"Convert panda to upper",
tools=[upper],
)
print(chain_response.text())
# The word "panda" converted to uppercase is "PANDA".
```
You can also loop through the `model.chain()` response to get a stream of tokens, like this:
```python
for chunk in model.chain(
"Convert panda to upper",
tools=[upper],
):
print(chunk, end="", flush=True)
```
This will stream each of the chain of responses in turn as they are generated.
You can access the individual responses that make up the chain using `chain.responses()`. This can be iterated over as the chain executes like this:
```python
chain = model.chain(
"Convert panda to upper",
tools=[upper],
)
for response in chain.responses():
print(response.prompt)
for chunk in response:
print(chunk, end="", flush=True)
```
(python-api-system-prompts)=
### System prompts
@ -148,6 +91,123 @@ if "image/jpeg" in model.attachment_types:
...
```
(python-api-tools)=
### Tools
{ref}`Tools <tools>` are functions that can be executed by the model as part of a chain of responses.
You can define tools in Python code - with a docstring to describe what they do - and then pass them to the `model.prompt()` method using the `tools=` keyword argument. If the model decides to request a tool call the `response.tool_calls()` method show what the model wants to execute:
```python
import llm
def upper(text: str) -> str:
"""Convert text to uppercase."""
return text.upper()
model = llm.get_model("gpt-4.1-mini")
response = model.prompt("Convert panda to upper", tools=[upper])
tool_calls = response.tool_calls()
# [ToolCall(name='upper', arguments={'text': 'panda'}, tool_call_id='...')]
```
You can call `response.execute_tool_calls()` to execute those calls and get back the results:
```python
tool_results = response.execute_tool_calls()
# [ToolResult(name='upper', output='PANDA', tool_call_id='...')]
```
You can use the `model.chain()` to pass the results of tool calls back to the model automatically as subsequent prompts:
```python
chain_response = model.chain(
"Convert panda to upper",
tools=[upper],
)
print(chain_response.text())
# The word "panda" converted to uppercase is "PANDA".
```
You can also loop through the `model.chain()` response to get a stream of tokens, like this:
```python
for chunk in model.chain(
"Convert panda to upper",
tools=[upper],
):
print(chunk, end="", flush=True)
```
This will stream each of the chain of responses in turn as they are generated.
You can access the individual responses that make up the chain using `chain.responses()`. This can be iterated over as the chain executes like this:
```python
chain = model.chain(
"Convert panda to upper",
tools=[upper],
)
for response in chain.responses():
print(response.prompt)
for chunk in response:
print(chunk, end="", flush=True)
```
(python-api-toolbox)=
#### Toolbox classes
Functions are useful for simple tools, but some tools may have more advanced needs. You can also define tools as a class (known as a "toolbox"), which provides the following advantages:
- Toolbox tools can bundle multiple tools together
- Toolbox tools can be configured, e.g. to give filesystem tools access to a specific directory
- Toolbox instances can persist shared state in between tool invocations
Toolboxes are classes that extend `llm.Toolbox`. Any methods that do not begin with an underscore will be exposed as tool functions.
This example sets up key/value memory storage that can be used by the model:
```python
import llm
class Memory(llm.Toolbox):
_memory = None
def _get_memory(self):
if self._memory is None:
self._memory = {}
return self._memory
def set(self, key: str, value: str):
"Set something as a key"
self._get_memory()[key] = value
def get(self, key: str):
"Get something from a key"
return self._get_memory().get(key) or ""
def append(self, key: str, value: str):
"Append something as a key"
memory = self._get_memory()
memory[key] = (memory.get(key) or "") + "\n" + value
def keys(self):
"Return a list of keys"
return list(self._get_memory().keys())
```
You can then use that from Python like this:
```python
model = llm.get_model("gpt-4.1-mini")
memory = Memory()
conversation = model.conversation(tools=[memory])
print(conversation.chain("Set name to Simon", after_call=print).text())
print(memory._memory)
# Should show {'name': 'Simon'}
print(conversation.chain("Set name to Penguin", after_call=print).text())
# Now it should be {'name': 'Penguin'}
print(conversation.chain("Print current name", after_call=print).text())
```
See the {ref}`register_tools() plugin hook documentation <plugin-hooks-register-tools>` for an example of this tool in action as a CLI plugin.
(python-api-schemas)=
### Schemas
@ -396,6 +456,7 @@ chain_response = model.chain(
)
print(chain_response.text())
```
This also works for `async def` methods of `llm.Toolbox` subclasses.
### Tool use for async models

View file

@ -173,6 +173,21 @@ async def test_async_tools_run_tools_in_parallel():
assert delta_ns < (100_000_000 * 0.2)
@pytest.mark.asyncio
async def test_async_toolbox():
class Tools(llm.Toolbox):
async def go(self):
return "This was async"
model = llm.get_async_model("echo")
chain_response = model.chain(
json.dumps({"tool_calls": [{"name": "Tools_go"}]}),
tools=[Tools()],
)
output = await chain_response.text()
assert '"output": "This was async"' in output
@pytest.mark.vcr
def test_conversation_with_tools(vcr):
import llm