diff --git a/README.md b/README.md index 9a7e18f..abbce54 100644 --- a/README.md +++ b/README.md @@ -283,9 +283,9 @@ See also [the llm tag](https://simonwillison.net/tags/llm/) on my blog. * [Response.fake()](https://llm.datasette.io/en/stable/plugins/plugin-utilities.html#response-fake) * [Python API](https://llm.datasette.io/en/stable/python-api.html) * [Basic prompt execution](https://llm.datasette.io/en/stable/python-api.html#basic-prompt-execution) - * [Tools](https://llm.datasette.io/en/stable/python-api.html#tools) * [System prompts](https://llm.datasette.io/en/stable/python-api.html#system-prompts) * [Attachments](https://llm.datasette.io/en/stable/python-api.html#attachments) + * [Tools](https://llm.datasette.io/en/stable/python-api.html#tools) * [Schemas](https://llm.datasette.io/en/stable/python-api.html#schemas) * [Fragments](https://llm.datasette.io/en/stable/python-api.html#fragments) * [Model options](https://llm.datasette.io/en/stable/python-api.html#model-options) diff --git a/docs/plugins/plugin-hooks.md b/docs/plugins/plugin-hooks.md index dabbb9b..9e00ac9 100644 --- a/docs/plugins/plugin-hooks.md +++ b/docs/plugins/plugin-hooks.md @@ -87,42 +87,15 @@ def register_tools(register): register(count_char, name="count_character_in_word") ``` -Functions are useful for simple tools, but some tools may have more advanced needs. You can also define tools as a class (known as a "toolbox"), which provides the following advantages: +Tools can also be implemented as classes, as described in {ref}`Toolbox classes ` in the Python API documentation. -- Toolbox tools can bundle multiple tools together -- Toolbox tools can be configured, e.g. to give filesystem tools access to a specific directory -- Toolbox instances can persist shared state in between tool invocations +You can register classes like the `Memory` example from there by passing the class (_not_ an instance of the class) to `register()`: -Toolboxes are classes that extend `llm.Toolbox`. Any methods that do not begin with an underscore will be exposed as tool functions. - -This example sets up key/value memory storage that can be used by the model: ```python import llm class Memory(llm.Toolbox): - _memory = None - - def _get_memory(self): - if self._memory is None: - self._memory = {} - return self._memory - - def set(self, key: str, value: str): - "Set something as a key" - self._get_memory()[key] = value - - def get(self, key: str): - "Get something from a key" - return self._get_memory().get(key) or "" - - def append(self, key: str, value: str): - "Append something as a key" - memory = self._get_memory() - memory[key] = (memory.get(key) or "") + "\n" + value - - def keys(self): - "Return a list of keys" - return list(self._get_memory().keys()) + ... @llm.hookimpl def register_tools(register): diff --git a/docs/python-api.md b/docs/python-api.md index 774958e..441d181 100644 --- a/docs/python-api.md +++ b/docs/python-api.md @@ -45,63 +45,6 @@ If you have set a `OPENAI_API_KEY` environment variable you can omit the `model. Calling `llm.get_model()` with an invalid model ID will raise a `llm.UnknownModelError` exception. -(python-api-tools)= - -### Tools - -{ref}`Tools ` are functions that can be executed by the model as part of a chain of responses. - -You can define tools in Python code - with a docstring to describe what they do - and then pass them to the `model.prompt()` method using the `tools=` keyword argument. If the model decides to request a tool call the `response.tool_calls()` method show what the model wants to execute: - -```python -import llm - -def upper(text: str) -> str: - """Convert text to uppercase.""" - return text.upper() - -model = llm.get_model("gpt-4.1-mini") -response = model.prompt("Convert panda to upper", tools=[upper]) -tool_calls = response.tool_calls() -# [ToolCall(name='upper', arguments={'text': 'panda'}, tool_call_id='...')] -``` -You can call `response.execute_tool_calls()` to execute those calls and get back the results: -```python -tool_results = response.execute_tool_calls() -# [ToolResult(name='upper', output='PANDA', tool_call_id='...')] -``` -To pass the results of the tool calls back to the model you need to use a utility method called `model.chain()`: -```python -chain_response = model.chain( - "Convert panda to upper", - tools=[upper], -) -print(chain_response.text()) -# The word "panda" converted to uppercase is "PANDA". -``` -You can also loop through the `model.chain()` response to get a stream of tokens, like this: -```python -for chunk in model.chain( - "Convert panda to upper", - tools=[upper], -): - print(chunk, end="", flush=True) -``` -This will stream each of the chain of responses in turn as they are generated. - -You can access the individual responses that make up the chain using `chain.responses()`. This can be iterated over as the chain executes like this: - -```python -chain = model.chain( - "Convert panda to upper", - tools=[upper], -) -for response in chain.responses(): - print(response.prompt) - for chunk in response: - print(chunk, end="", flush=True) -``` - (python-api-system-prompts)= ### System prompts @@ -148,6 +91,123 @@ if "image/jpeg" in model.attachment_types: ... ``` +(python-api-tools)= + +### Tools + +{ref}`Tools ` are functions that can be executed by the model as part of a chain of responses. + +You can define tools in Python code - with a docstring to describe what they do - and then pass them to the `model.prompt()` method using the `tools=` keyword argument. If the model decides to request a tool call the `response.tool_calls()` method show what the model wants to execute: + +```python +import llm + +def upper(text: str) -> str: + """Convert text to uppercase.""" + return text.upper() + +model = llm.get_model("gpt-4.1-mini") +response = model.prompt("Convert panda to upper", tools=[upper]) +tool_calls = response.tool_calls() +# [ToolCall(name='upper', arguments={'text': 'panda'}, tool_call_id='...')] +``` +You can call `response.execute_tool_calls()` to execute those calls and get back the results: +```python +tool_results = response.execute_tool_calls() +# [ToolResult(name='upper', output='PANDA', tool_call_id='...')] +``` +You can use the `model.chain()` to pass the results of tool calls back to the model automatically as subsequent prompts: +```python +chain_response = model.chain( + "Convert panda to upper", + tools=[upper], +) +print(chain_response.text()) +# The word "panda" converted to uppercase is "PANDA". +``` +You can also loop through the `model.chain()` response to get a stream of tokens, like this: +```python +for chunk in model.chain( + "Convert panda to upper", + tools=[upper], +): + print(chunk, end="", flush=True) +``` +This will stream each of the chain of responses in turn as they are generated. + +You can access the individual responses that make up the chain using `chain.responses()`. This can be iterated over as the chain executes like this: + +```python +chain = model.chain( + "Convert panda to upper", + tools=[upper], +) +for response in chain.responses(): + print(response.prompt) + for chunk in response: + print(chunk, end="", flush=True) +``` + +(python-api-toolbox)= + +#### Toolbox classes + +Functions are useful for simple tools, but some tools may have more advanced needs. You can also define tools as a class (known as a "toolbox"), which provides the following advantages: + +- Toolbox tools can bundle multiple tools together +- Toolbox tools can be configured, e.g. to give filesystem tools access to a specific directory +- Toolbox instances can persist shared state in between tool invocations + +Toolboxes are classes that extend `llm.Toolbox`. Any methods that do not begin with an underscore will be exposed as tool functions. + +This example sets up key/value memory storage that can be used by the model: +```python +import llm + +class Memory(llm.Toolbox): + _memory = None + + def _get_memory(self): + if self._memory is None: + self._memory = {} + return self._memory + + def set(self, key: str, value: str): + "Set something as a key" + self._get_memory()[key] = value + + def get(self, key: str): + "Get something from a key" + return self._get_memory().get(key) or "" + + def append(self, key: str, value: str): + "Append something as a key" + memory = self._get_memory() + memory[key] = (memory.get(key) or "") + "\n" + value + + def keys(self): + "Return a list of keys" + return list(self._get_memory().keys()) +``` +You can then use that from Python like this: +```python +model = llm.get_model("gpt-4.1-mini") +memory = Memory() + +conversation = model.conversation(tools=[memory]) +print(conversation.chain("Set name to Simon", after_call=print).text()) + +print(memory._memory) +# Should show {'name': 'Simon'} + +print(conversation.chain("Set name to Penguin", after_call=print).text()) +# Now it should be {'name': 'Penguin'} + +print(conversation.chain("Print current name", after_call=print).text()) +``` + +See the {ref}`register_tools() plugin hook documentation ` for an example of this tool in action as a CLI plugin. + (python-api-schemas)= ### Schemas @@ -396,6 +456,7 @@ chain_response = model.chain( ) print(chain_response.text()) ``` +This also works for `async def` methods of `llm.Toolbox` subclasses. ### Tool use for async models diff --git a/tests/test_tools.py b/tests/test_tools.py index 42c5f51..474dc0f 100644 --- a/tests/test_tools.py +++ b/tests/test_tools.py @@ -173,6 +173,21 @@ async def test_async_tools_run_tools_in_parallel(): assert delta_ns < (100_000_000 * 0.2) +@pytest.mark.asyncio +async def test_async_toolbox(): + class Tools(llm.Toolbox): + async def go(self): + return "This was async" + + model = llm.get_async_model("echo") + chain_response = model.chain( + json.dumps({"tool_calls": [{"name": "Tools_go"}]}), + tools=[Tools()], + ) + output = await chain_response.text() + assert '"output": "This was async"' in output + + @pytest.mark.vcr def test_conversation_with_tools(vcr): import llm