From 278509d82411ebbe62296c307d6bda6bfeec9a16 Mon Sep 17 00:00:00 2001 From: Simon Willison Date: Mon, 26 May 2025 18:01:51 -0700 Subject: [PATCH] Warn about prompt injection tools risk, closes #1097 --- docs/tools.md | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/docs/tools.md b/docs/tools.md index ecae6d6..0258121 100644 --- a/docs/tools.md +++ b/docs/tools.md @@ -4,6 +4,10 @@ Many Large Language Models have been trained to execute tools as part of responding to a prompt. LLM supports tool usage with both the command-line interface and the Python API. +Exposing tools to LLMs **carries risks**! Be sure to read the {ref}`warning below `. + +(tools-how-they-work)= + ## How tools work A tool is effectively a function that the model can request to be executed. Here's how that works: @@ -14,6 +18,30 @@ A tool is effectively a function that the model can request to be executed. Here 4. LLM prompts the model a second time, this time including the output of the tool execution. 5. The model can then use that output to generate its next response. +This sequence can run several times in a loop, allowing the LLM to access data, act on that data and then pass that data off to other tools for further processing. + +:::{admonition} Tools can be dangerous +:class: danger + +(tools-warning)= + +## Warning: Tools can be dangerous + +Applications built on top of LLMs suffer from a class of attacks called [prompt injection](https://simonwillison.net/tags/prompt-injection/) attacks. These occur when a malicious third party injects content into the LLM which causes it to take tool-based actions that act against the interests of the user of that application. + +Be very careful about which tools you enable when you potentially might be exposed to untrusted sources of content - web pages, GitHub issues posted by other people, email and messages that have been sent to you that could come from an attacker. + +Watch out for the **lethal trifecta** of prompt injection exfiltration attacks. If your tool-enabled LLM has the following: + +- access to private data +- exposure to malicious instructions +- the ability to exfiltrate information + +Anyone who can feed malicious instructions into your LLM - by leaving them on a web page it visits, or sending an email to an inbox that it monitors - could be able to trick your LLM into using other tools to access your private information and then exfiltrate (pass out) that data to somewhere the attacker can see it. +::: + +(tools-trying-out)= + ## Trying out tools LLM comes with a default tool installed, called `llm_version`. You can try that out like this: @@ -32,6 +60,8 @@ The installed version of the LLM is 0.26a0. ``` Further tools can be installed using plugins, or you can use the `llm --functions` option to pass tools implemented as PYthon functions directly, as {ref}`described here `. +(tools-implementation)= + ## LLM's implementation of tools In LLM every tool is a defined as a Python function. The function can take any number of arguments and can return a string or an object that can be converted to a string. @@ -42,6 +72,8 @@ The Python API can accept functions directly. The command-line interface has two You can use tools {ref}`with the LLM command-line tool ` or {ref}`with the Python API `. +(tools-tips)= + ## Tips for implementing tools Consult the {ref}`register_tools() plugin hook ` documentation for examples of how to implement tools in plugins. @@ -49,4 +81,3 @@ Consult the {ref}`register_tools() plugin hook ` do If your plugin needs access to API secrets I recommend storing those using `llm keys set api-name` and then reading them using the {ref}`plugin-utilities-get-key` utility function. This avoids secrets being logged to the database as part of tool calls. -