Warn about prompt injection tools risk, closes #1097

2026-05-10 06:44:45 +00:00 · 2025-05-26 18:01:51 -07:00 · 2025-05-26 18:01:51 -07:00 · 278509d824
commit 278509d824
parent e1f276e576
1 changed files with 32 additions and 1 deletions
--- a/docs/tools.md
+++ b/docs/tools.md
@ -4,6 +4,10 @@

 Many Large Language Models have been trained to execute tools as part of responding to a prompt. LLM supports tool usage with both the command-line interface and the Python API.

+Exposing tools to LLMs **carries risks**! Be sure to read the {ref}`warning below <tools-warning>`.
+
+(tools-how-they-work)=
+
 ## How tools work

 A tool is effectively a function that the model can request to be executed. Here's how that works:
@ -14,6 +18,30 @@ A tool is effectively a function that the model can request to be executed. Here
 4. LLM prompts the model a second time, this time including the output of the tool execution.
 5. The model can then use that output to generate its next response.

+This sequence can run several times in a loop, allowing the LLM to access data, act on that data and then pass that data off to other tools for further processing.
+
+:::{admonition} Tools can be dangerous
+:class: danger
+
+(tools-warning)=
+
+## Warning: Tools can be dangerous
+
+Applications built on top of LLMs suffer from a class of attacks called [prompt injection](https://simonwillison.net/tags/prompt-injection/) attacks. These occur when a malicious third party injects content into the LLM which causes it to take tool-based actions that act against the interests of the user of that application.
+
+Be very careful about which tools you enable when you potentially might be exposed to untrusted sources of content - web pages, GitHub issues posted by other people, email and messages that have been sent to you that could come from an attacker.
+
+Watch out for the **lethal trifecta** of prompt injection exfiltration attacks. If your tool-enabled LLM has the following:
+
+- access to private data
+- exposure to malicious instructions
+- the ability to exfiltrate information
+
+Anyone who can feed malicious instructions into your LLM - by leaving them on a web page it visits, or sending an email to an inbox that it monitors - could be able to trick your LLM into using other tools to access your private information and then exfiltrate (pass out) that data to somewhere the attacker can see it.
+:::
+
+(tools-trying-out)=
+
 ## Trying out tools

 LLM comes with a default tool installed, called `llm_version`. You can try that out like this:
@ -32,6 +60,8 @@ The installed version of the LLM is 0.26a0.
 ```
 Further tools can be installed using plugins, or you can use the `llm --functions` option to pass tools implemented as PYthon functions directly, as {ref}`described here <usage-tools>`.

+(tools-implementation)=
+
 ## LLM's implementation of tools

 In LLM every tool is a defined as a Python function. The function can take any number of arguments and can return a string or an object that can be converted to a string.
@ -42,6 +72,8 @@ The Python API can accept functions directly. The command-line interface has two

 You can use tools {ref}`with the LLM command-line tool <usage-tools>` or {ref}`with the Python API <python-api-tools>`.

+(tools-tips)=
+
 ## Tips for implementing tools

 Consult the {ref}`register_tools() plugin hook <plugin-hooks-register-tools>` documentation for examples of how to implement tools in plugins.
@ -49,4 +81,3 @@ Consult the {ref}`register_tools() plugin hook <plugin-hooks-register-tools>` do
 If your plugin needs access to API secrets I recommend storing those using `llm keys set api-name` and then reading them using the {ref}`plugin-utilities-get-key` utility function. This avoids secrets being logged to the database as part of tool calls.

 <!-- Uncomment when this is true: The [llm-tools-datasette](https://github.com/simonw/llm-tools-datasette) plugin is a good example of this pattern in action. -->
-