Improvements to schemas.md refs #788

2026-05-08 05:44:44 +00:00 · 2025-02-27 11:08:39 -08:00 · 2025-02-27 11:08:39 -08:00 · 6957e4ecbb
commit 6957e4ecbb
parent 259366a575
1 changed files with 21 additions and 12 deletions
--- a/docs/schemas.md
+++ b/docs/schemas.md
@ -31,15 +31,18 @@ LLM provides several ways to use schemas:
 3. Via templates that include schemas
 4. Through the {ref}`Python API <python-api-schemas>`

+(schemas-using-cli)=
+
 ### Basic usage with the command line

 To get structured data from a language model you can provide a JSON schema directly using the `--schema` option:

 ```bash
-curl https://www.nytimes.com/ | uvx strip-tags | llm --schema '{
+curl https://www.nytimes.com/ | uvx strip-tags | \
+  llm --schema '{
  "type": "object",
  "properties": {
-    "stories": {
+    "items": {
      "type": "array",
      "items": {
        "type": "object",
@ -50,18 +53,18 @@ curl https://www.nytimes.com/ | uvx strip-tags | llm --schema '{
          "short_summary": {
            "type": "string"
          },
-          "key_people": {
+          "key_points": {
            "type": "array",
            "items": {
              "type": "string"
            }
          }
        },
-        "required": ["headline", "summary", "key_people"]
+        "required": ["headline", "short_summary", "key_points"]
      }
    }
  },
-  "required": ["stories"]
+  "required": ["items"]
 }' | jq
 ```
 This example uses [uvx](https://docs.astral.sh/uv/guides/tools/) to run [strip-tags](https://github.com/simonw/strip-tags) against the front page of the New York Times, runs GPT-4o mini with a schema to extract story headlines and summaries, then pipes the result through [jq](https://jqlang.org/) to format it.
@ -70,10 +73,19 @@ This will instruct the model to return an array of JSON objects with the specifi

 (schemas-dsl)=

-### Alternative schema syntax
+## Alternative schema syntax

 JSON schema's can be time-consuming to construct by hand. LLM also supports a concise alternative syntax for specifying a schema.

+The New York Times example above can be condensed to this, though note that key points is now a string rather than an array of strings:
+
+```bash
+curl https://www.nytimes.com/ | uvx strip-tags | \
+  llm --schema-multi 'headline, short_summary, key_points' | jq
+```
+
+### How that syntax works
+
 A simple schema for an object with two string properties called `name` and `bio` looks like this:

    name, bio
@ -94,6 +106,8 @@ If your schema is getting long you can switch from comma-separated to newline-se
    age int: their age
    bio: a short bio, no more than three sentences

+### Using alternative schema syntax
+
 This format is supported by the `--schema` option. The format will be detected any time you provide a string with at least one space that doesn't start with a `{` (indicating JSON):

 ```bash
@ -104,12 +118,7 @@ To return multiple items matching your schema, use the `--schema-multi` option.
 ```bash
 llm --schema-multi 'name,description,fave_toy' 'invent 3 dogs'
 ```
-Using this option a simpler version of the New York Times example above is the following:
-```bash
-curl https://www.nytimes.com/ | uvx strip-tags | llm --schema-multi 'headline, summary' | jq
-```
-
-The Python utility function `llm.schema_dsl(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary.
+The Python utility function `llm.schema_dsl(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary when working with schemas {ref}`in the Python API <python-api-schemas`.

 You can experiment with the syntax using the `llm schemas dsl` command, which converts the input into a JSON schema:
 ```bash