Renamed build_json_schema to schema_dsl

This commit is contained in:
Simon Willison 2025-02-27 10:22:29 -08:00
parent 7e819c2ffa
commit 8d32b71ef1
6 changed files with 28 additions and 10 deletions

View file

@ -94,14 +94,16 @@ def register_models(register):
## Supporting schemas
If your model supports {ref}`structured output <python-api-schemas>` against a defined JSON schema you can implement support by first adding `supports_schema = True` to the class:
If your model supports {ref}`structured output <schemas>` against a defined JSON schema you can implement support by first adding `supports_schema = True` to the class:
```python
class MyModel(llm.KeyModel):
...
support_schema = True
```
And then adding code to your `.execute()` method that checks for `prompt.schema` and, if it is present, uses that to prompt the model. `prompt.schema` will always be a Python dictionary, even if the user passed in a Pydantic model class.
And then adding code to your `.execute()` method that checks for `prompt.schema` and, if it is present, uses that to prompt the model.
`prompt.schema` will always be a Python dictionary representing a JSON schema, even if the user passed in a Pydantic model class.
Check the [llm-gemini](https://github.com/simonw/llm-gemini) and [llm-anthropic](https://github.com/simonw/llm-anthropic) plugins for example of this pattern in action.

View file

@ -118,6 +118,15 @@ response = model.prompt("Describe a nice dog", schema={
})
```
You can also use LLM's {ref}`alternative schema syntax <schemas-dsl>` via the `llm.schema_dsl(schema_dsl)` function. This provides a quick way to construct a JSON schema for simple cases:
```python
print(model.prompt(
"Describe a nice dog with a surprising name",
schema=llm.schema_dsl("name, age int, bio")
))
```
(python-api-model-options)=
### Model options

View file

@ -4,7 +4,9 @@
Large Language Models are very good at producing structured output as JSON or other formats. LLM's **schemas** feature allows you to define the exact structure of JSON data you want to receive from a model.
This feature is supported by models from OpenAI, Anthropic, Google Gemini and others {ref}`via plugins <advanced-model-plugins-schemas>`.
This feature is supported by models from OpenAI, Anthropic, Google Gemini and can be implemented for others {ref}`via plugins <advanced-model-plugins-schemas>`.
(schemas-json-schemas)=
## Understanding JSON schemas
@ -18,6 +20,8 @@ A [JSON schema](https://json-schema.org/) is a specification that describes the
Different models may support different subsets of the overall JSON schema language. You should experiment to figure out what works for the model you are using.
(schemas-using-with-llm)=
## Using schemas with LLM
LLM provides several ways to use schemas:
@ -64,6 +68,8 @@ This example uses [uvx](https://docs.astral.sh/uv/guides/tools/) to run [strip-t
This will instruct the model to return an array of JSON objects with the specified structure, each containing a headline, summary, and array of key people mentioned.
(schemas-dsl)=
### Alternative schema syntax
JSON schema's can be time-consuming to construct by hand. LLM also supports a concise alternative syntax for specifying a schema.
@ -103,4 +109,4 @@ Using this option a simpler version of the New York Times example above is the f
curl https://www.nytimes.com/ | uvx strip-tags | llm --schema-multi 'headline, summary' | jq
```
The Python utility function `llm.utils.build_json_schema(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary.
The Python utility function `llm.schema_dsl(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary.

View file

@ -19,6 +19,7 @@ from .models import (
Prompt,
Response,
)
from .utils import schema_dsl
from .embeddings import Collection
from .templates import Template
from .plugins import pm, load_plugins
@ -49,6 +50,7 @@ __all__ = [
"Response",
"Template",
"user_dir",
"schema_dsl",
]
DEFAULT_MODEL = "gpt-4o-mini"

View file

@ -245,7 +245,7 @@ def resolve_schema_input(db, schema_input):
pass
if " " in schema_input.strip() or "," in schema_input:
# Treat it as schema DSL
return build_json_schema(schema_input)
return schema_dsl(schema_input)
# Is it a file on disk?
path = pathlib.Path(schema_input)
if path.exists():
@ -303,7 +303,7 @@ def schema_summary(schema: dict) -> str:
return ""
def build_json_schema(schema_dsl: str) -> Dict[str, Any]:
def schema_dsl(schema_dsl: str) -> Dict[str, Any]:
"""
Build a JSON schema from a concise schema string.

View file

@ -1,5 +1,5 @@
import pytest
from llm.utils import simplify_usage_dict, extract_fenced_code_block, build_json_schema
from llm.utils import simplify_usage_dict, extract_fenced_code_block, schema_dsl
@pytest.mark.parametrize(
@ -209,7 +209,6 @@ def test_extract_fenced_code_block(input, last, expected):
),
],
)
def test_build_json_schema(schema, expected):
"""Test the build_json_schema function with various inputs."""
result = build_json_schema(schema)
def test_schema_dsl(schema, expected):
result = schema_dsl(schema)
assert result == expected