mirror of
https://github.com/Hopiu/llm.git
synced 2026-03-17 05:00:25 +00:00
Improvements to schemas.md refs #788
This commit is contained in:
parent
259366a575
commit
6957e4ecbb
1 changed files with 21 additions and 12 deletions
|
|
@ -31,15 +31,18 @@ LLM provides several ways to use schemas:
|
|||
3. Via templates that include schemas
|
||||
4. Through the {ref}`Python API <python-api-schemas>`
|
||||
|
||||
(schemas-using-cli)=
|
||||
|
||||
### Basic usage with the command line
|
||||
|
||||
To get structured data from a language model you can provide a JSON schema directly using the `--schema` option:
|
||||
|
||||
```bash
|
||||
curl https://www.nytimes.com/ | uvx strip-tags | llm --schema '{
|
||||
curl https://www.nytimes.com/ | uvx strip-tags | \
|
||||
llm --schema '{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"stories": {
|
||||
"items": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
|
|
@ -50,18 +53,18 @@ curl https://www.nytimes.com/ | uvx strip-tags | llm --schema '{
|
|||
"short_summary": {
|
||||
"type": "string"
|
||||
},
|
||||
"key_people": {
|
||||
"key_points": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
},
|
||||
"required": ["headline", "summary", "key_people"]
|
||||
"required": ["headline", "short_summary", "key_points"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"required": ["stories"]
|
||||
"required": ["items"]
|
||||
}' | jq
|
||||
```
|
||||
This example uses [uvx](https://docs.astral.sh/uv/guides/tools/) to run [strip-tags](https://github.com/simonw/strip-tags) against the front page of the New York Times, runs GPT-4o mini with a schema to extract story headlines and summaries, then pipes the result through [jq](https://jqlang.org/) to format it.
|
||||
|
|
@ -70,10 +73,19 @@ This will instruct the model to return an array of JSON objects with the specifi
|
|||
|
||||
(schemas-dsl)=
|
||||
|
||||
### Alternative schema syntax
|
||||
## Alternative schema syntax
|
||||
|
||||
JSON schema's can be time-consuming to construct by hand. LLM also supports a concise alternative syntax for specifying a schema.
|
||||
|
||||
The New York Times example above can be condensed to this, though note that key points is now a string rather than an array of strings:
|
||||
|
||||
```bash
|
||||
curl https://www.nytimes.com/ | uvx strip-tags | \
|
||||
llm --schema-multi 'headline, short_summary, key_points' | jq
|
||||
```
|
||||
|
||||
### How that syntax works
|
||||
|
||||
A simple schema for an object with two string properties called `name` and `bio` looks like this:
|
||||
|
||||
name, bio
|
||||
|
|
@ -94,6 +106,8 @@ If your schema is getting long you can switch from comma-separated to newline-se
|
|||
age int: their age
|
||||
bio: a short bio, no more than three sentences
|
||||
|
||||
### Using alternative schema syntax
|
||||
|
||||
This format is supported by the `--schema` option. The format will be detected any time you provide a string with at least one space that doesn't start with a `{` (indicating JSON):
|
||||
|
||||
```bash
|
||||
|
|
@ -104,12 +118,7 @@ To return multiple items matching your schema, use the `--schema-multi` option.
|
|||
```bash
|
||||
llm --schema-multi 'name,description,fave_toy' 'invent 3 dogs'
|
||||
```
|
||||
Using this option a simpler version of the New York Times example above is the following:
|
||||
```bash
|
||||
curl https://www.nytimes.com/ | uvx strip-tags | llm --schema-multi 'headline, summary' | jq
|
||||
```
|
||||
|
||||
The Python utility function `llm.schema_dsl(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary.
|
||||
The Python utility function `llm.schema_dsl(schema)` can be used to convert this syntax into the equivalent JSON schema dictionary when working with schemas {ref}`in the Python API <python-api-schemas`.
|
||||
|
||||
You can experiment with the syntax using the `llm schemas dsl` command, which converts the input into a JSON schema:
|
||||
```bash
|
||||
|
|
|
|||
Loading…
Reference in a new issue