Hopiu/llm

mirror of https://github.com/Hopiu/llm.git synced 2026-03-18 05:30:24 +00:00

llm logs --schema, --data, --data-array and --data-key options (#785 )

* llm logs --schema option, refs #782
* --data and --data-array and --data-key options, refs #782
* Tests for llm logs --schema options, refs #785
* Also implemented --schema ID lookup, refs #780
* Using --data-key implies --data
* Docs for llm logs --schema and --data etc

2025-02-26 21:51:08 -08:00

10 KiB

Raw Blame History

(logging)=

Logging to SQLite

llm defaults to logging all prompts and responses to a SQLite database.

You can find the location of that database using the llm logs path command:

llm logs path

On my Mac that outputs:

/Users/simon/Library/Application Support/io.datasette.llm/logs.db

This will differ for other operating systems.

To avoid logging an individual prompt, pass --no-log or -n to the command:

llm 'Ten names for cheesecakes' -n

To turn logging by default off:

llm logs off

If you've turned off logging you can still log an individual prompt and response by adding --log:

llm 'Five ambitious names for a pet pterodactyl' --log

To turn logging by default back on again:

llm logs on

To see the status of the logs database, run this:

llm logs status

Example output:

Logging is ON for all prompts
Found log database at /Users/simon/Library/Application Support/io.datasette.llm/logs.db
Number of conversations logged: 33
Number of responses logged:     48
Database file size:             19.96MB

(logging-view)=

Viewing the logs

You can view the logs using the llm logs command:

llm logs

This will output the three most recent logged items in Markdown format, showing both the prompt and the response formatted using Markdown.

To get back just the most recent prompt response as plain text, add -r/--response:

llm logs -r

Use -x/--extract to extract and return the first fenced code block from the selected log entries:

llm logs --extract

Or --xl/--extract-last for the last fenced code block:

llm logs --extract-last

Add --json to get the log messages in JSON instead:

llm logs --json

Add -n 10 to see the ten most recent items:

llm logs -n 10

Or -n 0 to see everything that has ever been logged:

llm logs -n 0

You can truncate the display of the prompts and responses using the -t/--truncate option. This can help make the JSON output more readable:

llm logs -n 1 -t --json

Example output:

[
  {
    "id": "01jm8ec74wxsdatyn5pq1fp0s5",
    "model": "anthropic/claude-3-haiku-20240307",
    "prompt": "hi",
    "system": null,
    "prompt_json": null,
    "response": "Hello! How can I assist you today?",
    "conversation_id": "01jm8ec74taftdgj2t4zra9z0j",
    "duration_ms": 560,
    "datetime_utc": "2025-02-16T22:34:30.374882+00:00",
    "input_tokens": 8,
    "output_tokens": 12,
    "token_details": null,
    "conversation_name": "hi",
    "conversation_model": "anthropic/claude-3-haiku-20240307",
    "attachments": []
  }
]

(logging-short)=

-s/--short mode

Use -s/--short to see a shortened YAML log with truncated prompts and no responses:

llm logs -n 2 --short

Example output:

- model: deepseek-reasoner
  datetime: '2025-02-02T06:39:53'
  conversation: 01jk2pk05xq3d0vgk0202zrsg1
  prompt:  H01 There are five huts. H02 The Scotsman lives in the purple hut. H03 The Welshman owns the parrot. H04 Kombucha is...
- model: o3-mini
  datetime: '2025-02-02T19:03:05'
  conversation: 01jk40qkxetedzpf1zd8k9bgww
  system: Formatting re-enabled. Write a detailed README with extensive usage examples.
  prompt: <documents> <document index="1"> <source>./Cargo.toml</source> <document_content> [package] name = "py-limbo" version...

Include -u/--usage to include token usage information:

llm logs -n 1 --short --usage

Example output:

- model: o3-mini
  datetime: '2025-02-16T23:00:56'
  conversation: 01jm8fxxnef92n1663c6ays8xt
  system: Produce Python code that demonstrates every possible usage of yaml.dump
    with all of the arguments it can take, especi...
  prompt: <documents> <document index="1"> <source>./setup.py</source> <document_content>
    NAME = 'PyYAML' VERSION = '7.0.0.dev0...
  usage:
    input: 74793
    output: 3550
    details:
      completion_tokens_details:
        reasoning_tokens: 2240

(logging-conversation)=

Logs for a conversation

To view the logs for the most recent {ref}conversation <usage-conversation> you have had with a model, use -c:

llm logs -c

To see logs for a specific conversation based on its ID, use --cid ID or --conversation ID:

llm logs --cid 01h82n0q9crqtnzmf13gkyxawg

(logging-search)=

Searching the logs

You can search the logs for a search term in the prompt or the response columns.

llm logs -q 'cheesecake'

The most relevant terms will be shown at the bottom of the output.

(logging-filter-model)=

Filtering by model

You can filter to logs just for a specific model (or model alias) using -m/--model:

llm logs -m chatgpt

(logging-datasette)=

Browsing logs using Datasette

You can also use Datasette to browse your logs like this:

datasette "$(llm logs path)"

(logging-schemas)=

JSON objects created using schemas

You can use {ref}usage-schemas to collect structured JSON data from text and images that you feed into LLM.

The JSON produced by these is logged in the database. You can use special options to extract just those JSON objects in a useful format.

The --schema X filter option can be used to filter just for responses that were created using the specified schema. You can pass the full schema JSON, a path to the schema on disk or the schema ID.

The --data option causes just the JSON data collected by that schema to be outputted, as newline-delimited JSON.

If you instead want a JSON array of objects (with starting and ending square braces) you can use --data-array instead.

Consider this schema file, called dogs.schema.json:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "dogs": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "minLength": 1
          },
          "ten_word_bio": {
            "type": "string",
            "minLength": 1
          }
        },
        "required": ["name", "ten_word_bio"],
        "additionalProperties": false
      }
    }
  },
  "required": ["dogs"],
  "additionalProperties": false
}

You can use this several times to invent several cool dogs:

llm --schema dogs.schema.json 'invent 3 cool dogs'
llm --schema dogs.schema.json 'invent 2 cool dogs'

Having logged the cool dogs, you can see just the data that was returned by those prompts like this:

llm logs --schema dogs.schema.json --data

Output:

{"dogs": [{"name": "Robo", "ten_word_bio": "A cybernetic dog with laser eyes and super intelligence."}, {"name": "Flamepaw", "ten_word_bio": "Fire-resistant dog with a talent for agility and tricks."}]}
{"dogs": [{"name": "Bolt", "ten_word_bio": "Lightning-fast border collie, loves frisbee and outdoor adventures."}, {"name": "Luna", "ten_word_bio": "Mystical husky with mesmerizing blue eyes, enjoys snow and play."}, {"name": "Ziggy", "ten_word_bio": "Quirky pug who loves belly rubs and quirky outfits."}]}

Note that the dogs are nested in that "dogs" key. To access the list of items from that key use --data-key dogs:

llm logs --schema dogs.schema.json --data-key dogs

Output:

{"name": "Bolt", "ten_word_bio": "Lightning-fast border collie, loves frisbee and outdoor adventures."}
{"name": "Luna", "ten_word_bio": "Mystical husky with mesmerizing blue eyes, enjoys snow and play."}
{"name": "Ziggy", "ten_word_bio": "Quirky pug who loves belly rubs and quirky outfits."}
{"name": "Robo", "ten_word_bio": "A cybernetic dog with laser eyes and super intelligence."}
{"name": "Flamepaw", "ten_word_bio": "Fire-resistant dog with a talent for agility and tricks."}

Finally, to output a JSON array instead of newline-delimited JSON use --data-array:

llm logs --schema dogs.schema.json --data-key dogs --data-array

Output:

[{"name": "Bolt", "ten_word_bio": "Lightning-fast border collie, loves frisbee and outdoor adventures."},
 {"name": "Luna", "ten_word_bio": "Mystical husky with mesmerizing blue eyes, enjoys snow and play."},
 {"name": "Ziggy", "ten_word_bio": "Quirky pug who loves belly rubs and quirky outfits."},
 {"name": "Robo", "ten_word_bio": "A cybernetic dog with laser eyes and super intelligence."},
 {"name": "Flamepaw", "ten_word_bio": "Fire-resistant dog with a talent for agility and tricks."}]

(logging-sql-schema)=

SQL schema

Here's the SQL schema used by the logs.db database:

CREATE TABLE [conversations] (
  [id] TEXT PRIMARY KEY,
  [name] TEXT,
  [model] TEXT
);
CREATE TABLE [schemas] (
  [id] TEXT PRIMARY KEY,
  [content] TEXT
);
CREATE TABLE "responses" (
  [id] TEXT PRIMARY KEY,
  [model] TEXT,
  [prompt] TEXT,
  [system] TEXT,
  [prompt_json] TEXT,
  [options_json] TEXT,
  [response] TEXT,
  [response_json] TEXT,
  [conversation_id] TEXT REFERENCES [conversations]([id]),
  [duration_ms] INTEGER,
  [datetime_utc] TEXT,
  [input_tokens] INTEGER,
  [output_tokens] INTEGER,
  [token_details] TEXT,
  [schema_id] TEXT REFERENCES [schemas]([id])
);
CREATE VIRTUAL TABLE [responses_fts] USING FTS5 (
  [prompt],
  [response],
  content=[responses]
);
CREATE TABLE [attachments] (
  [id] TEXT PRIMARY KEY,
  [type] TEXT,
  [path] TEXT,
  [url] TEXT,
  [content] BLOB
);
CREATE TABLE [prompt_attachments] (
  [response_id] TEXT REFERENCES [responses]([id]),
  [attachment_id] TEXT REFERENCES [attachments]([id]),
  [order] INTEGER,
  PRIMARY KEY ([response_id],
  [attachment_id])
);

responses_fts configures SQLite full-text search against the prompt and response columns in the responses table.

10 KiB Raw Blame History

Logging to SQLite

Viewing the logs

-s/--short mode

Logs for a conversation

Searching the logs

Filtering by model

Browsing logs using Datasette

JSON objects created using schemas

SQL schema

10 KiB

Raw Blame History