From 64f9f2ef5244dc38ed09cdd0af708621a5b35206 Mon Sep 17 00:00:00 2001 From: Simon Willison Date: Sun, 16 Feb 2025 22:29:32 -0800 Subject: [PATCH] Promote llm-mlx in changelog and plugin directory !stable-docs --- docs/changelog.md | 1 + docs/plugins/directory.md | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/changelog.md b/docs/changelog.md index 255217c..0c80626 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -12,6 +12,7 @@ See also [LLM 0.22, the annotated release notes](https://simonwillison.net/2025/ - New `llm embed-multi --prepend X` option for prepending a string to each value before it is embedded - useful for models such as [nomic-embed-text-v2-moe](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe) that require passages to start with a string like `"search_document: "`. [#745](https://github.com/simonw/llm/issues/745) - The `response.json()` and `response.usage()` methods are {ref}`now documented `. - Fixed a bug where conversations that were loaded from the database could not be continued using `asyncio` prompts. [#742](https://github.com/simonw/llm/issues/742) +- New plugin for macOS users: [llm-mlx](https://github.com/simonw/llm-mlx), which provides [extremely high performance access](https://simonwillison.net/2025/Feb/15/llm-mlx/) to a wide range of local models using Apple's MLX framework. - The `llm-claude-3` plugin has been renamed to [llm-anthropic](https://github.com/simonw/llm-anthropic). (v0_21)= diff --git a/docs/plugins/directory.md b/docs/plugins/directory.md index e1e10a6..d7e2df9 100644 --- a/docs/plugins/directory.md +++ b/docs/plugins/directory.md @@ -9,11 +9,12 @@ These plugins all help you run LLMs directly on your own computer: - **[llm-gguf](https://github.com/simonw/llm-gguf)** uses [llama.cpp](https://github.com/ggerganov/llama.cpp) to run models published in the GGUF format. +- **[llm-mlx](https://github.com/simonw/llm-mlx)** (Mac only) uses Apple's MLX framework to provide extremely high performance access to a large number of local models. +- **[llm-ollama](https://github.com/taketwo/llm-ollama)** adds support for local models run using [Ollama](https://ollama.ai/). +- **[llm-llamafile](https://github.com/simonw/llm-llamafile)** adds support for local models that are running locally using [llamafile](https://github.com/Mozilla-Ocho/llamafile). - **[llm-mlc](https://github.com/simonw/llm-mlc)** can run local models released by the [MLC project](https://mlc.ai/mlc-llm/), including models that can take advantage of the GPU on Apple Silicon M1/M2 devices. - **[llm-gpt4all](https://github.com/simonw/llm-gpt4all)** adds support for various models released by the [GPT4All](https://gpt4all.io/) project that are optimized to run locally on your own machine. These models include versions of Vicuna, Orca, Falcon and MPT - here's [a full list of models](https://observablehq.com/@simonw/gpt4all-models). - **[llm-mpt30b](https://github.com/simonw/llm-mpt30b)** adds support for the [MPT-30B](https://huggingface.co/mosaicml/mpt-30b) local model. -- **[llm-ollama](https://github.com/taketwo/llm-ollama)** adds support for local models run using [Ollama](https://ollama.ai/). -- **[llm-llamafile](https://github.com/simonw/llm-llamafile)** adds support for local models that are running locally using [llamafile](https://github.com/Mozilla-Ocho/llamafile). ## Remote APIs