2023-07-07 10:15:33 +00:00
< a name = "back-to-top" > < / a >
2022-02-24 11:22:30 +00:00

2020-08-04 23:06:27 +00:00
2022-08-13 22:51:42 +00:00
[](https://lycheeverse.github.io)
2022-03-06 23:51:14 +00:00
[](https://github.com/marketplace/actions/lychee-broken-link-checker)
2022-11-03 10:12:24 +00:00

2021-12-16 19:47:35 +00:00
[](https://docs.rs/lychee-lib)
2022-03-06 23:51:14 +00:00
[](https://github.com/lycheeverse/lychee/actions/workflows/links.yml)
2022-08-13 22:51:42 +00:00
[](https://hub.docker.com/r/lycheeverse/lychee)
2022-03-11 21:42:23 +00:00
2021-12-03 11:23:43 +00:00
⚡ A fast, async, stream-based link checker written in Rust.\
2022-01-14 14:25:51 +00:00
Finds broken hyperlinks and mail addresses inside Markdown, HTML,
reStructuredText, or any other text file or website!
2021-02-08 11:18:50 +00:00
2022-01-13 12:11:55 +00:00
Available as a command-line utility, a library and a [GitHub Action ](https://github.com/lycheeverse/lychee-action ).
2020-08-04 23:06:27 +00:00
2022-08-10 15:35:50 +00:00

2020-08-17 18:14:46 +00:00
2021-04-24 13:33:25 +00:00
## Installation
2021-04-24 14:04:17 +00:00
### Arch Linux
```sh
2023-02-23 09:31:44 +00:00
pacman -S lychee
2021-04-24 14:04:17 +00:00
```
### macOS
2021-04-24 13:33:25 +00:00
```sh
brew install lychee
```
### Docker
```sh
docker pull lycheeverse/lychee
```
2021-04-24 13:53:51 +00:00
### NixOS
2021-04-24 14:04:56 +00:00
```sh
2021-04-24 13:53:51 +00:00
nix-env -iA nixos.lychee
```
2021-04-24 13:33:25 +00:00
### FreeBSD
```sh
pkg install lychee
```
2022-12-12 23:09:38 +00:00
### Scoop
```sh
scoop install lychee
```
2021-07-04 23:45:32 +00:00
### Termux
```sh
pkg install lychee
```
2021-04-24 13:33:25 +00:00
### Pre-built binaries
We provide binaries for Linux, macOS, and Windows for every release. \
You can download them from the [releases page ](https://github.com/lycheeverse/lychee/releases ).
### Cargo
2021-09-08 23:49:25 +00:00
#### Build dependencies
On APT/dpkg-based Linux distros (e.g. Debian, Ubuntu, Linux Mint and Kali Linux)
the following commands will install all required build dependencies, including
the Rust toolchain and `cargo` :
```sh
curl -sSf 'https://sh.rustup.rs' | sh
apt install gcc pkg-config libc6-dev libssl-dev
```
#### Compile and install lychee
2021-04-24 13:33:25 +00:00
```sh
cargo install lychee
```
Add optional Rustls support (#1099)
* Add optional Rustls support
This commit adds a non-default feature flag to use Rustls instead of OpenSSL.
My personal motivation is to use Lychee on OpenBSD -current, where the
`openssl` crate frequently fails to link against the unreleased system
LibreSSL. Using the `vendored-openssl` feature helps with compilation, but
segfaults at runtime.
The commit adds three feature flags to the library, binary, benchmark, and all
examples:
- The `native-tls` feature flag toggles the `openssl` crate.
- The `rustls-tls` feature flag toggles the `rustls` crate.
- The `email-check` feature flag toggles the `check-if-email-exists` crate,
which is the only existing functionality currently incompatible with Rustls.
By default, `native-tls` and `email-check` are enabled. Thus, Lychee (bin and
lib) can be used as before unless default features are disabled.
To use the Rustls feature, pass `--no-default-features --features rustls` to
cargo check/build/test/..., e.g.,
$ cargo clippy --workspace --all-targets --no-default-features \ --features
rustls-tls -- --deny warnings
Checking email addresses requires both, `native-tls` and `email-check`, to be
enabled. Otherwise, email addresses are excluded.
The `email-check` feature flag is technically not necessary. I preferred it
over `not(rustls-tls)` because it's clearer and it addresses the AGPL license
issue #594. As far as I understand, a Lychee binary compiled without the
`email-check` feature could be distributed with file-based copyleft for the
MPL-licensed dependencies only. But that's out of scope here.
The benchmark shows a performance regression varying between 2% and 4.4% when
using Rustls instead of OpenSSL on my machine.
PS: The `ring` crate needs to be patched on OpenBSD 7.3 and later until the new
xonly patches have been upstreamed, see the `rust-ring` port.
* Use platform native certificates with Rustls
By default, reqwest uses the webpki-roots crate with Rustls, effectively
bundling Mozilla's root certificates.
This commit uses the rustls-native-certs crate instead to use locally
installed root certificates, to minimize the difference between the
native-tls and rustls-tls features.
* Document feature flags
2023-06-16 00:21:57 +00:00
#### Feature flags
Lychee supports several feature flags:
- `native-tls` enables the platform-native TLS crate [native-tls ](https://crates.io/crates/native-tls ).
- `vendored-openssl` compiles and statically links a copy of OpenSSL. See the corresponding feature of the [openssl ](https://crates.io/crates/openssl ) crate.
- `rustls-tls` enables the alternative TLS crate [rustls ](https://crates.io/crates/rustls ).
- `email-check` enables checking email addresses using the [check-if-email-exists ](https://crates.io/crates/check-if-email-exists ) crate. This feature requires the `native-tls` feature.
- `check_example_domains` allows checking example domains such as `example.com` . This feature is useful for testing.
By default, `native-tls` and `email-check` are enabled.
2020-10-18 22:09:53 +00:00
## Features
2020-12-04 09:44:31 +00:00
This comparison is made on a best-effort basis. Please create a PR to fix
outdated information.
2021-09-08 23:49:25 +00:00
2022-01-10 00:38:46 +00:00
| | lychee | [awesome_bot] | [muffet] | [broken-link-checker] | [linkinator] | [linkchecker] | [markdown-link-check] | [fink] |
2023-05-22 10:36:22 +00:00
|----------------------|---------|---------------|----------|-----------------------|--------------|----------------------|-----------------------|--------|
2022-01-10 00:38:46 +00:00
| Language | Rust | Ruby | Go | JS | TypeScript | Python | JS | PHP |
| Async/Parallel | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
| JSON output | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![maybe]< sup > 1< / sup > | ![yes] | ![yes] |
| Static binary | ![yes] | ![no] | ![yes] | ![no] | ![no] | ️ ![no] | ![no] | ![no] |
| Markdown files | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![no] |
| HTML files | ![yes] | ![no] | ![no] | ![yes] | ![yes] | ![no] | ![yes] | ![no] |
| Text files | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
| Website support | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![no] | ![yes] |
| Chunked encodings | ![yes] | ![maybe] | ![maybe] | ![maybe] | ![maybe] | ![no] | ![yes] | ![yes] |
| GZIP compression | ![yes] | ![maybe] | ![maybe] | ![yes] | ![maybe] | ![yes] | ![maybe] | ![no] |
| Basic Auth | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
| Custom user agent | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
| Relative URLs | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
| Skip relative URLs | ![yes] | ![no] | ![no] | ![maybe] | ![no] | ![no] | ![no] | ![no] |
| Include patterns | ![yes]️ | ![yes] | ![no] | ![yes] | ![no] | ![no] | ![no] | ![no] |
| Exclude patterns | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
| Handle redirects | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
| Ignore insecure SSL | ![yes] | ![yes] | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] |
| File globbing | ![yes] | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] |
| Limit scheme | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
| [Custom headers] | ![yes] | ![no] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] |
| Summary | ![yes] | ![yes] | ![yes] | ![maybe] | ![yes] | ![yes] | ![no] | ![yes] |
| `HEAD` requests | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![no] | ![no] |
| Colored output | ![yes] | ![maybe] | ![yes] | ![maybe] | ![yes] | ![yes] | ![no] | ![yes] |
| [Filter status code] | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![no] | ![yes] | ![no] |
| Custom timeout | ![yes] | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![no] | ![yes] |
| E-mail links | ![yes] | ![no] | ![no] | ![no] | ![no] | ![yes] | ![no] | ![no] |
| Progress bar | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] |
| Retry and backoff | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] |
| Skip private domains | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
| [Use as library] | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![no] | ![yes] | ![no] |
| Quiet mode | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] |
| [Config file] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![no] |
| Recursion | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![no] |
| Amazing lychee logo | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
2020-10-20 08:40:45 +00:00
2020-11-09 11:12:57 +00:00
[awesome_bot]: https://github.com/dkhamsing/awesome_bot
[muffet]: https://github.com/raviqqe/muffet
[broken-link-checker]: https://github.com/stevenvachon/broken-link-checker
[linkinator]: https://github.com/JustinBeckwith/linkinator
[linkchecker]: https://github.com/linkchecker/linkchecker
[markdown-link-check]: https://github.com/tcort/markdown-link-check
[fink]: https://github.com/dantleech/fink
2020-11-09 10:49:32 +00:00
[yes]: ./assets/yes.svg
[no]: ./assets/no.svg
[maybe]: ./assets/maybe.svg
2020-11-09 11:12:57 +00:00
[custom headers]: https://github.com/rust-lang/crates.io/issues/788
[filter status code]: https://github.com/tcort/markdown-link-check/issues/94
[skip private domains]: https://github.com/appscodelabs/liche/blob/a5102b0bf90203b467a4f3b4597d22cd83d94f99/url_checker.go
2020-12-14 23:43:45 +00:00
[use as library]: https://github.com/raviqqe/liche/issues/13
2021-09-01 12:30:34 +00:00
[config file]: https://github.com/lycheeverse/lychee/blob/master/lychee.example.toml
2020-11-09 10:49:32 +00:00
2021-01-06 23:37:42 +00:00
< sup > 1< / sup > Other machine-readable formats like CSV are supported.
2020-12-04 09:44:31 +00:00
## Commandline usage
2020-08-04 23:06:27 +00:00
2021-12-17 23:28:08 +00:00
Recursively check all links in supported files inside the current directory
2020-08-04 23:44:16 +00:00
2021-02-18 10:14:00 +00:00
```sh
2021-12-17 23:28:08 +00:00
lychee .
2020-12-02 22:28:37 +00:00
```
You can also specify various types of inputs:
2021-02-18 10:14:00 +00:00
```sh
2021-12-17 23:28:08 +00:00
# check links in specific local file(s):
2020-12-02 22:28:37 +00:00
lychee README.md
lychee test.html info.txt
2021-12-17 01:00:28 +00:00
# check links on a website:
lychee https://endler.dev
# check links in directory but block network requests
lychee --offline path/to/directory
# check links in a remote file:
lychee https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
# check links in local files via shell glob:
2020-12-02 22:28:37 +00:00
lychee ~/projects/*/README.md
# check links in local files (lychee supports advanced globbing and ~ expansion):
lychee "~/projects/big_project/**/README.*"
2021-09-02 23:42:57 +00:00
2021-02-24 11:19:12 +00:00
# ignore case when globbing and check result for each link:
lychee --glob-ignore-case --verbose "~/projects/**/[r]eadme.*"
2021-07-14 14:03:48 +00:00
2021-09-03 00:03:39 +00:00
# check links from epub file (requires atool: https://www.nongnu.org/atool)
2021-07-14 14:03:48 +00:00
acat -F zip {file.epub} "*.xhtml" "*.html" | lychee -
2020-08-04 23:44:16 +00:00
```
2020-08-04 23:06:27 +00:00
2022-11-05 23:21:00 +00:00
lychee parses other file formats as plaintext and extracts links using [linkify ](https://github.com/robinst/linkify ).
2022-07-20 22:04:53 +00:00
This generally works well if there are no format or encoding specifics,
but in case you need dedicated support for a new file format, please consider creating an issue.
2021-10-26 17:50:53 +00:00
### Docker Usage
Here's how to mount a local directory into the container and check some input
2023-05-22 10:36:22 +00:00
with lychee.
- The `--init` parameter is passed so that lychee can be stopped from the terminal.
- We also pass `-it` to start an interactive terminal, which is required to show the progress bar.
- The `--rm` removes not used anymore container from the host after the run (self-cleanup).
- The `-w /input` points to `/input` as the default workspace
- The `-v $(pwd):/input` does local volume mounting to the container for lychee access.
> By default a Debian-based Docker image is used. If you want to run an Alpine-based image, use the `latest-alpine` tag.
> For example, `lycheeverse/lychee:latest-alpine`
#### Linux/macOS shell command
2021-10-26 17:50:53 +00:00
```sh
2023-05-22 10:36:22 +00:00
docker run --init -it --rm -w /input -v $(pwd):/input lycheeverse/lychee README.md
```
#### Windows PowerShell command
```powershell
docker run --init -it --rm -w /input -v ${PWD}:/input lycheeverse/lychee README.md
2021-10-26 17:50:53 +00:00
```
2021-12-17 01:00:28 +00:00
### GitHub Token
2020-12-04 21:42:10 +00:00
2021-12-17 01:00:28 +00:00
To avoid getting rate-limited while checking GitHub links, you can optionally
2023-05-22 10:36:22 +00:00
set an environment variable with your GitHub token like so `GITHUB_TOKEN=xxxx` ,
2020-12-04 21:42:10 +00:00
or use the `--github-token` CLI option. It can also be set in the config file.
2021-09-01 12:30:34 +00:00
[Here is an example config file][config file].
2020-12-04 21:42:10 +00:00
2023-05-22 10:36:22 +00:00
The token can be generated on your [GitHub account settings page ](https://github.com/settings/tokens ).
A personal access token with no extra permissions is enough to be able to check public repo links.
For more scalable organization-wide scenarios you can consider a [GitHub App][github-app-overview].
It has a higher rate limit than personal access tokens but requires additional configuration steps on your GitHub workflow.
Please follow the [GitHub App Setup][github-app-setup] example.
[github-app-overview]: https://docs.github.com/en/apps/overview
[github-app-setup]: https://github.com/github/combine-prs/blob/main/docs/github-app-setup.md#github-app-setup
2020-12-04 09:44:31 +00:00
### Commandline Parameters
2020-11-25 09:22:03 +00:00
2023-05-22 10:36:22 +00:00
There is an extensive list of command line parameters to customize the behavior.
2021-12-17 01:00:28 +00:00
See below for a full list.
2020-12-04 21:42:10 +00:00
2022-10-24 13:59:04 +00:00
```text
2022-11-13 20:10:32 +00:00
A fast, async link checker
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
Finds broken URLs and mail addresses inside Markdown, HTML, `reStructuredText` , websites and more!
2020-12-04 09:44:31 +00:00
2022-11-13 20:10:32 +00:00
Usage: lychee [OPTIONS] < inputs > ...
2021-09-06 23:05:41 +00:00
2022-11-13 20:10:32 +00:00
Arguments:
< inputs > ...
The inputs (where to get links to check from). These can be: files (e.g. `README.md` ), file globs (e.g. `"~/git/*/README.md"` ), remote URLs (e.g. `https://example.com/README.md` ) or standard input (`-`). NOTE: Use `--` to separate inputs from options that allow multiple arguments
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
Options:
-c, --config < CONFIG_FILE >
Configuration file to use
2023-03-01 23:23:05 +00:00
[default: lychee.toml]
2022-08-12 20:53:13 +00:00
2022-11-28 22:25:33 +00:00
-v, --verbose...
Set verbosity level; more output per occurrence (e.g. `-v` or `-vv` )
-q, --quiet...
Less output per occurrence (e.g. `-q` or `-qq` )
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-n, --no-progress
Do not show progress bar.
This is recommended for non-interactive shells (e.g. for continuous integration)
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--cache
Use request cache stored on disk at `.lycheecache`
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--max-cache-age < MAX_CACHE_AGE >
Discard all cached requests older than this duration
[default: 1d]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--dump
Don't perform any link checking. Instead, dump all the links extracted from inputs that would be checked
2022-08-12 20:53:13 +00:00
2023-03-27 22:45:06 +00:00
--archive < ARCHIVE >
Specify the use of a specific web archive. Can be used in combination with `--suggest`
[possible values: wayback]
--suggest
Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`
2022-11-13 20:10:32 +00:00
-m, --max-redirects < MAX_REDIRECTS >
Maximum number of allowed redirects
[default: 5]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--max-retries < MAX_RETRIES >
Maximum number of retries per request
[default: 3]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--max-concurrency < MAX_CONCURRENCY >
Maximum number of concurrent network requests
[default: 128]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-T, --threads < THREADS >
Number of threads to utilize. Defaults to number of cores available to the system
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-u, --user-agent < USER_AGENT >
User agent
2023-05-14 22:25:32 +00:00
[default: lychee/0.13.0]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-i, --insecure
Proceed for server connections considered insecure (invalid TLS)
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-s, --scheme < SCHEME >
Only test links with the given schemes (e.g. http and https)
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--offline
Only check local files and block network requests
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--include < INCLUDE >
URLs to check (supports regex). Has preference over all excludes
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude < EXCLUDE >
Exclude URLs and mail addresses from checking (supports regex)
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-file < EXCLUDE_FILE >
Deprecated; use `--exclude-path` instead
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-path < EXCLUDE_PATH >
Exclude file path from getting checked
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-E, --exclude-all-private
Exclude all private IPs from checking.
Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-private
Exclude private IP address ranges from checking
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-link-local
Exclude link-local IP address range from checking
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-loopback
Exclude loopback IP address range and localhost from checking
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--exclude-mail
Exclude all mail addresses from checking
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--remap < REMAP >
Remap URI matching pattern to different URI
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--header < HEADER >
Custom request header
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-a, --accept < ACCEPT >
Comma-separated list of accepted status codes for valid links
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-t, --timeout < TIMEOUT >
Website timeout in seconds from connect to response finished
[default: 20]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-r, --retry-wait-time < RETRY_WAIT_TIME >
Minimum wait time in seconds between retries of failed requests
[default: 1]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-X, --method < METHOD >
Request method
[default: get]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-b, --base < BASE >
Base URL or website root directory to check relative URLs e.g. https://example.com or `/path/to/public`
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--basic-auth < BASIC_AUTH >
2023-06-26 10:06:24 +00:00
Basic authentication support. E.g. `http://example.com username:password`
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--github-token < GITHUB_TOKEN >
GitHub API token to use when checking github.com links, to avoid rate limiting
[env: GITHUB_TOKEN]
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--skip-missing
Skip missing input files (default is to error if they don't exist)
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--include-verbatim
Find links in verbatim sections like `pre` - and `code` blocks
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
--glob-ignore-case
Ignore case when expanding filesystem path glob inputs
2022-08-12 20:53:13 +00:00
2022-11-13 20:10:32 +00:00
-o, --output < OUTPUT >
Output file of status report
-f, --format < FORMAT >
Output format of final status report (compact, detailed, json, markdown)
[default: compact]
--require-https
When HTTPS is available, treat HTTP links as errors
-h, --help
2023-01-17 14:08:12 +00:00
Print help (see a summary with '-h')
2022-11-13 20:10:32 +00:00
-V, --version
2023-01-17 14:08:12 +00:00
Print version
2022-08-12 20:53:13 +00:00
2020-12-04 09:44:31 +00:00
```
### Exit codes
2020-10-26 22:31:31 +00:00
- `0` for success (all links checked successfully or excluded/skipped as configured)
2020-12-02 22:28:37 +00:00
- `1` for missing inputs and any unexpected runtime failures or config errors
2020-10-26 22:31:31 +00:00
- `2` for link check failures (if any non-excluded link failed the check)
2021-11-23 00:39:53 +00:00
### Ignoring links
2022-05-31 15:48:56 +00:00
You can exclude links from getting checked by specifying regex patterns
2022-11-11 17:27:26 +00:00
with `--exclude` (e.g. `--exclude example\.(com|org)` ).
2022-01-14 14:25:51 +00:00
If a file named `.lycheeignore` exists in the current working directory, its
2022-11-05 23:21:00 +00:00
contents are excluded as well. The file allows you to list multiple regular
2022-05-31 15:48:56 +00:00
expressions for exclusion (one pattern per line).
2022-01-14 14:25:51 +00:00
2022-11-11 17:27:26 +00:00
For excluding files/directories from being scanned use `lychee.toml`
and `exclude_path` .
```toml
2022-11-13 12:50:11 +00:00
exclude_path = ["some/path", "*/dev/*"]
2022-11-11 17:27:26 +00:00
```
2022-01-14 14:25:51 +00:00
### Caching
If the `--cache` flag is set, lychee will cache responses in a file called
`.lycheecache` in the current directory. If the file exists and the flag is set,
then the cache will be loaded on startup. This can greatly speed up future runs.
Note that by default lychee will not store any data on disk.
2021-11-23 00:39:53 +00:00
2020-12-04 09:44:31 +00:00
## Library usage
2021-12-17 01:00:28 +00:00
You can use lychee as a library for your own projects!
2021-02-18 00:32:48 +00:00
Here is a "hello world" example:
2020-12-04 09:44:31 +00:00
```rust
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
use lychee_lib::Result;
2021-02-18 00:32:48 +00:00
#[tokio::main]
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
async fn main() -> Result< ()> {
let response = lychee_lib::check("https://github.com/lycheeverse/lychee").await?;
2022-02-12 09:51:52 +00:00
println!("{response}");
2021-02-18 00:32:48 +00:00
Ok(())
}
```
This is equivalent to the following snippet, in which we build our own client:
```rust
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
use lychee_lib::{ClientBuilder, Result, Status};
2021-02-15 23:35:59 +00:00
#[tokio::main]
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
async fn main() -> Result< ()> {
2021-04-16 18:25:22 +00:00
let client = ClientBuilder::default().client()?;
2021-02-18 00:32:48 +00:00
let response = client.check("https://github.com/lycheeverse/lychee").await?;
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
assert!(response.status().is_success());
2021-02-15 23:35:59 +00:00
Ok(())
}
2020-12-04 09:44:31 +00:00
```
2021-02-18 00:32:48 +00:00
The client builder is very customizable:
2020-12-04 09:44:31 +00:00
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
```rust, ignore
2021-04-16 18:25:22 +00:00
let client = lychee_lib::ClientBuilder::builder()
2020-12-04 09:44:31 +00:00
.includes(includes)
.excludes(excludes)
.max_redirects(cfg.max_redirects)
.user_agent(cfg.user_agent)
.allow_insecure(cfg.insecure)
.custom_headers(headers)
.method(method)
.timeout(timeout)
.github_token(cfg.github_token)
.scheme(cfg.scheme)
.accepted(accepted)
2021-04-16 18:25:22 +00:00
.build()
.client()?;
2020-12-04 09:44:31 +00:00
```
2021-02-18 00:32:48 +00:00
All options that you set will be used for all link checks.
2022-01-14 14:25:51 +00:00
See the [builder
documentation](https://docs.rs/lychee-lib/latest/lychee_lib/struct.ClientBuilder.html)
for all options. For more information, check out the [examples ](examples )
folder.
2021-01-08 09:52:10 +00:00
2021-12-17 01:00:28 +00:00
## GitHub Action Usage
2021-02-08 11:18:50 +00:00
2021-02-18 00:32:48 +00:00
A GitHub Action that uses lychee is available as a separate repository: [lycheeverse/lychee-action ](https://github.com/lycheeverse/lychee-action )
2021-02-08 11:18:50 +00:00
which includes usage instructions.
2021-04-24 14:23:36 +00:00
## Contributing to lychee
We'd be thankful for any contribution. \
2023-05-22 10:36:22 +00:00
We try to keep the issue tracker up-to-date so you can quickly find a task to work on.
2021-04-24 14:23:36 +00:00
Try one of these links to get started:
- [good first issues ](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 )
- [help wanted ](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22 )
2022-08-14 00:10:11 +00:00
For more detailed instructions, head over to [`CONTRIBUTING.md` ](/CONTRIBUTING.md ).
2021-09-03 17:18:39 +00:00
2021-09-12 16:10:23 +00:00
## Debugging and improving async code
2022-01-14 14:25:51 +00:00
Lychee makes heavy use of async code to be resource-friendly while still being
performant. Async code can be difficult to troubleshoot with most tools,
however. Therefore we provide experimental support for
[tokio-console ](https://github.com/tokio-rs/console ). It provides a top(1)-like
overview for async tasks!
2021-09-12 16:10:23 +00:00
If you want to give it a spin, download and start the console:
```sh
git clone https://github.com/tokio-rs/console
cd console
cargo run
```
Then run lychee with some special flags and features enabled.
```sh
RUSTFLAGS="--cfg tokio_unstable" cargo run --features tokio-console -- < input1 > < input2 > ...
```
If you find a way to make lychee faster, please do reach out.
2021-12-17 01:00:28 +00:00
## Troubleshooting and Workarounds
2020-11-30 23:32:37 +00:00
2022-02-04 11:05:59 +00:00
We collect a list of common workarounds for various websites in our [troubleshooting guide ](./docs/TROUBLESHOOTING.md ).
2020-11-30 23:32:37 +00:00
2020-11-25 09:22:03 +00:00
## Users
2020-08-04 23:06:27 +00:00
2022-08-12 11:54:05 +00:00
- https://github.com/InnerSourceCommons/InnerSourcePatterns
2021-12-17 01:00:28 +00:00
- https://github.com/opensearch-project/OpenSearch
- https://github.com/ramitsurana/awesome-kubernetes
- https://github.com/papers-we-love/papers-we-love
- https://github.com/pingcap/docs
- https://github.com/microsoft/WhatTheHack
2021-12-20 21:50:19 +00:00
- https://github.com/Azure/ResourceModules
2021-12-17 01:00:28 +00:00
- https://github.com/nix-community/awesome-nix
- https://github.com/balena-io/docs
2022-01-10 20:03:30 +00:00
- https://github.com/launchdarkly/LaunchDarkly-Docs
2020-12-04 21:42:10 +00:00
- https://github.com/pawroman/links
2020-12-14 10:38:10 +00:00
- https://github.com/analysis-tools-dev/static-analysis
- https://github.com/analysis-tools-dev/dynamic-analysis
- https://github.com/mre/idiomatic-rust
2020-12-14 23:46:42 +00:00
- https://github.com/lycheeverse/lychee (yes, the lychee docs are checked with lychee 🤯)
2020-10-20 08:40:45 +00:00
2021-12-17 01:00:28 +00:00
If you are using lychee for your project, **please add it here** .
2021-04-24 14:23:36 +00:00
## Credits
The first prototype of lychee was built in [episode 10 of Hello
2023-05-22 10:36:22 +00:00
Rust](https://hello-rust.show/10/). Thanks to all GitHub and Patreon sponsors
2021-04-24 14:23:36 +00:00
for supporting the development since the beginning. Also, thanks to all the
great contributors who have since made this project more mature.
2020-12-04 09:44:31 +00:00
2021-02-02 13:32:19 +00:00
## License
lychee is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or
2021-09-03 00:03:39 +00:00
https://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
2021-02-02 13:32:19 +00:00
at your option.
2023-07-07 10:15:33 +00:00
< br > < hr >
[🔼 Back to top ](#back-to-top )