2021-03-15 22:27:42 +00:00
|
|
|
|

|
2020-08-04 23:06:27 +00:00
|
|
|
|
|
2020-08-07 22:10:30 +00:00
|
|
|
|

|
2021-12-16 19:47:35 +00:00
|
|
|
|
[](https://docs.rs/lychee-lib)
|
2021-10-14 21:32:31 +00:00
|
|
|
|
[](https://github.com/marketplace/actions/lychee-broken-link-checker)
|
2021-04-15 10:42:37 +00:00
|
|
|
|
|
2021-12-03 11:23:43 +00:00
|
|
|
|
⚡ A fast, async, stream-based link checker written in Rust.\
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
Finds broken hyperlinks and mail addresses inside Markdown, HTML, reStructuredText, or any other text file or website!
|
2021-02-08 11:18:50 +00:00
|
|
|
|
|
|
|
|
|
|
Available as a CLI utility and as a GitHub Action: [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action).
|
2020-08-04 23:06:27 +00:00
|
|
|
|
|
2020-10-20 08:40:45 +00:00
|
|
|
|

|
2020-08-17 18:14:46 +00:00
|
|
|
|
|
2021-04-24 13:33:25 +00:00
|
|
|
|
## Installation
|
|
|
|
|
|
|
2021-04-24 14:04:17 +00:00
|
|
|
|
### Arch Linux
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
pacman -S lychee-link-checker
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
### macOS
|
2021-04-24 13:33:25 +00:00
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
brew install lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
### Docker
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
docker pull lycheeverse/lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2021-04-24 13:53:51 +00:00
|
|
|
|
### NixOS
|
|
|
|
|
|
|
2021-04-24 14:04:56 +00:00
|
|
|
|
```sh
|
2021-04-24 13:53:51 +00:00
|
|
|
|
nix-env -iA nixos.lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2021-04-24 13:33:25 +00:00
|
|
|
|
### FreeBSD
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
pkg install lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2021-07-04 23:45:32 +00:00
|
|
|
|
### Termux
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
pkg install lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2021-04-24 13:33:25 +00:00
|
|
|
|
### Pre-built binaries
|
|
|
|
|
|
|
|
|
|
|
|
We provide binaries for Linux, macOS, and Windows for every release. \
|
|
|
|
|
|
You can download them from the [releases page](https://github.com/lycheeverse/lychee/releases).
|
|
|
|
|
|
|
|
|
|
|
|
### Cargo
|
|
|
|
|
|
|
2021-09-08 23:49:25 +00:00
|
|
|
|
#### Build dependencies
|
|
|
|
|
|
|
|
|
|
|
|
On APT/dpkg-based Linux distros (e.g. Debian, Ubuntu, Linux Mint and Kali Linux)
|
|
|
|
|
|
the following commands will install all required build dependencies, including
|
|
|
|
|
|
the Rust toolchain and `cargo`:
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
curl -sSf 'https://sh.rustup.rs' | sh
|
|
|
|
|
|
apt install gcc pkg-config libc6-dev libssl-dev
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
#### Compile and install lychee
|
|
|
|
|
|
|
2021-04-24 13:33:25 +00:00
|
|
|
|
```sh
|
|
|
|
|
|
cargo install lychee
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2020-10-18 22:09:53 +00:00
|
|
|
|
## Features
|
|
|
|
|
|
|
2020-12-04 09:44:31 +00:00
|
|
|
|
This comparison is made on a best-effort basis. Please create a PR to fix
|
|
|
|
|
|
outdated information.
|
2021-09-08 23:49:25 +00:00
|
|
|
|
|
2020-12-04 09:44:31 +00:00
|
|
|
|
| | lychee | [awesome_bot] | [muffet] | [broken-link-checker] | [linkinator] | [linkchecker] | [markdown-link-check] | [fink] |
|
|
|
|
|
|
| -------------------- | ------- | ------------- | -------- | --------------------- | ------------ | ------------- | --------------------- | ------ |
|
|
|
|
|
|
| Language | Rust | Ruby | Go | JS | TypeScript | Python | JS | PHP |
|
|
|
|
|
|
| Async/Parallel | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
|
2021-09-08 23:49:25 +00:00
|
|
|
|
| JSON output | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![maybe]<sup>1</sup> | ![yes] | ![yes] |
|
|
|
|
|
|
| Static binary | ![yes] | ![no] | ![yes] | ![no] | ![no] | ️![no] | ![no] | ![no] |
|
|
|
|
|
|
| Markdown files | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![no] |
|
|
|
|
|
|
| HTML files | ![yes] | ![no] | ![no] | ![yes] | ![yes] | ![no] | ![yes] | ![no] |
|
|
|
|
|
|
| Text files | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
|
2020-12-04 09:44:31 +00:00
|
|
|
|
| Website support | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![no] | ![yes] |
|
|
|
|
|
|
| Chunked encodings | ![yes] | ![maybe] | ![maybe] | ![maybe] | ![maybe] | ![no] | ![yes] | ![yes] |
|
|
|
|
|
|
| GZIP compression | ![yes] | ![maybe] | ![maybe] | ![yes] | ![maybe] | ![yes] | ![maybe] | ![no] |
|
|
|
|
|
|
| Basic Auth | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
|
|
|
|
|
|
| Custom user agent | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
|
|
|
|
|
|
| Relative URLs | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
|
|
|
|
|
|
| Skip relative URLs | ![yes] | ![no] | ![no] | ![maybe] | ![no] | ![no] | ![no] | ![no] |
|
2021-09-08 23:49:25 +00:00
|
|
|
|
| Include patterns | ![yes]️ | ![yes] | ![no] | ![yes] | ![no] | ![no] | ![no] | ![no] |
|
2020-12-04 09:44:31 +00:00
|
|
|
|
| Exclude patterns | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
|
|
|
|
|
|
| Handle redirects | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] |
|
|
|
|
|
|
| Ignore insecure SSL | ![yes] | ![yes] | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] |
|
|
|
|
|
|
| File globbing | ![yes] | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] |
|
|
|
|
|
|
| Limit scheme | ![yes] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] | ![no] |
|
|
|
|
|
|
| [Custom headers] | ![yes] | ![no] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] |
|
|
|
|
|
|
| Summary | ![yes] | ![yes] | ![yes] | ![maybe] | ![yes] | ![yes] | ![no] | ![yes] |
|
|
|
|
|
|
| `HEAD` requests | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![yes] | ![no] | ![no] |
|
|
|
|
|
|
| Colored output | ![yes] | ![maybe] | ![yes] | ![maybe] | ![yes] | ![yes] | ![no] | ![yes] |
|
|
|
|
|
|
| [Filter status code] | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![no] | ![yes] | ![no] |
|
|
|
|
|
|
| Custom timeout | ![yes] | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![no] | ![yes] |
|
|
|
|
|
|
| E-mail links | ![yes] | ![no] | ![no] | ![no] | ![no] | ![yes] | ![no] | ![no] |
|
|
|
|
|
|
| Progress bar | ![yes] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] |
|
|
|
|
|
|
| Retry and backoff | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![no] | ![yes] | ![no] |
|
|
|
|
|
|
| Skip private domains | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
|
2020-12-14 23:42:54 +00:00
|
|
|
|
| [Use as library] | ![yes] | ![yes] | ![no] | ![yes] | ![yes] | ![no] | ![yes] | ![no] |
|
2020-12-04 09:44:31 +00:00
|
|
|
|
| Quiet mode | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] |
|
2021-09-01 12:30:34 +00:00
|
|
|
|
| [Config file] | ![yes] | ![no] | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![no] |
|
2021-11-23 12:32:07 +00:00
|
|
|
|
| Recursion | ![no] | ![no] | ![yes] | ![yes] | ![yes] | ![yes] | ![yes] | ![no] |
|
2020-12-04 09:44:31 +00:00
|
|
|
|
| Amazing lychee logo | ![yes] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] | ![no] |
|
2020-10-20 08:40:45 +00:00
|
|
|
|
|
2020-11-09 11:12:57 +00:00
|
|
|
|
[awesome_bot]: https://github.com/dkhamsing/awesome_bot
|
|
|
|
|
|
[muffet]: https://github.com/raviqqe/muffet
|
|
|
|
|
|
[broken-link-checker]: https://github.com/stevenvachon/broken-link-checker
|
|
|
|
|
|
[linkinator]: https://github.com/JustinBeckwith/linkinator
|
|
|
|
|
|
[linkchecker]: https://github.com/linkchecker/linkchecker
|
|
|
|
|
|
[markdown-link-check]: https://github.com/tcort/markdown-link-check
|
|
|
|
|
|
[fink]: https://github.com/dantleech/fink
|
2020-11-09 10:49:32 +00:00
|
|
|
|
[yes]: ./assets/yes.svg
|
|
|
|
|
|
[no]: ./assets/no.svg
|
|
|
|
|
|
[maybe]: ./assets/maybe.svg
|
2020-11-09 11:12:57 +00:00
|
|
|
|
[custom headers]: https://github.com/rust-lang/crates.io/issues/788
|
|
|
|
|
|
[filter status code]: https://github.com/tcort/markdown-link-check/issues/94
|
|
|
|
|
|
[skip private domains]: https://github.com/appscodelabs/liche/blob/a5102b0bf90203b467a4f3b4597d22cd83d94f99/url_checker.go
|
2020-12-14 23:43:45 +00:00
|
|
|
|
[use as library]: https://github.com/raviqqe/liche/issues/13
|
2021-09-01 12:30:34 +00:00
|
|
|
|
[config file]: https://github.com/lycheeverse/lychee/blob/master/lychee.example.toml
|
2020-11-09 10:49:32 +00:00
|
|
|
|
|
2021-01-06 23:37:42 +00:00
|
|
|
|
<sup>1</sup> Other machine-readable formats like CSV are supported.
|
|
|
|
|
|
|
2020-12-04 09:44:31 +00:00
|
|
|
|
## Commandline usage
|
2020-08-04 23:06:27 +00:00
|
|
|
|
|
2021-12-17 23:28:08 +00:00
|
|
|
|
Recursively check all links in supported files inside the current directory
|
2020-08-04 23:44:16 +00:00
|
|
|
|
|
2021-02-18 10:14:00 +00:00
|
|
|
|
```sh
|
2021-12-17 23:28:08 +00:00
|
|
|
|
lychee .
|
2020-12-02 22:28:37 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
You can also specify various types of inputs:
|
|
|
|
|
|
|
2021-02-18 10:14:00 +00:00
|
|
|
|
```sh
|
2021-12-17 23:28:08 +00:00
|
|
|
|
# check links in specific local file(s):
|
2020-12-02 22:28:37 +00:00
|
|
|
|
lychee README.md
|
|
|
|
|
|
lychee test.html info.txt
|
|
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
# check links on a website:
|
|
|
|
|
|
lychee https://endler.dev
|
|
|
|
|
|
|
|
|
|
|
|
# check links in directory but block network requests
|
|
|
|
|
|
lychee --offline path/to/directory
|
|
|
|
|
|
|
|
|
|
|
|
# check links in a remote file:
|
|
|
|
|
|
lychee https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
|
|
|
|
|
|
|
|
|
|
|
|
# check links in local files via shell glob:
|
2020-12-02 22:28:37 +00:00
|
|
|
|
lychee ~/projects/*/README.md
|
|
|
|
|
|
|
|
|
|
|
|
# check links in local files (lychee supports advanced globbing and ~ expansion):
|
|
|
|
|
|
lychee "~/projects/big_project/**/README.*"
|
2021-09-02 23:42:57 +00:00
|
|
|
|
|
2021-02-24 11:19:12 +00:00
|
|
|
|
# ignore case when globbing and check result for each link:
|
|
|
|
|
|
lychee --glob-ignore-case --verbose "~/projects/**/[r]eadme.*"
|
2021-07-14 14:03:48 +00:00
|
|
|
|
|
2021-09-03 00:03:39 +00:00
|
|
|
|
# check links from epub file (requires atool: https://www.nongnu.org/atool)
|
2021-07-14 14:03:48 +00:00
|
|
|
|
acat -F zip {file.epub} "*.xhtml" "*.html" | lychee -
|
2020-08-04 23:44:16 +00:00
|
|
|
|
```
|
2020-08-04 23:06:27 +00:00
|
|
|
|
|
2021-10-26 17:50:53 +00:00
|
|
|
|
### Docker Usage
|
|
|
|
|
|
|
|
|
|
|
|
Here's how to mount a local directory into the container and check some input
|
|
|
|
|
|
with lychee:
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
docker run -v `pwd`:/input lycheeverse/lychee /input/README.md
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
### GitHub Token
|
2020-12-04 21:42:10 +00:00
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
To avoid getting rate-limited while checking GitHub links, you can optionally
|
2020-12-04 21:42:10 +00:00
|
|
|
|
set an environment variable with your Github token like so `GITHUB_TOKEN=xxxx`,
|
|
|
|
|
|
or use the `--github-token` CLI option. It can also be set in the config file.
|
2021-09-01 12:30:34 +00:00
|
|
|
|
[Here is an example config file][config file].
|
2020-12-04 21:42:10 +00:00
|
|
|
|
|
|
|
|
|
|
The token can be generated in your
|
|
|
|
|
|
[GitHub account settings page](https://github.com/settings/tokens). A personal
|
|
|
|
|
|
token with no extra permissions is enough to be able to check public repos links.
|
2020-12-04 09:44:31 +00:00
|
|
|
|
|
|
|
|
|
|
### Commandline Parameters
|
2020-11-25 09:22:03 +00:00
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
There is an extensive list of commandline parameters to customize the behavior.
|
|
|
|
|
|
See below for a full list.
|
2020-12-04 21:42:10 +00:00
|
|
|
|
|
2021-03-28 15:20:03 +00:00
|
|
|
|
```ignore
|
2020-12-04 09:44:31 +00:00
|
|
|
|
USAGE:
|
2021-09-16 14:40:38 +00:00
|
|
|
|
lychee [FLAGS] [OPTIONS] <inputs>...
|
2020-12-04 09:44:31 +00:00
|
|
|
|
|
|
|
|
|
|
FLAGS:
|
2021-09-06 14:10:48 +00:00
|
|
|
|
--dump Don't perform any link checking. Instead, dump all the links extracted from inputs that
|
|
|
|
|
|
would be checked
|
2021-04-16 18:25:22 +00:00
|
|
|
|
-E, --exclude-all-private Exclude all private IPs from checking.
|
|
|
|
|
|
Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`
|
2020-12-04 09:44:31 +00:00
|
|
|
|
--exclude-link-local Exclude link-local IP address range from checking
|
2021-10-06 09:33:23 +00:00
|
|
|
|
--exclude-loopback Exclude loopback IP address range and localhost from checking
|
2021-02-10 10:58:04 +00:00
|
|
|
|
--exclude-mail Exclude all mail addresses from checking
|
2020-12-04 09:44:31 +00:00
|
|
|
|
--exclude-private Exclude private IP address ranges from checking
|
|
|
|
|
|
--glob-ignore-case Ignore case when expanding filesystem path glob inputs
|
|
|
|
|
|
--help Prints help information
|
|
|
|
|
|
-i, --insecure Proceed for server connections considered insecure (invalid TLS)
|
2021-04-16 18:25:22 +00:00
|
|
|
|
-n, --no-progress Do not show progress bar.
|
|
|
|
|
|
This is recommended for non-interactive shells (e.g. for continuous integration)
|
2021-09-06 22:53:42 +00:00
|
|
|
|
--offline Only check local files and block network requests
|
2021-09-04 01:21:54 +00:00
|
|
|
|
--require-https When HTTPS is available, treat HTTP links as errors
|
2020-12-04 09:44:31 +00:00
|
|
|
|
--skip-missing Skip missing input files (default is to error if they don't exist)
|
|
|
|
|
|
-V, --version Prints version information
|
|
|
|
|
|
-v, --verbose Verbose program output
|
|
|
|
|
|
|
|
|
|
|
|
OPTIONS:
|
|
|
|
|
|
-a, --accept <accept> Comma-separated list of accepted status codes for valid links
|
2021-06-22 22:14:21 +00:00
|
|
|
|
-b, --base <base> Base URL or website root directory to check relative URLs e.g.
|
|
|
|
|
|
https://example.org or `/path/to/public`
|
2020-12-04 09:44:31 +00:00
|
|
|
|
--basic-auth <basic-auth> Basic authentication support. E.g. `username:password`
|
|
|
|
|
|
-c, --config <config-file> Configuration file to use [default: ./lychee.toml]
|
|
|
|
|
|
--exclude <exclude>... Exclude URLs from checking (supports regex)
|
2021-11-23 00:39:53 +00:00
|
|
|
|
--exclude-file <exclude-file>... File or files that contain URLs to be excluded from checking. Regular
|
|
|
|
|
|
expressions supported; one pattern per line. Automatically excludes
|
|
|
|
|
|
patterns from `.lycheeignore` if file exists
|
2021-11-17 23:44:48 +00:00
|
|
|
|
-f, --format <format> Output format of final status report (compact, detailed, json, markdown)
|
|
|
|
|
|
[default: compact]
|
2020-12-04 09:44:31 +00:00
|
|
|
|
--github-token <github-token> GitHub API token to use when checking github.com links, to avoid rate
|
|
|
|
|
|
limiting [env: GITHUB_TOKEN=]
|
|
|
|
|
|
-h, --headers <headers>... Custom request headers
|
|
|
|
|
|
--include <include>... URLs to check (supports regex). Has preference over all excludes
|
|
|
|
|
|
--max-concurrency <max-concurrency> Maximum number of concurrent network requests [default: 128]
|
|
|
|
|
|
-m, --max-redirects <max-redirects> Maximum number of allowed redirects [default: 10]
|
|
|
|
|
|
-X, --method <method> Request method [default: get]
|
2021-01-04 21:26:43 +00:00
|
|
|
|
-o, --output <output> Output file of status report
|
2021-04-26 16:24:54 +00:00
|
|
|
|
-s, --scheme <scheme>... Only test links with the given schemes (e.g. http and https)
|
2020-12-04 09:44:31 +00:00
|
|
|
|
-T, --threads <threads> Number of threads to utilize. Defaults to number of cores available to
|
|
|
|
|
|
the system
|
|
|
|
|
|
-t, --timeout <timeout> Website timeout from connect to response finished [default: 20]
|
2021-11-17 23:59:28 +00:00
|
|
|
|
-u, --user-agent <user-agent> User agent [default: lychee/0.8.1]
|
2021-09-06 23:05:41 +00:00
|
|
|
|
|
|
|
|
|
|
ARGS:
|
|
|
|
|
|
<inputs>... The inputs (where to get links to check from). These can be: files (e.g. `README.md`), file globs
|
|
|
|
|
|
(e.g. `"~/git/*/README.md"`), remote URLs (e.g. `https://example.org/README.md`) or standard
|
2021-09-16 14:40:38 +00:00
|
|
|
|
input (`-`). NOTE: Use `--` to separate inputs from options that allow multiple arguments
|
2020-12-04 09:44:31 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
### Exit codes
|
2020-10-26 22:31:31 +00:00
|
|
|
|
|
|
|
|
|
|
- `0` for success (all links checked successfully or excluded/skipped as configured)
|
2020-12-02 22:28:37 +00:00
|
|
|
|
- `1` for missing inputs and any unexpected runtime failures or config errors
|
2020-10-26 22:31:31 +00:00
|
|
|
|
- `2` for link check failures (if any non-excluded link failed the check)
|
|
|
|
|
|
|
2021-11-23 00:39:53 +00:00
|
|
|
|
### Ignoring links
|
|
|
|
|
|
|
|
|
|
|
|
You can exclude links from getting checked by either specifying regex patterns
|
|
|
|
|
|
with `--exclude` (e.g. `--exclude example\.(com|org)`) or by using an "exclude
|
|
|
|
|
|
file" (`--exclude_file`), which allows you to list multiple regular expressions
|
|
|
|
|
|
for exclusion (one pattern per line).
|
|
|
|
|
|
If a file named `.lycheeignore` exists in the current working directory, its contents are excluded as well.
|
|
|
|
|
|
|
2020-12-04 09:44:31 +00:00
|
|
|
|
## Library usage
|
|
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
You can use lychee as a library for your own projects!
|
2021-02-18 00:32:48 +00:00
|
|
|
|
Here is a "hello world" example:
|
2020-12-04 09:44:31 +00:00
|
|
|
|
|
|
|
|
|
|
```rust
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
use lychee_lib::Result;
|
2021-02-18 00:32:48 +00:00
|
|
|
|
|
|
|
|
|
|
#[tokio::main]
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
async fn main() -> Result<()> {
|
|
|
|
|
|
let response = lychee_lib::check("https://github.com/lycheeverse/lychee").await?;
|
2021-02-18 00:32:48 +00:00
|
|
|
|
println!("{}", response);
|
|
|
|
|
|
Ok(())
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
This is equivalent to the following snippet, in which we build our own client:
|
|
|
|
|
|
|
|
|
|
|
|
```rust
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
use lychee_lib::{ClientBuilder, Result, Status};
|
2021-02-15 23:35:59 +00:00
|
|
|
|
|
|
|
|
|
|
#[tokio::main]
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
async fn main() -> Result<()> {
|
2021-04-16 18:25:22 +00:00
|
|
|
|
let client = ClientBuilder::default().client()?;
|
2021-02-18 00:32:48 +00:00
|
|
|
|
let response = client.check("https://github.com/lycheeverse/lychee").await?;
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
assert!(response.status().is_success());
|
2021-02-15 23:35:59 +00:00
|
|
|
|
Ok(())
|
|
|
|
|
|
}
|
2020-12-04 09:44:31 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
2021-02-18 00:32:48 +00:00
|
|
|
|
The client builder is very customizable:
|
2020-12-04 09:44:31 +00:00
|
|
|
|
|
Major refactor of codebase (#208)
- The binary component and library component are separated as two
packages in the same workspace.
- `lychee` is the binary component, in `lychee-bin/*`.
- `lychee-lib` is the library component, in `lychee-lib/*`.
- Users can now install only the `lychee-lib`, instead of both
components, that would require fewer dependencies and faster
compilation.
- Dependencies for each component are adjusted and updated. E.g.,
no CLI dependencies for `lychee-lib`.
- CLI tests are only moved to `lychee`, as it has nothing to do
with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
`ErrorKind`.
- The motivation is to delay the formatting of errors to strings.
Note that `e.to_string()` is not necessarily cheap (though
trivial in many cases). The formatting is no delayed until the
error is needed to be displayed to users. So in some cases, if
the error is never used, it means that it won't be formatted at
all.
- Replaced `regex` based matching with one of the following:
- Simple string equality test in the case of 'false positivie'.
- URL parsing based test, in the case of extracting repository and
user name for GitHub links.
- Either cases would be much more efficient than `regex` based
matching. First, there's no need to construct a state machine for
regex. Second, URL is already verified and parsed on its creation,
and extracting its components is fairly cheap. Also, this removes
the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
`test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
- Imports from `std`
- Imports from 3rd-party crates, and `lychee-lib`.
- Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
`clippy:pedantic`.
Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
|
|
|
|
```rust, ignore
|
2021-04-16 18:25:22 +00:00
|
|
|
|
let client = lychee_lib::ClientBuilder::builder()
|
2020-12-04 09:44:31 +00:00
|
|
|
|
.includes(includes)
|
|
|
|
|
|
.excludes(excludes)
|
|
|
|
|
|
.max_redirects(cfg.max_redirects)
|
|
|
|
|
|
.user_agent(cfg.user_agent)
|
|
|
|
|
|
.allow_insecure(cfg.insecure)
|
|
|
|
|
|
.custom_headers(headers)
|
|
|
|
|
|
.method(method)
|
|
|
|
|
|
.timeout(timeout)
|
|
|
|
|
|
.github_token(cfg.github_token)
|
|
|
|
|
|
.scheme(cfg.scheme)
|
|
|
|
|
|
.accepted(accepted)
|
2021-04-16 18:25:22 +00:00
|
|
|
|
.build()
|
|
|
|
|
|
.client()?;
|
2020-12-04 09:44:31 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
2021-02-18 00:32:48 +00:00
|
|
|
|
All options that you set will be used for all link checks.
|
2021-04-15 22:53:06 +00:00
|
|
|
|
See the [builder documentation](https://docs.rs/lychee-lib/latest/lychee_lib/struct.ClientBuilder.html) for all options.
|
2021-04-30 14:33:37 +00:00
|
|
|
|
For more information, check out the [examples](examples) folder.
|
2021-01-08 09:52:10 +00:00
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
## GitHub Action Usage
|
2021-02-08 11:18:50 +00:00
|
|
|
|
|
2021-02-18 00:32:48 +00:00
|
|
|
|
A GitHub Action that uses lychee is available as a separate repository: [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action)
|
2021-02-08 11:18:50 +00:00
|
|
|
|
which includes usage instructions.
|
|
|
|
|
|
|
2021-04-24 14:23:36 +00:00
|
|
|
|
## Contributing to lychee
|
|
|
|
|
|
|
|
|
|
|
|
We'd be thankful for any contribution. \
|
|
|
|
|
|
We try to keep the issue-tracker up-to-date so you can quickly find a task to work on.
|
|
|
|
|
|
|
|
|
|
|
|
Try one of these links to get started:
|
|
|
|
|
|
|
|
|
|
|
|
- [good first issues](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
|
|
|
|
|
|
- [help wanted](https://github.com/lycheeverse/lychee/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22)
|
|
|
|
|
|
|
2021-09-03 19:54:16 +00:00
|
|
|
|
Lychee is written in Rust. Install [rust-up](https://rustup.rs/) to get started.
|
2021-10-02 21:53:14 +00:00
|
|
|
|
Begin by making sure the following commands succeed without errors.
|
2021-09-03 17:18:39 +00:00
|
|
|
|
|
2021-10-26 17:50:53 +00:00
|
|
|
|
```sh
|
2021-09-03 17:18:39 +00:00
|
|
|
|
cargo test # runs tests
|
|
|
|
|
|
cargo clippy # lints code
|
2021-09-03 23:49:29 +00:00
|
|
|
|
cargo install cargo-publish-all
|
|
|
|
|
|
cargo-publish-all --dry-run --yes # dry run release
|
2021-09-03 17:18:39 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
2021-09-12 16:10:23 +00:00
|
|
|
|
## Debugging and improving async code
|
|
|
|
|
|
|
|
|
|
|
|
Lychee makes heavy use of async code to be resource-friendly while still being performant.
|
|
|
|
|
|
Async code can be difficult to troubleshoot with most tools, however.
|
|
|
|
|
|
Therefore we provide experimental support for [tokio-console](https://github.com/tokio-rs/console).
|
|
|
|
|
|
It provides a top(1)-like overview for async tasks!
|
|
|
|
|
|
|
|
|
|
|
|
If you want to give it a spin, download and start the console:
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
git clone https://github.com/tokio-rs/console
|
|
|
|
|
|
cd console
|
|
|
|
|
|
cargo run
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Then run lychee with some special flags and features enabled.
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
RUSTFLAGS="--cfg tokio_unstable" cargo run --features tokio-console -- <input1> <input2> ...
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
If you find a way to make lychee faster, please do reach out.
|
|
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
## Troubleshooting and Workarounds
|
2020-11-30 23:32:37 +00:00
|
|
|
|
|
|
|
|
|
|
We collect a list of common workarounds for various websites in our [troubleshooting guide](./TROUBLESHOOTING.md).
|
|
|
|
|
|
|
2020-11-25 09:22:03 +00:00
|
|
|
|
## Users
|
2020-08-04 23:06:27 +00:00
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
- https://github.com/opensearch-project/OpenSearch
|
|
|
|
|
|
- https://github.com/ramitsurana/awesome-kubernetes
|
|
|
|
|
|
- https://github.com/papers-we-love/papers-we-love
|
|
|
|
|
|
- https://github.com/pingcap/docs
|
|
|
|
|
|
- https://github.com/microsoft/WhatTheHack
|
2021-12-20 21:50:19 +00:00
|
|
|
|
- https://github.com/Azure/ResourceModules
|
2021-12-17 01:00:28 +00:00
|
|
|
|
- https://github.com/nix-community/awesome-nix
|
|
|
|
|
|
- https://github.com/balena-io/docs
|
2020-12-04 21:42:10 +00:00
|
|
|
|
- https://github.com/pawroman/links
|
2020-12-14 10:38:10 +00:00
|
|
|
|
- https://github.com/analysis-tools-dev/static-analysis
|
|
|
|
|
|
- https://github.com/analysis-tools-dev/dynamic-analysis
|
|
|
|
|
|
- https://github.com/mre/idiomatic-rust
|
2020-12-14 23:46:42 +00:00
|
|
|
|
- https://github.com/lycheeverse/lychee (yes, the lychee docs are checked with lychee 🤯)
|
2020-10-20 08:40:45 +00:00
|
|
|
|
|
2021-12-17 01:00:28 +00:00
|
|
|
|
If you are using lychee for your project, **please add it here**.
|
2021-04-24 14:23:36 +00:00
|
|
|
|
|
|
|
|
|
|
## Credits
|
|
|
|
|
|
|
|
|
|
|
|
The first prototype of lychee was built in [episode 10 of Hello
|
|
|
|
|
|
Rust](https://hello-rust.show/10/). Thanks to all Github- and Patreon sponsors
|
|
|
|
|
|
for supporting the development since the beginning. Also, thanks to all the
|
|
|
|
|
|
great contributors who have since made this project more mature.
|
2020-12-04 09:44:31 +00:00
|
|
|
|
|
2021-02-02 13:32:19 +00:00
|
|
|
|
## License
|
|
|
|
|
|
|
|
|
|
|
|
lychee is licensed under either of
|
|
|
|
|
|
|
|
|
|
|
|
- Apache License, Version 2.0, (LICENSE-APACHE or
|
2021-09-03 00:03:39 +00:00
|
|
|
|
https://www.apache.org/licenses/LICENSE-2.0)
|
|
|
|
|
|
- MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
|
2021-02-02 13:32:19 +00:00
|
|
|
|
|
|
|
|
|
|
at your option.
|