lychee/README.md

88 lines
2.8 KiB
Markdown
Raw Normal View History

2020-08-04 23:06:27 +00:00
![lychee](assets/banner.png)
2020-08-07 22:10:30 +00:00
![Rust](https://github.com/hello-rust/lychee/workflows/Rust/badge.svg)
2020-08-19 23:08:04 +00:00
...because who says I can't write yet another link checker?
2020-08-04 23:06:27 +00:00
## What?
This thing was created from [Hello Rust Episode
10](https://hello-rust.show/10/). It's a link checker that treats Github links
2020-08-19 23:08:04 +00:00
specially by using a `GITHUB_TOKEN` to avoid getting blocked by the rate
2020-08-14 13:24:41 +00:00
limiter.
2020-08-04 23:06:27 +00:00
2020-08-17 18:14:46 +00:00
TODO: Add screenshots here
2020-08-09 20:43:18 +00:00
## Why?
2020-08-14 13:24:41 +00:00
The existing link checkers were not flexible enough for my use-case. lychee
runs all requests fully asynchronously and has a low memory/CPU footprint.
2020-08-09 20:43:18 +00:00
lychee can...
2020-08-12 22:17:16 +00:00
- handle links inside Markdown, HTML, and other documents
2020-08-09 21:13:52 +00:00
- handle chunked encodings
- handle gzip compression
- fake user agents (required for some firewalls)
- skip non-links like anchors or relative URLs
2020-08-11 20:48:50 +00:00
- exclude some websites with regular expressions
2020-08-12 10:59:15 +00:00
- handle a configurable number of redirects
2020-08-12 11:10:15 +00:00
- disguise as a different user agent (like curl)
2020-08-14 13:24:41 +00:00
- optionally ignore SSL certificate errors (`--insecure`)
2020-08-14 00:33:04 +00:00
- check multiple files at once (supports globbing)
2020-08-13 23:15:18 +00:00
- support checking links from any website URL
2020-08-13 23:54:05 +00:00
- limit scheme (e.g. only check HTTPS links with "https")
2020-08-14 13:24:41 +00:00
- accept custom headers (e.g. for cases like https://github.com/rust-lang/crates.io/issues/788)
2020-08-14 09:38:35 +00:00
- show final summary/statistics
2020-08-14 15:36:43 +00:00
- optionally use `HEAD` requests instead of `GET`
2020-08-17 18:14:46 +00:00
- show colored output
- filter based on status codes (https://github.com/tcort/markdown-link-check/issues/94)
(e.g. `--accept 200,204`)
2020-08-21 22:41:24 +00:00
- accept a request timeout (`--timeout`) in seconds. Default is 20s. Set to 0 for no timeout.
2020-08-23 21:19:10 +00:00
- check e-mail links using [check-if-mail-exists](https://github.com/amaurymartiny/check-if-email-exists)
- show the progress interactively with progress bar and in-flight requests (`--progress`) by @xiaochuanyu
2020-08-13 13:41:27 +00:00
SOON:
- automatically retry and backoff
2020-08-17 18:14:46 +00:00
- check relative (`base-url` to set project root)
- usable as a library (https://github.com/raviqqe/liche/issues/13)
- exclude private domains (https://github.com/appscodelabs/liche/blob/a5102b0bf90203b467a4f3b4597d22cd83d94f99/url_checker.go)
- recursion
2020-08-19 23:08:04 +00:00
- extended statistics: request latency
2020-08-26 13:31:29 +00:00
- use colored output (https://crates.io/crates/colored)
2020-08-19 23:08:04 +00:00
## Users
2020-08-21 22:36:03 +00:00
- SOON: https://github.com/analysis-tools-dev/static-analysis
2020-08-17 18:14:46 +00:00
2020-08-04 23:06:27 +00:00
## How?
```
cargo install lychee
```
Set an environment variable with your token like so `GITHUB_TOKEN=xxxx`.
Run it inside a repository with a `README.md` or specify a different Markdown
file with
```
2020-08-13 21:01:30 +00:00
lychee <yourfile>
```
2020-08-04 23:06:27 +00:00
2020-08-14 13:24:41 +00:00
## Comparison
Collecting other link checkers here to crush them in comparison. :P
- https://github.com/dkhamsing/awesome_bot
2020-08-17 18:14:46 +00:00
- https://github.com/tcort/markdown-link-check
- https://github.com/raviqqe/liche
- https://github.com/raviqqe/muffet
2020-08-14 13:24:41 +00:00
2020-08-04 23:06:27 +00:00
## Thanks
...to my Github sponsors and Patreon sponsors for supporting these projects. If
you want to help out as well, [go here](https://github.com/sponsors/mre/).