lychee/README.md

113 lines
7.8 KiB
Markdown
Raw Normal View History

2020-08-04 23:06:27 +00:00
![lychee](assets/banner.png)
2020-08-07 22:10:30 +00:00
![Rust](https://github.com/hello-rust/lychee/workflows/Rust/badge.svg)
2020-08-19 23:08:04 +00:00
...because who says I can't write yet another link checker?
2020-08-04 23:06:27 +00:00
## What?
This thing was created from [Hello Rust Episode
10](https://hello-rust.show/10/). It's a link checker that treats Github links
2020-08-19 23:08:04 +00:00
specially by using a `GITHUB_TOKEN` to avoid getting blocked by the rate
2020-08-14 13:24:41 +00:00
limiter.
2020-08-04 23:06:27 +00:00
![Lychee demo](./assets/lychee.gif)
2020-08-17 18:14:46 +00:00
2020-08-09 20:43:18 +00:00
## Why?
2020-08-14 13:24:41 +00:00
The existing link checkers were not flexible enough for my use-case. lychee
runs all requests fully asynchronously and has a low memory/CPU footprint.
2020-10-18 22:09:53 +00:00
## Features
2020-10-20 22:21:11 +00:00
This comparison is made on a best-effort basis. Please create a PR to fix outdated information.
| | lychee | awesome_bot | muffet | broken-link-checker | linkinator | linkchecker | markdown-link-check | fink |
| -------------------- | ------ | ----------- | ------ | ------------------- | ---------- | ----------- | ------------------- | ---- |
| Language | Rust | Ruby | Go | JS | TypeScript | Python | JS | PHP |
| Async/Parallel | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
2020-10-21 10:44:12 +00:00
| Static binary | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ |
2020-10-20 22:21:11 +00:00
| Markdown files | ✔️ | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✔️ | ✖️ |
| HTML files | ✔️ | ✖️ | ✖️ | ✔️ | ✔️ | ✖️ | ✖️ | ✖️ |
| Txt files | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ |
| Website support | ✔️ | ✖️ | ✔️ | ✔️ | ✔️ | ✔️ | ✖️ | ✔️ |
2020-10-21 10:46:24 +00:00
| Chunked encodings | ✔️ | **?** | **?** | **?** | **?** | ✖️ | ✔️ | ✔️ |
| GZIP compression | ✔️ | **?** | **?** | ✔️ | **?** | ✔️ | **?** | ✖️ |
2020-10-20 22:21:11 +00:00
| Basic Auth | ✖️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ |
| Custom user agent | ✔️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ |
2020-10-21 11:03:55 +00:00
| Relative URLs | ✔️ | ✔️ | ✖️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
2020-10-21 10:46:24 +00:00
| Skip relative URLs | ✔️ | ✖️ | ✖️ | **?** | ✖️ | ✖️ | ✖️ | ✖️ |
| Include patterns | ✔️️ | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ |
2020-10-20 22:21:11 +00:00
| Exclude patterns | ✔️ | ✖️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Handle redirects | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Ignore insecure SSL | ✔️ | ✔️ | ✔️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ |
| File globbing | ✔️ | ✔️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ | ✖️ |
| Limit scheme | ✔️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ |
| [Custom headers] | ✔️ | ✖️ | ✔️ | ✖️ | ✖️ | ✖️ | ✔️ | ✔️ |
2020-10-21 10:46:24 +00:00
| Summary | ✔️ | ✔️ | ✔️ | **?** | ✔️ | ✔️ | ✖️ | ✔️ |
2020-10-20 22:21:11 +00:00
| `HEAD` requests | ✔️ | ✔️ | ✖️ | ✔️ | ✔️ | ✔️ | ✖️ | ✖️ |
2020-10-21 10:46:24 +00:00
| Colored output | ✔️ | **?** | ✔️ | **?** | ✔️ | ✔️ | ✖️ | ✔️ |
2020-10-20 22:21:11 +00:00
| [Filter status code] | ✔️ | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✔️ | ✖️ |
| Custom timeout | ✔️ | ✔️ | ✔️ | ✖️ | ✔️ | ✔️ | ✖️ | ✔️ |
| E-mail links | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✔️ | ✖️ | ✖️ |
| Progress bar | ✔️ | ✔️ | ✖️ | ✖️ | ✖️ | ✔️ | ✔️ | ✔️ |
| Retry and backoff | ✔️ | ✖️ | ✖️ | ✖️ | ✔️ | ✖️ | ✔️ | ✖️ |
| Skip private domains | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ |
| [Use as lib] | ✖️ | ✔️ | ✖️ | ✔️ | ✔️ | ✖️ | ✔️ | ✖️ |
| Quiet mode | ✔️ | ✖️ | ✖️ | ✖️ | ✔️ | ✔️ | ✔️ | ✔️ |
2020-10-20 22:29:47 +00:00
| Amazing lychee logo | ✔️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ | ✖️ |
2020-10-21 10:41:53 +00:00
| Config file | ✔️ | ✖️ | ✖️ | ✖️ | ✔️ | ✔️ | ✔️ | ✖️ |
## Planned features:
- report output in HTML, SQL, CSV, XML, JSON, YAML... format
- report extended statistics: request latency
- recursion
2020-08-26 13:31:29 +00:00
- use colored output (https://crates.io/crates/colored)
- skip duplicate urls
2020-10-20 22:21:11 +00:00
- request throttling
2020-08-19 23:08:04 +00:00
## Users
2020-10-20 21:13:45 +00:00
- https://github.com/analysis-tools-dev/static-analysis (soon)
- https://github.com/mre/idiomatic-rust (soon)
2020-08-17 18:14:46 +00:00
2020-08-04 23:06:27 +00:00
## How?
```
cargo install lychee
```
Set an environment variable with your token like so `GITHUB_TOKEN=xxxx`.
Run it inside a repository with a `README.md` or specify a file with
```
2020-08-13 21:01:30 +00:00
lychee <yourfile>
```
2020-08-04 23:06:27 +00:00
2020-08-14 13:24:41 +00:00
## Comparison
Collecting other link checkers here to crush them in comparison. :P
- https://github.com/dkhamsing/awesome_bot
2020-08-17 18:14:46 +00:00
- https://github.com/tcort/markdown-link-check
- https://github.com/raviqqe/liche
- https://github.com/raviqqe/muffet
2020-10-17 22:15:11 +00:00
- https://github.com/stevenvachon/broken-link-checker
- https://github.com/JustinBeckwith/linkinator
- https://github.com/linkchecker/linkchecker
- https://github.com/dantleech/fink
- https://github.com/bartdag/pylinkvalidator
- https://github.com/victoriadrake/hydra-link-checker
2020-08-14 13:24:41 +00:00
2020-08-04 23:06:27 +00:00
## Thanks
...to my Github sponsors and Patreon sponsors for supporting these projects. If
you want to help out as well, [go here](https://github.com/sponsors/mre/).
[custom headers]: https://github.com/rust-lang/crates.io/issues/788)
2020-10-20 22:21:11 +00:00
[filter status code]: https://github.com/tcort/markdown-link-check/issues/94
[skip private domains]: https://github.com/appscodelabs/liche/blob/a5102b0bf90203b467a4f3b4597d22cd83d94f99/url_checker.go
[use as lib]: https://github.com/raviqqe/liche/issues/13