Commit graph

116 commits

Author SHA1 Message Date
Matthias Endler
14e748793e
Cookie Support (#1146)
This is a very conservative and limited implementation of cookie support.

The goal is to ship an MVP, which covers 80% of the use-cases.
When you run lychee with --cookie-jar cookies.json, all cookies will be stored in cookies.json, one cookie per line.
This makes cookies easy to edit by hand if needed, although this is an advanced use-case and the API for the format is not guaranteed to be stable.

Fixes: #645, #715
Partially fixes: #1108
2023-07-13 17:32:41 +02:00
Matthias Endler
97573123ef
Extend remap feature (#1133)
* wip

* Extend support for remapping

This adds supports for partial remaps and
capture groups to the remap feature.

Fixes #1129
2023-07-05 15:05:19 +02:00
Techassi
67af7ef6d3
feat: add support for basic auth per URI (#1110)
* Add support for basic auth per domain
* Move URI matching to link collection phase
* Allow AsRef for BasicAuthExtractor::new to avoid clone
* Add tests

---------

Co-authored-by: Matthias Endler <matthias@endler.dev>
2023-06-26 12:06:24 +02:00
Stefan Kreutz
7dd84f6b7c
Add optional Rustls support (#1099)
* Add optional Rustls support

This commit adds a non-default feature flag to use Rustls instead of OpenSSL.

My personal motivation is to use Lychee on OpenBSD -current, where the
`openssl` crate frequently fails to link against the unreleased system
LibreSSL. Using the `vendored-openssl` feature helps with compilation, but
segfaults at runtime.

The commit adds three feature flags to the library, binary, benchmark, and all
examples:

- The `native-tls` feature flag toggles the `openssl` crate.
- The `rustls-tls` feature flag toggles the `rustls` crate.
- The `email-check` feature flag toggles the `check-if-email-exists` crate,
  which is the only existing functionality currently incompatible with Rustls.

By default, `native-tls` and `email-check` are enabled. Thus, Lychee (bin and
lib) can be used as before unless default features are disabled.

To use the Rustls feature, pass `--no-default-features --features rustls` to
cargo check/build/test/..., e.g.,

    $ cargo clippy --workspace --all-targets --no-default-features \ --features
    rustls-tls -- --deny warnings

Checking email addresses requires both, `native-tls` and `email-check`, to be
enabled. Otherwise, email addresses are excluded.

The `email-check` feature flag is technically not necessary. I preferred it
over `not(rustls-tls)` because it's clearer and it addresses the AGPL license
issue #594. As far as I understand, a Lychee binary compiled without the
`email-check` feature could be distributed with file-based copyleft for the
MPL-licensed dependencies only. But that's out of scope here.

The benchmark shows a performance regression varying between 2% and 4.4% when
using Rustls instead of OpenSSL on my machine.

PS: The `ring` crate needs to be patched on OpenBSD 7.3 and later until the new
xonly patches have been upstreamed, see the `rust-ring` port.

* Use platform native certificates with Rustls

By default, reqwest uses the webpki-roots crate with Rustls, effectively
bundling Mozilla's root certificates.

This commit uses the rustls-native-certs crate instead to use locally
installed root certificates, to minimize the difference between the
native-tls and rustls-tls features.

* Document feature flags
2023-06-16 02:21:57 +02:00
Matthias Endler
5ce77e1202
Don't cache unknown status codes (#1090)
Unknown status codes should be skipped and not cached by default. The reason is that we don't know if they are valid or not and even if they are invalid, we don't know if they will be valid in the future.
2023-06-02 02:46:20 +02:00
Matthias
649ab227d3 Add check duration to compact format 2023-06-01 18:31:41 +02:00
Matthias Endler
3c3051a7f0
Remove inaccurate details in compact view (#1088) 2023-06-01 16:55:30 +02:00
Matthias Endler
2b08c250be
Prettier colors and progress bar (#1069)
I've experimented a bit with the colors and these are the ones I
(currently) like best. The loader is taken from Python.
See https://stackoverflow.com/a/73724672
and 68224905f5/rich/progress_bar.py (LL70C16-L70C16)
2023-05-17 14:35:26 +02:00
Thomas Zahner
130fa21a6a
Concurrent archives (#1027) 2023-05-11 20:20:27 +02:00
Matthias Endler
fe24ba783a
Add check duration (in seconds) to report (#1064) 2023-05-06 00:47:32 +02:00
dependabot[bot]
df115098e3
Bump tabled from 0.10.0 to 0.11.1 (#1039)
* Bump tabled from 0.10.0 to 0.11.1

Bumps [tabled](https://github.com/zhiburt/tabled) from 0.10.0 to 0.11.1.
- [Release notes](https://github.com/zhiburt/tabled/releases)
- [Changelog](https://github.com/zhiburt/tabled/blob/master/CHANGELOG.md)
- [Commits](https://github.com/zhiburt/tabled/commits)

---
updated-dependencies:
- dependency-name: tabled
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* update tabled imports

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
Co-authored-by: Matthias Endler <matthias@endler.dev>
2023-04-13 15:17:12 +02:00
Matthias Endler
0e97f57040
Use standard error for error output (#990)
Fixes https://github.com/lycheeverse/lychee/issues/984

From https://doc.rust-lang.org/book/ch12-06-writing-to-stderr-instead-of-stdout.html:

> Command line programs are expected to send error messages to the standard error stream so we can still see error messages on the screen even if we redirect the standard output stream to a file. Our program is not currently well-behaved: we’re about to see that it saves the error message output to a file instead!
2023-04-11 23:43:33 +02:00
Matthias
8f6199b5b6 Don't panic on invalid response URIs 2023-04-11 00:26:43 +02:00
Matthias
649f307028 Avoid unwrap when deserializing statuscode 2023-04-11 00:23:23 +02:00
Thomas
994b2852cd
Wayback integration (#1003)
Adds support for suggesting archived URLs for broken links.
Uses Wayback Machine as the archive provider.
2023-03-28 00:45:06 +02:00
Benny Joe Villiger
250f7a8f0a
Status codes in maps (#1014) 2023-03-27 12:29:12 +02:00
Matthias
cd45f9db07 cleanup empty file 2023-03-18 14:47:21 +01:00
Matthias Endler
30e2a2b62b
Fix --max-redirects (#987)
Having more than the max number of redirects
caused lychee to abort the requests, but did not
lead to an error.

Related: https://github.com/lycheeverse/lychee-action/issues/164
2023-03-10 15:15:37 +01:00
Matthias
9eb3149a69 Custom config handling to spot errors when passing invalid config and ignoring errors loading missing default conf 2023-03-03 12:13:09 +01:00
Matthias
6c133493e9 Revert "Don't ignore file-not-found errors when loading config"
This reverts commit 9ade4502a27cb3776c5fb39cdad7666ab854a373.
2023-03-03 12:13:09 +01:00
Matthias
387766322d Don't ignore file-not-found errors when loading config
This is no longer necessary ever since 712bdfa8cb
2023-03-03 12:13:09 +01:00
Matthias
17937537f8 Ignored URLs don't lead to failing exit code 2023-03-03 12:13:09 +01:00
Matthias
7e0b9e2c68 Update verbosity docs
Thanks to @MichaIng for mentioning the issue and providing a fix.
2023-02-25 15:44:43 +01:00
Matthias Endler
7874195bbb
Customize verbosity (#956) 2023-02-24 23:53:09 +01:00
dependabot[bot]
d8e4940dbe
Bump toml from 0.5.11 to 0.7.0 (#933)
* Bump toml from 0.5.11 to 0.7.0

Bumps [toml](https://github.com/toml-rs/toml) from 0.5.11 to 0.7.0.
- [Release notes](https://github.com/toml-rs/toml/releases)
- [Commits](https://github.com/toml-rs/toml/compare/toml-v0.5.11...toml-v0.7.0)

---
updated-dependencies:
- dependency-name: toml
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Introduce new let...else syntax

* Update config file loading for latest toml crate version

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
Co-authored-by: Matthias Endler <matthias@endler.dev>
2023-01-30 15:12:34 +01:00
Lucius Hu
e2406089ad
chore!: improve client and remap modules (#913)
`lychee_lib::client`:

- Improved documentation.
- Added an log message in `ClientBuilder::client()` when provied user-agent
  overrides the one defined in provied custom header.
- Removed unnecessary error handling in `Client::check()` when setting HTTPS
  scheme because all failure cases should occur when checking this URL the first
  time already.
- Removed unnecessary error handling in `Client::remap()` since
  `lychee-lib::remap::Remaps::remap()` doesn't returns a `Result` anymore.
- Fixed potential integer overflow in `Client::check_website()` when the wait
  time between retries doubles, by using `std::time::Duration::saturating_mul`
  instead.
- Renamed `invalid()` to `validate_url()`.

`lychee_lib::remap`:

- Improved documentation, in particular, clarified (in the comment) that it's
  URLs not URIs being remapped.
- Changed `Remaps::remap()` so it takes `&mut Url` instead of `Uri` as its
  argument, and doesn't return a `Result` as a result.
    - Using `Url` instead of `Uri` because it aligns with the concept of
      remapping locations rather than identifiers.
    - Mutating the URL directly instead of returning a new one for it's more
      straightforward.
    - There is no error handling because we don't convert from URL to URI
      anymore. Furthermore, this always succeed in the first place so we never
      needed error handling.
- Added implementation of `IntoIterator` for `&'a Remaps` and convenience method
  of `Remaps::iter`. (Their mutable or moving counterparts are deliberately
  avoided because we don't want library users to modify all consume the
  remapping rules after its instantiation.)

`lychee_lib::error`:

- Renamed `ErrorKind::InvalidUriRemap` to `InvalidUrlRemap` and improved
  its error message.

Changes to other modules are minor and only serves to accompany aforementioned
changes.
2023-01-16 19:14:09 +01:00
Matthias Endler
15d8024c7c
Change progress bar style (#718)
* Bump indicatif from 0.16.2 to 0.17.0

Bumps [indicatif](https://github.com/console-rs/indicatif) from 0.16.2 to 0.17.0.
- [Release notes](https://github.com/console-rs/indicatif/releases)
- [Commits](https://github.com/console-rs/indicatif/compare/0.16.2...0.17.0)

---
updated-dependencies:
- dependency-name: indicatif
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update progress bar setup

* Change progress bar style

* Use pink for spinner
* Show ETA instead of elapsed
* dim progress bar and adjust size to terminal width

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-22 15:29:45 +01:00
Matthias Endler
da46734c54
Extend response stats in verbose mode (#882) 2022-12-20 10:43:01 +01:00
Matthias Endler
6df1c378ec
Fix Rust 1.66 clippy lints (#879) 2022-12-19 14:28:10 +01:00
Matthias
96dec6984a
Refactor check function (#860) 2022-12-12 01:05:47 +01:00
Matthias
e476965bee
Fix verbosity serialization (#853)
Forgot the serde defaults which lead to problems on some terminals
2022-11-29 12:59:32 +01:00
Matthias
93a1481305
Less verbose cache age formatting (#849)
Previously the cache age was formatted with nanosecond resolution,
which is too fine-grained even for Rustaceans.
Now the format is limited to days, hours, minutes, and seconds.
With that, the cache age becomes more easily parseable by humans.
2022-11-29 00:39:49 +01:00
Matthias
982d978e47
Add different verbosity levels (#824)
More granular verbosity levels have been asked
for repeatedly.
To enable that we're moving to [env_logger] and [clap-verbosity-flag]
to provide more flexible verbosity settings.

Also tackles #661, #709
Lays the groundwork for tackling #268

https://github.com/rust-cli/env_logger
https://github.com/clap-rs/clap-verbosity-flag
2022-11-28 23:25:33 +01:00
Matthias
b479a5810e
Allow overriding accepted status codes for cached URIs (#843)
Fixes #840
2022-11-28 12:23:07 +01:00
dependabot[bot]
2ce1a9ae06
Bump clap from 3.2.23 to 4.0.22 (#813)
* Bump clap from 3.2.23 to 4.0.22

Bumps [clap](https://github.com/clap-rs/clap) from 3.2.23 to 4.0.22.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.2.23...v4.0.22)

* The `headers` option got renamed to `header` to align with the rest
   of the options, which are singular.
* The short option for `header` (`-h`) was removed to avoid a conflict with
  help (`lychee -h`).
* Update and simplify readme check

Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-11-13 21:10:32 +01:00
Matthias
35ccfb87c3
Add support for dumping links to file (#810) 2022-11-08 00:33:16 +01:00
Matthias
264af23822 Improve wording 2022-11-05 17:25:44 +01:00
Andy Grunwald
a67b513238
Extend description of "--exclude" to also exclude email addresses, not only URLs (#801) 2022-10-23 12:17:20 +02:00
Matthias
cbd936960a
Move from structopt to clap (#732)
Structopt was subsumed by clap. See
https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#migrating
2022-08-12 22:53:13 +02:00
Matthias
69f387c1bd
Markdown-status (#729)
* Fix typos

* Add status code description to markdown output
2022-08-11 22:08:05 +02:00
tooomm
092b8b0bf1
reorder md output (#708) 2022-08-04 00:48:45 +02:00
dependabot[bot]
960e32c55f
Bump tabled from 0.7.0 to 0.8.0 (#701)
* Bump tabled from 0.7.0 to 0.8.0

Bumps [tabled](https://github.com/zhiburt/tabled) from 0.7.0 to 0.8.0.
- [Release notes](https://github.com/zhiburt/tabled/releases)
- [Changelog](https://github.com/zhiburt/tabled/blob/master/CHANGELOG.md)
- [Commits](https://github.com/zhiburt/tabled/compare/v0.7.0...v0.8.0)

---
updated-dependencies:
- dependency-name: tabled
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update tabled formatting and tests

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-08-03 23:22:08 +02:00
dependabot[bot]
7c1b2f7527
Bump indicatif from 0.16.2 to 0.17.0 (#711)
* Bump indicatif from 0.16.2 to 0.17.0

Bumps [indicatif](https://github.com/console-rs/indicatif) from 0.16.2 to 0.17.0.
- [Release notes](https://github.com/console-rs/indicatif/releases)
- [Commits](https://github.com/console-rs/indicatif/compare/0.16.2...0.17.0)

---
updated-dependencies:
- dependency-name: indicatif
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update progress bar setup

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-08-03 14:20:25 +02:00
Matthias
6fae93f2da
Skip caching unsupported and excluded URLs (#692)
As discussed in https://github.com/lycheeverse/lychee/issues/647#issuecomment-1170773449, it does not make much sense to cache unsupported
and excluded URLs.
Unsupported URLs might be supported in the future and caching them
would mean they won't get checked then. Excluded URLs were
excluded for a reason and should not appear in the cache.
Furthermore they might not be excluded
in a consecutive run, leading to a false-positive.
2022-07-17 18:40:45 +02:00
Walter Beller-Morales
75a3da0b7e
Add status code in Markdown output (#677) 2022-07-05 14:43:15 +02:00
Matthias
78185d3b63 Add documentation 2022-06-21 10:03:31 +02:00
Matthias
84de43c554
Refactor request types (#637) 2022-06-03 20:13:07 +02:00
Matthias
a557cba0b4
Add support for parsing list of status codes from config file (#636) 2022-06-02 18:53:04 +02:00
Matthias
9b4dfadffd
Fix parsing errors with config options (#632) 2022-05-31 19:43:46 +02:00
vpereira01
d48a3279a8
Improve configuration example (#631)
* Add missing parameters
* Remove deprecated `--exclude-file` parameter
* Improve TOML comments
* Add config smoketest
2022-05-31 19:05:27 +02:00