Commit graph

506 commits

Author SHA1 Message Date
Matthias
487d88cefe
Add test for mailto address with query params (#655) 2022-06-29 10:19:17 +02:00
dependabot[bot]
2e6caa512c
Bump html5gum from 0.5.1 to 0.5.2 (#659)
Bumps [html5gum](https://github.com/untitaker/html5gum) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/untitaker/html5gum/releases)
- [Commits](https://github.com/untitaker/html5gum/compare/0.5.1...0.5.2)

---
updated-dependencies:
- dependency-name: html5gum
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-06-29 10:18:54 +02:00
dependabot[bot]
231939af82
Bump html5gum from 0.4.0 to 0.5.1 (#658)
* Bump html5gum from 0.4.0 to 0.5.1

Bumps [html5gum](https://github.com/untitaker/html5gum) from 0.4.0 to 0.5.1.
- [Release notes](https://github.com/untitaker/html5gum/releases)
- [Commits](https://github.com/untitaker/html5gum/compare/0.4.0...0.5.1)

---
updated-dependencies:
- dependency-name: html5gum
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update html5gum

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-06-23 00:07:28 +02:00
Markus Unterwaditzer
f1ae22da09
Replace lazy hashset with matches! (#656)
* Replace lazy hashset with matches!

llvm will typically create much faster code than accessing a hashset at
runtime

source: trust me bro

* cargo fix

* cargo fmt

* shorten docstring
2022-06-18 19:00:07 +02:00
dependabot[bot]
2730e71656
Bump cached from 0.34.0 to 0.34.1 (#650)
Bumps [cached](https://github.com/jaemk/cached) from 0.34.0 to 0.34.1.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 12:43:04 +02:00
dependabot[bot]
f620748f25
Bump reqwest from 0.11.10 to 0.11.11 (#651)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.10 to 0.11.11.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.10...v0.11.11)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-15 23:52:30 +02:00
dependabot[bot]
e6f0f0098e
Bump http from 0.2.7 to 0.2.8 (#642)
Bumps [http](https://github.com/hyperium/http) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.7...v0.2.8)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-08 19:39:43 +02:00
dependabot[bot]
e38724a022
Bump tokio from 1.18.2 to 1.19.2 (#643)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-08 19:39:16 +02:00
Matthias
84de43c554
Refactor request types (#637) 2022-06-03 20:13:07 +02:00
dependabot[bot]
96da3d64c0
Bump check-if-email-exists from 0.8.29 to 0.8.30 (#638)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.29 to 0.8.30.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/commits)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-03 20:12:24 +02:00
dependabot[bot]
5432ec8c22
Bump openssl-sys from 0.9.73 to 0.9.74 (#635)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.73 to 0.9.74.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.73...openssl-sys-v0.9.74)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-03 08:45:00 +02:00
Matthias
9b4dfadffd
Fix parsing errors with config options (#632) 2022-05-31 19:43:46 +02:00
Matthias
b40aacd459
Prepare for release v0.10.0 (#629) 2022-05-30 23:02:18 +02:00
Matthias
f33b897d5d
Exclude example domains as per RFC 2606 from checking (#627)
Unfortunately it's not possible to automatically enable features
for `cargo test`. See https://github.com/rust-lang/cargo/issues/2911.

As a workaround to allow for using example domains for unit- and integration
tests,  we introduce a new feature, `check_example_domains`, which is
disabled by default for normal users. The feature gets activated for the
integration test which checks that the example domain exclusion works as
expected.
2022-05-29 21:42:00 +02:00
Matthias
22fecfc056
Add support for URI remapping (#620)
Remaps allow mapping from a URI pattern to a different URI.

The syntax is

```
lychee --remap 'https://example.com http://127.0.0.1'
```

Some use-cases are
- Testing URIs prior to production deployment
- Testing URIs behind a proxy

Be careful when using this feature because checking every link against a
large set of regular expressions has a performance impact. Also there are no
constraints on the URI mapping, so the rules might contradict with each
other.
Remap rules get applied in order of definition to every input URI.
2022-05-29 21:41:22 +02:00
Matthias
363b95fe5f
Add support for excluding paths from link checking (#623)
This change deprecates `--exclude-file` as it was ambiguous.
Instead, `--exclude-path` was introduced to support excluding paths
to files and directories that should not be checked.
Furthermore, `.lycheeignore` is now the only way
to exclude URL patterns.
2022-05-29 17:27:09 +02:00
dependabot[bot]
627c584935
Bump once_cell from 1.11.0 to 1.12.0 (#625)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 15:27:46 +02:00
dependabot[bot]
840e5b6d4b
Bump regex from 1.5.5 to 1.5.6 (#624)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.5 to 1.5.6.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.5...1.5.6)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 18:02:13 +02:00
dependabot[bot]
1dfd2cb9e0
Bump once_cell from 1.10.0 to 1.11.0 (#622)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-22 12:18:38 +02:00
Matthias
571b49410c
Extend reqwest client settings (#617)
This sets a HTTP connect timeout (for stability)
and a TCP keepalive (for performance).

The connect timeout should help with flaky servers, which
would block the runtime and therefore other requests.

The keepalive helps when making many requests to the same
host. This is a very common pattern for checking internal documentation,
which is an important use-case of lychee.

The settings are currently not configurable by the user
and set to sane defaults. We might make this configurable in the future
if there is demand to do so.
2022-05-13 18:51:11 +02:00
dependabot[bot]
508ebb8726
Bump log from 0.4.16 to 0.4.17 (#609)
Bumps [log](https://github.com/rust-lang/log) from 0.4.16 to 0.4.17.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/commits/0.4.17)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-09 17:46:15 +02:00
dependabot[bot]
73c55fa8fc
Bump tokio from 1.18.0 to 1.18.2 (#612)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.0 to 1.18.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.0...tokio-1.18.2)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-09 16:24:28 +02:00
dependabot[bot]
814aa7d7f4
Bump thiserror from 1.0.30 to 1.0.31 (#606)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.30 to 1.0.31.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.30...1.0.31)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-02 15:08:27 +02:00
dependabot[bot]
fa51af2052
Bump octocrab from 0.15.4 to 0.16.0 (#601)
Bumps [octocrab](https://github.com/XAMPPRocky/octocrab) from 0.15.4 to 0.16.0.
- [Release notes](https://github.com/XAMPPRocky/octocrab/releases)
- [Changelog](https://github.com/XAMPPRocky/octocrab/blob/master/CHANGELOG.md)
- [Commits](https://github.com/XAMPPRocky/octocrab/compare/octocrab@0.15.4...octocrab@0.16.0)

---
updated-dependencies:
- dependency-name: octocrab
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-29 14:47:16 +02:00
dependabot[bot]
3345014f67
Bump tokio from 1.17.0 to 1.18.0 (#602)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-29 14:47:07 +02:00
dependabot[bot]
e954a87c7c
Bump http from 0.2.6 to 0.2.7 (#603)
Bumps [http](https://github.com/hyperium/http) from 0.2.6 to 0.2.7.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.6...v0.2.7)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-29 14:46:58 +02:00
Matthias
8c0a32d81d
Refactor response formatting (#599)
* Add support for raw formatter (no color)
* Introduce ResponseFormatter trait
* Pass the same params to every cli command
* Update dependencies
* Remove pretty_assertions dependency (latest version doesn't build)
2022-04-25 19:19:36 +02:00
Matthias
a607b853c9
Move to downstream optimization for short strings (#600)
Skipping to parse very short strings was merged into linkify
so our own workaround is unnecessary
https://github.com/robinst/linkify/pull/34
2022-04-25 19:18:50 +02:00
Matthias
da7bbf113d
Remove unnecessary Ok wrapper 2022-04-12 01:39:38 +02:00
Matthias
6ebc9fed4b
Reset nofollow in html5gum start tag (#584) 2022-04-06 00:49:00 +02:00
dependabot[bot]
a1726e7dbf
Bump wiremock from 0.5.11 to 0.5.12 (#582)
Bumps [wiremock](https://github.com/LukeMathWalker/wiremock-rs) from 0.5.11 to 0.5.12.
- [Release notes](https://github.com/LukeMathWalker/wiremock-rs/releases)
- [Changelog](https://github.com/LukeMathWalker/wiremock-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LukeMathWalker/wiremock-rs/compare/v0.5.11...v0.5.12)

---
updated-dependencies:
- dependency-name: wiremock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-04 17:30:52 +02:00
Matthias
debe958766
Add support for nofollow (#572) 2022-04-04 10:32:00 +02:00
Matthias
03d28820bb
Extract more status information from reqwest (#577)
Recently we cleaned up the commandline output to trim away redundant
information like the URL, which occured twice.
Unfortunately we also removed helpful information from reqwest, which
could support the user in troubleshooting unexpected errors.

This commit reverts that.
We now extract the meaningful information from reqwest, without being
too verbose. For that we have to depend on the string output for the
reqwest error, but it's better than hiding that information from the user.
It is fragile as it depends on the reqwest internals, but in the worst case
we simply return the full error text in case our parsing won't work.
2022-04-02 14:37:03 +02:00
dependabot[bot]
e5c63c8544
Bump html5ever from 0.25.2 to 0.26.0 (#573)
Bumps [html5ever](https://github.com/servo/html5ever) from 0.25.2 to 0.26.0.
- [Release notes](https://github.com/servo/html5ever/releases)
- [Commits](https://github.com/servo/html5ever/commits/html5ever-v0.26.0)

---
updated-dependencies:
- dependency-name: html5ever
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-28 15:50:10 +02:00
Matthias
5ad7b14bdd
Regression: Ignore invalid URLs (#571)
With the refactoring the URL checking as a workaround for the upstream
reqwest panic on invalid URLs, we introduced a regression, which caused
unsupported URL schemes to show up as errors in the lychee output.

This commit changes the behavior such that invalid schemes get ignored
again by making a differentiation between truly invalid URIs which would make
reqwest panic, and ones which are valid but just not handled by reqwest.
The check was moved to `check_website` such that the invalid URIs would
not be checked three times in a loop before erroring out.
2022-03-27 23:22:46 +02:00
Matthias
36d3195c68
Cache verbosity issue (fixes #562) 2022-03-27 14:48:09 +02:00
Matthias
743d386252
Allow input URLs without scheme (fixes #567)
This requires `Input::new` to return a `Result`, because the URL
parsing could fail when prepending `http://`.

We use http instead of https, because curl does as well:
70ac27604a/lib/urlapi.c (L1104-L1124)
Missing files will be interpreted as URLs from the command line
and these can be invalid, but that's not seen as an error anymore.
2022-03-27 01:27:27 +01:00
Matthias
d616177a99
Implement excluding code blocks (#523)
This is done in the extractor to avoid unnecessary
allocations.
2022-03-26 10:42:56 +01:00
dependabot[bot]
5a77209466
Bump html5ever from 0.25.1 to 0.25.2 (#566)
Bumps [html5ever](https://github.com/servo/html5ever) from 0.25.1 to 0.25.2.
- [Release notes](https://github.com/servo/html5ever/releases)
- [Commits](https://github.com/servo/html5ever/commits)

---
updated-dependencies:
- dependency-name: html5ever
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-24 13:45:07 +01:00
Matthias
77b1724881
Optimize plaintext extractor for small strings (#565)
Immediately return for very small strings which cannot be valid URIs.

The shortest valid URI without a scheme might be g.cn (Google China)
At least I am not aware of a shorter one. We set this as a lower threshold
for parsing URIs from plaintext to avoid false-positives and as a slight
performance optimization, which could add up for big files.
This threshold might be adjusted in the future.
2022-03-23 23:06:49 +01:00
dependabot[bot]
9ece4f9552
Bump log from 0.4.15 to 0.4.16 (#564)
Bumps [log](https://github.com/rust-lang/log) from 0.4.15 to 0.4.16.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/commits)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-23 14:19:22 +01:00
Matthias
e1d112dbab
Remove missing_panic_doc (#561) 2022-03-22 21:02:56 +01:00
Matthias
328c96576d
Bump version to v0.9.0 (#560) 2022-03-22 13:43:49 +01:00
dependabot[bot]
0d9e500988
Bump log from 0.4.14 to 0.4.15 (#559)
Bumps [log](https://github.com/rust-lang/log) from 0.4.14 to 0.4.15.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/compare/0.4.14...0.4.15)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-22 13:32:08 +01:00
Matthias
45de5c763e
Avoid reqwest panic on invalid URIs (#557) 2022-03-22 13:15:11 +01:00
dependabot[bot]
dd9a8f29ce
Bump reqwest from 0.11.9 to 0.11.10 (#555)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.9 to 0.11.10.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.9...v0.11.10)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-15 14:17:40 +01:00
dependabot[bot]
09dd476597
Bump async-stream from 0.3.2 to 0.3.3 (#553)
Bumps [async-stream](https://github.com/tokio-rs/async-stream) from 0.3.2 to 0.3.3.
- [Release notes](https://github.com/tokio-rs/async-stream/releases)
- [Commits](https://github.com/tokio-rs/async-stream/compare/v0.3.2...v0.3.3)

---
updated-dependencies:
- dependency-name: async-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-14 15:37:17 +01:00
dependabot[bot]
93808fd121
Bump cached from 0.33.0 to 0.34.0 (#549)
Bumps [cached](https://github.com/jaemk/cached) from 0.33.0 to 0.34.0.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-10 14:42:44 +01:00
dependabot[bot]
5371d3344e
Bump regex from 1.5.4 to 1.5.5 (#545)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.5.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.5)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-09 16:36:32 +01:00
Matthias
ceb185e579
Add more comments to path methods (#543) 2022-03-08 13:50:54 +01:00
dependabot[bot]
72a17af135
Bump once_cell from 1.9.0 to 1.10.0 (#541)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-06 13:55:40 +01:00
dependabot[bot]
11b6c4be57
Bump check-if-email-exists from 0.8.28 to 0.8.29 (#538)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.28 to 0.8.29.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.28...v0.8.29)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-03 14:14:41 +01:00
Matthias
8097bfa408
Print Github token error once at the end (#537)
Print original reqwest error for every Github link.
It contains more information about the underlying error.

Only print a message about the Github token at the
end if it's not set and there were Github errors.
2022-03-03 10:04:55 +01:00
Matthias
4c51fce22f
Fix broken pipe error on failing writes to stdout (#535)
Make sure that broken pipes (e.g. when a reader of a
pipe prematurely exits during execution) get handled gracefully.
This change also moves some error messages to stderr by using
eprintln.

More info: https://github.com/jez/as-tree/issues/15
2022-03-02 23:39:54 +01:00
dependabot[bot]
595a713b4b
Bump cached from 0.32.1 to 0.33.0 (#536)
Bumps [cached](https://github.com/jaemk/cached) from 0.32.1 to 0.33.0.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-02 13:55:00 +01:00
Matthias
0fc5fc9ffe
Print errors with a different format for easier clickability (fixes #532) 2022-03-01 16:58:04 +01:00
dependabot[bot]
d7de8ad38e
Bump wiremock from 0.5.10 to 0.5.11 (#531)
Bumps [wiremock](https://github.com/LukeMathWalker/wiremock-rs) from 0.5.10 to 0.5.11.
- [Release notes](https://github.com/LukeMathWalker/wiremock-rs/releases)
- [Changelog](https://github.com/LukeMathWalker/wiremock-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LukeMathWalker/wiremock-rs/compare/v0.5.10...v0.5.11)

---
updated-dependencies:
- dependency-name: wiremock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-28 15:38:44 +01:00
dependabot[bot]
2f10222792
Bump cached from 0.30.0 to 0.32.1 (#530)
Bumps [cached](https://github.com/jaemk/cached) from 0.30.0 to 0.32.1.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-28 15:38:34 +01:00
Matthias
05bd3817ee
Make retry wait time configurable (#525) 2022-02-24 12:24:57 +01:00
Matthias
286da6094f
Update link to documentation (#528) 2022-02-24 12:22:16 +01:00
Matthias
41b291037a
Response output overhaul (#524)
Clean up the response output.
Superfluous information was removed and the formatting was changed to make
the output more readable to humans.
2022-02-23 17:28:14 +01:00
Lucius Hu
70ebe45117
Improved IPv6 filtering support (#501)
This commit uses crate `ip_network` to determine whether an IPv6 address is
link-local or unique local.

Note that this extra dependencies can be removed once rust-lang/rust#27709 is
stabilized.

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2022-02-22 10:39:44 +01:00
Matthias
ba276cd51b
Error cleanup (#510)
* Add more fine-grained error types; remove generic IO error
* Update error message for missing file
* Remove missing `Error` suffix
* Rename ErrorKind::Github to ErrorKind::GithubRequest for consistency with NetworkRequest
2022-02-19 01:44:00 +01:00
dependabot[bot]
e2d303b493
Bump check-if-email-exists from 0.8.26 to 0.8.28 (#516)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.26 to 0.8.28.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.26...v0.8.28)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-18 17:09:52 +01:00
dependabot[bot]
6836deac79
Bump par-stream from 0.10.0 to 0.10.2 (#518)
Bumps [par-stream](https://github.com/jerry73204/par-stream) from 0.10.0 to 0.10.2.
- [Release notes](https://github.com/jerry73204/par-stream/releases)
- [Commits](https://github.com/jerry73204/par-stream/commits)

---
updated-dependencies:
- dependency-name: par-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-18 17:09:30 +01:00
dependabot[bot]
98535f7bd3
Bump typed-builder from 0.9.1 to 0.10.0 (#512)
Bumps [typed-builder](https://github.com/idanarye/rust-typed-builder) from 0.9.1 to 0.10.0.
- [Release notes](https://github.com/idanarye/rust-typed-builder/releases)
- [Changelog](https://github.com/idanarye/rust-typed-builder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/idanarye/rust-typed-builder/commits)

---
updated-dependencies:
- dependency-name: typed-builder
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-18 11:00:30 +01:00
dependabot[bot]
2114406235
Bump futures from 0.3.19 to 0.3.21 (#493)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.19 to 0.3.21.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.19...0.3.21)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-18 10:58:01 +01:00
Matthias
812663d832
Prevent flaky tests (#514)
Move from example.org to example.com, which seems to be more permissive for testing
2022-02-18 10:29:49 +01:00
Lucius Hu
6d56c6b55c
Replace plain String with SecretString for GitHub token (#509)
This commit changed the type of `lychee-lib::ClientBuilder::github_token` from
`String` to `secrecy::SecretString` to fortify the secret management within our
program.

Note that this won't affect TOML configuration of `lychee-bin` because
`serde::Deserialize` is still implemented for `SecretString`.
2022-02-13 13:53:46 +01:00
Matthias
47df7780fe
Use captured identifiers in format strings (#507)
Makes for arguably cleaner-looking code.
The downside is that the MSRV is 1.58
https://blog.rust-lang.org/2022/01/13/Rust-1.58.0.html

Given that nobody uses lychee as a library yet
and we have precompiled binaries, it's an acceptable
tradeoff.
My little research revealed that this is a much-liked
feature: https://twitter.com/matthiasendler/status/1483895557621960715
2022-02-12 10:51:52 +01:00
Lucius Hu
53c41b03d8
replace hubcaps by octocrab (#502)
This commit replaced `hubcaps` by `octocrab`, which has more downloads per month
and receives more frequent release updates.

The caveats are:

1. When instantiating the API client, `octocrab` doesn't offer you a way to
specify custom user-agent. But I would argue that, at least presently, this
doesn't seem to cause issues.
2. `octocrab` doesn't export as much details of its error types as `hubcaps`
does. So we will have fewer control on the display of the error message. But I
would also argue that this is not really important. Though we should do more
tests to make sure the error looks good enough.

* hide implementation details in error message

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2022-02-11 23:43:47 +01:00
Lucius Hu
476a048350
lychee-lib::client reworked (#500)
This commit mainly added or improved documentation for `lychee-lib::client`
module.

But it also contains a few API changes:

- `ClientBuilder::client()` now consumes itself instead of taking a reference.
  This helps to avoid a few unnecessary clones.
- `ClientBuilder::build_filter()` was a private function and is inlined to avoid
  unnecessary clones.
- Added a new crate-scoped function `Uri::set_scheme()`.

* added notes on deprecated site-local network

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2022-02-10 00:04:48 +01:00
Lucius Hu
5921fd248a
Update license files (#497)
- The date in MIT license files have been updated to 2022
- Each of the benchmark and example crates are theoretically
  a separate package in Cargo's sense. So license files are
  added for them as well.

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2022-02-08 10:59:54 +01:00
Markus Unterwaditzer
68d09f7e5b
Add html5gum as alternative link extractor (#480)
html5gum is a HTML parser that offers lower-level control over which tokens actually get created and are tracked. As such, the extractor doesn't allocate anything tokens it doesn't care about. On some benchmarks it provides a substantial performance boost. The old parser, html5ever is still available by setting the `LYCHEE_USE_HTML5EVER=1` env var.
2022-02-07 22:54:47 +01:00
dependabot[bot]
1f3abce671
Bump pretty_assertions from 1.0.0 to 1.1.0 (#487)
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v1.0.0...v1.1.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-01 13:42:58 +01:00
dependabot[bot]
a8d1359df4
Bump tokio from 1.15.0 to 1.16.1 (#482)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.15.0 to 1.16.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.15.0...tokio-1.16.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-28 13:23:02 +01:00
Matthias
6635863746
Add Alpine page for benchmark; refactor code (#481) 2022-01-27 23:42:06 +01:00
dependabot[bot]
8e31b234d3
Bump check-if-email-exists from 0.8.25 to 0.8.26 (#479)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.25 to 0.8.26.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.25...v0.8.26)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-27 15:02:05 +01:00
Matthias
97b06230fc
Add missing Github exclusions; sort entries (#473) 2022-01-21 23:54:59 +01:00
dependabot[bot]
84eff209ff
Bump cached from 0.29.0 to 0.30.0 (#472)
Bumps [cached](https://github.com/jaemk/cached) from 0.29.0 to 0.30.0.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-21 21:50:20 +01:00
dependabot[bot]
9082865e24
Bump pulldown-cmark from 0.9.0 to 0.9.1 (#468)
Bumps [pulldown-cmark](https://github.com/raphlinus/pulldown-cmark) from 0.9.0 to 0.9.1.
- [Release notes](https://github.com/raphlinus/pulldown-cmark/releases)
- [Commits](https://github.com/raphlinus/pulldown-cmark/compare/v0.9.0...v0.9.1)

---
updated-dependencies:
- dependency-name: pulldown-cmark
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-18 14:44:45 +01:00
Matthias
5802ae912c
Fix bugs in extractor; reduce allocs (#464)
When URLs couldn't be extracted from a tag,
we ran a plaintext search, but never added the
newly found urls to the vec of extracted urls.

Also tried to make the code a little more idiomatic
2022-01-16 02:13:38 +01:00
Matthias
6e757fa20e
Add more information about mail errors (#463) 2022-01-14 22:22:53 +01:00
Matthias
994aadf6a1
Simplify error messages (#462)
Using pattern matching to make the hubcaps and reqwest error messages a little shorter and (subjectively) more readable.
2022-01-14 15:26:13 +01:00
Matthias
ac490f9c53
Add caching functionality (v2) (#443)
A while ago, caching was removed due to some issues (see #349).
This is a new implementation with the following improvements:

 * Architecture: The new implementation is decoupled from the collector, which was a major issue in the last version.    Now the collector has a single responsibility: collecting links. This also avoids race-conditions when running multiple collect_links instances, which probably was an issue before.
* Performance: Uses DashMap under the hood, which was noticeably faster than Mutex<HashMap> in my tests.
* Simplicity: The cache format is a CSV file with two columns: URI and status. I decided to create a new struct called CacheStatus for serialization, because trying to serialize the error kinds in Status turned out to be a bit of a nightmare and at this point I don't think it's worth the pain (and probably isn't idiomatic either).

This is an optional feature. Caching only gets used if the `--cache` flag is set.
2022-01-14 15:25:51 +01:00
dependabot[bot]
80fb20cca5 Bump cached from 0.28.0 to 0.29.0
Bumps [cached](https://github.com/jaemk/cached) from 0.28.0 to 0.29.0.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-13 13:38:48 +01:00
dependabot[bot]
a0d34a04f5 Bump cached from 0.26.2 to 0.28.0
Bumps [cached](https://github.com/jaemk/cached) from 0.26.2 to 0.28.0.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-12 13:25:43 +01:00
Matthias
1e76e82811 Add test for nonexistent Github file 2022-01-12 09:25:12 +01:00
Matthias
48c8153e11 Refactor Github checking; add docs 2022-01-12 09:25:12 +01:00
Matthias
50d7b05736 Conditionally compile constructors for GithubUri for tests 2022-01-12 09:25:12 +01:00
Matthias
8d445a3a4b Be more permissive around private GH repos
The Github API doesn't handle checking individual files inside repos or
paths like `github.com/org/repo/issues`, so we are more
permissive and only check for repo existence. This is the
only way to get a basic check for private repos. Public repos are not affected and should work
with a normal check.
2022-01-12 09:25:12 +01:00
Matthias
e91c0c60f0 Only accept two path segments (org/repo) for Github API check 2022-01-12 09:25:12 +01:00
Matthias
7667842bb6 Strip .git suffix from Github URLs (#384) 2022-01-12 09:25:12 +01:00
dependabot[bot]
5a5ed00ba4
Bump reqwest from 0.11.8 to 0.11.9 (#455)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.8 to 0.11.9.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.8...v0.11.9)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-11 13:43:35 +01:00
Matthias
36450621fa
Update dependencies (#454) 2022-01-10 22:35:37 +01:00
dependabot[bot]
6b7671b97c
Bump wiremock from 0.5.9 to 0.5.10 (#451)
Bumps [wiremock](https://github.com/LukeMathWalker/wiremock-rs) from 0.5.9 to 0.5.10.
- [Release notes](https://github.com/LukeMathWalker/wiremock-rs/releases)
- [Changelog](https://github.com/LukeMathWalker/wiremock-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LukeMathWalker/wiremock-rs/compare/v0.5.9...v0.5.10)

---
updated-dependencies:
- dependency-name: wiremock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 18:15:58 +01:00
Matthias
0645177b84
Bump version (#450) 2022-01-10 01:38:46 +01:00
Matthias
21f3160b71
Make retries configurable; align constants (#446)
Using the same default values for the library and the
binary now but tweaked the values a bit for slightly faster performance.
2022-01-07 01:03:10 +01:00
Matthias
388bbbe7b0
Exclude known false-positives from Github API check (#445)
Fixes https://github.com/lycheeverse/lychee/issues/431
2022-01-06 00:33:53 +01:00
dependabot[bot]
f515d096db
Bump wiremock from 0.5.8 to 0.5.9 (#442)
Bumps [wiremock](https://github.com/LukeMathWalker/wiremock-rs) from 0.5.8 to 0.5.9.
- [Release notes](https://github.com/LukeMathWalker/wiremock-rs/releases)
- [Changelog](https://github.com/LukeMathWalker/wiremock-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LukeMathWalker/wiremock-rs/compare/v0.5.8...v0.5.9)

---
updated-dependencies:
- dependency-name: wiremock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-05 15:18:08 +01:00
Matthias
dd48466d9a
Add missing test for local links in plaintext files (#444) 2022-01-05 12:51:14 +01:00
dependabot[bot]
7a4de16138
Bump http from 0.2.5 to 0.2.6 (#438)
Bumps [http](https://github.com/hyperium/http) from 0.2.5 to 0.2.6.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.5...v0.2.6)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-31 18:33:22 +01:00
dependabot[bot]
c0b7205a71
Bump pulldown-cmark from 0.8.0 to 0.9.0 (#433)
Bumps [pulldown-cmark](https://github.com/raphlinus/pulldown-cmark) from 0.8.0 to 0.9.0.
- [Release notes](https://github.com/raphlinus/pulldown-cmark/releases)
- [Commits](https://github.com/raphlinus/pulldown-cmark/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: pulldown-cmark
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-23 13:49:54 +01:00
dependabot[bot]
147fa8de87
Bump reqwest from 0.11.7 to 0.11.8 (#432)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.7 to 0.11.8.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.7...v0.11.8)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-21 16:28:22 +01:00
dependabot[bot]
b2bc0e7eac
Bump futures from 0.3.18 to 0.3.19 (#430)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.18 to 0.3.19.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.18...0.3.19)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 15:37:18 +01:00
Matthias
01393b34a2
Upgrade to Rust 2021 (#427) 2021-12-17 01:32:13 +01:00
Matthias
83182c29ca
Fix JSON serialization (#426)
We recently removed the custom serialization for InputSource.
This causes the JSON formatter to fail
with "key must be a string".
Add it back and add a comment on
why this is needed.
2021-12-16 23:55:04 +01:00
dependabot[bot]
58785311f0
Bump tokio from 1.14.0 to 1.15.0 (#425)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.14.0 to 1.15.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.14.0...tokio-1.15.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-16 19:32:51 +01:00
Matthias
166c86c30e
Use tokenizer for extraction; add benchmark (#424)
This avoids creating a DOM tree for link extraction and instead uses a `TokenSink` for on-the-fly extraction. In hyperfine benchmarks it was about 10-25% faster than the master.

Old: 4.557 s ± 0.404 s
New: 3.832 s ± 0.131 s

The performance fluctuates a little less as well.

Some missing element/attribute pairs were also added, which contain links according to the HTML spec. These occur very rarely, but it's good to parse them for completeness' sake.

Furthermore tried to clean up a lot of papercuts around our types. We now differentiate between a `RawUri` (stringy-types) and a Uri, which is a properly parsed `URI` type.
The extractor now only deals with extracting `RawUri`s while the collector creates the request objects.
2021-12-16 18:45:52 +01:00
dependabot[bot]
c97ff95575
Bump once_cell from 1.8.0 to 1.9.0 (#423)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.8.0 to 1.9.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.8.0...v1.9.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-15 13:45:19 +01:00
dependabot[bot]
eac9d5b9a0
Bump openssl-sys from 0.9.71 to 0.9.72 (#421)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.71 to 0.9.72.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.71...openssl-sys-v0.9.72)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 13:26:15 +01:00
Matthias
c41ba64a69
Max concurrency moved to check (#419)
Concurrency is defined by the channel size consuming
from the request stream in  `check`
2021-12-07 11:52:40 +01:00
Matthias
3d5135668b
Improve concurrency with streams (#330)
* Move to from vec to streams

Previously we collected all inputs in one vector
before checking the links, which is not ideal.
Especially when reading many inputs (e.g. by using a glob pattern),
this could cause issues like running out of file handles.

By moving to streams we avoid that scenario. This is also the first
step towards improving performance for many inputs.

To stay as close to the pre-stream behaviour, we want to stop processing
as soon as an Err value appears in the stream. This is easiest when the
stream is consumed in the main thread.
Previously, the stream was consumed in a tokio task and the main thread
waited for responses.
Now, a tokio task waits for responses (and displays them/registers
response stats) and the main thread sends links to the ClientPool.
To ensure that the main thread waits for all responses to have arrived
before finishing the ProgressBar and printing the stats, it waits for
the show_results_task to finish.


* Return collected links as Stream
* Initialize ProgressBar without length because we can't know the amount of links without blocking
* Handle stream results in main thread, not in task
* Add basic directory support using jwalk
* Add test for HTTP protocol file type (http://)
* Remove deadpool (once again): Replaced with `futures::StreamExt::for_each_concurrent`.
* Refactor main; fix tests
* Move commands into separate submodule
* Simplify input handling
* Simplify collector
* Remove unnecessary unwrap
* Simplify main
* cleanup check
* clean up dump command
* Handle requests in parallel 
* Fix formatting and lints

Co-authored-by: Timo Freiberg <self@timofreiberg.com>
2021-12-01 18:25:11 +01:00
dependabot[bot]
bcd1d6725a
Bump reqwest from 0.11.6 to 0.11.7 (#415)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.6 to 0.11.7.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.6...v0.11.7)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-01 16:06:38 +01:00
dependabot[bot]
06140fff3a
Bump linkify from 0.7.0 to 0.8.0 (#409)
Bumps [linkify](https://github.com/robinst/linkify) from 0.7.0 to 0.8.0.
- [Release notes](https://github.com/robinst/linkify/releases)
- [Changelog](https://github.com/robinst/linkify/blob/main/CHANGELOG.md)
- [Commits](https://github.com/robinst/linkify/compare/0.7.0...0.8.0)

---
updated-dependencies:
- dependency-name: linkify
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-26 13:33:41 +01:00
Matthias
d96c1269ff
Use thiserror for error handling (#399)
This removes some boilerplate and is arguably better
than handwriting the error handling code for
maintainability and avoid inconsitent functionality
for the error variants.
thiserror is also the de-facto standard for library
error types as of today.
2021-11-20 01:42:50 +01:00
dependabot[bot]
fc9790b98b
Bump openssl-sys from 0.9.70 to 0.9.71 (#395)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.70 to 0.9.71.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.70...openssl-sys-v0.9.71)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-18 16:37:48 +01:00
Matthias
30a0fd3856
Bump version to 0.8.1 (#396) 2021-11-18 00:59:28 +01:00
Matthias
b97fda34d0
Add support for different output formats (compact, detailed, markdown) (#375) 2021-11-18 00:44:48 +01:00
Markus Unterwaditzer
d3ed133f10
Remove srcset attribute from list of "link" attrs (#393)
* Remove srcset attribute from list of "link" attrs

Fix #390

* Add test for srcset

* Add note about srcSet links

* add real support for srcset

Co-authored-by: Matthias <matthias-endler@gmx.net>
2021-11-16 22:58:10 +01:00
dependabot[bot]
09a4754c55
Bump deadpool from 0.9.1 to 0.9.2 (#392)
Bumps [deadpool](https://github.com/bikeshedder/deadpool) from 0.9.1 to 0.9.2.
- [Release notes](https://github.com/bikeshedder/deadpool/releases)
- [Changelog](https://github.com/bikeshedder/deadpool/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bikeshedder/deadpool/compare/deadpool-v0.9.1...deadpool-v0.9.2)

---
updated-dependencies:
- dependency-name: deadpool
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-16 13:29:21 +01:00
dependabot[bot]
31ec9a1fe7
Bump tokio from 1.13.0 to 1.14.0 (#394)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.13.0 to 1.14.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-16 13:28:59 +01:00
Matthias
69e5d56687
Add more known false positive schema domains (#376)
See https://github.com/lycheeverse/lychee-action/issues/53
2021-10-31 14:53:40 +01:00
dependabot[bot]
e346033a10
Bump openssl-sys from 0.9.67 to 0.9.68 (#373)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.67 to 0.9.68.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.67...openssl-sys-v0.9.68)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-28 14:35:49 +02:00
dependabot[bot]
d3a72d3816
Bump deadpool from 0.7.0 to 0.9.1 (#371)
* Bump deadpool from 0.7.0 to 0.9.1

Bumps [deadpool](https://github.com/bikeshedder/deadpool) from 0.7.0 to 0.9.1.
- [Release notes](https://github.com/bikeshedder/deadpool/releases)
- [Changelog](https://github.com/bikeshedder/deadpool/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bikeshedder/deadpool/compare/deadpool-v0.7.0...deadpool-v0.9.1)

---
updated-dependencies:
- dependency-name: deadpool
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Attempt fix for deadpool v0.8.0+ (#372)

Signed-off-by: MichaIng <micha@dietpi.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MichaIng <micha@dietpi.com>
2021-10-28 02:05:58 +02:00
Matthias
47426c6971
Fix typos, grammar 2021-10-28 02:05:35 +02:00
Matthias
ed0efcd4f8 Prepare release 2021-10-28 00:34:48 +02:00
dependabot[bot]
d79b57fb9d
Bump reqwest from 0.11.5 to 0.11.6 (#364)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.5 to 0.11.6.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.5...v0.11.6)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-26 01:19:35 +02:00
dependabot[bot]
d09348ffee
Bump cached from 0.25.1 to 0.26.2 (#366)
Bumps [cached](https://github.com/jaemk/cached) from 0.25.1 to 0.26.2.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-26 01:17:37 +02:00
dependabot[bot]
1b1ba7a095
Bump cached from 0.25.0 to 0.25.1 (#361)
Bumps [cached](https://github.com/jaemk/cached) from 0.25.0 to 0.25.1.
- [Release notes](https://github.com/jaemk/cached/releases)
- [Changelog](https://github.com/jaemk/cached/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jaemk/cached/commits)

---
updated-dependencies:
- dependency-name: cached
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-16 18:02:27 +02:00
MichaIng
0870f0bc9e
Add http://www.w3.org/2000/svg to known false positives (#359)
It has no forced HTTPS rewrite, but sets the HSTS header. Access otherwise works fine, so similar to http://www.w3.org/1999/xhtml it is basically to avoid lychee failures when --require-https was defined.

Signed-off-by: MichaIng <micha@dietpi.com>
2021-10-11 00:40:27 +02:00
Jorge Luis Betancourt
174331d983
Extract base from the source URL if --base is empty (#358)
When running lychee against a remote URL all relative links are ignored
by default because `--base` is normally not set. A good default in this
case is to automatically use the base domain from the source URL.
Setting `--base` overrides the automatic source extraction from the
source URL (same behaviour as we currently have).
2021-10-10 02:42:01 +02:00
dependabot[bot]
2be3b3b896
Bump reqwest from 0.11.4 to 0.11.5 (#356)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.4 to 0.11.5.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.4...v0.11.5)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-10 02:41:08 +02:00
Matthias
dd9e24b7f4 support uppercase filenames; add tests 2021-10-09 22:20:22 +02:00
Matthias
175342baf4 Merge branch 'master' of github.com:lycheeverse/lychee 2021-10-09 21:17:41 +02:00
Matthias
bdcd6f87bf Make error message for broken file links more understandable 2021-10-09 21:17:37 +02:00
Matthias
56726f41fc
Add back connection pool (#355) 2021-10-08 13:08:44 +02:00
MichaIng
961f12e58e
Remove cache from collector and remove custom reqwest client pool
* Reqwest comes with its own request pool, so there's no need in adding
another layer of indirection. This also gets rid of a lot of allocs.
* Remove cache from collector
* Improve error handling and documentation
* Add back test for request caching in single file

Signed-off-by: MichaIng <micha@dietpi.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2021-10-07 18:07:18 +02:00
Matthias
a7f809612d
Refactor extractor (#354)
This avoids sending URLs back and forth between the different parsers.
Also, it should allow for future optimizations to reduce allocs.
2021-10-07 12:51:02 +02:00
dependabot[bot]
ee1f26c44a
Bump check-if-email-exists from 0.8.24 to 0.8.25 (#352)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.24 to 0.8.25.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.24...v0.8.25)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-06 14:56:20 +02:00
MichaIng
b648b5e914
Imply "localhost" when loopback IPs are excluded (#351)
as "localhost" is usually mapped via "hosts" file to a loopback IP address.

Resolves: https://github.com/lycheeverse/lychee/issues/319

Signed-off-by: MichaIng <micha@dietpi.com>
2021-10-06 11:33:23 +02:00
Matthias
251332efe2
Cache absolute_path to decrease allocations (#346)
* Cache `absolute_path` to decrease allocations

While profiling local file handling, I noticed that resolving paths was taking a
significant amount of time. It also caused quite a few allocations.
By caching the path and using a constant value for the current
directory, we can reduce the number of allocs by quite a lot.
For example, when testing on the sentry documentation, we do 50,4%
less allocations in total now. That's just a single test-case of course,
but it's probably also helping in many other cases as well.

* Defer to_string for attr.value to reduce allocs
* Use Tendrils instead of Strings for parsing (another ~1.5% less allocs)
* Move option parsing code into separate module
* Handle base dir more correctly
* Temporarily disable dry run
2021-10-05 01:37:43 +02:00
dependabot[bot]
aadce95e35
Bump pretty_assertions from 0.7.2 to 1.0.0 (#347)
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 0.7.2 to 1.0.0.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v0.7.2...v1.0.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-29 00:56:15 +02:00
dependabot[bot]
6848b20546
Bump tokio from 1.11.0 to 1.12.0 (#343)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.11.0...tokio-1.12.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:28:59 +02:00
dependabot[bot]
61fa23099b
Bump http from 0.2.4 to 0.2.5 (#344)
Bumps [http](https://github.com/hyperium/http) from 0.2.4 to 0.2.5.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.4...v0.2.5)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:27:23 +02:00
dependabot[bot]
7f17ffb9b1
Bump openssl-sys from 0.9.63 to 0.9.67 (#342)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.63 to 0.9.67.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.63...openssl-sys-v0.9.67)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:27:14 +02:00
Matthias
3b41c4c375
Silently ignore absolute paths without base (fixes #320) (#338) 2021-09-20 11:13:30 +02:00
dependabot[bot]
d24511217f
Bump check-if-email-exists from 0.8.23 to 0.8.24 (#323)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.23 to 0.8.24.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.23...v0.8.24)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-14 15:49:02 +02:00
Matthias
21ea0fd033
Add support for tokio-console (#318)
This allows troubleshooting and improving async Rust code.
It is an optional feature that is still
experimental (but can be quite helpful)
2021-09-12 18:10:23 +02:00
Matthias
de55fbd178 Add TODO for fixing URL encoding for paths 2021-09-09 19:31:49 +02:00
Matthias
d7436575eb formatting 2021-09-09 14:43:40 +02:00
Matthias
2a4170eade Add test for + encoding 2021-09-09 14:42:09 +02:00
Matthias
a1acf7b0d0 Reintegrate master 2021-09-09 01:49:25 +02:00
Matthias
93948d7367 Avoid double-encoding already encoded destination paths
E.g. `web%20site` becomes `web site`.
That's because Url::from_file_path will encode the full URL in the end.
This behavior cannot be configured.
See https://github.com/lycheeverse/lychee/pull/262#issuecomment-915245411
2021-09-09 01:44:10 +02:00
Matthias
24ea2482d3 Update docs 2021-09-08 01:08:59 +02:00
Matthias
f3fe46a4d6 Merge branch 'master' of github.com:lycheeverse/lychee into local-files 2021-09-08 00:35:41 +02:00
Matthias
ffab0343fc Revert refactor for removing params and fragments
The refactored version was not equivalent. It could not handle
fragments containing a question mark.
See 67268ed598 (r703400238)
2021-09-08 00:29:30 +02:00
Matthias
1246fa564c
Don't exlude mail on exclude-all-private (#316) 2021-09-08 00:21:00 +02:00
Matthias
67268ed598 Clean up params and fragment handling 2021-09-07 13:02:39 +02:00
Matthias
4827ecf6bd Fix clippy warnings 2021-09-07 00:22:06 +02:00
Matthias
5d0b95271d Remove anchor from file links 2021-09-07 00:20:09 +02:00
Matthias
b2ce61357f Fix build errors; cleanup code 2021-09-06 23:46:31 +02:00
Paweł Romanowski
8fd34a7367
Add no check (dump links only) flag (#99) 2021-09-06 16:10:48 +02:00
Matthias
00ddb6dfc8 Filter out directories with suffixes that look like extensions
Directories can still have a suffix which looks like
a file extension like `foo.html`. This can lead to
unexpected behavior with glob patterns like
`**/*.html`. Therefore filter these out.
https://github.com/lycheeverse/lychee/pull/262#issuecomment-91322681
2021-09-06 15:23:10 +02:00
Matthias
f47282093a String allocation not needed 2021-09-06 15:23:10 +02:00
Matthias
f143087743 Relative path not needed 2021-09-06 15:23:10 +02:00
Matthias
b3c5d122e7 Fix clippy lints 2021-09-06 15:23:10 +02:00
Matthias
57af648ec9 fix tests after making base dir mandatory 2021-09-06 15:23:10 +02:00
Matthias
b7c129c431 Fix resolving absolute paths
The previous solution didn't resolve to absolute paths
and rather removed things like `.` and `..`.
2021-09-06 15:20:18 +02:00
Matthias
dd3205a87c wip 2021-09-06 15:19:43 +02:00
Matthias
b06afb7252 fix test 2021-09-06 15:19:24 +02:00
Matthias
04bf838f98 lint 2021-09-06 15:19:24 +02:00
Matthias
4f9dc67bbd fix test 2021-09-06 15:19:24 +02:00
Matthias
afdb721612 Fix lints 2021-09-06 15:19:24 +02:00
Matthias
1546d6ee38 Normalize path; fix tests 2021-09-06 15:19:09 +02:00
Matthias
a3fd85d923 Exclude anchor links 2021-09-06 15:19:09 +02:00
Matthias
daa5be4c3a Add/change file link tests 2021-09-06 15:19:09 +02:00
Matthias
d924c25669 Non-existing directories are fine for URI base for files 2021-09-06 15:19:09 +02:00
Matthias
d51a49db46 Move uri to types 2021-09-06 15:19:09 +02:00
Matthias
887f1b9589 Split up file checking into file discovery and validation of path exists 2021-09-06 15:19:09 +02:00
Matthias
bfa3b1b6a1 Introduce Base type, which can be a path or URL 2021-09-06 15:15:40 +02:00
Matthias
f9bf52ef10 Add support for base_dir 2021-09-06 15:15:05 +02:00
Matthias Endler
d5bb7ee7d7 Or Patterns (Rust 1.53) 2021-09-06 15:15:05 +02:00
Matthias Endler
701fbc9ada Add support for local files 2021-09-06 15:14:33 +02:00
dependabot[bot]
13d0b84389
Bump wiremock from 0.5.2 to 0.5.7 (#313)
Bumps [wiremock](https://github.com/LukeMathWalker/wiremock-rs) from 0.5.2 to 0.5.7.
- [Release notes](https://github.com/LukeMathWalker/wiremock-rs/releases)
- [Changelog](https://github.com/LukeMathWalker/wiremock-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LukeMathWalker/wiremock-rs/compare/v0.5.2...v0.5.7)

---
updated-dependencies:
- dependency-name: wiremock
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-06 15:04:08 +02:00
Lucius Hu
80b8a856ac
Add new flag --require-https (#195) 2021-09-04 03:21:54 +02:00
dependabot[bot]
4b6c1d7719
Bump check-if-email-exists from 0.8.21 to 0.8.23 (#311)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.21 to 0.8.23.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.21...v0.8.23)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-04 02:59:16 +02:00
Daniel Doubrovkine (dB.)
f866abef61
Fix publish workflow (#309) 2021-09-04 01:49:29 +02:00
Matthias
a7c1eae115 Bump version to 0.7.1 2021-09-03 19:35:36 +02:00
Matthias
59abd189cf Fix remaining clippy lints 2021-09-03 16:29:57 +02:00
dependabot[bot]
7e497723cb
Bump linkify from 0.6.0 to 0.7.0 (#249)
Bumps [linkify](https://github.com/robinst/linkify) from 0.6.0 to 0.7.0.
- [Release notes](https://github.com/robinst/linkify/releases)
- [Changelog](https://github.com/robinst/linkify/blob/main/CHANGELOG.md)
- [Commits](https://github.com/robinst/linkify/compare/0.6.0...0.7.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-19 09:19:51 +02:00
dependabot[bot]
2b21e3b122
Bump url from 2.2.1 to 2.2.2 (#245)
Bumps [url](https://github.com/servo/rust-url) from 2.2.1 to 2.2.2.
- [Release notes](https://github.com/servo/rust-url/releases)
- [Commits](https://github.com/servo/rust-url/compare/v2.2.1...v2.2.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-17 15:25:24 +02:00
dependabot[bot]
19ae5fecc0
Bump tokio from 1.5.0 to 1.6.0 (#248)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.5.0 to 1.6.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.5.0...tokio-1.6.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-17 10:38:33 +02:00
dependabot[bot]
524583f5e7
Bump openssl-sys from 0.9.62 to 0.9.63 (#244)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.62 to 0.9.63.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.62...openssl-sys-v0.9.63)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-10 10:04:46 +02:00
Matthias
fe399c0a8c
Simple URI cache (#243) 2021-05-04 13:28:39 +02:00
dependabot[bot]
bbc763e854
Bump openssl-sys from 0.9.61 to 0.9.62 (#240)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.61 to 0.9.62.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.61...openssl-sys-v0.9.62)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-04-29 10:31:45 +02:00
Matthias
164e1aea7e
Add support for multiple schemes (#237) 2021-04-26 18:24:54 +02:00
Matthias
f8426bafbf
Skip unsupported schemes (#236) 2021-04-26 17:16:58 +02:00
Matthias
2a80760f58
Fix crates.io 404 with quirk (#235) 2021-04-26 14:20:54 +02:00
dependabot-preview[bot]
d651b1ef7b Bump regex from 1.4.5 to 1.4.6
Bumps [regex](https://github.com/rust-lang/regex) from 1.4.5 to 1.4.6.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.4.5...1.4.6)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2021-04-23 06:38:34 +00:00
Matthias
1865f7a309
Use thumbnail endpoint for YouTube links (#232) 2021-04-23 01:23:15 +02:00
Matthias
1926c73b6b
Add missing docs (#231)
This enables `#![deny(missing_docs)`) and adds all missing doc strings
2021-04-23 00:27:12 +02:00
Matthias
f7f9485be0
Bump version to 0.7 (#229) 2021-04-17 13:41:00 +02:00
Lucius Hu
f64213d58c
More refactor (#225)
- Major changes in `lychee-lib::filter` module:
  - Fields in `Excludes` except the `RegexSet` is now moved to `Filter`.
  - `Filter` contains `Option<Excludes>` and `Option<Includes>`, which are
    wrapper struct of `RegexSet` instead of `Option<RegexSet>`. As a result
    the code now looks cleaner.
  - Factored out some filtering logics to dedicated functions.
    - It's possible to write tests for those functions in addition to tests
      for the `Filter` struct.
  - Added docs to `Filter::is_excluded` and reorgnized the code.
- placed `derive_builder` by `typed_builder`:
  - The internal interface very ugly, as admitted by the author, but we no
    longer have nested `Option`s like before.
  - As a result, the `Client` building is much easier to read.
  - Main benefit of `typed_builder` is, the arguments feeded to builder is
    checked at compile time instead of run-time.
- Fixed a bug in `lychee::tests::usage` and `lychee-lib::stats::test`.
  - Now it will clear environment variable which would otherwise cause an
    issue if `GITHUB_TOKEN` is set.
- Updated dependencies.

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-16 20:25:22 +02:00
dependabot-preview[bot]
63774c9ce2 Bump pretty_assertions from 0.7.1 to 0.7.2
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v0.7.1...v0.7.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2021-04-16 06:48:39 +00:00
Lucius Hu
228e5df6a3
Major refactor of codebase (#208)
- The binary component and library component are separated as two
  packages in the same workspace.
  - `lychee` is the binary component, in `lychee-bin/*`.
  - `lychee-lib` is the library component, in `lychee-lib/*`.
  - Users can now install only the `lychee-lib`, instead of both
    components, that would require fewer dependencies and faster
    compilation.
  - Dependencies for each component are adjusted and updated. E.g.,
    no CLI dependencies for `lychee-lib`.
  - CLI tests are only moved to `lychee`, as it has nothing to do
    with the library component.
- `Status::Error` is refactored to contain dedicated error enum,
  `ErrorKind`.
  - The motivation is to delay the formatting of errors to strings.
    Note that `e.to_string()` is not necessarily cheap (though
    trivial in many cases). The formatting is no delayed until the
    error is needed to be displayed to users. So in some cases, if
    the error is never used, it means that it won't be formatted at
    all.
- Replaced `regex` based matching with one of the following:
  - Simple string equality test in the case of 'false positivie'.
  - URL parsing based test, in the case of extracting repository and
    user name for GitHub links.
  - Either cases would be much more efficient than `regex` based
    matching. First, there's no need to construct a state machine for
    regex. Second, URL is already verified and parsed on its creation,
    and extracting its components is fairly cheap. Also, this removes
    the dependency on `lazy-static` in `lychee-lib`.
- `types` module now has a sub-directory, and its components are now
  separated into their own modules (in that sub-directory).
- `lychee-lib::test_utils` module is only compiled for tests.
- `wiremock` is moved to `dev-dependency` as it's only needed for
  `test` modules.
- Dependencies are listed in alphabetical order.
- Imports are organized in the following fashion:
  - Imports from `std`
  - Imports from 3rd-party crates, and `lychee-lib`.
  - Imports from `crate::*` or `super::*`.
- No glob import.
- I followed suggestion from `cargo clippy`, with `clippy::all` and
  `clippy:pedantic`.

Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-15 01:24:11 +02:00