Commit graph

72 commits

Author SHA1 Message Date
Matthias
175342baf4 Merge branch 'master' of github.com:lycheeverse/lychee 2021-10-09 21:17:41 +02:00
Matthias
bdcd6f87bf Make error message for broken file links more understandable 2021-10-09 21:17:37 +02:00
Matthias
56726f41fc
Add back connection pool (#355) 2021-10-08 13:08:44 +02:00
MichaIng
961f12e58e
Remove cache from collector and remove custom reqwest client pool
* Reqwest comes with its own request pool, so there's no need in adding
another layer of indirection. This also gets rid of a lot of allocs.
* Remove cache from collector
* Improve error handling and documentation
* Add back test for request caching in single file

Signed-off-by: MichaIng <micha@dietpi.com>
Co-authored-by: Matthias <matthias-endler@gmx.net>
2021-10-07 18:07:18 +02:00
Matthias
a7f809612d
Refactor extractor (#354)
This avoids sending URLs back and forth between the different parsers.
Also, it should allow for future optimizations to reduce allocs.
2021-10-07 12:51:02 +02:00
dependabot[bot]
ee1f26c44a
Bump check-if-email-exists from 0.8.24 to 0.8.25 (#352)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.24 to 0.8.25.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.24...v0.8.25)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-06 14:56:20 +02:00
MichaIng
b648b5e914
Imply "localhost" when loopback IPs are excluded (#351)
as "localhost" is usually mapped via "hosts" file to a loopback IP address.

Resolves: https://github.com/lycheeverse/lychee/issues/319

Signed-off-by: MichaIng <micha@dietpi.com>
2021-10-06 11:33:23 +02:00
Matthias
251332efe2
Cache absolute_path to decrease allocations (#346)
* Cache `absolute_path` to decrease allocations

While profiling local file handling, I noticed that resolving paths was taking a
significant amount of time. It also caused quite a few allocations.
By caching the path and using a constant value for the current
directory, we can reduce the number of allocs by quite a lot.
For example, when testing on the sentry documentation, we do 50,4%
less allocations in total now. That's just a single test-case of course,
but it's probably also helping in many other cases as well.

* Defer to_string for attr.value to reduce allocs
* Use Tendrils instead of Strings for parsing (another ~1.5% less allocs)
* Move option parsing code into separate module
* Handle base dir more correctly
* Temporarily disable dry run
2021-10-05 01:37:43 +02:00
dependabot[bot]
aadce95e35
Bump pretty_assertions from 0.7.2 to 1.0.0 (#347)
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 0.7.2 to 1.0.0.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v0.7.2...v1.0.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-29 00:56:15 +02:00
dependabot[bot]
6848b20546
Bump tokio from 1.11.0 to 1.12.0 (#343)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.11.0...tokio-1.12.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:28:59 +02:00
dependabot[bot]
61fa23099b
Bump http from 0.2.4 to 0.2.5 (#344)
Bumps [http](https://github.com/hyperium/http) from 0.2.4 to 0.2.5.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.4...v0.2.5)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:27:23 +02:00
dependabot[bot]
7f17ffb9b1
Bump openssl-sys from 0.9.63 to 0.9.67 (#342)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.63 to 0.9.67.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.63...openssl-sys-v0.9.67)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-22 14:27:14 +02:00
Matthias
3b41c4c375
Silently ignore absolute paths without base (fixes #320) (#338) 2021-09-20 11:13:30 +02:00
dependabot[bot]
d24511217f
Bump check-if-email-exists from 0.8.23 to 0.8.24 (#323)
Bumps [check-if-email-exists](https://github.com/reacherhq/check-if-email-exists) from 0.8.23 to 0.8.24.
- [Release notes](https://github.com/reacherhq/check-if-email-exists/releases)
- [Changelog](https://github.com/reacherhq/check-if-email-exists/blob/master/CHANGELOG.md)
- [Commits](https://github.com/reacherhq/check-if-email-exists/compare/v0.8.23...v0.8.24)

---
updated-dependencies:
- dependency-name: check-if-email-exists
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-14 15:49:02 +02:00
Matthias
21ea0fd033
Add support for tokio-console (#318)
This allows troubleshooting and improving async Rust code.
It is an optional feature that is still
experimental (but can be quite helpful)
2021-09-12 18:10:23 +02:00
Matthias
de55fbd178 Add TODO for fixing URL encoding for paths 2021-09-09 19:31:49 +02:00
Matthias
d7436575eb formatting 2021-09-09 14:43:40 +02:00
Matthias
2a4170eade Add test for + encoding 2021-09-09 14:42:09 +02:00
Matthias
a1acf7b0d0 Reintegrate master 2021-09-09 01:49:25 +02:00
Matthias
93948d7367 Avoid double-encoding already encoded destination paths
E.g. `web%20site` becomes `web site`.
That's because Url::from_file_path will encode the full URL in the end.
This behavior cannot be configured.
See https://github.com/lycheeverse/lychee/pull/262#issuecomment-915245411
2021-09-09 01:44:10 +02:00
Matthias
24ea2482d3 Update docs 2021-09-08 01:08:59 +02:00
Matthias
f3fe46a4d6 Merge branch 'master' of github.com:lycheeverse/lychee into local-files 2021-09-08 00:35:41 +02:00
Matthias
ffab0343fc Revert refactor for removing params and fragments
The refactored version was not equivalent. It could not handle
fragments containing a question mark.
See 67268ed598 (r703400238)
2021-09-08 00:29:30 +02:00
Matthias
1246fa564c
Don't exlude mail on exclude-all-private (#316) 2021-09-08 00:21:00 +02:00
Matthias
67268ed598 Clean up params and fragment handling 2021-09-07 13:02:39 +02:00
Matthias
4827ecf6bd Fix clippy warnings 2021-09-07 00:22:06 +02:00
Matthias
5d0b95271d Remove anchor from file links 2021-09-07 00:20:09 +02:00
Matthias
b2ce61357f Fix build errors; cleanup code 2021-09-06 23:46:31 +02:00
Paweł Romanowski
8fd34a7367
Add no check (dump links only) flag (#99) 2021-09-06 16:10:48 +02:00
Matthias
00ddb6dfc8 Filter out directories with suffixes that look like extensions
Directories can still have a suffix which looks like
a file extension like `foo.html`. This can lead to
unexpected behavior with glob patterns like
`**/*.html`. Therefore filter these out.
https://github.com/lycheeverse/lychee/pull/262#issuecomment-91322681
2021-09-06 15:23:10 +02:00
Matthias
f47282093a String allocation not needed 2021-09-06 15:23:10 +02:00
Matthias
f143087743 Relative path not needed 2021-09-06 15:23:10 +02:00
Matthias
b3c5d122e7 Fix clippy lints 2021-09-06 15:23:10 +02:00
Matthias
57af648ec9 fix tests after making base dir mandatory 2021-09-06 15:23:10 +02:00
Matthias
b7c129c431 Fix resolving absolute paths
The previous solution didn't resolve to absolute paths
and rather removed things like `.` and `..`.
2021-09-06 15:20:18 +02:00
Matthias
dd3205a87c wip 2021-09-06 15:19:43 +02:00
Matthias
b06afb7252 fix test 2021-09-06 15:19:24 +02:00
Matthias
04bf838f98 lint 2021-09-06 15:19:24 +02:00
Matthias
4f9dc67bbd fix test 2021-09-06 15:19:24 +02:00
Matthias
afdb721612 Fix lints 2021-09-06 15:19:24 +02:00
Matthias
1546d6ee38 Normalize path; fix tests 2021-09-06 15:19:09 +02:00
Matthias
a3fd85d923 Exclude anchor links 2021-09-06 15:19:09 +02:00
Matthias
daa5be4c3a Add/change file link tests 2021-09-06 15:19:09 +02:00
Matthias
d924c25669 Non-existing directories are fine for URI base for files 2021-09-06 15:19:09 +02:00
Matthias
d51a49db46 Move uri to types 2021-09-06 15:19:09 +02:00
Matthias
887f1b9589 Split up file checking into file discovery and validation of path exists 2021-09-06 15:19:09 +02:00
Matthias
bfa3b1b6a1 Introduce Base type, which can be a path or URL 2021-09-06 15:15:40 +02:00
Matthias
f9bf52ef10 Add support for base_dir 2021-09-06 15:15:05 +02:00
Matthias Endler
d5bb7ee7d7 Or Patterns (Rust 1.53) 2021-09-06 15:15:05 +02:00
Matthias Endler
701fbc9ada Add support for local files 2021-09-06 15:14:33 +02:00