Commit graph

549 commits

Author SHA1 Message Date
Matthias
a795f2bf3f
Change usage examples (#429) 2021-12-18 00:28:08 +01:00
Matthias
d80b9b8e6a
Update docs (#428)
Add more lychee users, update usage instructions, fix typos
2021-12-17 02:00:28 +01:00
Matthias
01393b34a2
Upgrade to Rust 2021 (#427) 2021-12-17 01:32:13 +01:00
Matthias
83182c29ca
Fix JSON serialization (#426)
We recently removed the custom serialization for InputSource.
This causes the JSON formatter to fail
with "key must be a string".
Add it back and add a comment on
why this is needed.
2021-12-16 23:55:04 +01:00
Matthias
18c606d2e8
Fix docs badge 2021-12-16 20:47:35 +01:00
dependabot[bot]
58785311f0
Bump tokio from 1.14.0 to 1.15.0 (#425)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.14.0 to 1.15.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.14.0...tokio-1.15.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-16 19:32:51 +01:00
Matthias
166c86c30e
Use tokenizer for extraction; add benchmark (#424)
This avoids creating a DOM tree for link extraction and instead uses a `TokenSink` for on-the-fly extraction. In hyperfine benchmarks it was about 10-25% faster than the master.

Old: 4.557 s ± 0.404 s
New: 3.832 s ± 0.131 s

The performance fluctuates a little less as well.

Some missing element/attribute pairs were also added, which contain links according to the HTML spec. These occur very rarely, but it's good to parse them for completeness' sake.

Furthermore tried to clean up a lot of papercuts around our types. We now differentiate between a `RawUri` (stringy-types) and a Uri, which is a properly parsed `URI` type.
The extractor now only deals with extracting `RawUri`s while the collector creates the request objects.
2021-12-16 18:45:52 +01:00
dependabot[bot]
c97ff95575
Bump once_cell from 1.8.0 to 1.9.0 (#423)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.8.0 to 1.9.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.8.0...v1.9.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-15 13:45:19 +01:00
dependabot[bot]
7ac3a44e3d
Bump serde_json from 1.0.72 to 1.0.73 (#422)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.72 to 1.0.73.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.72...v1.0.73)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-14 14:00:58 +01:00
dependabot[bot]
eac9d5b9a0
Bump openssl-sys from 0.9.71 to 0.9.72 (#421)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.71 to 0.9.72.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.71...openssl-sys-v0.9.72)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 13:26:15 +01:00
Matthias
c41ba64a69
Max concurrency moved to check (#419)
Concurrency is defined by the channel size consuming
from the request stream in  `check`
2021-12-07 11:52:40 +01:00
Matthias
ec03c8d0f7
Update README.md 2021-12-03 12:23:43 +01:00
Matthias
bd1e8ba8db Docker workflow fixes
See https://github.com/lycheeverse/lychee/pull/406#issuecomment-983903903
for dicussion
2021-12-02 11:49:43 +01:00
Matthias
3d5135668b
Improve concurrency with streams (#330)
* Move to from vec to streams

Previously we collected all inputs in one vector
before checking the links, which is not ideal.
Especially when reading many inputs (e.g. by using a glob pattern),
this could cause issues like running out of file handles.

By moving to streams we avoid that scenario. This is also the first
step towards improving performance for many inputs.

To stay as close to the pre-stream behaviour, we want to stop processing
as soon as an Err value appears in the stream. This is easiest when the
stream is consumed in the main thread.
Previously, the stream was consumed in a tokio task and the main thread
waited for responses.
Now, a tokio task waits for responses (and displays them/registers
response stats) and the main thread sends links to the ClientPool.
To ensure that the main thread waits for all responses to have arrived
before finishing the ProgressBar and printing the stats, it waits for
the show_results_task to finish.


* Return collected links as Stream
* Initialize ProgressBar without length because we can't know the amount of links without blocking
* Handle stream results in main thread, not in task
* Add basic directory support using jwalk
* Add test for HTTP protocol file type (http://)
* Remove deadpool (once again): Replaced with `futures::StreamExt::for_each_concurrent`.
* Refactor main; fix tests
* Move commands into separate submodule
* Simplify input handling
* Simplify collector
* Remove unnecessary unwrap
* Simplify main
* cleanup check
* clean up dump command
* Handle requests in parallel 
* Fix formatting and lints

Co-authored-by: Timo Freiberg <self@timofreiberg.com>
2021-12-01 18:25:11 +01:00
dependabot[bot]
bcd1d6725a
Bump reqwest from 0.11.6 to 0.11.7 (#415)
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.6 to 0.11.7.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.6...v0.11.7)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-01 16:06:38 +01:00
Matthias
ddd4f82bbf
Test pushing Docker new images 2021-12-01 16:06:23 +01:00
Matthias
22ac3c213c
Add dispatch 2021-12-01 10:22:34 +01:00
faust
7353c8793b
Publish arm64 docker image (#406) 2021-12-01 09:58:32 +01:00
dependabot[bot]
c3ec652e75
Bump anyhow from 1.0.50 to 1.0.51 (#412)
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.50 to 1.0.51.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.50...1.0.51)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-30 18:54:30 +01:00
dependabot[bot]
8306d0c4f9
Bump tracing-subscriber from 0.3.2 to 0.3.3 (#413)
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.3.2 to 0.3.3.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.2...tracing-subscriber-0.3.3)

---
updated-dependencies:
- dependency-name: tracing-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-30 18:54:22 +01:00
dependabot[bot]
3725c0e9b5
Bump anyhow from 1.0.48 to 1.0.50 (#411)
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.48 to 1.0.50.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.48...1.0.50)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 15:51:40 +01:00
dependabot[bot]
06140fff3a
Bump linkify from 0.7.0 to 0.8.0 (#409)
Bumps [linkify](https://github.com/robinst/linkify) from 0.7.0 to 0.8.0.
- [Release notes](https://github.com/robinst/linkify/releases)
- [Changelog](https://github.com/robinst/linkify/blob/main/CHANGELOG.md)
- [Commits](https://github.com/robinst/linkify/compare/0.7.0...0.8.0)

---
updated-dependencies:
- dependency-name: linkify
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-26 13:33:41 +01:00
dependabot[bot]
1af7acea7c
Bump serde_json from 1.0.71 to 1.0.72 (#407)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.71 to 1.0.72.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.71...v1.0.72)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-25 17:39:37 +01:00
Erik Rigtorp
7d92609ff2
Muffet supports recursively checking links (#403) 2021-11-23 13:32:07 +01:00
dependabot[bot]
31d538bab2
Bump predicates from 2.0.3 to 2.1.0 (#404)
Bumps [predicates](https://github.com/assert-rs/predicates-rs) from 2.0.3 to 2.1.0.
- [Release notes](https://github.com/assert-rs/predicates-rs/releases)
- [Changelog](https://github.com/assert-rs/predicates-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/predicates-rs/compare/v2.0.3...v2.1.0)

---
updated-dependencies:
- dependency-name: predicates
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-23 13:31:50 +01:00
dependabot[bot]
70c23ef92f
Bump anyhow from 1.0.47 to 1.0.48 (#405)
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.47 to 1.0.48.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.47...1.0.48)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-23 13:31:31 +01:00
Matthias
591cbdbebb
Add support for .lycheeignore file #308 (#402)
This is similar to files like .gitignore and .dockerignore
and gets merged into exclude_files
2021-11-23 01:39:53 +01:00
dependabot[bot]
cda11359ee
Bump anyhow from 1.0.45 to 1.0.47 (#398)
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.45 to 1.0.47.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.45...1.0.47)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 15:09:16 +01:00
dependabot[bot]
5da3f49410
Bump tracing-subscriber from 0.3.1 to 0.3.2 (#401)
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.3.1 to 0.3.2.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.1...tracing-subscriber-0.3.2)

---
updated-dependencies:
- dependency-name: tracing-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 15:09:08 +01:00
Matthias
1eb4453957
Only print source in verbose mode (#400)
This way the normal link output can be fed into
another tool without data mangling.
2021-11-21 17:22:04 +01:00
Matthias
d96c1269ff
Use thiserror for error handling (#399)
This removes some boilerplate and is arguably better
than handwriting the error handling code for
maintainability and avoid inconsitent functionality
for the error variants.
thiserror is also the de-facto standard for library
error types as of today.
2021-11-20 01:42:50 +01:00
dependabot[bot]
20eee6f000
Bump predicates from 1.0.8 to 2.0.3 (#381)
Bumps [predicates](https://github.com/assert-rs/predicates-rs) from 1.0.8 to 2.0.3.
- [Release notes](https://github.com/assert-rs/predicates-rs/releases)
- [Changelog](https://github.com/assert-rs/predicates-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/predicates-rs/compare/v1.0.8...v2.0.3)

---
updated-dependencies:
- dependency-name: predicates
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-18 23:22:38 +01:00
dependabot[bot]
fc9790b98b
Bump openssl-sys from 0.9.70 to 0.9.71 (#395)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.70 to 0.9.71.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.70...openssl-sys-v0.9.71)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-18 16:37:48 +01:00
dependabot[bot]
25b70a1129
Bump serde_json from 1.0.70 to 1.0.71 (#397)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.70 to 1.0.71.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.70...v1.0.71)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-18 16:37:36 +01:00
Matthias
30a0fd3856
Bump version to 0.8.1 (#396) 2021-11-18 00:59:28 +01:00
Matthias
4008c2ce38 Add missing newline 2021-11-18 00:46:20 +01:00
Matthias
b97fda34d0
Add support for different output formats (compact, detailed, markdown) (#375) 2021-11-18 00:44:48 +01:00
Markus Unterwaditzer
d3ed133f10
Remove srcset attribute from list of "link" attrs (#393)
* Remove srcset attribute from list of "link" attrs

Fix #390

* Add test for srcset

* Add note about srcSet links

* add real support for srcset

Co-authored-by: Matthias <matthias-endler@gmx.net>
2021-11-16 22:58:10 +01:00
dependabot[bot]
893dfff453
Bump serde_json from 1.0.68 to 1.0.70 (#391)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.68 to 1.0.70.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.68...v1.0.70)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-16 13:39:38 +01:00
dependabot[bot]
09a4754c55
Bump deadpool from 0.9.1 to 0.9.2 (#392)
Bumps [deadpool](https://github.com/bikeshedder/deadpool) from 0.9.1 to 0.9.2.
- [Release notes](https://github.com/bikeshedder/deadpool/releases)
- [Changelog](https://github.com/bikeshedder/deadpool/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bikeshedder/deadpool/compare/deadpool-v0.9.1...deadpool-v0.9.2)

---
updated-dependencies:
- dependency-name: deadpool
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-16 13:29:21 +01:00
dependabot[bot]
31ec9a1fe7
Bump tokio from 1.13.0 to 1.14.0 (#394)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.13.0 to 1.14.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-16 13:28:59 +01:00
Matthias
34f379319d
Fix link checking after upgrade to 0.8.0 (#386) 2021-11-05 22:31:05 +01:00
Derek Croote
e8bab82d76
Fix clippy lint (#383) 2021-11-05 10:22:51 +01:00
Matthias
f92f4f516a Update deps 2021-11-03 23:45:42 +01:00
Matthias
69e5d56687
Add more known false positive schema domains (#376)
See https://github.com/lycheeverse/lychee-action/issues/53
2021-10-31 14:53:40 +01:00
dependabot[bot]
e346033a10
Bump openssl-sys from 0.9.67 to 0.9.68 (#373)
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.67 to 0.9.68.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.67...openssl-sys-v0.9.68)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-28 14:35:49 +02:00
Matthias
37bbb14add Merge branch 'master' of github.com:lycheeverse/lychee 2021-10-28 03:30:50 +02:00
Matthias
e7b4f15da3 Disable tokio console-subscriber as it is not published yet 2021-10-28 03:15:44 +02:00
dependabot[bot]
d3a72d3816
Bump deadpool from 0.7.0 to 0.9.1 (#371)
* Bump deadpool from 0.7.0 to 0.9.1

Bumps [deadpool](https://github.com/bikeshedder/deadpool) from 0.7.0 to 0.9.1.
- [Release notes](https://github.com/bikeshedder/deadpool/releases)
- [Changelog](https://github.com/bikeshedder/deadpool/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bikeshedder/deadpool/compare/deadpool-v0.7.0...deadpool-v0.9.1)

---
updated-dependencies:
- dependency-name: deadpool
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Attempt fix for deadpool v0.8.0+ (#372)

Signed-off-by: MichaIng <micha@dietpi.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MichaIng <micha@dietpi.com>
2021-10-28 02:05:58 +02:00
Matthias
47426c6971
Fix typos, grammar 2021-10-28 02:05:35 +02:00