lychee

mirror of https://github.com/Hopiu/lychee.git synced 2026-05-13 16:23:12 +00:00

Author	SHA1	Message	Date
dependabot[bot]	2114406235	Bump futures from 0.3.19 to 0.3.21 (#493 ) Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.19 to 0.3.21. - [Release notes](https://github.com/rust-lang/futures-rs/releases) - [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.19...0.3.21) --- updated-dependencies: - dependency-name: futures dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-18 10:58:01 +01:00
Matthias	812663d832	Prevent flaky tests (#514 ) Move from example.org to example.com, which seems to be more permissive for testing	2022-02-18 10:29:49 +01:00
Matthias	47df7780fe	Use captured identifiers in format strings (#507 ) Makes for arguably cleaner-looking code. The downside is that the MSRV is 1.58 https://blog.rust-lang.org/2022/01/13/Rust-1.58.0.html Given that nobody uses lychee as a library yet and we have precompiled binaries, it's an acceptable tradeoff. My little research revealed that this is a much-liked feature: https://twitter.com/matthiasendler/status/1483895557621960715	2022-02-12 10:51:52 +01:00
Lucius Hu	5921fd248a	Update license files (#497 ) - The date in MIT license files have been updated to 2022 - Each of the benchmark and example crates are theoretically a separate package in Cargo's sense. So license files are added for them as well. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>	2022-02-08 10:59:54 +01:00
Markus Unterwaditzer	68d09f7e5b	Add html5gum as alternative link extractor (#480 ) html5gum is a HTML parser that offers lower-level control over which tokens actually get created and are tracked. As such, the extractor doesn't allocate anything tokens it doesn't care about. On some benchmarks it provides a substantial performance boost. The old parser, html5ever is still available by setting the `LYCHEE_USE_HTML5EVER=1` env var.	2022-02-07 22:54:47 +01:00
dependabot[bot]	a8d1359df4	Bump tokio from 1.15.0 to 1.16.1 (#482 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.15.0 to 1.16.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.15.0...tokio-1.16.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-28 13:23:02 +01:00
Matthias	ac490f9c53	Add caching functionality (v2) (#443 ) A while ago, caching was removed due to some issues (see #349). This is a new implementation with the following improvements: * Architecture: The new implementation is decoupled from the collector, which was a major issue in the last version. Now the collector has a single responsibility: collecting links. This also avoids race-conditions when running multiple collect_links instances, which probably was an issue before. * Performance: Uses DashMap under the hood, which was noticeably faster than Mutex<HashMap> in my tests. * Simplicity: The cache format is a CSV file with two columns: URI and status. I decided to create a new struct called CacheStatus for serialization, because trying to serialize the error kinds in Status turned out to be a bit of a nightmare and at this point I don't think it's worth the pain (and probably isn't idiomatic either). This is an optional feature. Caching only gets used if the `--cache` flag is set.	2022-01-14 15:25:51 +01:00
dependabot[bot]	5a5ed00ba4	Bump reqwest from 0.11.8 to 0.11.9 (#455 ) Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.8 to 0.11.9. - [Release notes](https://github.com/seanmonstar/reqwest/releases) - [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.8...v0.11.9) --- updated-dependencies: - dependency-name: reqwest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-11 13:43:35 +01:00
Matthias	36450621fa	Update dependencies (#454 )	2022-01-10 22:35:37 +01:00
Matthias	0645177b84	Bump version (#450 )	2022-01-10 01:38:46 +01:00
dependabot[bot]	7a4de16138	Bump http from 0.2.5 to 0.2.6 (#438 ) Bumps [http](https://github.com/hyperium/http) from 0.2.5 to 0.2.6. - [Release notes](https://github.com/hyperium/http/releases) - [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/http/compare/v0.2.5...v0.2.6) --- updated-dependencies: - dependency-name: http dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-12-31 18:33:22 +01:00
dependabot[bot]	147fa8de87	Bump reqwest from 0.11.7 to 0.11.8 (#432 ) Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.7 to 0.11.8. - [Release notes](https://github.com/seanmonstar/reqwest/releases) - [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.7...v0.11.8) --- updated-dependencies: - dependency-name: reqwest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-12-21 16:28:22 +01:00
dependabot[bot]	b2bc0e7eac	Bump futures from 0.3.18 to 0.3.19 (#430 ) Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.18 to 0.3.19. - [Release notes](https://github.com/rust-lang/futures-rs/releases) - [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.18...0.3.19) --- updated-dependencies: - dependency-name: futures dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-12-20 15:37:18 +01:00
Matthias	01393b34a2	Upgrade to Rust 2021 (#427 )	2021-12-17 01:32:13 +01:00
dependabot[bot]	58785311f0	Bump tokio from 1.14.0 to 1.15.0 (#425 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.14.0 to 1.15.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.14.0...tokio-1.15.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-12-16 19:32:51 +01:00
Matthias	166c86c30e	Use tokenizer for extraction; add benchmark (#424 ) This avoids creating a DOM tree for link extraction and instead uses a `TokenSink` for on-the-fly extraction. In hyperfine benchmarks it was about 10-25% faster than the master. Old: 4.557 s ± 0.404 s New: 3.832 s ± 0.131 s The performance fluctuates a little less as well. Some missing element/attribute pairs were also added, which contain links according to the HTML spec. These occur very rarely, but it's good to parse them for completeness' sake. Furthermore tried to clean up a lot of papercuts around our types. We now differentiate between a `RawUri` (stringy-types) and a Uri, which is a properly parsed `URI` type. The extractor now only deals with extracting `RawUri`s while the collector creates the request objects.	2021-12-16 18:45:52 +01:00
Matthias	c41ba64a69	Max concurrency moved to check (#419 ) Concurrency is defined by the channel size consuming from the request stream in `check`	2021-12-07 11:52:40 +01:00
Matthias	3d5135668b	Improve concurrency with streams (#330 ) * Move to from vec to streams Previously we collected all inputs in one vector before checking the links, which is not ideal. Especially when reading many inputs (e.g. by using a glob pattern), this could cause issues like running out of file handles. By moving to streams we avoid that scenario. This is also the first step towards improving performance for many inputs. To stay as close to the pre-stream behaviour, we want to stop processing as soon as an Err value appears in the stream. This is easiest when the stream is consumed in the main thread. Previously, the stream was consumed in a tokio task and the main thread waited for responses. Now, a tokio task waits for responses (and displays them/registers response stats) and the main thread sends links to the ClientPool. To ensure that the main thread waits for all responses to have arrived before finishing the ProgressBar and printing the stats, it waits for the show_results_task to finish. * Return collected links as Stream * Initialize ProgressBar without length because we can't know the amount of links without blocking * Handle stream results in main thread, not in task * Add basic directory support using jwalk * Add test for HTTP protocol file type (http://) * Remove deadpool (once again): Replaced with `futures::StreamExt::for_each_concurrent`. * Refactor main; fix tests * Move commands into separate submodule * Simplify input handling * Simplify collector * Remove unnecessary unwrap * Simplify main * cleanup check * clean up dump command * Handle requests in parallel * Fix formatting and lints Co-authored-by: Timo Freiberg <self@timofreiberg.com>	2021-12-01 18:25:11 +01:00
dependabot[bot]	bcd1d6725a	Bump reqwest from 0.11.6 to 0.11.7 (#415 ) Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.6 to 0.11.7. - [Release notes](https://github.com/seanmonstar/reqwest/releases) - [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.6...v0.11.7) --- updated-dependencies: - dependency-name: reqwest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-12-01 16:06:38 +01:00
Matthias	30a0fd3856	Bump version to 0.8.1 (#396 )	2021-11-18 00:59:28 +01:00
dependabot[bot]	31ec9a1fe7	Bump tokio from 1.13.0 to 1.14.0 (#394 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.13.0 to 1.14.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/commits) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-11-16 13:28:59 +01:00
Matthias	ed0efcd4f8	Prepare release	2021-10-28 00:34:48 +02:00
dependabot[bot]	d79b57fb9d	Bump reqwest from 0.11.5 to 0.11.6 (#364 ) Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.5 to 0.11.6. - [Release notes](https://github.com/seanmonstar/reqwest/releases) - [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.5...v0.11.6) --- updated-dependencies: - dependency-name: reqwest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-10-26 01:19:35 +02:00
dependabot[bot]	2be3b3b896	Bump reqwest from 0.11.4 to 0.11.5 (#356 ) Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.4 to 0.11.5. - [Release notes](https://github.com/seanmonstar/reqwest/releases) - [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/reqwest/compare/v0.11.4...v0.11.5) --- updated-dependencies: - dependency-name: reqwest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-10-10 02:41:08 +02:00
Matthias	56726f41fc	Add back connection pool (#355 )	2021-10-08 13:08:44 +02:00
MichaIng	961f12e58e	Remove cache from collector and remove custom reqwest client pool * Reqwest comes with its own request pool, so there's no need in adding another layer of indirection. This also gets rid of a lot of allocs. * Remove cache from collector * Improve error handling and documentation * Add back test for request caching in single file Signed-off-by: MichaIng <micha@dietpi.com> Co-authored-by: Matthias <matthias-endler@gmx.net>	2021-10-07 18:07:18 +02:00
dependabot[bot]	6848b20546	Bump tokio from 1.11.0 to 1.12.0 (#343 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.11.0 to 1.12.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.11.0...tokio-1.12.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-09-22 14:28:59 +02:00
dependabot[bot]	61fa23099b	Bump http from 0.2.4 to 0.2.5 (#344 ) Bumps [http](https://github.com/hyperium/http) from 0.2.4 to 0.2.5. - [Release notes](https://github.com/hyperium/http/releases) - [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/http/compare/v0.2.4...v0.2.5) --- updated-dependencies: - dependency-name: http dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-09-22 14:27:23 +02:00
Matthias	bfa3b1b6a1	Introduce Base type, which can be a path or URL	2021-09-06 15:15:40 +02:00
Matthias	f9bf52ef10	Add support for base_dir	2021-09-06 15:15:05 +02:00
Daniel Doubrovkine (dB.)	f866abef61	Fix publish workflow (#309 )	2021-09-04 01:49:29 +02:00
dependabot[bot]	19ae5fecc0	Bump tokio from 1.5.0 to 1.6.0 (#248 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.5.0 to 1.6.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.5.0...tokio-1.6.0) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-05-17 10:38:33 +02:00
Matthias	fe399c0a8c	Simple URI cache (#243 )	2021-05-04 13:28:39 +02:00
Matthias	9f75f28d3d	Add example folder (#241 )	2021-04-30 13:33:24 +02:00

34 commits