Commit graph

126 commits

Author SHA1 Message Date
Trask Stalnaker
6d0e94c799
Introduce --root-dir (#1576)
* windows

* Introduce --root-path

* lint

* lint

* Simplification

* Add unit tests

* Add integration test

* Sync docs

* Add missing comment to make CI happy

* Revert one of the Windows-specific changes because causing a test failure

* Support both options at the same time

* Revert a comment change that is no longer applicable

* Remove unused code

* Fix and simplification

* Integration test both at the same time

* Unit tests both at the same time

* Remove now redundant comment

* Revert windows-specific change, seems not needed after recent changes

* Use Collector::default()

* extract method and unit tests

* clippy

* clippy: &Option<A> -> Option<&A>

* Remove outdated comment

* Rename --root-path to --root-dir

* Restrict --root-dir to absolute paths for now

* Move root dir check
2024-12-13 14:36:33 +01:00
Trask Stalnaker
034fd129a9
Fix retries (#1573) 2024-11-27 22:58:32 +01:00
dependabot[bot]
2d35dccd34
Bump the dependencies group with 4 updates (#1566)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias@endler.dev>
2024-11-12 23:51:39 +01:00
Matthias Endler
9dc42176fa
Rename fail_map to error_map for improved clarity in response statistics (#1560)
Fixes #1446
2024-11-08 09:02:33 +01:00
Matthias Endler
f4a35ff580
Add quirks support for youtube-nocookie.com and youtube embed URLs (#1563) 2024-11-08 06:15:26 +01:00
Matthias Endler
e794b40d4d
Support excluded paths in --dump-inputs (#1556) 2024-11-07 16:32:32 +01:00
Matthias Endler
4beec320c8
Improve robustness of cache integration test (#1557) 2024-11-07 16:24:13 +01:00
Matthias Endler
71564344de
Fix: Bring back error output for links (#1553)
With the last lychee release, we simplified the status output for links.

While this reduced the visual noise, it also accidentally caused the source of errors to not be printed anymore. This change brings back the additional error information as part of the final report output. Furthermore, it shows the error information in the progress output if verbose mode is activated.

Fixes #1487
2024-11-07 00:22:50 +01:00
autoantwort
98015907f2
Ignore casing when processing markdown fragments + check for percent encoded ancors (#1535)
We must also check the fragment before it is percent-decoded as required by the HTML standard.

Fixes https://github.com/lycheeverse/lychee/issues/1467
2024-10-28 09:21:13 +01:00
Matthias Endler
bc0b05bb28
Refactor cache handling test to make it more robust (#1548) 2024-10-27 02:19:35 +02:00
Matthias Endler
812941c2aa
Fix format option in configuration file (#1547) 2024-10-27 02:17:00 +02:00
Matthias Endler
e43086c2e9
Fix skipping of email addresses in stylesheets (#1546) 2024-10-27 01:32:11 +02:00
Matthias Endler
3094bbca33
Add support for relative links (#1489)
This commit introduces several improvements to the file checking process and URI handling:

- Extract file checking logic into separate `Checker` structs (`FileChecker`, `WebsiteChecker`, `MailChecker`)
- Improve handling of relative and absolute file paths
- Enhance URI parsing and creation from file paths
- Refactor `create_request` function for better clarity and error handling

These changes provide better support for resolving relative links, handling different base URLs, and working with file paths.

Fixes https://github.com/lycheeverse/lychee/issues/1296 and https://github.com/lycheeverse/lychee/issues/1480
2024-10-26 04:07:37 +02:00
Damien Mathieu
f0ebac29a2
Allow excluding cache based on status code (#1403)
This introduces an option `--cache-exclude-status`, which allows specifying a range of HTTP status codes which will be ignored from the cache.

Closes #1400.
2024-10-14 02:41:56 +02:00
Thomas Zahner
462033a294 Test ignored files 2024-09-22 19:09:35 +02:00
Thomas Zahner
0e9b6532d2 Test hidden files 2024-09-22 19:09:35 +02:00
Thomas Zahner
ee25adbed1 Update README.md test to trim whitespaced lines & update README.md 2024-09-22 19:09:35 +02:00
dependabot[bot]
9e1a99a936
Bump the dependencies group with 4 updates (#1490)
* Bump the dependencies group with 4 updates

Bumps the dependencies group with 4 updates: [reqwest](https://github.com/seanmonstar/reqwest), [serde](https://github.com/serde-rs/serde), [serde_json](https://github.com/serde-rs/json) and [typed-builder](https://github.com/idanarye/rust-typed-builder).


Updates `reqwest` from 0.12.5 to 0.12.7
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.12.5...v0.12.7)

Updates `serde` from 1.0.208 to 1.0.209
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.208...v1.0.209)

Updates `serde_json` from 1.0.125 to 1.0.127
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/1.0.125...1.0.127)

Updates `typed-builder` from 0.19.1 to 0.20.0
- [Changelog](https://github.com/idanarye/rust-typed-builder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/idanarye/rust-typed-builder/commits)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: typed-builder
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>

* skip flaky test

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias@endler.dev>
2024-08-27 11:43:39 +02:00
Matthias Endler
bf19934949
Run cargo nextest in CI and enable all tests (#1483) 2024-08-13 18:45:09 +02:00
Hugo McNally
4bb8a61545
Updated pulldown-cmark dependency and fixed maths parsing (#1473)
* Update pulldown-cmark version to 0.11.0
* Fix markdown math parsing
* Fix lints
* Disable flaky wayback test

---------

Co-authored-by: Matthias <matthias@endler.dev>
2024-08-06 15:43:34 +02:00
Matthias Endler
dedc554eda
Add response formatter; refactor stats formatter (#1398)
This adds support for formatting responses in different ways.

For now, the options are:

* `plain`: No color, basic formatting
* `color`: Color, indented formatting (default)
* `emoji`: Fancy mode with emoji icons

Fixes #546
Related to #271
2024-06-14 19:47:52 +02:00
Johannes Schindelin
8c6eee9b5f
Add a way to handle "pretty URLs", i.e. URIs without .html extension (#1422)
In many circumstances (GitHub Pages, Apache configured with MultiViews,
etc), web servers process URIs by appending the `.html` file extension
when no file is found at the path specified by the URI but a `.html`
file corresponding to that path _is_ found.

To allow Lychee to use the fast, offline method of checking such files
locally via the `file://` scheme, let's handle this scenario gracefully
by adding the `--fallback-extensions=html` option.

Note: This new option can take a list of file extensions to use; The
first one for which a corresponding file is found is then used.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-06-11 16:11:24 +02:00
John Bampton
7be088bbfc
Fix spelling; Github -> GitHub (#1416) 2024-04-25 22:44:24 +02:00
Orhun Parmaksız
13f4339710
fix(tests): update the asserts in expired SSL certificate test (#1373) 2024-02-09 20:35:47 +01:00
Matthias Endler
90ed0e70b7
Bump to version; move to workspace versioning (#1372) 2024-02-05 16:50:32 +01:00
Hugo McNally
9ff4a838ce
Fixed fragment generation for headings with inline code (#1370)
* Added code headings to fragment cli test

* Fixed fragment generation for headings with inline code
2024-02-05 01:07:56 +01:00
Norbert Kamiński
2a95944ef5
status.rs: Make json output more verbose (#1367)
* status.rs: Make json output more verbose

Currently if the status response has no status code, json output
contains only a text field which gives no real information about
the cause of the problem. The patch adds field with more detailed
information when the status response contains some details.

Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com>

* cli.rs: Test parsing of error details in JSON format

Some network error such as SSL has no status code but it can be
identified by error status details. This patch adds a test case to
verify if the error details are parsed properly in the json format.

Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com>

---------

Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com>
2024-01-30 23:58:18 +01:00
Orhun Parmaksız
b882a14161
fix(tests): update the expected output in cli tests (#1362) 2024-01-28 02:52:45 +01:00
Matthias Endler
ad3ba31184
Merge missing include_mail flag into config (#1357) 2024-01-24 13:39:43 +01:00
Matthias Endler
d481c061b9
Always output valid JSON with --format=json (#1356)
Previously, when using JSON as the output format, any supplementary warnings included in the output would invalidate the JSON structure. This pull request addresses this issue by redirecting any extra warnings to `stderr`. This change guarantees that the output remains valid JSON even when additional warnings are present.

Fixes https://github.com/lycheeverse/lychee/issues/1355
2024-01-24 13:12:55 +01:00
Matthias
f933656161 Add integration test for accept (int and string) 2024-01-10 00:10:22 +01:00
Levi Zim
704126eab4
fix(test_cookie_jar): use google.com/ncr (#1336)
google.com might redirect to other domains, causing cookie_jar test to fail.
2024-01-06 12:31:23 +01:00
Matthias Endler
63ba63f7c9
Exclude example TLDs from RFC 2606 (#1335)
Fixes https://github.com/lycheeverse/lychee/issues/1283
2024-01-05 18:48:15 +01:00
Hugo McNally
c9b707ea74
Decode percent escapes in fragments (#1275)
* Added test to check a fragment with a utf8 character
2024-01-05 15:46:09 +01:00
Matthias Endler
d3d0cd513d
Better TOML parsing error message (#1332)
The error handling for config loading was pretty poor.
That's because we didn't use the correct syntax to show the entire context with `anhow`.
See ["Display representations"](https://docs.rs/anyhow/latest/anyhow/struct.Error.html#display-representations).
2024-01-04 22:17:14 +01:00
Matthias Endler
ef4e19268a
Fix false-positive example domains (#1316) 2023-12-04 01:55:14 +01:00
Thomas Zahner
46f0ae908e
Address warnings of the new clippy lints (#1310) 2023-12-01 14:21:49 +01:00
Hugo McNally
f59aa61ee3
Check fragments in HTML files (#1198)
* Added html5gum based fragment extractor
* Markdown fragment extractor now extracts fragments from inline html
* Added fragment checks for html file
* Added inline html and html document to fragment checks test
* Improved some comments
* Improved documentation of markdown's fragment extractor.
2023-08-22 16:44:45 +02:00
Matthias Endler
006ee6d3be
Make suggestion test more robust (#1229) 2023-08-17 16:54:59 +02:00
Matthias Endler
1bf2944c1e
Update dependencies; fix flaky tests (#1219) 2023-08-15 16:41:58 +02:00
Hugo McNally
8e6369377c
Introduce fragment checking for links to markdown files. (#1126)
- Implemented enhancements to include fragments in file links
- Checked links to markdown files with fragments, generating unique kebab case and heading attributes.
- Made code more idiomatic and added an integration test.
- Updated documentation.
- Fixed issues with heading attributes fragments and ensured proper handling of file errors.
2023-07-31 16:04:00 +02:00
Matthias Endler
04887ee293
Make checking email addresses optional (#1171)
E-Mail checks cause too many false-postives,
so we put them behind a flag.

* `--exclude-mail` is deprecated (to be removed in 1.0)
* `--include-mail` is the new flag

This PR also removes the obsolete tests for `--exclude-file`, which was superseded by `.lycheeignore`.

Fixes #1089
2023-07-19 19:58:38 +02:00
Techassi
f53619a455
feat: Add support for --dump-inputs (#1159)
* Add support for --dump-inputs
* Add integration tests
* Fix usage guide in README
2023-07-16 18:08:14 +02:00
Matthias
961575cdc7 fix typos 2023-07-13 21:48:46 +02:00
Matthias Endler
14e748793e
Cookie Support (#1146)
This is a very conservative and limited implementation of cookie support.

The goal is to ship an MVP, which covers 80% of the use-cases.
When you run lychee with --cookie-jar cookies.json, all cookies will be stored in cookies.json, one cookie per line.
This makes cookies easy to edit by hand if needed, although this is an advanced use-case and the API for the format is not guaranteed to be stable.

Fixes: #645, #715
Partially fixes: #1108
2023-07-13 17:32:41 +02:00
Matthias Endler
40ba18794d
Don't check Twitter URLs (#1147)
Twitter completely locked down and requires
a login to read tweets. (Temporarily) disable all
Twitter URLs to avoid false-positives.

For context:
https://github.com/zedeus/nitter/issues/919
https://news.ycombinator.com/item?id=36540957
https://techcrunch.com/2023/06/30/twitter-now-requires-an-account-to-view-tweets/

Fixes https://github.com/lycheeverse/lychee/issues/1108
2023-07-13 17:31:59 +02:00
Matthias Endler
97573123ef
Extend remap feature (#1133)
* wip

* Extend support for remapping

This adds supports for partial remaps and
capture groups to the remap feature.

Fixes #1129
2023-07-05 15:05:19 +02:00
Techassi
67af7ef6d3
feat: add support for basic auth per URI (#1110)
* Add support for basic auth per domain
* Move URI matching to link collection phase
* Allow AsRef for BasicAuthExtractor::new to avoid clone
* Add tests

---------

Co-authored-by: Matthias Endler <matthias@endler.dev>
2023-06-26 12:06:24 +02:00
Matthias Endler
5ce77e1202
Don't cache unknown status codes (#1090)
Unknown status codes should be skipped and not cached by default. The reason is that we don't know if they are valid or not and even if they are invalid, we don't know if they will be valid in the future.
2023-06-02 02:46:20 +02:00
Thomas Zahner
130fa21a6a
Concurrent archives (#1027) 2023-05-11 20:20:27 +02:00