Commit graph

145 commits

Author SHA1 Message Date
Thomas Zahner
ea415c8db4 Fix test 2025-07-26 17:33:02 +02:00
Thomas Zahner
0d25e524bf Remove JSON output files even on test failures 2025-07-26 17:33:02 +02:00
Thomas Zahner
08dabb06b2 Add regression test 2025-07-26 17:33:02 +02:00
Thomas Zahner
e743ea3f5f Improve test 2025-07-18 16:53:08 +02:00
Thomas Zahner
475d7f3d3a Apply clippy suggestions 2025-07-18 16:53:08 +02:00
Thomas Zahner
678acd9760 Test regex functionality in --exclude-path flag 2025-07-18 16:53:08 +02:00
Keming
696a7cafc8
fix: do not check the fragment when http response err but accepted (#1763)
Signed-off-by: Keming <kemingy94@gmail.com>
2025-07-10 06:32:15 +02:00
MichaIng
92a9bca23f
feat: skip fragment checking for unsupported MIME types (#1744)
* feat: skip fragment checking for unsupported MIME types

The remote URL/website checker currently passes all URLs with fragments to the fragment checker as HTML document, even if it is a different or unsupported MIME type. This can cause false fragment checking for Markdown documents, failures for other MIME types, especially binaries, and unnecessary traffic for large downloads, which are always finished completely, if the fragment checker is invoked.

This commit checks the Content-Type header of the response:
- Only if it is `text/html`, it is passed to the fragment checker as HTML type.
- Only if it is `text/markdown`, of `text/plain` and URL path ends on `.md`, it is passed to the fragment checker as Markdown type.
- In all other cases, the fragment checker is skipped and the HTTP status is returned.

To invoke the fragment checker with a variable document type, a new `FileType` argument is added to the `check_html_fragment()` function.

The fragment checker test and fixture are adjusted to match the expected result: checking a binary file via remote URL with fragment is now expected to succeed, since its Content-Type header does not invoke the fragment checker anymore.

Signed-off-by: MichaIng <micha@dietpi.com>

* Update fixtures/fragments/file1.md

Co-authored-by: MichaIng <micha@dietpi.com>

---------

Signed-off-by: MichaIng <micha@dietpi.com>
Co-authored-by: Matthias Endler <matthias@endler.dev>
2025-07-06 10:46:06 +02:00
Keming
02f6f5cb49
feat: add 'user-content-' prefix to support github markdown fragment (#1750) 2025-07-04 22:58:47 +02:00
ocavue
81f2605118
fix: treat a fragment in an empty directory as an error (#1756)
* fix: treat a fragment in an empty directory as an error
* test: add more fragment tests
2025-07-04 10:25:57 +02:00
ocavue
6bcb37c2dc
fix: resolve index file inside a directory (#1752) 2025-07-03 16:55:57 +02:00
Thomas Zahner
845f74bab0
Fix basic auth (#1748)
* Capture bug as failing test

* Add basic auth credentials for website extraction requests via RequestChain & remove headers from Input

* Create UrlExtractor and add back headers

* Improve UrlExtractor

* Fix bug: extend headers instead of setting them

* Clean up

* Minor adjustments

* Apply suggestions from code review

Co-authored-by: Matthias Endler <matthias@endler.dev>

* Mention in doc comment how the method might panic

* Remove use of chain for more simplicity

---------

Co-authored-by: Matthias Endler <matthias@endler.dev>
2025-07-03 13:45:30 +02:00
Thomas Zahner
8f2f746bf9
Migrate to Clippy 1.88 (#1749)
* Update flake
* Fix clippy's new suggestions
* Do not ignore tests any longer since they work by now
* Add ignore reason
2025-06-27 12:34:48 +02:00
MichaIng
140f70167c test: fix assertion in fragment checks
Signed-off-by: MichaIng <micha@dietpi.com>
2025-06-25 11:18:01 +02:00
Thomas Zahner
34ec9b3d48 Replace unreliable API URL 2025-06-25 11:10:39 +02:00
MichaIng
b970256248
fix: skip fragment check if website URL doesn't contain fragment (#1733)
* fix: skip fragment check if website URL doesn't contain fragment

Signed-off-by: MichaIng <micha@dietpi.com>

* test: add tests for fragment checks with binary data

Signed-off-by: MichaIng <micha@dietpi.com>

* fix: skip fragment checking as well if fragment is empty

`is_some()` is true as well if the fragment is given but empty, i.e. `#`. While it is an edge case, skip the fragment checker as well in case of an empty fragment.

Signed-off-by: MichaIng <micha@dietpi.com>

* test: switch to lycheeverse/master remote URLs

Signed-off-by: MichaIng <micha@dietpi.com>

* fix: apply rustfmt annotation

Signed-off-by: MichaIng <micha@dietpi.com>

---------

Signed-off-by: MichaIng <micha@dietpi.com>
2025-06-20 17:47:35 +02:00
Thomas Zahner
e59456b96e Add cli test 2025-06-11 11:19:51 +02:00
Thomas Zahner
25b835f12d Update tests 2025-06-11 11:19:51 +02:00
Thomas Zahner
54bbc080a9 Remove duplicated information from output 2025-06-11 11:19:51 +02:00
Thomas Zahner
a783ecc103 Update test 2025-06-11 11:19:51 +02:00
Thomas Zahner
2dfaef74ff Update test 2025-06-11 11:19:51 +02:00
Keming
b128b86a48
feat: raise error when the default config file is invalid (#1715)
Signed-off-by: Keming <kemingy94@gmail.com>
2025-05-25 13:10:58 +02:00
Jakob
63cdb70e6d
Upgrade to 2024 edition (#1711)
* Upgrade to 2024 edition

* Revert expr_2021 -> expr

* resolve merge conflicts

* make lint happy
2025-05-24 18:23:23 +02:00
Keming
208fa80aa6
fix: only check the fragment when it's a file (#1713)
* fix: only check the fragment when it's a file
* add dir fragment test
* Clean up unused fragment_check in Client

---------

Signed-off-by: Keming <kemingy94@gmail.com>
Co-authored-by: Matthias <matthias@endler.dev>
2025-05-23 21:50:26 +02:00
Matthias Endler
35610764a1
Add support for custom headers in input processing (#1561) 2025-05-23 13:37:32 +02:00
Keming
1ed357fe73
feat: detect website fragments (#1675)
Signed-off-by: Keming <kemingy94@gmail.com>
2025-05-14 01:52:08 +02:00
Matthias Endler
d33b7554a1
test: add tests for URL extraction ending with a period (#1641) 2025-02-24 08:48:58 +01:00
Ben
d6bbf85145
renamed base to base_url (fixes #1607) (#1629)
* renamed `base` to `base_url` (fixes #1607)
* fixed readme
* added warning for deprecated `--base`
* Update lychee.example.toml
* Update fixtures/configs/smoketest.toml
2025-02-16 01:41:32 +01:00
sud
50687175d1
Sort compact/detailed/markdown error output by file path (#1622)
* Sort compact/detailed/markdown error output by file path

* - Modify sort_stats_map to sort HashMap values
- Add unit/integration tests for sort_stats_map
- Add human-sort dependency for natural sorting

* Fix warnings reported by GitHub checks

* Fix clippy warning

- Fix clippy warning
- Make entry sorting case-insensitive in sort_stat_map

* Fix clippy warning
2025-02-15 00:10:59 +01:00
MichaIng
d3d7f6a56b
fix: do not fail on empty # and #top fragments (#1609)
The empty "#" and "#top" fragments are always valid without related HTML element. Browser will scroll to the top of the page. Hence lychee must not fail on those.

Credits go to @thiru-appitap for initial attempt and helping to find missing parts of the implementation.

Solves: https://github.com/lycheeverse/lychee/issues/1599

Signed-off-by: MichaIng <micha@dietpi.com>
2025-02-06 15:09:59 +01:00
Matthias Endler
971ee67bc3
Fix new clippy lints (#1625) 2025-02-06 14:51:44 +01:00
Trask Stalnaker
6d0e94c799
Introduce --root-dir (#1576)
* windows

* Introduce --root-path

* lint

* lint

* Simplification

* Add unit tests

* Add integration test

* Sync docs

* Add missing comment to make CI happy

* Revert one of the Windows-specific changes because causing a test failure

* Support both options at the same time

* Revert a comment change that is no longer applicable

* Remove unused code

* Fix and simplification

* Integration test both at the same time

* Unit tests both at the same time

* Remove now redundant comment

* Revert windows-specific change, seems not needed after recent changes

* Use Collector::default()

* extract method and unit tests

* clippy

* clippy: &Option<A> -> Option<&A>

* Remove outdated comment

* Rename --root-path to --root-dir

* Restrict --root-dir to absolute paths for now

* Move root dir check
2024-12-13 14:36:33 +01:00
Trask Stalnaker
034fd129a9
Fix retries (#1573) 2024-11-27 22:58:32 +01:00
dependabot[bot]
2d35dccd34
Bump the dependencies group with 4 updates (#1566)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias@endler.dev>
2024-11-12 23:51:39 +01:00
Matthias Endler
9dc42176fa
Rename fail_map to error_map for improved clarity in response statistics (#1560)
Fixes #1446
2024-11-08 09:02:33 +01:00
Matthias Endler
f4a35ff580
Add quirks support for youtube-nocookie.com and youtube embed URLs (#1563) 2024-11-08 06:15:26 +01:00
Matthias Endler
e794b40d4d
Support excluded paths in --dump-inputs (#1556) 2024-11-07 16:32:32 +01:00
Matthias Endler
4beec320c8
Improve robustness of cache integration test (#1557) 2024-11-07 16:24:13 +01:00
Matthias Endler
71564344de
Fix: Bring back error output for links (#1553)
With the last lychee release, we simplified the status output for links.

While this reduced the visual noise, it also accidentally caused the source of errors to not be printed anymore. This change brings back the additional error information as part of the final report output. Furthermore, it shows the error information in the progress output if verbose mode is activated.

Fixes #1487
2024-11-07 00:22:50 +01:00
autoantwort
98015907f2
Ignore casing when processing markdown fragments + check for percent encoded ancors (#1535)
We must also check the fragment before it is percent-decoded as required by the HTML standard.

Fixes https://github.com/lycheeverse/lychee/issues/1467
2024-10-28 09:21:13 +01:00
Matthias Endler
bc0b05bb28
Refactor cache handling test to make it more robust (#1548) 2024-10-27 02:19:35 +02:00
Matthias Endler
812941c2aa
Fix format option in configuration file (#1547) 2024-10-27 02:17:00 +02:00
Matthias Endler
e43086c2e9
Fix skipping of email addresses in stylesheets (#1546) 2024-10-27 01:32:11 +02:00
Matthias Endler
3094bbca33
Add support for relative links (#1489)
This commit introduces several improvements to the file checking process and URI handling:

- Extract file checking logic into separate `Checker` structs (`FileChecker`, `WebsiteChecker`, `MailChecker`)
- Improve handling of relative and absolute file paths
- Enhance URI parsing and creation from file paths
- Refactor `create_request` function for better clarity and error handling

These changes provide better support for resolving relative links, handling different base URLs, and working with file paths.

Fixes https://github.com/lycheeverse/lychee/issues/1296 and https://github.com/lycheeverse/lychee/issues/1480
2024-10-26 04:07:37 +02:00
Damien Mathieu
f0ebac29a2
Allow excluding cache based on status code (#1403)
This introduces an option `--cache-exclude-status`, which allows specifying a range of HTTP status codes which will be ignored from the cache.

Closes #1400.
2024-10-14 02:41:56 +02:00
Thomas Zahner
462033a294 Test ignored files 2024-09-22 19:09:35 +02:00
Thomas Zahner
0e9b6532d2 Test hidden files 2024-09-22 19:09:35 +02:00
dependabot[bot]
9e1a99a936
Bump the dependencies group with 4 updates (#1490)
* Bump the dependencies group with 4 updates

Bumps the dependencies group with 4 updates: [reqwest](https://github.com/seanmonstar/reqwest), [serde](https://github.com/serde-rs/serde), [serde_json](https://github.com/serde-rs/json) and [typed-builder](https://github.com/idanarye/rust-typed-builder).


Updates `reqwest` from 0.12.5 to 0.12.7
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/reqwest/compare/v0.12.5...v0.12.7)

Updates `serde` from 1.0.208 to 1.0.209
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.208...v1.0.209)

Updates `serde_json` from 1.0.125 to 1.0.127
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/1.0.125...1.0.127)

Updates `typed-builder` from 0.19.1 to 0.20.0
- [Changelog](https://github.com/idanarye/rust-typed-builder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/idanarye/rust-typed-builder/commits)

---
updated-dependencies:
- dependency-name: reqwest
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: dependencies
- dependency-name: typed-builder
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>

* skip flaky test

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthias <matthias@endler.dev>
2024-08-27 11:43:39 +02:00
Matthias Endler
bf19934949
Run cargo nextest in CI and enable all tests (#1483) 2024-08-13 18:45:09 +02:00
Hugo McNally
4bb8a61545
Updated pulldown-cmark dependency and fixed maths parsing (#1473)
* Update pulldown-cmark version to 0.11.0
* Fix markdown math parsing
* Fix lints
* Disable flaky wayback test

---------

Co-authored-by: Matthias <matthias@endler.dev>
2024-08-06 15:43:34 +02:00