lychee

mirror of https://github.com/Hopiu/lychee.git synced 2026-05-08 13:54:44 +00:00

Author	SHA1	Message	Date
Thomas Zahner	08dabb06b2	Add regression test	2025-07-26 17:33:02 +02:00
Thomas Zahner	e743ea3f5f	Improve test	2025-07-18 16:53:08 +02:00
Thomas Zahner	678acd9760	Test regex functionality in --exclude-path flag	2025-07-18 16:53:08 +02:00
Thomas Zahner	5036ce8388	Update flag description & clean up	2025-07-18 16:53:08 +02:00
Keming	696a7cafc8	fix: do not check the fragment when http response err but accepted (#1763 ) Signed-off-by: Keming <kemingy94@gmail.com>	2025-07-10 06:32:15 +02:00
MichaIng	92a9bca23f	feat: skip fragment checking for unsupported MIME types (#1744 ) * feat: skip fragment checking for unsupported MIME types The remote URL/website checker currently passes all URLs with fragments to the fragment checker as HTML document, even if it is a different or unsupported MIME type. This can cause false fragment checking for Markdown documents, failures for other MIME types, especially binaries, and unnecessary traffic for large downloads, which are always finished completely, if the fragment checker is invoked. This commit checks the Content-Type header of the response: - Only if it is `text/html`, it is passed to the fragment checker as HTML type. - Only if it is `text/markdown`, of `text/plain` and URL path ends on `.md`, it is passed to the fragment checker as Markdown type. - In all other cases, the fragment checker is skipped and the HTTP status is returned. To invoke the fragment checker with a variable document type, a new `FileType` argument is added to the `check_html_fragment()` function. The fragment checker test and fixture are adjusted to match the expected result: checking a binary file via remote URL with fragment is now expected to succeed, since its Content-Type header does not invoke the fragment checker anymore. Signed-off-by: MichaIng <micha@dietpi.com> * Update fixtures/fragments/file1.md Co-authored-by: MichaIng <micha@dietpi.com> --------- Signed-off-by: MichaIng <micha@dietpi.com> Co-authored-by: Matthias Endler <matthias@endler.dev>	2025-07-06 10:46:06 +02:00
Keming	02f6f5cb49	feat: add 'user-content-' prefix to support github markdown fragment (#1750 )	2025-07-04 22:58:47 +02:00
ocavue	81f2605118	fix: treat a fragment in an empty directory as an error (#1756 ) * fix: treat a fragment in an empty directory as an error * test: add more fragment tests	2025-07-04 10:25:57 +02:00
ocavue	6bcb37c2dc	fix: resolve index file inside a directory (#1752 )	2025-07-03 16:55:57 +02:00
MichaIng	b970256248	fix: skip fragment check if website URL doesn't contain fragment (#1733 ) * fix: skip fragment check if website URL doesn't contain fragment Signed-off-by: MichaIng <micha@dietpi.com> * test: add tests for fragment checks with binary data Signed-off-by: MichaIng <micha@dietpi.com> * fix: skip fragment checking as well if fragment is empty `is_some()` is true as well if the fragment is given but empty, i.e. `#`. While it is an edge case, skip the fragment checker as well in case of an empty fragment. Signed-off-by: MichaIng <micha@dietpi.com> * test: switch to lycheeverse/master remote URLs Signed-off-by: MichaIng <micha@dietpi.com> * fix: apply rustfmt annotation Signed-off-by: MichaIng <micha@dietpi.com> --------- Signed-off-by: MichaIng <micha@dietpi.com>	2025-06-20 17:47:35 +02:00
Keming	b128b86a48	feat: raise error when the default config file is invalid (#1715 ) Signed-off-by: Keming <kemingy94@gmail.com>	2025-05-25 13:10:58 +02:00
Keming	208fa80aa6	fix: only check the fragment when it's a file (#1713 ) * fix: only check the fragment when it's a file * add dir fragment test * Clean up unused fragment_check in Client --------- Signed-off-by: Keming <kemingy94@gmail.com> Co-authored-by: Matthias <matthias@endler.dev>	2025-05-23 21:50:26 +02:00
Matthias Endler	35610764a1	Add support for custom headers in input processing (#1561 )	2025-05-23 13:37:32 +02:00
Keming	1ed357fe73	feat: detect website fragments (#1675 ) Signed-off-by: Keming <kemingy94@gmail.com>	2025-05-14 01:52:08 +02:00
Matthias Endler	d33b7554a1	test: add tests for URL extraction ending with a period (#1641 )	2025-02-24 08:48:58 +01:00
Ben	d6bbf85145	renamed `base` to `base_url` (fixes #1607 ) (#1629 ) * renamed `base` to `base_url` (fixes #1607) * fixed readme * added warning for deprecated `--base` * Update lychee.example.toml * Update fixtures/configs/smoketest.toml	2025-02-16 01:41:32 +01:00
MichaIng	d3d7f6a56b	fix: do not fail on empty # and #top fragments (#1609 ) The empty "#" and "#top" fragments are always valid without related HTML element. Browser will scroll to the top of the page. Hence lychee must not fail on those. Credits go to @thiru-appitap for initial attempt and helping to find missing parts of the implementation. Solves: https://github.com/lycheeverse/lychee/issues/1599 Signed-off-by: MichaIng <micha@dietpi.com>	2025-02-06 15:09:59 +01:00
Trask Stalnaker	6d0e94c799	Introduce --root-dir (#1576 ) * windows * Introduce --root-path * lint * lint * Simplification * Add unit tests * Add integration test * Sync docs * Add missing comment to make CI happy * Revert one of the Windows-specific changes because causing a test failure * Support both options at the same time * Revert a comment change that is no longer applicable * Remove unused code * Fix and simplification * Integration test both at the same time * Unit tests both at the same time * Remove now redundant comment * Revert windows-specific change, seems not needed after recent changes * Use Collector::default() * extract method and unit tests * clippy * clippy: &Option<A> -> Option<&A> * Remove outdated comment * Rename --root-path to --root-dir * Restrict --root-dir to absolute paths for now * Move root dir check	2024-12-13 14:36:33 +01:00
Matthias Endler	71564344de	Fix: Bring back error output for links (#1553 ) With the last lychee release, we simplified the status output for links. While this reduced the visual noise, it also accidentally caused the source of errors to not be printed anymore. This change brings back the additional error information as part of the final report output. Furthermore, it shows the error information in the progress output if verbose mode is activated. Fixes #1487	2024-11-07 00:22:50 +01:00
autoantwort	98015907f2	Ignore casing when processing markdown fragments + check for percent encoded ancors (#1535 ) We must also check the fragment before it is percent-decoded as required by the HTML standard. Fixes https://github.com/lycheeverse/lychee/issues/1467	2024-10-28 09:21:13 +01:00
Matthias Endler	812941c2aa	Fix format option in configuration file (#1547 )	2024-10-27 02:17:00 +02:00
Matthias Endler	e43086c2e9	Fix skipping of email addresses in stylesheets (#1546 )	2024-10-27 01:32:11 +02:00
Matthias Endler	3094bbca33	Add support for relative links (#1489 ) This commit introduces several improvements to the file checking process and URI handling: - Extract file checking logic into separate `Checker` structs (`FileChecker`, `WebsiteChecker`, `MailChecker`) - Improve handling of relative and absolute file paths - Enhance URI parsing and creation from file paths - Refactor `create_request` function for better clarity and error handling These changes provide better support for resolving relative links, handling different base URLs, and working with file paths. Fixes https://github.com/lycheeverse/lychee/issues/1296 and https://github.com/lycheeverse/lychee/issues/1480	2024-10-26 04:07:37 +02:00
Thomas Zahner	462033a294	Test ignored files	2024-09-22 19:09:35 +02:00
Thomas Zahner	0e9b6532d2	Test hidden files	2024-09-22 19:09:35 +02:00
Johannes Schindelin	8c6eee9b5f	Add a way to handle "pretty URLs", i.e. URIs without `.html` extension (#1422 ) In many circumstances (GitHub Pages, Apache configured with MultiViews, etc), web servers process URIs by appending the `.html` file extension when no file is found at the path specified by the URI but a `.html` file corresponding to that path _is_ found. To allow Lychee to use the fast, offline method of checking such files locally via the `file://` scheme, let's handle this scenario gracefully by adding the `--fallback-extensions=html` option. Note: This new option can take a list of file extensions to use; The first one for which a corresponding file is found is then used. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-06-11 16:11:24 +02:00
John Bampton	0956ec6c38	Fix spelling and remove unneeded trailing whitespace (#1417 )	2024-04-26 08:22:44 +02:00
Hugo McNally	9ff4a838ce	Fixed fragment generation for headings with inline code (#1370 ) * Added code headings to fragment cli test * Fixed fragment generation for headings with inline code	2024-02-05 01:07:56 +01:00
Norbert Kamiński	2a95944ef5	status.rs: Make json output more verbose (#1367 ) * status.rs: Make json output more verbose Currently if the status response has no status code, json output contains only a text field which gives no real information about the cause of the problem. The patch adds field with more detailed information when the status response contains some details. Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com> * cli.rs: Test parsing of error details in JSON format Some network error such as SSL has no status code but it can be identified by error status details. This patch adds a test case to verify if the error details are parsed properly in the json format. Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com> --------- Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com>	2024-01-30 23:58:18 +01:00
Matthias	f933656161	Add integration test for `accept` (int and string)	2024-01-10 00:10:22 +01:00
Matthias Endler	63ba63f7c9	Exclude example TLDs from RFC 2606 (#1335 ) Fixes https://github.com/lycheeverse/lychee/issues/1283	2024-01-05 18:48:15 +01:00
Hugo McNally	c9b707ea74	Decode percent escapes in fragments (#1275 ) * Added test to check a fragment with a utf8 character	2024-01-05 15:46:09 +01:00
Matthias Endler	ef4e19268a	Fix false-positive example domains (#1316 )	2023-12-04 01:55:14 +01:00
Techassi	1b1fd0c707	feat: Add support for ranges in the `--accept` option / config field (#1167 ) Adds support for accept ranges discussed in #1157. This allows the user to specify custom HTTP status codes accepted during checking and thus will report as valid (not broken). The accept option only supports specifying status codes as a comma-separated list. With this PR, the option will accept a list of status code ranges formatted like this: ```toml accept = ["100..=103", "200..=299", "403"] ``` These combinations will be supported: `..<end>`, ` ..=<end>`, `<start>..<end>` and `<start>..=<end>`. The behavior is copied from the Rust Range like concepts: ``` ..<end>, includes 0 to <end> (exclusive) ..=<end>, includes 0 to <end> (inclusive) <start>..<end>, includes <start> to <end> (exclusive) <start>..=<end>, includes <start> to <end> (inclusive) ``` - Foundation and enhancements for accept ranges, including support for comma-separated strings and integration into the CLI. - Implementations and updates for AcceptSelector, including Default, Display, and serde defaults. - Address and fix various errors: clippy, cargo fmt, and tests. - Add more tests, address edge cases, and enhance error messaging, especially for TOML config parsing. - Update dependencies.	2023-09-17 21:39:01 +02:00
Matthias Endler	0711112841	Mention supported schemes (#1255 ) Fixes https://github.com/lycheeverse/lycheeverse.github.io/issues/7	2023-09-15 01:27:44 +02:00
Hugo McNally	f59aa61ee3	Check fragments in HTML files (#1198 ) * Added html5gum based fragment extractor * Markdown fragment extractor now extracts fragments from inline html * Added fragment checks for html file * Added inline html and html document to fragment checks test * Improved some comments * Improved documentation of markdown's fragment extractor.	2023-08-22 16:44:45 +02:00
Hugo McNally	8e6369377c	Introduce fragment checking for links to markdown files. (#1126 ) - Implemented enhancements to include fragments in file links - Checked links to markdown files with fragments, generating unique kebab case and heading attributes. - Made code more idiomatic and added an integration test. - Updated documentation. - Fixed issues with heading attributes fragments and ensured proper handling of file errors.	2023-07-31 16:04:00 +02:00
Matthias Endler	04887ee293	Make checking email addresses optional (#1171 ) E-Mail checks cause too many false-postives, so we put them behind a flag. * `--exclude-mail` is deprecated (to be removed in 1.0) * `--include-mail` is the new flag This PR also removes the obsolete tests for `--exclude-file`, which was superseded by `.lycheeignore`. Fixes #1089	2023-07-19 19:58:38 +02:00
Techassi	f53619a455	feat: Add support for --dump-inputs (#1159 ) * Add support for --dump-inputs * Add integration tests * Fix usage guide in README	2023-07-16 18:08:14 +02:00
Matthias Endler	97573123ef	Extend remap feature (#1133 ) * wip * Extend support for remapping This adds supports for partial remaps and capture groups to the remap feature. Fixes #1129	2023-07-05 15:05:19 +02:00
Techassi	67af7ef6d3	feat: add support for basic auth per URI (#1110 ) * Add support for basic auth per domain * Move URI matching to link collection phase * Allow AsRef for BasicAuthExtractor::new to avoid clone * Add tests --------- Co-authored-by: Matthias Endler <matthias@endler.dev>	2023-06-26 12:06:24 +02:00
Thomas Zahner	130fa21a6a	Concurrent archives (#1027 )	2023-05-11 20:20:27 +02:00
Matthias Endler	55797071b0	Fix nested URL extraction in verbatim elements (#988 ) Skipping URLs in verbatim elements didn't take nested elements into consideration, which were not verbatim. For instance, the following HTML snippet would yield `https://example.com` in non-verbatim mode, even if it is nested inside a verbatim `<pre>` element: ```html <pre><a href="https://example.com">link</a></pre> ``` This commit fixes the behavior for both `html5gum` and `html5ever`. Note that nested verbatim elements of the same kind still are not handled correctly. For instance, the following HTML snippet would still yield `https://example.com`: ```html <pre> <pre></pre> <a href="https://example.com">link</a> </pre> ``` The reason is that we currently only keep track of a single verbatim element and not a stack of elements, which we would need to unwind and resolve the situation. Fixes https://github.com/lycheeverse/lychee/issues/986.	2023-03-11 15:18:25 +01:00
Matthias	c9edb7f809	Split up quirks and skip twitter check It's flaky on Github	2023-03-03 12:13:09 +01:00
Matthias	08466ad59b	Ignore config smoketest output report file	2023-03-03 12:13:09 +01:00
Matthias	86f13609e6	Put lycheecache tests into separate subfolders to avoid race	2023-03-03 12:13:09 +01:00
Matthias	388bd20673	Fix tests after `address` is no longer a verbatim element	2023-03-03 12:13:09 +01:00
Matthias Endler	7874195bbb	Customize verbosity (#956 )	2023-02-24 23:53:09 +01:00
Matthias Endler	5654b7c317	Harden URL detection and extend verbatim elements (#899 ) Previously remote URLs were incorrectly detected because the string representation of a path is different than the path itself, causing the `http` prefix match to be insufficient. This resulted in unexpected side-effects, such as the incorrect detection of verbatim mode for remote URLs. The check now got improved and unit tests were added to avoid future breakage. On top of that, missing verbatim elements were added	2023-01-04 00:38:19 +01:00
Matthias	982d978e47	Add different verbosity levels (#824 ) More granular verbosity levels have been asked for repeatedly. To enable that we're moving to [env_logger] and [clap-verbosity-flag] to provide more flexible verbosity settings. Also tackles #661, #709 Lays the groundwork for tackling #268 https://github.com/rust-cli/env_logger https://github.com/clap-rs/clap-verbosity-flag	2022-11-28 23:25:33 +01:00

1 2

99 commits