lychee/lychee-lib/src
Matthias Endler 55797071b0
Fix nested URL extraction in verbatim elements (#988)
Skipping URLs in verbatim elements didn't take nested
elements into consideration, which were not verbatim.

For instance, the following HTML snippet would yield
`https://example.com` in non-verbatim mode, even if
it is nested inside a verbatim `<pre>` element:

```html
<pre><a href="https://example.com">link</a></pre>
```

This commit fixes the behavior for both `html5gum` and
`html5ever`.

Note that nested verbatim elements of the same kind
still are not handled correctly.

For instance,  the following HTML snippet would still yield
`https://example.com`:

```html
<pre>
  <pre></pre>
  <a href="https://example.com">link</a>
</pre>
```

The reason is that we currently only keep track of a single
verbatim element and not a stack of elements, which we
would need to unwind and resolve the situation.

Fixes https://github.com/lycheeverse/lychee/issues/986.
2023-03-11 15:18:25 +01:00
..
extract Fix nested URL extraction in verbatim elements (#988) 2023-03-11 15:18:25 +01:00
filter Don't check example mail addresses by default (#815) 2022-11-08 23:46:32 +01:00
helpers Introduce new let...else syntax (#936) 2023-01-30 14:25:30 +01:00
quirks Properly handle youtu.be shortlinks (#908) 2023-01-06 18:25:09 +01:00
types Better retry handling (#981) 2023-03-10 22:36:45 +01:00
client.rs Better retry handling (#981) 2023-03-10 22:36:45 +01:00
collector.rs Fix Rust 1.66 clippy lints (#879) 2022-12-19 14:28:10 +01:00
lib.rs Better retry handling (#981) 2023-03-10 22:36:45 +01:00
remap.rs Fix typos (#944) 2023-02-09 15:32:16 +01:00
retry.rs Better retry handling (#981) 2023-03-10 22:36:45 +01:00
test_utils.rs Harden URL detection and extend verbatim elements (#899) 2023-01-04 00:38:19 +01:00