lychee/lychee-bin
Matthias Endler 55797071b0
Fix nested URL extraction in verbatim elements (#988)
Skipping URLs in verbatim elements didn't take nested
elements into consideration, which were not verbatim.

For instance, the following HTML snippet would yield
`https://example.com` in non-verbatim mode, even if
it is nested inside a verbatim `<pre>` element:

```html
<pre><a href="https://example.com">link</a></pre>
```

This commit fixes the behavior for both `html5gum` and
`html5ever`.

Note that nested verbatim elements of the same kind
still are not handled correctly.

For instance,  the following HTML snippet would still yield
`https://example.com`:

```html
<pre>
  <pre></pre>
  <a href="https://example.com">link</a>
</pre>
```

The reason is that we currently only keep track of a single
verbatim element and not a stack of elements, which we
would need to unwind and resolve the situation.

Fixes https://github.com/lycheeverse/lychee/issues/986.
2023-03-11 15:18:25 +01:00
..
src Fix --max-redirects (#987) 2023-03-10 15:15:37 +01:00
tests Fix nested URL extraction in verbatim elements (#988) 2023-03-11 15:18:25 +01:00
Cargo.toml Bump serde from 1.0.153 to 1.0.154 2023-03-09 12:58:27 +00:00
LICENSE-APACHE Major refactor of codebase (#208) 2021-04-15 01:24:11 +02:00
LICENSE-MIT Update license files (#497) 2022-02-08 10:59:54 +01:00