Previously an include alone would not mean that only the included
patterns were checked. Only if an exclude was given as well,
the includes would make a difference. Now, the includes on their
own will work as expected.
Moved the exlude methods into the Exclude mod.
Also changed the order of exclude tests to do the fast lookup
ones before the regex ones.
Added tests to guarantee behavior in the future.
Adds a new function `lychee::check()`, which removes
a lot of boilerplate for simple cases. Adjusted the code,
tests, and documentation.
The downside is that `check` now returns a Result, so
we have to use `?` to get to the response. That's because
we have to account for the case where the given string is
not a valid URI.
Even on stdout we print JSON now if `--format json` is set.
The reason is that not outputting any JSON when the format
is requested can be unintuitive. It is also great for debugging
purposes before sending the output to a file
with the `--output` argument.
If an error occurs during link checking,
it is important to know where the error occured.
Therefore the request and response objects now contain a the input
source as a field. This makes error tracking easier.
* Fix HTML parsing for non-closed elements like <link>
The XML parser we use requires all tags to be closed by default,
and if they aren't (like HTML5 <link> elements), it simply gives up
on further parsing. This change makes it ignore such issues.
Also uncover a bug with the current parser (it simply won't parse
elements like `<script defer src="..."></script>`) -- e.g. elements
with no attribute values.
The XML parser is an XML parser and will have to be replaced with
HTML aware parser in the future.
* Add check for empty elements
* Update extract.rs
Co-authored-by: Matthias <matthias-endler@gmx.net>
For now we only support JSON.
I honestly don't know if it makes sense to include other formats.
For example, MD and HTML are not really
machine-readable. YAML is not
a great standard format for this use-case. Open for discussions, though.
This splits up the code into a `lib` and a `bin`
to make the runtime usable from other crates.
Co-authored-by: Paweł Romanowski <pawroman@pawroman.dev>
This implements a basic builder for the Checker struct as discussed in #12.
It is using derive_builder and uses a custom build method to instantiate the more elaborate fields like reqwest::Client.
It also adds deadpool and tokio::mpsc as dependencies to handle a pool of clients to query websites.
* Make GITHUB_TOKEN optional
This also makes the token possible to pass in from CLI args.
* Add missing test fixture file
* Normalize exit codes and GitHub checking behavior
The exit code is now defined as 1 for unexpected or config errors,
and 2 for link check failures.
GitHub checking behavior has been tweaked to generate errors if
a GitHub-specific check cannot be performed because of a missing
token.
* Remove short flag for github token
[Issue #18](https://github.com/hello-rust/lychee/issues/18)
* Add headers crate to type headers and create auth header
* Add cmd param basic-auth to set property to the main
* Add simple test to test if with auth headres is no broken
Signed-off-by: FabianBG <f4b4g3@gmail.com>