lychee/lychee-bin/src/stats.rs

200 lines
5.8 KiB
Rust
Raw Normal View History

Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
use std::{
collections::{HashMap, HashSet},
fmt::{self, Display},
};
use console::style;
use lychee_lib::{Input, Response, ResponseBody, Status};
use pad::{Alignment, PadStr};
use serde::Serialize;
// Maximum padding for each entry in the final statistics output
const MAX_PADDING: usize = 20;
pub(crate) fn color_response(response: &ResponseBody) -> String {
let out = match response.status {
Status::Ok(_) => style(response).green().bright(),
2021-04-26 15:16:58 +00:00
Status::Excluded | Status::Unsupported(_) => style(response).dim(),
Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
Status::Redirected(_) => style(response),
Status::Timeout(_) => style(response).yellow().bright(),
Status::Error(_) => style(response).red().bright(),
};
out.to_string()
}
#[derive(Default, Serialize)]
pub(crate) struct ResponseStats {
total: usize,
successful: usize,
failures: usize,
timeouts: usize,
redirects: usize,
excludes: usize,
errors: usize,
fail_map: HashMap<Input, HashSet<ResponseBody>>,
}
impl ResponseStats {
#[inline]
pub(crate) fn new() -> Self {
Self::default()
}
pub(crate) fn add(&mut self, response: Response) {
let Response(source, ResponseBody { ref status, .. }) = response;
2021-04-26 15:16:58 +00:00
if status.is_unsupported() {
// Silently skip unsupported URIs
return;
}
Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
self.total += 1;
match status {
Status::Ok(_) => self.successful += 1,
Status::Error(_) => self.failures += 1,
Status::Timeout(_) => self.timeouts += 1,
Status::Redirected(_) => self.redirects += 1,
Status::Excluded => self.excludes += 1,
2021-04-26 15:16:58 +00:00
Status::Unsupported(_) => (), // Just skip unsupported URI
Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
}
if matches!(
status,
Status::Error(_) | Status::Timeout(_) | Status::Redirected(_)
) {
let fail = self.fail_map.entry(source).or_default();
fail.insert(response.1);
};
}
#[inline]
pub(crate) const fn is_success(&self) -> bool {
self.total == self.successful + self.excludes
}
#[inline]
pub(crate) const fn is_empty(&self) -> bool {
self.total == 0
}
}
fn write_stat(f: &mut fmt::Formatter, title: &str, stat: usize, newline: bool) -> fmt::Result {
let fill = title.chars().count();
f.write_str(title)?;
f.write_str(
&stat
.to_string()
.pad(MAX_PADDING - fill, '.', Alignment::Right, false),
)?;
if newline {
f.write_str("\n")?;
}
Ok(())
}
impl Display for ResponseStats {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let separator = "-".repeat(MAX_PADDING + 1);
writeln!(f, "\u{1f4dd} Summary")?; // 📝
writeln!(f, "{}", separator)?;
write_stat(f, "\u{1f50d} Total", self.total, true)?; // 🔍
write_stat(f, "\u{2705} Successful", self.successful, true)?; // ✅
write_stat(f, "\u{23f3} Timeouts", self.timeouts, true)?; // ⏳
write_stat(f, "\u{1f500} Redirected", self.redirects, true)?; // 🔀
write_stat(f, "\u{1f47b} Excluded", self.excludes, true)?; // 👻
write_stat(f, "\u{1f6ab} Errors", self.errors + self.failures, false)?; // 🚫
for (input, responses) in &self.fail_map {
// Using leading newlines over trailing ones (e.g. `writeln!`)
// lets us avoid extra newlines without any additional logic.
write!(f, "\n\nErrors in {}", input)?;
for response in responses {
2021-09-03 00:18:58 +00:00
write!(f, "\n{}", color_response(response))?;
Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
}
}
Ok(())
}
}
#[cfg(test)]
mod test {
use std::collections::{HashMap, HashSet};
use http::StatusCode;
use lychee_lib::{ClientBuilder, Input, Response, ResponseBody, Status, Uri};
use pretty_assertions::assert_eq;
use reqwest::Url;
use wiremock::{matchers::path, Mock, MockServer, ResponseTemplate};
use super::ResponseStats;
fn website(url: &str) -> Uri {
Uri::from(Url::parse(url).expect("Expected valid Website URI"))
}
async fn get_mock_status_response<S>(status_code: S) -> Response
where
S: Into<StatusCode>,
{
let mock_server = MockServer::start().await;
let template = ResponseTemplate::new(status_code.into());
Mock::given(path("/"))
.respond_with(template)
.mount(&mock_server)
.await;
ClientBuilder::default()
.client()
Major refactor of codebase (#208) - The binary component and library component are separated as two packages in the same workspace. - `lychee` is the binary component, in `lychee-bin/*`. - `lychee-lib` is the library component, in `lychee-lib/*`. - Users can now install only the `lychee-lib`, instead of both components, that would require fewer dependencies and faster compilation. - Dependencies for each component are adjusted and updated. E.g., no CLI dependencies for `lychee-lib`. - CLI tests are only moved to `lychee`, as it has nothing to do with the library component. - `Status::Error` is refactored to contain dedicated error enum, `ErrorKind`. - The motivation is to delay the formatting of errors to strings. Note that `e.to_string()` is not necessarily cheap (though trivial in many cases). The formatting is no delayed until the error is needed to be displayed to users. So in some cases, if the error is never used, it means that it won't be formatted at all. - Replaced `regex` based matching with one of the following: - Simple string equality test in the case of 'false positivie'. - URL parsing based test, in the case of extracting repository and user name for GitHub links. - Either cases would be much more efficient than `regex` based matching. First, there's no need to construct a state machine for regex. Second, URL is already verified and parsed on its creation, and extracting its components is fairly cheap. Also, this removes the dependency on `lazy-static` in `lychee-lib`. - `types` module now has a sub-directory, and its components are now separated into their own modules (in that sub-directory). - `lychee-lib::test_utils` module is only compiled for tests. - `wiremock` is moved to `dev-dependency` as it's only needed for `test` modules. - Dependencies are listed in alphabetical order. - Imports are organized in the following fashion: - Imports from `std` - Imports from 3rd-party crates, and `lychee-lib`. - Imports from `crate::*` or `super::*`. - No glob import. - I followed suggestion from `cargo clippy`, with `clippy::all` and `clippy:pedantic`. Co-authored-by: Lucius Hu <lebensterben@users.noreply.github.com>
2021-04-14 23:24:11 +00:00
.unwrap()
.check(mock_server.uri())
.await
.unwrap()
}
#[test]
fn test_stats_is_empty() {
let mut stats = ResponseStats::new();
assert!(stats.is_empty());
stats.add(Response(
Input::Stdin,
ResponseBody {
uri: website("http://example.org/ok"),
status: Status::Ok(StatusCode::OK),
},
));
assert!(!stats.is_empty());
}
#[tokio::test]
async fn test_stats() {
let stata = [
StatusCode::OK,
StatusCode::PERMANENT_REDIRECT,
StatusCode::BAD_GATEWAY,
];
let mut stats = ResponseStats::new();
for status in &stata {
stats.add(get_mock_status_response(status).await);
}
let mut expected_map: HashMap<Input, HashSet<ResponseBody>> = HashMap::new();
for status in &stata {
if status.is_server_error() || status.is_client_error() || status.is_redirection() {
let Response(input, response_body) = get_mock_status_response(status).await;
let entry = expected_map.entry(input).or_default();
entry.insert(response_body);
}
}
assert_eq!(stats.fail_map, expected_map);
}
}