linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-03-27 03:00:36 +00:00

Author	SHA1	Message	Date
Chris Mayo	94839dcb03	Fix TestHttp.test_html_internet on Python 3.12.6 A relative path in a URL without netloc is preserved in urllib.parse.urlunsplit(). https://docs.python.org/3.12/whatsnew/changelog.html#id4	2024-09-20 19:21:25 +01:00
Chris Mayo	ecfd35b9e9	Fix TestFile.test_markdown on Python 3.12.5 mime type mapping for markdown was added. https://docs.python.org/3.12/whatsnew/changelog.html#id3	2024-08-21 19:34:37 +01:00
nodet	28f6743778	Add ignorewarningsforurls to ignore specific warnings (#794 ) We want to allow specifying a warning to ignore for each URL. If no regex is specified for the warning to ignore, we'll ignore all warnings. The tests still pass as they are, which means that unknown values in the configuration file are simply ignored. * [#782] Add values to configuration file * [#782] Parse new configuration values * [#782] Actually ignore a warning * [#782] Confirm side cases work as expected * [#782] Add logging when deciding to ignore warnings * [#782] Documentation for ignorewarningsforurls * [#782] Update (generated) man pages * [#782] These tests pass without network, actually * [#782] Fix copy/paste error in symbol naming * [#782] The regex matches the name of the warning, not the message * [#782] Better wording * [#782] Update (generated) man pages * [#782] We match the type, not the message	2024-02-13 19:43:29 +00:00
Chris Mayo	beaf9399f8	Elevate redirection to a warning tagged http-redirected Include the HTTP status code and reason in the message.	2023-08-28 19:22:24 +01:00
Nathan Arthur	2d1bf6ef98	Add tests for encoded anchors for file: and http: I started with a test of urlencoded anchors, assuming at the URL might have a urlencoded anchor, but the actual anchor in the HTML would NOT be urlencoded.	2022-10-03 19:33:05 +01:00
Nathan Arthur	5398fd2406	Add an anchor test for multiple inter-connected files	2022-10-03 19:33:05 +01:00
Chris Mayo	3c7fb5b571	Fix checking directory containing Unicode filenames Non-Unicode filenames are not supported. sys.platform has not returned "linux2" since Python 3.3.	2022-09-05 19:28:40 +01:00
Chris Mayo	d6936ceb91	Add warning url-content-type-unparseable	2022-09-02 19:29:11 +01:00
Chris Mayo	7c2036b68c	Drop support for Beautiful Soup < 4.8.1 The minimum version supported was already 4.8.0 because of the use of multi_valued_attributes [1]. Test support for < 4.8.1 is the only code that needs removing [2]. [1] `3ff3d724` ("Use BeautifulSoup element attrs directly", 2020-04-03) [2] `607328d5` ("Support Beautiful Soup line numbers", 2019-10-05)	2021-01-28 19:20:24 +00:00
Chris Mayo	314ec085a3	Merge pull request #462 from cjmayo/anchor Fix anchor checking	2020-09-01 19:39:29 +01:00
Chris Mayo	737c61cd67	Merge pull request #484 from cjmayo/issuetests Tests of img srcset and invalid host name	2020-08-22 16:32:03 +01:00
Chris Mayo	24c2f4ac39	Add test for invalid host name in content Tests code added in: `d5690203` ("Fix critical exception when parsing a URL with a ]", 2020-08-08)	2020-08-15 17:04:41 +01:00
Chris Mayo	8c804c35a5	Detect sitemaps that do not start with an XML declaration	2020-08-11 19:35:56 +01:00
Chris Mayo	a7eacd6200	Add a test for a page with links to anchors Query and fragment URL parts for filesystem URLs are ignored, therefore test over http.	2020-07-27 19:22:32 +01:00
Chris Mayo	6f126a54d2	Add coverage for parser.sitemap.parse_sitemapindex()	2020-05-27 20:02:03 +01:00
Chris Mayo	d611564cb0	Add a test for an empty html file accessed over http	2020-05-23 20:01:24 +01:00
Marius Gedminas	5bd1fb4e36	Fix internal error on empty HTML files When BeautifulSoup finds an empty file on disk, it sets original_encoding to None. It doesn't matter what encoding we pick for empty files, so let's just pick one. I don't know if there are any circumstances where BeautifulSoup might set the encoding to None for a non-empty file. Closes #392.	2020-05-21 19:01:33 +03:00
Chris Mayo	00c4a30386	Add user and password only loginurl tests	2020-05-13 19:32:29 +01:00
Chris Mayo	31a9f68c46	Merge pull request #367 from cjmayo/loginurl Add test for loginurl	2020-05-12 20:08:57 +01:00
Chris Mayo	4ffdbf2406	Replace MetaRobotsFinder using BeautifulSoup.find()	2020-04-29 20:07:00 +01:00
Chris Mayo	3b8af403be	Add test for loginurl A new cgi-bin directory is created to identify the scripts to be run by http.server.CGIHTTPRequestHandler.	2020-04-19 19:05:55 +01:00
Chris Mayo	56b8c9f7ab	Add tests for <meta name="robots" content="nofollow"> norobots.html was used for testing <meta name="robots" content="nofollow"> in local files until [1]. This commit reinstates local file testing and adds an http test. Checking is reported by checker.httpurl.HttpUrl.content_allows_robots(). [1] `ce733ae7` ("Don't check for robots.txt directives in local html files.", 2014-03-19)	2020-04-18 20:30:46 +01:00
Chris Mayo	74d5c68094	Add new tests for URL quoting	2019-10-05 19:38:57 +01:00
Chris Mayo	607328d5c5	Support Beautiful Soup line numbers	2019-10-05 19:38:57 +01:00
Petr Dlouhý	2c3c794e52	fix http test after parser change	2019-07-22 19:59:37 +01:00
Petr Dlouhý	d1844a526e	add charset tests	2019-07-22 19:59:37 +01:00
Chris Mayo	ec8b6e09f0	Fix XmlTagUrlParser and make Python 3 compatible URLs within a sitemap file were not being captured.	2019-10-28 19:20:05 +00:00
Marius Gedminas	87b504785c	Add a regression test for the sitemap parser	2019-10-23 17:30:10 +03:00
Marius Gedminas	58b0d5aaae	Fix TypeError: string arg required in content_allows_robots() See #323 an #317.	2019-10-22 14:13:45 +03:00
Marius Gedminas	84dbb5d603	Fix TypeError: string arg required in find_links() Fixes #317.	2019-10-21 17:47:46 +03:00
Marius Gedminas	a4967fe92c	Add a regression test for issue #317 The important bit was making the `file_test` helper not ignore internal errors.	2019-10-21 17:45:18 +03:00
Petr Dlouhý	c1ab81627e	test of correct logging of all parts in url_data	2018-01-14 17:17:07 +01:00
Philipp Hahn	1368643a50	Fix fragment identifier quoting According to <https://tools.ietf.org/html/rfc3986>: fragment = ( pchar / "/" / "?" ) pchar = unreserved / pct-encoded / sub-delims / ":" / "@" unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" pct-encoded = "%" HEXDIG HEXDIG sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "" / "+" / "," / ";" / "=" Fixes #96	2017-11-10 08:03:03 -05:00
Petr Dlouhý	f5100138ff	fix tests that fail because of changed linkchecker output	2017-02-14 10:59:38 +01:00
Marius Gedminas	f4ec7531c1	Fix TestHttp.test_html The HTML tag has two attributes with URLs: <applet archive="file.html" src="file.css"> It would appear that the order in which these attributes are crawled does not match the order in the result file. Possibly the crawling order is non-deterministic, although I cannot reproduce that. If that's the case, the fix would be to sort the attributes in the crawler before following them, which means we want the expected results sorted as well (and since 'archive' comes before 'src', so file.html should come before file.css).	2017-02-01 18:41:47 +02:00
Bastian Kleineidam	914995b5fc	Use example.com for tests.	2016-01-19 12:17:08 +01:00
Vadim Khohlov	d4352fc828	Added plugin for parsing and checking links in Markdown files	2014-11-11 15:35:18 +02:00
Bastian Kleineidam	0fa7ed2699	Fix empty URL handling.	2014-07-03 23:34:40 +02:00
Bastian Kleineidam	b152ce7a6e	Add PDF test and fix page number.	2014-04-29 18:53:24 +02:00
Bastian Kleineidam	bca226c293	Fix assertion checking external links; fix tests	2014-03-10 18:23:44 +01:00
Bastian Kleineidam	6b334dc79b	Fix URL result caching.	2014-03-08 19:35:10 +01:00
Bastian Kleineidam	ef13a3fce1	Implement sitemap and sitemap index parsing.	2014-03-05 19:26:37 +01:00
Bastian Kleineidam	82f81241fd	Check all links and add better caching.	2014-03-03 23:29:45 +01:00
Bastian Kleineidam	7b34be590b	Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.	2014-03-01 00:12:34 +01:00
Bastian Kleineidam	b363945052	Adjust example.com/org tests. This seems to change every now and then.	2013-12-04 19:13:18 +01:00
Bastian Kleineidam	023da7c993	Remove the duplicate URL content check.	2013-12-04 19:12:40 +01:00
Bastian Kleineidam	a86e36e5d3	Fix test cases for example.com redirection.	2013-01-23 19:42:29 +01:00
Bastian Kleineidam	e6ad32c028	Catch UnicodeError for invalid host names.	2013-01-23 19:42:29 +01:00
Bastian Kleineidam	4dad2aa33c	Support dns-prefetch URLs.	2013-01-17 20:41:09 +01:00
Bastian Kleineidam	03f2e19cfd	Fix html tests.	2013-01-17 20:40:51 +01:00

1 2

95 commits