Commit graph

312 commits

Author SHA1 Message Date
Lukas Pirl
8c959589c3
add option to ignore specific errors for specific URLs 2022-09-25 22:52:04 +02:00
Chris Mayo
3c7fb5b571 Fix checking directory containing Unicode filenames
Non-Unicode filenames are not supported.

sys.platform has not returned "linux2" since Python 3.3.
2022-09-05 19:28:40 +01:00
Chris Mayo
1abd9ea10e Skip tests in TestFile rather than silently returning 2022-09-05 19:28:40 +01:00
Chris Mayo
d6936ceb91 Add warning url-content-type-unparseable 2022-09-02 19:29:11 +01:00
Kian-Meng Ang
a70ea9ea14 Fix typos
Found via `codespell ./linkcheck/ ./tests ./doc/man/en -L bu,noone,fo,pres,shttp`
2022-09-02 17:20:02 +08:00
Chris Mayo
b0878dd7e8 Require TestFile.test_directory_listing to succeed 2021-12-15 19:36:53 +00:00
Chris Mayo
76815bcf47 Don't guess the URL for files that end in .html
Fixes:
linkchecker ftp.html
failing looking for ftp://ftp.html
2021-12-13 19:31:13 +00:00
Chris Mayo
fe5a34c68f Remove linkcheck.checker.proxysupport
Set up the requests.Session() with the complete proxy configuration
to fix a problem with using an HTTP server as an HTTPS proxy and
potential redirection issues.

Requests handles no_proxy.
2021-12-13 19:25:23 +00:00
Chris Mayo
0356524369 Disable AnchorCheck plugin
Can't be relied on. Multiple reports of expected results not returned.

https://github.com/linkchecker/linkchecker/issues/542
https://github.com/linkchecker/linkchecker/issues/555
https://github.com/linkchecker/linkchecker/issues/568

Previously a fix was needed just to get the tests working:
0912e8a2c ("Don't strip the URL fragment from cache key if using AnchorCheck", 2020-07-27)

After:
eaa538c81 ("don't check one url multiple times", 2016-11-09)
2021-11-29 19:35:34 +00:00
Chris Mayo
ef60e9dcd6 Enable certificate verification during https test 2021-11-22 19:27:18 +00:00
Chris Mayo
bb4102da5a Replace deprecated ssl.wrap_socket() in tests 2021-11-22 19:27:18 +00:00
Chris Mayo
7c2036b68c Drop support for Beautiful Soup < 4.8.1
The minimum version supported was already 4.8.0 because of the use
of multi_valued_attributes [1].

Test support for < 4.8.1 is the only code that needs removing [2].

[1] 3ff3d724 ("Use BeautifulSoup element attrs directly", 2020-04-03)
[2] 607328d5 ("Support Beautiful Soup line numbers", 2019-10-05)
2021-01-28 19:20:24 +00:00
Chris Mayo
e922dd0224 Stop using biplist
plistlib has supported binary files since Python 3.4.
2020-10-12 19:55:46 +01:00
Chris Mayo
9891fc3f70 Python 3.9 adds suport for HTTP status code 103 EARLY_HINTS 2020-09-14 19:55:05 +01:00
Chris Mayo
b1faef93c3
Merge pull request #495 from cjmayo/mswindows
MS Windows Python 3.7 and MS Store compatibility
2020-09-01 19:46:44 +01:00
Chris Mayo
314ec085a3
Merge pull request #462 from cjmayo/anchor
Fix anchor checking
2020-09-01 19:39:29 +01:00
Chris Mayo
89613d56f2 Replace the use of Python internal test.support
Its use is discourged and it is not present in the MS Store version of
Python.
2020-08-29 16:57:57 +01:00
Chris Mayo
1390c9cd7e
Merge pull request #489 from cjmayo/urlsplit
Replace deprecated urllib.parse.split functions
2020-08-29 16:44:56 +01:00
Chris Mayo
2de25d54fd Rename blacklist to failures
Continue to support blacklist for the time being, with deprecation
warnings.
2020-08-23 17:19:26 +01:00
Chris Mayo
737c61cd67
Merge pull request #484 from cjmayo/issuetests
Tests of img srcset and invalid host name
2020-08-22 16:32:03 +01:00
Chris Mayo
f99f15c349 Add a test for UrlBase.build_url() 2020-08-22 16:28:53 +01:00
Chris Mayo
24c2f4ac39 Add test for invalid host name in content
Tests code added in:
d5690203 ("Fix critical exception when parsing a URL with a ]", 2020-08-08)
2020-08-15 17:04:41 +01:00
Chris Mayo
8c804c35a5 Detect sitemaps that do not start with an XML declaration 2020-08-11 19:35:56 +01:00
Chris Mayo
a7eacd6200 Add a test for a page with links to anchors
Query and fragment URL parts for filesystem URLs are ignored, therefore
test over http.
2020-07-27 19:22:32 +01:00
Chris Mayo
4330b8a59e Replace codecs.open() with open() 2020-06-05 16:59:46 +01:00
Chris Mayo
a6b1eb45b1 Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
Chris Mayo
5df8aa085c Convert space-separated strings in tests/ 2020-05-29 19:40:46 +01:00
Chris Mayo
5ee8d8e1ea Add trailing comma to single dict list in TestLoginUrl.visit_loginurl() 2020-05-29 19:40:46 +01:00
Chris Mayo
a534be0b50 Remove unnecessary character match in regexp in TestLogger.normalize() 2020-05-29 19:40:46 +01:00
Chris Mayo
be53c4a659 Remove unnecessary commas before closing brackets in tests/ 2020-05-29 19:40:46 +01:00
Chris Mayo
87039913b2 Fix remaining flake8 violations in tests/
tests/test_clamav.py:58:89: E501 line too long (90 > 88 characters)
tests/test_containers.py:38:9: F841 local variable 'dummy' is assigned to but never used
tests/test_dummy.py:35:9: F841 local variable 'dummy' is assigned to but never used
tests/test_ftpparse.py:94:89: E501 line too long (96 > 88 characters)
tests/test_url.py:128:89: E501 line too long (130 > 88 characters)
tests/test_strformat.py:62:9: E741 ambiguous variable name 'l'
tests/test_strformat.py:136:9: E731 do not assign a lambda expression, use a def
tests/checker/ftpserver.py:94:9: E722 do not use bare 'except'
tests/checker/httpserver.py:55:39: E231 missing whitespace after ','
tests/checker/httpserver.py:224:9: E722 do not use bare 'except'
tests/checker/telnetserver.py:84:9: E722 do not use bare 'except'
tests/checker/__init__.py:71:89: E501 line too long (119 > 88 characters)
tests/checker/__init__.py:292:13: E741 ambiguous variable name 'l'
tests/checker/test_http_misc.py:30:1: W293 blank line contains whitespace
tests/checker/test_https.py:21:1: F401 'tests.need_network' imported but unused
tests/checker/test_news.py:35:1: E302 expected 2 blank lines, found 1
2020-05-28 20:29:13 +01:00
Chris Mayo
165c51aeea Run black on tests/ 2020-05-28 20:29:13 +01:00
Chris Mayo
6f126a54d2 Add coverage for parser.sitemap.parse_sitemapindex() 2020-05-27 20:02:03 +01:00
Chris Mayo
f6e182f0e4 Mark TestFile.test_html_url_quote as need_network
Else without the internet the test fails, eventually, with:

warning No MX mail host for users.sourceforge.net found
2020-05-25 19:55:28 +01:00
Chris Mayo
d3c9618b1b TestHttps.test_https doesn't need the internet now
A result of changes introduced in:

dee4be4b ("Enable https checking using a test server", 2019-11-11)
2020-05-25 19:55:28 +01:00
Chris Mayo
32689ea230 Enable as many TestHttp html tests as possible without the internet 2020-05-25 19:55:28 +01:00
Chris Mayo
313a14ff0d Remove instances of Python 2 unicode 2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Chris Mayo
d611564cb0 Add a test for an empty html file accessed over http 2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting 2020-05-23 14:15:52 +03:00
Marius Gedminas
5bd1fb4e36 Fix internal error on empty HTML files
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None.  It doesn't matter what encoding we pick for
empty files, so let's just pick one.

I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.

Closes #392.
2020-05-21 19:01:33 +03:00
Chris Mayo
96e1c00ff7 TestLogger diff output is all Unicode in Python 3 2020-05-20 19:58:44 +01:00
Chris Mayo
a127902607 Replace str_text in asserts 2020-05-19 19:56:42 +01:00
Chris Mayo
04465530c4 Use HttpServerTest.get_url() 2020-05-17 20:10:28 +01:00
Chris Mayo
58dbe1f282 Remove unused import pytest from tests/checker/test_http.py
pytest.mark.xfail() removed in:
743a5f31 ("Crawl HTML attributes in deterministic order", 2017-02-01)
2020-05-17 20:10:28 +01:00
Chris Mayo
79eafee826 Add a test for VirusCheck 2020-05-17 19:04:49 +01:00
Chris Mayo
a15a2833ca Remove spaces after names in class method definitions
And also nested functions.

This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7 Remove spaces after names in function definitions
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
fc11d08968 Remove spaces after names in class definitions 2020-05-16 20:19:42 +01:00
Chris Mayo
1416a08119 On Python 3 no need to convert os.linesep to a string 2020-05-16 17:02:01 +01:00