Commit graph

551 commits

Author SHA1 Message Date
Chris Mayo
1390c9cd7e
Merge pull request #489 from cjmayo/urlsplit
Replace deprecated urllib.parse.split functions
2020-08-29 16:44:56 +01:00
Chris Mayo
47604e7d34
Merge pull request #481 from cjmayo/failures
Rename blacklist to failures
2020-08-29 16:39:24 +01:00
Chris Mayo
7dfba766a9
Merge pull request #486 from cjmayo/url
Remove unused code from url.py
2020-08-26 19:28:50 +01:00
Chris Mayo
2de25d54fd Rename blacklist to failures
Continue to support blacklist for the time being, with deprecation
warnings.
2020-08-23 17:19:26 +01:00
Chris Mayo
737c61cd67
Merge pull request #484 from cjmayo/issuetests
Tests of img srcset and invalid host name
2020-08-22 16:32:03 +01:00
Chris Mayo
f99f15c349 Add a test for UrlBase.build_url() 2020-08-22 16:28:53 +01:00
Chris Mayo
d58b3ab285 Remove unused url.url_fix_common_typos() 2020-08-18 19:57:46 +01:00
Chris Mayo
71ea78382b Remove unused url.safe_host_pattern() 2020-08-18 19:57:46 +01:00
Chris Mayo
794efd6d44 Remove unused url.is_duplicate_content_url() 2020-08-18 19:57:46 +01:00
Chris Mayo
e372657fb8 Remove unused url.get_content() 2020-08-18 19:57:46 +01:00
Chris Mayo
e4ba9c84ce Remove unused url.match_{host,url}()
Removes deprecation warnings for urllib.parse.split{host,type}() in
url_split()
2020-08-18 19:57:46 +01:00
Chris Mayo
4ad20d7f03
Merge pull request #477 from cjmayo/sitemap
Detect sitemaps that do not start with an XML declaration
2020-08-18 19:51:32 +01:00
Chris Mayo
24c2f4ac39 Add test for invalid host name in content
Tests code added in:
d5690203 ("Fix critical exception when parsing a URL with a ]", 2020-08-08)
2020-08-15 17:04:41 +01:00
Chris Mayo
88c84364b3 Add additional tests for <img srcset>
Tests code added in:
7ba40537 ("Fix critical exception if srcset value ends with a comma", 2020-08-07)
27f22ae1 ("Fix treating data: URIs in srcset values as links", 2020-08-07)
2020-08-15 17:04:41 +01:00
Chris Mayo
8c804c35a5 Detect sitemaps that do not start with an XML declaration 2020-08-11 19:35:56 +01:00
Chris Mayo
40b2ebff8f Remove defaults from lc_cgi.checklink()
Only called from application() with arguments. Causes local environment
to be embedded in documentation when using Sphinx autodoc.
2020-08-05 19:54:56 +01:00
Chris Mayo
10170b2966 Add a test for the LocationInfo plugin
Because the GeoIP database now requires registration to download the
result of the lookup using geoip-database is not going to change.
2020-07-07 17:25:28 +01:00
Chris Mayo
d91a328224 Remove strformat.unicode_safe() and strformat.url_unicode_split()
All strings support Unicode in Python 3.
2020-07-07 17:25:28 +01:00
Chris Mayo
d66e64460c Remove unused code from strformat.py 2020-06-18 19:31:00 +01:00
Chris Mayo
18d6eeae76 Ensure PO files are opened as UTF-8 in test_gtranslator() 2020-06-09 19:47:24 +01:00
Chris Mayo
74d449f8ac Test po files as strings and check po files have been found 2020-06-05 16:59:46 +01:00
Chris Mayo
4330b8a59e Replace codecs.open() with open() 2020-06-05 16:59:46 +01:00
Chris Mayo
d591fedb60 Remove unused updater code that supports linkchecker-gui
pip provides update support for linkchecker.
2020-06-05 16:05:25 +01:00
Chris Mayo
a6b1eb45b1 Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
Chris Mayo
5df8aa085c Convert space-separated strings in tests/ 2020-05-29 19:40:46 +01:00
Chris Mayo
c71cfcbea4 Tidy TestClamav.testInfected() acceptable_responses 2020-05-29 19:40:46 +01:00
Chris Mayo
5ee8d8e1ea Add trailing comma to single dict list in TestLoginUrl.visit_loginurl() 2020-05-29 19:40:46 +01:00
Chris Mayo
a534be0b50 Remove unnecessary character match in regexp in TestLogger.normalize() 2020-05-29 19:40:46 +01:00
Chris Mayo
be53c4a659 Remove unnecessary commas before closing brackets in tests/ 2020-05-29 19:40:46 +01:00
Chris Mayo
87039913b2 Fix remaining flake8 violations in tests/
tests/test_clamav.py:58:89: E501 line too long (90 > 88 characters)
tests/test_containers.py:38:9: F841 local variable 'dummy' is assigned to but never used
tests/test_dummy.py:35:9: F841 local variable 'dummy' is assigned to but never used
tests/test_ftpparse.py:94:89: E501 line too long (96 > 88 characters)
tests/test_url.py:128:89: E501 line too long (130 > 88 characters)
tests/test_strformat.py:62:9: E741 ambiguous variable name 'l'
tests/test_strformat.py:136:9: E731 do not assign a lambda expression, use a def
tests/checker/ftpserver.py:94:9: E722 do not use bare 'except'
tests/checker/httpserver.py:55:39: E231 missing whitespace after ','
tests/checker/httpserver.py:224:9: E722 do not use bare 'except'
tests/checker/telnetserver.py:84:9: E722 do not use bare 'except'
tests/checker/__init__.py:71:89: E501 line too long (119 > 88 characters)
tests/checker/__init__.py:292:13: E741 ambiguous variable name 'l'
tests/checker/test_http_misc.py:30:1: W293 blank line contains whitespace
tests/checker/test_https.py:21:1: F401 'tests.need_network' imported but unused
tests/checker/test_news.py:35:1: E302 expected 2 blank lines, found 1
2020-05-28 20:29:13 +01:00
Chris Mayo
165c51aeea Run black on tests/ 2020-05-28 20:29:13 +01:00
Chris Mayo
6f126a54d2 Add coverage for parser.sitemap.parse_sitemapindex() 2020-05-27 20:02:03 +01:00
Chris Mayo
f6e182f0e4 Mark TestFile.test_html_url_quote as need_network
Else without the internet the test fails, eventually, with:

warning No MX mail host for users.sourceforge.net found
2020-05-25 19:55:28 +01:00
Chris Mayo
d3c9618b1b TestHttps.test_https doesn't need the internet now
A result of changes introduced in:

dee4be4b ("Enable https checking using a test server", 2019-11-11)
2020-05-25 19:55:28 +01:00
Chris Mayo
32689ea230 Enable as many TestHttp html tests as possible without the internet 2020-05-25 19:55:28 +01:00
Chris Mayo
313a14ff0d Remove instances of Python 2 unicode 2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Chris Mayo
d611564cb0 Add a test for an empty html file accessed over http 2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting 2020-05-23 14:15:52 +03:00
Marius Gedminas
5bd1fb4e36 Fix internal error on empty HTML files
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None.  It doesn't matter what encoding we pick for
empty files, so let's just pick one.

I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.

Closes #392.
2020-05-21 19:01:33 +03:00
Chris Mayo
96e1c00ff7 TestLogger diff output is all Unicode in Python 3 2020-05-20 19:58:44 +01:00
Chris Mayo
71eaf9a982 Remove str_text from tests/ 2020-05-19 19:56:42 +01:00
Chris Mayo
a127902607 Replace str_text in asserts 2020-05-19 19:56:42 +01:00
Chris Mayo
12fd59057e Remove duplicate tests from test_strformat.py 2020-05-17 20:10:28 +01:00
Chris Mayo
339d293326 Convert tests/test_po.py to UTF-8 2020-05-17 20:10:28 +01:00
Chris Mayo
04465530c4 Use HttpServerTest.get_url() 2020-05-17 20:10:28 +01:00
Chris Mayo
58dbe1f282 Remove unused import pytest from tests/checker/test_http.py
pytest.mark.xfail() removed in:
743a5f31 ("Crawl HTML attributes in deterministic order", 2017-02-01)
2020-05-17 20:10:28 +01:00
Chris Mayo
79eafee826 Add a test for VirusCheck 2020-05-17 19:04:49 +01:00
Chris Mayo
a15a2833ca Remove spaces after names in class method definitions
And also nested functions.

This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7 Remove spaces after names in function definitions
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00