Chris Mayo
e9db151145
Merge pull request #480 from cjmayo/blacklist
...
Fix blacklist updating
2020-08-20 19:48:59 +01:00
Chris Mayo
b869b8876f
Avoid dependency on gsettings-desktop-schemas
...
Gio.Settings.new() causes LinkChecker to exit if the GNOME proxy schema
cannot be found.
2020-08-20 19:42:44 +01:00
Chris Mayo
cfe5c89eb6
Merge pull request #479 from cjmayo/versions
...
Add missing essential modules to internal error message
2020-08-20 19:36:45 +01:00
Chris Mayo
d7efa20d33
Remove unused constants from url.py
2020-08-19 19:27:28 +01:00
Chris Mayo
be24836c73
Remove unused url.url_unsplit()
2020-08-18 19:57:46 +01:00
Chris Mayo
d58b3ab285
Remove unused url.url_fix_common_typos()
2020-08-18 19:57:46 +01:00
Chris Mayo
9488e1eb41
Remove unused url.is_safe_x matches
2020-08-18 19:57:46 +01:00
Chris Mayo
71ea78382b
Remove unused url.safe_host_pattern()
2020-08-18 19:57:46 +01:00
Chris Mayo
794efd6d44
Remove unused url.is_duplicate_content_url()
2020-08-18 19:57:46 +01:00
Chris Mayo
e372657fb8
Remove unused url.get_content()
2020-08-18 19:57:46 +01:00
Chris Mayo
e4ba9c84ce
Remove unused url.match_{host,url}()
...
Removes deprecation warnings for urllib.parse.split{host,type}() in
url_split()
2020-08-18 19:57:46 +01:00
Chris Mayo
b32fe6f692
Merge pull request #478 from cjmayo/imp
...
Fix deprecation warning for use of the imp module
2020-08-18 19:56:40 +01:00
Chris Mayo
4ad20d7f03
Merge pull request #477 from cjmayo/sitemap
...
Detect sitemaps that do not start with an XML declaration
2020-08-18 19:51:32 +01:00
Chris Mayo
5d83e93829
Merge pull request #475 from cjmayo/iana
...
Update IANA scripts and ignored schemes
2020-08-18 19:40:35 +01:00
Chris Mayo
0086c28b3a
Merge pull request #474 from cjmayo/srcset
...
Fix problems with trailing commas and data: URIs in srcset values
2020-08-15 16:58:38 +01:00
Chris Mayo
0269fd88b0
Merge pull request #473 from cjmayo/valueerror
...
Fix critical exception when parsing a URL with a ]
2020-08-15 16:51:17 +01:00
Chris Mayo
88566ad20a
Merge pull request #472 from cjmayo/baseref
...
Fix CSV logger not recognising base part setting
2020-08-15 16:41:57 +01:00
Chris Mayo
525b6751a9
Merge pull request #468 from cjmayo/interrupter
...
Rename director/interrupt.py to director/interrupter.py
2020-08-15 16:31:33 +01:00
Chris Mayo
ccaa882d50
Merge pull request #471 from cjmayo/status
...
Fix status=0 setting being ignored
2020-08-14 20:02:01 +01:00
Chris Mayo
33a5444dea
Merge pull request #469 from cjmayo/checklink
...
Remove defaults from lc_cgi.checklink()
2020-08-14 19:57:03 +01:00
Chris Mayo
5aa2ddce4d
Merge pull request #461 from cjmayo/docstrings
...
Fix formatting and typos in docstrings
2020-08-14 19:45:41 +01:00
Chris Mayo
7ee151ebbf
Don't translate "Retry-After" server header field
...
It is defined in RFC 7231.
2020-08-14 19:29:19 +01:00
Chris Mayo
ad71cb4e43
Fix CssSyntaxCheck list index out of range
...
Errors do not report the column.
2020-08-14 19:25:21 +01:00
Chris Mayo
94dbac1e5e
Fix CssSyntaxCheck warning message, CSS not HTML
2020-08-14 19:25:21 +01:00
Chris Mayo
e053b3bc5f
HtmlSyntaxCheck disabled because it is broken
2020-08-14 19:25:21 +01:00
Chris Mayo
068a60ee39
SyntaxCheck plugins only work with http
...
They use a Requests session from url_data.
2020-08-14 19:25:21 +01:00
Chris Mayo
7d950cf848
Fix blacklist updating
...
A second run creates an additional entry in blacklist rather than
upating the original:
1 '"(\'http://localhost/broken.html \', \'http://localhost/nosuchlink.html \')"'
1 "('http://localhost/broken.html ', 'http://localhost/nosuchlink.html ')"
Broken since at least 9.3:
1 "(u'http://localhost/broken.html ', u'http://localhost/nosuchlink.html ')"
1 u'"(u\'http://localhost/broken.html \', u\'http://localhost/nosuchlink.html \')"'
If such an entry is found LinkChecker will now halt. Either remove
the entry or the whole file.
2020-08-13 19:32:21 +01:00
Chris Mayo
682bdbeab4
Add missing essential modules to internal error message
2020-08-12 19:38:40 +01:00
Chris Mayo
8c804c35a5
Detect sitemaps that do not start with an XML declaration
2020-08-11 19:35:56 +01:00
Chris Mayo
658c8051f0
Fix deprecation warning for use of the imp module
2020-08-10 19:32:04 +01:00
Chris Mayo
80763ed1ea
Add slack to the list of ignored schemes
...
slack:// is a way to interact with a local Slack client [1], and is not
something that LinkChecker can check.
[1] https://api.slack.com/reference/deep-linking#client
2020-08-09 17:10:26 +01:00
Chris Mayo
f19fd4f5bc
Update IANA scripts and ignored schemes (2020-07-28)
2020-08-09 17:10:26 +01:00
Chris Mayo
d5690203fc
Fix critical exception when parsing a URL with a ]
...
e.g.:
<a href="http://localhost ]">square</a>
Causes urllib to raise a ValueError:
File "/usr/lib/python3.8/site-packages/linkcheck/url.py", line 315, in url_norm
line: urlparts = list(urllib.parse.urlsplit(url))
locals:
urlparts = <not found>
list = <builtin> <class 'list'>
urllib = <global> <module 'urllib' from '/usr/lib/python3.8/urllib/__init__.py'>
urllib.parse = <global> <module 'urllib.parse' from '/usr/lib/python3.8/urllib/parse.py'>
urllib.parse.urlsplit = <global> <function urlsplit at 0x7f950e699e50>
url = <local> 'http://localhost ]', len = 17
File "/usr/lib/python3.8/urllib/parse.py", line 440, in urlsplit
line: raise ValueError("Invalid IPv6 URL")
locals:
ValueError = <builtin> <class 'ValueError'>
2020-08-08 16:47:31 +01:00
Chris Mayo
27f22ae17a
Fix treating data: URIs in srcset values as links
2020-08-07 20:04:23 +01:00
Chris Mayo
7ba4053710
Fix critical exception if srcset value ends with a comma
...
Log a debug message as this is a minor syntax problem, won't stop
LinkChecker parsing strings up to the comma.
2020-08-07 20:04:23 +01:00
Chris Mayo
f3a823fb5b
Fix CSV logger not recognising base part setting
2020-08-07 19:45:24 +01:00
Chris Mayo
4f3f1ac0d4
Fix status=0 setting being ignored
...
- Set the correct default for the setting in configuration.Configuration
- Detect when the argument is not passed by setting the default to None
(store_false sets the default to True)
2020-08-06 19:32:33 +01:00
Chris Mayo
40b2ebff8f
Remove defaults from lc_cgi.checklink()
...
Only called from application() with arguments. Causes local environment
to be embedded in documentation when using Sphinx autodoc.
2020-08-05 19:54:56 +01:00
Chris Mayo
46b9e6b169
Rename director/interrupt.py to director/interrupter.py
...
Avoid a clash with director.interrupt() when automatically documenting.
2020-08-03 19:48:07 +01:00
Chris Mayo
0912e8a2c1
Don't strip the URL fragment from cache key if using AnchorCheck
...
Else once one URL for a page has been checked, URLs with different
fragments are skipped and not passed to AnchorCheck.
eaa538c ("don't check one url multiple times", 2016-11-09)
2020-07-27 19:25:30 +01:00
Chris Mayo
dee21ee9a0
Fix formatting and typos in docstrings
2020-07-25 16:35:48 +01:00
Chris Mayo
500c13e2cb
Log a debug message when a cached URL is skipped
...
Skipping introduced in:
eaa538c8 ("don't check one url multiple times", 2016-11-09)
2020-07-21 19:54:18 +01:00
Chris Mayo
a977e4d712
Merge pull request #444 from cjmayo/isinstance
...
Remove or replace uses of isinstance()
2020-07-08 19:55:29 +01:00
Chris Mayo
7a0644a234
No need to process an empty string in str_format.ascii_safe()
2020-07-08 19:47:59 +01:00
Chris Mayo
b328520f08
Convert UrlBase syntax Exception to a string
...
Causes an exception when logging.
2020-07-07 17:25:28 +01:00
Chris Mayo
53bd5c4d21
Remove HttpUrl.getheader()
2020-07-07 17:25:28 +01:00
Chris Mayo
1018b8332b
Convert PDF URL to a string
2020-07-07 17:25:28 +01:00
Chris Mayo
3fcee872b6
urlparts need to support assignment
2020-07-07 17:25:28 +01:00
Chris Mayo
d91a328224
Remove strformat.unicode_safe() and strformat.url_unicode_split()
...
All strings support Unicode in Python 3.
2020-07-07 17:25:28 +01:00
Chris Mayo
4cb5b6f2fa
Merge pull request #443 from cjmayo/kde5
...
Replace KDE 3 proxy support with KDE 5 support
2020-07-07 17:12:53 +01:00
Chris Mayo
18f20d592f
Check for KDE 5 proxy first and then KDE 4
...
Don't look for kde4-config in case a KDE 5 user still has it installed.
2020-07-07 17:06:25 +01:00
Chris Mayo
bd55c2ef8f
Compare KDE proxy ReversedException integer value to zero
2020-07-07 17:06:25 +01:00
Chris Mayo
da22d4886b
Merge pull request #441 from cjmayo/authentication
...
Improve documentation of authentication
2020-06-23 17:35:19 +01:00
Chris Mayo
085ae188f7
Remove checks for empty loginpasswordfield and loginuserfield
...
These have default values and cannot be reset.
2020-06-23 17:28:31 +01:00
Chris Mayo
1ec3848720
Log problem with login form without exception
2020-06-23 17:28:31 +01:00
Chris Mayo
2f51a9dca0
Improve documentation of authentication
2020-06-23 17:28:31 +01:00
Chris Mayo
d66e64460c
Remove unused code from strformat.py
2020-06-18 19:31:00 +01:00
Chris Mayo
1f77506c9f
Remove isinstance() in url.url_fix_mailto_urlsplit()
...
urls are strings.
2020-06-18 19:27:06 +01:00
Chris Mayo
8f9f687ed8
Remove isinstance() from fileutil.path_safe()
...
paths are derived from urls which are strings.
2020-06-18 19:27:06 +01:00
Chris Mayo
f86e506de4
Remove isinstance() from FileUrl.read_content()
...
get_index_html() returns a string.
2020-06-18 19:27:06 +01:00
Chris Mayo
3231730366
Remove isinstance() from robotparser2.py
...
Originally for encoding Python 2 Unicode strings [1]. Will not be used
in Python 3 because the variables are strings, if they were bytes
exceptions would be raised.
[1] c97f68f7 ("accept unicode in robots.txt can_fetch", 2004-11-09)
2020-06-18 19:27:06 +01:00
Chris Mayo
9c9a3d8b14
Remove isinstance() from url.idna_encode()
...
Was originally used for Python 2 Unicode strings.
f4b73c6d ("Python3: fix unicode in url.py", 2018-01-05)
2020-06-18 19:27:06 +01:00
Chris Mayo
3a6540bc46
Replace isinstance() in strformat.ascii_safe()
2020-06-18 19:27:06 +01:00
Chris Mayo
4009039158
Merge pull request #420 from cjmayo/dconf
...
Update GNOME proxy support for GNOME 3 and Python 3
2020-06-14 18:56:19 +01:00
Chris Mayo
b6004fb6b1
Simplify and add debug messages to KDE proxy retrieval
2020-06-08 17:00:10 +01:00
Chris Mayo
29b292c90f
Replace KDE 3 proxy support with KDE 5 support
...
KDE 3 was superseded in 2008.
KDE 4 uses: ${HOME}/.kde4/share/config/kioslaverc
KDE 5 (Kubuntu) uses: ${HOME}/.config/kioslaverc
Default ReversedException is false
2020-06-08 17:00:10 +01:00
Chris Mayo
9108afeee5
Add html.escape on URLs in logger/html.py
2020-06-05 16:59:46 +01:00
Chris Mayo
eeb5fa48ca
Update configuration/confparse.py log message to https
2020-06-05 16:59:46 +01:00
Chris Mayo
0191b021f4
Make configuration/confparse.py log message translatable
2020-06-05 16:59:46 +01:00
Chris Mayo
36246c15ac
Update various comments to https
2020-06-05 16:59:46 +01:00
Chris Mayo
3bd790c22d
Update W3C validator links to use https
2020-06-05 16:59:46 +01:00
Chris Mayo
b987d6f3ca
Fix indent in plugins/locationinfo.py
2020-06-05 16:59:46 +01:00
Chris Mayo
4330b8a59e
Replace codecs.open() with open()
2020-06-05 16:59:46 +01:00
Chris Mayo
b9c8e33878
Update GNOME proxy support for GNOME 3 and Python 3
...
GConf is replaced by dconf and the GSettings API in GNOME 3.
2020-06-05 16:29:45 +01:00
Chris Mayo
e207ac54ce
Merge pull request #437 from cjmayo/translate
...
Update man page translation and fixes for application translation process
2020-06-05 16:17:06 +01:00
Chris Mayo
1632a1ce26
Fix xgettext Non-ASCII error when translating
...
xgettext: Non-ASCII character at
../linkcheck/plugins/markdowncheck.py:2.
Please specify the source encoding through --from-code or through a comment
as specified in https://www.python.org/peps/pep-0263.html .
make: *** [Makefile:25: linkchecker.pot] Error 1
2020-06-05 16:06:01 +01:00
Chris Mayo
d591fedb60
Remove unused updater code that supports linkchecker-gui
...
pip provides update support for linkchecker.
2020-06-05 16:05:25 +01:00
Chris Mayo
a6b1eb45b1
Convert to Python 3 super()
2020-06-03 20:06:36 +01:00
Chris Mayo
cec9b78f5e
Additional review comments on black linkcheck/
2020-06-03 20:06:36 +01:00
Chris Mayo
6b3cb18546
Restore better_exchook2.py and colorama.py to pre-Black state
...
These files are based on published packages.
better_exchook2.py was derived from better_exchook.py in:
https://pypi.org/project/better_exchook/
colorama.py was derived from win32.py in:
https://pypi.org/project/colorama/
Files modified in:
a92a684a ("Run black on linkcheck/", 2020-05-30)
2020-06-03 20:06:36 +01:00
Chris Mayo
b974ec3262
Review comments on black linkcheck/
2020-06-01 16:07:21 +01:00
Chris Mayo
ac0967e251
Fix remaining flake8 violations in linkcheck/
...
linkcheck/better_exchook2.py:28:89: E501 line too long (90 > 88 characters)
linkcheck/better_exchook2.py:155:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:166:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:289:13: E741 ambiguous variable name 'l'
linkcheck/better_exchook2.py:299:9: E722 do not use bare 'except'
linkcheck/containers.py:48:13: E731 do not assign a lambda expression, use a def
linkcheck/ftpparse.py:123:89: E501 line too long (93 > 88 characters)
linkcheck/loader.py:46:47: E203 whitespace before ':'
linkcheck/logconf.py:45:29: E231 missing whitespace after ','
linkcheck/robotparser2.py:157:89: E501 line too long (95 > 88 characters)
linkcheck/robotparser2.py:182:89: E501 line too long (89 > 88 characters)
linkcheck/strformat.py:181:16: E203 whitespace before ':'
linkcheck/strformat.py:181:43: E203 whitespace before ':'
linkcheck/strformat.py:253:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:254:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:341:89: E501 line too long (111 > 88 characters)
linkcheck/url.py:102:32: E203 whitespace before ':'
linkcheck/url.py:277:5: E741 ambiguous variable name 'l'
linkcheck/url.py:402:5: E741 ambiguous variable name 'l'
linkcheck/checker/__init__.py:203:1: E402 module level import not at top of file
linkcheck/checker/fileurl.py:200:89: E501 line too long (103 > 88 characters)
linkcheck/checker/mailtourl.py:122:60: E203 whitespace before ':'
linkcheck/checker/mailtourl.py:157:89: E501 line too long (96 > 88 characters)
linkcheck/checker/mailtourl.py:190:89: E501 line too long (109 > 88 characters)
linkcheck/checker/mailtourl.py:200:89: E501 line too long (111 > 88 characters)
linkcheck/checker/mailtourl.py:249:89: E501 line too long (106 > 88 characters)
linkcheck/checker/unknownurl.py:226:23: W291 trailing whitespace
linkcheck/checker/urlbase.py:245:89: E501 line too long (101 > 88 characters)
linkcheck/configuration/confparse.py:236:89: E501 line too long (186 > 88 characters)
linkcheck/configuration/confparse.py:247:89: E501 line too long (111 > 88 characters)
linkcheck/configuration/__init__.py:164:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:184:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:190:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:195:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:198:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:435:89: E501 line too long (90 > 88 characters)
linkcheck/director/aggregator.py:45:43: E231 missing whitespace after ','
linkcheck/director/aggregator.py:178:89: E501 line too long (106 > 88 characters)
linkcheck/logger/__init__.py:29:1: E731 do not assign a lambda expression, use a def
linkcheck/logger/__init__.py:108:13: E741 ambiguous variable name 'l'
linkcheck/logger/__init__.py:275:19: F821 undefined name '_'
linkcheck/logger/__init__.py:342:16: F821 undefined name '_'
linkcheck/logger/__init__.py:380:13: F821 undefined name '_'
linkcheck/logger/__init__.py:384:13: F821 undefined name '_'
linkcheck/logger/__init__.py:387:13: F821 undefined name '_'
linkcheck/logger/__init__.py:396:13: F821 undefined name '_'
linkcheck/network/__init__.py:1:1: W391 blank line at end of file
linkcheck/plugins/locationinfo.py:89:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/locationinfo.py:91:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/markdowncheck.py:112:89: E501 line too long (111 > 88 characters)
linkcheck/plugins/markdowncheck.py:141:9: E741 ambiguous variable name 'l'
linkcheck/plugins/markdowncheck.py:165:23: E203 whitespace before ':'
linkcheck/plugins/viruscheck.py:95:42: E203 whitespace before ':'
2020-05-30 17:01:36 +01:00
Chris Mayo
8dc2f12b94
Address space-separated strings in linkcheck/
2020-05-30 17:01:36 +01:00
Chris Mayo
b9f4864d9e
Remove unnecessary commas before closing brackets in linkcheck/
2020-05-30 17:01:36 +01:00
Chris Mayo
a92a684ac4
Run black on linkcheck/
2020-05-30 17:01:36 +01:00
Chris Mayo
abdb160413
Remove unused bookmarks code that supports linkcheck-gui
...
linkchecker does not need to find a bookmark file, it is given the URL.
Most bookmarks are detected by their MIME type, Firefox is different
because it uses a SQLite database.
2020-05-28 19:44:53 +01:00
Chris Mayo
e204182acb
Remove unused httputil.has_header_value()
2020-05-28 19:44:53 +01:00
Chris Mayo
4d2449bb13
Merge pull request #425 from cjmayo/xdg_config_home
...
Fix xdg_config_home import in bookmarks/chrome.py
2020-05-28 19:18:21 +01:00
Chris Mayo
75349e4dc9
Fix xdg_config_home import in bookmarks/chrome.py
2020-05-27 20:02:07 +01:00
Chris Mayo
a49f42b617
Remove unused mem.py
2020-05-27 20:01:57 +01:00
Chris Mayo
488e72c81f
Ignore imports providing aliases in subpackages
2020-05-26 19:49:59 +01:00
Chris Mayo
97f50e8be1
Remove unused import htmlsoup from checker/httpurl.py
...
Unused since:
f7337f55 ("Fix error due to an empty html file accessed over http", 2020-05-23)
2020-05-25 19:50:57 +01:00
Chris Mayo
3473656fe1
Replace import of distutils.spawn.find_executable with shutil.which
2020-05-25 19:50:57 +01:00
Chris Mayo
6dda2f9669
Move imports to the top of files to resolve flake8 E402
2020-05-25 19:50:57 +01:00
Chris Mayo
0f3444e906
Drop run-time requests version check
...
Requests 2.4.0 was released in 2014.
2020-05-25 19:50:57 +01:00
Chris Mayo
89c7c74bcf
Remove unused set_linecache() from better_exchook2.py
2020-05-25 19:50:57 +01:00
Chris Mayo
7257e5e1a0
Remove unused imports in parser/__init__.py
2020-05-25 19:50:57 +01:00
Chris Mayo
313a14ff0d
Remove instances of Python 2 unicode
2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
...
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Marius Gedminas
dcafa2df75
Avoid u-prefixed strings
...
linkchecker is Python 3 only, all strings are unicode.
2020-05-24 14:50:07 +03:00
Chris Mayo
03b1c4919d
Record encoding in debug log messages
2020-05-23 20:01:24 +01:00
Chris Mayo
f7337f55e8
Fix error due to an empty html file accessed over http
...
Use the already fixed [1] UrlBase.get_content() in HttpUrl.
[1] 5bd1fb4 ("Fix internal error on empty HTML files", 2020-05-21)
2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting
2020-05-23 14:15:52 +03:00
Marius Gedminas
6dffacf17f
Merge pull request #409 from linkchecker/fix-login-timeouts
...
Make sure login form fetching uses a timeout and sends User-Agent
2020-05-22 21:40:48 +03:00
Marius Gedminas
b0435b3d47
Make sure login form fetching uses a timeout
...
Also resolve an XXX comment about the User-Agent header (which is
configured in new_request_session), but add a couple of XXX comments
about using proxy and possibly disabling TLS certificate checking.
2020-05-22 11:19:51 +03:00
Marius Gedminas
4f3fe5e1c3
Make sure fetching robots.txt uses the configured timeout
...
Closes #396 .
2020-05-22 10:53:33 +03:00
Marius Gedminas
c60d7c66e4
Clarify the decision to fall back to Latin-1
2020-05-21 19:35:39 +03:00
Marius Gedminas
5bd1fb4e36
Fix internal error on empty HTML files
...
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None. It doesn't matter what encoding we pick for
empty files, so let's just pick one.
I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.
Closes #392 .
2020-05-21 19:01:33 +03:00
Chris Mayo
6cfc8eeb49
Replace threading.Thread.setName() with setting the name property
...
As recommended in:
https://docs.python.org/3.5/library/threading.html#threading.Thread.setName
2020-05-20 19:58:44 +01:00
Chris Mayo
42eba19a7d
No need to encode url in Checker.check_url_data()
...
Was causing b'' in log messages e.g. CheckThread-b'http:...
2020-05-20 19:58:44 +01:00
Chris Mayo
28f4587dfa
Remove str_text from fileutil.py, strformat.py and url.py
2020-05-19 19:56:42 +01:00
Chris Mayo
ebcc3c4961
Remove str_text from plugins/
2020-05-19 19:56:42 +01:00
Chris Mayo
1c14583535
Remove str_text from logger/
2020-05-19 19:56:42 +01:00
Chris Mayo
6bddd4ac60
Remove str_text from checker/
2020-05-19 19:56:42 +01:00
Chris Mayo
a127902607
Replace str_text in asserts
2020-05-19 19:56:42 +01:00
Chris Mayo
7490804e2c
Merge pull request #395 from cjmayo/tidyten11
...
Remove unused code from linkcheck/fileutil.py
2020-05-19 19:45:08 +01:00
Marius Gedminas
e6e969f975
Merge pull request #391 from linkchecker/dev-version
...
Bump version in git to 10.0.0.dev0
2020-05-19 18:49:34 +03:00
Chris Mayo
690605c519
Remove unused code from linkcheck/fileutil.py
2020-05-18 19:29:55 +01:00
Marius Gedminas
5317347e54
Avoid distutils.version.StrictVersion
...
distutils.version is old code that predates PEP 440. We could add a
dependency on https://packaging.pypa.io/en/latest/version/ , but meh.
2020-05-17 21:12:43 +03:00
Marius Gedminas
bb53aaa621
Fix viruscheck plugin
...
The clamav interface needs bytes, not unicode.
It would be nice if we had tests for this code.
2020-05-17 17:50:11 +01:00
Chris Mayo
a15a2833ca
Remove spaces after names in class method definitions
...
And also nested functions.
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7
Remove spaces after names in function definitions
...
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
fc11d08968
Remove spaces after names in class definitions
2020-05-16 20:19:42 +01:00
Chris Mayo
1416a08119
On Python 3 no need to convert os.linesep to a string
2020-05-16 17:02:01 +01:00
Chris Mayo
0752408a44
Remove Python 2 use of sys.stdout in i18n.get_encoded_writer()
2020-05-16 17:02:00 +01:00
Chris Mayo
2c2e7e55ac
Remove CSVLogger.encode_row_s()
...
Introduced during Python 3 conversion to maintaint Python 2 support:
55a7973b ("Python3: fix csvlog", 2016-12-04)
2020-05-16 17:02:00 +01:00
Chris Mayo
ed13a926d3
Remove setting Python 2 xmlparser.returns_unicode
2020-05-16 17:02:00 +01:00
Chris Mayo
025637b08d
Remove Python 2 cookielib import
2020-05-16 16:26:38 +01:00
Chris Mayo
1e277444f4
Remove Python 2 thread import
2020-05-16 16:26:34 +01:00
Chris Mayo
dcbddfe045
Remove Python 2 ConfigParser import
2020-05-15 19:37:04 +01:00
Chris Mayo
f8c9faec1b
Remove Python 2 cStringIO imports
2020-05-15 19:37:04 +01:00
Chris Mayo
bda9612273
Make html.escape Python 3 only
2020-05-14 20:15:28 +01:00
Chris Mayo
42de609f8e
Make urllib imports Python 3 only
2020-05-14 20:15:28 +01:00
Chris Mayo
3c661a83d0
Replace parse_host_port() in checker.proxysupport with url.splitport()
2020-05-14 20:15:28 +01:00
Chris Mayo
c80002437e
Update run-time version check
2020-05-13 19:50:19 +01:00
Chris Mayo
08ddf658bc
Merge pull request #366 from cjmayo/userorpwd
...
Support login forms with user and/or password
2020-05-13 19:37:44 +01:00
Chris Mayo
736c893707
Merge pull request #377 from cjmayo/tidyten3
...
Remove u string prefixes
2020-05-13 19:36:54 +01:00
Chris Mayo
3ace021264
Support login forms with user and/or password
2020-05-13 19:32:25 +01:00
Chris Mayo
44e81d27dd
Remove inheriting object
...
All Python 3 classes are new-style.
2020-05-08 10:45:31 +01:00
Chris Mayo
b0ea72e8c1
Remove # -*- coding: lines
...
Except for tests that include non-unicode characters:
tests/test_po.py
tests/test_strformat.py
tests/test_url.py
tests/checker/test_error.py
tests/checker/test_news.py
2020-05-08 10:45:31 +01:00
Marius Gedminas
22b0165b72
Make _Logger an abstract base class
...
The __metaclass__ syntax is a Python-2-ism. It was replaced with
class _Logger (object, metaclass=abc.ABCMeta):
in Python 3. And then Python 3.4 introduced abc.ABC which is an empty
class that has ABCMeta as the metaclass, making it simpler to define
abstract base classes.
2020-04-30 23:09:42 +03:00
Chris Mayo
4d3e5abcfa
Remove u string prefixes
2020-04-30 20:11:59 +01:00
anarcat
ab476fa4bf
Merge pull request #364 from cjmayo/parser5
...
Stop using HTML handlers and improve login form error handling
2020-04-30 09:28:48 -04:00
Chris Mayo
12a948894b
Fix space style in linkcheck/htmlutil/linkparse.py
2020-04-29 20:07:00 +01:00
Chris Mayo
9eed070a73
Stop using HTML handlers
...
LinkFinder is the only remaining HTML handler therefore no need for
htmlsoup.process_soup() as an independent function or TagFinder as a
base class.
2020-04-29 20:07:00 +01:00
Chris Mayo
4ffdbf2406
Replace MetaRobotsFinder using BeautifulSoup.find()
2020-04-29 20:07:00 +01:00
Chris Mayo
a51f02cf66
Improve error handling and debugging for login form
2020-04-27 18:06:29 +01:00
Chris Mayo
9a33c2a659
Make requesting login form password work on Python 3
2020-04-27 18:06:29 +01:00
Chris Mayo
8fc0dcc055
Make matching login form credentials case-sensitive
...
The keys of the form.data dictionary are case-sensitive and therefore a
KeyError was possible if the configured values are not identical to
the input element name attributes.
2020-04-27 18:06:29 +01:00
Chris Mayo
7a6ef938cc
Rename htmlutil.formsearch to htmlutil.loginformsearch
...
Make it clear that this module has only one specific use.
2020-04-27 18:06:29 +01:00