Commit graph

76 commits

Author SHA1 Message Date
Chris Mayo
759208d903 Disable VirusCheck plugin
Not compatible with ClamAV >= 1.0.
2025-07-14 19:52:34 +01:00
Marius Gedminas
b5fa5054c9 Fix pylint errors (false positives)
The errors were

    ************* Module linkcheck.ansicolor
    linkcheck/ansicolor.py:190:21: E0606: Possibly using variable 'WinColor' before assignment (possibly-used-before-assignment)
    ************* Module linkcheck.plugins.locationinfo
    linkcheck/plugins/locationinfo.py:107:12: E0606: Possibly using variable 'geoip_error' before assignment (possibly-used-before-assignment)

and while it's true that the variables (WinColor, geoip_error) are
assigned conditionally, their use is protected (indirectly) by the same
conditionals.
2024-05-20 14:10:30 +03:00
Chris Mayo
ffe4a83b88 Document CssSyntaxCheck checks stylesheets 2024-02-26 19:24:07 +00:00
Chris Mayo
f64dd7b06e Fix SslCertificateCheck.__init__() docstring 2023-12-04 19:25:01 +00:00
Chris Mayo
1341ee92bc Fix errors found by Pylint
Remove code for old-style classes:
better_exchook2.py:238:62: E1101: Module 'types' has no 'InstanceType' member (no-member)

Command-line equivalent of 9a33c2a65 ("Make requesting login form password work on Python 3", 2020-04-14):
command/setup_config.py:167:36: E1101: Module 'linkcheck.director.console' has no 'encode' member (no-member)

Add missing argument:
plugins/parseword.py:141:31: E1120: No value for argument 'wrange' in function call (no-value-for-parameter)
2023-05-03 19:24:53 +01:00
Chris Mayo
1f5a1e92d3 Fix VirusCheck.check() docstring 2022-11-30 19:21:06 +00:00
Chris Mayo
8065c75c4e Convert some printf-style strings 2022-11-08 19:21:29 +00:00
Chris Mayo
b6bc366af0 Run pyupgrade --py37-plus x 2 2022-11-08 19:21:29 +00:00
Chris Mayo
fd6c960ace Make more messages translatable 2022-11-08 19:21:29 +00:00
Chris Mayo
0bb1576887 Run pyupgrade --py37-plus --keep-percent-format 2022-11-08 19:21:29 +00:00
Nathan Arthur
4cdaa59fcc Fix AnchorCheck mismatching encoded anchors
Problem identified by Christian Kirchhof.
2022-10-03 19:33:05 +01:00
Nathan Arthur
6499b7b233 Fix a major thread-safety bug in AnchorCheck
The threading issue has been there for years, but I didn't notice it
until after I thought I was done, while I was doing manual testing
(with threads re-enabled).

The problem was with storing URL-specific state (.anchors) on the
AnchorCheck object itself, because there's only one global AnchorCheck
object, so all the threads are competing to use that one simgle variable
(self.anchors).

The solution was to create a new object to hold .anchors, for each
processed URL.
2022-10-03 19:33:05 +01:00
Chris Mayo
54bcefd7d7 Revert "Disable AnchorCheck plugin"
This reverts commit 0356524369.
2022-10-03 19:33:05 +01:00
Kian-Meng Ang
a70ea9ea14 Fix typos
Found via `codespell ./linkcheck/ ./tests ./doc/man/en -L bu,noone,fo,pres,shttp`
2022-09-02 17:20:02 +08:00
Chris Mayo
94781120ac Correct mention of pdfminer in WordParser comment 2022-05-18 19:29:54 +01:00
Chris Mayo
5768b76f6c Use pkgutil to simplify loader.get_package_modules()
Replaces use of __file__.
2021-12-30 19:27:04 +00:00
Chris Mayo
0356524369 Disable AnchorCheck plugin
Can't be relied on. Multiple reports of expected results not returned.

https://github.com/linkchecker/linkchecker/issues/542
https://github.com/linkchecker/linkchecker/issues/555
https://github.com/linkchecker/linkchecker/issues/568

Previously a fix was needed just to get the tests working:
0912e8a2c ("Don't strip the URL fragment from cache key if using AnchorCheck", 2020-07-27)

After:
eaa538c81 ("don't check one url multiple times", 2016-11-09)
2021-11-29 19:35:34 +00:00
Chris Mayo
a3e9c31560 Remove execute bits from parsepdf.py and parseword.py 2021-01-14 19:48:22 +00:00
Chris Mayo
4969b6dd0a
Merge pull request #482 from cjmayo/syntaxcheck
Fix CssSyntaxCheck list index out of range
2020-08-21 16:46:37 +01:00
Chris Mayo
ad71cb4e43 Fix CssSyntaxCheck list index out of range
Errors do not report the column.
2020-08-14 19:25:21 +01:00
Chris Mayo
94dbac1e5e Fix CssSyntaxCheck warning message, CSS not HTML 2020-08-14 19:25:21 +01:00
Chris Mayo
e053b3bc5f HtmlSyntaxCheck disabled because it is broken 2020-08-14 19:25:21 +01:00
Chris Mayo
068a60ee39 SyntaxCheck plugins only work with http
They use a Requests session from url_data.
2020-08-14 19:25:21 +01:00
Chris Mayo
dee21ee9a0 Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
Chris Mayo
1018b8332b Convert PDF URL to a string 2020-07-07 17:25:28 +01:00
Chris Mayo
d91a328224 Remove strformat.unicode_safe() and strformat.url_unicode_split()
All strings support Unicode in Python 3.
2020-07-07 17:25:28 +01:00
Chris Mayo
3bd790c22d Update W3C validator links to use https 2020-06-05 16:59:46 +01:00
Chris Mayo
b987d6f3ca Fix indent in plugins/locationinfo.py 2020-06-05 16:59:46 +01:00
Chris Mayo
1632a1ce26 Fix xgettext Non-ASCII error when translating
xgettext: Non-ASCII character at
../linkcheck/plugins/markdowncheck.py:2.
          Please specify the source encoding through --from-code or through a comment
          as specified in https://www.python.org/peps/pep-0263.html.

make: *** [Makefile:25: linkchecker.pot] Error 1
2020-06-05 16:06:01 +01:00
Chris Mayo
a6b1eb45b1 Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
Chris Mayo
cec9b78f5e Additional review comments on black linkcheck/ 2020-06-03 20:06:36 +01:00
Chris Mayo
ac0967e251 Fix remaining flake8 violations in linkcheck/
linkcheck/better_exchook2.py:28:89: E501 line too long (90 > 88 characters)
linkcheck/better_exchook2.py:155:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:166:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:289:13: E741 ambiguous variable name 'l'
linkcheck/better_exchook2.py:299:9: E722 do not use bare 'except'
linkcheck/containers.py:48:13: E731 do not assign a lambda expression, use a def
linkcheck/ftpparse.py:123:89: E501 line too long (93 > 88 characters)
linkcheck/loader.py:46:47: E203 whitespace before ':'
linkcheck/logconf.py:45:29: E231 missing whitespace after ','
linkcheck/robotparser2.py:157:89: E501 line too long (95 > 88 characters)
linkcheck/robotparser2.py:182:89: E501 line too long (89 > 88 characters)
linkcheck/strformat.py:181:16: E203 whitespace before ':'
linkcheck/strformat.py:181:43: E203 whitespace before ':'
linkcheck/strformat.py:253:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:254:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:341:89: E501 line too long (111 > 88 characters)
linkcheck/url.py:102:32: E203 whitespace before ':'
linkcheck/url.py:277:5: E741 ambiguous variable name 'l'
linkcheck/url.py:402:5: E741 ambiguous variable name 'l'
linkcheck/checker/__init__.py:203:1: E402 module level import not at top of file
linkcheck/checker/fileurl.py:200:89: E501 line too long (103 > 88 characters)
linkcheck/checker/mailtourl.py:122:60: E203 whitespace before ':'
linkcheck/checker/mailtourl.py:157:89: E501 line too long (96 > 88 characters)
linkcheck/checker/mailtourl.py:190:89: E501 line too long (109 > 88 characters)
linkcheck/checker/mailtourl.py:200:89: E501 line too long (111 > 88 characters)
linkcheck/checker/mailtourl.py:249:89: E501 line too long (106 > 88 characters)
linkcheck/checker/unknownurl.py:226:23: W291 trailing whitespace
linkcheck/checker/urlbase.py:245:89: E501 line too long (101 > 88 characters)
linkcheck/configuration/confparse.py:236:89: E501 line too long (186 > 88 characters)
linkcheck/configuration/confparse.py:247:89: E501 line too long (111 > 88 characters)
linkcheck/configuration/__init__.py:164:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:184:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:190:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:195:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:198:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:435:89: E501 line too long (90 > 88 characters)
linkcheck/director/aggregator.py:45:43: E231 missing whitespace after ','
linkcheck/director/aggregator.py:178:89: E501 line too long (106 > 88 characters)
linkcheck/logger/__init__.py:29:1: E731 do not assign a lambda expression, use a def
linkcheck/logger/__init__.py:108:13: E741 ambiguous variable name 'l'
linkcheck/logger/__init__.py:275:19: F821 undefined name '_'
linkcheck/logger/__init__.py:342:16: F821 undefined name '_'
linkcheck/logger/__init__.py:380:13: F821 undefined name '_'
linkcheck/logger/__init__.py:384:13: F821 undefined name '_'
linkcheck/logger/__init__.py:387:13: F821 undefined name '_'
linkcheck/logger/__init__.py:396:13: F821 undefined name '_'
linkcheck/network/__init__.py:1:1: W391 blank line at end of file
linkcheck/plugins/locationinfo.py:89:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/locationinfo.py:91:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/markdowncheck.py:112:89: E501 line too long (111 > 88 characters)
linkcheck/plugins/markdowncheck.py:141:9: E741 ambiguous variable name 'l'
linkcheck/plugins/markdowncheck.py:165:23: E203 whitespace before ':'
linkcheck/plugins/viruscheck.py:95:42: E203 whitespace before ':'
2020-05-30 17:01:36 +01:00
Chris Mayo
a92a684ac4 Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
Chris Mayo
ebcc3c4961 Remove str_text from plugins/ 2020-05-19 19:56:42 +01:00
Marius Gedminas
bb53aaa621 Fix viruscheck plugin
The clamav interface needs bytes, not unicode.

It would be nice if we had tests for this code.
2020-05-17 17:50:11 +01:00
Chris Mayo
a15a2833ca Remove spaces after names in class method definitions
And also nested functions.

This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7 Remove spaces after names in function definitions
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
fc11d08968 Remove spaces after names in class definitions 2020-05-16 20:19:42 +01:00
Chris Mayo
42de609f8e Make urllib imports Python 3 only 2020-05-14 20:15:28 +01:00
Chris Mayo
736c893707
Merge pull request #377 from cjmayo/tidyten3
Remove u string prefixes
2020-05-13 19:36:54 +01:00
Chris Mayo
44e81d27dd Remove inheriting object
All Python 3 classes are new-style.
2020-05-08 10:45:31 +01:00
Chris Mayo
b0ea72e8c1 Remove # -*- coding: lines
Except for tests that include non-unicode characters:

tests/test_po.py
tests/test_strformat.py
tests/test_url.py
tests/checker/test_error.py
tests/checker/test_news.py
2020-05-08 10:45:31 +01:00
Chris Mayo
4d3e5abcfa Remove u string prefixes 2020-04-30 20:11:59 +01:00
Chris Mayo
9eed070a73 Stop using HTML handlers
LinkFinder is the only remaining HTML handler therefore no need for
htmlsoup.process_soup() as an independent function or TagFinder as a
base class.
2020-04-29 20:07:00 +01:00
Chris Mayo
b7ec71d8cc Always use utf-8 encoding when quoting 2019-10-05 19:38:57 +01:00
Marius Gedminas
8bdd402aed
Merge pull request #333 from linkchecker/fix-clamav-on-py3
Fix test_clamav.py on Python 3
2019-10-25 16:16:23 +03:00
Marius Gedminas
f9766a2049 Python 3: fix bytes vs strings in viruscheck plugin
Socket communication deals with bytes.

There are probably remaining issues with the viruscheck plugin on
Python 3, we just can't see them because the code is not fully covered
with tests.
2019-10-25 14:24:07 +03:00
Chris Mayo
b2e63663f8 Make PdfParser Python 3 compatible
basestring is not available in Python 3. Ensure all URLs are Unicode.

url_data.get_raw_content() is returning bytes.
2019-10-24 19:57:27 +01:00
Marius Gedminas
938467c3ae
Merge pull request #324 from cjmayo/pdfminer
Add pdfminer to tox.ini and dev-requirements.txt to enable pdf test
2019-10-23 09:47:01 +03:00
Marius Gedminas
fa32a89d6b Fix MS Word parser, hopefully
MS Word files are binary data, and get_temp_filename() will write them
to disk using open(..., 'wb'), so we want to pass bytes in there, not
Unicode.

See #323.
2019-10-22 16:39:57 +03:00