Commit graph

63 commits

Author SHA1 Message Date
Kian-Meng Ang
a70ea9ea14 Fix typos
Found via `codespell ./linkcheck/ ./tests ./doc/man/en -L bu,noone,fo,pres,shttp`
2022-09-02 17:20:02 +08:00
Chris Mayo
94781120ac Correct mention of pdfminer in WordParser comment 2022-05-18 19:29:54 +01:00
Chris Mayo
5768b76f6c Use pkgutil to simplify loader.get_package_modules()
Replaces use of __file__.
2021-12-30 19:27:04 +00:00
Chris Mayo
0356524369 Disable AnchorCheck plugin
Can't be relied on. Multiple reports of expected results not returned.

https://github.com/linkchecker/linkchecker/issues/542
https://github.com/linkchecker/linkchecker/issues/555
https://github.com/linkchecker/linkchecker/issues/568

Previously a fix was needed just to get the tests working:
0912e8a2c ("Don't strip the URL fragment from cache key if using AnchorCheck", 2020-07-27)

After:
eaa538c81 ("don't check one url multiple times", 2016-11-09)
2021-11-29 19:35:34 +00:00
Chris Mayo
a3e9c31560 Remove execute bits from parsepdf.py and parseword.py 2021-01-14 19:48:22 +00:00
Chris Mayo
4969b6dd0a
Merge pull request #482 from cjmayo/syntaxcheck
Fix CssSyntaxCheck list index out of range
2020-08-21 16:46:37 +01:00
Chris Mayo
ad71cb4e43 Fix CssSyntaxCheck list index out of range
Errors do not report the column.
2020-08-14 19:25:21 +01:00
Chris Mayo
94dbac1e5e Fix CssSyntaxCheck warning message, CSS not HTML 2020-08-14 19:25:21 +01:00
Chris Mayo
e053b3bc5f HtmlSyntaxCheck disabled because it is broken 2020-08-14 19:25:21 +01:00
Chris Mayo
068a60ee39 SyntaxCheck plugins only work with http
They use a Requests session from url_data.
2020-08-14 19:25:21 +01:00
Chris Mayo
dee21ee9a0 Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
Chris Mayo
1018b8332b Convert PDF URL to a string 2020-07-07 17:25:28 +01:00
Chris Mayo
d91a328224 Remove strformat.unicode_safe() and strformat.url_unicode_split()
All strings support Unicode in Python 3.
2020-07-07 17:25:28 +01:00
Chris Mayo
3bd790c22d Update W3C validator links to use https 2020-06-05 16:59:46 +01:00
Chris Mayo
b987d6f3ca Fix indent in plugins/locationinfo.py 2020-06-05 16:59:46 +01:00
Chris Mayo
1632a1ce26 Fix xgettext Non-ASCII error when translating
xgettext: Non-ASCII character at
../linkcheck/plugins/markdowncheck.py:2.
          Please specify the source encoding through --from-code or through a comment
          as specified in https://www.python.org/peps/pep-0263.html.

make: *** [Makefile:25: linkchecker.pot] Error 1
2020-06-05 16:06:01 +01:00
Chris Mayo
a6b1eb45b1 Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
Chris Mayo
cec9b78f5e Additional review comments on black linkcheck/ 2020-06-03 20:06:36 +01:00
Chris Mayo
ac0967e251 Fix remaining flake8 violations in linkcheck/
linkcheck/better_exchook2.py:28:89: E501 line too long (90 > 88 characters)
linkcheck/better_exchook2.py:155:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:166:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:289:13: E741 ambiguous variable name 'l'
linkcheck/better_exchook2.py:299:9: E722 do not use bare 'except'
linkcheck/containers.py:48:13: E731 do not assign a lambda expression, use a def
linkcheck/ftpparse.py:123:89: E501 line too long (93 > 88 characters)
linkcheck/loader.py:46:47: E203 whitespace before ':'
linkcheck/logconf.py:45:29: E231 missing whitespace after ','
linkcheck/robotparser2.py:157:89: E501 line too long (95 > 88 characters)
linkcheck/robotparser2.py:182:89: E501 line too long (89 > 88 characters)
linkcheck/strformat.py:181:16: E203 whitespace before ':'
linkcheck/strformat.py:181:43: E203 whitespace before ':'
linkcheck/strformat.py:253:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:254:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:341:89: E501 line too long (111 > 88 characters)
linkcheck/url.py:102:32: E203 whitespace before ':'
linkcheck/url.py:277:5: E741 ambiguous variable name 'l'
linkcheck/url.py:402:5: E741 ambiguous variable name 'l'
linkcheck/checker/__init__.py:203:1: E402 module level import not at top of file
linkcheck/checker/fileurl.py:200:89: E501 line too long (103 > 88 characters)
linkcheck/checker/mailtourl.py:122:60: E203 whitespace before ':'
linkcheck/checker/mailtourl.py:157:89: E501 line too long (96 > 88 characters)
linkcheck/checker/mailtourl.py:190:89: E501 line too long (109 > 88 characters)
linkcheck/checker/mailtourl.py:200:89: E501 line too long (111 > 88 characters)
linkcheck/checker/mailtourl.py:249:89: E501 line too long (106 > 88 characters)
linkcheck/checker/unknownurl.py:226:23: W291 trailing whitespace
linkcheck/checker/urlbase.py:245:89: E501 line too long (101 > 88 characters)
linkcheck/configuration/confparse.py:236:89: E501 line too long (186 > 88 characters)
linkcheck/configuration/confparse.py:247:89: E501 line too long (111 > 88 characters)
linkcheck/configuration/__init__.py:164:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:184:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:190:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:195:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:198:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:435:89: E501 line too long (90 > 88 characters)
linkcheck/director/aggregator.py:45:43: E231 missing whitespace after ','
linkcheck/director/aggregator.py:178:89: E501 line too long (106 > 88 characters)
linkcheck/logger/__init__.py:29:1: E731 do not assign a lambda expression, use a def
linkcheck/logger/__init__.py:108:13: E741 ambiguous variable name 'l'
linkcheck/logger/__init__.py:275:19: F821 undefined name '_'
linkcheck/logger/__init__.py:342:16: F821 undefined name '_'
linkcheck/logger/__init__.py:380:13: F821 undefined name '_'
linkcheck/logger/__init__.py:384:13: F821 undefined name '_'
linkcheck/logger/__init__.py:387:13: F821 undefined name '_'
linkcheck/logger/__init__.py:396:13: F821 undefined name '_'
linkcheck/network/__init__.py:1:1: W391 blank line at end of file
linkcheck/plugins/locationinfo.py:89:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/locationinfo.py:91:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/markdowncheck.py:112:89: E501 line too long (111 > 88 characters)
linkcheck/plugins/markdowncheck.py:141:9: E741 ambiguous variable name 'l'
linkcheck/plugins/markdowncheck.py:165:23: E203 whitespace before ':'
linkcheck/plugins/viruscheck.py:95:42: E203 whitespace before ':'
2020-05-30 17:01:36 +01:00
Chris Mayo
a92a684ac4 Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
Chris Mayo
ebcc3c4961 Remove str_text from plugins/ 2020-05-19 19:56:42 +01:00
Marius Gedminas
bb53aaa621 Fix viruscheck plugin
The clamav interface needs bytes, not unicode.

It would be nice if we had tests for this code.
2020-05-17 17:50:11 +01:00
Chris Mayo
a15a2833ca Remove spaces after names in class method definitions
And also nested functions.

This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7 Remove spaces after names in function definitions
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
fc11d08968 Remove spaces after names in class definitions 2020-05-16 20:19:42 +01:00
Chris Mayo
42de609f8e Make urllib imports Python 3 only 2020-05-14 20:15:28 +01:00
Chris Mayo
736c893707
Merge pull request #377 from cjmayo/tidyten3
Remove u string prefixes
2020-05-13 19:36:54 +01:00
Chris Mayo
44e81d27dd Remove inheriting object
All Python 3 classes are new-style.
2020-05-08 10:45:31 +01:00
Chris Mayo
b0ea72e8c1 Remove # -*- coding: lines
Except for tests that include non-unicode characters:

tests/test_po.py
tests/test_strformat.py
tests/test_url.py
tests/checker/test_error.py
tests/checker/test_news.py
2020-05-08 10:45:31 +01:00
Chris Mayo
4d3e5abcfa Remove u string prefixes 2020-04-30 20:11:59 +01:00
Chris Mayo
9eed070a73 Stop using HTML handlers
LinkFinder is the only remaining HTML handler therefore no need for
htmlsoup.process_soup() as an independent function or TagFinder as a
base class.
2020-04-29 20:07:00 +01:00
Chris Mayo
b7ec71d8cc Always use utf-8 encoding when quoting 2019-10-05 19:38:57 +01:00
Marius Gedminas
8bdd402aed
Merge pull request #333 from linkchecker/fix-clamav-on-py3
Fix test_clamav.py on Python 3
2019-10-25 16:16:23 +03:00
Marius Gedminas
f9766a2049 Python 3: fix bytes vs strings in viruscheck plugin
Socket communication deals with bytes.

There are probably remaining issues with the viruscheck plugin on
Python 3, we just can't see them because the code is not fully covered
with tests.
2019-10-25 14:24:07 +03:00
Chris Mayo
b2e63663f8 Make PdfParser Python 3 compatible
basestring is not available in Python 3. Ensure all URLs are Unicode.

url_data.get_raw_content() is returning bytes.
2019-10-24 19:57:27 +01:00
Marius Gedminas
938467c3ae
Merge pull request #324 from cjmayo/pdfminer
Add pdfminer to tox.ini and dev-requirements.txt to enable pdf test
2019-10-23 09:47:01 +03:00
Marius Gedminas
fa32a89d6b Fix MS Word parser, hopefully
MS Word files are binary data, and get_temp_filename() will write them
to disk using open(..., 'wb'), so we want to pass bytes in there, not
Unicode.

See #323.
2019-10-22 16:39:57 +03:00
Chris Mayo
949f84d329 PdfParser requires bytes 2019-10-21 20:12:33 +01:00
Petr Dlouhý
6e8da10942 fixes for Python 3: fix markdowncheck
The translate() method of string objects (and Python 2 Unicode objects)
only accepts a single, table argument.
2019-09-30 19:46:24 +01:00
Petr Dlouhý
79e05d1511 Python3: fix parsepdf 2019-04-09 20:09:35 +01:00
Graham Seaman
2e32780dc7 Force header names to lower to allow for CaseInsensitvieDict variability 2017-02-01 16:28:07 +00:00
Vadim Khohlov
d4352fc828 Added plugin for parsing and checking links in Markdown files 2014-11-11 15:35:18 +02:00
Bastian Kleineidam
35eb30432e Added some Python3 fixes. 2014-09-12 19:36:30 +02:00
Bastian Kleineidam
85dadc1f1a Add documentation 2014-07-16 07:37:19 +02:00
Bastian Kleineidam
37664ea8a4 Fix Word file check plugin. 2014-07-15 22:39:41 +02:00
Bastian Kleineidam
032c4091c3 Some easy python3 compatibility changes. 2014-07-15 18:40:47 +02:00
Bastian Kleineidam
b152ce7a6e Add PDF test and fix page number. 2014-04-29 18:53:24 +02:00
Bastian Kleineidam
82dd76b0d7 Add PDF link parsing. 2014-04-28 18:13:45 +02:00
Bastian Kleineidam
0ffdea2b8d Added parser plugins and the applies_to() function. 2014-04-28 18:11:19 +02:00
Bastian Kleineidam
b6b5c7a12e Simpler link parsing routine. 2014-03-27 19:49:17 +01:00