Commit graph

3053 commits

Author SHA1 Message Date
Marius Gedminas
bb53aaa621 Fix viruscheck plugin
The clamav interface needs bytes, not unicode.

It would be nice if we had tests for this code.
2020-05-17 17:50:11 +01:00
Chris Mayo
a15a2833ca Remove spaces after names in class method definitions
And also nested functions.

This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
1663e10fe7 Remove spaces after names in function definitions
This is a PEP 8 convention, E211.
2020-05-16 20:19:42 +01:00
Chris Mayo
fc11d08968 Remove spaces after names in class definitions 2020-05-16 20:19:42 +01:00
Chris Mayo
1416a08119 On Python 3 no need to convert os.linesep to a string 2020-05-16 17:02:01 +01:00
Chris Mayo
0752408a44 Remove Python 2 use of sys.stdout in i18n.get_encoded_writer() 2020-05-16 17:02:00 +01:00
Chris Mayo
2c2e7e55ac Remove CSVLogger.encode_row_s()
Introduced during Python 3 conversion to maintaint Python 2 support:

55a7973b ("Python3: fix csvlog", 2016-12-04)
2020-05-16 17:02:00 +01:00
Chris Mayo
ed13a926d3 Remove setting Python 2 xmlparser.returns_unicode 2020-05-16 17:02:00 +01:00
Chris Mayo
025637b08d Remove Python 2 cookielib import 2020-05-16 16:26:38 +01:00
Chris Mayo
1e277444f4 Remove Python 2 thread import 2020-05-16 16:26:34 +01:00
Chris Mayo
dcbddfe045 Remove Python 2 ConfigParser import 2020-05-15 19:37:04 +01:00
Chris Mayo
f8c9faec1b Remove Python 2 cStringIO imports 2020-05-15 19:37:04 +01:00
Chris Mayo
bda9612273 Make html.escape Python 3 only 2020-05-14 20:15:28 +01:00
Chris Mayo
42de609f8e Make urllib imports Python 3 only 2020-05-14 20:15:28 +01:00
Chris Mayo
3c661a83d0 Replace parse_host_port() in checker.proxysupport with url.splitport() 2020-05-14 20:15:28 +01:00
Chris Mayo
c80002437e Update run-time version check 2020-05-13 19:50:19 +01:00
Chris Mayo
08ddf658bc
Merge pull request #366 from cjmayo/userorpwd
Support login forms with user and/or password
2020-05-13 19:37:44 +01:00
Chris Mayo
736c893707
Merge pull request #377 from cjmayo/tidyten3
Remove u string prefixes
2020-05-13 19:36:54 +01:00
Chris Mayo
3ace021264 Support login forms with user and/or password 2020-05-13 19:32:25 +01:00
Chris Mayo
44e81d27dd Remove inheriting object
All Python 3 classes are new-style.
2020-05-08 10:45:31 +01:00
Chris Mayo
b0ea72e8c1 Remove # -*- coding: lines
Except for tests that include non-unicode characters:

tests/test_po.py
tests/test_strformat.py
tests/test_url.py
tests/checker/test_error.py
tests/checker/test_news.py
2020-05-08 10:45:31 +01:00
Marius Gedminas
22b0165b72 Make _Logger an abstract base class
The __metaclass__ syntax is a Python-2-ism.  It was replaced with

    class _Logger (object, metaclass=abc.ABCMeta):

in Python 3.  And then Python 3.4 introduced abc.ABC which is an empty
class that has ABCMeta as the metaclass, making it simpler to define
abstract base classes.
2020-04-30 23:09:42 +03:00
Chris Mayo
4d3e5abcfa Remove u string prefixes 2020-04-30 20:11:59 +01:00
anarcat
ab476fa4bf
Merge pull request #364 from cjmayo/parser5
Stop using HTML handlers and improve login form error handling
2020-04-30 09:28:48 -04:00
Chris Mayo
12a948894b Fix space style in linkcheck/htmlutil/linkparse.py 2020-04-29 20:07:00 +01:00
Chris Mayo
9eed070a73 Stop using HTML handlers
LinkFinder is the only remaining HTML handler therefore no need for
htmlsoup.process_soup() as an independent function or TagFinder as a
base class.
2020-04-29 20:07:00 +01:00
Chris Mayo
4ffdbf2406 Replace MetaRobotsFinder using BeautifulSoup.find() 2020-04-29 20:07:00 +01:00
Chris Mayo
a51f02cf66 Improve error handling and debugging for login form 2020-04-27 18:06:29 +01:00
Chris Mayo
9a33c2a659 Make requesting login form password work on Python 3 2020-04-27 18:06:29 +01:00
Chris Mayo
8fc0dcc055 Make matching login form credentials case-sensitive
The keys of the form.data dictionary are case-sensitive and therefore a
KeyError was possible if the configured values are not identical to
the input element name attributes.
2020-04-27 18:06:29 +01:00
Chris Mayo
7a6ef938cc Rename htmlutil.formsearch to htmlutil.loginformsearch
Make it clear that this module has only one specific use.
2020-04-27 18:06:29 +01:00
anarcat
350f8bfef9
Merge pull request #373 from linkchecker/fix-swf-parsing
SWF files are binary data
2020-04-27 09:39:52 -04:00
Marius Gedminas
680783b1ff SWF files are binary data
Should fix #372.
2020-04-27 11:25:37 +03:00
anarcat
183d483074
Merge pull request #365 from cjmayo/tidyten1
Remove use of the future package
2020-04-26 12:02:30 -04:00
Chris Mayo
d189445a8e LinkFinder does not raise StopParse 2020-04-18 20:30:46 +01:00
Chris Mayo
ee6628a831 Move HtmlParser/htmlsax.py to htmlutil/htmlsoup.py
Remove one subpackage and some import lines where htmlutil.linkparse is
also being used.
2020-04-18 20:30:45 +01:00
Chris Mayo
384e1e196d Remove Python 2 gettext builtin installation 2020-04-15 19:49:16 +01:00
Chris Mayo
a83fbb56c0 Remove from __future__ imports 2020-04-15 19:49:16 +01:00
Chris Mayo
f5e7f3a382 Remove use of the future package
It was providing Python 2 compatibility.
2020-04-15 19:49:16 +01:00
Chris Mayo
0795e3c1b4 Replace Parser class using BeautifulSoup.find_all() 2020-04-10 13:51:09 +01:00
Chris Mayo
eb3cf28baa Remove support for start_end_element() callback
The LinkFinder handler start_end_element() callback does nothing apart
from call start_element().
2020-04-10 13:51:09 +01:00
Chris Mayo
c9f17e92b9 Remove support for end_element() callback 2020-04-10 13:51:09 +01:00
Chris Mayo
48b590cf8b Replace FormFinder using BeautifulSoup.find_all()
FormFinder was the only handler that used an end_element() callback and
was therefore a blocker to moving the Parser class to use
BeautifulSoup.find_all()

FormFinder was a specialised handler used to parse a login form at
the start of a session if the user had configured authentication
credentials.
2020-04-10 13:51:05 +01:00
Chris Mayo
974915cc4f Remove encoding from Parser
Only used by the test and an attribute of the soup object.
2020-04-08 20:03:35 +01:00
Chris Mayo
02e1c389b2 Remove parser flush() and reset()
Remnants of the feed() interface.
2020-04-08 20:03:35 +01:00
Chris Mayo
3771dd9136 Use parser.feed_soup() instead of parser.feed()
Markup is not being passed in pieces to the parser, so simplify the
interface and reduce the state further.
2020-04-08 20:03:35 +01:00
Chris Mayo
40f43ae41c Create one function to make soup objects 2020-04-08 20:03:35 +01:00
Chris Mayo
9d8d251d06 Replace Parser lineno() and column() methods
Stop storing this data in Parser object state.
2020-04-08 20:03:35 +01:00
Chris Mayo
16e6fb2919 Fix incorrect character in FormFinder log message 2020-04-07 19:24:34 +01:00
Chris Mayo
00f940d979 Fix FormFinder callbacks for missing element_text
element_text added in:
51a06d8a ("Remove home-cooked htmlparser and use BeautifulSoup",
2019-07-22)
2020-04-07 19:24:34 +01:00