Commit graph

6216 commits

Author SHA1 Message Date
Chris Mayo
16e6fb2919 Fix incorrect character in FormFinder log message 2020-04-07 19:24:34 +01:00
Chris Mayo
00f940d979 Fix FormFinder callbacks for missing element_text
element_text added in:
51a06d8a ("Remove home-cooked htmlparser and use BeautifulSoup",
2019-07-22)
2020-04-07 19:24:34 +01:00
Chris Mayo
514210199d Add tests for search_form 2020-04-07 19:24:34 +01:00
anarcat
7d55855ffb
Merge pull request #356 from cjmayo/parser1
Remove unecessary parser related code
2020-04-04 09:26:51 -04:00
Chris Mayo
fe024fb0c8 Remove unused Parser.debug() method 2020-04-03 19:24:08 +01:00
Chris Mayo
0c5e3bb403 Remove old HtmlParser .gitignore
htmlparse.output was a product of the built-in parser.
2020-04-03 19:24:08 +01:00
Chris Mayo
036b900ffc Remove unused linkcheck.containers classes 2020-04-03 19:24:08 +01:00
Chris Mayo
3ff3d72492 Use BeautifulSoup element attrs directly 2020-04-03 19:24:08 +01:00
Chris Mayo
a7e1e20172 Remove last line and column from Parser
Only used for debug log message and not very useful.
2020-04-03 19:24:08 +01:00
anarcat
25d517521c
Merge pull request #353 from cjmayo/setup
Tidy setup.py for C extensions and Python 2
2020-04-02 10:10:38 -04:00
anarcat
39aa438d06
Merge pull request #354 from cjmayo/unicode
Remove use of Python 2 unicode() and related u prefixes
2020-04-02 10:10:31 -04:00
Chris Mayo
28701e291a Remove use of Python 2 unicode() and related u prefixes
Several instances for MS Windows left unchanged.
2020-04-01 19:39:50 +01:00
Chris Mayo
e0bf5fc24f Remove unused imports and variables from setup.py 2020-04-01 19:21:47 +01:00
Chris Mayo
f6b273d05e Remove code for compiling C extensions from setup.py
C extensions for parser and network utilities have been replaced in
Python.
2020-04-01 19:21:47 +01:00
Chris Mayo
9f899605a9 Remove Python 2 compatibility from setup.py
sys.version_info was introduced in Python 2.0.
2020-04-01 19:21:47 +01:00
anarcat
cf4e6bb235
Merge pull request #351 from cjmayo/tagsonly
Remove support for non-Tag elements from Parser
2020-04-01 12:17:18 -04:00
Marius Gedminas
7c14bf1ad6 Declare supported Python versions in setup.py
The python_requires is the important one; it means once we publish a
new release on PyPI, pip install will know not to try to install it if
you run it on Python 2 and will fall back to an older version.
2020-04-01 17:49:51 +03:00
anarcat
b5c8a5d1ce
Merge pull request #314 from cjmayo/postbs4
Replace memoized with functools.lru_cache and deprecations
2020-04-01 10:28:18 -04:00
Chris Mayo
9fc651e82b Remove Python 2 compatibility from parser tests 2020-03-31 20:10:35 +01:00
Chris Mayo
ffa6ac457f Remove support for non-Tag elements from Parser
This change is made because the linkchecker handlers only process
Tags.

The test HtmlPrettyPrinter handler is updated to output element text
because its support for non-Tag elements has been removed. This results
in a number of the existing tests still passing.
2020-03-31 20:10:35 +01:00
Chris Mayo
d2cb1b9dd6 Raise minimum Python version to 3.5 in setup.py 2020-03-31 19:46:31 +01:00
Chris Mayo
e7c5f353cd Remove unused function linkcheck.fileutil.write_file()
Doesn't appear to have ever been used.

Causes flake8 error:
linkcheck/fileutil.py:45:9: F821 undefined name 'file'
2020-03-31 19:46:31 +01:00
Chris Mayo
c3860e2218 Remove third_party directory from MANIFEST.in
Unused since:
0a13fae3 ("remove third party packages and use them as dependency",
2018-01-06)
2020-03-31 19:46:31 +01:00
Chris Mayo
504004d4f0 Use ipaddress in network.iputil.is_valid_ip()
ipaddress was introduced in Python 3.3.
2020-03-31 19:46:31 +01:00
Chris Mayo
2eb1424703 Replace deprecated plistlib.readPlistFromBytes() in bookmarks.safari
Remove Python 2 code.

plistlib.loads() was added in Python 3.4.
2020-03-31 19:46:31 +01:00
Chris Mayo
0ee4414a60 Replace memoized with functools.lru_cache 2020-03-31 19:46:31 +01:00
Marius Gedminas
61b30a95dd Switch to travis-ci.com
Migrating from legacy GitHub services/webhooks to the new Travis CI
GitHub app means we also have to use travis-ci.com instead of
travis-ci.org to see build status or history.
2020-03-31 18:35:37 +03:00
anarcat
67f91fee54
Merge pull request #349 from cjmayo/unused
Remove unused code
2020-03-31 11:20:31 -04:00
Chris Mayo
1255119ca8 Move HtmlPrinter and HtmlPrettyPrinter into tests 2020-03-30 19:32:30 +01:00
Chris Mayo
ce1d669329 Remove unused functions from linkcheck.httputil
http_persistent() unused since:
4b818cb4 ("Detect more cases to close the connection, and close response
objects", 2006-09-15)

http_keepalive(), get_content_encoding() unused since:
7b34be59 ("Introduce check plugins, use Python requests for http/s
connections, and some code cleanups and improvements.", 2014-03-01)
2020-03-30 19:32:30 +01:00
Chris Mayo
5b66964afa Remove unused .charset from checker classes
Unused since:
4f8c2954 ("Don't set parser.encoding", 2019-10-05)
2020-03-30 19:32:30 +01:00
Chris Mayo
f743be57e8 Remove unused functions from linkcheck.HtmlParser
resolve_entities() unused since:
2c000683 ("Remove unused linkcheck.htmlutil.linkname module",
2020-03-30)

set_doctype(), set_encoding() unused since:
51a06d8a ("Remove home-cooked htmlparser and use BeautifulSoup",
2019-07-22)
2020-03-30 19:32:18 +01:00
Chris Mayo
2c000683e1 Remove unused linkcheck.htmlutil.linkname module
Unused since:
d6d48b48 ("html parser: use name instead of peeking", 2019-07-22)
2020-03-30 19:31:11 +01:00
Marius Gedminas
78530956a1
Merge pull request #337 from linkchecker/htmlparser-beautifulsoup
Change HtmlParser to use Beautiful Soup
2020-03-30 20:45:14 +03:00
Chris Mayo
9030050599 Remove Python 3 status document 2020-03-30 17:39:23 +01:00
Marius Gedminas
af0f50efa8 Restore support for older BeautifulSoup4 versions 2020-03-30 14:49:56 +03:00
Marius Gedminas
ccc0ee0464 Clean up travis and tox.ini
I want the Python 3.5 travis job to run just tox -e py35, without the
oldbs4 job, and without an explicit TOXENV setting that is awkward to
insert in the .travis.yml (also, it reorders the jobs putting 3.5 below
3.8 which annoys me).

I think I found a way of doing that by renaming py35-oldbs4 to oldbs4.
2020-03-30 14:46:44 +03:00
Marius Gedminas
ed08e7fa7e Split the oldbs4 into a separate Travis job (take 3)
I did an oopsie whoopsie with the YAML syntax in my previous commit.
2020-03-23 16:50:27 +02:00
Marius Gedminas
894f0b0922 Split the oldbs4 into a separate Travis job (take 2)
The previous attempt did not work: the 3.5 build ran both toxenvs.
2020-03-23 16:45:46 +02:00
Marius Gedminas
ba5888f06a Split the oldbs4 into a separate Travis job 2020-03-23 16:40:22 +02:00
Marius Gedminas
0417f677c2 Ignore files created during test runs 2020-03-23 11:05:13 +02:00
Marius Gedminas
6a50fe9d86 Add Python 3.8 to the build matrix 2020-03-23 11:00:25 +02:00
Marius Gedminas
a311ebb97e Fix doctype tests
I don't think linkchecker actually cares about the document type, so I'm
not sure why we're even testing this...
2020-03-23 10:56:57 +02:00
Chris Mayo
5eaad24641 Use HTTP header encoding for decoding 2020-03-22 19:54:37 +00:00
Chris Mayo
f5ae90e824 Parser threading lock no longer required with Beautiful Soup 2020-03-22 19:54:37 +00:00
Marius Gedminas
205ceb6805
Merge pull request #344 from hroncok/beautifulsoup4-requirement
Require beautifulsoup4 instead of bs4
2020-02-06 12:52:20 +02:00
Miro Hrončok
ff5ebbae69 Require beautifulsoup4 instead of bs4
bs4 is a dummy package managed by the developer of Beautiful Soup to prevent
name squatting. The official name of PyPI’s Beautiful Soup Python package is
beautifulsoup4. The bs4 package ensures that if you type pip install bs4 by
mistake you will end up with Beautiful Soup.

However, for requirements, it's cleaner to use the proper name.
For downstream packaging in Fedora, this avoids the need of packaging
the dummy package.
2020-02-06 10:05:13 +01:00
anarcat
e37dab8a4b
Merge pull request #339 from cjmayo/notafter
Actually fix TypeError when checking https link
2019-11-21 10:33:27 -05:00
Chris Mayo
d3d6638973 Actually fix TypeError when checking https link
The test was added but not the fix in:
ecd06776 ("Fix TypeError when checking https link and test", 2019-11-11)

Which is caught by the new test when run on Python 3:
___________________ TestHttps.test_x509_to_dict__________________
[gw14] linux -- Python 3.6.9 /usr/bin/python3.6
tests/checker/test_https.py:72: in test_x509_to_dict
    self.assertEqual(httputil.x509_to_dict(cert)["notAfter"],
linkcheck/httputil.py:47: in x509_to_dict
    parsedtime = asn1_generaltime_to_seconds(notAfter)
linkcheck/httputil.py:68: in asn1_generaltime_to_seconds
    res = datetime.strptime(timestr, timeformat + 'Z')
E   TypeError: strptime() argument 1 must be str, not bytes
2019-11-19 20:06:10 +00:00
anarcat
c92ab72676
Merge pull request #338 from cjmayo/https
Enable https checking using a test server
2019-11-14 09:38:54 -05:00