anarcat
b5c8a5d1ce
Merge pull request #314 from cjmayo/postbs4
...
Replace memoized with functools.lru_cache and deprecations
2020-04-01 10:28:18 -04:00
Chris Mayo
d2cb1b9dd6
Raise minimum Python version to 3.5 in setup.py
2020-03-31 19:46:31 +01:00
Chris Mayo
e7c5f353cd
Remove unused function linkcheck.fileutil.write_file()
...
Doesn't appear to have ever been used.
Causes flake8 error:
linkcheck/fileutil.py:45:9: F821 undefined name 'file'
2020-03-31 19:46:31 +01:00
Chris Mayo
c3860e2218
Remove third_party directory from MANIFEST.in
...
Unused since:
0a13fae3 ("remove third party packages and use them as dependency",
2018-01-06)
2020-03-31 19:46:31 +01:00
Chris Mayo
504004d4f0
Use ipaddress in network.iputil.is_valid_ip()
...
ipaddress was introduced in Python 3.3.
2020-03-31 19:46:31 +01:00
Chris Mayo
2eb1424703
Replace deprecated plistlib.readPlistFromBytes() in bookmarks.safari
...
Remove Python 2 code.
plistlib.loads() was added in Python 3.4.
2020-03-31 19:46:31 +01:00
Chris Mayo
0ee4414a60
Replace memoized with functools.lru_cache
2020-03-31 19:46:31 +01:00
Marius Gedminas
61b30a95dd
Switch to travis-ci.com
...
Migrating from legacy GitHub services/webhooks to the new Travis CI
GitHub app means we also have to use travis-ci.com instead of
travis-ci.org to see build status or history.
2020-03-31 18:35:37 +03:00
anarcat
67f91fee54
Merge pull request #349 from cjmayo/unused
...
Remove unused code
2020-03-31 11:20:31 -04:00
Chris Mayo
1255119ca8
Move HtmlPrinter and HtmlPrettyPrinter into tests
2020-03-30 19:32:30 +01:00
Chris Mayo
ce1d669329
Remove unused functions from linkcheck.httputil
...
http_persistent() unused since:
4b818cb4 ("Detect more cases to close the connection, and close response
objects", 2006-09-15)
http_keepalive(), get_content_encoding() unused since:
7b34be59 ("Introduce check plugins, use Python requests for http/s
connections, and some code cleanups and improvements.", 2014-03-01)
2020-03-30 19:32:30 +01:00
Chris Mayo
5b66964afa
Remove unused .charset from checker classes
...
Unused since:
4f8c2954 ("Don't set parser.encoding", 2019-10-05)
2020-03-30 19:32:30 +01:00
Chris Mayo
f743be57e8
Remove unused functions from linkcheck.HtmlParser
...
resolve_entities() unused since:
2c000683 ("Remove unused linkcheck.htmlutil.linkname module",
2020-03-30)
set_doctype(), set_encoding() unused since:
51a06d8a ("Remove home-cooked htmlparser and use BeautifulSoup",
2019-07-22)
2020-03-30 19:32:18 +01:00
Chris Mayo
2c000683e1
Remove unused linkcheck.htmlutil.linkname module
...
Unused since:
d6d48b48 ("html parser: use name instead of peeking", 2019-07-22)
2020-03-30 19:31:11 +01:00
Marius Gedminas
78530956a1
Merge pull request #337 from linkchecker/htmlparser-beautifulsoup
...
Change HtmlParser to use Beautiful Soup
2020-03-30 20:45:14 +03:00
Chris Mayo
9030050599
Remove Python 3 status document
2020-03-30 17:39:23 +01:00
Marius Gedminas
af0f50efa8
Restore support for older BeautifulSoup4 versions
2020-03-30 14:49:56 +03:00
Marius Gedminas
ccc0ee0464
Clean up travis and tox.ini
...
I want the Python 3.5 travis job to run just tox -e py35, without the
oldbs4 job, and without an explicit TOXENV setting that is awkward to
insert in the .travis.yml (also, it reorders the jobs putting 3.5 below
3.8 which annoys me).
I think I found a way of doing that by renaming py35-oldbs4 to oldbs4.
2020-03-30 14:46:44 +03:00
Marius Gedminas
ed08e7fa7e
Split the oldbs4 into a separate Travis job (take 3)
...
I did an oopsie whoopsie with the YAML syntax in my previous commit.
2020-03-23 16:50:27 +02:00
Marius Gedminas
894f0b0922
Split the oldbs4 into a separate Travis job (take 2)
...
The previous attempt did not work: the 3.5 build ran both toxenvs.
2020-03-23 16:45:46 +02:00
Marius Gedminas
ba5888f06a
Split the oldbs4 into a separate Travis job
2020-03-23 16:40:22 +02:00
Marius Gedminas
0417f677c2
Ignore files created during test runs
2020-03-23 11:05:13 +02:00
Marius Gedminas
6a50fe9d86
Add Python 3.8 to the build matrix
2020-03-23 11:00:25 +02:00
Marius Gedminas
a311ebb97e
Fix doctype tests
...
I don't think linkchecker actually cares about the document type, so I'm
not sure why we're even testing this...
2020-03-23 10:56:57 +02:00
Chris Mayo
5eaad24641
Use HTTP header encoding for decoding
2020-03-22 19:54:37 +00:00
Chris Mayo
f5ae90e824
Parser threading lock no longer required with Beautiful Soup
2020-03-22 19:54:37 +00:00
Marius Gedminas
205ceb6805
Merge pull request #344 from hroncok/beautifulsoup4-requirement
...
Require beautifulsoup4 instead of bs4
2020-02-06 12:52:20 +02:00
Miro Hrončok
ff5ebbae69
Require beautifulsoup4 instead of bs4
...
bs4 is a dummy package managed by the developer of Beautiful Soup to prevent
name squatting. The official name of PyPI’s Beautiful Soup Python package is
beautifulsoup4. The bs4 package ensures that if you type pip install bs4 by
mistake you will end up with Beautiful Soup.
However, for requirements, it's cleaner to use the proper name.
For downstream packaging in Fedora, this avoids the need of packaging
the dummy package.
2020-02-06 10:05:13 +01:00
anarcat
e37dab8a4b
Merge pull request #339 from cjmayo/notafter
...
Actually fix TypeError when checking https link
2019-11-21 10:33:27 -05:00
Chris Mayo
d3d6638973
Actually fix TypeError when checking https link
...
The test was added but not the fix in:
ecd06776 ("Fix TypeError when checking https link and test", 2019-11-11)
Which is caught by the new test when run on Python 3:
___________________ TestHttps.test_x509_to_dict__________________
[gw14] linux -- Python 3.6.9 /usr/bin/python3.6
tests/checker/test_https.py:72: in test_x509_to_dict
self.assertEqual(httputil.x509_to_dict(cert)["notAfter"],
linkcheck/httputil.py:47: in x509_to_dict
parsedtime = asn1_generaltime_to_seconds(notAfter)
linkcheck/httputil.py:68: in asn1_generaltime_to_seconds
res = datetime.strptime(timestr, timeformat + 'Z')
E TypeError: strptime() argument 1 must be str, not bytes
2019-11-19 20:06:10 +00:00
anarcat
c92ab72676
Merge pull request #338 from cjmayo/https
...
Enable https checking using a test server
2019-11-14 09:38:54 -05:00
Chris Mayo
ecd06776ab
Fix TypeError when checking https link and test
...
File "/usr/lib/python3.7/site-packages/linkcheck/httputil.py", line 68, in asn1_generaltime_to_seconds
line: res = datetime.strptime(timestr, timeformat + 'Z')
locals:
res = <local> None
datetime = <global> <class 'datetime.datetime'>
datetime.strptime = <global> <built-in method strptime of type object at 0x7fa39064dda0>
timestr = <local> b'20191106202117Z'
timeformat = <local> '%Y%m%d%H%M%S'
TypeError: strptime() argument 1 must be str, not bytes
pyOpenSSL OpenSSL.crypto.X509.get_notAfter() returns bytes:
https://www.pyopenssl.org/en/stable/api/crypto.html#OpenSSL.crypto.X509.get_notAfter
2019-11-11 20:12:25 +00:00
Chris Mayo
dee4be4b1d
Enable https checking using a test server
...
Verification has to be turned off because we are using a
self-signed certificate.
2019-11-11 20:12:25 +00:00
anarcat
5308ec5204
Merge pull request #336 from cjmayo/logdiff
...
Improve test failure diff
2019-10-29 16:20:26 -04:00
Chris Mayo
2f16152dc8
Improve test failure diff
...
Some url lines were missing a url prefix while others had a double url
prefix. diff was reporting more url lines as changed than actually had.
Improve formatting by removing newlines from control lines and adding
headings.
Before:
E AssertionError: http://localhost:46031/tests/checker/data/sitemap.xml
E ---
E
E +++
E
E @@ -1,4 +1,8 @@
E
E -url http://localhost:46031/tests/checker/data/sitemap.xml
E +http://www.example.com/
E +cache key http://www.example.com/
E +real url http://www.example.com/
E +valid
E +url url http://localhost:46031/tests/checker/data/sitemap.xml
E cache key http://localhost:46031/tests/checker/data/sitemap.xml
E real url http://localhost:46031/tests/checker/data/sitemap.xml
E valid
After:
E AssertionError: http://localhost:44021/tests/checker/data/sitemap.xml
E --- expected
E +++ result
E @@ -2,3 +2,7 @@
E cache key http://localhost:44021/tests/checker/data/sitemap.xml
E real url http://localhost:44021/tests/checker/data/sitemap.xml
E valid
E +url http://www.example.com/
E +cache key http://www.example.com/
E +real url http://www.example.com/
E +valid
2019-10-29 20:03:08 +00:00
Marius Gedminas
c294a4e6c1
Merge pull request #335 from cjmayo/sitemap
...
Fix XmlTagUrlParser and make Python 3 compatible
2019-10-29 15:50:49 +02:00
Chris Mayo
ec8b6e09f0
Fix XmlTagUrlParser and make Python 3 compatible
...
URLs within a sitemap file were not being captured.
2019-10-28 19:20:05 +00:00
Marius Gedminas
8bdd402aed
Merge pull request #333 from linkchecker/fix-clamav-on-py3
...
Fix test_clamav.py on Python 3
2019-10-25 16:16:23 +03:00
Marius Gedminas
5b2b3613ec
Merge pull request #330 from linkchecker/fix-sitemap
...
Fix sitemap parser
2019-10-25 16:15:55 +03:00
anarcat
6dcc9dbf9d
Merge pull request #332 from cjmayo/py3pdf
...
Make PdfParser Python 3 compatible
2019-10-25 08:38:59 -04:00
Marius Gedminas
f9766a2049
Python 3: fix bytes vs strings in viruscheck plugin
...
Socket communication deals with bytes.
There are probably remaining issues with the viruscheck plugin on
Python 3, we just can't see them because the code is not fully covered
with tests.
2019-10-25 14:24:07 +03:00
Marius Gedminas
65f861901c
Fix all Python 3 tox environments
...
Old pdfminer supports Python 2 only, new pdfminer supports Python 3
only.
2019-10-25 14:20:31 +03:00
Chris Mayo
b2e63663f8
Make PdfParser Python 3 compatible
...
basestring is not available in Python 3. Ensure all URLs are Unicode.
url_data.get_raw_content() is returning bytes.
2019-10-24 19:57:27 +01:00
Marius Gedminas
011f6c147e
Merge pull request #331 from linkchecker/explain-skips
...
Explain why these tests are being skipped
2019-10-23 17:59:55 +03:00
Marius Gedminas
606ece0308
Explain why these tests are being skipped
...
pytest output before this change:
SKIPPED [3] tests/__init__.py:217: condition: True
SKIPPED [1] tests/checker/test_news.py:63: condition: True
SKIPPED [1] tests/checker/test_news.py:41: condition: True
SKIPPED [1] tests/checker/test_news.py:116: condition: True
SKIPPED [1] tests/checker/test_news.py:75: condition: True
After:
SKIPPED [3] tests/__init__.py: disabled for now until some stable news server comes up
SKIPPED [4] tests/checker/test_news.py: disabled for now until some stable news server comes up
2019-10-23 17:35:31 +03:00
Marius Gedminas
87b504785c
Add a regression test for the sitemap parser
2019-10-23 17:30:10 +03:00
Marius Gedminas
a1af1e9717
Fix sitemap parser
...
PyExpat wants bytes on Python 2. See #323 .
2019-10-23 17:23:23 +03:00
Marius Gedminas
f46151dbf8
Merge pull request #318 from tkfu/docs/fix-install-instructions
...
Add instructions to install current release tag from git via pip
2019-10-23 09:47:25 +03:00
Marius Gedminas
938467c3ae
Merge pull request #324 from cjmayo/pdfminer
...
Add pdfminer to tox.ini and dev-requirements.txt to enable pdf test
2019-10-23 09:47:01 +03:00
Marius Gedminas
db3e25e934
Merge pull request #326 from linkchecker/fix-word-maybe
...
Fix MS Word parser, hopefully
2019-10-22 18:08:46 +03:00