linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-05-27 07:13:59 +00:00

Author	SHA1	Message	Date
anarcat	cf4e6bb235	Merge pull request #351 from cjmayo/tagsonly Remove support for non-Tag elements from Parser	2020-04-01 12:17:18 -04:00
Marius Gedminas	7c14bf1ad6	Declare supported Python versions in setup.py The python_requires is the important one; it means once we publish a new release on PyPI, pip install will know not to try to install it if you run it on Python 2 and will fall back to an older version.	2020-04-01 17:49:51 +03:00
anarcat	b5c8a5d1ce	Merge pull request #314 from cjmayo/postbs4 Replace memoized with functools.lru_cache and deprecations	2020-04-01 10:28:18 -04:00
Chris Mayo	9fc651e82b	Remove Python 2 compatibility from parser tests	2020-03-31 20:10:35 +01:00
Chris Mayo	ffa6ac457f	Remove support for non-Tag elements from Parser This change is made because the linkchecker handlers only process Tags. The test HtmlPrettyPrinter handler is updated to output element text because its support for non-Tag elements has been removed. This results in a number of the existing tests still passing.	2020-03-31 20:10:35 +01:00
Chris Mayo	d2cb1b9dd6	Raise minimum Python version to 3.5 in setup.py	2020-03-31 19:46:31 +01:00
Chris Mayo	e7c5f353cd	Remove unused function linkcheck.fileutil.write_file() Doesn't appear to have ever been used. Causes flake8 error: linkcheck/fileutil.py:45:9: F821 undefined name 'file'	2020-03-31 19:46:31 +01:00
Chris Mayo	c3860e2218	Remove third_party directory from MANIFEST.in Unused since: `0a13fae3` ("remove third party packages and use them as dependency", 2018-01-06)	2020-03-31 19:46:31 +01:00
Chris Mayo	504004d4f0	Use ipaddress in network.iputil.is_valid_ip() ipaddress was introduced in Python 3.3.	2020-03-31 19:46:31 +01:00
Chris Mayo	2eb1424703	Replace deprecated plistlib.readPlistFromBytes() in bookmarks.safari Remove Python 2 code. plistlib.loads() was added in Python 3.4.	2020-03-31 19:46:31 +01:00
Chris Mayo	0ee4414a60	Replace memoized with functools.lru_cache	2020-03-31 19:46:31 +01:00
Marius Gedminas	61b30a95dd	Switch to travis-ci.com Migrating from legacy GitHub services/webhooks to the new Travis CI GitHub app means we also have to use travis-ci.com instead of travis-ci.org to see build status or history.	2020-03-31 18:35:37 +03:00
anarcat	67f91fee54	Merge pull request #349 from cjmayo/unused Remove unused code	2020-03-31 11:20:31 -04:00
Chris Mayo	1255119ca8	Move HtmlPrinter and HtmlPrettyPrinter into tests	2020-03-30 19:32:30 +01:00
Chris Mayo	ce1d669329	Remove unused functions from linkcheck.httputil http_persistent() unused since: `4b818cb4` ("Detect more cases to close the connection, and close response objects", 2006-09-15) http_keepalive(), get_content_encoding() unused since: `7b34be59` ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)	2020-03-30 19:32:30 +01:00
Chris Mayo	5b66964afa	Remove unused .charset from checker classes Unused since: `4f8c2954` ("Don't set parser.encoding", 2019-10-05)	2020-03-30 19:32:30 +01:00
Chris Mayo	f743be57e8	Remove unused functions from linkcheck.HtmlParser resolve_entities() unused since: `2c000683` ("Remove unused linkcheck.htmlutil.linkname module", 2020-03-30) set_doctype(), set_encoding() unused since: `51a06d8a` ("Remove home-cooked htmlparser and use BeautifulSoup", 2019-07-22)	2020-03-30 19:32:18 +01:00
Chris Mayo	2c000683e1	Remove unused linkcheck.htmlutil.linkname module Unused since: `d6d48b48` ("html parser: use name instead of peeking", 2019-07-22)	2020-03-30 19:31:11 +01:00
Marius Gedminas	78530956a1	Merge pull request #337 from linkchecker/htmlparser-beautifulsoup Change HtmlParser to use Beautiful Soup	2020-03-30 20:45:14 +03:00
Chris Mayo	9030050599	Remove Python 3 status document	2020-03-30 17:39:23 +01:00
Marius Gedminas	af0f50efa8	Restore support for older BeautifulSoup4 versions	2020-03-30 14:49:56 +03:00
Marius Gedminas	ccc0ee0464	Clean up travis and tox.ini I want the Python 3.5 travis job to run just tox -e py35, without the oldbs4 job, and without an explicit TOXENV setting that is awkward to insert in the .travis.yml (also, it reorders the jobs putting 3.5 below 3.8 which annoys me). I think I found a way of doing that by renaming py35-oldbs4 to oldbs4.	2020-03-30 14:46:44 +03:00
Marius Gedminas	ed08e7fa7e	Split the oldbs4 into a separate Travis job (take 3) I did an oopsie whoopsie with the YAML syntax in my previous commit.	2020-03-23 16:50:27 +02:00
Marius Gedminas	894f0b0922	Split the oldbs4 into a separate Travis job (take 2) The previous attempt did not work: the 3.5 build ran both toxenvs.	2020-03-23 16:45:46 +02:00
Marius Gedminas	ba5888f06a	Split the oldbs4 into a separate Travis job	2020-03-23 16:40:22 +02:00
Marius Gedminas	0417f677c2	Ignore files created during test runs	2020-03-23 11:05:13 +02:00
Marius Gedminas	6a50fe9d86	Add Python 3.8 to the build matrix	2020-03-23 11:00:25 +02:00
Marius Gedminas	a311ebb97e	Fix doctype tests I don't think linkchecker actually cares about the document type, so I'm not sure why we're even testing this...	2020-03-23 10:56:57 +02:00
Chris Mayo	5eaad24641	Use HTTP header encoding for decoding	2020-03-22 19:54:37 +00:00
Chris Mayo	f5ae90e824	Parser threading lock no longer required with Beautiful Soup	2020-03-22 19:54:37 +00:00
Marius Gedminas	205ceb6805	Merge pull request #344 from hroncok/beautifulsoup4-requirement Require beautifulsoup4 instead of bs4	2020-02-06 12:52:20 +02:00
Miro Hrončok	ff5ebbae69	Require beautifulsoup4 instead of bs4 bs4 is a dummy package managed by the developer of Beautiful Soup to prevent name squatting. The official name of PyPI’s Beautiful Soup Python package is beautifulsoup4. The bs4 package ensures that if you type pip install bs4 by mistake you will end up with Beautiful Soup. However, for requirements, it's cleaner to use the proper name. For downstream packaging in Fedora, this avoids the need of packaging the dummy package.	2020-02-06 10:05:13 +01:00
anarcat	e37dab8a4b	Merge pull request #339 from cjmayo/notafter Actually fix TypeError when checking https link	2019-11-21 10:33:27 -05:00
Chris Mayo	d3d6638973	Actually fix TypeError when checking https link The test was added but not the fix in: `ecd06776` ("Fix TypeError when checking https link and test", 2019-11-11) Which is caught by the new test when run on Python 3: ___________________ TestHttps.test_x509_to_dict__________________ [gw14] linux -- Python 3.6.9 /usr/bin/python3.6 tests/checker/test_https.py:72: in test_x509_to_dict self.assertEqual(httputil.x509_to_dict(cert)["notAfter"], linkcheck/httputil.py:47: in x509_to_dict parsedtime = asn1_generaltime_to_seconds(notAfter) linkcheck/httputil.py:68: in asn1_generaltime_to_seconds res = datetime.strptime(timestr, timeformat + 'Z') E TypeError: strptime() argument 1 must be str, not bytes	2019-11-19 20:06:10 +00:00
anarcat	c92ab72676	Merge pull request #338 from cjmayo/https Enable https checking using a test server	2019-11-14 09:38:54 -05:00
Chris Mayo	ecd06776ab	Fix TypeError when checking https link and test File "/usr/lib/python3.7/site-packages/linkcheck/httputil.py", line 68, in asn1_generaltime_to_seconds line: res = datetime.strptime(timestr, timeformat + 'Z') locals: res = <local> None datetime = <global> <class 'datetime.datetime'> datetime.strptime = <global> <built-in method strptime of type object at 0x7fa39064dda0> timestr = <local> b'20191106202117Z' timeformat = <local> '%Y%m%d%H%M%S' TypeError: strptime() argument 1 must be str, not bytes pyOpenSSL OpenSSL.crypto.X509.get_notAfter() returns bytes: https://www.pyopenssl.org/en/stable/api/crypto.html#OpenSSL.crypto.X509.get_notAfter	2019-11-11 20:12:25 +00:00
Chris Mayo	dee4be4b1d	Enable https checking using a test server Verification has to be turned off because we are using a self-signed certificate.	2019-11-11 20:12:25 +00:00
anarcat	5308ec5204	Merge pull request #336 from cjmayo/logdiff Improve test failure diff	2019-10-29 16:20:26 -04:00
Chris Mayo	2f16152dc8	Improve test failure diff Some url lines were missing a url prefix while others had a double url prefix. diff was reporting more url lines as changed than actually had. Improve formatting by removing newlines from control lines and adding headings. Before: E AssertionError: http://localhost:46031/tests/checker/data/sitemap.xml E --- E E +++ E E @@ -1,4 +1,8 @@ E E -url http://localhost:46031/tests/checker/data/sitemap.xml E +http://www.example.com/ E +cache key http://www.example.com/ E +real url http://www.example.com/ E +valid E +url url http://localhost:46031/tests/checker/data/sitemap.xml E cache key http://localhost:46031/tests/checker/data/sitemap.xml E real url http://localhost:46031/tests/checker/data/sitemap.xml E valid After: E AssertionError: http://localhost:44021/tests/checker/data/sitemap.xml E --- expected E +++ result E @@ -2,3 +2,7 @@ E cache key http://localhost:44021/tests/checker/data/sitemap.xml E real url http://localhost:44021/tests/checker/data/sitemap.xml E valid E +url http://www.example.com/ E +cache key http://www.example.com/ E +real url http://www.example.com/ E +valid	2019-10-29 20:03:08 +00:00
Marius Gedminas	c294a4e6c1	Merge pull request #335 from cjmayo/sitemap Fix XmlTagUrlParser and make Python 3 compatible	2019-10-29 15:50:49 +02:00
Chris Mayo	ec8b6e09f0	Fix XmlTagUrlParser and make Python 3 compatible URLs within a sitemap file were not being captured.	2019-10-28 19:20:05 +00:00
Marius Gedminas	8bdd402aed	Merge pull request #333 from linkchecker/fix-clamav-on-py3 Fix test_clamav.py on Python 3	2019-10-25 16:16:23 +03:00
Marius Gedminas	5b2b3613ec	Merge pull request #330 from linkchecker/fix-sitemap Fix sitemap parser	2019-10-25 16:15:55 +03:00
anarcat	6dcc9dbf9d	Merge pull request #332 from cjmayo/py3pdf Make PdfParser Python 3 compatible	2019-10-25 08:38:59 -04:00
Marius Gedminas	f9766a2049	Python 3: fix bytes vs strings in viruscheck plugin Socket communication deals with bytes. There are probably remaining issues with the viruscheck plugin on Python 3, we just can't see them because the code is not fully covered with tests.	2019-10-25 14:24:07 +03:00
Marius Gedminas	65f861901c	Fix all Python 3 tox environments Old pdfminer supports Python 2 only, new pdfminer supports Python 3 only.	2019-10-25 14:20:31 +03:00
Chris Mayo	b2e63663f8	Make PdfParser Python 3 compatible basestring is not available in Python 3. Ensure all URLs are Unicode. url_data.get_raw_content() is returning bytes.	2019-10-24 19:57:27 +01:00
Marius Gedminas	011f6c147e	Merge pull request #331 from linkchecker/explain-skips Explain why these tests are being skipped	2019-10-23 17:59:55 +03:00
Marius Gedminas	606ece0308	Explain why these tests are being skipped pytest output before this change: SKIPPED [3] tests/__init__.py:217: condition: True SKIPPED [1] tests/checker/test_news.py:63: condition: True SKIPPED [1] tests/checker/test_news.py:41: condition: True SKIPPED [1] tests/checker/test_news.py:116: condition: True SKIPPED [1] tests/checker/test_news.py:75: condition: True After: SKIPPED [3] tests/__init__.py: disabled for now until some stable news server comes up SKIPPED [4] tests/checker/test_news.py: disabled for now until some stable news server comes up	2019-10-23 17:35:31 +03:00
Marius Gedminas	87b504785c	Add a regression test for the sitemap parser	2019-10-23 17:30:10 +03:00

1 2 3 4 5 ...

6201 commits