linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-05-24 22:23:43 +00:00

Author	SHA1	Message	Date
Chris Mayo	f6e182f0e4	Mark TestFile.test_html_url_quote as need_network Else without the internet the test fails, eventually, with: warning No MX mail host for users.sourceforge.net found	2020-05-25 19:55:28 +01:00
Chris Mayo	d3c9618b1b	TestHttps.test_https doesn't need the internet now A result of changes introduced in: `dee4be4b` ("Enable https checking using a test server", 2019-11-11)	2020-05-25 19:55:28 +01:00
Chris Mayo	32689ea230	Enable as many TestHttp html tests as possible without the internet	2020-05-25 19:55:28 +01:00
Chris Mayo	313a14ff0d	Remove instances of Python 2 unicode	2020-05-24 19:14:47 +01:00
Marius Gedminas	d0169c46d4	Merge pull request #348 from weshaggard/HandleRateLimiting Turn status code 429 into warning instead of failure	2020-05-24 16:16:56 +03:00
Chris Mayo	d611564cb0	Add a test for an empty html file accessed over http	2020-05-23 20:01:24 +01:00
Marius Gedminas	f268a90cfb	Merge branch 'master' into HandleRateLimiting	2020-05-23 14:15:52 +03:00
Marius Gedminas	5bd1fb4e36	Fix internal error on empty HTML files When BeautifulSoup finds an empty file on disk, it sets original_encoding to None. It doesn't matter what encoding we pick for empty files, so let's just pick one. I don't know if there are any circumstances where BeautifulSoup might set the encoding to None for a non-empty file. Closes #392.	2020-05-21 19:01:33 +03:00
Chris Mayo	96e1c00ff7	TestLogger diff output is all Unicode in Python 3	2020-05-20 19:58:44 +01:00
Chris Mayo	71eaf9a982	Remove str_text from tests/	2020-05-19 19:56:42 +01:00
Chris Mayo	a127902607	Replace str_text in asserts	2020-05-19 19:56:42 +01:00
Chris Mayo	12fd59057e	Remove duplicate tests from test_strformat.py	2020-05-17 20:10:28 +01:00
Chris Mayo	339d293326	Convert tests/test_po.py to UTF-8	2020-05-17 20:10:28 +01:00
Chris Mayo	04465530c4	Use HttpServerTest.get_url()	2020-05-17 20:10:28 +01:00
Chris Mayo	58dbe1f282	Remove unused import pytest from tests/checker/test_http.py pytest.mark.xfail() removed in: `743a5f31` ("Crawl HTML attributes in deterministic order", 2017-02-01)	2020-05-17 20:10:28 +01:00
Chris Mayo	79eafee826	Add a test for VirusCheck	2020-05-17 19:04:49 +01:00
Chris Mayo	a15a2833ca	Remove spaces after names in class method definitions And also nested functions. This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	1663e10fe7	Remove spaces after names in function definitions This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	fc11d08968	Remove spaces after names in class definitions	2020-05-16 20:19:42 +01:00
Chris Mayo	1416a08119	On Python 3 no need to convert os.linesep to a string	2020-05-16 17:02:01 +01:00
Chris Mayo	10552a79c7	Remove LinkCheckTest.fail_unicode() No need to encode Python 3 strings before output.	2020-05-16 17:02:00 +01:00
Chris Mayo	9f95d06a39	Remove Python 2 test.test_support import	2020-05-16 16:26:38 +01:00
Chris Mayo	f8c9faec1b	Remove Python 2 cStringIO imports	2020-05-15 19:37:04 +01:00
Chris Mayo	bda9612273	Make html.escape Python 3 only	2020-05-14 20:15:28 +01:00
Chris Mayo	42de609f8e	Make urllib imports Python 3 only	2020-05-14 20:15:28 +01:00
Chris Mayo	08ddf658bc	Merge pull request #366 from cjmayo/userorpwd Support login forms with user and/or password	2020-05-13 19:37:44 +01:00
Chris Mayo	736c893707	Merge pull request #377 from cjmayo/tidyten3 Remove u string prefixes	2020-05-13 19:36:54 +01:00
Chris Mayo	00c4a30386	Add user and password only loginurl tests	2020-05-13 19:32:29 +01:00
Chris Mayo	31a9f68c46	Merge pull request #367 from cjmayo/loginurl Add test for loginurl	2020-05-12 20:08:57 +01:00
Chris Mayo	44e81d27dd	Remove inheriting object All Python 3 classes are new-style.	2020-05-08 10:45:31 +01:00
Chris Mayo	b0ea72e8c1	Remove # -*- coding: lines Except for tests that include non-unicode characters: tests/test_po.py tests/test_strformat.py tests/test_url.py tests/checker/test_error.py tests/checker/test_news.py	2020-05-08 10:45:31 +01:00
Chris Mayo	4d3e5abcfa	Remove u string prefixes	2020-04-30 20:11:59 +01:00
anarcat	ab476fa4bf	Merge pull request #364 from cjmayo/parser5 Stop using HTML handlers and improve login form error handling	2020-04-30 09:28:48 -04:00
Chris Mayo	1d1d9c3bde	Add testing for variants of the robots meta directive	2020-04-29 20:14:10 +01:00
Chris Mayo	9eed070a73	Stop using HTML handlers LinkFinder is the only remaining HTML handler therefore no need for htmlsoup.process_soup() as an independent function or TagFinder as a base class.	2020-04-29 20:07:00 +01:00
Chris Mayo	a1433767e5	Replace HtmlPrettyPrinter with pretty_print_html()	2020-04-29 20:07:00 +01:00
Chris Mayo	0361d9e0e8	Remove encoding and default fd from HtmlPrettyPrinter Neither are used.	2020-04-29 20:07:00 +01:00
Chris Mayo	4ffdbf2406	Replace MetaRobotsFinder using BeautifulSoup.find()	2020-04-29 20:07:00 +01:00
Chris Mayo	8fc0dcc055	Make matching login form credentials case-sensitive The keys of the form.data dictionary are case-sensitive and therefore a KeyError was possible if the configured values are not identical to the input element name attributes.	2020-04-27 18:06:29 +01:00
Chris Mayo	7a6ef938cc	Rename htmlutil.formsearch to htmlutil.loginformsearch Make it clear that this module has only one specific use.	2020-04-27 18:06:29 +01:00
anarcat	183d483074	Merge pull request #365 from cjmayo/tidyten1 Remove use of the future package	2020-04-26 12:02:30 -04:00
Chris Mayo	3b8af403be	Add test for loginurl A new cgi-bin directory is created to identify the scripts to be run by http.server.CGIHTTPRequestHandler.	2020-04-19 19:05:55 +01:00
Chris Mayo	56b8c9f7ab	Add tests for <meta name="robots" content="nofollow"> norobots.html was used for testing <meta name="robots" content="nofollow"> in local files until [1]. This commit reinstates local file testing and adds an http test. Checking is reported by checker.httpurl.HttpUrl.content_allows_robots(). [1] `ce733ae7` ("Don't check for robots.txt directives in local html files.", 2014-03-19)	2020-04-18 20:30:46 +01:00
Chris Mayo	d189445a8e	LinkFinder does not raise StopParse	2020-04-18 20:30:46 +01:00
Chris Mayo	ee6628a831	Move HtmlParser/htmlsax.py to htmlutil/htmlsoup.py Remove one subpackage and some import lines where htmlutil.linkparse is also being used.	2020-04-18 20:30:45 +01:00
Chris Mayo	a83fbb56c0	Remove from __future__ imports	2020-04-15 19:49:16 +01:00
Chris Mayo	f5e7f3a382	Remove use of the future package It was providing Python 2 compatibility.	2020-04-15 19:49:16 +01:00
Chris Mayo	0795e3c1b4	Replace Parser class using BeautifulSoup.find_all()	2020-04-10 13:51:09 +01:00
Chris Mayo	eb3cf28baa	Remove support for start_end_element() callback The LinkFinder handler start_end_element() callback does nothing apart from call start_element().	2020-04-10 13:51:09 +01:00
Chris Mayo	c9f17e92b9	Remove support for end_element() callback	2020-04-10 13:51:09 +01:00

1 2 3 4 5 ...

519 commits