linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-05-07 22:24:45 +00:00

Author	SHA1	Message	Date
Chris Mayo	79eafee826	Add a test for VirusCheck	2020-05-17 19:04:49 +01:00
Chris Mayo	a15a2833ca	Remove spaces after names in class method definitions And also nested functions. This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	1663e10fe7	Remove spaces after names in function definitions This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	fc11d08968	Remove spaces after names in class definitions	2020-05-16 20:19:42 +01:00
Chris Mayo	1416a08119	On Python 3 no need to convert os.linesep to a string	2020-05-16 17:02:01 +01:00
Chris Mayo	10552a79c7	Remove LinkCheckTest.fail_unicode() No need to encode Python 3 strings before output.	2020-05-16 17:02:00 +01:00
Chris Mayo	9f95d06a39	Remove Python 2 test.test_support import	2020-05-16 16:26:38 +01:00
Chris Mayo	f8c9faec1b	Remove Python 2 cStringIO imports	2020-05-15 19:37:04 +01:00
Chris Mayo	bda9612273	Make html.escape Python 3 only	2020-05-14 20:15:28 +01:00
Chris Mayo	42de609f8e	Make urllib imports Python 3 only	2020-05-14 20:15:28 +01:00
Chris Mayo	08ddf658bc	Merge pull request #366 from cjmayo/userorpwd Support login forms with user and/or password	2020-05-13 19:37:44 +01:00
Chris Mayo	736c893707	Merge pull request #377 from cjmayo/tidyten3 Remove u string prefixes	2020-05-13 19:36:54 +01:00
Chris Mayo	00c4a30386	Add user and password only loginurl tests	2020-05-13 19:32:29 +01:00
Chris Mayo	31a9f68c46	Merge pull request #367 from cjmayo/loginurl Add test for loginurl	2020-05-12 20:08:57 +01:00
Chris Mayo	44e81d27dd	Remove inheriting object All Python 3 classes are new-style.	2020-05-08 10:45:31 +01:00
Chris Mayo	b0ea72e8c1	Remove # -*- coding: lines Except for tests that include non-unicode characters: tests/test_po.py tests/test_strformat.py tests/test_url.py tests/checker/test_error.py tests/checker/test_news.py	2020-05-08 10:45:31 +01:00
Chris Mayo	4d3e5abcfa	Remove u string prefixes	2020-04-30 20:11:59 +01:00
anarcat	ab476fa4bf	Merge pull request #364 from cjmayo/parser5 Stop using HTML handlers and improve login form error handling	2020-04-30 09:28:48 -04:00
Chris Mayo	1d1d9c3bde	Add testing for variants of the robots meta directive	2020-04-29 20:14:10 +01:00
Chris Mayo	9eed070a73	Stop using HTML handlers LinkFinder is the only remaining HTML handler therefore no need for htmlsoup.process_soup() as an independent function or TagFinder as a base class.	2020-04-29 20:07:00 +01:00
Chris Mayo	a1433767e5	Replace HtmlPrettyPrinter with pretty_print_html()	2020-04-29 20:07:00 +01:00
Chris Mayo	0361d9e0e8	Remove encoding and default fd from HtmlPrettyPrinter Neither are used.	2020-04-29 20:07:00 +01:00
Chris Mayo	4ffdbf2406	Replace MetaRobotsFinder using BeautifulSoup.find()	2020-04-29 20:07:00 +01:00
Chris Mayo	8fc0dcc055	Make matching login form credentials case-sensitive The keys of the form.data dictionary are case-sensitive and therefore a KeyError was possible if the configured values are not identical to the input element name attributes.	2020-04-27 18:06:29 +01:00
Chris Mayo	7a6ef938cc	Rename htmlutil.formsearch to htmlutil.loginformsearch Make it clear that this module has only one specific use.	2020-04-27 18:06:29 +01:00
anarcat	183d483074	Merge pull request #365 from cjmayo/tidyten1 Remove use of the future package	2020-04-26 12:02:30 -04:00
Chris Mayo	3b8af403be	Add test for loginurl A new cgi-bin directory is created to identify the scripts to be run by http.server.CGIHTTPRequestHandler.	2020-04-19 19:05:55 +01:00
Chris Mayo	56b8c9f7ab	Add tests for <meta name="robots" content="nofollow"> norobots.html was used for testing <meta name="robots" content="nofollow"> in local files until [1]. This commit reinstates local file testing and adds an http test. Checking is reported by checker.httpurl.HttpUrl.content_allows_robots(). [1] `ce733ae7` ("Don't check for robots.txt directives in local html files.", 2014-03-19)	2020-04-18 20:30:46 +01:00
Chris Mayo	d189445a8e	LinkFinder does not raise StopParse	2020-04-18 20:30:46 +01:00
Chris Mayo	ee6628a831	Move HtmlParser/htmlsax.py to htmlutil/htmlsoup.py Remove one subpackage and some import lines where htmlutil.linkparse is also being used.	2020-04-18 20:30:45 +01:00
Chris Mayo	a83fbb56c0	Remove from __future__ imports	2020-04-15 19:49:16 +01:00
Chris Mayo	f5e7f3a382	Remove use of the future package It was providing Python 2 compatibility.	2020-04-15 19:49:16 +01:00
Chris Mayo	0795e3c1b4	Replace Parser class using BeautifulSoup.find_all()	2020-04-10 13:51:09 +01:00
Chris Mayo	eb3cf28baa	Remove support for start_end_element() callback The LinkFinder handler start_end_element() callback does nothing apart from call start_element().	2020-04-10 13:51:09 +01:00
Chris Mayo	c9f17e92b9	Remove support for end_element() callback	2020-04-10 13:51:09 +01:00
Chris Mayo	48b590cf8b	Replace FormFinder using BeautifulSoup.find_all() FormFinder was the only handler that used an end_element() callback and was therefore a blocker to moving the Parser class to use BeautifulSoup.find_all() FormFinder was a specialised handler used to parse a login form at the start of a session if the user had configured authentication credentials.	2020-04-10 13:51:05 +01:00
Chris Mayo	974915cc4f	Remove encoding from Parser Only used by the test and an attribute of the soup object.	2020-04-08 20:03:35 +01:00
Chris Mayo	02e1c389b2	Remove parser flush() and reset() Remnants of the feed() interface.	2020-04-08 20:03:35 +01:00
Chris Mayo	3771dd9136	Use parser.feed_soup() instead of parser.feed() Markup is not being passed in pieces to the parser, so simplify the interface and reduce the state further.	2020-04-08 20:03:35 +01:00
Chris Mayo	9d8d251d06	Replace Parser lineno() and column() methods Stop storing this data in Parser object state.	2020-04-08 20:03:35 +01:00
Chris Mayo	514210199d	Add tests for search_form	2020-04-07 19:24:34 +01:00
Chris Mayo	036b900ffc	Remove unused linkcheck.containers classes	2020-04-03 19:24:08 +01:00
Chris Mayo	3ff3d72492	Use BeautifulSoup element attrs directly	2020-04-03 19:24:08 +01:00
Chris Mayo	28701e291a	Remove use of Python 2 unicode() and related u prefixes Several instances for MS Windows left unchanged.	2020-04-01 19:39:50 +01:00
anarcat	cf4e6bb235	Merge pull request #351 from cjmayo/tagsonly Remove support for non-Tag elements from Parser	2020-04-01 12:17:18 -04:00
Chris Mayo	9fc651e82b	Remove Python 2 compatibility from parser tests	2020-03-31 20:10:35 +01:00
Chris Mayo	ffa6ac457f	Remove support for non-Tag elements from Parser This change is made because the linkchecker handlers only process Tags. The test HtmlPrettyPrinter handler is updated to output element text because its support for non-Tag elements has been removed. This results in a number of the existing tests still passing.	2020-03-31 20:10:35 +01:00
Chris Mayo	0ee4414a60	Replace memoized with functools.lru_cache	2020-03-31 19:46:31 +01:00
Chris Mayo	1255119ca8	Move HtmlPrinter and HtmlPrettyPrinter into tests	2020-03-30 19:32:30 +01:00
Chris Mayo	f743be57e8	Remove unused functions from linkcheck.HtmlParser resolve_entities() unused since: `2c000683` ("Remove unused linkcheck.htmlutil.linkname module", 2020-03-30) set_doctype(), set_encoding() unused since: `51a06d8a` ("Remove home-cooked htmlparser and use BeautifulSoup", 2019-07-22)	2020-03-30 19:32:18 +01:00

1 2 3 4 5 ...

503 commits