linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-03-17 06:20:27 +00:00

Author	SHA1	Message	Date
Chris Mayo	1663e10fe7	Remove spaces after names in function definitions This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	736c893707	Merge pull request #377 from cjmayo/tidyten3 Remove u string prefixes	2020-05-13 19:36:54 +01:00
Chris Mayo	b0ea72e8c1	Remove # -*- coding: lines Except for tests that include non-unicode characters: tests/test_po.py tests/test_strformat.py tests/test_url.py tests/checker/test_error.py tests/checker/test_news.py	2020-05-08 10:45:31 +01:00
Chris Mayo	4d3e5abcfa	Remove u string prefixes	2020-04-30 20:11:59 +01:00
Chris Mayo	12a948894b	Fix space style in linkcheck/htmlutil/linkparse.py	2020-04-29 20:07:00 +01:00
Chris Mayo	9eed070a73	Stop using HTML handlers LinkFinder is the only remaining HTML handler therefore no need for htmlsoup.process_soup() as an independent function or TagFinder as a base class.	2020-04-29 20:07:00 +01:00
Chris Mayo	4ffdbf2406	Replace MetaRobotsFinder using BeautifulSoup.find()	2020-04-29 20:07:00 +01:00
Marius Gedminas	680783b1ff	SWF files are binary data Should fix #372.	2020-04-27 11:25:37 +03:00
Chris Mayo	ee6628a831	Move HtmlParser/htmlsax.py to htmlutil/htmlsoup.py Remove one subpackage and some import lines where htmlutil.linkparse is also being used.	2020-04-18 20:30:45 +01:00
Chris Mayo	eb3cf28baa	Remove support for start_end_element() callback The LinkFinder handler start_end_element() callback does nothing apart from call start_element().	2020-04-10 13:51:09 +01:00
Chris Mayo	9d8d251d06	Replace Parser lineno() and column() methods Stop storing this data in Parser object state.	2020-04-08 20:03:35 +01:00
Chris Mayo	3ff3d72492	Use BeautifulSoup element attrs directly	2020-04-03 19:24:08 +01:00
Chris Mayo	a7e1e20172	Remove last line and column from Parser Only used for debug log message and not very useful.	2020-04-03 19:24:08 +01:00
Chris Mayo	2c000683e1	Remove unused linkcheck.htmlutil.linkname module Unused since: `d6d48b48` ("html parser: use name instead of peeking", 2019-07-22)	2020-03-30 19:31:11 +01:00
Chris Mayo	607328d5c5	Support Beautiful Soup line numbers	2019-10-05 19:38:57 +01:00
Petr Dlouhý	d6d48b4814	html parser: use name instead of peeking	2019-07-22 19:59:37 +01:00
Petr Dlouhý	51a06d8a1e	Remove home-cooked htmlparser and use BeautifulSoup	2019-07-22 19:59:37 +01:00
anarcat	7cfb1136e9	Merge pull request #313 from cjmayo/titlefinder Remove unused linkparse.TitleFinder	2019-10-07 11:30:10 -04:00
Chris Mayo	127c2272c4	Remove unused linkparse.TitleFinder Stopped being used with removal of UrlBase.set_title_from_content() in: `7b34be59` ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)	2019-10-05 19:43:33 +01:00
Chris Mayo	5732606c58	Remove urlutil.decode_for_unquote() Not needed since all content is now being decoded on retrieval. Added by: `a6643034` ("Python3: decode parts before submitting them to urllib.quote()", 2018-01-05)	2019-10-04 19:37:09 +01:00
Petr Dlouhý	e10f25b968	fixes for Python 3: fix running problems in Python 3	2019-09-10 19:30:09 +01:00
Petr Dlouhý	2c6411d68e	Python3: fix regexp format	2019-04-17 19:50:06 +01:00
Marius Gedminas	743a5f31cb	Crawl HTML attributes in deterministic order Fixes #17.	2017-02-01 19:19:53 +02:00
Bastian Kleineidam	35eb30432e	Added some Python3 fixes.	2014-09-12 19:36:30 +02:00
Bastian Kleineidam	176b95a30e	Do not strip quotes from resolved URLs.	2014-07-11 00:43:46 +02:00
Bastian Kleineidam	82dd76b0d7	Add PDF link parsing.	2014-04-28 18:13:45 +02:00
Bastian Kleineidam	981079c041	Support itemtype attribute parsing.	2014-04-23 22:03:20 +02:00
Bastian Kleineidam	4232b69633	Support <img> srcset attribute parsing.	2014-04-10 17:51:59 +02:00
Bastian Kleineidam	9c5693ad41	Add doc and copyright.	2014-03-30 19:23:42 +02:00
Bastian Kleineidam	b6b5c7a12e	Simpler link parsing routine.	2014-03-27 19:49:17 +01:00
Bastian Kleineidam	81da2eb48f	Code cleanup	2014-03-27 17:19:52 +01:00
Bastian Kleineidam	7b34be590b	Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.	2014-03-01 00:12:34 +01:00
Bastian Kleineidam	c806be5c15	Updated copyright	2014-01-08 22:33:04 +01:00
Bastian Kleineidam	78ed1e9e52	Do not GET on POST forms.	2013-12-10 23:42:43 +01:00
Bastian Kleineidam	9b8cb67d78	Updated copyright.	2013-01-17 20:41:47 +01:00
Bastian Kleineidam	4dad2aa33c	Support dns-prefetch URLs.	2013-01-17 20:41:09 +01:00
Bastian Kleineidam	ecef16b2c9	Support WML sites.	2012-08-22 22:43:14 +02:00
Bastian Kleineidam	b550a9dcb5	Updated copyright.	2012-06-23 14:31:11 +02:00
Bastian Kleineidam	363ccc0121	Check <object codebase=...> as normal URL.	2012-06-23 14:28:32 +02:00
Bastian Kleineidam	cdf6b91b39	Don't use <object codebase=...> attribute as parent url.	2012-06-23 13:32:08 +02:00
Bastian Kleineidam	fb979b4f3c	Add test for archive attribute support.	2011-12-30 12:36:22 +01:00
Bastian Kleineidam	d06c43d470	Split comma-separated archive attribute values.	2011-12-30 08:58:45 +01:00
Bastian Kleineidam	4a4985a960	Add HTML5 link elements and attributes.	2011-12-30 08:55:38 +01:00
Bastian Kleineidam	a1f0867c74	Updated copyright	2011-05-06 20:27:36 +02:00
Bastian Kleineidam	dacc7e7ae4	Consolidate the stop messages.	2011-04-29 19:49:24 +02:00
Bastian Kleineidam	76f7f6b6a3	Prefer anchor element content as name instead of title attribute.	2010-07-30 21:03:04 +02:00
Bastian Kleineidam	c4c098bd83	pep8-ify the source a little more	2010-03-13 08:47:12 +01:00
Bastian Kleineidam	57397e938b	Improved linkname parsing by adding a new peek() HTML parser function.	2010-03-09 11:31:12 +01:00
Bastian Kleineidam	51a0ef0ad4	Speed up HTML parsing by stopping early and adding callbacks.	2010-03-08 09:04:33 +01:00
Bastian Kleineidam	5e06b6b8d4	Updated FSF address in GPL blurb	2009-07-24 23:58:20 +02:00

1 2

52 commits