linkchecker

mirror of https://github.com/Hopiu/linkchecker.git synced 2026-04-25 08:34:43 +00:00

Author	SHA1	Message	Date
Chris Mayo	1663e10fe7	Remove spaces after names in function definitions This is a PEP 8 convention, E211.	2020-05-16 20:19:42 +01:00
Chris Mayo	42de609f8e	Make urllib imports Python 3 only	2020-05-14 20:15:28 +01:00
Chris Mayo	736c893707	Merge pull request #377 from cjmayo/tidyten3 Remove u string prefixes	2020-05-13 19:36:54 +01:00
Chris Mayo	44e81d27dd	Remove inheriting object All Python 3 classes are new-style.	2020-05-08 10:45:31 +01:00
Chris Mayo	b0ea72e8c1	Remove # -*- coding: lines Except for tests that include non-unicode characters: tests/test_po.py tests/test_strformat.py tests/test_url.py tests/checker/test_error.py tests/checker/test_news.py	2020-05-08 10:45:31 +01:00
Chris Mayo	4d3e5abcfa	Remove u string prefixes	2020-04-30 20:11:59 +01:00
anarcat	183d483074	Merge pull request #365 from cjmayo/tidyten1 Remove use of the future package	2020-04-26 12:02:30 -04:00
Chris Mayo	ee6628a831	Move HtmlParser/htmlsax.py to htmlutil/htmlsoup.py Remove one subpackage and some import lines where htmlutil.linkparse is also being used.	2020-04-18 20:30:45 +01:00
Chris Mayo	f5e7f3a382	Remove use of the future package It was providing Python 2 compatibility.	2020-04-15 19:49:16 +01:00
Chris Mayo	40f43ae41c	Create one function to make soup objects	2020-04-08 20:03:35 +01:00
Chris Mayo	3ff3d72492	Use BeautifulSoup element attrs directly	2020-04-03 19:24:08 +01:00
Chris Mayo	5b66964afa	Remove unused .charset from checker classes Unused since: `4f8c2954` ("Don't set parser.encoding", 2019-10-05)	2020-03-30 19:32:30 +01:00
Chris Mayo	646e138166	Pass encoding when unquoting Else non-UTF-8 codes are misinterpreted: >>> from urllib import parse >>> parse.unquote("%FF") '�' >>> parse.unquote("%FF", "latin1") 'ÿ'	2019-10-05 19:38:57 +01:00
Chris Mayo	153e53ba03	Reuse soup object used for detecting encoding in the HTML parser	2019-10-05 19:38:57 +01:00
Chris Mayo	607328d5c5	Support Beautiful Soup line numbers	2019-10-05 19:38:57 +01:00
Chris Mayo	5fc01455b7	Decode content when retrieved, use bs4 to detect encoding if non-Unicode UrlBase has been modified as follows: - the "data" variable now holds bytes - decoded content is stored in a new variable "text" - functionality from get_content() has been split out into get_raw_content() which returns "data" and download_content() which calls read_content() and sets the download related variables. This allows for subclasses to do their own decoding and parsers to use bytes.	2019-09-30 19:46:24 +01:00
Petr Dlouhý	c2af88ad2e	Python3: fix for test_telnet in urlbase.py	2019-09-15 19:49:26 +01:00
Petr Dlouhý	e10f25b968	fixes for Python 3: fix running problems in Python 3	2019-09-10 19:30:09 +01:00
Petr Dlouhý	e92b0a9f7b	Python3: fix unicode in urlbase	2019-04-25 19:57:45 +01:00
Petr Dlouhý	b3881ce3b5	Python3: fix urlbase, strformat and others	2019-04-25 19:57:45 +01:00
Petr Dlouhý	4acabf5cb5	fix urllib imports	2019-04-09 20:09:35 +01:00
Graham Seaman	233e7dcf68	Allow wayback-format urls without affecting atom 'feed' urls	2017-02-09 11:43:45 +00:00
Bastian Kleineidam	35eb30432e	Added some Python3 fixes.	2014-09-12 19:36:30 +02:00
Bastian Kleineidam	0fa7ed2699	Fix empty URL handling.	2014-07-03 23:34:40 +02:00
Bastian Kleineidam	82dd76b0d7	Add PDF link parsing.	2014-04-28 18:13:45 +02:00
Bastian Kleineidam	22caa9367a	Refactor recursion checks.	2014-04-10 17:50:55 +02:00
Bastian Kleineidam	ce733ae76b	Don't check for robots.txt directives in local html files.	2014-03-19 16:33:22 +01:00
Bastian Kleineidam	6437f08277	Display downloaded bytes.	2014-03-14 21:06:10 +01:00
Bastian Kleineidam	c51caf1133	Assertions should be earlier.	2014-03-14 20:26:11 +01:00
Bastian Kleineidam	cfff4c4a84	Disable URL length warning for data: URLs.	2014-03-14 20:24:28 +01:00
Bastian Kleineidam	bca226c293	Fix assertion checking external links; fix tests	2014-03-10 18:23:44 +01:00
Bastian Kleineidam	6b334dc79b	Fix URL result caching.	2014-03-08 19:35:10 +01:00
Bastian Kleineidam	fab2c2da98	Improve content type setting.	2014-03-05 20:12:19 +01:00
Bastian Kleineidam	ef13a3fce1	Implement sitemap and sitemap index parsing.	2014-03-05 19:26:37 +01:00
Bastian Kleineidam	b72cf252fb	Move parseable check down since it might get the content.	2014-03-05 19:26:05 +01:00
Bastian Kleineidam	9ef65cb774	Fix UrlData string representation.	2014-03-05 19:25:40 +01:00
Bastian Kleineidam	192cfab009	Cleanup of the UrlData.is_* functions	2014-03-05 19:23:16 +01:00
Bastian Kleineidam	978b24f2d7	Merge branch 'caching'	2014-03-04 07:21:42 +01:00
Bastian Kleineidam	f1076c8813	Increase url-too-long warning.	2014-03-03 23:31:04 +01:00
Bastian Kleineidam	82f81241fd	Check all links and add better caching.	2014-03-03 23:29:45 +01:00
Bastian Kleineidam	7b34be590b	Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.	2014-03-01 00:12:34 +01:00
Bastian Kleineidam	c806be5c15	Updated copyright	2014-01-08 22:33:04 +01:00
Bastian Kleineidam	0ca63797bf	Remove content cache.	2013-12-10 23:41:52 +01:00
Bastian Kleineidam	023da7c993	Remove the duplicate URL content check.	2013-12-04 19:12:40 +01:00
Bastian Kleineidam	64d95e45e0	Remove local HTML and CSS syntax check.	2013-02-08 21:36:02 +01:00
Bastian Kleineidam	e6ad32c028	Catch UnicodeError for invalid host names.	2013-01-23 19:42:29 +01:00
Bastian Kleineidam	7fe72745ae	Updated copyright.	2013-01-09 23:03:12 +01:00
Bastian Kleineidam	a5b6136e70	Check word document validity before closing.	2013-01-07 21:58:02 +01:00
Bastian Kleineidam	9820530313	Use better_exchook to print more internal error info.	2012-12-18 23:06:48 +01:00
Bastian Kleineidam	42a17cbb98	Prepare py3 port and display sys.argv on internal errors.	2012-11-26 18:49:07 +01:00

1 2 3 4 5 ...

299 commits