Bastian Kleineidam
|
6437f08277
|
Display downloaded bytes.
|
2014-03-14 21:06:10 +01:00 |
|
Bastian Kleineidam
|
c51caf1133
|
Assertions should be earlier.
|
2014-03-14 20:26:11 +01:00 |
|
Bastian Kleineidam
|
cfff4c4a84
|
Disable URL length warning for data: URLs.
|
2014-03-14 20:24:28 +01:00 |
|
Bastian Kleineidam
|
bca226c293
|
Fix assertion checking external links; fix tests
|
2014-03-10 18:23:44 +01:00 |
|
Bastian Kleineidam
|
6b334dc79b
|
Fix URL result caching.
|
2014-03-08 19:35:10 +01:00 |
|
Bastian Kleineidam
|
fab2c2da98
|
Improve content type setting.
|
2014-03-05 20:12:19 +01:00 |
|
Bastian Kleineidam
|
ef13a3fce1
|
Implement sitemap and sitemap index parsing.
|
2014-03-05 19:26:37 +01:00 |
|
Bastian Kleineidam
|
b72cf252fb
|
Move parseable check down since it might get the content.
|
2014-03-05 19:26:05 +01:00 |
|
Bastian Kleineidam
|
9ef65cb774
|
Fix UrlData string representation.
|
2014-03-05 19:25:40 +01:00 |
|
Bastian Kleineidam
|
192cfab009
|
Cleanup of the UrlData.is_* functions
|
2014-03-05 19:23:16 +01:00 |
|
Bastian Kleineidam
|
978b24f2d7
|
Merge branch 'caching'
|
2014-03-04 07:21:42 +01:00 |
|
Bastian Kleineidam
|
f1076c8813
|
Increase url-too-long warning.
|
2014-03-03 23:31:04 +01:00 |
|
Bastian Kleineidam
|
82f81241fd
|
Check all links and add better caching.
|
2014-03-03 23:29:45 +01:00 |
|
Bastian Kleineidam
|
7b34be590b
|
Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.
|
2014-03-01 00:12:34 +01:00 |
|
Bastian Kleineidam
|
c806be5c15
|
Updated copyright
|
2014-01-08 22:33:04 +01:00 |
|
Bastian Kleineidam
|
0ca63797bf
|
Remove content cache.
|
2013-12-10 23:41:52 +01:00 |
|
Bastian Kleineidam
|
023da7c993
|
Remove the duplicate URL content check.
|
2013-12-04 19:12:40 +01:00 |
|
Bastian Kleineidam
|
64d95e45e0
|
Remove local HTML and CSS syntax check.
|
2013-02-08 21:36:02 +01:00 |
|
Bastian Kleineidam
|
e6ad32c028
|
Catch UnicodeError for invalid host names.
|
2013-01-23 19:42:29 +01:00 |
|
Bastian Kleineidam
|
7fe72745ae
|
Updated copyright.
|
2013-01-09 23:03:12 +01:00 |
|
Bastian Kleineidam
|
a5b6136e70
|
Check word document validity before closing.
|
2013-01-07 21:58:02 +01:00 |
|
Bastian Kleineidam
|
9820530313
|
Use better_exchook to print more internal error info.
|
2012-12-18 23:06:48 +01:00 |
|
Bastian Kleineidam
|
42a17cbb98
|
Prepare py3 port and display sys.argv on internal errors.
|
2012-11-26 18:49:07 +01:00 |
|
Bastian Kleineidam
|
cd4abb1f12
|
Improve repr() of url data, and remove alexa test script.
|
2012-11-09 19:09:38 +01:00 |
|
Bastian Kleineidam
|
eabaa41bd2
|
Do not check duplicate URLs.
|
2012-11-06 21:34:22 +01:00 |
|
Bastian Kleineidam
|
dca52145d3
|
Misc stuff.
|
2012-10-24 22:59:28 +02:00 |
|
Bastian Kleineidam
|
b39158e65c
|
Improve available anchor message.
|
2012-10-24 22:21:46 +02:00 |
|
Bastian Kleineidam
|
dd2c963fac
|
Fix non-ASCII exception handling.
|
2012-10-24 22:14:45 +02:00 |
|
Bastian Kleineidam
|
06a25676c5
|
Only read the maximum data size plus one, not the whole file.
|
2012-10-10 06:35:33 +02:00 |
|
Bastian Kleineidam
|
6d47b76509
|
Limit HTTP and FTP connections. Gets rid of spurious BadStatusLine errors.
|
2012-10-09 21:04:20 +02:00 |
|
Bastian Kleineidam
|
ad8525c483
|
Improve BadStatusline error message.
|
2012-10-05 08:32:24 +02:00 |
|
Bastian Kleineidam
|
ed7c60e491
|
Do not warn about duplicate URLs which can point to the same content.
|
2012-10-01 13:42:46 +02:00 |
|
Bastian Kleineidam
|
c274b50c50
|
Store lowercase URL scheme in checker class.
|
2012-09-21 14:35:25 +02:00 |
|
Bastian Kleineidam
|
0941f6ff02
|
Improve exception handling by using unicode.
|
2012-09-21 14:29:20 +02:00 |
|
Bastian Kleineidam
|
7c6dce6136
|
Only warn non-empty site duplicates.
|
2012-09-20 20:39:36 +02:00 |
|
Bastian Kleineidam
|
a03090c20f
|
Optimize intern/extern pattern parsing.
|
2012-09-20 20:19:13 +02:00 |
|
Bastian Kleineidam
|
bff217c58b
|
Never log ignored warnings.
|
2012-09-20 12:44:40 +02:00 |
|
Bastian Kleineidam
|
600b7c0e69
|
Fix duplicate content warning when self.size is not set yet.
|
2012-09-20 12:44:23 +02:00 |
|
Bastian Kleineidam
|
18a200d85f
|
Fix tests.
|
2012-09-19 11:05:26 +02:00 |
|
Bastian Kleineidam
|
3a352631ba
|
Add modified field to loggers.
|
2012-09-18 12:12:00 +02:00 |
|
Bastian Kleineidam
|
4e59056ee7
|
Warn about duplicate URL contents.
|
2012-09-17 19:49:50 +02:00 |
|
Bastian Kleineidam
|
cb71f483a5
|
Warn about too long URLs.
|
2012-09-17 16:00:23 +02:00 |
|
Bastian Kleineidam
|
6e1841cf1f
|
Print download and cache statistics.
|
2012-09-17 15:23:25 +02:00 |
|
Bastian Kleineidam
|
7a6436f08f
|
Increase checked cache in URL queue.
|
2012-09-02 22:21:49 +02:00 |
|
Bastian Kleineidam
|
ecef16b2c9
|
Support WML sites.
|
2012-08-22 22:43:14 +02:00 |
|
Bastian Kleineidam
|
e65b5c72ce
|
Correct list of schemes requiring host name.
|
2012-08-12 14:21:56 +02:00 |
|
Bastian Kleineidam
|
afc0ecd7a6
|
--ignore-url now really ignores URLs.
|
2012-08-12 11:16:29 +02:00 |
|
Bastian Kleineidam
|
0fd1a78378
|
Always compare encoded anchor names.
|
2012-06-27 20:59:53 +02:00 |
|
Bastian Kleineidam
|
5c045fef44
|
Fix UNC path handling on Windows.
|
2012-06-24 10:30:54 +02:00 |
|
Bastian Kleineidam
|
73b176d7c9
|
Fix URL joining: properly detect absolute URL.
|
2012-06-23 13:33:27 +02:00 |
|