Bastian Kleineidam
|
2390827735
|
Debug cookies.
|
2012-10-25 17:53:16 +02:00 |
|
Bastian Kleineidam
|
c44aa2db1f
|
Fix anchor checking of cached HTTP URLs by using the cached content type.
|
2012-10-25 06:37:10 +02:00 |
|
Bastian Kleineidam
|
dca52145d3
|
Misc stuff.
|
2012-10-24 22:59:28 +02:00 |
|
Bastian Kleineidam
|
b39158e65c
|
Improve available anchor message.
|
2012-10-24 22:21:46 +02:00 |
|
Bastian Kleineidam
|
dd2c963fac
|
Fix non-ASCII exception handling.
|
2012-10-24 22:14:45 +02:00 |
|
Bastian Kleineidam
|
64de760b97
|
Added debug statements for unparseable content types.
|
2012-10-24 22:06:42 +02:00 |
|
Bastian Kleineidam
|
2ebedbaaa6
|
Fix content reading.
|
2012-10-13 16:48:29 +02:00 |
|
Bastian Kleineidam
|
0e4e694ad1
|
Fix connection handling on redirects.
|
2012-10-13 13:36:43 +02:00 |
|
Bastian Kleineidam
|
d3b44be2c4
|
Improved documentation.
|
2012-10-13 12:03:19 +02:00 |
|
Bastian Kleineidam
|
6a204120b6
|
Handle stale file system links for local file checks.
|
2012-10-12 17:20:19 +02:00 |
|
Bastian Kleineidam
|
b758fc6f52
|
Reuse existing response.
|
2012-10-10 12:27:36 +02:00 |
|
Bastian Kleineidam
|
e1e80b7dd5
|
Remove addrinfo cache.
|
2012-10-10 10:54:58 +02:00 |
|
Bastian Kleineidam
|
f484a6776d
|
Use timeout value from configuration.
|
2012-10-10 10:53:52 +02:00 |
|
Bastian Kleineidam
|
06a25676c5
|
Only read the maximum data size plus one, not the whole file.
|
2012-10-10 06:35:33 +02:00 |
|
Bastian Kleineidam
|
6d47b76509
|
Limit HTTP and FTP connections. Gets rid of spurious BadStatusLine errors.
|
2012-10-09 21:04:20 +02:00 |
|
Bastian Kleineidam
|
ad8525c483
|
Improve BadStatusline error message.
|
2012-10-05 08:32:24 +02:00 |
|
Bastian Kleineidam
|
d15fafb1f7
|
Code cleanup.
|
2012-10-05 08:10:44 +02:00 |
|
Bastian Kleineidam
|
ed7c60e491
|
Do not warn about duplicate URLs which can point to the same content.
|
2012-10-01 13:42:46 +02:00 |
|
Bastian Kleineidam
|
38dd63f055
|
Code cleanup.
|
2012-09-23 16:19:42 +02:00 |
|
Bastian Kleineidam
|
7f8fd01b22
|
Add Accept-Encoding and Accept-Charset headers.
|
2012-09-23 15:06:44 +02:00 |
|
Bastian Kleineidam
|
03ecff22bb
|
Fix endless loop in http authentication.
|
2012-09-22 22:21:10 +02:00 |
|
Bastian Kleineidam
|
653b5f27dd
|
Updated ignored schemes.
|
2012-09-22 16:18:37 +02:00 |
|
Bastian Kleineidam
|
1c59cb4d4c
|
Use GET in case a HEAD method does not succeed, even if robots.txt content checkes denied the page. This way proper check results are achieved (but the content is still not checked, so it's ok).
|
2012-09-22 07:53:11 +02:00 |
|
Bastian Kleineidam
|
bbf25106fa
|
Fix double result setting on http checks.
|
2012-09-21 20:33:15 +02:00 |
|
Bastian Kleineidam
|
c274b50c50
|
Store lowercase URL scheme in checker class.
|
2012-09-21 14:35:25 +02:00 |
|
Bastian Kleineidam
|
0941f6ff02
|
Improve exception handling by using unicode.
|
2012-09-21 14:29:20 +02:00 |
|
Bastian Kleineidam
|
049882e4fe
|
Remove accept-encoding since some sites have wrong compression.
|
2012-09-20 22:39:15 +02:00 |
|
Bastian Kleineidam
|
7c6dce6136
|
Only warn non-empty site duplicates.
|
2012-09-20 20:39:36 +02:00 |
|
Bastian Kleineidam
|
a03090c20f
|
Optimize intern/extern pattern parsing.
|
2012-09-20 20:19:13 +02:00 |
|
Bastian Kleineidam
|
b9d234c78a
|
Fix wrong method name in SSL certificate check.
|
2012-09-20 16:28:01 +02:00 |
|
Bastian Kleineidam
|
bff217c58b
|
Never log ignored warnings.
|
2012-09-20 12:44:40 +02:00 |
|
Bastian Kleineidam
|
600b7c0e69
|
Fix duplicate content warning when self.size is not set yet.
|
2012-09-20 12:44:23 +02:00 |
|
Bastian Kleineidam
|
18a200d85f
|
Fix tests.
|
2012-09-19 11:05:26 +02:00 |
|
Bastian Kleineidam
|
b8f8bdf5fc
|
Fix last modified formatting.
|
2012-09-19 10:09:19 +02:00 |
|
Bastian Kleineidam
|
3a352631ba
|
Add modified field to loggers.
|
2012-09-18 12:12:00 +02:00 |
|
Bastian Kleineidam
|
4e59056ee7
|
Warn about duplicate URL contents.
|
2012-09-17 19:49:50 +02:00 |
|
Bastian Kleineidam
|
cb71f483a5
|
Warn about too long URLs.
|
2012-09-17 16:00:23 +02:00 |
|
Bastian Kleineidam
|
6e1841cf1f
|
Print download and cache statistics.
|
2012-09-17 15:23:25 +02:00 |
|
Bastian Kleineidam
|
273230d98b
|
Send HTTP Do-Not-Track header.
|
2012-09-14 22:41:38 +02:00 |
|
Bastian Kleineidam
|
7a6436f08f
|
Increase checked cache in URL queue.
|
2012-09-02 22:21:49 +02:00 |
|
Bastian Kleineidam
|
4c16d3e702
|
Make 401 unauthorized GET response a warning.
|
2012-08-26 11:32:17 +02:00 |
|
Bastian Kleineidam
|
b6d45eabe5
|
Code cleanup.
|
2012-08-24 09:46:38 +02:00 |
|
Bastian Kleineidam
|
ae15d51b30
|
Translate more result strings.
|
2012-08-23 23:59:33 +02:00 |
|
Bastian Kleineidam
|
ce4253263c
|
Do not special case http->ftp redirects.
|
2012-08-23 23:56:36 +02:00 |
|
Bastian Kleineidam
|
7374068941
|
Remove unused import.
|
2012-08-23 16:46:14 +02:00 |
|
Bastian Kleineidam
|
73d64e50ab
|
Fix redirection to new scheme.
|
2012-08-23 16:45:24 +02:00 |
|
Bastian Kleineidam
|
bc287d7710
|
Make unauthorized access responses with missing www-authenticate headers an error.
|
2012-08-23 15:52:11 +02:00 |
|
Bastian Kleineidam
|
e252bbf623
|
Remove Amazon quirk because the default behaviour handles this now.
|
2012-08-23 05:36:51 +02:00 |
|
Bastian Kleineidam
|
ecef16b2c9
|
Support WML sites.
|
2012-08-22 22:43:14 +02:00 |
|
Bastian Kleineidam
|
76f57dc4ad
|
Updated copyright.
|
2012-08-14 20:37:24 +02:00 |
|
Bastian Kleineidam
|
6915e2f989
|
Detect sites not supporting HEAD requests.
|
2012-08-14 18:43:39 +02:00 |
|
Bastian Kleineidam
|
f3b66b102d
|
Fallback to GET when method HEAD is not allowed.
|
2012-08-13 07:07:21 +02:00 |
|
Bastian Kleineidam
|
e65b5c72ce
|
Correct list of schemes requiring host name.
|
2012-08-12 14:21:56 +02:00 |
|
Bastian Kleineidam
|
7b567cc378
|
Make scheme and domain for internal url pattern case insensitive.
|
2012-08-12 14:19:42 +02:00 |
|
Bastian Kleineidam
|
afc0ecd7a6
|
--ignore-url now really ignores URLs.
|
2012-08-12 11:16:29 +02:00 |
|
Bastian Kleineidam
|
6be3e9ddff
|
Cleanup code and improve redirect anchor handling.
|
2012-08-12 11:14:56 +02:00 |
|
Bastian Kleineidam
|
c74690a79a
|
Do not check SSl certificates on HTTPS -> HTTP redirects.
|
2012-08-10 19:43:57 +02:00 |
|
Bastian Kleineidam
|
b0e5c7fc59
|
Ignore feed: URLs.
|
2012-06-27 21:32:03 +02:00 |
|
Bastian Kleineidam
|
0fd1a78378
|
Always compare encoded anchor names.
|
2012-06-27 20:59:53 +02:00 |
|
Bastian Kleineidam
|
5c045fef44
|
Fix UNC path handling on Windows.
|
2012-06-24 10:30:54 +02:00 |
|
Bastian Kleineidam
|
31519f6a01
|
Fix handling of UNC pathnames.
|
2012-06-23 14:30:58 +02:00 |
|
Bastian Kleineidam
|
73b176d7c9
|
Fix URL joining: properly detect absolute URL.
|
2012-06-23 13:33:27 +02:00 |
|
Bastian Kleineidam
|
8d23e2a3c6
|
Add debugging for checker class name.
|
2012-06-23 13:30:13 +02:00 |
|
Bastian Kleineidam
|
dbe57c0f9b
|
Treat Windows UNC paths as absolute paths.
|
2012-06-22 23:42:37 +02:00 |
|
Bastian Kleineidam
|
713b9ebada
|
Only assume local file links for URLs given on the command line.
|
2012-06-22 23:42:05 +02:00 |
|
Bastian Kleineidam
|
9d0cced73c
|
Fix SSL check errors.
|
2012-06-22 07:37:37 +02:00 |
|
Bastian Kleineidam
|
addbcfc54f
|
Updated translation.
|
2012-06-20 20:18:39 +02:00 |
|
Bastian Kleineidam
|
4cce99a77d
|
Test SSL certificate expiration.
|
2012-06-20 20:10:40 +02:00 |
|
Bastian Kleineidam
|
cbb13a8983
|
Add SSL certificate verification.
|
2012-06-18 23:05:44 +02:00 |
|
Bastian Kleineidam
|
f107092a8a
|
Fix handling of user/password info in URLs.
|
2012-06-10 22:07:42 +02:00 |
|
Bastian Kleineidam
|
838095cbd5
|
Updated copyright.
|
2012-06-10 14:58:38 +02:00 |
|
Bastian Kleineidam
|
00aa631267
|
Add localwebroot configuration option.
|
2012-06-10 14:47:27 +02:00 |
|
Bastian Kleineidam
|
98537eea2f
|
Code cleanup: use add_url() function in UrlBase.
|
2012-06-10 14:24:17 +02:00 |
|
Bastian Kleineidam
|
db95fce77e
|
Ignore PHP processing instructions in local files.
|
2012-06-10 14:02:01 +02:00 |
|
Bastian Kleineidam
|
2dee223555
|
Allow memory dumps to be written.
|
2012-06-10 13:18:35 +02:00 |
|
Bastian Kleineidam
|
837ab22d01
|
Syntax cleanup.
|
2012-06-10 11:46:05 +02:00 |
|
Bastian Kleineidam
|
77b8ec0fcd
|
Fix writing temporary Word files.
|
2012-06-10 11:07:35 +02:00 |
|
Bastian Kleineidam
|
54ffb102d8
|
Code cleanup: add function for GET fallback.
|
2012-06-10 09:52:12 +02:00 |
|
Bastian Kleineidam
|
5c94c47901
|
Remove old Squid proxy workaround.
|
2012-06-10 09:45:07 +02:00 |
|
Bastian Kleineidam
|
bcbacec79a
|
Code cleanup.
|
2012-05-10 21:05:33 +02:00 |
|
Bastian Kleineidam
|
61138744e6
|
Always use GET for Zope servers.
|
2012-05-08 20:47:47 +02:00 |
|
Bastian Kleineidam
|
52dcf101e0
|
Remove rest of deprecated options.
|
2012-04-22 17:55:12 +02:00 |
|
Bastian Kleineidam
|
797024c69b
|
Fix URL connection cache key.
|
2012-04-04 22:58:09 +02:00 |
|
Bastian Kleineidam
|
4feea986b4
|
Fix concatenation of multiple cookie values.
|
2012-03-31 08:51:58 +02:00 |
|
Bastian Kleineidam
|
da6d7b0eca
|
Store cookies on redirect.
|
2012-03-31 08:37:18 +02:00 |
|
Bastian Kleineidam
|
6d5e5f9efb
|
Updated copyright.
|
2012-03-30 22:24:10 +02:00 |
|
Bastian Kleineidam
|
b9b8e3f5b2
|
Honor the charset encoding of the Content-Type HTTP
header when parsing HTML.
|
2012-03-22 22:45:11 +01:00 |
|
Bastian Kleineidam
|
98b4768419
|
Use timeout when checking email addresses with SMTP.
|
2012-03-16 21:44:18 +01:00 |
|
Bastian Kleineidam
|
4c9fd8d488
|
Cache real url.
|
2012-03-14 21:12:13 +01:00 |
|
Bastian Kleineidam
|
5e13a78f66
|
Fix non-ascii HTTP header debugging.
|
2012-03-09 11:54:18 +01:00 |
|
Bastian Kleineidam
|
3fcff8a4e5
|
Fix non-ascii HTTP header handling.
|
2012-03-09 11:14:18 +01:00 |
|
Bastian Kleineidam
|
24811ac7b0
|
Recheck extern status on HTTP redirects even if domain did not change.
|
2012-03-08 10:07:31 +01:00 |
|
Bastian Kleineidam
|
71f5ee42c8
|
Updated copyright.
|
2012-01-29 17:18:28 +01:00 |
|
Bastian Kleineidam
|
042b0569ec
|
Fall back to W3C checkers.
|
2012-01-22 08:13:27 +01:00 |
|
Bastian Kleineidam
|
51cf55b7a6
|
Remove warning: prefix from warning messages.
|
2012-01-21 00:25:02 +01:00 |
|
Bastian Kleineidam
|
6e1e9148d8
|
Work around a squid bug resulting in not detecting broken links
|
2012-01-17 08:36:11 +01:00 |
|
Bastian Kleineidam
|
e99c55f6c4
|
Proper proxy type check.
|
2012-01-16 21:15:53 +01:00 |
|
Bastian Kleineidam
|
4c15fc6a8b
|
Properly handle non-ASCII HTTP header values.
|
2012-01-14 11:01:09 +01:00 |
|
Bastian Kleineidam
|
a0581cc2a1
|
Ignore steam:// URIs.
|
2012-01-10 19:37:19 +01:00 |
|
Bastian Kleineidam
|
f1eb51d885
|
Updated copyright
|
2012-01-06 09:21:30 +01:00 |
|