Commit graph

227 commits

Author SHA1 Message Date
Bastian Kleineidam
d15fafb1f7 Code cleanup. 2012-10-05 08:10:44 +02:00
Bastian Kleineidam
7f8fd01b22 Add Accept-Encoding and Accept-Charset headers. 2012-09-23 15:06:44 +02:00
Bastian Kleineidam
03ecff22bb Fix endless loop in http authentication. 2012-09-22 22:21:10 +02:00
Bastian Kleineidam
1c59cb4d4c Use GET in case a HEAD method does not succeed, even if robots.txt content checkes denied the page. This way proper check results are achieved (but the content is still not checked, so it's ok). 2012-09-22 07:53:11 +02:00
Bastian Kleineidam
bbf25106fa Fix double result setting on http checks. 2012-09-21 20:33:15 +02:00
Bastian Kleineidam
049882e4fe Remove accept-encoding since some sites have wrong compression. 2012-09-20 22:39:15 +02:00
Bastian Kleineidam
a03090c20f Optimize intern/extern pattern parsing. 2012-09-20 20:19:13 +02:00
Bastian Kleineidam
18a200d85f Fix tests. 2012-09-19 11:05:26 +02:00
Bastian Kleineidam
b8f8bdf5fc Fix last modified formatting. 2012-09-19 10:09:19 +02:00
Bastian Kleineidam
3a352631ba Add modified field to loggers. 2012-09-18 12:12:00 +02:00
Bastian Kleineidam
4e59056ee7 Warn about duplicate URL contents. 2012-09-17 19:49:50 +02:00
Bastian Kleineidam
6e1841cf1f Print download and cache statistics. 2012-09-17 15:23:25 +02:00
Bastian Kleineidam
273230d98b Send HTTP Do-Not-Track header. 2012-09-14 22:41:38 +02:00
Bastian Kleineidam
7a6436f08f Increase checked cache in URL queue. 2012-09-02 22:21:49 +02:00
Bastian Kleineidam
4c16d3e702 Make 401 unauthorized GET response a warning. 2012-08-26 11:32:17 +02:00
Bastian Kleineidam
ae15d51b30 Translate more result strings. 2012-08-23 23:59:33 +02:00
Bastian Kleineidam
ce4253263c Do not special case http->ftp redirects. 2012-08-23 23:56:36 +02:00
Bastian Kleineidam
7374068941 Remove unused import. 2012-08-23 16:46:14 +02:00
Bastian Kleineidam
73d64e50ab Fix redirection to new scheme. 2012-08-23 16:45:24 +02:00
Bastian Kleineidam
bc287d7710 Make unauthorized access responses with missing www-authenticate headers an error. 2012-08-23 15:52:11 +02:00
Bastian Kleineidam
e252bbf623 Remove Amazon quirk because the default behaviour handles this now. 2012-08-23 05:36:51 +02:00
Bastian Kleineidam
ecef16b2c9 Support WML sites. 2012-08-22 22:43:14 +02:00
Bastian Kleineidam
6915e2f989 Detect sites not supporting HEAD requests. 2012-08-14 18:43:39 +02:00
Bastian Kleineidam
f3b66b102d Fallback to GET when method HEAD is not allowed. 2012-08-13 07:07:21 +02:00
Bastian Kleineidam
6be3e9ddff Cleanup code and improve redirect anchor handling. 2012-08-12 11:14:56 +02:00
Bastian Kleineidam
5c045fef44 Fix UNC path handling on Windows. 2012-06-24 10:30:54 +02:00
Bastian Kleineidam
cbb13a8983 Add SSL certificate verification. 2012-06-18 23:05:44 +02:00
Bastian Kleineidam
f107092a8a Fix handling of user/password info in URLs. 2012-06-10 22:07:42 +02:00
Bastian Kleineidam
2dee223555 Allow memory dumps to be written. 2012-06-10 13:18:35 +02:00
Bastian Kleineidam
54ffb102d8 Code cleanup: add function for GET fallback. 2012-06-10 09:52:12 +02:00
Bastian Kleineidam
5c94c47901 Remove old Squid proxy workaround. 2012-06-10 09:45:07 +02:00
Bastian Kleineidam
bcbacec79a Code cleanup. 2012-05-10 21:05:33 +02:00
Bastian Kleineidam
61138744e6 Always use GET for Zope servers. 2012-05-08 20:47:47 +02:00
Bastian Kleineidam
797024c69b Fix URL connection cache key. 2012-04-04 22:58:09 +02:00
Bastian Kleineidam
4feea986b4 Fix concatenation of multiple cookie values. 2012-03-31 08:51:58 +02:00
Bastian Kleineidam
da6d7b0eca Store cookies on redirect. 2012-03-31 08:37:18 +02:00
Bastian Kleineidam
b9b8e3f5b2 Honor the charset encoding of the Content-Type HTTP
header when parsing HTML.
2012-03-22 22:45:11 +01:00
Bastian Kleineidam
5e13a78f66 Fix non-ascii HTTP header debugging. 2012-03-09 11:54:18 +01:00
Bastian Kleineidam
3fcff8a4e5 Fix non-ascii HTTP header handling. 2012-03-09 11:14:18 +01:00
Bastian Kleineidam
24811ac7b0 Recheck extern status on HTTP redirects even if domain did not change. 2012-03-08 10:07:31 +01:00
Bastian Kleineidam
71f5ee42c8 Updated copyright. 2012-01-29 17:18:28 +01:00
Bastian Kleineidam
6e1e9148d8 Work around a squid bug resulting in not detecting broken links 2012-01-17 08:36:11 +01:00
Bastian Kleineidam
4c15fc6a8b Properly handle non-ASCII HTTP header values. 2012-01-14 11:01:09 +01:00
Bastian Kleineidam
cdf91a0321 Improve cookie info message and fix cookie test cases. 2011-08-04 18:34:56 +02:00
Bastian Kleineidam
48413de418 Display warning message for each cookie parsing error. 2011-08-03 19:27:36 +02:00
Bastian Kleineidam
c99b75899d Send multiple cookie values in one header. 2011-08-02 21:57:16 +02:00
Bastian Kleineidam
c70bd68ef1 Refactor sending of cookie data in client into separate function. 2011-08-02 20:45:26 +02:00
Bastian Kleineidam
51bcccfdfe Added new option --user-agent to set the User-Agent header. 2011-07-25 21:09:49 +02:00
Bastian Kleineidam
552c71a3ca Do not append a stray newline character when encoding authentication information to base64. 2011-07-25 20:02:01 +02:00
Bastian Kleineidam
5515645af6 Reset content type setting after loading HTTP headers. 2011-05-28 17:59:44 +02:00