Commit graph

5248 commits

Author SHA1 Message Date
Bastian Kleineidam
1e1370fbb4 Updated docs. 2012-09-23 16:02:11 +02:00
Bastian Kleineidam
7f8fd01b22 Add Accept-Encoding and Accept-Charset headers. 2012-09-23 15:06:44 +02:00
Bastian Kleineidam
03ecff22bb Fix endless loop in http authentication. 2012-09-22 22:21:10 +02:00
Bastian Kleineidam
653b5f27dd Updated ignored schemes. 2012-09-22 16:18:37 +02:00
Bastian Kleineidam
1c59cb4d4c Use GET in case a HEAD method does not succeed, even if robots.txt content checkes denied the page. This way proper check results are achieved (but the content is still not checked, so it's ok). 2012-09-22 07:53:11 +02:00
Bastian Kleineidam
cff97b9718 Updated copyright. 2012-09-21 21:13:00 +02:00
Bastian Kleineidam
fba465e8e8 Fix robotstxt cache miss stats. 2012-09-21 21:12:28 +02:00
Bastian Kleineidam
f6b007f757 Fix useragent matching in robots.txt parser. 2012-09-21 21:12:13 +02:00
Bastian Kleineidam
1c2a66ffaf Refactor http tests into multiple files. 2012-09-21 20:34:05 +02:00
Bastian Kleineidam
bbf25106fa Fix double result setting on http checks. 2012-09-21 20:33:15 +02:00
Bastian Kleineidam
3e464e509c Do not allow empty configuration string values. 2012-09-21 16:05:34 +02:00
Bastian Kleineidam
7e130808bd Document the user-agent change. 2012-09-21 16:05:03 +02:00
Bastian Kleineidam
498567eb21 Remove alexa log files on distclean. 2012-09-21 16:04:46 +02:00
Bastian Kleineidam
32357c9683 Improve alexa test script. 2012-09-21 15:53:16 +02:00
Bastian Kleineidam
d60e4dc0a2 Improved custom user-agent example. 2012-09-21 15:51:44 +02:00
Bastian Kleineidam
ecf8753a19 Improved user-agent string similar to Google and Bing search bots. 2012-09-21 15:46:14 +02:00
Bastian Kleineidam
718c033989 Add alexa test run script. 2012-09-21 14:50:33 +02:00
Bastian Kleineidam
c274b50c50 Store lowercase URL scheme in checker class. 2012-09-21 14:35:25 +02:00
Bastian Kleineidam
0941f6ff02 Improve exception handling by using unicode. 2012-09-21 14:29:20 +02:00
Bastian Kleineidam
f46889a4af Log timestamps in debug output. 2012-09-21 13:05:36 +02:00
Bastian Kleineidam
049882e4fe Remove accept-encoding since some sites have wrong compression. 2012-09-20 22:39:15 +02:00
Bastian Kleineidam
0b9c0ee784 Updated translations and changelog. 2012-09-20 20:42:21 +02:00
Bastian Kleineidam
7c6dce6136 Only warn non-empty site duplicates. 2012-09-20 20:39:36 +02:00
Bastian Kleineidam
a03090c20f Optimize intern/extern pattern parsing. 2012-09-20 20:19:13 +02:00
Bastian Kleineidam
5554c16aa2 Do not copy stdin URLs in temporary list. 2012-09-20 16:40:04 +02:00
Bastian Kleineidam
c385c35b1a Fix ansicolor again. 2012-09-20 16:39:40 +02:00
Bastian Kleineidam
b9d234c78a Fix wrong method name in SSL certificate check. 2012-09-20 16:28:01 +02:00
Bastian Kleineidam
b073c5a3ef Improve warning output. 2012-09-20 16:23:38 +02:00
Bastian Kleineidam
bff217c58b Never log ignored warnings. 2012-09-20 12:44:40 +02:00
Bastian Kleineidam
600b7c0e69 Fix duplicate content warning when self.size is not set yet. 2012-09-20 12:44:23 +02:00
Bastian Kleineidam
9cfee5eb5b Improved color detection with curses. 2012-09-20 12:13:15 +02:00
Bastian Kleineidam
bc0a17c1c4 Display last modified date in the GUI. 2012-09-19 21:23:39 +02:00
Bastian Kleineidam
4e5a75fa92 Mention Peazip for Windows to extract the source. 2012-09-19 17:58:45 +02:00
Bastian Kleineidam
7b67feb880 More pyflakes filters. 2012-09-19 11:09:04 +02:00
Bastian Kleineidam
d37347cab0 Remove unused variable. 2012-09-19 11:08:06 +02:00
Bastian Kleineidam
18a200d85f Fix tests. 2012-09-19 11:05:26 +02:00
Bastian Kleineidam
b8f8bdf5fc Fix last modified formatting. 2012-09-19 10:09:19 +02:00
Bastian Kleineidam
7428085ae5 Filter false positives from pyflakes output. 2012-09-19 09:45:47 +02:00
Bastian Kleineidam
f5fbd7666f Remove unused import. 2012-09-19 09:39:32 +02:00
Bastian Kleineidam
75719b34f6 Updated copyright. 2012-09-19 09:17:25 +02:00
Bastian Kleineidam
450fb36373 Updated changelog. 2012-09-19 09:17:16 +02:00
Bastian Kleineidam
71fba0f8b7 Log all valid URLs in sitemap loggers. 2012-09-19 09:17:08 +02:00
Bastian Kleineidam
681cd90405 Ignore memory dump files. 2012-09-18 16:08:57 +02:00
Bastian Kleineidam
9d1c90f96c Write extra script to analyse a memory dump. 2012-09-18 16:08:31 +02:00
Bastian Kleineidam
3a352631ba Add modified field to loggers. 2012-09-18 12:12:00 +02:00
Bastian Kleineidam
1db63227f6 Memoize file operations to minimize disk I/O. 2012-09-18 09:37:21 +02:00
Bastian Kleineidam
aee515d406 Fix tests. 2012-09-18 09:17:08 +02:00
Bastian Kleineidam
932a07a9cf Added XML sitemap logger. 2012-09-18 09:16:34 +02:00
Bastian Kleineidam
a8bd9c3c89 Updated german translation. 2012-09-17 21:23:29 +02:00
Bastian Kleineidam
58cbe4b152 Updated copyright. 2012-09-17 21:03:52 +02:00