Commit graph

5272 commits

Author SHA1 Message Date
Bastian Kleineidam
5ebd754cdb Improved duplicate url check. 2012-10-01 16:11:45 +02:00
Bastian Kleineidam
f9cb512f7d Updated changelog. 2012-10-01 13:43:48 +02:00
Bastian Kleineidam
ed7c60e491 Do not warn about duplicate URLs which can point to the same content. 2012-10-01 13:42:46 +02:00
Bastian Kleineidam
2e81fcce98 Ignore generate test result file. 2012-10-01 13:33:04 +02:00
Bastian Kleineidam
148846be67 Add flag to log lock contentions. 2012-10-01 13:32:30 +02:00
Bastian Kleineidam
b56c054932 Use finer-grained robots.txt locks to improve lock contention. 2012-10-01 13:29:29 +02:00
Bastian Kleineidam
677feab81f Updated german translation. 2012-10-01 10:44:24 +02:00
Bastian Kleineidam
6f5e55fd3b Code cleanup. 2012-10-01 10:43:20 +02:00
Bastian Kleineidam
1b3b040be5 Fix check result order. 2012-10-01 10:28:42 +02:00
Bastian Kleineidam
5a12ccf8d0 Fix anchor test result ordering. 2012-09-30 22:02:29 +02:00
Bastian Kleineidam
3c44056fde Code cleanup 2012-09-30 14:01:09 +02:00
Bastian Kleineidam
27b61c3bfa Fix gzip handling in http content decoder. 2012-09-30 14:00:49 +02:00
Bastian Kleineidam
169bdecb69 Fix clamav test. 2012-09-30 12:00:44 +02:00
Bastian Kleineidam
b76c930c4f Determine number of cpus on osx. 2012-09-30 08:32:54 +02:00
Bastian Kleineidam
f98c2bc414 Use py.test on windows. 2012-09-29 20:28:46 +02:00
Bastian Kleineidam
39204ea0fe Use py.test skip function instead of nose. 2012-09-29 20:28:16 +02:00
Bastian Kleineidam
2479c53e6c Use a free port number in ftp tests for local server. 2012-09-29 19:22:12 +02:00
Bastian Kleineidam
8c9e633d96 Use py.test testrunner. 2012-09-26 16:54:54 +02:00
Bastian Kleineidam
cbc3bcb0d3 Sitemap logger fixes. 2012-09-23 23:20:21 +02:00
Bastian Kleineidam
60305d8877 Code cleanup. 2012-09-23 21:20:12 +02:00
Bastian Kleineidam
e21187b275 Put in-progress URLs back near the front of URL queue, not at end. 2012-09-23 21:00:01 +02:00
Bastian Kleineidam
1f3034b5f5 Sitemap logger fixes. 2012-09-23 20:59:38 +02:00
Bastian Kleineidam
a022c836bc Randomize site test. 2012-09-23 16:19:56 +02:00
Bastian Kleineidam
38dd63f055 Code cleanup. 2012-09-23 16:19:42 +02:00
Bastian Kleineidam
1e1370fbb4 Updated docs. 2012-09-23 16:02:11 +02:00
Bastian Kleineidam
7f8fd01b22 Add Accept-Encoding and Accept-Charset headers. 2012-09-23 15:06:44 +02:00
Bastian Kleineidam
03ecff22bb Fix endless loop in http authentication. 2012-09-22 22:21:10 +02:00
Bastian Kleineidam
653b5f27dd Updated ignored schemes. 2012-09-22 16:18:37 +02:00
Bastian Kleineidam
1c59cb4d4c Use GET in case a HEAD method does not succeed, even if robots.txt content checkes denied the page. This way proper check results are achieved (but the content is still not checked, so it's ok). 2012-09-22 07:53:11 +02:00
Bastian Kleineidam
cff97b9718 Updated copyright. 2012-09-21 21:13:00 +02:00
Bastian Kleineidam
fba465e8e8 Fix robotstxt cache miss stats. 2012-09-21 21:12:28 +02:00
Bastian Kleineidam
f6b007f757 Fix useragent matching in robots.txt parser. 2012-09-21 21:12:13 +02:00
Bastian Kleineidam
1c2a66ffaf Refactor http tests into multiple files. 2012-09-21 20:34:05 +02:00
Bastian Kleineidam
bbf25106fa Fix double result setting on http checks. 2012-09-21 20:33:15 +02:00
Bastian Kleineidam
3e464e509c Do not allow empty configuration string values. 2012-09-21 16:05:34 +02:00
Bastian Kleineidam
7e130808bd Document the user-agent change. 2012-09-21 16:05:03 +02:00
Bastian Kleineidam
498567eb21 Remove alexa log files on distclean. 2012-09-21 16:04:46 +02:00
Bastian Kleineidam
32357c9683 Improve alexa test script. 2012-09-21 15:53:16 +02:00
Bastian Kleineidam
d60e4dc0a2 Improved custom user-agent example. 2012-09-21 15:51:44 +02:00
Bastian Kleineidam
ecf8753a19 Improved user-agent string similar to Google and Bing search bots. 2012-09-21 15:46:14 +02:00
Bastian Kleineidam
718c033989 Add alexa test run script. 2012-09-21 14:50:33 +02:00
Bastian Kleineidam
c274b50c50 Store lowercase URL scheme in checker class. 2012-09-21 14:35:25 +02:00
Bastian Kleineidam
0941f6ff02 Improve exception handling by using unicode. 2012-09-21 14:29:20 +02:00
Bastian Kleineidam
f46889a4af Log timestamps in debug output. 2012-09-21 13:05:36 +02:00
Bastian Kleineidam
049882e4fe Remove accept-encoding since some sites have wrong compression. 2012-09-20 22:39:15 +02:00
Bastian Kleineidam
0b9c0ee784 Updated translations and changelog. 2012-09-20 20:42:21 +02:00
Bastian Kleineidam
7c6dce6136 Only warn non-empty site duplicates. 2012-09-20 20:39:36 +02:00
Bastian Kleineidam
a03090c20f Optimize intern/extern pattern parsing. 2012-09-20 20:19:13 +02:00
Bastian Kleineidam
5554c16aa2 Do not copy stdin URLs in temporary list. 2012-09-20 16:40:04 +02:00
Bastian Kleineidam
c385c35b1a Fix ansicolor again. 2012-09-20 16:39:40 +02:00