Bastian Kleineidam
|
6d47b76509
|
Limit HTTP and FTP connections. Gets rid of spurious BadStatusLine errors.
|
2012-10-09 21:04:20 +02:00 |
|
Bastian Kleineidam
|
7d3ece502c
|
Support semaphores.
|
2012-10-09 19:46:06 +02:00 |
|
Bastian Kleineidam
|
ad8525c483
|
Improve BadStatusline error message.
|
2012-10-05 08:32:24 +02:00 |
|
Bastian Kleineidam
|
d15fafb1f7
|
Code cleanup.
|
2012-10-05 08:10:44 +02:00 |
|
Bastian Kleineidam
|
5ebd754cdb
|
Improved duplicate url check.
|
2012-10-01 16:11:45 +02:00 |
|
Bastian Kleineidam
|
ed7c60e491
|
Do not warn about duplicate URLs which can point to the same content.
|
2012-10-01 13:42:46 +02:00 |
|
Bastian Kleineidam
|
148846be67
|
Add flag to log lock contentions.
|
2012-10-01 13:32:30 +02:00 |
|
Bastian Kleineidam
|
b56c054932
|
Use finer-grained robots.txt locks to improve lock contention.
|
2012-10-01 13:29:29 +02:00 |
|
Bastian Kleineidam
|
27b61c3bfa
|
Fix gzip handling in http content decoder.
|
2012-09-30 14:00:49 +02:00 |
|
Bastian Kleineidam
|
cbc3bcb0d3
|
Sitemap logger fixes.
|
2012-09-23 23:20:21 +02:00 |
|
Bastian Kleineidam
|
60305d8877
|
Code cleanup.
|
2012-09-23 21:20:12 +02:00 |
|
Bastian Kleineidam
|
e21187b275
|
Put in-progress URLs back near the front of URL queue, not at end.
|
2012-09-23 21:00:01 +02:00 |
|
Bastian Kleineidam
|
1f3034b5f5
|
Sitemap logger fixes.
|
2012-09-23 20:59:38 +02:00 |
|
Bastian Kleineidam
|
38dd63f055
|
Code cleanup.
|
2012-09-23 16:19:42 +02:00 |
|
Bastian Kleineidam
|
7f8fd01b22
|
Add Accept-Encoding and Accept-Charset headers.
|
2012-09-23 15:06:44 +02:00 |
|
Bastian Kleineidam
|
03ecff22bb
|
Fix endless loop in http authentication.
|
2012-09-22 22:21:10 +02:00 |
|
Bastian Kleineidam
|
653b5f27dd
|
Updated ignored schemes.
|
2012-09-22 16:18:37 +02:00 |
|
Bastian Kleineidam
|
1c59cb4d4c
|
Use GET in case a HEAD method does not succeed, even if robots.txt content checkes denied the page. This way proper check results are achieved (but the content is still not checked, so it's ok).
|
2012-09-22 07:53:11 +02:00 |
|
Bastian Kleineidam
|
fba465e8e8
|
Fix robotstxt cache miss stats.
|
2012-09-21 21:12:28 +02:00 |
|
Bastian Kleineidam
|
f6b007f757
|
Fix useragent matching in robots.txt parser.
|
2012-09-21 21:12:13 +02:00 |
|
Bastian Kleineidam
|
bbf25106fa
|
Fix double result setting on http checks.
|
2012-09-21 20:33:15 +02:00 |
|
Bastian Kleineidam
|
3e464e509c
|
Do not allow empty configuration string values.
|
2012-09-21 16:05:34 +02:00 |
|
Bastian Kleineidam
|
ecf8753a19
|
Improved user-agent string similar to Google and Bing search bots.
|
2012-09-21 15:46:14 +02:00 |
|
Bastian Kleineidam
|
c274b50c50
|
Store lowercase URL scheme in checker class.
|
2012-09-21 14:35:25 +02:00 |
|
Bastian Kleineidam
|
0941f6ff02
|
Improve exception handling by using unicode.
|
2012-09-21 14:29:20 +02:00 |
|
Bastian Kleineidam
|
f46889a4af
|
Log timestamps in debug output.
|
2012-09-21 13:05:36 +02:00 |
|
Bastian Kleineidam
|
049882e4fe
|
Remove accept-encoding since some sites have wrong compression.
|
2012-09-20 22:39:15 +02:00 |
|
Bastian Kleineidam
|
7c6dce6136
|
Only warn non-empty site duplicates.
|
2012-09-20 20:39:36 +02:00 |
|
Bastian Kleineidam
|
a03090c20f
|
Optimize intern/extern pattern parsing.
|
2012-09-20 20:19:13 +02:00 |
|
Bastian Kleineidam
|
c385c35b1a
|
Fix ansicolor again.
|
2012-09-20 16:39:40 +02:00 |
|
Bastian Kleineidam
|
b9d234c78a
|
Fix wrong method name in SSL certificate check.
|
2012-09-20 16:28:01 +02:00 |
|
Bastian Kleineidam
|
bff217c58b
|
Never log ignored warnings.
|
2012-09-20 12:44:40 +02:00 |
|
Bastian Kleineidam
|
600b7c0e69
|
Fix duplicate content warning when self.size is not set yet.
|
2012-09-20 12:44:23 +02:00 |
|
Bastian Kleineidam
|
9cfee5eb5b
|
Improved color detection with curses.
|
2012-09-20 12:13:15 +02:00 |
|
Bastian Kleineidam
|
bc0a17c1c4
|
Display last modified date in the GUI.
|
2012-09-19 21:23:39 +02:00 |
|
Bastian Kleineidam
|
d37347cab0
|
Remove unused variable.
|
2012-09-19 11:08:06 +02:00 |
|
Bastian Kleineidam
|
18a200d85f
|
Fix tests.
|
2012-09-19 11:05:26 +02:00 |
|
Bastian Kleineidam
|
b8f8bdf5fc
|
Fix last modified formatting.
|
2012-09-19 10:09:19 +02:00 |
|
Bastian Kleineidam
|
f5fbd7666f
|
Remove unused import.
|
2012-09-19 09:39:32 +02:00 |
|
Bastian Kleineidam
|
75719b34f6
|
Updated copyright.
|
2012-09-19 09:17:25 +02:00 |
|
Bastian Kleineidam
|
71fba0f8b7
|
Log all valid URLs in sitemap loggers.
|
2012-09-19 09:17:08 +02:00 |
|
Bastian Kleineidam
|
9d1c90f96c
|
Write extra script to analyse a memory dump.
|
2012-09-18 16:08:31 +02:00 |
|
Bastian Kleineidam
|
3a352631ba
|
Add modified field to loggers.
|
2012-09-18 12:12:00 +02:00 |
|
Bastian Kleineidam
|
1db63227f6
|
Memoize file operations to minimize disk I/O.
|
2012-09-18 09:37:21 +02:00 |
|
Bastian Kleineidam
|
932a07a9cf
|
Added XML sitemap logger.
|
2012-09-18 09:16:34 +02:00 |
|
Bastian Kleineidam
|
4e59056ee7
|
Warn about duplicate URL contents.
|
2012-09-17 19:49:50 +02:00 |
|
Bastian Kleineidam
|
02a09dbb28
|
Add documentation.
|
2012-09-17 16:30:32 +02:00 |
|
Bastian Kleineidam
|
99bf8aa940
|
Updated copyright.
|
2012-09-17 16:09:55 +02:00 |
|
Bastian Kleineidam
|
cb71f483a5
|
Warn about too long URLs.
|
2012-09-17 16:00:23 +02:00 |
|
Bastian Kleineidam
|
03667a4ec9
|
Print warning tags in text output.
|
2012-09-17 15:29:04 +02:00 |
|
Bastian Kleineidam
|
1f9ee987f9
|
Improved terminal color detection with curses.
|
2012-09-17 15:24:04 +02:00 |
|
Bastian Kleineidam
|
6e1841cf1f
|
Print download and cache statistics.
|
2012-09-17 15:23:25 +02:00 |
|
Bastian Kleineidam
|
0b5b6ab37b
|
Automatically set --complete for graph output.
|
2012-09-15 15:06:29 +02:00 |
|
Bastian Kleineidam
|
273230d98b
|
Send HTTP Do-Not-Track header.
|
2012-09-14 22:41:38 +02:00 |
|
Bastian Kleineidam
|
e98f15933f
|
Stop checking of all output loggers have been deactivated.
|
2012-09-14 22:36:59 +02:00 |
|
Bastian Kleineidam
|
81d2c4dbd9
|
Improved documentation.
|
2012-09-14 22:26:45 +02:00 |
|
Bastian Kleineidam
|
86f1c74006
|
Close loggers properly on I/O errors.
|
2012-09-14 22:09:18 +02:00 |
|
Bastian Kleineidam
|
6730fb51ee
|
Allow maximum check time specification.
|
2012-09-03 20:17:49 +02:00 |
|
Bastian Kleineidam
|
a1dfaf2f91
|
Add missing docstring.
|
2012-09-02 23:37:43 +02:00 |
|
Bastian Kleineidam
|
21db38546c
|
Updated copyright.
|
2012-09-02 23:36:31 +02:00 |
|
Bastian Kleineidam
|
3baaca47a0
|
Add maximum number of allowed puts on URL queue.
|
2012-09-02 22:44:29 +02:00 |
|
Bastian Kleineidam
|
d8fce1ceeb
|
Do not sort URL queue anymore.
|
2012-09-02 22:32:14 +02:00 |
|
Bastian Kleineidam
|
7a6436f08f
|
Increase checked cache in URL queue.
|
2012-09-02 22:21:49 +02:00 |
|
Bastian Kleineidam
|
4c16d3e702
|
Make 401 unauthorized GET response a warning.
|
2012-08-26 11:32:17 +02:00 |
|
Bastian Kleineidam
|
b6d45eabe5
|
Code cleanup.
|
2012-08-24 09:46:38 +02:00 |
|
Bastian Kleineidam
|
ac6591a009
|
Recognize WML files on Windows.
|
2012-08-24 09:46:26 +02:00 |
|
Bastian Kleineidam
|
7334a9863e
|
Make URL properties in GUI selectable with the mouse.
|
2012-08-24 00:10:59 +02:00 |
|
Bastian Kleineidam
|
ae15d51b30
|
Translate more result strings.
|
2012-08-23 23:59:33 +02:00 |
|
Bastian Kleineidam
|
ce4253263c
|
Do not special case http->ftp redirects.
|
2012-08-23 23:56:36 +02:00 |
|
Bastian Kleineidam
|
7374068941
|
Remove unused import.
|
2012-08-23 16:46:14 +02:00 |
|
Bastian Kleineidam
|
73d64e50ab
|
Fix redirection to new scheme.
|
2012-08-23 16:45:24 +02:00 |
|
Bastian Kleineidam
|
99ab68908c
|
Increase the default number of checker threads.
|
2012-08-23 16:11:47 +02:00 |
|
Bastian Kleineidam
|
bc287d7710
|
Make unauthorized access responses with missing www-authenticate headers an error.
|
2012-08-23 15:52:11 +02:00 |
|
Bastian Kleineidam
|
e252bbf623
|
Remove Amazon quirk because the default behaviour handles this now.
|
2012-08-23 05:36:51 +02:00 |
|
Bastian Kleineidam
|
02a9f0bacb
|
Add utility method to read string options.
|
2012-08-23 04:52:25 +02:00 |
|
Bastian Kleineidam
|
ecef16b2c9
|
Support WML sites.
|
2012-08-22 22:43:14 +02:00 |
|
Bastian Kleineidam
|
36b1bb01e0
|
Fix variable name typo.
|
2012-08-22 22:00:11 +02:00 |
|
Bastian Kleineidam
|
8d36bf4e3d
|
Show URLs in status bar.
|
2012-08-14 23:00:50 +02:00 |
|
Bastian Kleineidam
|
76f57dc4ad
|
Updated copyright.
|
2012-08-14 20:37:24 +02:00 |
|
Bastian Kleineidam
|
6915e2f989
|
Detect sites not supporting HEAD requests.
|
2012-08-14 18:43:39 +02:00 |
|
Bastian Kleineidam
|
db76f01d48
|
Stop application when aborting timed out. Only used on the command line.
|
2012-08-14 17:41:26 +02:00 |
|
Bastian Kleineidam
|
29a5c1a44a
|
Display the real url name in gui property field.
|
2012-08-13 18:55:25 +02:00 |
|
Bastian Kleineidam
|
f3b66b102d
|
Fallback to GET when method HEAD is not allowed.
|
2012-08-13 07:07:21 +02:00 |
|
Bastian Kleineidam
|
e65b5c72ce
|
Correct list of schemes requiring host name.
|
2012-08-12 14:21:56 +02:00 |
|
Bastian Kleineidam
|
7b567cc378
|
Make scheme and domain for internal url pattern case insensitive.
|
2012-08-12 14:19:42 +02:00 |
|
Bastian Kleineidam
|
afc0ecd7a6
|
--ignore-url now really ignores URLs.
|
2012-08-12 11:16:29 +02:00 |
|
Bastian Kleineidam
|
b86be09d9e
|
Recalculate extern settings after changing intern patterns.
|
2012-08-12 11:15:18 +02:00 |
|
Bastian Kleineidam
|
6be3e9ddff
|
Cleanup code and improve redirect anchor handling.
|
2012-08-12 11:14:56 +02:00 |
|
Bastian Kleineidam
|
10cc59c654
|
Use colorama only on Windows systems.
|
2012-08-12 10:23:44 +02:00 |
|
Bastian Kleineidam
|
cf53b33c94
|
Remove unused functions.
|
2012-08-11 19:34:27 +02:00 |
|
Bastian Kleineidam
|
aa22dc2702
|
Fix windows console output.
|
2012-08-11 07:52:04 +02:00 |
|
Bastian Kleineidam
|
d9acc97f9f
|
Use colorama instead of wconio.
|
2012-08-10 22:24:00 +02:00 |
|
Bastian Kleineidam
|
c74690a79a
|
Do not check SSl certificates on HTTPS -> HTTP redirects.
|
2012-08-10 19:43:57 +02:00 |
|
Bastian Kleineidam
|
451a520943
|
Prevent double color stream proxying.
|
2012-08-10 19:43:33 +02:00 |
|
Bastian Kleineidam
|
580ab74f0e
|
Updated german translation.
|
2012-08-09 20:43:31 +02:00 |
|
Bastian Kleineidam
|
82b4dea4fe
|
Updated copyright
|
2012-08-09 20:43:22 +02:00 |
|
Bastian Kleineidam
|
1c739aed81
|
Use urlparse.uses_relative instead of unofficial urlparse.non_hierarchical (which has been removed in the current CPython 2.7.x trunk).
|
2012-08-04 20:40:31 +02:00 |
|
Bastian Kleineidam
|
b0e5c7fc59
|
Ignore feed: URLs.
|
2012-06-27 21:32:03 +02:00 |
|
Bastian Kleineidam
|
0fd1a78378
|
Always compare encoded anchor names.
|
2012-06-27 20:59:53 +02:00 |
|
Bastian Kleineidam
|
e0d6aecad9
|
Add cancel button to show memory dialog.
|
2012-06-25 20:25:02 +02:00 |
|