Bastian Kleineidam
|
b39158e65c
|
Improve available anchor message.
|
2012-10-24 22:21:46 +02:00 |
|
Bastian Kleineidam
|
dd2c963fac
|
Fix non-ASCII exception handling.
|
2012-10-24 22:14:45 +02:00 |
|
Bastian Kleineidam
|
06a25676c5
|
Only read the maximum data size plus one, not the whole file.
|
2012-10-10 06:35:33 +02:00 |
|
Bastian Kleineidam
|
6d47b76509
|
Limit HTTP and FTP connections. Gets rid of spurious BadStatusLine errors.
|
2012-10-09 21:04:20 +02:00 |
|
Bastian Kleineidam
|
ad8525c483
|
Improve BadStatusline error message.
|
2012-10-05 08:32:24 +02:00 |
|
Bastian Kleineidam
|
ed7c60e491
|
Do not warn about duplicate URLs which can point to the same content.
|
2012-10-01 13:42:46 +02:00 |
|
Bastian Kleineidam
|
c274b50c50
|
Store lowercase URL scheme in checker class.
|
2012-09-21 14:35:25 +02:00 |
|
Bastian Kleineidam
|
0941f6ff02
|
Improve exception handling by using unicode.
|
2012-09-21 14:29:20 +02:00 |
|
Bastian Kleineidam
|
7c6dce6136
|
Only warn non-empty site duplicates.
|
2012-09-20 20:39:36 +02:00 |
|
Bastian Kleineidam
|
a03090c20f
|
Optimize intern/extern pattern parsing.
|
2012-09-20 20:19:13 +02:00 |
|
Bastian Kleineidam
|
bff217c58b
|
Never log ignored warnings.
|
2012-09-20 12:44:40 +02:00 |
|
Bastian Kleineidam
|
600b7c0e69
|
Fix duplicate content warning when self.size is not set yet.
|
2012-09-20 12:44:23 +02:00 |
|
Bastian Kleineidam
|
18a200d85f
|
Fix tests.
|
2012-09-19 11:05:26 +02:00 |
|
Bastian Kleineidam
|
3a352631ba
|
Add modified field to loggers.
|
2012-09-18 12:12:00 +02:00 |
|
Bastian Kleineidam
|
4e59056ee7
|
Warn about duplicate URL contents.
|
2012-09-17 19:49:50 +02:00 |
|
Bastian Kleineidam
|
cb71f483a5
|
Warn about too long URLs.
|
2012-09-17 16:00:23 +02:00 |
|
Bastian Kleineidam
|
6e1841cf1f
|
Print download and cache statistics.
|
2012-09-17 15:23:25 +02:00 |
|
Bastian Kleineidam
|
7a6436f08f
|
Increase checked cache in URL queue.
|
2012-09-02 22:21:49 +02:00 |
|
Bastian Kleineidam
|
ecef16b2c9
|
Support WML sites.
|
2012-08-22 22:43:14 +02:00 |
|
Bastian Kleineidam
|
e65b5c72ce
|
Correct list of schemes requiring host name.
|
2012-08-12 14:21:56 +02:00 |
|
Bastian Kleineidam
|
afc0ecd7a6
|
--ignore-url now really ignores URLs.
|
2012-08-12 11:16:29 +02:00 |
|
Bastian Kleineidam
|
0fd1a78378
|
Always compare encoded anchor names.
|
2012-06-27 20:59:53 +02:00 |
|
Bastian Kleineidam
|
5c045fef44
|
Fix UNC path handling on Windows.
|
2012-06-24 10:30:54 +02:00 |
|
Bastian Kleineidam
|
73b176d7c9
|
Fix URL joining: properly detect absolute URL.
|
2012-06-23 13:33:27 +02:00 |
|
Bastian Kleineidam
|
f107092a8a
|
Fix handling of user/password info in URLs.
|
2012-06-10 22:07:42 +02:00 |
|
Bastian Kleineidam
|
98537eea2f
|
Code cleanup: use add_url() function in UrlBase.
|
2012-06-10 14:24:17 +02:00 |
|
Bastian Kleineidam
|
db95fce77e
|
Ignore PHP processing instructions in local files.
|
2012-06-10 14:02:01 +02:00 |
|
Bastian Kleineidam
|
837ab22d01
|
Syntax cleanup.
|
2012-06-10 11:46:05 +02:00 |
|
Bastian Kleineidam
|
77b8ec0fcd
|
Fix writing temporary Word files.
|
2012-06-10 11:07:35 +02:00 |
|
Bastian Kleineidam
|
52dcf101e0
|
Remove rest of deprecated options.
|
2012-04-22 17:55:12 +02:00 |
|
Bastian Kleineidam
|
b9b8e3f5b2
|
Honor the charset encoding of the Content-Type HTTP
header when parsing HTML.
|
2012-03-22 22:45:11 +01:00 |
|
Bastian Kleineidam
|
4c9fd8d488
|
Cache real url.
|
2012-03-14 21:12:13 +01:00 |
|
Bastian Kleineidam
|
042b0569ec
|
Fall back to W3C checkers.
|
2012-01-22 08:13:27 +01:00 |
|
Bastian Kleineidam
|
51cf55b7a6
|
Remove warning: prefix from warning messages.
|
2012-01-21 00:25:02 +01:00 |
|
Bastian Kleineidam
|
f1eb51d885
|
Updated copyright
|
2012-01-06 09:21:30 +01:00 |
|
Bastian Kleineidam
|
033280cfb9
|
Remove workarounds for old Python versions.
|
2012-01-04 20:17:53 +01:00 |
|
Bastian Kleineidam
|
3d9958dfbb
|
Parse Safari bookmark files.
|
2011-12-17 16:38:25 +01:00 |
|
Bastian Kleineidam
|
27b7b1cb49
|
Fix W3C HTML validation.
|
2011-10-09 21:16:45 +02:00 |
|
Bastian Kleineidam
|
89ec0ee6a1
|
Check multiple matches of warning regex.
|
2011-10-09 19:00:35 +02:00 |
|
Bastian Kleineidam
|
72b65d94df
|
Only check anchors in HTML pages.
|
2011-05-22 17:33:16 +02:00 |
|
Bastian Kleineidam
|
e5c2271533
|
Only check warning patterns in parseable contents.
|
2011-05-22 17:32:26 +02:00 |
|
Bastian Kleineidam
|
68ea03ee16
|
Support both Chromium and Google Chrome profile dirs to find bookmark files.
|
2011-05-21 11:47:54 +02:00 |
|
Bastian Kleineidam
|
78790d7c8d
|
Improved anchor warning message display.
|
2011-05-20 06:48:06 +02:00 |
|
Bastian Kleineidam
|
343cf9703d
|
Code cleanup: indentation, unused variables etc.
|
2011-05-15 18:36:30 +02:00 |
|
Bastian Kleineidam
|
10bbb696e8
|
Limit download file size to 5MB.
|
2011-05-05 21:10:55 +02:00 |
|
Bastian Kleineidam
|
719441cca5
|
Make module detection more robust and use it when possible.
|
2011-04-20 09:08:11 +02:00 |
|
Bastian Kleineidam
|
84f6d56a49
|
Print level in loggers xml, csv and sql.
|
2011-04-09 10:51:03 +02:00 |
|
Bastian Kleineidam
|
c0732e3d37
|
Do not print empty country information.
|
2011-04-06 17:22:48 +02:00 |
|
Bastian Kleineidam
|
82e5ba8ce6
|
Add warning tag attribute in XML loggers.
|
2011-03-15 13:42:21 +01:00 |
|
Bastian Kleineidam
|
7b33cfac7b
|
Use stripped URL base constructing absolute URL.
|
2011-03-11 15:17:36 +01:00 |
|