Commit graph

61 commits

Author SHA1 Message Date
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
4055721fd4 Use internal gzip2 module
Use the internal gzip replacement module gzip2 for all GzipFile handling.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3685 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:55 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
fe438941a9 cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3576 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-10-02 01:06:24 +00:00
calvin
df48d4a905 bump up copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3534 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-01-01 14:57:38 +00:00
calvin
3f099a6438 use boolean objects for rule line allowance
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3508 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-10-19 20:36:31 +00:00
calvin
0c5d34e9f9 don't discard robots.txt entries with only Allow: lines
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3471 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-21 09:14:28 +00:00
calvin
e8e6a8af9a set modified time after parsing of robots.txt entries
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3348 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-05 19:44:59 +00:00
calvin
19a7495b9e only accept ASCII robots.txt
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3339 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-04 21:07:08 +00:00
calvin
a57618a4ad use relative imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3335 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-01 14:06:19 +00:00
calvin
a4e9b8eab1 fix debugging
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3236 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 16:24:53 +00:00
calvin
a741d7922c add get_crawldelay method
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3226 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 15:35:48 +00:00
calvin
d73aa0e5bd parse crawl-delay parameter line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3211 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 21:29:18 +00:00
calvin
2cfcb5c0bb avoid double timeouts by raising timeout errors in robots.txt retrieval
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3171 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 12:58:31 +00:00
calvin
dc9f04e6dc adjust debug asserts
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3159 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-13 21:03:21 +00:00
calvin
e92aee054c updated copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3010 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-01-03 19:12:47 +00:00
calvin
856ff8ef2a assert debugs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2987 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-18 08:55:42 +00:00
calvin
df34e1a8e9 remove unused imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2919 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-10-25 13:48:30 +00:00
calvin
c9f5d1a0b1 catch gzip errors, and use linkchecker debugging
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2910 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-10-15 00:06:48 +00:00
calvin
f24bb87e54 add missing return
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2784 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-20 12:37:32 +00:00
calvin
afa8750dc3 catch ValueError raised by urllib2
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2783 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-20 09:45:30 +00:00
calvin
baf51d1f5d cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2752 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-14 17:56:05 +00:00
calvin
63b76ec642 Use HTTPMessage() in all urllib handlers, really fixing the bug noted in http://www.python.org/sf/1117588. The workaround has been removed.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2603 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-05-18 17:53:39 +00:00
calvin
44075c47bf clean up raise calls
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2294 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-02-08 14:52:50 +00:00
calvin
973b6d5098 work around for 302 redirect handling error
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2283 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-02-07 12:10:17 +00:00
calvin
05c9b8b5e6 use linkchecker agent on getting /robots.txt
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2194 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-24 09:45:22 +00:00
calvin
adc1d02217 documentation
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2165 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-19 21:11:43 +00:00
calvin
d030a5b054 documentation updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2164 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-19 15:56:48 +00:00
calvin
647d7167ee documentation syntax
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2163 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-19 15:08:02 +00:00
calvin
700d564be7 documentation updates
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2148 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-18 01:00:45 +00:00
calvin
b06f144ced updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2122 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-01-11 02:22:43 +00:00
calvin
c97f68f70a accept unicode in robots.txt can_fetch
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1924 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-11-09 00:00:59 +00:00
calvin
62b2784ebc python 2.4 compat
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1805 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-09-16 20:11:38 +00:00
calvin
ce9dc6fbe9 increment robotparser version
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1655 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-08-31 21:45:34 +00:00
calvin
bc6bd34ffc fix password manager interface
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1654 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-08-31 21:42:34 +00:00
calvin
bffdfa68fd robots.txt password support
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1649 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-08-31 21:20:51 +00:00
calvin
4756641e1b source code restructuring
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1423 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-08-16 19:20:53 +00:00
calvin
1f6670e8cd import fixes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1399 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-07-26 13:47:19 +00:00
calvin
96dd6ef4b8 import fixes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1398 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-07-26 12:01:52 +00:00
calvin
b674575de1 import fixes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1388 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-07-22 13:36:43 +00:00
calvin
5ad8c827b4 syntax updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1374 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-07-20 14:49:44 +00:00
calvin
018cf945d1 new module layout
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1356 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-07-07 18:04:40 +00:00
calvin
8dcd8f408a copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1313 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-05 09:55:14 +00:00
calvin
95611de5c3 replace backticks with repr
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1121 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-20 11:28:55 +00:00
calvin
f741df2bfa catch httplib errors
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1081 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-11-15 15:30:22 +00:00
calvin
765f30fbb1 use internal debug logger
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1073 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-10-31 16:31:05 +00:00
calvin
0de34a7675 ignore decompression errors
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1062 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-10-21 10:11:46 +00:00
calvin
c4d243dfc7 re-added robotparser2.py, updated it
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1041 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-09-23 21:59:17 +00:00
calvin
14c9cbc4c4 updated tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@324 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2001-11-29 13:49:52 +00:00
calvin
8f4260458a docstrings and copyright updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@243 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2001-03-15 01:19:35 +00:00