Commit graph

154 commits

Author SHA1 Message Date
calvin
097bb8a143 mv contentAllowsRobots to end of recursion check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1339 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 20:43:41 +00:00
calvin
e9341590d4 better err msg on bad status line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1318 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:02:47 +00:00
calvin
ca081c2168 also check robots allowance of HTML files
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1304 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:48:31 +00:00
calvin
50bc463bb1 check cache
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1303 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:42:15 +00:00
calvin
2ffb97a855 get new urls from top of queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1302 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:45:33 +00:00
calvin
e78a8ea539 full css parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1300 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:30:10 +00:00
calvin
52609e4399 pychecker
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1291 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 18:10:38 +00:00
calvin
8584d5bc8e only check robots.txt for http
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1285 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 16:34:58 +00:00
calvin
d8e738c60b check syntax and cache before putting url objects in the queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1277 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-04 12:17:38 +00:00
calvin
033a0873be better error msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1261 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 11:56:24 +00:00
calvin
85115c2039 cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1257 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:17:49 +00:00
calvin
ab9092d7a0 catch errors earlier in recursion check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1253 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-19 23:27:21 +00:00
calvin
83b7ef7ff9 break cycles
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1238 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 23:46:13 +00:00
calvin
7121f81aff language
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1236 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:37:06 +00:00
calvin
7216e582fe nicer host not found error msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1213 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 14:36:21 +00:00
calvin
44f5941552 use new parser interface
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1203 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 22:49:20 +00:00
calvin
fce225826b honor nofollow robots.txt param in html meta tag
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1177 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-07 20:50:07 +00:00
calvin
ed563ee2e6 cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1173 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-04 09:23:00 +00:00
calvin
fef96392d6 updated copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1150 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-03 14:59:33 +00:00
calvin
a17bf11f4b updated caching
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1132 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-02 23:30:22 +00:00
calvin
83a8c945dd only cache needed info
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1127 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-29 19:12:51 +00:00
calvin
95611de5c3 replace backticks with repr
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1121 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-20 11:28:55 +00:00
calvin
7d87f007d4 do not add automatic filters with --strict when there are already some
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1090 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-07 10:14:18 +00:00
calvin
519c000cd5 fix inifinite recursion option
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1086 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-05 00:38:11 +00:00
calvin
3bc1816eb1 initialize self.urlparts
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1084 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-12-02 15:46:21 +00:00
calvin
e0a063104e parse css files recursively
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1058 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-10-17 10:53:48 +00:00
calvin
8f9e0d7a97 check CSS background image urls
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1052 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-10-16 20:39:59 +00:00
calvin
c744aa56fc use new robotparser2
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1042 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-09-23 22:00:14 +00:00
calvin
7e5970d256 remove unused imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1020 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-08-18 22:14:47 +00:00
calvin
c03e824438 use new-style classes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1008 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-08-11 13:19:39 +00:00
calvin
cbaebd7999 boolean stuff
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1001 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-08-11 12:29:11 +00:00
calvin
8c1deec0c9 use boolean values, timeout changes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@998 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-08-11 11:49:30 +00:00
calvin
d25b0d51ce fix denyllow match for intern patterns
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@972 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-08-05 12:20:55 +00:00
calvin
307e57ba0d redirection fixed
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@961 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-07-28 12:05:24 +00:00
calvin
09d916058a also accept a list of cache keys
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@960 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-07-28 11:25:40 +00:00
calvin
cf17757fae check cache again on redirects
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@957 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-07-28 10:55:16 +00:00
calvin
a4bcfca88e fix bug in strict check domain guessing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@942 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-07-09 20:36:44 +00:00
calvin
308ceb45c5 add coding line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@933 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-07-04 14:24:44 +00:00
calvin
a3c938b8ae add domain colon for -s
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@927 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-06-25 20:03:17 +00:00
calvin
a3c0a2a9c6 cleanup, minor text parsing glitch
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@924 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-06-24 20:57:24 +00:00
calvin
9e9a8c63b7 enable automatic intern link config for -s
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@911 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-06-18 09:03:51 +00:00
calvin
ec4c6a0c81 inifinite recursion
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@907 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-06-18 07:50:08 +00:00
calvin
98a0c329d8 also get id attributes on anchor check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@891 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-05-21 22:25:03 +00:00
calvin
b5ebd56f2b display parse info
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@870 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-04-30 13:56:31 +00:00
calvin
fc35e1f97f fix the anchor fix
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@858 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-04-29 09:40:33 +00:00
calvin
703abd1ca7 remove anchor from HEAD and GET requests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@853 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-04-28 13:08:00 +00:00
calvin
6300d08cfd use urllib2
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@836 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-04-17 13:06:14 +00:00
calvin
1382aebcc7 more imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@835 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-04-17 11:36:12 +00:00
calvin
13d41d290b new option --no-anchor-caching
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@813 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-03-16 14:12:21 +00:00
calvin
9676d14c89 only cache anchor urls with -a
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@809 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2003-03-05 11:34:40 +00:00