Commit graph

526 commits

Author SHA1 Message Date
calvin
097bb8a143 mv contentAllowsRobots to end of recursion check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1339 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 20:43:41 +00:00
calvin
49e2b1f10d rework anchor fallback
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1336 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 20:27:59 +00:00
calvin
715a80afff ignore flush errors
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1335 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 17:31:05 +00:00
calvin
7556d4e72c correctly quote request url
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1331 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 09:05:46 +00:00
calvin
58fab5a44f updated from webcleaner
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1330 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 08:29:43 +00:00
calvin
04e0a9448d updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1325 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 00:30:37 +00:00
calvin
4353d97854 added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1324 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-27 00:18:04 +00:00
calvin
1f28911a23 actually fallback to GET with Zope servers
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1321 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:54:47 +00:00
calvin
ce68bb782a cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1320 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:48:28 +00:00
calvin
e9341590d4 better err msg on bad status line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1318 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:02:47 +00:00
calvin
abccff16ea fall back to GET on bad status line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1317 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:00:21 +00:00
calvin
37b69e16e4 uri regex url added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1314 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-03 08:53:38 +00:00
calvin
8dcd8f408a copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1313 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-05 09:55:14 +00:00
calvin
ca081c2168 also check robots allowance of HTML files
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1304 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:48:31 +00:00
calvin
50bc463bb1 check cache
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1303 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:42:15 +00:00
calvin
2ffb97a855 get new urls from top of queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1302 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:45:33 +00:00
calvin
e78a8ea539 full css parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1300 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:30:10 +00:00
calvin
f4802fd467 pychecker
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1299 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:41:30 +00:00
calvin
fa46757bd7 fix import
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1298 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:34:21 +00:00
calvin
68451e65dd O3
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1297 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:31:57 +00:00
calvin
93253954a8 updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1296 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:48 +00:00
calvin
672e118d9b use sorted dict
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1295 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:38 +00:00
calvin
8e4e92dddd minor improvements
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1294 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:21 +00:00
calvin
1b148b0b4e sorted dict
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1293 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:01 +00:00
calvin
e183ac84dc handle missing startquotes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1292 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:29:31 +00:00
calvin
52609e4399 pychecker
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1291 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 18:10:38 +00:00
calvin
6b1d124d35 no sys.path fiddling
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1290 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 17:59:31 +00:00
calvin
8584d5bc8e only check robots.txt for http
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1285 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 16:34:58 +00:00
calvin
67fabd5d8e addd contact email and url to user-agent string
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1284 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-11 11:01:00 +00:00
calvin
d8e738c60b check syntax and cache before putting url objects in the queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1277 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-04 12:17:38 +00:00
calvin
d79aee3a2c xml prefix for attr var
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1272 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-01 15:49:32 +00:00
calvin
af5be26d2c use XmlUtils instead of xmlify for quoting
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1271 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-01 15:38:56 +00:00
calvin
b63fb15986 hmmmm
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1267 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 15:06:22 +00:00
calvin
b7e54260b0 also quote parent url
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1265 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 14:54:10 +00:00
calvin
033a0873be better error msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1261 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 11:56:24 +00:00
calvin
58057bd07f better err msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1260 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 11:48:39 +00:00
calvin
85115c2039 cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1257 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:17:49 +00:00
calvin
bd628b7de7 use new url.py
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1256 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:14:31 +00:00
calvin
5187dbc4c2 quote url in output
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1255 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:13:42 +00:00
calvin
ab9092d7a0 catch errors earlier in recursion check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1253 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-19 23:27:21 +00:00
calvin
fefba0036d catch ValueError, raise IncompleteRead on invalid chunk length
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1250 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-19 23:13:30 +00:00
calvin
a02d8ae2a4 fallback in redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1239 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 23:47:21 +00:00
calvin
83b7ef7ff9 break cycles
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1238 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 23:46:13 +00:00
calvin
4e8c8547ec fix typos
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1237 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:39:15 +00:00
calvin
7121f81aff language
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1236 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:37:06 +00:00
calvin
967cadaa26 fallback to GET
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1231 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:20:28 +00:00
calvin
76452953f8 use file instead of open
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1226 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:04:49 +00:00
calvin
669866a7ab add NoneLogger
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1223 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:50 +00:00
calvin
fa9023d9f8 fix file parsing, ignore comments and empty lines
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1222 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:31 +00:00
calvin
8a474914f3 added NOneLogger, adjust blacklist default file and handling
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1221 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:06 +00:00