Commit graph

516 commits

Author SHA1 Message Date
calvin
abccff16ea fall back to GET on bad status line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1317 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-26 23:00:21 +00:00
calvin
37b69e16e4 uri regex url added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1314 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-05-03 08:53:38 +00:00
calvin
8dcd8f408a copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1313 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-05 09:55:14 +00:00
calvin
ca081c2168 also check robots allowance of HTML files
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1304 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:48:31 +00:00
calvin
50bc463bb1 check cache
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1303 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 10:42:15 +00:00
calvin
2ffb97a855 get new urls from top of queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1302 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:45:33 +00:00
calvin
e78a8ea539 full css parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1300 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 09:30:10 +00:00
calvin
f4802fd467 pychecker
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1299 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:41:30 +00:00
calvin
fa46757bd7 fix import
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1298 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:34:21 +00:00
calvin
68451e65dd O3
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1297 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:31:57 +00:00
calvin
93253954a8 updated
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1296 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:48 +00:00
calvin
672e118d9b use sorted dict
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1295 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:38 +00:00
calvin
8e4e92dddd minor improvements
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1294 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:21 +00:00
calvin
1b148b0b4e sorted dict
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1293 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:30:01 +00:00
calvin
e183ac84dc handle missing startquotes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1292 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-04 08:29:31 +00:00
calvin
52609e4399 pychecker
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1291 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 18:10:38 +00:00
calvin
6b1d124d35 no sys.path fiddling
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1290 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 17:59:31 +00:00
calvin
8584d5bc8e only check robots.txt for http
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1285 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-04-03 16:34:58 +00:00
calvin
67fabd5d8e addd contact email and url to user-agent string
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1284 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-11 11:01:00 +00:00
calvin
d8e738c60b check syntax and cache before putting url objects in the queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1277 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-04 12:17:38 +00:00
calvin
d79aee3a2c xml prefix for attr var
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1272 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-01 15:49:32 +00:00
calvin
af5be26d2c use XmlUtils instead of xmlify for quoting
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1271 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-03-01 15:38:56 +00:00
calvin
b63fb15986 hmmmm
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1267 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 15:06:22 +00:00
calvin
b7e54260b0 also quote parent url
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1265 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 14:54:10 +00:00
calvin
033a0873be better error msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1261 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 11:56:24 +00:00
calvin
58057bd07f better err msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1260 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-21 11:48:39 +00:00
calvin
85115c2039 cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1257 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:17:49 +00:00
calvin
bd628b7de7 use new url.py
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1256 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:14:31 +00:00
calvin
5187dbc4c2 quote url in output
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1255 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-20 14:13:42 +00:00
calvin
ab9092d7a0 catch errors earlier in recursion check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1253 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-19 23:27:21 +00:00
calvin
fefba0036d catch ValueError, raise IncompleteRead on invalid chunk length
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1250 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-02-19 23:13:30 +00:00
calvin
a02d8ae2a4 fallback in redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1239 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 23:47:21 +00:00
calvin
83b7ef7ff9 break cycles
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1238 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 23:46:13 +00:00
calvin
4e8c8547ec fix typos
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1237 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:39:15 +00:00
calvin
7121f81aff language
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1236 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:37:06 +00:00
calvin
967cadaa26 fallback to GET
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1231 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 21:20:28 +00:00
calvin
76452953f8 use file instead of open
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1226 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:04:49 +00:00
calvin
669866a7ab add NoneLogger
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1223 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:50 +00:00
calvin
fa9023d9f8 fix file parsing, ignore comments and empty lines
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1222 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:31 +00:00
calvin
8a474914f3 added NOneLogger, adjust blacklist default file and handling
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1221 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:02:06 +00:00
calvin
d78d96dd0e added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1220 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 19:01:24 +00:00
calvin
7216e582fe nicer host not found error msg
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1213 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 14:36:21 +00:00
calvin
2c119a027a added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1211 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-29 14:10:35 +00:00
calvin
4df200a2d2 merged from webcleaner
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1205 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 23:38:00 +00:00
calvin
f4dde29117 parse fixes merged from webcleaner
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1204 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 23:04:39 +00:00
calvin
44f5941552 use new parser interface
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1203 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 22:49:20 +00:00
calvin
66ecc466b7 resolve entities
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1202 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 22:48:50 +00:00
calvin
26072afd92 new style parser object class
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1200 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 22:33:34 +00:00
calvin
aa64775892 added setdefault function
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1196 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 09:03:21 +00:00
calvin
c62de8c0d5 gc debug functions
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1195 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2004-01-28 09:03:11 +00:00