mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-03-30 04:30:28 +00:00
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@952 e7d03fd6-7b0d-0410-9947-9c21f3af8025
18 lines
818 B
Text
18 lines
818 B
Text
- all threads should regularly poll a status variable
|
|
this can be used to make ctrl-c working faster, and to print messages
|
|
|
|
- the HTML parser should be even more forgiving with badly formatted html
|
|
|
|
possible Python 2.3 improvements (ie needs Python >= 2.3)
|
|
- get rid of timeoutsocket.py, the default socket has timeouts
|
|
- use optparse instead of getopt with more flexible commandline help
|
|
- replace the debug() function with the logging module
|
|
we'll see how we can insert multiple debug levels into this thing
|
|
- use Bool object type
|
|
- get rid of the patched robotparser.py
|
|
- use new csv module
|
|
- use the Set type instead of hashmaps (did I use hashmaps for sets here?)
|
|
|
|
include some web check and/or spider features:
|
|
- warn if overall size of page (including images/flash/etc.) is too big
|
|
- save downloaded pages
|