added crawl-delay support

git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3231 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
calvin 2006-05-17 15:36:44 +00:00
parent 4bf2b361cb
commit 2914ee1f45
2 changed files with 5 additions and 2 deletions

View file

@ -52,6 +52,11 @@
Changed: linkcheck/cache/connection.py, linkcheck/checker/urlbase.py,
linkcheck/directory/__init__.py
* Honor the "Crawl-delay" directive in robots.txt files.
Type: feature
Changed: linkcheck/robotparser2.py, linkcheck/checker/httpurl.py,
linkcheck/cache/robots_txt.py, linkcheck/cache/connection.py,
3.4 "The Chumscrubbers" (released 4.2.2006)
* Ignore decoding errors when retrieving the robots.txt URL.

2
TODO
View file

@ -1,5 +1,3 @@
- [FEATURE] Add robots.txt crawl-delay support in linkchecker
- [TEST] Add test for cookie file parsing
- [BUGFIX] The URL in the log output is double normed.