add robots.txt RFC link

git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2520 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
calvin 2005-04-22 13:39:25 +00:00
parent f98d2d3641
commit cca21759f0
2 changed files with 5 additions and 3 deletions

View file

@ -277,7 +277,7 @@ If you really want to disable this, use the <tt class="docutils literal"><span c
option.</p>
<p><strong>Q: I see LinkChecker gets a /robots.txt file for every site it
checks. What is that about?</strong></p>
<p>A: LinkChecker follows the robots.txt exclusion standard. To avoid
<p>A: LinkChecker follows the <a class="reference" href="http://www.robotstxt.org/wc/norobots-rfc.html">robots.txt exclusion standard</a>. To avoid
misuse of LinkChecker, you cannot turn this feature off.
See the <a class="reference" href="http://www.robotstxt.org/wc/robots.html">Web Robot pages</a> and the <a class="reference" href="http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt">Spidering report</a> for more info.</p>
<p><strong>Q: Ctrl-C does not stop LinkChecker immediately. Why is that so?</strong></p>
@ -297,7 +297,7 @@ and look for missing files.</p>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-04-01 09:00 UTC.
Generated on: 2005-04-22 13:38 UTC.
</div>
</body>
</html>

View file

@ -299,10 +299,12 @@ option.
**Q: I see LinkChecker gets a /robots.txt file for every site it
checks. What is that about?**
A: LinkChecker follows the robots.txt exclusion standard. To avoid
A: LinkChecker follows the `robots.txt exclusion standard`_. To avoid
misuse of LinkChecker, you cannot turn this feature off.
See the `Web Robot pages`_ and the `Spidering report`_ for more info.
.. _`robots.txt exclusion standard`:
http://www.robotstxt.org/wc/norobots-rfc.html
.. _`Web Robot pages`:
http://www.robotstxt.org/wc/robots.html
.. _`Spidering report`: