mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-05-21 04:41:52 +00:00
add robots.txt RFC link
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2520 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
parent
f98d2d3641
commit
cca21759f0
2 changed files with 5 additions and 3 deletions
|
|
@ -277,7 +277,7 @@ If you really want to disable this, use the <tt class="docutils literal"><span c
|
|||
option.</p>
|
||||
<p><strong>Q: I see LinkChecker gets a /robots.txt file for every site it
|
||||
checks. What is that about?</strong></p>
|
||||
<p>A: LinkChecker follows the robots.txt exclusion standard. To avoid
|
||||
<p>A: LinkChecker follows the <a class="reference" href="http://www.robotstxt.org/wc/norobots-rfc.html">robots.txt exclusion standard</a>. To avoid
|
||||
misuse of LinkChecker, you cannot turn this feature off.
|
||||
See the <a class="reference" href="http://www.robotstxt.org/wc/robots.html">Web Robot pages</a> and the <a class="reference" href="http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt">Spidering report</a> for more info.</p>
|
||||
<p><strong>Q: Ctrl-C does not stop LinkChecker immediately. Why is that so?</strong></p>
|
||||
|
|
@ -297,7 +297,7 @@ and look for missing files.</p>
|
|||
</div>
|
||||
<hr class="docutils footer" />
|
||||
<div class="footer">
|
||||
Generated on: 2005-04-01 09:00 UTC.
|
||||
Generated on: 2005-04-22 13:38 UTC.
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
|
|
|||
|
|
@ -299,10 +299,12 @@ option.
|
|||
**Q: I see LinkChecker gets a /robots.txt file for every site it
|
||||
checks. What is that about?**
|
||||
|
||||
A: LinkChecker follows the robots.txt exclusion standard. To avoid
|
||||
A: LinkChecker follows the `robots.txt exclusion standard`_. To avoid
|
||||
misuse of LinkChecker, you cannot turn this feature off.
|
||||
See the `Web Robot pages`_ and the `Spidering report`_ for more info.
|
||||
|
||||
.. _`robots.txt exclusion standard`:
|
||||
http://www.robotstxt.org/wc/norobots-rfc.html
|
||||
.. _`Web Robot pages`:
|
||||
http://www.robotstxt.org/wc/robots.html
|
||||
.. _`Spidering report`:
|
||||
|
|
|
|||
Loading…
Reference in a new issue