mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-03-24 18:00:24 +00:00
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@219 e7d03fd6-7b0d-0410-9947-9c21f3af8025
651 lines
22 KiB
HTML
651 lines
22 KiB
HTML
<html>
|
||
<body>
|
||
<head>
|
||
<title>
|
||
A Standard for Robot Exclusion
|
||
</title>
|
||
</head>
|
||
<body bgcolor=white>
|
||
|
||
<div align=right>
|
||
<font size="+1" color=maroon>
|
||
<i>
|
||
The Web Robots Pages
|
||
<a href="robots.html"><img
|
||
src="lt.gif"
|
||
border=0 WIDTH=9 HEIGHT=12></a>
|
||
</i>
|
||
</font>
|
||
</div>
|
||
<hr>
|
||
<pre>
|
||
|
||
|
||
|
||
|
||
|
||
Network Working Group M. Koster
|
||
INTERNET DRAFT WebCrawler
|
||
Category: Informational November 1996
|
||
Dec 4, 1996 Expires June 4, 1997
|
||
<draft-koster-robots-00.txt>
|
||
|
||
A Method for Web Robots Control
|
||
|
||
|
||
Status of this Memo
|
||
|
||
This document is an Internet-Draft. Internet-Drafts are
|
||
working documents of the Internet Engineering Task Force
|
||
(IETF), its areas, and its working groups. Note that other
|
||
groups may also distribute working documents as Internet-
|
||
Drafts.
|
||
|
||
Internet-Drafts are draft documents valid for a maximum of six
|
||
months and may be updated, replaced, or obsoleted by other
|
||
documents at any time. It is inappropriate to use Internet-
|
||
Drafts as reference material or to cite them other than as
|
||
``work in progress.''
|
||
|
||
To learn the current status of any Internet-Draft, please
|
||
check the ``1id-abstracts.txt'' listing contained in the
|
||
Internet- Drafts Shadow Directories on ftp.is.co.za (Africa),
|
||
nic.nordu.net (Europe), munnari.oz.au (Pacific Rim),
|
||
ds.internic.net (US East Coast), or ftp.isi.edu (US West
|
||
Coast).
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 1]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
Table of Contents
|
||
|
||
1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
||
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2
|
||
3. Specification . . . . . . . . . . . . . . . . . . . . . . . 3
|
||
3.1 Access method . . . . . . . . . . . . . . . . . . . . . . . 3
|
||
3.2 File Format Description . . . . . . . . . . . . . . . . . . 4
|
||
3.2.1 The User-agent line . . . . . . . . . . . . . . . . . . . . 5
|
||
3.2.2 The Allow and Disallow lines . . . . . . . . . . . . . . . 5
|
||
3.3 Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . 6
|
||
3.4 Expiration . . . . . . . . . . . . . . . . . . . . . . . . 8
|
||
4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 8
|
||
5. Implementor's Notes . . . . . . . . . . . . . . . . . . . . 9
|
||
5.1 Backwards Compatibility . . . . . . . . . . . . . . . . . . 9
|
||
5.2 Interoperability . . .. . . . . . . . . . . . . . . . . . . 10
|
||
6. Security Considerations . . . . . . . . . . . . . . . . . . 10
|
||
7. References . . . . . . . . . . . . . . . . . . . . . . . . 10
|
||
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 11
|
||
9. Author's Address . . . . . . . . . . . . . . . . . . . . . 11
|
||
|
||
|
||
1. Abstract
|
||
|
||
This memo defines a method for administrators of sites on the World-
|
||
Wide Web to give instructions to visiting Web robots, most
|
||
importantly what areas of the site are to be avoided.
|
||
|
||
This document provides a more rigid specification of the Standard
|
||
for Robots Exclusion [1], which is currently in wide-spread use by
|
||
the Web community since 1994.
|
||
|
||
|
||
2. Introduction
|
||
|
||
Web Robots (also called "Wanderers" or "Spiders") are Web client
|
||
programs that automatically traverse the Web's hypertext structure
|
||
by retrieving a document, and recursively retrieving all documents
|
||
that are referenced.
|
||
|
||
Note that "recursively" here doesn't limit the definition to any
|
||
specific traversal algorithm; even if a robot applies some heuristic
|
||
to the selection and order of documents to visit and spaces out
|
||
requests over a long space of time, it qualifies to be called a
|
||
robot.
|
||
|
||
Robots are often used for maintenance and indexing purposes, by
|
||
people other than the administrators of the site being visited. In
|
||
some cases such visits may have undesirable effects which the
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 2]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
administrators would like to prevent, such as indexing of an
|
||
unannounced site, traversal of parts of the site which require vast
|
||
resources of the server, recursive traversal of an infinite URL
|
||
space, etc.
|
||
|
||
The technique specified in this memo allows Web site administrators
|
||
to indicate to visiting robots which parts of the site should be
|
||
avoided. It is solely up to the visiting robot to consult this
|
||
information and act accordingly. Blocking parts of the Web site
|
||
regardless of a robot's compliance with this method are outside
|
||
the scope of this memo.
|
||
|
||
|
||
3. The Specification
|
||
|
||
This memo specifies a format for encoding instructions to visiting
|
||
robots, and specifies an access method to retrieve these
|
||
instructions. Robots must retrieve these instructions before visiting
|
||
other URLs on the site, and use the instructions to determine if
|
||
other URLs on the site can be accessed.
|
||
|
||
3.1 Access method
|
||
|
||
The instructions must be accessible via HTTP [2] from the site that
|
||
the instructions are to be applied to, as a resource of Internet
|
||
Media Type [3] "text/plain" under a standard relative path on the
|
||
server: "/robots.txt".
|
||
|
||
For convenience we will refer to this resource as the "/robots.txt
|
||
file", though the resource need in fact not originate from a file-
|
||
system.
|
||
|
||
Some examples of URLs [4] for sites and URLs for corresponding
|
||
"/robots.txt" sites:
|
||
|
||
http://www.foo.com/welcome.html http://www.foo.com/robots.txt
|
||
|
||
http://www.bar.com:8001/ http://www.bar.com:8001/robots.txt
|
||
|
||
If the server response indicates Success (HTTP 2xx Status Code,)
|
||
the robot must read the content, parse it, and follow any
|
||
instructions applicable to that robot.
|
||
|
||
If the server response indicates the resource does not exist (HTTP
|
||
Status Code 404), the robot can assume no instructions are
|
||
available, and that access to the site is not restricted by
|
||
/robots.txt.
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 3]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
Specific behaviors for other server responses are not required by
|
||
this specification, though the following behaviours are recommended:
|
||
|
||
- On server response indicating access restrictions (HTTP Status
|
||
Code 401 or 403) a robot should regard access to the site
|
||
completely restricted.
|
||
|
||
- On the request attempt resulted in temporary failure a robot
|
||
should defer visits to the site until such time as the resource
|
||
can be retrieved.
|
||
|
||
- On server response indicating Redirection (HTTP Status Code 3XX)
|
||
a robot should follow the redirects until a resource can be
|
||
found.
|
||
|
||
|
||
3.2 File Format Description
|
||
|
||
The instructions are encoded as a formatted plain text object,
|
||
described here. A complete BNF-like description of the syntax of this
|
||
format is given in section 3.3.
|
||
|
||
The format logically consists of a non-empty set or records,
|
||
separated by blank lines. The records consist of a set of lines of
|
||
the form:
|
||
|
||
<Field> ":" <value>
|
||
|
||
In this memo we refer to lines with a Field "foo" as "foo lines".
|
||
|
||
The record starts with one or more User-agent lines, specifying
|
||
which robots the record applies to, followed by "Disallow" and
|
||
"Allow" instructions to that robot. For example:
|
||
|
||
User-agent: webcrawler
|
||
User-agent: infoseek
|
||
Allow: /tmp/ok.html
|
||
Disallow: /tmp
|
||
Disallow: /user/foo
|
||
|
||
These lines are discussed separately below.
|
||
|
||
Lines with Fields not explicitly specified by this specification
|
||
may occur in the /robots.txt, allowing for future extension of the
|
||
format. Consult the BNF for restrictions on the syntax of such
|
||
extensions. Note specifically that for backwards compatibility
|
||
with robots implementing earlier versions of this specification,
|
||
breaking of lines is not allowed.
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 4]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
Comments are allowed anywhere in the file, and consist of optional
|
||
whitespace, followed by a comment character '#' followed by the
|
||
comment, terminated by the end-of-line.
|
||
|
||
3.2.1 The User-agent line
|
||
|
||
Name tokens are used to allow robots to identify themselves via a
|
||
simple product token. Name tokens should be short and to the
|
||
point. The name token a robot chooses for itself should be sent
|
||
as part of the HTTP User-agent header, and must be well documented.
|
||
|
||
These name tokens are used in User-agent lines in /robots.txt to
|
||
identify to which specific robots the record applies. The robot
|
||
must obey the first record in /robots.txt that contains a User-
|
||
Agent line whose value contains the name token of the robot as a
|
||
substring. The name comparisons are case-insensitive. If no such
|
||
record exists, it should obey the first record with a User-agent
|
||
line with a "*" value, if present. If no record satisfied either
|
||
condition, or no records are present at all, access is unlimited.
|
||
|
||
The name comparisons are case-insensitive.
|
||
|
||
For example, a fictional company FigTree Search Services who names
|
||
their robot "Fig Tree", send HTTP requests like:
|
||
|
||
GET / HTTP/1.0
|
||
User-agent: FigTree/0.1 Robot libwww-perl/5.04
|
||
|
||
might scan the "/robots.txt" file for records with:
|
||
|
||
User-agent: figtree
|
||
|
||
3.2.2 The Allow and Disallow lines
|
||
|
||
These lines indicate whether accessing a URL that matches the
|
||
corresponding path is allowed or disallowed. Note that these
|
||
instructions apply to any HTTP method on a URL.
|
||
|
||
To evaluate if access to a URL is allowed, a robot must attempt to
|
||
match the paths in Allow and Disallow lines against the URL, in the
|
||
order they occur in the record. The first match found is used. If no
|
||
match is found, the default assumption is that the URL is allowed.
|
||
|
||
The /robots.txt URL is always allowed, and must not appear in the
|
||
Allow/Disallow rules.
|
||
|
||
The matching process compares every octet in the path portion of
|
||
the URL and the path from the record. If a %xx encoded octet is
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 5]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
encountered it is unencoded prior to comparison, unless it is the
|
||
"/" character, which has special meaning in a path. The match
|
||
evaluates positively if and only if the end of the path from the
|
||
record is reached before a difference in octets is encountered.
|
||
|
||
This table illustrates some examples:
|
||
|
||
Record Path URL path Matches
|
||
/tmp /tmp yes
|
||
/tmp /tmp.html yes
|
||
/tmp /tmp/a.html yes
|
||
/tmp/ /tmp no
|
||
/tmp/ /tmp/ yes
|
||
/tmp/ /tmp/a.html yes
|
||
|
||
/a%3cd.html /a%3cd.html yes
|
||
/a%3Cd.html /a%3cd.html yes
|
||
/a%3cd.html /a%3Cd.html yes
|
||
/a%3Cd.html /a%3Cd.html yes
|
||
|
||
/a%2fb.html /a%2fb.html yes
|
||
/a%2fb.html /a/b.html no
|
||
/a/b.html /a%2fb.html no
|
||
/a/b.html /a/b.html yes
|
||
|
||
/%7ejoe/index.html /~joe/index.html yes
|
||
/~joe/index.html /%7Ejoe/index.html yes
|
||
|
||
3.3 Formal Syntax
|
||
|
||
This is a BNF-like description, using the conventions of RFC 822 [5],
|
||
except that "|" is used to designate alternatives. Briefly, literals
|
||
are quoted with "", parentheses "(" and ")" are used to group
|
||
elements, optional elements are enclosed in [brackets], and elements
|
||
may be preceded with <n>* to designate n or more repetitions of the
|
||
following element; n defaults to 0.
|
||
|
||
robotstxt = *blankcomment
|
||
| *blankcomment record *( 1*commentblank 1*record )
|
||
*blankcomment
|
||
blankcomment = 1*(blank | commentline)
|
||
commentblank = *commentline blank *(blankcomment)
|
||
blank = *space CRLF
|
||
CRLF = CR LF
|
||
record = *commentline agentline *(commentline | agentline)
|
||
1*ruleline *(commentline | ruleline)
|
||
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 6]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
|
||
agentline = "User-agent:" *space agent [comment] CRLF
|
||
ruleline = (disallowline | allowline | extension)
|
||
disallowline = "Disallow" ":" *space path [comment] CRLF
|
||
allowline = "Allow" ":" *space rpath [comment] CRLF
|
||
extension = token : *space value [comment] CRLF
|
||
value = <any CHAR except CR or LF or "#">
|
||
|
||
commentline = comment CRLF
|
||
comment = *blank "#" anychar
|
||
space = 1*(SP | HT)
|
||
rpath = "/" path
|
||
agent = token
|
||
anychar = <any CHAR except CR or LF>
|
||
CHAR = <any US-ASCII character (octets 0 - 127)>
|
||
CTL = <any US-ASCII control character
|
||
(octets 0 - 31) and DEL (127)>
|
||
CR = <US-ASCII CR, carriage return (13)>
|
||
LF = <US-ASCII LF, linefeed (10)>
|
||
SP = <US-ASCII SP, space (32)>
|
||
HT = <US-ASCII HT, horizontal-tab (9)>
|
||
|
||
The syntax for "token" is taken from RFC 1945 [2], reproduced here for
|
||
convenience:
|
||
|
||
token = 1*<any CHAR except CTLs or tspecials>
|
||
|
||
tspecials = "(" | ")" | "<" | ">" | "@"
|
||
| "," | ";" | ":" | "\" | <">
|
||
| "/" | "[" | "]" | "?" | "="
|
||
| "{" | "}" | SP | HT
|
||
|
||
The syntax for "path" is defined in RFC 1808 [6], reproduced here for
|
||
convenience:
|
||
|
||
path = fsegment *( "/" segment )
|
||
fsegment = 1*pchar
|
||
segment = *pchar
|
||
|
||
pchar = uchar | ":" | "@" | "&" | "="
|
||
uchar = unreserved | escape
|
||
unreserved = alpha | digit | safe | extra
|
||
|
||
escape = "%" hex hex
|
||
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" |
|
||
"a" | "b" | "c" | "d" | "e" | "f"
|
||
|
||
alpha = lowalpha | hialpha
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 7]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
|
||
"j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
|
||
"s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
|
||
hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
|
||
"J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
|
||
"S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
|
||
|
||
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
|
||
"8" | "9"
|
||
|
||
safe = "$" | "-" | "_" | "." | "+"
|
||
extra = "!" | "*" | "'" | "(" | ")" | ","
|
||
|
||
|
||
3.4 Expiration
|
||
|
||
Robots should cache /robots.txt files, but if they do they must
|
||
periodically verify the cached copy is fresh before using its
|
||
contents.
|
||
|
||
Standard HTTP cache-control mechanisms can be used by both origin
|
||
server and robots to influence the caching of the /robots.txt file.
|
||
Specifically robots should take note of Expires header set by the
|
||
origin server.
|
||
|
||
If no cache-control directives are present robots should default to
|
||
an expiry of 7 days.
|
||
|
||
|
||
4. Examples
|
||
|
||
This section contains an example of how a /robots.txt may be used.
|
||
|
||
A fictional site may have the following URLs:
|
||
|
||
http://www.fict.org/
|
||
http://www.fict.org/index.html
|
||
http://www.fict.org/robots.txt
|
||
http://www.fict.org/server.html
|
||
http://www.fict.org/services/fast.html
|
||
http://www.fict.org/services/slow.html
|
||
http://www.fict.org/orgo.gif
|
||
http://www.fict.org/org/about.html
|
||
http://www.fict.org/org/plans.html
|
||
http://www.fict.org/%7Ejim/jim.html
|
||
http://www.fict.org/%7Emak/mak.html
|
||
|
||
The site may in the /robots.txt have specific rules for robots that
|
||
send a HTTP User-agent "UnhipBot/0.1", "WebCrawler/3.0", and
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 8]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
"Excite/1.0", and a set of default rules:
|
||
|
||
# /robots.txt for http://www.fict.org/
|
||
# comments to webmaster@fict.org
|
||
|
||
User-agent: unhipbot
|
||
Disallow: /
|
||
|
||
User-agent: webcrawler
|
||
User-agent: excite
|
||
Disallow:
|
||
|
||
User-agent: *
|
||
Disallow: /org/plans.html
|
||
Allow: /org/
|
||
Allow: /serv
|
||
Allow: /~mak
|
||
Disallow: /
|
||
|
||
The following matrix shows which robots are allowed to access URLs:
|
||
|
||
unhipbot webcrawler other
|
||
& excite
|
||
http://www.fict.org/ No Yes No
|
||
http://www.fict.org/index.html No Yes No
|
||
http://www.fict.org/robots.txt Yes Yes Yes
|
||
http://www.fict.org/server.html No Yes Yes
|
||
http://www.fict.org/services/fast.html No Yes Yes
|
||
http://www.fict.org/services/slow.html No Yes Yes
|
||
http://www.fict.org/orgo.gif No Yes No
|
||
http://www.fict.org/org/about.html No Yes Yes
|
||
http://www.fict.org/org/plans.html No Yes No
|
||
http://www.fict.org/%7Ejim/jim.html No Yes No
|
||
http://www.fict.org/%7Emak/mak.html No Yes Yes
|
||
|
||
|
||
5. Notes for Implementors
|
||
|
||
5.1 Backwards Compatibility
|
||
|
||
Previous of this specification didn't provide the Allow line. The
|
||
introduction of the Allow line causes robots to behave slightly
|
||
differently under either specification:
|
||
|
||
If a /robots.txt contains an Allow which overrides a later occurring
|
||
Disallow, a robot ignoring Allow lines will not retrieve those
|
||
parts. This is considered acceptable because there is no requirement
|
||
for a robot to access URLs it is allowed to retrieve, and it is safe,
|
||
in that no URLs a Web site administrator wants to Disallow are be
|
||
allowed. It is expected this may in fact encourage robots to upgrade
|
||
compliance to the specification in this memo.
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 9]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
5.2 Interoperability
|
||
|
||
Implementors should pay particular attention to the robustness in
|
||
parsing of the /robots.txt file. Web site administrators who are not
|
||
aware of the /robots.txt mechanisms often notice repeated failing
|
||
request for it in their log files, and react by putting up pages
|
||
asking "What are you looking for?".
|
||
|
||
As the majority of /robots.txt files are created with platform-
|
||
specific text editors, robots should be liberal in accepting files
|
||
with different end-of-line conventions, specifically CR and LF in
|
||
addition to CRLF.
|
||
|
||
|
||
6. Security Considerations
|
||
|
||
There are a few risks in the method described here, which may affect
|
||
either origin server or robot.
|
||
|
||
Web site administrators must realise this method is voluntary, and
|
||
is not sufficient to guarantee some robots will not visit restricted
|
||
parts of the URL space. Failure to use proper authentication or other
|
||
restriction may result in exposure of restricted information. It even
|
||
possible that the occurence of paths in the /robots.txt file may
|
||
expose the existence of resources not otherwise linked to on the
|
||
site, which may aid people guessing for URLs.
|
||
|
||
Robots need to be aware that the amount of resources spent on dealing
|
||
with the /robots.txt is a function of the file contents, which is not
|
||
under the control of the robot. For example, the contents may be
|
||
larger in size than the robot can deal with. To prevent denial-of-
|
||
service attacks, robots are therefore encouraged to place limits on
|
||
the resources spent on processing of /robots.txt.
|
||
|
||
The /robots.txt directives are retrieved and applied in separate,
|
||
possible unauthenticated HTTP transactions, and it is possible that
|
||
one server can impersonate another or otherwise intercept a
|
||
/robots.txt, and provide a robot with false information. This
|
||
specification does not preclude authentication and encryption
|
||
from being employed to increase security.
|
||
|
||
7. Acknowledgements
|
||
|
||
The author would like the subscribers to the robots mailing list for
|
||
their contributions to this specification.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 10]
|
||
|
||
INTERNET DRAFT A Method for Robots Control December 4, 1996
|
||
|
||
8. References
|
||
|
||
[1] Koster, M., "A Standard for Robot Exclusion",
|
||
http://info.webcrawler.com/mak/projects/robots/norobots.html,
|
||
June 1994.
|
||
|
||
[2] Berners-Lee, T., Fielding, R., and Frystyk, H., "Hypertext
|
||
Transfer Protocol -- HTTP/1.0." RFC 1945, MIT/LCS, May 1996.
|
||
|
||
[3] Postel, J., "Media Type Registration Procedure." RFC 1590,
|
||
USC/ISI, March 1994.
|
||
|
||
[4] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
|
||
Resource Locators (URL)", RFC 1738, CERN, Xerox PARC,
|
||
University of Minnesota, December 1994.
|
||
|
||
[5] Crocker, D., "Standard for the Format of ARPA Internet Text
|
||
Messages", STD 11, RFC 822, UDEL, August 1982.
|
||
|
||
[6] Fielding, R., "Relative Uniform Resource Locators", RFC 1808,
|
||
UC Irvine, June 1995.
|
||
|
||
9. Author's Address
|
||
|
||
Martijn Koster
|
||
WebCrawler
|
||
America Online
|
||
690 Fifth Street
|
||
San Francisco
|
||
CA 94107
|
||
|
||
Phone: 415-3565431
|
||
EMail: m.koster@webcrawler.com
|
||
|
||
Expires June 4, 1997
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Koster draft-koster-robots-00.txt [Page 11]
|
||
</pre>
|
||
<hr>
|
||
<div align=right>
|
||
<address>
|
||
<small>
|
||
<A href="http://info.webcrawler.com/mak/projects/robots/robots.html">The
|
||
Web Robots Pages</A>
|
||
</small>
|
||
</address>
|
||
</div>
|
||
</body>
|
||
</html>
|
||
|