2004-08-24 20:43:18 +00:00
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
< html xmlns = "http://www.w3.org/1999/xhtml" xml:lang = "en" lang = "en" >
< head >
< meta http-equiv = "Content-Type" content = "text/html; charset=utf-8" / >
< meta name = "generator" content = "Docutils 0.3.3: http://docutils.sourceforge.net/" / >
< title > Documentation< / title >
< meta content = "2" name = "navigation.order" / >
< meta content = "Documentation" name = "navigation.name" / >
< link rel = "stylesheet" href = "lc.css" type = "text/css" / >
< meta rel = "SHORTCUT ICON" href = "favicon.png" / >
< link rel = "stylesheet" href = "navigation.css" type = "text/css" / >
< script type = "text/javascript" >
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
< / script >
< / head >
< body >
<!-- bfknav -->
< div class = "navigation" >
< div class = "navrow" style = "padding: 0em 0em 0em 1em;" >
< a href = "./index.html" > LinkChecker< / a >
< a href = "./install.html" > Installation< / a >
< span > Documentation< / span >
< a href = "./faq.html" > FAQ< / a >
2004-08-25 20:22:12 +00:00
< a href = "./other.html" > Other< / a >
2004-08-24 20:43:18 +00:00
< / div >
< / div >
<!-- /bfknav -->
< h1 class = "title" > Documentation< / h1 >
< div class = "document" id = "documentation" >
< div class = "section" id = "basic-usage" >
< h1 > < a name = "basic-usage" > Basic usage< / a > < / h1 >
< p > To check an URL like < tt class = "literal" > < span class = "pre" > http://www.myhomepage.org/< / span > < / tt > it is enough to
execute < tt class = "literal" > < span class = "pre" > linkchecker< / span > < span class = "pre" > http://www.myhomepage.org/< / span > < / tt > . This will check the
complete domain of www.myhomepage.org recursively. All links pointing
outside of the domain are also checked for validity.< / p >
< p > For more options, read the man page < tt class = "literal" > < span class = "pre" > linkchecker(1)< / span > < / tt > or execute
< tt class = "literal" > < span class = "pre" > linkchecker< / span > < span class = "pre" > -h< / span > < / tt > .< / p >
< / div >
< div class = "section" id = "performed-checks" >
< h1 > < a name = "performed-checks" > Performed checks< / a > < / h1 >
2004-08-27 20:46:29 +00:00
< p > All URLs have to pass a preliminary syntax test. Minor quoting
mistakes will issue a warning, all other invalid syntax issues
are errors.
After the syntax check passes, the URL is queued for connection
checking. All connection check types are described below.< / p >
2004-08-24 20:43:18 +00:00
< ul >
< li > < p class = "first" > HTTP links (< tt class = "literal" > < span class = "pre" > http:< / span > < / tt > , < tt class = "literal" > < span class = "pre" > https:< / span > < / tt > )< / p >
< / li >
< li > < p class = "first" > Local files (< tt class = "literal" > < span class = "pre" > file:< / span > < / tt > )< / p >
2004-08-27 20:46:29 +00:00
< p > A regular, readable file that can be opened is valid. A readable
directory is also valid. All other files, for example device files,
unreadable or non-existing files are errors.< / p >
< p > File contents are checked for recursion.< / p >
2004-08-24 20:43:18 +00:00
< / li >
< li > < p class = "first" > Mail links (< tt class = "literal" > < span class = "pre" > mailto:< / span > < / tt > )< / p >
2004-08-27 20:46:29 +00:00
< p > A mailto: link eventually resolves to a list of email addresses.
If one address fails, the whole list will fail.
For each mail address we check the following things:< / p >
< ol class = "arabic simple" >
< li > Look up the MX DNS records. If we found no MX record,
print an error.< / li >
< li > Check if one of the mail hosts accept an SMTP connection.
Check hosts with higher priority first.
If no host accepts SMTP, we print a warning.< / li >
< li > Try to verify the address with the VRFY command. If we got
an answer, print the verified address as an info.< / li >
< / ol >
2004-08-24 20:43:18 +00:00
< / li >
< li > < p class = "first" > FTP links (< tt class = "literal" > < span class = "pre" > ftp:< / span > < / tt > )< / p >
2004-08-27 20:46:29 +00:00
< p > For FTP links we do:< / p >
< ol class = "arabic simple" >
< li > connect to the specified host< / li >
< li > try to login with the given user and password. The default
user is < tt class = "literal" > < span class = "pre" > anonymous< / span > < / tt > , the default password is < tt class = "literal" > < span class = "pre" > anonymous@ < / span > < / tt > .< / li >
< li > try to change to the given directory< / li >
< li > list the file with the NLST command< / li >
< / ol >
2004-08-24 20:43:18 +00:00
< / li >
< li > < p class = "first" > Gopher links (< tt class = "literal" > < span class = "pre" > gopher:< / span > < / tt > )< / p >
2004-08-27 20:46:29 +00:00
< p > Try to send the given selector (or query) to the gopher server.< / p >
2004-08-24 20:43:18 +00:00
< / li >
< li > < p class = "first" > Telnet links (< tt class = "literal" > < span class = "pre" > telnet:< / span > < / tt > )< / p >
2004-08-27 20:46:29 +00:00
< p > We try to connect and, if user/password are given, login to the
given telnet server.< / p >
2004-08-24 20:43:18 +00:00
< / li >
< li > < p class = "first" > NNTP links (< tt class = "literal" > < span class = "pre" > news:< / span > < / tt > , < tt class = "literal" > < span class = "pre" > snews:< / span > < / tt > , < tt class = "literal" > < span class = "pre" > nntp< / span > < / tt > )< / p >
< / li >
< li > < p class = "first" > Ignored links (< tt class = "literal" > < span class = "pre" > javascript:< / span > < / tt > , etc.)< / p >
2004-08-27 20:46:29 +00:00
< p > An ignored link will only print a warning. No further checking
will be made.< / p >
2004-08-24 20:43:18 +00:00
< p > Here is a complete list of recognized, but ignored links. The most
prominent of them should be JavaScript links.< / p >
< ul class = "simple" >
2004-08-25 19:54:22 +00:00
< li > < tt class = "literal" > < span class = "pre" > acap:< / span > < / tt > (application configuration access protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > afs:< / span > < / tt > (Andrew File System global file names)< / li >
< li > < tt class = "literal" > < span class = "pre" > chrome:< / span > < / tt > (Mozilla specific)< / li >
< li > < tt class = "literal" > < span class = "pre" > cid:< / span > < / tt > (content identifier)< / li >
< li > < tt class = "literal" > < span class = "pre" > clsid:< / span > < / tt > (Microsoft specific)< / li >
< li > < tt class = "literal" > < span class = "pre" > data:< / span > < / tt > (data)< / li >
< li > < tt class = "literal" > < span class = "pre" > dav:< / span > < / tt > (dav)< / li >
< li > < tt class = "literal" > < span class = "pre" > fax:< / span > < / tt > (fax)< / li >
< li > < tt class = "literal" > < span class = "pre" > find:< / span > < / tt > (Mozilla specific)< / li >
< li > < tt class = "literal" > < span class = "pre" > imap:< / span > < / tt > (internet message access protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > isbn:< / span > < / tt > (ISBN (int. book numbers))< / li >
< li > < tt class = "literal" > < span class = "pre" > javascript:< / span > < / tt > (JavaScript)< / li >
< li > < tt class = "literal" > < span class = "pre" > ldap:< / span > < / tt > (Lightweight Directory Access Protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > mailserver:< / span > < / tt > (Access to data available from mail servers)< / li >
< li > < tt class = "literal" > < span class = "pre" > mid:< / span > < / tt > (message identifier)< / li >
< li > < tt class = "literal" > < span class = "pre" > mms:< / span > < / tt > (multimedia stream)< / li >
< li > < tt class = "literal" > < span class = "pre" > modem:< / span > < / tt > (modem)< / li >
< li > < tt class = "literal" > < span class = "pre" > nfs:< / span > < / tt > (network file system protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > opaquelocktoken:< / span > < / tt > (opaquelocktoken)< / li >
< li > < tt class = "literal" > < span class = "pre" > pop:< / span > < / tt > (Post Office Protocol v3)< / li >
< li > < tt class = "literal" > < span class = "pre" > prospero:< / span > < / tt > (Prospero Directory Service)< / li >
< li > < tt class = "literal" > < span class = "pre" > rsync:< / span > < / tt > (rsync protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > rtsp:< / span > < / tt > (real time streaming protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > service:< / span > < / tt > (service location)< / li >
< li > < tt class = "literal" > < span class = "pre" > shttp:< / span > < / tt > (secure HTTP)< / li >
< li > < tt class = "literal" > < span class = "pre" > sip:< / span > < / tt > (session initiation protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > tel:< / span > < / tt > (telephone)< / li >
< li > < tt class = "literal" > < span class = "pre" > tip:< / span > < / tt > (Transaction Internet Protocol)< / li >
< li > < tt class = "literal" > < span class = "pre" > tn3270:< / span > < / tt > (Interactive 3270 emulation sessions)< / li >
< li > < tt class = "literal" > < span class = "pre" > vemmi:< / span > < / tt > (versatile multimedia interface)< / li >
< li > < tt class = "literal" > < span class = "pre" > wais:< / span > < / tt > (Wide Area Information Servers)< / li >
< li > < tt class = "literal" > < span class = "pre" > z39.50r:< / span > < / tt > (Z39.50 Retrieval)< / li >
< li > < tt class = "literal" > < span class = "pre" > z39.50s:< / span > < / tt > (Z39.50 Session)< / li >
2004-08-24 20:43:18 +00:00
< / ul >
< / li >
< / ul >
< / div >
2004-08-27 20:46:29 +00:00
< div class = "section" id = "recursion" >
< h1 > < a name = "recursion" > Recursion< / a > < / h1 >
< p > Recursion occurs on HTML files, Opera bookmark files and directories.
Note that the directory recursion reads all files in that
directory, not just a subset like < tt class = "literal" > < span class = "pre" > index.htm*< / span > < / tt > .< / p >
< / div >
2004-08-24 20:43:18 +00:00
< / div >
< hr class = "footer" / >
< div class = "footer" >
2004-08-27 20:46:29 +00:00
Generated on: 2004-08-27 20:46 UTC.
2004-08-24 20:43:18 +00:00
< / div >
< / body >
< / html >