git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2403 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
calvin 2005-03-11 16:27:38 +00:00
parent 43958a6efc
commit c1fb07f6b7
21 changed files with 1905 additions and 0 deletions

303
doc/en/documentation.html Normal file
View file

@ -0,0 +1,303 @@
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.3.7: http://docutils.sourceforge.net/" />
<title>Documentation</title>
<meta content="3" name="navigation.order" />
<meta content="Documentation" name="navigation.name" />
<meta content="4" name="navigation.order" />
<meta content="FAQ" name="navigation.name" />
<link rel="stylesheet" href="lc.css" type="text/css" />
<meta rel="SHORTCUT ICON" href="favicon.png" />
<link rel="stylesheet" href="navigation.css" type="text/css" />
<script type="text/javascript">
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
</script>
</head>
<body>
<!-- bfknav -->
<div class="navigation">
<div class="navrow" style="padding: 0em 0em 0em 1em;">
<a href="./index.html">LinkChecker</a>
<a href="./install.html">Installation</a>
<a href="./upgrading.html">Upgrading</a>
<span>Documentation</span>
<a href="./other.html">Other</a>
</div>
</div>
<!-- /bfknav -->
<div class="document" id="documentation">
<h1 class="title">Documentation</h1>
<div class="contents topic" id="contents">
<p class="topic-title first"><a name="contents">Contents</a></p>
<ul class="simple">
<li><a class="reference" href="#basic-usage" id="id2" name="id2">Basic usage</a></li>
<li><a class="reference" href="#performed-checks" id="id3" name="id3">Performed checks</a></li>
<li><a class="reference" href="#recursion" id="id4" name="id4">Recursion</a></li>
<li><a class="reference" href="#frequently-asked-questions" id="id5" name="id5">Frequently asked questions</a></li>
</ul>
</div>
<div class="section" id="basic-usage">
<h1><a class="toc-backref" href="#id2" name="basic-usage">Basic usage</a></h1>
<p>To check an URL like <tt class="docutils literal"><span class="pre">http://www.myhomepage.org/</span></tt> it is enough to
execute <tt class="docutils literal"><span class="pre">linkchecker</span> <span class="pre">http://www.myhomepage.org/</span></tt>. This will check the
complete domain of www.myhomepage.org recursively. All links pointing
outside of the domain are also checked for validity.</p>
<p>For more options, read the man page <tt class="docutils literal"><span class="pre">linkchecker(1)</span></tt> or execute
<tt class="docutils literal"><span class="pre">linkchecker</span> <span class="pre">-h</span></tt>.</p>
</div>
<div class="section" id="performed-checks">
<h1><a class="toc-backref" href="#id3" name="performed-checks">Performed checks</a></h1>
<p>All URLs have to pass a preliminary syntax test. Minor quoting
mistakes will issue a warning, all other invalid syntax issues
are errors.
After the syntax check passes, the URL is queued for connection
checking. All connection check types are described below.</p>
<ul>
<li><p class="first">HTTP links (<tt class="docutils literal"><span class="pre">http:</span></tt>, <tt class="docutils literal"><span class="pre">https:</span></tt>)</p>
<p>After connecting to the given HTTP server the given path
or query is requested. All redirections are followed, and
if user/password is given it will be used as authorization
when necessary.
Permanently moved pages issue a warning.
All final HTTP status codes other than 2xx are errors.</p>
</li>
<li><p class="first">Local files (<tt class="docutils literal"><span class="pre">file:</span></tt>)</p>
<p>A regular, readable file that can be opened is valid. A readable
directory is also valid. All other files, for example device files,
unreadable or non-existing files are errors.</p>
<p>File contents are checked for recursion.</p>
</li>
<li><p class="first">Mail links (<tt class="docutils literal"><span class="pre">mailto:</span></tt>)</p>
<p>A mailto: link eventually resolves to a list of email addresses.
If one address fails, the whole list will fail.
For each mail address we check the following things:</p>
<ol class="arabic simple">
<li>Look up the MX DNS records. If we found no MX record,
print an error.</li>
<li>Check if one of the mail hosts accept an SMTP connection.
Check hosts with higher priority first.
If no host accepts SMTP, we print a warning.</li>
<li>Try to verify the address with the VRFY command. If we got
an answer, print the verified address as an info.</li>
</ol>
</li>
<li><p class="first">FTP links (<tt class="docutils literal"><span class="pre">ftp:</span></tt>)</p>
<p>For FTP links we do:</p>
<ol class="arabic simple">
<li>connect to the specified host</li>
<li>try to login with the given user and password. The default
user is <tt class="docutils literal"><span class="pre">anonymous</span></tt>, the default password is <tt class="docutils literal"><span class="pre">anonymous&#64;</span></tt>.</li>
<li>try to change to the given directory</li>
<li>list the file with the NLST command</li>
</ol>
</li>
<li><p class="first">Gopher links (<tt class="docutils literal"><span class="pre">gopher:</span></tt>)</p>
<p>We try to send the given selector (or query) to the gopher server.</p>
</li>
<li><p class="first">Telnet links (<tt class="docutils literal"><span class="pre">telnet:</span></tt>)</p>
<p>We try to connect and if user/password are given, login to the
given telnet server.</p>
</li>
<li><p class="first">NNTP links (<tt class="docutils literal"><span class="pre">news:</span></tt>, <tt class="docutils literal"><span class="pre">snews:</span></tt>, <tt class="docutils literal"><span class="pre">nntp</span></tt>)</p>
<p>We try to connect to the given NNTP server. If a news group or
article is specified, try to request it from the server.</p>
</li>
<li><p class="first">Ignored links (<tt class="docutils literal"><span class="pre">javascript:</span></tt>, etc.)</p>
<p>An ignored link will only print a warning. No further checking
will be made.</p>
<p>Here is a complete list of recognized, but ignored links. The most
prominent of them should be JavaScript links.</p>
<ul class="simple">
<li><tt class="docutils literal"><span class="pre">acap:</span></tt> (application configuration access protocol)</li>
<li><tt class="docutils literal"><span class="pre">afs:</span></tt> (Andrew File System global file names)</li>
<li><tt class="docutils literal"><span class="pre">chrome:</span></tt> (Mozilla specific)</li>
<li><tt class="docutils literal"><span class="pre">cid:</span></tt> (content identifier)</li>
<li><tt class="docutils literal"><span class="pre">clsid:</span></tt> (Microsoft specific)</li>
<li><tt class="docutils literal"><span class="pre">data:</span></tt> (data)</li>
<li><tt class="docutils literal"><span class="pre">dav:</span></tt> (dav)</li>
<li><tt class="docutils literal"><span class="pre">fax:</span></tt> (fax)</li>
<li><tt class="docutils literal"><span class="pre">find:</span></tt> (Mozilla specific)</li>
<li><tt class="docutils literal"><span class="pre">imap:</span></tt> (internet message access protocol)</li>
<li><tt class="docutils literal"><span class="pre">isbn:</span></tt> (ISBN (int. book numbers))</li>
<li><tt class="docutils literal"><span class="pre">javascript:</span></tt> (JavaScript)</li>
<li><tt class="docutils literal"><span class="pre">ldap:</span></tt> (Lightweight Directory Access Protocol)</li>
<li><tt class="docutils literal"><span class="pre">mailserver:</span></tt> (Access to data available from mail servers)</li>
<li><tt class="docutils literal"><span class="pre">mid:</span></tt> (message identifier)</li>
<li><tt class="docutils literal"><span class="pre">mms:</span></tt> (multimedia stream)</li>
<li><tt class="docutils literal"><span class="pre">modem:</span></tt> (modem)</li>
<li><tt class="docutils literal"><span class="pre">nfs:</span></tt> (network file system protocol)</li>
<li><tt class="docutils literal"><span class="pre">opaquelocktoken:</span></tt> (opaquelocktoken)</li>
<li><tt class="docutils literal"><span class="pre">pop:</span></tt> (Post Office Protocol v3)</li>
<li><tt class="docutils literal"><span class="pre">prospero:</span></tt> (Prospero Directory Service)</li>
<li><tt class="docutils literal"><span class="pre">rsync:</span></tt> (rsync protocol)</li>
<li><tt class="docutils literal"><span class="pre">rtsp:</span></tt> (real time streaming protocol)</li>
<li><tt class="docutils literal"><span class="pre">service:</span></tt> (service location)</li>
<li><tt class="docutils literal"><span class="pre">shttp:</span></tt> (secure HTTP)</li>
<li><tt class="docutils literal"><span class="pre">sip:</span></tt> (session initiation protocol)</li>
<li><tt class="docutils literal"><span class="pre">tel:</span></tt> (telephone)</li>
<li><tt class="docutils literal"><span class="pre">tip:</span></tt> (Transaction Internet Protocol)</li>
<li><tt class="docutils literal"><span class="pre">tn3270:</span></tt> (Interactive 3270 emulation sessions)</li>
<li><tt class="docutils literal"><span class="pre">vemmi:</span></tt> (versatile multimedia interface)</li>
<li><tt class="docutils literal"><span class="pre">wais:</span></tt> (Wide Area Information Servers)</li>
<li><tt class="docutils literal"><span class="pre">z39.50r:</span></tt> (Z39.50 Retrieval)</li>
<li><tt class="docutils literal"><span class="pre">z39.50s:</span></tt> (Z39.50 Session)</li>
</ul>
</li>
</ul>
</div>
<div class="section" id="recursion">
<h1><a class="toc-backref" href="#id4" name="recursion">Recursion</a></h1>
<p>Recursion occurs on HTML files, Opera bookmark files and directories.
Note that the directory recursion reads all files in that
directory, not just a subset like <tt class="docutils literal"><span class="pre">index.htm*</span></tt>.</p>
</div>
<div class="section" id="frequently-asked-questions">
<h1><a class="toc-backref" href="#id5" name="frequently-asked-questions">Frequently asked questions</a></h1>
<p><strong>Q: LinkChecker produced an error, but my web page is ok with
Netscape/IE/Opera/...
Is this a bug in LinkChecker?</strong></p>
<p>A: Please check your web pages first. Are they really ok? Use
a <a class="reference" href="http://fte.sourceforge.net/">syntax highlighting editor</a>. Use <a class="reference" href="http://tidy.sourceforge.net/">HTML Tidy</a>.
Check if you are using a proxy which produces the error.</p>
<p><strong>Q: I still get an error, but the page is definitely ok.</strong></p>
<p>A: Some servers deny access of automated tools (also called robots)
like LinkChecker. This is not a bug in LinkChecker but rather a
policy by the webmaster running the website you are checking.
It might even be possible for a website to send robots different
web pages than normal browsers.</p>
<p><strong>Q: How can I tell LinkChecker which proxy to use?</strong></p>
<p>A: LinkChecker works transparently with proxies. In a Unix or Windows
environment, set the http_proxy, https_proxy, ftp_proxy or gopher_proxy
environment variables to a URL that identifies the proxy server before
starting LinkChecker. For example</p>
<pre class="literal-block">
$ http_proxy=&quot;http://www.someproxy.com:3128&quot;
$ export http_proxy
</pre>
<p>In a Macintosh environment, LinkChecker will retrieve proxy information
from Internet Config.</p>
<p><strong>Q: The link &quot;mailto:john&#64;company.com?subject=Hello John&quot; is reported
as an error.</strong></p>
<p>A: You have to quote special characters (e.g. spaces) in the subject field.
The correct link should be &quot;mailto:...?subject=Hello%20John&quot;
Unfortunately browsers like IE and Netscape do not enforce this.</p>
<p><strong>Q: Has LinkChecker JavaScript support?</strong></p>
<p>A: No, it never will. If your page is not working without JS then your
web design is broken.
Use PHP or Zope or ASP for dynamic content, and use JavaScript just as
an addon for your web pages.</p>
<p><strong>Q: I don't get this --extern/--intern stuff.</strong></p>
<p>A: When it comes to checking there are three types of URLs. Note
that local files are also represented als URLs (ie <a class="reference" href="file://">file://</a>). So
local files can be external URLs.</p>
<ol class="arabic simple">
<li>strict external URLs:
We do only syntax checking. Internal URLs are never strict.</li>
<li>external URLs:
Like 1), but we additionally check if they are valid by connect()ing
to them</li>
<li>internal URLs:
Like 2), but we additionally check if they are HTML pages and if so,
we descend recursively into this link and check all the links in the
HTML content.
The --recursion-level option restricts the number of such recursive
descends.</li>
</ol>
<p>LinkChecker provides four options which affect URLs to fall in one
of those three categories: --intern, --extern, --extern-strict-all and
--denyallow.
By default all URLs are internal. With --extern you specify what URLs
are external. With --intern you specify what URLs are internal.
Now imagine you have both --extern and --intern. What happens
when an URL matches both patterns? Or when it matches none? In this
situation the --denyallow option specifies the order in which we match
the URL. By default it is internal/external, with --denyallow the order is
external/internal. Either way, the first match counts, and if none matches,
the last checked category is the category for the URL.
Finally, with --extern-strict-all all external URLs are strict.</p>
<p>Oh, and just to boggle your mind: you can have more than one external
regular expression in a config file and for each of those expressions
you can specify if those matched external URLs should be strict or not.</p>
<p>An example. We don't want to check mailto urls. Then its
-i'!^mailto:'. The '!' negates an expression. With --extern-strictall,
we don't even connect to any mail hosts.</p>
<p>Another example. We check our site www.mycompany.com, don't recurse
into external links point outside from our site and want to ignore links
to hollowood.com and hullabulla.com completely.
This can only be done with a configuration entry like</p>
<pre class="literal-block">
[filtering]
extern1=hollowood.com 1
extern2=hullabulla.com 1
# the 1 means strict external ie don't even connect
</pre>
<p>and the command
<tt class="docutils literal"><span class="pre">linkchecker</span> <span class="pre">--intern=www.mycompany.com</span> <span class="pre">www.mycompany.com</span></tt></p>
<p><strong>Q: Is LinkCheckers cookie feature insecure?</strong></p>
<p>A: Cookies can not store more information as is in the HTTP request itself,
so you are not giving away any more system information.
After storing however, the cookies are sent out to the server on request.
Not to every server, but only to the one who the cookie originated from!
This could be used to &quot;track&quot; subsequent requests to this server,
and this is what some people annoys (including me).
Cookies are only stored in memory. After LinkChecker finishes, they
are lost. So the tracking is restricted to the checking time.
The cookie feature is disabled as default.</p>
<p><strong>Q: I want to have my own logging class. How can I use it in LinkChecker?</strong></p>
<p>A: Currently, only a Python API lets you define new logging classes.
Define your own logging class as a subclass of StandardLogger or any other
logging class in the log module.
Then call the addLogger function in Config.Configuration to register
your new Logger.
After this append a new Logging instance to the fileoutput.</p>
<pre class="literal-block">
import linkcheck, MyLogger
log_format = 'mylog'
log_args = {'fileoutput': log_format, 'filename': 'foo.txt'}
cfg = linkcheck.configuration.Configuration()
cfg.logger_add(log_format, MyLogger.MyLogger)
cfg['fileoutput'].append(cfg.logger_new(log_format, log_args))
</pre>
<p><strong>Q: LinkChecker does not ignore anchor references on caching.</strong></p>
<p><strong>Q: Some links with anchors are getting checked twice.</strong></p>
<p>A: This is not a bug.
It is common practice to believe that if an URL <tt class="docutils literal"><span class="pre">ABC#anchor1</span></tt> works then
<tt class="docutils literal"><span class="pre">ABC#anchor2</span></tt> works too. That is not specified anywhere and I have seen
server-side scripts that fail on some anchors and not on others.
This is the reason for always checking URLs with different anchors.
If you really want to disable this, use the <tt class="docutils literal"><span class="pre">--no-anchor-caching</span></tt>
option.</p>
<p><strong>Q: I see LinkChecker gets a /robots.txt file for every site it
checks. What is that about?</strong></p>
<p>A: LinkChecker follows the robots.txt exclusion standard. To avoid
misuse of LinkChecker, you cannot turn this feature off.
See the <a class="reference" href="http://www.robotstxt.org/wc/robots.html">Web Robot pages</a> and the <a class="reference" href="http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt">Spidering report</a> for more info.</p>
<p><strong>Q: Ctrl-C does not stop LinkChecker immediately. Why is that so?</strong></p>
<p>A: The Python interpreter has to wait for all threads to finish, and
this means waiting for all open sockets to close. The default timeout
for sockets is 30 seconds, hence the delay.
You can change the default socket timeout with the --timeout option.</p>
<p><strong>Q: How do I print unreachable/dead documents of my website with
LinkChecker?</strong></p>
<p>A: No can do. This would require file system access to your web
repository and access to your web server configuration.</p>
<p>You can instead store the linkchecker results in a database
and look for missing files.</p>
<p><strong>Q: How do I check HTML/XML syntax with LinkChecker?</strong></p>
<p>A: No can do. Use the <a class="reference" href="http://tidy.sourceforge.net/">HTML Tidy</a> program.</p>
</div>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-01-11 11:18 UTC.
</div>
</body>
</html>

5
doc/en/documentation.nav Normal file
View file

@ -0,0 +1,5 @@
# generated by htmlnav.py, do not edit
name = u'Documentation'
level = 0
visible = True
order = 3

336
doc/en/documentation.txt Normal file
View file

@ -0,0 +1,336 @@
.. meta::
:navigation.order: 3
:navigation.name: Documentation
Documentation
=============
.. contents::
Basic usage
-----------
To check an URL like ``http://www.myhomepage.org/`` it is enough to
execute ``linkchecker http://www.myhomepage.org/``. This will check the
complete domain of www.myhomepage.org recursively. All links pointing
outside of the domain are also checked for validity.
For more options, read the man page ``linkchecker(1)`` or execute
``linkchecker -h``.
Performed checks
----------------
All URLs have to pass a preliminary syntax test. Minor quoting
mistakes will issue a warning, all other invalid syntax issues
are errors.
After the syntax check passes, the URL is queued for connection
checking. All connection check types are described below.
- HTTP links (``http:``, ``https:``)
After connecting to the given HTTP server the given path
or query is requested. All redirections are followed, and
if user/password is given it will be used as authorization
when necessary.
Permanently moved pages issue a warning.
All final HTTP status codes other than 2xx are errors.
- Local files (``file:``)
A regular, readable file that can be opened is valid. A readable
directory is also valid. All other files, for example device files,
unreadable or non-existing files are errors.
File contents are checked for recursion.
- Mail links (``mailto:``)
A mailto: link eventually resolves to a list of email addresses.
If one address fails, the whole list will fail.
For each mail address we check the following things:
1) Look up the MX DNS records. If we found no MX record,
print an error.
2) Check if one of the mail hosts accept an SMTP connection.
Check hosts with higher priority first.
If no host accepts SMTP, we print a warning.
3) Try to verify the address with the VRFY command. If we got
an answer, print the verified address as an info.
- FTP links (``ftp:``)
For FTP links we do:
1) connect to the specified host
2) try to login with the given user and password. The default
user is ``anonymous``, the default password is ``anonymous@``.
3) try to change to the given directory
4) list the file with the NLST command
- Gopher links (``gopher:``)
We try to send the given selector (or query) to the gopher server.
- Telnet links (``telnet:``)
We try to connect and if user/password are given, login to the
given telnet server.
- NNTP links (``news:``, ``snews:``, ``nntp``)
We try to connect to the given NNTP server. If a news group or
article is specified, try to request it from the server.
- Ignored links (``javascript:``, etc.)
An ignored link will only print a warning. No further checking
will be made.
Here is a complete list of recognized, but ignored links. The most
prominent of them should be JavaScript links.
- ``acap:`` (application configuration access protocol)
- ``afs:`` (Andrew File System global file names)
- ``chrome:`` (Mozilla specific)
- ``cid:`` (content identifier)
- ``clsid:`` (Microsoft specific)
- ``data:`` (data)
- ``dav:`` (dav)
- ``fax:`` (fax)
- ``find:`` (Mozilla specific)
- ``imap:`` (internet message access protocol)
- ``isbn:`` (ISBN (int. book numbers))
- ``javascript:`` (JavaScript)
- ``ldap:`` (Lightweight Directory Access Protocol)
- ``mailserver:`` (Access to data available from mail servers)
- ``mid:`` (message identifier)
- ``mms:`` (multimedia stream)
- ``modem:`` (modem)
- ``nfs:`` (network file system protocol)
- ``opaquelocktoken:`` (opaquelocktoken)
- ``pop:`` (Post Office Protocol v3)
- ``prospero:`` (Prospero Directory Service)
- ``rsync:`` (rsync protocol)
- ``rtsp:`` (real time streaming protocol)
- ``service:`` (service location)
- ``shttp:`` (secure HTTP)
- ``sip:`` (session initiation protocol)
- ``tel:`` (telephone)
- ``tip:`` (Transaction Internet Protocol)
- ``tn3270:`` (Interactive 3270 emulation sessions)
- ``vemmi:`` (versatile multimedia interface)
- ``wais:`` (Wide Area Information Servers)
- ``z39.50r:`` (Z39.50 Retrieval)
- ``z39.50s:`` (Z39.50 Session)
Recursion
---------
Recursion occurs on HTML files, Opera bookmark files and directories.
Note that the directory recursion reads all files in that
directory, not just a subset like ``index.htm*``.
.. meta::
:navigation.order: 4
:navigation.name: FAQ
Frequently asked questions
--------------------------
**Q: LinkChecker produced an error, but my web page is ok with
Netscape/IE/Opera/...
Is this a bug in LinkChecker?**
A: Please check your web pages first. Are they really ok? Use
a `syntax highlighting editor`_. Use `HTML Tidy`_.
Check if you are using a proxy which produces the error.
.. _`syntax highlighting editor`:
http://fte.sourceforge.net/
.. _`HTML Tidy`:
http://tidy.sourceforge.net/
**Q: I still get an error, but the page is definitely ok.**
A: Some servers deny access of automated tools (also called robots)
like LinkChecker. This is not a bug in LinkChecker but rather a
policy by the webmaster running the website you are checking.
It might even be possible for a website to send robots different
web pages than normal browsers.
**Q: How can I tell LinkChecker which proxy to use?**
A: LinkChecker works transparently with proxies. In a Unix or Windows
environment, set the http_proxy, https_proxy, ftp_proxy or gopher_proxy
environment variables to a URL that identifies the proxy server before
starting LinkChecker. For example
::
$ http_proxy="http://www.someproxy.com:3128"
$ export http_proxy
In a Macintosh environment, LinkChecker will retrieve proxy information
from Internet Config.
**Q: The link "mailto:john@company.com?subject=Hello John" is reported
as an error.**
A: You have to quote special characters (e.g. spaces) in the subject field.
The correct link should be "mailto:...?subject=Hello%20John"
Unfortunately browsers like IE and Netscape do not enforce this.
**Q: Has LinkChecker JavaScript support?**
A: No, it never will. If your page is not working without JS then your
web design is broken.
Use PHP or Zope or ASP for dynamic content, and use JavaScript just as
an addon for your web pages.
**Q: I don't get this --extern/--intern stuff.**
A: When it comes to checking there are three types of URLs. Note
that local files are also represented als URLs (ie file://). So
local files can be external URLs.
1) strict external URLs:
We do only syntax checking. Internal URLs are never strict.
2) external URLs:
Like 1), but we additionally check if they are valid by connect()ing
to them
3) internal URLs:
Like 2), but we additionally check if they are HTML pages and if so,
we descend recursively into this link and check all the links in the
HTML content.
The --recursion-level option restricts the number of such recursive
descends.
LinkChecker provides four options which affect URLs to fall in one
of those three categories: --intern, --extern, --extern-strict-all and
--denyallow.
By default all URLs are internal. With --extern you specify what URLs
are external. With --intern you specify what URLs are internal.
Now imagine you have both --extern and --intern. What happens
when an URL matches both patterns? Or when it matches none? In this
situation the --denyallow option specifies the order in which we match
the URL. By default it is internal/external, with --denyallow the order is
external/internal. Either way, the first match counts, and if none matches,
the last checked category is the category for the URL.
Finally, with --extern-strict-all all external URLs are strict.
Oh, and just to boggle your mind: you can have more than one external
regular expression in a config file and for each of those expressions
you can specify if those matched external URLs should be strict or not.
An example. We don't want to check mailto urls. Then its
-i'!^mailto:'. The '!' negates an expression. With --extern-strictall,
we don't even connect to any mail hosts.
Another example. We check our site www.mycompany.com, don't recurse
into external links point outside from our site and want to ignore links
to hollowood.com and hullabulla.com completely.
This can only be done with a configuration entry like
::
[filtering]
extern1=hollowood.com 1
extern2=hullabulla.com 1
# the 1 means strict external ie don't even connect
and the command
``linkchecker --intern=www.mycompany.com www.mycompany.com``
**Q: Is LinkCheckers cookie feature insecure?**
A: Cookies can not store more information as is in the HTTP request itself,
so you are not giving away any more system information.
After storing however, the cookies are sent out to the server on request.
Not to every server, but only to the one who the cookie originated from!
This could be used to "track" subsequent requests to this server,
and this is what some people annoys (including me).
Cookies are only stored in memory. After LinkChecker finishes, they
are lost. So the tracking is restricted to the checking time.
The cookie feature is disabled as default.
**Q: I want to have my own logging class. How can I use it in LinkChecker?**
A: Currently, only a Python API lets you define new logging classes.
Define your own logging class as a subclass of StandardLogger or any other
logging class in the log module.
Then call the addLogger function in Config.Configuration to register
your new Logger.
After this append a new Logging instance to the fileoutput.
::
import linkcheck, MyLogger
log_format = 'mylog'
log_args = {'fileoutput': log_format, 'filename': 'foo.txt'}
cfg = linkcheck.configuration.Configuration()
cfg.logger_add(log_format, MyLogger.MyLogger)
cfg['fileoutput'].append(cfg.logger_new(log_format, log_args))
**Q: LinkChecker does not ignore anchor references on caching.**
**Q: Some links with anchors are getting checked twice.**
A: This is not a bug.
It is common practice to believe that if an URL ``ABC#anchor1`` works then
``ABC#anchor2`` works too. That is not specified anywhere and I have seen
server-side scripts that fail on some anchors and not on others.
This is the reason for always checking URLs with different anchors.
If you really want to disable this, use the ``--no-anchor-caching``
option.
**Q: I see LinkChecker gets a /robots.txt file for every site it
checks. What is that about?**
A: LinkChecker follows the robots.txt exclusion standard. To avoid
misuse of LinkChecker, you cannot turn this feature off.
See the `Web Robot pages`_ and the `Spidering report`_ for more info.
.. _`Web Robot pages`:
http://www.robotstxt.org/wc/robots.html
.. _`Spidering report`:
http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt
**Q: Ctrl-C does not stop LinkChecker immediately. Why is that so?**
A: The Python interpreter has to wait for all threads to finish, and
this means waiting for all open sockets to close. The default timeout
for sockets is 30 seconds, hence the delay.
You can change the default socket timeout with the --timeout option.
**Q: How do I print unreachable/dead documents of my website with
LinkChecker?**
A: No can do. This would require file system access to your web
repository and access to your web server configuration.
You can instead store the linkchecker results in a database
and look for missing files.
**Q: How do I check HTML/XML syntax with LinkChecker?**
A: No can do. Use the `HTML Tidy`_ program.
.. _`HTML Tidy`:
http://tidy.sourceforge.net/

150
doc/en/index.html Normal file
View file

@ -0,0 +1,150 @@
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.3.7: http://docutils.sourceforge.net/" />
<title>LinkChecker - check HTML documents for broken links</title>
<meta content="0" name="navigation.order" />
<meta content="LinkChecker" name="navigation.name" />
<link rel="stylesheet" href="lc.css" type="text/css" />
<meta rel="SHORTCUT ICON" href="favicon.png" />
<link rel="stylesheet" href="navigation.css" type="text/css" />
<script type="text/javascript">
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
</script>
</head>
<body>
<!-- bfknav -->
<div class="navigation">
<div class="navrow" style="padding: 0em 0em 0em 1em;">
<span>LinkChecker</span>
<a href="./install.html">Installation</a>
<a href="./upgrading.html">Upgrading</a>
<a href="./documentation.html">Documentation</a>
<a href="./other.html">Other</a>
</div>
</div>
<!-- /bfknav -->
<div class="document" id="linkchecker-check-html-documents-for-broken-links">
<h1 class="title">LinkChecker - check HTML documents for broken links</h1>
<div class="contents topic" id="contents">
<p class="topic-title first"><a name="contents">Contents</a></p>
<ul class="simple">
<li><a class="reference" href="#features" id="id1" name="id1">Features</a></li>
<li><a class="reference" href="#download" id="id2" name="id2">Download</a></li>
<li><a class="reference" href="#screenshots" id="id3" name="id3">Screenshots</a></li>
<li><a class="reference" href="#running" id="id4" name="id4">Running</a><ul>
<li><a class="reference" href="#running-under-unix-or-mac-os-x-platforms" id="id5" name="id5">Running under Unix or Mac OS X platforms</a></li>
<li><a class="reference" href="#running-under-windows-platforms" id="id6" name="id6">Running under Windows platforms</a></li>
<li><a class="reference" href="#running-under-mac-os-9-x-platforms" id="id7" name="id7">Running under Mac OS 9.x platforms</a></li>
<li><a class="reference" href="#internationalization" id="id8" name="id8">Internationalization</a></li>
</ul>
</li>
<li><a class="reference" href="#bug-reporting" id="id9" name="id9">Bug reporting</a></li>
<li><a class="reference" href="#cvs-access" id="id10" name="id10">CVS access</a></li>
</ul>
</div>
<div class="section" id="features">
<h1><a class="toc-backref" href="#id1" name="features">Features</a></h1>
<ul class="simple">
<li>recursive checking</li>
<li>multithreading</li>
<li>output in colored or normal text, HTML, SQL, CSV or a sitemap
graph in GML or XML.</li>
<li>HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet and local
file links support</li>
<li>restriction of link checking with regular expression filters for URLs</li>
<li>proxy support</li>
<li>username/password authorization for HTTP and FTP</li>
<li>robots.txt exclusion protocol support</li>
<li>i18n support</li>
<li>a command line interface</li>
<li>a (Fast)CGI web interface (requires HTTP server)</li>
</ul>
</div>
<div class="section" id="download">
<h1><a class="toc-backref" href="#id2" name="download">Download</a></h1>
<p>Download the latest packages from <a class="reference" href="http://sourceforge.net/project/showfiles.php?group_id=1913">LinkChecker download section</a>.
There are also <a class="reference" href="linkchecker-md5sums.txt">Md5sum checksums</a> from above files.</p>
<p>Requirements and installation instructions are located at the
<a class="reference" href="install.html">install documentation</a>. To see what has changed between releases
look at the <a class="reference" href="http://cvs.sourceforge.net/viewcvs.py/linkchecker/linkchecker/ChangeLog?view=markup">ChangeLog</a>.</p>
</div>
<div class="section" id="screenshots">
<h1><a class="toc-backref" href="#id3" name="screenshots">Screenshots</a></h1>
<blockquote>
<table border="1" class="docutils">
<colgroup>
<col width="50%" />
<col width="50%" />
</colgroup>
<tbody valign="top">
<tr><td><div align="middle" class="image align-middle image-reference"><a class="first last reference" href="shot1.png"><img align="middle" alt="shot1_thumb.jpg" src="shot1_thumb.jpg" /></a></div>
</td>
<td><div align="middle" class="image align-middle image-reference"><a class="first last reference" href="shot2.png"><img align="middle" alt="shot2_thumb.jpg" src="shot2_thumb.jpg" /></a></div>
</td>
</tr>
<tr><td>Commandline interface</td>
<td>Web interface</td>
</tr>
</tbody>
</table>
</blockquote>
</div>
<div class="section" id="running">
<h1><a class="toc-backref" href="#id4" name="running">Running</a></h1>
<div class="section" id="running-under-unix-or-mac-os-x-platforms">
<h2><a class="toc-backref" href="#id5" name="running-under-unix-or-mac-os-x-platforms">Running under Unix or Mac OS X platforms</a></h2>
<p>The local configuration file is $HOME/.linkcheckerrc
Type &quot;linkchecker&quot; followed by your URLs you want to check.
Type &quot;linkchecker -h&quot; for help.</p>
</div>
<div class="section" id="running-under-windows-platforms">
<h2><a class="toc-backref" href="#id6" name="running-under-windows-platforms">Running under Windows platforms</a></h2>
<p>Start &quot;Check URL&quot; in your LinkChecker program group.
URL input is interactive.
Another way is executing &quot;python.exe linkchecker&quot; in the Python
Scripts directory.</p>
</div>
<div class="section" id="running-under-mac-os-9-x-platforms">
<h2><a class="toc-backref" href="#id7" name="running-under-mac-os-9-x-platforms">Running under Mac OS 9.x platforms</a></h2>
<p>Read the MacOS Python documentation to find out about passing
commandline options to Python scripts.</p>
</div>
<div class="section" id="internationalization">
<h2><a class="toc-backref" href="#id8" name="internationalization">Internationalization</a></h2>
<p>For german output execute &quot;export LC_MESSAGES=de&quot; in bash or
&quot;setenv LC_MESSAGES de&quot; in tcsh.
Under Windows, execute &quot;set LC_MESSAGES=de&quot;.
Other supported languages are 'nl' (Nederlands) and 'fr' (français).</p>
<p>You can help to translate LinkChecker by copying the included
<tt class="docutils literal"><span class="pre">linkchecker.pot</span></tt> file to <tt class="docutils literal"><span class="pre">language.po</span></tt>, translate it and
send it to me.</p>
</div>
</div>
<div class="section" id="bug-reporting">
<h1><a class="toc-backref" href="#id9" name="bug-reporting">Bug reporting</a></h1>
<p>The <a class="reference" href="http://sourceforge.net/tracker/?func=add&amp;group_id=1913&amp;atid=101913">SourceForge Bug interface</a> allows submitting of bugs, patches
and requests.</p>
</div>
<div class="section" id="cvs-access">
<h1><a class="toc-backref" href="#id10" name="cvs-access">CVS access</a></h1>
<p>The <a class="reference" href="http://sourceforge.net/cvs/?group_id=1913">SourceForge CVS page</a> has all the information on how to
obtain the development version of LinkChecker. Development of
LinkChecker requires some more software to be available, which
is documented on the <cite>installation page</cite>.</p>
<div align="right" class="image align-right image-reference"><a class="reference" href="http://sourceforge.net/"><img align="right" alt="SourceForge Logo" height="31" src="http://sourceforge.net/sflogo.php?group_id=1913&amp;type=1" width="88" /></a></div>
</div>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-01-11 11:18 UTC.
</div>
</body>
</html>

5
doc/en/index.nav Normal file
View file

@ -0,0 +1,5 @@
# generated by htmlnav.py, do not edit
name = u'LinkChecker'
level = 0
visible = True
order = 0

128
doc/en/index.txt Normal file
View file

@ -0,0 +1,128 @@
.. meta::
:navigation.order: 0
:navigation.name: LinkChecker
===================================================
LinkChecker - check HTML documents for broken links
===================================================
.. contents::
Features
========
- recursive checking
- multithreading
- output in colored or normal text, HTML, SQL, CSV or a sitemap
graph in GML or XML.
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet and local
file links support
- restriction of link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP and FTP
- robots.txt exclusion protocol support
- i18n support
- a command line interface
- a (Fast)CGI web interface (requires HTTP server)
Download
========
Download the latest packages from `LinkChecker download section`_.
There are also `Md5sum checksums`_ from above files.
.. _LinkChecker download section:
http://sourceforge.net/project/showfiles.php?group_id=1913
.. _Md5sum checksums:
linkchecker-md5sums.txt
Requirements and installation instructions are located at the
`install documentation`_. To see what has changed between releases
look at the ChangeLog_.
.. _install documentation:
install.html
.. _ChangeLog:
http://cvs.sourceforge.net/viewcvs.py/linkchecker/linkchecker/ChangeLog?view=markup
Screenshots
===========
+----------------------------+----------------------------+
| .. image:: shot1_thumb.jpg | .. image:: shot2_thumb.jpg |
| :align: middle | :align: middle |
| :target: shot1.png | :target: shot2.png |
+----------------------------+----------------------------+
| Commandline interface | Web interface |
+----------------------------+----------------------------+
Running
=======
Running under Unix or Mac OS X platforms
----------------------------------------
The local configuration file is $HOME/.linkcheckerrc
Type "linkchecker" followed by your URLs you want to check.
Type "linkchecker -h" for help.
Running under Windows platforms
-------------------------------
Start "Check URL" in your LinkChecker program group.
URL input is interactive.
Another way is executing "python.exe linkchecker" in the Python
Scripts directory.
Running under Mac OS 9.x platforms
----------------------------------
Read the MacOS Python documentation to find out about passing
commandline options to Python scripts.
Internationalization
--------------------
For german output execute "export LC_MESSAGES=de" in bash or
"setenv LC_MESSAGES de" in tcsh.
Under Windows, execute "set LC_MESSAGES=de".
Other supported languages are 'nl' (Nederlands) and 'fr' (français).
You can help to translate LinkChecker by copying the included
``linkchecker.pot`` file to ``language.po``, translate it and
send it to me.
Bug reporting
=============
The `SourceForge Bug interface`_ allows submitting of bugs, patches
and requests.
.. _SourceForge Bug interface:
http://sourceforge.net/tracker/?func=add&group_id=1913&atid=101913
CVS access
==========
The `SourceForge CVS page`_ has all the information on how to
obtain the development version of LinkChecker. Development of
LinkChecker requires some more software to be available, which
is documented on the `installation page`.
.. _SourceForge CVS page:
http://sourceforge.net/cvs/?group_id=1913
.. _installation page:
install.html
.. image:: http://sourceforge.net/sflogo.php?group_id=1913&type=1
:align: right
:target: http://sourceforge.net/
:alt: SourceForge Logo
:width: 88
:height: 31

207
doc/en/install.html Normal file
View file

@ -0,0 +1,207 @@
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.3.7: http://docutils.sourceforge.net/" />
<title>Installation</title>
<meta content="1" name="navigation.order" />
<meta content="Installation" name="navigation.name" />
<link rel="stylesheet" href="lc.css" type="text/css" />
<meta rel="SHORTCUT ICON" href="favicon.png" />
<link rel="stylesheet" href="navigation.css" type="text/css" />
<script type="text/javascript">
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
</script>
</head>
<body>
<!-- bfknav -->
<div class="navigation">
<div class="navrow" style="padding: 0em 0em 0em 1em;">
<a href="./index.html">LinkChecker</a>
<span>Installation</span>
<a href="./upgrading.html">Upgrading</a>
<a href="./documentation.html">Documentation</a>
<a href="./other.html">Other</a>
</div>
</div>
<!-- /bfknav -->
<div class="document" id="installation">
<h1 class="title">Installation</h1>
<p>If you are upgrading from older versions of LinkChecker you should
also read the <a class="reference" href="upgrading.html">upgrading documentation</a>.</p>
<div class="section" id="requirements-for-unix-linux-or-mac-os-x">
<h1><a name="requirements-for-unix-linux-or-mac-os-x">Requirements for Unix/Linux or Mac OS X</a></h1>
<ol class="arabic">
<li><p class="first">You need a standard GNU development environment with</p>
<ul>
<li><p class="first">C compiler (for example the GNU C Compiler gcc)</p>
<p>Depending on your distribution, several development packages
might be needed to provide a fully functional C development
environment.</p>
</li>
</ul>
<p>Note for developers: if you want to regenerate the po/linkchecker.pot
template from the source files, you will need xgettext with Python
support. This is available in gettext &gt;= 0.12.</p>
</li>
<li><p class="first">Python &gt;= 2.4 from <a class="reference" href="http://www.python.org/">http://www.python.org/</a> with zlib support</p>
<p>Be sure to also have installed the included distutils module.
On most distributions, the distutils module is included in
an extra &quot;python-dev&quot; package.</p>
</li>
<li><p class="first"><em>Optional, for bash-completion:</em>
optcomplete Python module from <a class="reference" href="http://furius.ca/optcomplete/">http://furius.ca/optcomplete/</a></p>
</li>
<li><p class="first"><em>Optional (speedup for i386 compatible PCs)</em>
Psyco from <a class="reference" href="http://psyco.sourceforge.net/">http://psyco.sourceforge.net/</a>
[<a class="reference" href="http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4-src.tar.gz">http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4-src.tar.gz</a>]</p>
</li>
</ol>
</div>
<div class="section" id="requirements-for-windows">
<h1><a name="requirements-for-windows">Requirements for Windows</a></h1>
<ol class="arabic simple">
<li>Install Python &gt;= 2.4 from <a class="reference" href="http://www.python.org/">http://www.python.org/</a>
[<a class="reference" href="http://www.python.org/ftp/python/2.4/python-2.4.msi">http://www.python.org/ftp/python/2.4/python-2.4.msi</a>]</li>
<li><em>Only needed if you compile from source:</em>
install the MinGW suite from <a class="reference" href="http://mingw.sourceforge.net/">http://mingw.sourceforge.net/</a>
Be sure to install in the given order:<ol class="loweralpha">
<li>MingGW
[<a class="reference" href="http://osdn.dl.sourceforge.net/sourceforge/mingw/MinGW-3.1.0-1.exe">http://osdn.dl.sourceforge.net/sourceforge/mingw/MinGW-3.1.0-1.exe</a>]</li>
<li>MSYS
[<a class="reference" href="http://osdn.dl.sourceforge.net/sourceforge/mingw/MSYS-1.0.10.exe">http://osdn.dl.sourceforge.net/sourceforge/mingw/MSYS-1.0.10.exe</a>]</li>
</ol>
</li>
<li><em>Optional (speedup for i386 compatible PCs)</em>
Psyco from <a class="reference" href="http://psyco.sourceforge.net/">http://psyco.sourceforge.net/</a>
[<a class="reference" href="http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4.win32-py2.4.exe">http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4.win32-py2.4.exe</a>]</li>
</ol>
</div>
<div class="section" id="setup-for-unix-linux-or-mac-os-x">
<h1><a name="setup-for-unix-linux-or-mac-os-x">Setup for Unix/Linux or Mac OS X</a></h1>
<ol class="arabic">
<li><p class="first">Install check</p>
<p>Be sure to have installed all required Unix/Linux software listed above.</p>
</li>
<li><p class="first">Compile Python modules</p>
<p>Run <tt class="docutils literal"><span class="pre">python</span> <span class="pre">setup.py</span> <span class="pre">build</span></tt> to compile the Python files.
For help about the setup.py script options, run
<tt class="docutils literal"><span class="pre">python</span> <span class="pre">setup.py</span> <span class="pre">--help</span></tt>.
The CC environment variable is checked before compilation, so you can
change the default C compiler with <tt class="docutils literal"><span class="pre">export</span> <span class="pre">CC=myccompiler</span></tt>.</p>
</li>
<li><blockquote class="first">
<ol class="loweralpha">
<li><p class="first">Installation as root</p>
<p>Run <tt class="docutils literal"><span class="pre">su</span> <span class="pre">-c</span> <span class="pre">'python</span> <span class="pre">setup.py</span> <span class="pre">install'</span></tt> to install LinkChecker.</p>
</li>
<li><p class="first">Installation as a normal user</p>
<p>Run <tt class="docutils literal"><span class="pre">python</span> <span class="pre">setup.py</span> <span class="pre">install</span> <span class="pre">--home</span> <span class="pre">$HOME</span></tt>. Note that you have
to adjust your PATH and PYTHONPATH environment variables, eg. by
adding the commands <tt class="docutils literal"><span class="pre">export</span> <span class="pre">PYTHONPATH=$HOME/lib/python</span></tt> and
<tt class="docutils literal"><span class="pre">export</span> <span class="pre">PATH=$PATH:$HOME/bin</span></tt> to your shell configuration
file.</p>
<p>For more information look at the <a class="reference" href="http://docs.python.org/inst/search-path.html#SECTION000410000000000000000">Modifying Python's search path</a>
documentation.</p>
</li>
</ol>
<p>If you downloaded Psyco please read the <a class="reference" href="http://psyco.sourceforge.net/psycoguide/node2.html">psyco installation docs</a>.</p>
</blockquote>
</li>
</ol>
</div>
<div class="section" id="setup-for-windows-the-binary-exe-installer">
<h1><a name="setup-for-windows-the-binary-exe-installer">Setup for Windows - the binary .exe installer:</a></h1>
<ol class="arabic">
<li><p class="first">Install check</p>
<p>Be sure to have installed all required windows software listed above.</p>
</li>
<li><p class="first">Execute the <tt class="docutils literal"><span class="pre">linkchecker-x.xx.win32-py2.4.exe</span></tt> file and follow
the instructions.</p>
</li>
</ol>
</div>
<div class="section" id="setup-for-windows-compiling-from-source">
<h1><a name="setup-for-windows-compiling-from-source">Setup for Windows - compiling from source:</a></h1>
<ol class="arabic">
<li><p class="first">Install check</p>
<p>Be sure to have installed all required Unix/Linux software listed above.</p>
</li>
<li><p class="first">Preparing Python for the MinGW compiler</p>
<p>Search the file python24.dll in your windows folder.
After you found it, launch MSYS. Change into the windows folder,
for example <tt class="docutils literal"><span class="pre">cd</span> <span class="pre">c:\winnt\system32</span></tt>. Then execute
<tt class="docutils literal"><span class="pre">pexports</span> <span class="pre">python24.dll</span> <span class="pre">&gt;</span> <span class="pre">python24.def</span></tt>.
Then use the dlltool with
<tt class="docutils literal"><span class="pre">dlltool</span> <span class="pre">--dllname</span> <span class="pre">python24.dll</span> <span class="pre">--def</span> <span class="pre">python24.def</span> <span class="pre">--output-lib</span>
<span class="pre">libpython24.a</span></tt>.
The resulting library has to be placed in the same directory as
python24.lib. (Should be the libs directory under your Python installation
directory, for example <tt class="docutils literal"><span class="pre">c:\Python24\Libs\</span></tt>.)</p>
</li>
<li><p class="first">Generate and execute the LinkChecker installer</p>
<p>Close the MSYS application (by typing <tt class="docutils literal"><span class="pre">exit</span></tt>) and open a DOS command
prompt.
Change to the <tt class="docutils literal"><span class="pre">linkchecker-X.X.X</span></tt> directory and run
<tt class="docutils literal"><span class="pre">python</span> <span class="pre">setup.py</span> <span class="pre">build</span> <span class="pre">-c</span> <span class="pre">mingw32</span> <span class="pre">bdist_wininst</span></tt>.</p>
<p>This generates a binary installer
<tt class="docutils literal"><span class="pre">dist\linkchecker-X.X.X.win32-py2.4.exe</span></tt> which you just have to
execute.</p>
<p>If you downloaded Psyco please read the <a class="reference" href="http://psyco.sourceforge.net/psycoguide/node2.html">psyco installation docs</a>.</p>
</li>
</ol>
</div>
<div class="section" id="after-installation">
<h1><a name="after-installation">After installation</a></h1>
<p>LinkChecker is now installed. Have fun!
See the <a class="reference" href="index.html">main page</a> on how to configure and start LinkChecker.</p>
</div>
<div class="section" id="installation-for-other-platforms">
<h1><a name="installation-for-other-platforms">Installation for other platforms</a></h1>
<p>If you happen to install LinkChecker on other platforms (for example
Mac OS 9.x) then drop me a note.</p>
</div>
<div class="section" id="fast-cgi-web-interface">
<h1><a name="fast-cgi-web-interface">(Fast)CGI web interface</a></h1>
<p>The included CGI scripts can run LinkChecker with a nice graphical web
interface.
You can use and adjust the example HTML files in the lconline directory
to run the script.</p>
<ol class="arabic simple">
<li>Choose a CGI script. The simplest is lc.cgi and you need a web server
with CGI support.
The script lc.fcgi (I tested this a while ago) needs a web server
with FastCGI support.</li>
<li>Copy the script of your choice in the CGI directory.
Note that only the local host (ie. 127.0.0.1) can access this
script. If you want to enable access from other hosts you have
to adjust the ALLOWED_HOSTS and ALLOWED_SERVERS variables in
the lc.cgi (or lc.fcgi) script.</li>
<li>Adjust the &quot;action=...&quot; parameter in lconline/lc_cgi.html
to point to your CGI script.</li>
<li>load the lconline/index.html file, enter an URL and klick on the
check button</li>
<li>If something goes wrong, check the following:<ol class="loweralpha">
<li>look in the error log of your web server</li>
<li>be sure that you have enabled CGI support in your web server
do this by running other CGI scripts which you know are
working</li>
<li>try to run the lc.cgi script by hand</li>
<li>try the testit() function in the lc.cgi script</li>
</ol>
</li>
</ol>
</div>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-03-04 20:27 UTC.
</div>
</body>
</html>

5
doc/en/install.nav Normal file
View file

@ -0,0 +1,5 @@
# generated by htmlnav.py, do not edit
name = u'Installation'
level = 0
visible = True
order = 1

194
doc/en/install.txt Normal file
View file

@ -0,0 +1,194 @@
.. meta::
:navigation.order: 1
:navigation.name: Installation
Installation
============
If you are upgrading from older versions of LinkChecker you should
also read the `upgrading documentation`_.
.. _upgrading documentation:
upgrading.html
Requirements for Unix/Linux or Mac OS X
---------------------------------------
1. You need a standard GNU development environment with
- C compiler (for example the GNU C Compiler gcc)
Depending on your distribution, several development packages
might be needed to provide a fully functional C development
environment.
Note for developers: if you want to regenerate the po/linkchecker.pot
template from the source files, you will need xgettext with Python
support. This is available in gettext >= 0.12.
2. Python >= 2.4 from http://www.python.org/ with zlib support
Be sure to also have installed the included distutils module.
On most distributions, the distutils module is included in
an extra "python-dev" package.
3. *Optional, for bash-completion:*
optcomplete Python module from http://furius.ca/optcomplete/
4. *Optional (speedup for i386 compatible PCs)*
Psyco from http://psyco.sourceforge.net/
[http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4-src.tar.gz]
Requirements for Windows
------------------------
1. Install Python >= 2.4 from http://www.python.org/
[http://www.python.org/ftp/python/2.4/python-2.4.msi]
2. *Only needed if you compile from source:*
install the MinGW suite from http://mingw.sourceforge.net/
Be sure to install in the given order:
a) MingGW
[http://osdn.dl.sourceforge.net/sourceforge/mingw/MinGW-3.1.0-1.exe]
b) MSYS
[http://osdn.dl.sourceforge.net/sourceforge/mingw/MSYS-1.0.10.exe]
3. *Optional (speedup for i386 compatible PCs)*
Psyco from http://psyco.sourceforge.net/
[http://osdn.dl.sourceforge.net/sourceforge/psyco/psyco-1.4.win32-py2.4.exe]
Setup for Unix/Linux or Mac OS X
--------------------------------
1. Install check
Be sure to have installed all required Unix/Linux software listed above.
2. Compile Python modules
Run ``python setup.py build`` to compile the Python files.
For help about the setup.py script options, run
``python setup.py --help``.
The CC environment variable is checked before compilation, so you can
change the default C compiler with ``export CC=myccompiler``.
3.
a) Installation as root
Run ``su -c 'python setup.py install'`` to install LinkChecker.
b) Installation as a normal user
Run ``python setup.py install --home $HOME``. Note that you have
to adjust your PATH and PYTHONPATH environment variables, eg. by
adding the commands ``export PYTHONPATH=$HOME/lib/python`` and
``export PATH=$PATH:$HOME/bin`` to your shell configuration
file.
For more information look at the `Modifying Python's search path`_
documentation.
.. _Modifying Python's search path:
http://docs.python.org/inst/search-path.html#SECTION000410000000000000000
If you downloaded Psyco please read the `psyco installation docs`_.
.. _psyco installation docs:
http://psyco.sourceforge.net/psycoguide/node2.html
Setup for Windows - the binary .exe installer:
----------------------------------------------
1. Install check
Be sure to have installed all required windows software listed above.
2. Execute the ``linkchecker-x.xx.win32-py2.4.exe`` file and follow
the instructions.
Setup for Windows - compiling from source:
------------------------------------------
1. Install check
Be sure to have installed all required Unix/Linux software listed above.
2. Preparing Python for the MinGW compiler
Search the file python24.dll in your windows folder.
After you found it, launch MSYS. Change into the windows folder,
for example ``cd c:\winnt\system32``. Then execute
``pexports python24.dll > python24.def``.
Then use the dlltool with
``dlltool --dllname python24.dll --def python24.def --output-lib
libpython24.a``.
The resulting library has to be placed in the same directory as
python24.lib. (Should be the libs directory under your Python installation
directory, for example ``c:\Python24\Libs\``.)
3. Generate and execute the LinkChecker installer
Close the MSYS application (by typing ``exit``) and open a DOS command
prompt.
Change to the ``linkchecker-X.X.X`` directory and run
``python setup.py build -c mingw32 bdist_wininst``.
This generates a binary installer
``dist\linkchecker-X.X.X.win32-py2.4.exe`` which you just have to
execute.
If you downloaded Psyco please read the `psyco installation docs`_.
.. _psyco installation docs:
http://psyco.sourceforge.net/psycoguide/node2.html
After installation
------------------
LinkChecker is now installed. Have fun!
See the `main page`_ on how to configure and start LinkChecker.
.. _main page: index.html
Installation for other platforms
--------------------------------
If you happen to install LinkChecker on other platforms (for example
Mac OS 9.x) then drop me a note.
(Fast)CGI web interface
-----------------------
The included CGI scripts can run LinkChecker with a nice graphical web
interface.
You can use and adjust the example HTML files in the lconline directory
to run the script.
1. Choose a CGI script. The simplest is lc.cgi and you need a web server
with CGI support.
The script lc.fcgi (I tested this a while ago) needs a web server
with FastCGI support.
2. Copy the script of your choice in the CGI directory.
Note that only the local host (ie. 127.0.0.1) can access this
script. If you want to enable access from other hosts you have
to adjust the ALLOWED_HOSTS and ALLOWED_SERVERS variables in
the lc.cgi (or lc.fcgi) script.
3. Adjust the "action=..." parameter in lconline/lc_cgi.html
to point to your CGI script.
4. load the lconline/index.html file, enter an URL and klick on the
check button
5. If something goes wrong, check the following:
a) look in the error log of your web server
b) be sure that you have enabled CGI support in your web server
do this by running other CGI scripts which you know are
working
c) try to run the lc.cgi script by hand
d) try the testit() function in the lc.cgi script

263
doc/en/lc.css Normal file
View file

@ -0,0 +1,263 @@
/*
:Author: David Goodger
:Contact: goodger@users.sourceforge.net
:date: $Date$
:version: $Revision$
:copyright: This stylesheet has been placed in the public domain.
Default cascading style sheet for the HTML output of Docutils.
*/
body {
font-family: Verdana, Helvetica, Arial, sans-serif;
background: #fff7ee;/*#f7ebd3;*//*fdf9f4*/
margin: 0;
padding: 0;
}
img {
border: 0;
}
.first {
margin-top: 0 }
.last {
margin-bottom: 0 }
a {
color: #222222;
}
a:hover {
color: black;
}
a.toc-backref {
text-decoration: none ;
color: black;
}
div.document {
margin-left: 2em;
width:680px;
overflow: visible;
}
blockquote.epigraph {
margin: 2em 5em ; }
dd {
margin-bottom: 0.5em }
div.abstract {
margin: 2em 5em;
}
div.abstract p.topic-title {
font-weight: bold ;
text-align: center }
div.attention, div.caution, div.danger, div.error, div.hint,
div.important, div.note, div.tip, div.warning, div.admonition {
margin: 2em ;
border: medium outset ;
padding: 1em }
div.attention p.admonition-title, div.caution p.admonition-title,
div.danger p.admonition-title, div.error p.admonition-title,
div.warning p.admonition-title {
color: red ;
font-weight: bold ;
font-family: sans-serif }
div.hint p.admonition-title, div.important p.admonition-title,
div.note p.admonition-title, div.tip p.admonition-title,
div.admonition p.admonition-title {
font-weight: bold ;
font-family: sans-serif }
div.dedication {
margin: 2em 5em ;
text-align: center ;
font-style: italic }
div.dedication p.topic-title {
font-weight: bold ;
font-style: normal }
div.figure {
margin-left: 2em }
div.footer, div.header {
font-size: smaller;
margin-left: 2em;
margin-bottom: 1em;
}
hr.footer {
width: 680px;
margin-left: 2em;
}
div.sidebar {
margin-left: 1em ;
border: medium outset ;
padding: 0em 1em ;
background-color: #ffffee ;
width: 40% ;
float: right ;
clear: right }
div.sidebar p.rubric {
font-family: sans-serif ;
font-size: medium }
div.system-messages {
margin: 5em }
div.system-messages h1 {
color: red }
div.system-message {
border: medium outset ;
padding: 1em }
div.system-message p.system-message-title {
color: red ;
font-weight: bold }
div.topic {
margin: 2em }
h1.title {
text-align: center }
h2.subtitle {
text-align: center }
hr {
width: 75% }
ol.simple, ul.simple {
margin-bottom: 1em }
ol.arabic {
list-style: decimal }
ol.loweralpha {
list-style: lower-alpha }
ol.upperalpha {
list-style: upper-alpha }
ol.lowerroman {
list-style: lower-roman }
ol.upperroman {
list-style: upper-roman }
p.attribution {
text-align: right ;
margin-left: 50% }
p.caption {
font-style: italic }
p.credits {
font-style: italic ;
font-size: smaller }
p.label {
white-space: nowrap }
p.rubric {
font-weight: bold ;
font-size: larger ;
color: maroon ;
text-align: center }
p.sidebar-title {
font-family: sans-serif ;
font-weight: bold ;
font-size: larger }
p.sidebar-subtitle {
font-family: sans-serif ;
font-weight: bold }
p.topic-title {
font-weight: bold }
pre.address {
margin-bottom: 0 ;
margin-top: 0 ;
font-family: serif ;
font-size: 100% }
pre.line-block {
font-family: serif ;
font-size: 100% }
pre.literal-block, pre.doctest-block {
margin-left: 2em ;
margin-right: 2em ;
background-color: #eeeeee }
span.classifier {
font-family: sans-serif ;
font-style: oblique }
span.classifier-delimiter {
font-family: sans-serif ;
font-weight: bold }
span.interpreted {
font-family: sans-serif }
span.option {
white-space: nowrap }
span.option-argument {
font-style: italic }
span.pre {
white-space: pre }
span.problematic {
color: red }
table {
margin-top: 0.5em ;
margin-bottom: 0.5em }
table.citation {
border-left: solid thin gray ;
padding-left: 0.5ex }
table.docinfo {
margin: 2em 4em }
table.footnote {
border-left: solid thin black ;
padding-left: 0.5ex }
td, th {
padding-left: 0.5em ;
padding-right: 0.5em ;
vertical-align: top }
th.docinfo-name, th.field-name {
font-weight: bold ;
text-align: left ;
white-space: nowrap }
h1 tt, h2 tt, h3 tt, h4 tt, h5 tt, h6 tt {
font-size: 100% }
tt {
background-color: #eeeeee }
ul.auto-toc {
list-style-type: none }

40
doc/en/navigation.css Normal file
View file

@ -0,0 +1,40 @@
.navigation {
background: transparent;
}
.navrow {
border-collapse: collapse;
border-bottom-width: 1px;
border-bottom-style: dotted;
border-bottom-color: #f86821;
white-space: nowrap;
}
.navrow a {
color: #222222;
background: transparent;
border-left-width: 10px;
border-left-style: solid;
border-left-color: #f86821;
font-weight: normal;
margin-right: 1em;
padding: 0em 0.5em;
text-decoration: none;
}
.navrow a:hover {
color: black;
background: #f8c218;
border-color: #f85b0d;
}
.navrow span {
color: #222222;
background: #f8c218;
border-left-width: 10px;
border-left-style: solid;
border-left-color: #f85b0d;
font-weight: normal;
margin-right: 1em;
padding: 0em 0.5em;
}

56
doc/en/other.html Normal file
View file

@ -0,0 +1,56 @@
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.3.7: http://docutils.sourceforge.net/" />
<title>Other link checkers</title>
<meta content="5" name="navigation.order" />
<meta content="Other" name="navigation.name" />
<link rel="stylesheet" href="lc.css" type="text/css" />
<meta rel="SHORTCUT ICON" href="favicon.png" />
<link rel="stylesheet" href="navigation.css" type="text/css" />
<script type="text/javascript">
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
</script>
</head>
<body>
<!-- bfknav -->
<div class="navigation">
<div class="navrow" style="padding: 0em 0em 0em 1em;">
<a href="./index.html">LinkChecker</a>
<a href="./install.html">Installation</a>
<a href="./upgrading.html">Upgrading</a>
<a href="./documentation.html">Documentation</a>
<span>Other</span>
</div>
</div>
<!-- /bfknav -->
<div class="document" id="other-link-checkers">
<h1 class="title">Other link checkers</h1>
<p>If LinkChecker does not fit your requirements, you can check out the
competition. All of these programs have also an <a class="reference" href="http://www.opensource.org/licenses/">Open Source license</a>
like LinkChecker.</p>
<ul class="simple">
<li><a class="reference" href="http://degraaff.org/checkbot/">checkbot</a> written in Perl</li>
<li><a class="reference" href="http://www.jmarshall.com/tools/cl/">Checklinks</a> written in Perl</li>
<li><a class="reference" href="http://dlc.sourceforge.net/">Dead link check</a> written in Perl</li>
<li><a class="reference" href="http://labs.libre-entreprise.org/projects/gurlchecker/">gURLChecker</a> written in C</li>
<li><a class="reference" href="http://web.purplefrog.com/~thoth/jchecklinks/">jchecklinks</a> written in Java</li>
<li><a class="reference" href="http://ymettier.free.fr/link-checker/link-checker.html">link-checker</a> written in C</li>
<li><a class="reference" href="http://www.linklint.org/">linklint</a> written in Perl</li>
<li><a class="reference" href="http://www.mired.org/webcheck/">webcheck</a> written in Python</li>
<li><a class="reference" href="http://cgi.linuxfocus.org/~guido/index.html#webgrep">webgrep</a> written in Perl</li>
</ul>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-01-11 11:18 UTC.
</div>
</body>
</html>

5
doc/en/other.nav Normal file
View file

@ -0,0 +1,5 @@
# generated by htmlnav.py, do not edit
name = u'Other'
level = 0
visible = True
order = 5

58
doc/en/other.txt Normal file
View file

@ -0,0 +1,58 @@
.. meta::
:navigation.order: 5
:navigation.name: Other
Other link checkers
===================
If LinkChecker does not fit your requirements, you can check out the
competition. All of these programs have also an `Open Source license`_
like LinkChecker.
.. _`Open Source license`:
http://www.opensource.org/licenses/
- `checkbot`_ written in Perl
.. _checkbot:
http://degraaff.org/checkbot/
- `Checklinks`_ written in Perl
.. _Checklinks:
http://www.jmarshall.com/tools/cl/
- `Dead link check`_ written in Perl
.. _Dead link check:
http://dlc.sourceforge.net/
- `gURLChecker`_ written in C
.. _gURLChecker:
http://labs.libre-entreprise.org/projects/gurlchecker/
- `jchecklinks`_ written in Java
.. _jchecklinks:
http://web.purplefrog.com/~thoth/jchecklinks/
- `link-checker`_ written in C
.. _link-checker:
http://ymettier.free.fr/link-checker/link-checker.html
- `linklint`_ written in Perl
.. _linklint:
http://www.linklint.org/
- `webcheck`_ written in Python
.. _webcheck:
http://www.mired.org/webcheck/
- `webgrep`_ written in Perl
.. _webgrep:
http://cgi.linuxfocus.org/~guido/index.html#webgrep

BIN
doc/en/shot1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

BIN
doc/en/shot1_thumb.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

BIN
doc/en/shot2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

BIN
doc/en/shot2_thumb.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

83
doc/en/upgrading.html Normal file
View file

@ -0,0 +1,83 @@
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.3.7: http://docutils.sourceforge.net/" />
<title>Upgrading</title>
<meta content="2" name="navigation.order" />
<meta content="Upgrading" name="navigation.name" />
<link rel="stylesheet" href="lc.css" type="text/css" />
<meta rel="SHORTCUT ICON" href="favicon.png" />
<link rel="stylesheet" href="navigation.css" type="text/css" />
<script type="text/javascript">
window.onload = function() {
if (top.location != location) {
top.location.href = document.location.href;
}
}
</script>
</head>
<body>
<!-- bfknav -->
<div class="navigation">
<div class="navrow" style="padding: 0em 0em 0em 1em;">
<a href="./index.html">LinkChecker</a>
<a href="./install.html">Installation</a>
<span>Upgrading</span>
<a href="./documentation.html">Documentation</a>
<a href="./other.html">Other</a>
</div>
</div>
<!-- /bfknav -->
<div class="document" id="upgrading">
<h1 class="title">Upgrading</h1>
<div class="section" id="migrating-from-2-2-to-2-3">
<h1><a name="migrating-from-2-2-to-2-3">Migrating from 2.2 to 2.3</a></h1>
<p>The per-user config file is now <tt class="docutils literal"><span class="pre">~/.linkchecker/linkcheckerrc</span></tt>
(previous location was <tt class="docutils literal"><span class="pre">~/.linkcheckerrc</span></tt> ).</p>
<p>The default blacklist output file is now <tt class="docutils literal"><span class="pre">~/.linkchecker/blacklist</span></tt>
(previous location was <tt class="docutils literal"><span class="pre">~/.blacklist</span></tt>).</p>
<p>Python &gt;= 2.4 is now required.</p>
</div>
<div class="section" id="migrating-from-1-x-to-2-0">
<h1><a name="migrating-from-1-x-to-2-0">Migrating from 1.x to 2.0</a></h1>
<p>The --output and --file-output parameters can specify the encoding
now. You should check your scripts if they support the new option
syntax.</p>
<p>Some added checks might trigger new warnings, so automated scripts
or alarms can have more output than with 1.x releases.</p>
<p>All output (file and console) is now encoded according to a given
character set encoding which defaults to ISO-8859-15. If you
relied that output was in a specific encoding, you might want to
use the output encoding option.</p>
</div>
<div class="section" id="migrating-from-1-12-x-to-1-13-0">
<h1><a name="migrating-from-1-12-x-to-1-13-0">Migrating from 1.12.x to 1.13.0</a></h1>
<p>Since lots of filenames have changed you should check that any
manually installed versions prior to 1.13.0 are removed. Otherwise
you will have startup problems.</p>
<p>The default output logger <tt class="docutils literal"><span class="pre">text</span></tt> has now colored output if the
output terminal supports it. The old <tt class="docutils literal"><span class="pre">colored</span></tt> output logger has
been removed.</p>
<p>The <tt class="docutils literal"><span class="pre">-F</span></tt> option no longer suppresses normal output. The old behaviour
can be restored by giving the option <tt class="docutils literal"><span class="pre">-onone</span></tt>.</p>
<p>The --status option is now the default and has been deprecated. The
old behaviour can be restored by giving the option <tt class="docutils literal"><span class="pre">--no-status</span></tt>.</p>
<p>The default recursion depth is now infinite. The old behaviour
can be restored by giving the option <tt class="docutils literal"><span class="pre">--recursion-level=1</span></tt>.</p>
<p>The option <tt class="docutils literal"><span class="pre">--strict</span></tt> has been renamed to <tt class="docutils literal"><span class="pre">--extern-strict-all</span></tt>.</p>
<p>The commandline program <tt class="docutils literal"><span class="pre">linkchecker</span></tt> returns now non-zero exit value
when errors were encountered. Previous versions always return a zero
exit value.
For scripts to ignore exit values and therefore restore the old behaviour
you can append a <tt class="docutils literal"><span class="pre">||</span> <span class="pre">true</span></tt> at the end of the command.</p>
</div>
</div>
<hr class="docutils footer" />
<div class="footer">
Generated on: 2005-01-31 13:00 UTC.
</div>
</body>
</html>

5
doc/en/upgrading.nav Normal file
View file

@ -0,0 +1,5 @@
# generated by htmlnav.py, do not edit
name = u'Upgrading'
level = 0
visible = True
order = 2

62
doc/en/upgrading.txt Normal file
View file

@ -0,0 +1,62 @@
.. meta::
:navigation.order: 2
:navigation.name: Upgrading
Upgrading
=========
Migrating from 2.2 to 2.3
-------------------------
The per-user config file is now ``~/.linkchecker/linkcheckerrc``
(previous location was ``~/.linkcheckerrc`` ).
The default blacklist output file is now ``~/.linkchecker/blacklist``
(previous location was ``~/.blacklist``).
Python >= 2.4 is now required.
Migrating from 1.x to 2.0
-------------------------
The --output and --file-output parameters can specify the encoding
now. You should check your scripts if they support the new option
syntax.
Some added checks might trigger new warnings, so automated scripts
or alarms can have more output than with 1.x releases.
All output (file and console) is now encoded according to a given
character set encoding which defaults to ISO-8859-15. If you
relied that output was in a specific encoding, you might want to
use the output encoding option.
Migrating from 1.12.x to 1.13.0
-------------------------------
Since lots of filenames have changed you should check that any
manually installed versions prior to 1.13.0 are removed. Otherwise
you will have startup problems.
The default output logger ``text`` has now colored output if the
output terminal supports it. The old ``colored`` output logger has
been removed.
The ``-F`` option no longer suppresses normal output. The old behaviour
can be restored by giving the option ``-onone``.
The --status option is now the default and has been deprecated. The
old behaviour can be restored by giving the option ``--no-status``.
The default recursion depth is now infinite. The old behaviour
can be restored by giving the option ``--recursion-level=1``.
The option ``--strict`` has been renamed to ``--extern-strict-all``.
The commandline program ``linkchecker`` returns now non-zero exit value
when errors were encountered. Previous versions always return a zero
exit value.
For scripts to ignore exit values and therefore restore the old behaviour
you can append a ``|| true`` at the end of the command.