mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-03-21 08:20:25 +00:00
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
208 lines
7.4 KiB
Groff
208 lines
7.4 KiB
Groff
.TH LINKCHECKER 1 "10 March 2001"
|
|
.SH NAME
|
|
linkchecker \- check your HTML documents for broken links
|
|
.SH SYNOPSIS
|
|
.B linkchecker
|
|
[
|
|
.I options
|
|
]
|
|
[
|
|
.I file-or-url
|
|
]
|
|
.SH DESCRIPTION
|
|
.LP
|
|
LinkChecker features
|
|
recursive checking,
|
|
multithreading,
|
|
output in colored or normal text, HTML, SQL, CSV or a sitemap
|
|
graph in GML or XML,
|
|
support for HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:,
|
|
Gopher, Telnet and local file links,
|
|
restriction of link checking with regular expression filters for URLs,
|
|
proxy support,
|
|
username/password authorization for HTTP and FTP,
|
|
robots.txt exclusion protocol support,
|
|
i18n support,
|
|
a command line interface and
|
|
a (Fast)CGI web interface (requires HTTP server)
|
|
.SH EXAMPLES
|
|
The most common use checks the given domain recursively, plus any
|
|
single URL pointing outside of the domain:
|
|
\fBlinkchecker http://treasure.calvinsplayground.de/\fP
|
|
|
|
Beware that this checks the whole site which can have several hundred
|
|
thousands URLs. Use the -r option to restrict the recursion depth.
|
|
|
|
Don't connect to mailto: hosts, only check their URL syntax. All other
|
|
links are checked as usual:
|
|
\fBlinkchecker --intern='!^mailto:' --extern-strict-all www.mysite.org\fP
|
|
|
|
Checking a local HTML file on Unix:
|
|
\fBlinkchecker ../bla.html\fP
|
|
|
|
Checking a local HTML file on Windows:
|
|
\fBlinkchecker c:\\temp\\test.html\fP
|
|
|
|
You can skip the \fBhttp://\fP url part if the domain starts with \fBwww.\fP:
|
|
\fBlinkchecker www.myhomepage.de\fP
|
|
|
|
You can skip the \fBftp://\fP url part if the domain starts with \fBftp.\fP:
|
|
\fBlinkchecker -r0 ftp.linux.org\fP
|
|
.SH OPTIONS
|
|
For single-letter option arguments the space is not a necessity.
|
|
So \fB-o colored\fP is the same as \fB-ocolored\fP.
|
|
.TP
|
|
\fB-a\fP, \fB--anchors\fP
|
|
Check HTTP anchor references. This option applies to both internal
|
|
and external urls. Default is don't check anchors.
|
|
This option implies -w because anchor errors are always warnings.
|
|
.TP
|
|
\fB-C\fP, \fB--cookies\fP
|
|
Accept and send HTTP cookies according to RFC 2109. Only cookies
|
|
which are sent back to the originating server are accepted.
|
|
Sent and accepted cookies are provided as additional logging
|
|
information.
|
|
.TP
|
|
\fB-d\fP, \fB--denyallow\fP
|
|
Swap checking order to external/internal. Default checking order is
|
|
internal/external.
|
|
.TP
|
|
\fB-D\fP, \fB--debug\fP
|
|
Print debugging information. Provide this option multiple times
|
|
for even more debugging information. Enabling debug will also
|
|
disable threading.
|
|
.TP
|
|
\fB-e\fP \fIregex\fP, \fB--extern=\fP\fIregex\fP
|
|
Assume urls that match the given regular expression as external.
|
|
Only internal HTML links are checked recursively.
|
|
.TP
|
|
\fB-f\fP \fIfile\fP, \fB--config=\fP\fIfile\fP
|
|
Use \fIfile\fP as configuration file. As default LinkChecker first searches
|
|
/etc/linkcheckerrc and then ~/.linkcheckerrc.
|
|
.TP
|
|
\fB-F\fP \fItype\fP[\fB/\fP\fIfilename\fP], \fB--file-output=\fP\fItype\fP[\fB/\fP\fIfilename\fP]
|
|
Output to a file \fBlinkchecker-out.\fP\fItype\fP,
|
|
\fB$HOME/.linkchecker_blacklist\fP for
|
|
\fBblacklist\fP output, or \fIfilename\fP if specified.
|
|
The \fIfilename\fP part of the \fBnone\fP output type will be ignored,
|
|
else if the file already exists, it will be overwritten.
|
|
You can specify this option more than once. Valid file output types
|
|
are \fBtext\fP, \fBcolored\fP, \fBhtml\fP, \fBsql\fP,
|
|
\fBcsv\fP, \fBgml\fP, \fBxml\fP, \fBnone\fP or \fBblacklist\fP
|
|
Default is no file output. If console output is not specified with
|
|
\fB-o\fP, this option suppresses all console output by implying
|
|
\fB-o none\fP.
|
|
.TP
|
|
\fB-I\fP, \fB--interactive\fP
|
|
Ask for url if none are given on the commandline.
|
|
.TP
|
|
\fB-i\fP \fIregex\fP, \fB--intern=\fIregex\fP
|
|
Assume URLs that match the given regular expression as internal.
|
|
LinkChecker descends recursively only to internal URLs, not to external.
|
|
.TP
|
|
\fB-h\fP, \fB--help\fP
|
|
Help me! Print usage information for this program.
|
|
.TP
|
|
\fB-N\fP \fIserver\fP, \fB--nntp-server=\fP\fIserver\fP
|
|
Specify an NNTP server for 'news:...' links. Default is the
|
|
environment variable NNTP_SERVER. If no host is given,
|
|
only the syntax of the link is checked.
|
|
.TP
|
|
\fB--no-anchor-caching\fP
|
|
Treat url#anchora and url#anchorb as equal on caching. This
|
|
is the default browser behaviour, but it's not specified in
|
|
the URI specification. Use with care.
|
|
.TP
|
|
\fB-o\fP \fItype\fP, \fB--output=\fP\fItype\fP
|
|
Specify output type as \fBtext\fP, \fBcolored\fP, \fBhtml\fP, \fBsql\fP,
|
|
\fBcsv\fP, \fBgml\fP, \fBxml\fP, \fBnone\fP or \fBblacklist\fP.
|
|
Default type is \fBtext\fP.
|
|
.TP
|
|
\fB-p\fP \fIpwd\fP, \fB--password=\fP\fIpwd\fP
|
|
Try the password \fIpwd\fP for HTTP and FTP authorization.
|
|
For FTP the default password is \fBanonymous@\fP. See also \fB-u\fP.
|
|
.TP
|
|
\fB-P\fP \fIsecs\fP, \fB--pause=\fP\fIsecs\fP
|
|
Pause \fIsecs\fP seconds between each url check. This option
|
|
implies \fB-t0\fP.
|
|
Default is no pause between requests.
|
|
.TP
|
|
\fB-q\fP, \fB--quiet\fP
|
|
Quiet operation. This is only useful with \fB-F\fP.
|
|
.TP
|
|
\fB-r\fP \fIdepth\fP, \fB--recursion-level=\fP\fIdepth\fP
|
|
Check recursively all links up to given \fIdepth\fP.
|
|
A negative depth will enable inifinite recursion.
|
|
Default depth is inifinite.
|
|
.TP
|
|
\fB-s\fP, \fB--extern-strict-all\fP
|
|
Check only the syntax of external links, do not try to connect to them.
|
|
For local file urls, only local files are internal. For
|
|
http and ftp urls, all urls at the same domain name are internal.
|
|
.TP
|
|
\fB--no-status\fP
|
|
Do not print check status every 5 seconds to stderr. Does not work with the
|
|
\fB--debug\fP option.
|
|
.TP
|
|
\fB-t\fP \fInum\fP, \fB--threads=\fP\fInum\fP
|
|
Generate no more than \fInum\fP threads. Default number of threads is 10.
|
|
To disable threading specify a non-positive number.
|
|
.TP
|
|
\fB--timeout=\fP\fIsecs\fP
|
|
Set the timeout for connection attempts in seconds. The default timeout
|
|
is 30 seconds.
|
|
.TP
|
|
\fB-u\fP \fIname\fP, \fB--user=\fP\fIname\fP
|
|
Try username \fIname\fP for HTTP and FTP authorization.
|
|
For FTP the default username is \fBanonymous\fP. See also \fB-p\fP.
|
|
.TP
|
|
\fB-V\fP, \fB--version\fP
|
|
Print version and exit.
|
|
.TP
|
|
\fB-v\fP, \fB--verbose\fP
|
|
Log all checked URLs (implies \fB-w\fP). Default is to log only invalid
|
|
URLs.
|
|
.TP
|
|
\fB-w\fP, \fB--warnings\fP
|
|
Log warnings.
|
|
.TP
|
|
\fB-W\fP \fIregex\fP, \fB--warning-regex=\fIregex\fP
|
|
Define a regular expression which prints a warning if it matches any
|
|
content of the checked link.
|
|
This applies of course only to pages which are valid, so we can get
|
|
their content.
|
|
Use this to check for pages that contain some form of error, for example
|
|
'This page has moved' or 'Oracle Application Server error'.
|
|
This option implies \fB-w\fP.
|
|
.TP
|
|
\fB--warning-size-bytes=\fP\fIbytes\fP
|
|
Print a warning if content size is available and exceeds the given
|
|
number of \fIbytes\fP.
|
|
This option implies \fB-w\fP.
|
|
.SH NOTES
|
|
A \fB!\fP before any regex negates it. So \fB'!^mailto:'\fP matches
|
|
everything but a mailto link.
|
|
|
|
LinkCheckers commandline parser treats \fBftp.\fP links like \fBftp://ftp.\fP
|
|
and \fBwww.\fP links like \fBhttp://www.\fP.
|
|
You can also give local files as arguments.
|
|
|
|
If you have your system configured to automatically establish a
|
|
connection to the internet (e.g. with diald), it will connect when
|
|
checking links not pointing to your local host.
|
|
Use the -s and -i options to prevent this.
|
|
|
|
Javascript links are currently ignored.
|
|
|
|
If your platform does not support threading, LinkChecker uses
|
|
\fB-t0\fP.
|
|
|
|
You can supply multiple user/password pairs in a configuration file.
|
|
|
|
To use proxies set $http_proxy, $https_proxy on Unix or Windows.
|
|
On a Mac use the Internet Config.
|
|
|
|
When checking 'news:' links the given NNTP host doesn't need to be the
|
|
same as the host of the user browsing your pages!
|
|
.SH AUTHOR
|
|
Bastian Kleineidam <calvin@debian.org>
|