Merge pull request #371 from cjmayo/manhtml

Switch to mandoc for generating html man pages
This commit is contained in:
anarcat 2020-04-24 18:59:10 -04:00 committed by GitHub
commit 87079312db
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 1342 additions and 1456 deletions

View file

@ -13,13 +13,17 @@ po4a:
man: $(MANHTMLFILES)
$(HTMLDIR)/man1/linkchecker.1.html: en/linkchecker.1 linkchecker.1.html.diff
man2html -r $< | tail -n +2 | sed 's/Time:.*//g' | sed 's@/:@/@g' > $@
patch --no-backup-if-mismatch --quiet $@ linkchecker.1.html.diff
$(HTMLDIR)/man1/linkchecker.1.html: en/linkchecker.1
mandoc -Thtml $< > $@
@sed -i -e \
's:<b>linkcheckerrc</b>(5):<a href="../man5/linkcheckerrc.5.html" class="Xr">linkcheckerrc(5)</a>:g' \
$(HTMLDIR)/man1/linkchecker.1.html
$(HTMLDIR)/man5/linkcheckerrc.5.html: en/linkcheckerrc.5 linkcheckerrc.5.html.diff
man2html -r $< | tail -n +2 | sed 's/Time:.*//g' | sed 's@/:@/@g' > $@
patch --no-backup-if-mismatch --quiet $@ linkcheckerrc.5.html.diff
$(HTMLDIR)/man5/linkcheckerrc.5.html: en/linkcheckerrc.5
mandoc -Thtml $< > $@
@sed -i -e \
's:<b>linkchecker</b>(1):<a href="../man1/linkchecker.1.html" class="Xr">linkchecker(1)</a>:g' \
$(HTMLDIR)/man5/linkcheckerrc.5.html
# check all makefiles for formatting warnings
check:
@ -32,4 +36,8 @@ check:
done; \
done
.PHONY: po4a man check
clean:
rm $(MANHTMLFILES)
.PHONY: po4a man check clean

View file

@ -1,12 +1,12 @@
.TH LINKCHECKER 1 2010-07-01 "LinkChecker" "LinkChecker commandline usage"
.TH LINKCHECKER 1 2020-04-24 "LinkChecker" "LinkChecker User Manual"
.SH NAME
linkchecker - command line client to check HTML documents and websites for broken links
.
linkchecker \- command line client to check HTML documents and websites for broken links
.SH SYNOPSIS
\fBlinkchecker\fP [\fIoptions\fP] [\fIfile-or-url\fP]...
.
.B linkchecker
.RI [ options ]
.RI [ file-or-url ]...
.SH DESCRIPTION
.LP
.TP 2
LinkChecker features
.IP \(bu
recursive and multithreaded checking,
@ -33,30 +33,30 @@ Antivirus check
.IP \(bu
a command line and web interface
.SH EXAMPLES
.TP 2
The most common use checks the given domain recursively:
\fBlinkchecker http://www.example.com/\fP
.B linkchecker http://www.example.com/
.br
Beware that this checks the whole site which can have thousands of URLs.
Use the \fB\-r\fP option to restrict the recursion depth.
.br
.TP
Don't check URLs with \fB/secret\fP in its name. All other links are checked as usual:
\fBlinkchecker \-\-ignore\-url=/secret mysite.example.com\fP
.br
.B linkchecker \-\-ignore\-url=/secret mysite.example.com
.TP
Checking a local HTML file on Unix:
\fBlinkchecker ../bla.html\fP
.br
.B linkchecker ../bla.html
.TP
Checking a local HTML file on Windows:
\fBlinkchecker c:\\temp\\test.html\fP
.br
.B linkchecker c:\\temp\\test.html
.TP
You can skip the \fBhttp://\fP url part if the domain starts with \fBwww.\fP:
\fBlinkchecker www.example.com\fP
.br
.B linkchecker www.example.com
.TP
You can skip the \fBftp://\fP url part if the domain starts with \fBftp.\fP:
\fBlinkchecker \-r0 ftp.example.com\fP
.br
.B linkchecker \-r0 ftp.example.com
.TP
Generate a sitemap graph and convert it with the graphviz dot utility:
\fBlinkchecker \-odot \-v www.example.com | dot \-Tps > sitemap.ps\fP
.
.B linkchecker \-odot \-v www.example.com | dot \-Tps > sitemap.ps
.SH OPTIONS
.SS General options
.TP
@ -85,7 +85,7 @@ Print available check plugins and exit.
\fB\-D\fP\fISTRING\fP, \fB\-\-debug=\fP\fISTRING\fP
Print debugging output for the given logger.
Available loggers are \fBcmdline\fP, \fBchecking\fP,
\fBcache\fP, \fBdns\fP, \fBplugins\fP and \fBall\fP.
\fBcache\fP, \fBdns\fP, \fBplugin\fP and \fBall\fP.
Specifying \fBall\fP is an alias for specifying all available loggers.
The option can be given multiple times to debug with more
than one logger.
@ -99,7 +99,8 @@ Output to a file \fBlinkchecker\-out.\fP\fITYPE\fP,
The \fIENCODING\fP specifies the output encoding, the default is
that of your locale.
Valid encodings are listed at
\fBhttp://docs.python.org/library/\:codecs.html#standard-encodings\fP.
.UR https://docs.python.org/library/codecs.html#standard-encodings
.UE .
.br
The \fIFILENAME\fP and \fIENCODING\fP parts of the \fBnone\fP output type
will be ignored, else if the file already exists, it will be overwritten.
@ -126,7 +127,8 @@ below.
.br
The \fIENCODING\fP specifies the output encoding, the default is
that of your locale. Valid encodings are listed at
\fBhttp://docs.python.org/library/\:codecs.html#standard-encodings\fP.
.UR https://docs.python.org/library/codecs.html#standard-encodings
.UE .
.TP
\fB\-q\fP, \fB\-\-quiet\fP
Quiet operation, an alias for \fB\-o none\fP.
@ -203,7 +205,9 @@ version of LinkChecker.
.SH "CONFIGURATION FILES"
Configuration files can specify all options above. They can also
specify some options that cannot be set on the command line.
See \fBlinkcheckerrc\fP(5) for more info.
See
.BR linkcheckerrc (5)
for more info.
.SH OUTPUT TYPES
Note that by default only errors and warnings are logged.
@ -236,7 +240,8 @@ Log check result as machine-readable XML.
.TP
\fBsitemap\fP
Log check result as an XML sitemap whose protocol is documented at
\fBhttp://www.sitemaps.org/protocol.html\fP.
.UR https://www.sitemaps.org/protocol.html
.UE .
.TP
\fBsql\fP
Log check result as SQL script with INSERT commands. An example
@ -252,7 +257,10 @@ Logs nothing. Suitable for debugging or checking the exit code.
.
.SH REGULAR EXPRESSIONS
LinkChecker accepts Python regular expressions.
See \fBhttp://docs.python.org/\:howto/regex.html\fP for an introduction.
See
.UR https://docs.python.org/howto/regex.html
.UE
for an introduction.
An addition is that a leading exclamation mark negates the regular
expression.
@ -276,15 +284,15 @@ Multiple entries are separated by a blank line.
The example below will send two cookies to all URLs starting with
\fBhttp://example.com/hello/\fP and one to all URLs starting
with \fBhttps://example.org/\fP:
Host: example.com
Path: /hello
Set-cookie: ID="smee"
Set-cookie: spam="egg"
Host: example.org
Set-cookie: baggage="elitist"; comment="hologram"
.EX
Host: example.com
Path: /hello
Set-cookie: ID="smee"
Set-cookie: spam="egg"
.PP
Host: example.org
Set-cookie: baggage="elitist"; comment="hologram"
.EE
.SH PROXY SUPPORT
To use a proxy on Unix or Windows set the $http_proxy, $https_proxy or $ftp_proxy
environment variables to the proxy URL. The URL should be of the form
@ -292,29 +300,27 @@ environment variables to the proxy URL. The URL should be of the form
LinkChecker also detects manual proxy settings of Internet Explorer under
Windows systems, and gconf or KDE on Linux systems.
On a Mac use the Internet Config to select a proxy.
.
.PP
You can also set a comma-separated domain list in the $no_proxy environment
variables to ignore any proxy settings for these domains.
.
.TP
Setting a HTTP proxy on Unix for example looks like this:
export http_proxy="http://proxy.example.com:8080"
.B
export http_proxy="http://proxy.example.com:8080"
.TP
Proxy authentication is also supported:
export http_proxy="http://user1:mypass@proxy.example.org:8081"
.B
export http_proxy="http://user1:mypass@proxy.example.org:8081"
.TP
Setting a proxy on the Windows command prompt:
set http_proxy=http://proxy.example.com:8080
.B
set http_proxy=http://proxy.example.com:8080
.SH PERFORMED CHECKS
All URLs have to pass a preliminary syntax test. Minor quoting
mistakes will issue a warning, all other invalid syntax issues
are errors.
After the syntax check passes, the URL is queued for connection
checking. All connection check types are described below.
.
.TP
HTTP links (\fBhttp:\fP, \fBhttps:\fP)
After connecting to the given HTTP server the given path
@ -322,75 +328,78 @@ or query is requested. All redirections are followed, and
if user/password is given it will be used as authorization
when necessary.
All final HTTP status codes other than 2xx are errors.
.
.IP
HTML page contents are checked for recursion.
.TP
Local files (\fBfile:\fP)
A regular, readable file that can be opened is valid. A readable
directory is also valid. All other files, for example device files,
unreadable or non-existing files are errors.
.
.IP
HTML or other parseable file contents are checked for recursion.
.TP
Mail links (\fBmailto:\fP)
A mailto: link eventually resolves to a list of email addresses.
If one address fails, the whole list will fail.
For each mail address we check the following things:
.
1) Check the adress syntax, both of the part before and after
the @ sign.
2) Look up the MX DNS records. If we found no MX record,
print an error.
3) Check if one of the mail hosts accept an SMTP connection.
Check hosts with higher priority first.
If no host accepts SMTP, we print a warning.
4) Try to verify the address with the VRFY command. If we got
an answer, print the verified address as an info.
.br
1) Check the adress syntax, both of the part before and after the @ sign.
.br
2) Look up the MX DNS records. If we found no MX record, print an error.
.br
3) Check if one of the mail hosts accept an SMTP connection.
Check hosts with higher priority first.
If no host accepts SMTP, we print a warning.
.br
4) Try to verify the address with the VRFY command. If we got an answer,
print the verified address as an info.
.TP
FTP links (\fBftp:\fP)
For FTP links we do:
1) connect to the specified host
2) try to login with the given user and password. The default
user is ``anonymous``, the default password is ``anonymous@``.
3) try to change to the given directory
4) list the file with the NLST command
For FTP links we do:
.br
1) connect to the specified host
.br
2) try to login with the given user and password. The default
user is \fBanonymous\fP, the default password is \fBanonymous@\fP.
.br
3) try to change to the given directory
.br
4) list the file with the NLST command
.TP
Telnet links (``telnet:``)
We try to connect and if user/password are given, login to the
given telnet server.
Telnet links (\fBtelnet:\fP)
We try to connect and if user/password are given, login to the
given telnet server.
.TP
NNTP links (``news:``, ``snews:``, ``nntp``)
We try to connect to the given NNTP server. If a news group or
article is specified, try to request it from the server.
NNTP links (\fBnews:\fP, \fBsnews:\fP, \fBnntp\fP)
We try to connect to the given NNTP server. If a news group or
article is specified, try to request it from the server.
.TP
Unsupported links (``javascript:``, etc.)
An unsupported link will only print a warning. No further checking
will be made.
The complete list of recognized, but unsupported links can be found
in the \fBlinkcheck/checker/unknownurl.py\fP source file.
The most prominent of them should be JavaScript links.
Unsupported links (\fBjavascript:\fP, etc.)
An unsupported link will only print a warning. No further checking
will be made.
.IP
The complete list of recognized, but unsupported links can be found
in the
.UR https://github.com/linkchecker/linkchecker/blob/master/linkcheck/checker/unknownurl.py
linkcheck/checker/unknownurl.py
.UE
source file.
The most prominent of them should be JavaScript links.
.SH PLUGINS
There are two plugin types: connection and content plugins.
.
Connection plugins are run after a successful connection to the
URL host.
.
Content plugins are run if the URL type has content
(mailto: URLs have no content for example) and if the check is not
forbidden (ie. by HTTP robots.txt).
.
.PP
See \fBlinkchecker \-\-list\-plugins\fP for a list of plugins and
their documentation. All plugins are enabled via the \fBlinkcheckerrc\fP(5)
their documentation. All plugins are enabled via the
.BR linkcheckerrc (5)
configuration file.
.SH RECURSION
@ -455,11 +464,11 @@ same as the host of the user browsing your pages.
.
.SH RETURN VALUE
The return value is 2 when
.IP \(bu
.IP \(bu 2
a program error occurred.
.PP
The return value is 1 when
.IP \(bu
.IP \(bu 2
invalid links were found or
.IP \(bu
link warnings were found and warnings are enabled
@ -478,12 +487,16 @@ might slow down the program or even the whole system.
.br
\fBlinkchecker\-out.\fP\fITYPE\fP - default logger file output name
.br
\fBhttp://docs.python.org/library/codecs.html#standard-encodings\fP - valid output encodings
.UR https://docs.python.org/library/codecs.html#standard-encodings
.UE
\- valid output encodings
.br
\fBhttp://docs.python.org/howto/regex.html\fP - regular expression documentation
.UR https://docs.python.org/howto/regex.html
.UE
\- regular expression documentation
.SH "SEE ALSO"
\fBlinkcheckerrc\fP(5)
.BR linkcheckerrc (5)
.
.SH AUTHOR
Bastian Kleineidam <bastian.kleineidam@web.de>

View file

@ -1,4 +1,4 @@
.TH linkcheckerrc 5 2007-11-30 "LinkChecker"
.TH LINKCHECKERRC 5 2020-04-24 "LinkChecker" "LinkChecker User Manual"
.SH NAME
linkcheckerrc - configuration file for LinkChecker
.
@ -13,7 +13,8 @@ The default file location is \fB~/.linkchecker/linkcheckerrc\fP on Unix,
.TP
\fBcookiefile=\fP\fIfilename\fP
Read a file with initial cookie data. The cookie data
format is explained in linkchecker(1).
format is explained in
.BR linkchecker (1).
.br
Command line option: \fB\-\-cookiefile\fP
.TP
@ -188,7 +189,8 @@ below.
.br
The \fIENCODING\fP specifies the output encoding, the default is
that of your locale. Valid encodings are listed at
\fBhttp://docs.python.org/library/codecs.html#standard-encodings\fP.
.UR https://docs.python.org/library/codecs.html#standard-encodings
.UE .
.br
Command line option: \fB\-\-output\fP
.TP
@ -228,7 +230,8 @@ Command line option: none
.TP
\fBencoding=\fP\fISTRING\fP
Valid encodings are listed in
\fBhttp://docs.python.org/library/codecs.html#standard-encodings\fP.
.UR https://docs.python.org/library/codecs.html#standard-encodings
.UE .
.br
Default encoding is \fBiso\-8859\-15\fP.
.TP
@ -404,42 +407,47 @@ priority for the first URL is 1.0, for all child URLs 0.5.
How frequently pages are changing.
.
.SH "LOGGER PARTS"
\fBall\fP (for all parts)
\fBid\fP (a unique ID for each logentry)
\fBrealurl\fP (the full url link)
\fBresult\fP (valid or invalid, with messages)
\fBextern\fP (1 or 0, only in some logger types reported)
\fBbase\fP (base href=...)
\fBname\fP (<a href=...>name</a> and <img alt="name">)
\fBparenturl\fP (if any)
\fBinfo\fP (some additional info, e.g. FTP welcome messages)
\fBwarning\fP (warnings)
\fBdltime\fP (download time)
\fBchecktime\fP (check time)
\fBurl\fP (the original url name, can be relative)
\fBintro\fP (the blurb at the beginning, "starting at ...")
\fBoutro\fP (the blurb at the end, "found x errors ...")
.TS
nokeep, tab(@);
ll.
\fBall\fP@(for all parts)
\fBid\fP@(a unique ID for each logentry)
\fBrealurl\fP@(the full url link)
\fBresult\fP@(valid or invalid, with messages)
\fBextern\fP@(1 or 0, only in some logger types reported)
\fBbase\fP@(base href=...)
\fBname\fP@(<a href=...>name</a> and <img alt="name">)
\fBparenturl\fP@(if any)
\fBinfo\fP@(some additional info, e.g. FTP welcome messages)
\fBwarning\fP@(warnings)
\fBdltime\fP@(download time)
\fBchecktime\fP@(check time)
\fBurl\fP@(the original url name, can be relative)
\fBintro\fP@(the blurb at the beginning, "starting at ...")
\fBoutro\fP@(the blurb at the end, "found x errors ...")
.TE
.SH MULTILINE
Some option values can span multiple lines. Each line has to be indented
for that to work. Lines starting with a hash (\fB#\fP) will be ignored,
though they must still be indented.
ignore=
lconline
bookmark
# a comment
^mailto:
.
.EX
ignore=
lconline
bookmark
# a comment
^mailto:
.EE
.SH EXAMPLE
[output]
log=html
[checking]
threads=5
[filtering]
ignorewarnings=http-moved-permanent
.EX
[output]
log=html
.PP
[checking]
threads=5
.PP
[filtering]
ignorewarnings=http-moved-permanent
.EE
.SH PLUGINS
All plugins have a separate section. If the section
appears in the configuration file the plugin is enabled.
@ -475,7 +483,9 @@ Configures the expiration warning time in days.
.SS \fB[HtmlSyntaxCheck]\fP
Check the syntax of HTML pages with the online W3C HTML validator.
See http://validator.w3.org/docs/api.html.
See
.UR https://validator.w3.org/docs/api.html
.UE .
.SS \fB[HttpHeaderInfo]\fP
Print HTTP headers in URL info.
@ -486,7 +496,9 @@ to display all HTTP headers that start with "X-".
.SS \fB[CssSyntaxCheck]\fP
Check the syntax of HTML pages with the online W3C CSS validator.
See http://jigsaw.w3.org/css-validator/manual.html#expert.
See
.UR https://jigsaw.w3.org/css-validator/manual.html#expert
.UE .
.SS \fB[VirusCheck]\fP
Checks the page content for virus infections with clamav.
@ -551,7 +563,7 @@ The IP is obfuscated.
The URL contains leading or trailing whitespace.
.SH "SEE ALSO"
linkchecker(1)
.BR linkchecker (1)
.
.SH AUTHOR
Bastian Kleineidam <bastian.kleineidam@web.de>

View file

@ -1,80 +0,0 @@
--- linkchecker.1.html.orig 2011-06-14 21:14:55.016011206 +0200
+++ linkchecker.1.html 2011-06-14 21:17:07.108913849 +0200
@@ -38,7 +38,7 @@
The most common use checks the given domain recursively, plus any
URL pointing outside of the domain:
-<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;<A HREF="http://www.example.net/">http://www.example.net/</A></B>
+<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;http://www.example.net/</B>
<BR>
Beware that this checks the whole site which can have thousands of URLs.
@@ -59,15 +59,15 @@
<BR>
You can skip the <B>http://</B> url part if the domain starts with <B>www.</B>:
-<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;<A HREF="http://www.example.com">www.example.com</A></B>
+<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;www.example.com</B>
<BR>
You can skip the <B>ftp://</B> url part if the domain starts with <B>ftp.</B>:
-<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;-r0&nbsp;<A HREF="ftp://ftp.example.org">ftp.example.org</A></B>
+<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;-r0&nbsp;ftp.example.org</B>
<BR>
Generate a sitemap graph and convert it with the graphviz dot utility:
-<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;-odot&nbsp;-v&nbsp;<A HREF="http://www.example.com">www.example.com</A>&nbsp;|&nbsp;dot&nbsp;-Tps&nbsp;&gt;&nbsp;sitemap.ps</B>
+<BR>&nbsp;&nbsp;<B>linkchecker&nbsp;-odot&nbsp;-v&nbsp;www.example.com&nbsp;|&nbsp;dot&nbsp;-Tps&nbsp;&gt;&nbsp;sitemap.ps</B>
<A NAME="lbAF">&nbsp;</A>
<H2>OPTIONS</H2>
@@ -302,8 +302,8 @@
Multiple entries are separated by a blank line.
The example below will send two cookies to all URLs starting with
-<B><A HREF="http://example.com/hello/">http://example.com/hello/</A></B> and one to all URLs starting
-with <B><A HREF="https://example.org/">https://example.org/</A></B>:
+<B>http://example.com/hello/</B> and one to all URLs starting
+with <B>https://example.org/</B>:
<P>
<BR>&nbsp;Host:&nbsp;example.com
<BR>&nbsp;Path:&nbsp;/hello
@@ -326,15 +326,15 @@
variables to ignore any proxy settings for these domains.
Setting a HTTP proxy on Unix for example looks like this:
<P>
-<BR>&nbsp;&nbsp;export&nbsp;http_proxy=&quot;<A HREF="http://proxy.example.com:8080">http://proxy.example.com:8080</A>&quot;
+<BR>&nbsp;&nbsp;export&nbsp;http_proxy=&quot;http://proxy.example.com:8080&quot;
<P>
Proxy authentication is also supported:
<P>
-<BR>&nbsp;&nbsp;export&nbsp;http_proxy=&quot;<A HREF="http://user1:mypass@proxy.example.org:8081">http://user1:mypass@proxy.example.org:8081</A>&quot;
+<BR>&nbsp;&nbsp;export&nbsp;http_proxy=&quot;http://user1:mypass@proxy.example.org:8081&quot;
<P>
Setting a proxy on the Windows command prompt:
<P>
-<BR>&nbsp;&nbsp;set&nbsp;http_proxy=<A HREF="http://proxy.example.com:8080">http://proxy.example.com:8080</A>
+<BR>&nbsp;&nbsp;set&nbsp;http_proxy=http://proxy.example.com:8080
<P>
<A NAME="lbAO">&nbsp;</A>
<H2>PERFORMED CHECKS</H2>
@@ -470,8 +470,8 @@
<H2>NOTES</H2>
URLs on the commandline starting with <B>ftp.</B> are treated like
-<B><A HREF="ftp://ftp.">ftp://ftp.</A></B>, URLs starting with <B>www.</B> are treated like
-<B><A HREF="http://www.">http://www.</A></B>.
+<B>ftp://ftp.</B>, URLs starting with <B>www.</B> are treated like
+<B>http://www.</B>.
You can also give local files as arguments.
<P>
If you have your system configured to automatically establish a
@@ -584,7 +584,7 @@
</DL>
<HR>
This document was created by
-<A HREF="/cgi-bin/man/man2html">man2html</A>,
+man2html,
using the manual pages.<BR>
</BODY>
</HTML>

View file

@ -1,11 +0,0 @@
--- linkcheckerrc.5.html.orig 2011-06-15 06:38:09.830998286 +0200
+++ linkcheckerrc.5.html 2011-06-15 06:38:18.327310373 +0200
@@ -487,7 +487,7 @@
</DL>
<HR>
This document was created by
-<A HREF="/cgi-bin/man/man2html">man2html</A>,
+man2html,
using the manual pages.<BR>
</BODY>

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff