Use direct HTML documentation for the GUI client; moved the homepage content to a separate package.
|
Before Width: | Height: | Size: 3.6 KiB After Width: | Height: | Size: 3.6 KiB |
|
Before Width: | Height: | Size: 907 B After Width: | Height: | Size: 907 B |
|
Before Width: | Height: | Size: 2.6 KiB After Width: | Height: | Size: 2.6 KiB |
227
doc/index.html
Normal file
|
|
@ -0,0 +1,227 @@
|
|||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
|
||||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||||
<head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||||
|
||||
<title>Check websites for broken links</title>
|
||||
<link rel="stylesheet" href="_static/sphinxdoc.css" type="text/css" />
|
||||
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
|
||||
<link rel="top" title="LinkChecker" href="" />
|
||||
<style type="text/css">
|
||||
img { border: 0; }
|
||||
</style>
|
||||
|
||||
</head>
|
||||
<body>
|
||||
<div style="background-color: white; text-align: left; padding: 10px 10px 15px 15px">
|
||||
|
||||
<table border="0"><tr>
|
||||
<td><a href=""><img
|
||||
src="_static/logo64x64.png" border="0" alt="LinkChecker"/></a></td>
|
||||
<td><h1>LinkChecker</h1></td>
|
||||
</tr></table>
|
||||
</div>
|
||||
|
||||
<div class="document">
|
||||
<div class="documentwrapper">
|
||||
<div class="body">
|
||||
|
||||
<div class="section" id="check-websites-for-broken-links">
|
||||
<h1>Check websites for broken links</h1>
|
||||
<p>LinkChecker is a free, <a class="reference external" href="http://www.gnu.org/licenses/gpl-2.0.html">GPL</a> licensed URL validator.</p>
|
||||
<div class="section" id="basic-usage">
|
||||
<h2>Basic usage</h2>
|
||||
<p>To check a URL like <tt class="docutils literal"><span class="pre">http://www.myhomepage.org/</span></tt> it is enough to
|
||||
execute <tt class="docutils literal"><span class="pre">linkchecker</span> <span class="pre">http://www.myhomepage.org/</span></tt>. This will check the
|
||||
complete domain of www.myhomepage.org recursively. All links pointing
|
||||
outside of the domain are also checked for validity.</p>
|
||||
</div>
|
||||
<div class="section" id="performed-checks">
|
||||
<h2>Performed checks</h2>
|
||||
<p>All URLs have to pass a preliminary syntax test. Minor quoting
|
||||
mistakes will issue a warning, all other invalid syntax issues
|
||||
are errors.
|
||||
After the syntax check passes, the URL is queued for connection
|
||||
checking. All connection check types are described below.</p>
|
||||
<ul>
|
||||
<li><p class="first">HTTP links (<tt class="docutils literal"><span class="pre">http:</span></tt>, <tt class="docutils literal"><span class="pre">https:</span></tt>)</p>
|
||||
<p>After connecting to the given HTTP server the given path
|
||||
or query is requested. All redirections are followed, and
|
||||
if user/password is given it will be used as authorization
|
||||
when necessary.
|
||||
Permanently moved pages issue a warning.
|
||||
All final HTTP status codes other than 2xx are errors.</p>
|
||||
</li>
|
||||
<li><p class="first">Local files (<tt class="docutils literal"><span class="pre">file:</span></tt>)</p>
|
||||
<p>A regular, readable file that can be opened is valid. A readable
|
||||
directory is also valid. All other files, for example device files,
|
||||
unreadable or non-existing files are errors.</p>
|
||||
<p>File contents are checked for recursion.</p>
|
||||
</li>
|
||||
<li><p class="first">Mail links (<tt class="docutils literal"><span class="pre">mailto:</span></tt>)</p>
|
||||
<p>A mailto: link eventually resolves to a list of email addresses.
|
||||
If one address fails, the whole list will fail.
|
||||
For each mail address we check the following things:</p>
|
||||
<ol class="arabic simple">
|
||||
<li>Check the adress syntax, both of the part before and after
|
||||
the @ sign.</li>
|
||||
<li>Look up the MX DNS records. If we found no MX record,
|
||||
print an error.</li>
|
||||
<li>Check if one of the mail hosts accept an SMTP connection.
|
||||
Check hosts with higher priority first.
|
||||
If no host accepts SMTP, we print a warning.</li>
|
||||
<li>Try to verify the address with the VRFY command. If we got
|
||||
an answer, print the verified address as an info.</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><p class="first">FTP links (<tt class="docutils literal"><span class="pre">ftp:</span></tt>)</p>
|
||||
<p>For FTP links we do:</p>
|
||||
<ol class="arabic simple">
|
||||
<li>connect to the specified host</li>
|
||||
<li>try to login with the given user and password. The default
|
||||
user is <tt class="docutils literal"><span class="pre">anonymous</span></tt>, the default password is <tt class="docutils literal"><span class="pre">anonymous@</span></tt>.</li>
|
||||
<li>try to change to the given directory</li>
|
||||
<li>list the file with the NLST command</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><p class="first">Telnet links (<tt class="docutils literal"><span class="pre">telnet:</span></tt>)</p>
|
||||
<p>We try to connect and if user/password are given, login to the
|
||||
given telnet server.</p>
|
||||
</li>
|
||||
<li><p class="first">NNTP links (<tt class="docutils literal"><span class="pre">news:</span></tt>, <tt class="docutils literal"><span class="pre">snews:</span></tt>, <tt class="docutils literal"><span class="pre">nntp</span></tt>)</p>
|
||||
<p>We try to connect to the given NNTP server. If a news group or
|
||||
article is specified, try to request it from the server.</p>
|
||||
</li>
|
||||
<li><p class="first">Ignored links (<tt class="docutils literal"><span class="pre">javascript:</span></tt>, etc.)</p>
|
||||
<p>An ignored link will only print a warning. No further checking
|
||||
will be made.</p>
|
||||
<p>Here is a complete list of recognized, but ignored links. The most
|
||||
prominent of them should be JavaScript links.</p>
|
||||
<ul class="simple">
|
||||
<li><tt class="docutils literal"><span class="pre">acap:</span></tt> (application configuration access protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">afs:</span></tt> (Andrew File System global file names)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">chrome:</span></tt> (Mozilla specific)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">cid:</span></tt> (content identifier)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">clsid:</span></tt> (Microsoft specific)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">data:</span></tt> (data)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">dav:</span></tt> (dav)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">fax:</span></tt> (fax)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">find:</span></tt> (Mozilla specific)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">gopher:</span></tt> (Gopher)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">imap:</span></tt> (internet message access protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">isbn:</span></tt> (ISBN (int. book numbers))</li>
|
||||
<li><tt class="docutils literal"><span class="pre">javascript:</span></tt> (JavaScript)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">ldap:</span></tt> (Lightweight Directory Access Protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">mailserver:</span></tt> (Access to data available from mail servers)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">mid:</span></tt> (message identifier)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">mms:</span></tt> (multimedia stream)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">modem:</span></tt> (modem)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">nfs:</span></tt> (network file system protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">opaquelocktoken:</span></tt> (opaquelocktoken)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">pop:</span></tt> (Post Office Protocol v3)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">prospero:</span></tt> (Prospero Directory Service)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">rsync:</span></tt> (rsync protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">rtsp:</span></tt> (real time streaming protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">service:</span></tt> (service location)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">shttp:</span></tt> (secure HTTP)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">sip:</span></tt> (session initiation protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">tel:</span></tt> (telephone)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">tip:</span></tt> (Transaction Internet Protocol)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">tn3270:</span></tt> (Interactive 3270 emulation sessions)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">vemmi:</span></tt> (versatile multimedia interface)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">wais:</span></tt> (Wide Area Information Servers)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">z39.50r:</span></tt> (Z39.50 Retrieval)</li>
|
||||
<li><tt class="docutils literal"><span class="pre">z39.50s:</span></tt> (Z39.50 Session)</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="section" id="recursion">
|
||||
<h2>Recursion</h2>
|
||||
<p>Before descending recursively into a URL, it has to fulfill several
|
||||
conditions. They are checked in this order:</p>
|
||||
<ol class="arabic simple">
|
||||
<li>A URL must be valid.</li>
|
||||
<li>A URL must be parseable. This currently includes HTML files,
|
||||
Opera bookmarks files, and directories. If a file type cannot
|
||||
be determined (for example it does not have a common HTML file
|
||||
extension, and the content does not look like HTML), it is assumed
|
||||
to be non-parseable.</li>
|
||||
<li>The URL content must be retrievable. This is usually the case
|
||||
except for example mailto: or unknown URL types.</li>
|
||||
<li>The maximum recursion level must not be exceeded. It is configured
|
||||
with the <tt class="docutils literal"><span class="pre">--recursion-level</span></tt> option and is unlimited per default.</li>
|
||||
<li>It must not match the ignored URL list. This is controlled with
|
||||
the <tt class="docutils literal"><span class="pre">--ignore-url</span></tt> option.</li>
|
||||
<li>The Robots Exclusion Protocol must allow links in the URL to be
|
||||
followed recursively. This is checked by searching for a
|
||||
“nofollow” directive in the HTML header data.</li>
|
||||
</ol>
|
||||
<p>Note that the directory recursion reads all files in that
|
||||
directory, not just a subset like <tt class="docutils literal"><span class="pre">index.htm*</span></tt>.</p>
|
||||
</div>
|
||||
<div class="section" id="frequently-asked-questions">
|
||||
<h2>Frequently asked questions</h2>
|
||||
<p><strong>Q: LinkChecker produced an error, but my web page is ok with
|
||||
Mozilla/IE/Opera/...
|
||||
Is this a bug in LinkChecker?</strong></p>
|
||||
<p>A: Please check your web pages first. Are they really ok?
|
||||
Use the <tt class="docutils literal"><span class="pre">--check-html</span></tt> option, or check if you are using a proxy
|
||||
which produces the error.</p>
|
||||
<p><strong>Q: I still get an error, but the page is definitely ok.</strong></p>
|
||||
<p>A: Some servers deny access of automated tools (also called robots)
|
||||
like LinkChecker. This is not a bug in LinkChecker but rather a
|
||||
policy by the webmaster running the website you are checking. Look
|
||||
the <tt class="docutils literal"><span class="pre">/robots.txt</span></tt> file which follows the <a class="reference external" href="http://www.robotstxt.org/wc/norobots-rfc.html">robots.txt exclusion standard</a>.</p>
|
||||
<p><strong>Q: How can I tell LinkChecker which proxy to use?</strong></p>
|
||||
<p>A: LinkChecker works transparently with proxies. In a Unix or Windows
|
||||
environment, set the http_proxy, https_proxy, ftp_proxy environment
|
||||
variables to a URL that identifies the proxy server before starting
|
||||
LinkChecker. For example</p>
|
||||
<div class="highlight-python"><pre>$ http_proxy="http://www.someproxy.com:3128"
|
||||
$ export http_proxy</pre>
|
||||
</div>
|
||||
<p><strong>Q: The link “mailto:john@company.com?subject=Hello John” is reported
|
||||
as an error.</strong></p>
|
||||
<p>A: You have to quote special characters (e.g. spaces) in the subject field.
|
||||
The correct link should be “mailto:...?subject=Hello%20John”
|
||||
Unfortunately browsers like IE and Netscape do not enforce this.</p>
|
||||
<p><strong>Q: Has LinkChecker JavaScript support?</strong></p>
|
||||
<p>A: No, it never will. If your page is not working without JS, it is
|
||||
better checked with a browser testing tool like <a class="reference external" href="http://seleniumhq.org/">Selenium</a>.</p>
|
||||
<p><strong>Q: Is LinkCheckers cookie feature insecure?</strong></p>
|
||||
<p>A: If a cookie file is specified, the information will be sent
|
||||
to the specified hosts.
|
||||
The following restrictions apply for LinkChecker cookies:</p>
|
||||
<ul class="simple">
|
||||
<li>Cookies will only be sent to the originating server.</li>
|
||||
<li>Cookies are only stored in memory. After LinkChecker finishes, they
|
||||
are lost.</li>
|
||||
<li>The cookie feature is disabled as default.</li>
|
||||
</ul>
|
||||
<p><strong>Q: I see LinkChecker gets a /robots.txt file for every site it
|
||||
checks. What is that about?</strong></p>
|
||||
<p>A: LinkChecker follows the <a class="reference external" href="http://www.robotstxt.org/wc/norobots-rfc.html">robots.txt exclusion standard</a>. To avoid
|
||||
misuse of LinkChecker, you cannot turn this feature off.
|
||||
See the <a class="reference external" href="http://www.robotstxt.org/wc/robots.html">Web Robot pages</a> and the <a class="reference external" href="http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt">Spidering report</a> for more info.</p>
|
||||
<p><strong>Q: How do I print unreachable/dead documents of my website with
|
||||
LinkChecker?</strong></p>
|
||||
<p>A: No can do. This would require file system access to your web
|
||||
repository and access to your web server configuration.</p>
|
||||
<p><strong>Q: How do I check HTML/XML/CSS syntax with LinkChecker?</strong></p>
|
||||
<p>A: Use the <tt class="docutils literal"><span class="pre">--check-html</span></tt> and <tt class="docutils literal"><span class="pre">--check-css</span></tt> options.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
</div>
|
||||
</div>
|
||||
<div class="clearer"></div>
|
||||
</div>
|
||||
<div class="footer">
|
||||
© Copyright 2009, Bastian Kleineidam.
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
|
Before Width: | Height: | Size: 25 KiB After Width: | Height: | Size: 25 KiB |
|
Before Width: | Height: | Size: 8 KiB After Width: | Height: | Size: 8 KiB |
|
|
@ -1,291 +0,0 @@
|
|||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (C) 2007-2009 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
"""
|
||||
A script to lossless compress media files to be used in production
|
||||
deployments of web software. Used together with HTML compression
|
||||
it decreases the size of transmitted data considerably.
|
||||
|
||||
Currently supported media files:
|
||||
Type Extension Compressor(s)
|
||||
========================================================
|
||||
JavaScript .js YUI compressor (a Java program)
|
||||
CSS .css YUI compressor (a Java program)
|
||||
PNG .png pngcrush (a C program)
|
||||
JPEG .jpg jpegtran (a C program)
|
||||
GIF .gif giftrans (a C program)
|
||||
|
||||
It compresses all supported media files to new files. The original
|
||||
files will not be changed if not explicitely requested.
|
||||
|
||||
Compressed files will be named <filebase>-min.<ext> where <filebase> is
|
||||
everything up to the last dot and <ext> is everything after the last dot.
|
||||
If requested, the original file will be overwritten with the compressed one.
|
||||
|
||||
A directory will be recursively searched and all media files within will
|
||||
be compressed.
|
||||
|
||||
Files will only be compressed when the compressed file is missing or the
|
||||
original file is newer than the compressed file.
|
||||
"""
|
||||
import sys
|
||||
import os
|
||||
import getopt
|
||||
import stat
|
||||
import shutil
|
||||
from distutils.spawn import spawn, find_executable
|
||||
from distutils.errors import DistutilsExecError
|
||||
import distutils.log
|
||||
distutils.log.set_verbosity(1)
|
||||
|
||||
|
||||
# list of extensions for compressable files
|
||||
COMPRESS_EXTENSIONS = (".js", ".css", ".png", ".jpg", ".gif")
|
||||
|
||||
|
||||
def log (*args):
|
||||
for arg in args:
|
||||
print >> sys.stderr, arg,
|
||||
print >> sys.stderr
|
||||
|
||||
|
||||
def usage (msg=None):
|
||||
"""
|
||||
Print usage information to sys.stderr and call sys.exit().
|
||||
The exit code is zero if msg is None, else one.
|
||||
"""
|
||||
if msg is None:
|
||||
err = 0
|
||||
else:
|
||||
print >> sys.stderr, msg
|
||||
err = 1
|
||||
thisfile = os.path.basename(__file__)
|
||||
log("Usage:", thisfile, "[options]", "<file-or-directory>...")
|
||||
log("Options:")
|
||||
log(" --js-compressor - Specify the JavaScript compressor " \
|
||||
"(default: yuicompressor.jar)")
|
||||
log(" --exclude - Specify (part of) filenames to ignore")
|
||||
log(" --overwrite - Comma-separated list of file extensions to overwrite")
|
||||
log(" --help - Display help")
|
||||
sys.exit(err)
|
||||
|
||||
|
||||
class DirectoryWalker:
|
||||
|
||||
def __init__(self, directory):
|
||||
self.stack = [directory]
|
||||
self.files = []
|
||||
self.index = 0
|
||||
|
||||
def __getitem__(self, index):
|
||||
while 1:
|
||||
try:
|
||||
file = self.files[self.index]
|
||||
self.index = self.index + 1
|
||||
except IndexError:
|
||||
# pop next directory from stack
|
||||
self.directory = self.stack.pop()
|
||||
self.files = os.listdir(self.directory)
|
||||
self.index = 0
|
||||
else:
|
||||
# got a filename
|
||||
fullname = os.path.join(self.directory, file)
|
||||
if os.path.isdir(fullname) and not os.path.islink(fullname):
|
||||
self.stack.append(fullname)
|
||||
return fullname
|
||||
|
||||
|
||||
def is_compressable (settings, filename):
|
||||
"Check if given filename is compressable."
|
||||
# is it excluded?
|
||||
if [x for x in settings["exclude"] if x in filename]:
|
||||
return False
|
||||
# is it compressable?
|
||||
return os.path.splitext(filename)[1] in COMPRESS_EXTENSIONS
|
||||
|
||||
|
||||
def get_files (settings, args):
|
||||
"""
|
||||
Given a list of files and/or directories return all compressable
|
||||
files as an iterator.
|
||||
"""
|
||||
for arg in args:
|
||||
if os.path.isdir(arg):
|
||||
for file in DirectoryWalker(arg):
|
||||
if is_compressable(settings, file):
|
||||
yield file
|
||||
elif os.path.isfile(arg):
|
||||
if is_compressable(settings, arg):
|
||||
yield arg
|
||||
else:
|
||||
log("Warning: not a file or directory", repr(arg))
|
||||
|
||||
|
||||
settings = {
|
||||
# default compress executables
|
||||
"compressor": {
|
||||
".js": "yuicompressor.jar", # Note: automatically suffixes with "java"
|
||||
".css": "yuicompressor.jar",
|
||||
".png": "pngcrush",
|
||||
".jpg": "jpegtran",
|
||||
".gif": "giftrans",
|
||||
},
|
||||
# list of filenames (or a part of them) to exclude
|
||||
"exclude": set(),
|
||||
# list of file extensions to overwrite
|
||||
"overwrite": set(),
|
||||
}
|
||||
def parse_options (args):
|
||||
"""
|
||||
Parse command line arguments.
|
||||
@return: (settings, args)
|
||||
@rtype: tuple (dict, list)
|
||||
"""
|
||||
long_opts = ["help", "js-compressor=", "exclude=", "overwrite="]
|
||||
try:
|
||||
opts, args = getopt.getopt(args, "", long_opts)
|
||||
except getopt.error:
|
||||
usage(msg=sys.exc_info()[1])
|
||||
for opt, arg in opts:
|
||||
if opt == "--help":
|
||||
usage()
|
||||
elif opt == "--js-compressor":
|
||||
for ext in (".js", ".css"):
|
||||
settings["compressor"][ext] = arg
|
||||
elif opt == "--exclude":
|
||||
settings["exclude"].add(arg)
|
||||
elif opt == "--overwrite":
|
||||
exts = [x.strip().lower() for x in arg.split(",") if x]
|
||||
settings["overwrite"].update(exts)
|
||||
else:
|
||||
usage(msg="Unbekannte Option %r" % opt)
|
||||
return settings, args
|
||||
|
||||
|
||||
def get_mtime (filename):
|
||||
"Return modification time of file."
|
||||
return os.stat(filename)[stat.ST_MTIME]
|
||||
|
||||
|
||||
def get_fsize (filename):
|
||||
"Return file size in bytes."
|
||||
return os.stat(filename)[stat.ST_SIZE]
|
||||
|
||||
|
||||
def needs_compression (infile, outfile):
|
||||
"Check if infile needs to be compressed to given outfile."
|
||||
if not os.path.exists(outfile):
|
||||
return True
|
||||
return get_mtime(infile) > get_mtime(outfile)
|
||||
|
||||
|
||||
def compress_file (infile):
|
||||
"Compress given file if needed."
|
||||
base, ext = os.path.splitext(infile)
|
||||
if base.endswith("-min"):
|
||||
#log("Ignoring", repr(infile))
|
||||
return
|
||||
outfile = "%s-min%s" % (base, ext)
|
||||
if needs_compression(infile, outfile):
|
||||
cmd = compress_cmd(ext, infile, outfile)
|
||||
if not cmd:
|
||||
log("Skipping", repr(infile), "no compressor available")
|
||||
return
|
||||
try:
|
||||
log("Compressing", repr(infile), "...")
|
||||
run_cmd(cmd)
|
||||
except DistutilsExecError, msg:
|
||||
log("Error running %s: %s" % (cmd, msg))
|
||||
else:
|
||||
insize = get_fsize(infile)
|
||||
outsize = get_fsize(outfile)
|
||||
if outsize > insize:
|
||||
log("Warning: compressed file is bigger than original "
|
||||
"(%dB > %dB); copying instead." % (insize, outsize))
|
||||
shutil.copyfile(infile, outfile)
|
||||
else:
|
||||
percentage = float(outsize * 100) / insize
|
||||
log(".. compressed to %.2f%% (%dB -> %dB)" % \
|
||||
(percentage, insize, outsize))
|
||||
if ext[1:].lower() in settings["overwrite"]:
|
||||
shutil.move(outfile, infile)
|
||||
else:
|
||||
log("Skipping", repr(infile))
|
||||
|
||||
|
||||
def compress_cmd (ext, infile, outfile):
|
||||
"Get list of commands args for compression."
|
||||
cmd = []
|
||||
compressor = settings["compressor"][ext]
|
||||
if compressor.endswith(".jar"):
|
||||
if not find_executable("java"):
|
||||
return None
|
||||
cmd.insert(0, "java")
|
||||
cmd.insert(1, "-jar")
|
||||
elif not find_executable(compressor):
|
||||
return None
|
||||
cmd.append(compressor)
|
||||
cmd.extend(compressor_args(compressor, infile, outfile))
|
||||
return cmd
|
||||
|
||||
|
||||
def compressor_args (compressor, infile, outfile):
|
||||
"""
|
||||
Return list of commandline arguments that compress infile to outfile
|
||||
with given compressor.
|
||||
"""
|
||||
basename = os.path.basename(compressor).lower()
|
||||
if basename.startswith("yuicompressor"):
|
||||
args = compressor_args_yui(infile, outfile)
|
||||
elif basename.startswith("pngcrush"):
|
||||
args = compressor_args_pngcursh(infile, outfile)
|
||||
elif basename.startswith("jpegtran"):
|
||||
args = compressor_args_jpegtran(infile, outfile)
|
||||
elif basename.startswith("giftrans"):
|
||||
args = compressor_args_giftrans(infile, outfile)
|
||||
else:
|
||||
raise getopt.error("Unknown compressor %r" % compressor)
|
||||
return args
|
||||
|
||||
|
||||
def compressor_args_yui (infile, outfile):
|
||||
return ["--charset", "utf8", "-o", outfile, infile]
|
||||
|
||||
def compressor_args_pngcursh (infile, outfile):
|
||||
return [infile, outfile]
|
||||
|
||||
def compressor_args_jpegtran (infile, outfile):
|
||||
return ["-optimize", "-perfect", "-copy", "none",
|
||||
"-outfile", outfile, infile]
|
||||
|
||||
def compressor_args_giftrans (infile, outfile):
|
||||
return ["-C", "-o", outfile, infile]
|
||||
|
||||
|
||||
def run_cmd (cmd):
|
||||
"Execute given command."
|
||||
return spawn(cmd)
|
||||
|
||||
|
||||
def main (args):
|
||||
settings, args = parse_options(args)
|
||||
for file in get_files(settings, args):
|
||||
compress_file(file)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv[1:])
|
||||
|
Before Width: | Height: | Size: 12 KiB After Width: | Height: | Size: 12 KiB |
|
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
|
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 56 KiB |
|
Before Width: | Height: | Size: 5.7 KiB After Width: | Height: | Size: 5.7 KiB |
|
Before Width: | Height: | Size: 41 KiB After Width: | Height: | Size: 41 KiB |
|
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.9 KiB |
|
|
@ -1,193 +0,0 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
#
|
||||
# LinkChecker documentation build configuration file, created by
|
||||
# sphinx-quickstart on Tue Jan 20 23:59:41 2009.
|
||||
#
|
||||
# This file is execfile()d with the current directory set to its containing dir.
|
||||
#
|
||||
# The contents of this file are pickled, so don't put values in the namespace
|
||||
# that aren't pickleable (module imports are okay, they're removed automatically).
|
||||
#
|
||||
# Note that not all possible configuration values are present in this
|
||||
# autogenerated file.
|
||||
#
|
||||
# All configuration values have a default; values that are commented out
|
||||
# serve to show the default.
|
||||
|
||||
#import sys, os
|
||||
|
||||
# If your extensions are in another directory, add it here. If the directory
|
||||
# is relative to the documentation root, use os.path.abspath to make it
|
||||
# absolute, like shown here.
|
||||
#sys.path.append(os.path.abspath('.'))
|
||||
|
||||
# General configuration
|
||||
# ---------------------
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be extensions
|
||||
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
|
||||
extensions = []
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['templates']
|
||||
|
||||
# The suffix of source filenames.
|
||||
source_suffix = '.txt'
|
||||
|
||||
# The encoding of source files.
|
||||
#source_encoding = 'utf-8'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# General information about the project.
|
||||
project = u'LinkChecker'
|
||||
copyright = u'2009, Bastian Kleineidam'
|
||||
|
||||
# The version info for the project you're documenting, acts as replacement for
|
||||
# |version| and |release|, also used in various other places throughout the
|
||||
# built documents.
|
||||
#
|
||||
# The short X.Y version.
|
||||
version = '5.0.2'
|
||||
# The full version, including alpha/beta/rc tags.
|
||||
release = version
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
#language = None
|
||||
|
||||
# There are two options for replacing |today|: either, you set today to some
|
||||
# non-false value, then it is used:
|
||||
#today = ''
|
||||
# Else, today_fmt is used as the format for a strftime call.
|
||||
#today_fmt = '%B %d, %Y'
|
||||
|
||||
# List of documents that shouldn't be included in the build.
|
||||
#unused_docs = None
|
||||
|
||||
# List of directories, relative to source directory, that shouldn't be searched
|
||||
# for source files.
|
||||
exclude_trees = []
|
||||
|
||||
# The reST default role (used for this markup: `text`) to use for all documents.
|
||||
#default_role = None
|
||||
|
||||
# If true, '()' will be appended to :func: etc. cross-reference text.
|
||||
#add_function_parentheses = True
|
||||
|
||||
# If true, the current module name will be prepended to all description
|
||||
# unit titles (such as .. function::).
|
||||
add_module_names = False
|
||||
|
||||
# If true, sectionauthor and moduleauthor directives will be shown in the
|
||||
# output. They are ignored by default.
|
||||
#show_authors = False
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = 'friendly'
|
||||
|
||||
|
||||
# Options for HTML output
|
||||
# -----------------------
|
||||
|
||||
# The style sheet to use for HTML and HTML Help pages. A file of that name
|
||||
# must exist either in Sphinx' static/ path, or in one of the custom paths
|
||||
# given in html_static_path.
|
||||
html_style = 'sphinxdoc.css'
|
||||
|
||||
# The name for this set of Sphinx documents. If None, it defaults to
|
||||
# "<project> v<release> documentation".
|
||||
html_title = project
|
||||
|
||||
# A shorter title for the navigation bar. Default is the same as html_title.
|
||||
#html_short_title = None
|
||||
|
||||
# The name of an image file (relative to this directory) to place at the top
|
||||
# of the sidebar.
|
||||
html_logo = None
|
||||
|
||||
# The name of an image file (within the static path) to use as favicon of the
|
||||
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
|
||||
# pixels large.
|
||||
html_favicon = "favicon.ico"
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ['static']
|
||||
|
||||
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
|
||||
# using the given strftime format.
|
||||
#html_last_updated_fmt = '%b %d, %Y'
|
||||
|
||||
# If true, SmartyPants will be used to convert quotes and dashes to
|
||||
# typographically correct entities.
|
||||
html_use_smartypants = True
|
||||
|
||||
# Custom sidebar templates, maps document names to template names.
|
||||
#html_sidebars = {}
|
||||
|
||||
# Additional templates that should be rendered to pages, maps page names to
|
||||
# template names.
|
||||
#html_additional_pages = {}
|
||||
|
||||
# If false, no module index is generated.
|
||||
html_use_modindex = False
|
||||
|
||||
# If false, no index is generated.
|
||||
html_use_index = False
|
||||
|
||||
# If true, the index is split into individual pages for each letter.
|
||||
#html_split_index = False
|
||||
|
||||
# If true, the reST sources are included in the HTML build as _sources/<name>.
|
||||
html_copy_source = False
|
||||
|
||||
# If true, an OpenSearch description file will be output, and all pages will
|
||||
# contain a <link> tag referring to it. The value of this option must be the
|
||||
# base URL from which the finished HTML is served.
|
||||
#html_use_opensearch = ''
|
||||
|
||||
# If nonempty, this is the file name suffix for HTML files (e.g. ".xhtml").
|
||||
#html_file_suffix = ''
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = 'LinkCheckerdoc'
|
||||
|
||||
|
||||
# Options for LaTeX output
|
||||
# ------------------------
|
||||
|
||||
# The paper size ('letter' or 'a4').
|
||||
latex_paper_size = 'a4'
|
||||
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
#latex_font_size = '10pt'
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title, author, document class [howto/manual]).
|
||||
latex_documents = [
|
||||
('index', 'LinkChecker.tex', ur'LinkChecker Documentation',
|
||||
ur'Bastian Kleineidam', 'manual'),
|
||||
]
|
||||
|
||||
# The name of an image file (relative to this directory) to place at the top of
|
||||
# the title page.
|
||||
#latex_logo = None
|
||||
|
||||
# For "manual" documents, if this is true, then toplevel headings are parts,
|
||||
# not chapters.
|
||||
#latex_use_parts = False
|
||||
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
#latex_preamble = ''
|
||||
|
||||
# Documents to append as an appendix to all manuals.
|
||||
#latex_appendices = []
|
||||
|
||||
# If false, no module index is generated.
|
||||
#latex_use_modindex = True
|
||||
|
||||
#def setup(app):
|
||||
# app.add_config_value('foo', 'default', True)
|
||||
|
|
@ -1,270 +0,0 @@
|
|||
Documentation
|
||||
=============
|
||||
|
||||
Basic usage
|
||||
-----------
|
||||
|
||||
To check a URL like ``http://www.myhomepage.org/`` it is enough to
|
||||
execute ``linkchecker http://www.myhomepage.org/``. This will check the
|
||||
complete domain of www.myhomepage.org recursively. All links pointing
|
||||
outside of the domain are also checked for validity.
|
||||
|
||||
Performed checks
|
||||
----------------
|
||||
|
||||
All URLs have to pass a preliminary syntax test. Minor quoting
|
||||
mistakes will issue a warning, all other invalid syntax issues
|
||||
are errors.
|
||||
After the syntax check passes, the URL is queued for connection
|
||||
checking. All connection check types are described below.
|
||||
|
||||
- HTTP links (``http:``, ``https:``)
|
||||
|
||||
After connecting to the given HTTP server the given path
|
||||
or query is requested. All redirections are followed, and
|
||||
if user/password is given it will be used as authorization
|
||||
when necessary.
|
||||
Permanently moved pages issue a warning.
|
||||
All final HTTP status codes other than 2xx are errors.
|
||||
|
||||
- Local files (``file:``)
|
||||
|
||||
A regular, readable file that can be opened is valid. A readable
|
||||
directory is also valid. All other files, for example device files,
|
||||
unreadable or non-existing files are errors.
|
||||
|
||||
File contents are checked for recursion.
|
||||
|
||||
- Mail links (``mailto:``)
|
||||
|
||||
A mailto: link eventually resolves to a list of email addresses.
|
||||
If one address fails, the whole list will fail.
|
||||
For each mail address we check the following things:
|
||||
|
||||
1) Check the adress syntax, both of the part before and after
|
||||
the @ sign.
|
||||
2) Look up the MX DNS records. If we found no MX record,
|
||||
print an error.
|
||||
3) Check if one of the mail hosts accept an SMTP connection.
|
||||
Check hosts with higher priority first.
|
||||
If no host accepts SMTP, we print a warning.
|
||||
4) Try to verify the address with the VRFY command. If we got
|
||||
an answer, print the verified address as an info.
|
||||
|
||||
- FTP links (``ftp:``)
|
||||
|
||||
For FTP links we do:
|
||||
|
||||
1) connect to the specified host
|
||||
2) try to login with the given user and password. The default
|
||||
user is ``anonymous``, the default password is ``anonymous@``.
|
||||
3) try to change to the given directory
|
||||
4) list the file with the NLST command
|
||||
|
||||
- Telnet links (``telnet:``)
|
||||
|
||||
We try to connect and if user/password are given, login to the
|
||||
given telnet server.
|
||||
|
||||
- NNTP links (``news:``, ``snews:``, ``nntp``)
|
||||
|
||||
We try to connect to the given NNTP server. If a news group or
|
||||
article is specified, try to request it from the server.
|
||||
|
||||
- Ignored links (``javascript:``, etc.)
|
||||
|
||||
An ignored link will only print a warning. No further checking
|
||||
will be made.
|
||||
|
||||
Here is a complete list of recognized, but ignored links. The most
|
||||
prominent of them should be JavaScript links.
|
||||
|
||||
- ``acap:`` (application configuration access protocol)
|
||||
- ``afs:`` (Andrew File System global file names)
|
||||
- ``chrome:`` (Mozilla specific)
|
||||
- ``cid:`` (content identifier)
|
||||
- ``clsid:`` (Microsoft specific)
|
||||
- ``data:`` (data)
|
||||
- ``dav:`` (dav)
|
||||
- ``fax:`` (fax)
|
||||
- ``find:`` (Mozilla specific)
|
||||
- ``gopher:`` (Gopher)
|
||||
- ``imap:`` (internet message access protocol)
|
||||
- ``isbn:`` (ISBN (int. book numbers))
|
||||
- ``javascript:`` (JavaScript)
|
||||
- ``ldap:`` (Lightweight Directory Access Protocol)
|
||||
- ``mailserver:`` (Access to data available from mail servers)
|
||||
- ``mid:`` (message identifier)
|
||||
- ``mms:`` (multimedia stream)
|
||||
- ``modem:`` (modem)
|
||||
- ``nfs:`` (network file system protocol)
|
||||
- ``opaquelocktoken:`` (opaquelocktoken)
|
||||
- ``pop:`` (Post Office Protocol v3)
|
||||
- ``prospero:`` (Prospero Directory Service)
|
||||
- ``rsync:`` (rsync protocol)
|
||||
- ``rtsp:`` (real time streaming protocol)
|
||||
- ``service:`` (service location)
|
||||
- ``shttp:`` (secure HTTP)
|
||||
- ``sip:`` (session initiation protocol)
|
||||
- ``tel:`` (telephone)
|
||||
- ``tip:`` (Transaction Internet Protocol)
|
||||
- ``tn3270:`` (Interactive 3270 emulation sessions)
|
||||
- ``vemmi:`` (versatile multimedia interface)
|
||||
- ``wais:`` (Wide Area Information Servers)
|
||||
- ``z39.50r:`` (Z39.50 Retrieval)
|
||||
- ``z39.50s:`` (Z39.50 Session)
|
||||
|
||||
|
||||
Recursion
|
||||
---------
|
||||
|
||||
Before descending recursively into a URL, it has to fulfill several
|
||||
conditions. They are checked in this order:
|
||||
|
||||
1. A URL must be valid.
|
||||
|
||||
2. A URL must be parseable. This currently includes HTML files,
|
||||
Opera bookmarks files, and directories. If a file type cannot
|
||||
be determined (for example it does not have a common HTML file
|
||||
extension, and the content does not look like HTML), it is assumed
|
||||
to be non-parseable.
|
||||
|
||||
3. The URL content must be retrievable. This is usually the case
|
||||
except for example mailto: or unknown URL types.
|
||||
|
||||
4. The maximum recursion level must not be exceeded. It is configured
|
||||
with the ``--recursion-level`` option and is unlimited per default.
|
||||
|
||||
5. It must not match the ignored URL list. This is controlled with
|
||||
the ``--ignore-url`` option.
|
||||
|
||||
6. The Robots Exclusion Protocol must allow links in the URL to be
|
||||
followed recursively. This is checked by searching for a
|
||||
"nofollow" directive in the HTML header data.
|
||||
|
||||
Note that the directory recursion reads all files in that
|
||||
directory, not just a subset like ``index.htm*``.
|
||||
|
||||
|
||||
Frequently asked questions
|
||||
--------------------------
|
||||
|
||||
**Q: LinkChecker produced an error, but my web page is ok with
|
||||
Mozilla/IE/Opera/...
|
||||
Is this a bug in LinkChecker?**
|
||||
|
||||
A: Please check your web pages first. Are they really ok?
|
||||
Use the ``--check-html`` option, or check if you are using a proxy
|
||||
which produces the error.
|
||||
|
||||
**Q: I still get an error, but the page is definitely ok.**
|
||||
|
||||
A: Some servers deny access of automated tools (also called robots)
|
||||
like LinkChecker. This is not a bug in LinkChecker but rather a
|
||||
policy by the webmaster running the website you are checking. Look
|
||||
the ``/robots.txt`` file which follows the `robots.txt exclusion standard`_.
|
||||
|
||||
.. _`robots.txt exclusion standard`:
|
||||
http://www.robotstxt.org/wc/norobots-rfc.html
|
||||
|
||||
**Q: How can I tell LinkChecker which proxy to use?**
|
||||
|
||||
A: LinkChecker works transparently with proxies. In a Unix or Windows
|
||||
environment, set the http_proxy, https_proxy, ftp_proxy environment
|
||||
variables to a URL that identifies the proxy server before starting
|
||||
LinkChecker. For example
|
||||
|
||||
::
|
||||
|
||||
$ http_proxy="http://www.someproxy.com:3128"
|
||||
$ export http_proxy
|
||||
|
||||
|
||||
**Q: The link "mailto:john@company.com?subject=Hello John" is reported
|
||||
as an error.**
|
||||
|
||||
A: You have to quote special characters (e.g. spaces) in the subject field.
|
||||
The correct link should be "mailto:...?subject=Hello%20John"
|
||||
Unfortunately browsers like IE and Netscape do not enforce this.
|
||||
|
||||
|
||||
**Q: Has LinkChecker JavaScript support?**
|
||||
|
||||
A: No, it never will. If your page is not working without JS, it is
|
||||
better checked with a browser testing tool like Selenium_.
|
||||
|
||||
.. _Selenium:
|
||||
http://seleniumhq.org/
|
||||
|
||||
|
||||
**Q: Is LinkCheckers cookie feature insecure?**
|
||||
|
||||
A: Cookies can not store more information as is in the HTTP request itself,
|
||||
so you are not giving away any more system information.
|
||||
After storing however, the cookies are sent out to the server on request.
|
||||
Not to every server, but only to the one who the cookie originated from!
|
||||
This could be used to "track" subsequent requests to this server,
|
||||
and this is what some people annoys (including me).
|
||||
Cookies are only stored in memory. After LinkChecker finishes, they
|
||||
are lost. So the tracking is restricted to the checking time.
|
||||
The cookie feature is disabled as default.
|
||||
|
||||
|
||||
**Q: I want to have my own logging class. How can I use it in LinkChecker?**
|
||||
|
||||
A: Currently, only a Python API lets you define new logging classes.
|
||||
Define your own logging class as a subclass of StandardLogger or any other
|
||||
logging class in the log module.
|
||||
Then call the addLogger function in Config.Configuration to register
|
||||
your new Logger.
|
||||
After this append a new Logging instance to the fileoutput.
|
||||
|
||||
::
|
||||
|
||||
import linkcheck, MyLogger
|
||||
log_format = 'mylog'
|
||||
log_args = {'fileoutput': log_format, 'filename': 'foo.txt'}
|
||||
cfg = linkcheck.configuration.Configuration()
|
||||
cfg.logger_add(log_format, MyLogger.MyLogger)
|
||||
cfg['fileoutput'].append(cfg.logger_new(log_format, log_args))
|
||||
|
||||
|
||||
**Q: LinkChecker does not ignore anchor references on caching.**
|
||||
|
||||
**Q: Some links with anchors are getting checked twice.**
|
||||
|
||||
A: This is not a bug.
|
||||
It is not necessarily true that if a URL ``ABC#anchor1`` works then
|
||||
``ABC#anchor2`` works too. That is not specified anywhere and there are
|
||||
server-side scripts that fail on some anchors and not on others.
|
||||
This is the reason for always checking URLs with different anchors.
|
||||
If you really want to disable this, use the ``--no-anchor-caching``
|
||||
option.
|
||||
|
||||
|
||||
**Q: I see LinkChecker gets a /robots.txt file for every site it
|
||||
checks. What is that about?**
|
||||
|
||||
A: LinkChecker follows the `robots.txt exclusion standard`_. To avoid
|
||||
misuse of LinkChecker, you cannot turn this feature off.
|
||||
See the `Web Robot pages`_ and the `Spidering report`_ for more info.
|
||||
|
||||
.. _`robots.txt exclusion standard`:
|
||||
http://www.robotstxt.org/wc/norobots-rfc.html
|
||||
.. _`Web Robot pages`:
|
||||
http://www.robotstxt.org/wc/robots.html
|
||||
.. _`Spidering report`:
|
||||
http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt
|
||||
|
||||
|
||||
**Q: How do I print unreachable/dead documents of my website with
|
||||
LinkChecker?**
|
||||
|
||||
A: No can do. This would require file system access to your web
|
||||
repository and access to your web server configuration.
|
||||
|
||||
|
||||
**Q: How do I check HTML/XML/CSS syntax with LinkChecker?**
|
||||
|
||||
A: Use the ``--check-html`` and ``--check-css`` options.
|
||||
|
||||
|
|
@ -1,49 +0,0 @@
|
|||
.. meta::
|
||||
:keywords: link, URL, validation, checking
|
||||
|
||||
===============================
|
||||
Check websites for broken links
|
||||
===============================
|
||||
|
||||
LinkChecker is a free, GPL_ licensed URL validator.
|
||||
|
||||
.. _GPL:
|
||||
http://www.gnu.org/licenses/gpl-2.0.html
|
||||
|
||||
If you like LinkChecker, consider a donation_ to improve it even
|
||||
more!
|
||||
|
||||
.. _donation:
|
||||
http://sourceforge.net/project/project_donations.php?group_id=1913
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
- recursive and multithreaded checking
|
||||
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap
|
||||
graph in different formats
|
||||
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file
|
||||
links support
|
||||
- restriction of link checking with regular expression filters for URLs
|
||||
- proxy support
|
||||
- username/password authorization for HTTP and FTP and Telnet
|
||||
- honors robots.txt exclusion protocol
|
||||
- Cookie support
|
||||
- HTML and CSS syntax check
|
||||
- Antivirus check
|
||||
- a command line interface
|
||||
- a GUI client interface
|
||||
- a (Fast)CGI web interface (requires HTTP server)
|
||||
|
||||
|
||||
Screenshots
|
||||
===========
|
||||
|
||||
+------------------------------------+------------------------------------+------------------------------------+
|
||||
| .. image:: shot1_thumb.jpg | .. image:: shot2_thumb.jpg | .. image:: shot3_thumb.jpg |
|
||||
| :align: center | :align: center | :align: center |
|
||||
| :target: _static/shot1.png | :target: _static/shot2.png | :target: _static/shot3.png |
|
||||
+------------------------------------+------------------------------------+------------------------------------+
|
||||
| Commandline interface | GUI client | Web interface |
|
||||
+------------------------------------+------------------------------------+------------------------------------+
|
||||
|
||||
|
|
@ -1,54 +0,0 @@
|
|||
Other link checkers
|
||||
===================
|
||||
|
||||
If LinkChecker does not fit your requirements, you can check out the
|
||||
competition. All of these programs have also an `Open Source license`_
|
||||
like LinkChecker.
|
||||
|
||||
.. _`Open Source license`:
|
||||
http://www.opensource.org/licenses/
|
||||
|
||||
- `Checklinks`_ written in Perl
|
||||
|
||||
.. _Checklinks:
|
||||
http://www.jmarshall.com/tools/cl/
|
||||
|
||||
- `Dead link check`_ written in Perl
|
||||
|
||||
.. _Dead link check:
|
||||
http://dlc.sourceforge.net/
|
||||
|
||||
- `gURLChecker`_ written in C
|
||||
|
||||
.. _gURLChecker:
|
||||
http://labs.libre-entreprise.org/projects/gurlchecker/
|
||||
|
||||
- `KLinkStatus`_ written in C++
|
||||
|
||||
.. _KLinkStatus:
|
||||
http://klinkstatus.kdewebdev.org/
|
||||
|
||||
- `link-checker`_ written in C
|
||||
|
||||
.. _link-checker:
|
||||
http://ymettier.free.fr/link-checker/link-checker.html
|
||||
|
||||
- `linklint`_ written in Perl
|
||||
|
||||
.. _linklint:
|
||||
http://www.linklint.org/
|
||||
|
||||
- `W3C Link Checker`_ HTML interface only
|
||||
|
||||
.. _W3C Link Checker:
|
||||
http://validator.w3.org/checklink/
|
||||
|
||||
- `webcheck`_ written in Python
|
||||
|
||||
.. _webcheck:
|
||||
http://ch.tudelft.nl/~arthur/webcheck/
|
||||
|
||||
- `webgrep`_ written in Perl
|
||||
|
||||
.. _webgrep:
|
||||
http://cgi.linuxfocus.org/~guido/index.html#webgrep
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
favicon.ico: favicon32x32.png favicon16x16.png
|
||||
png2ico favicon.ico favicon32x32.png favicon16x16.png
|
||||
|
||||
|
|
@ -1,63 +0,0 @@
|
|||
{% extends "!layout.html" %}
|
||||
|
||||
{% block extrahead %}
|
||||
<style type="text/css">
|
||||
img { border: 0; }
|
||||
</style>
|
||||
{% endblock %}
|
||||
|
||||
{% block rootrellink %}
|
||||
<li><a href="{{ pathto('index') }}">Home </a> | </li>
|
||||
<li><a href="{{ pathto('documentation') }}">Documentation </a>| </li>
|
||||
<li><a href="{{ pathto('other') }}">Other link checkers </a> </li>
|
||||
{% endblock %}
|
||||
|
||||
{% block relbar1 %}
|
||||
<div style="background-color: white; text-align: left; padding: 10px 10px 15px 15px">
|
||||
{% if builder == 'html' %}
|
||||
<div style="float:right;"><a
|
||||
href="http://sourceforge.net/projects/linkchecker"><img
|
||||
src="http://sflogo.sourceforge.net/sflogo.php?group_id=1913&type=13"
|
||||
width="120" height="30" border="0"
|
||||
alt="Get LinkChecker at SourceForge.net." /></a>
|
||||
{# Piwik tag #}
|
||||
<script type="text/javascript">
|
||||
var pkBaseURL = (("https:" == document.location.protocol) ? "https:" : "http:") + "//apps.sourceforge.net/piwik/linkchecker/";
|
||||
document.write(unescape("%3Cscript src='" + pkBaseURL + "piwik.js' type='text/javascript'%3E%3C/script%3E"));
|
||||
</script><script type="text/javascript">
|
||||
piwik_action_name = '';
|
||||
piwik_idsite = 1;
|
||||
piwik_url = pkBaseURL + "piwik.php";
|
||||
piwik_log(piwik_action_name, piwik_idsite, piwik_url);
|
||||
</script>
|
||||
<object><noscript><p><img src="http://apps.sourceforge.net/piwik/linkchecker/piwik.php?idsite=1" alt=""/></p></noscript></object>
|
||||
{# End Piwik tag #}
|
||||
</div>
|
||||
{% endif %}
|
||||
<table border="0"><tr>
|
||||
<td><a href="{{ pathto('index') }}"><img
|
||||
src="{{ pathto("_static/logo64x64.png", 1) }}" border="0" alt="LinkChecker"/></a></td>
|
||||
<td><h1>LinkChecker</h1></td>
|
||||
</tr></table>
|
||||
</div>
|
||||
{{ super() }}
|
||||
{% endblock %}
|
||||
{% block relbar2 %}{% endblock %}
|
||||
|
||||
{# put the sidebar before the body #}
|
||||
{% block sidebarsearch %}{{ super() }}{% endblock %}
|
||||
{% block sidebar1 %}{{ sidebar() }}{% endblock %}
|
||||
{% block sidebar2 %}{% endblock %}
|
||||
{% block sidebarlogo %}{% if builder == 'html' %}
|
||||
{% if pagename == 'index' %}
|
||||
<h3>Download</h3>
|
||||
<a href="http://prdownloads.sourceforge.net/linkchecker/LinkChecker-{{version}}.exe?download">LinkChecker {{version}} for Windows</a><br/>
|
||||
<a href="http://prdownloads.sourceforge.net/linkchecker/LinkChecker-{{version}}.tar.gz?download">LinkChecker {{version}} source</a><br/>
|
||||
<a href="http://linkchecker.git.sourceforge.net/git/gitweb.cgi?p=linkchecker;a=blob;f=ChangeLog.txt;hb=HEAD">Changelog</a><br/>
|
||||
<h3>Support</h3>
|
||||
<a href="http://sourceforge.net/tracker/?func=add&group_id=1913&atid=101913">Bug tracker</a><br/>
|
||||
<a href="http://sourceforge.net/scm/?type=git&group_id=1913">Development repository</a><br/>
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
{% endblock %}
|
||||
{% block sidebartoc %}{% endblock %}
|
||||