diff --git a/doc/source/_static/logo64x64.png b/doc/source/_static/logo64x64.png new file mode 100755 index 00000000..a4f04e32 Binary files /dev/null and b/doc/source/_static/logo64x64.png differ diff --git a/doc/source/_static/shot2.png b/doc/source/_static/shot2.png index 28559794..6b0c0a9e 100644 Binary files a/doc/source/_static/shot2.png and b/doc/source/_static/shot2.png differ diff --git a/doc/source/_static/shot3.png b/doc/source/_static/shot3.png new file mode 100644 index 00000000..28559794 Binary files /dev/null and b/doc/source/_static/shot3.png differ diff --git a/doc/source/_templates/layout.html b/doc/source/_templates/layout.html index 5cecbf10..c498d79a 100644 --- a/doc/source/_templates/layout.html +++ b/doc/source/_templates/layout.html @@ -18,8 +18,8 @@ img { border: 0; }
{{ super() }} diff --git a/doc/source/documentation.txt b/doc/source/documentation.txt index e642c816..86bea7db 100644 --- a/doc/source/documentation.txt +++ b/doc/source/documentation.txt @@ -63,11 +63,6 @@ checking. All connection check types are described below. 3) try to change to the given directory 4) list the file with the NLST command -- Gopher links (``gopher:``) (DEPRECATED) - - We try to send the given selector (or query) to the gopher server. - Gopher support will be removed in a future version of LinkChecker. - - Telnet links (``telnet:``) We try to connect and if user/password are given, login to the @@ -95,6 +90,7 @@ checking. All connection check types are described below. - ``dav:`` (dav) - ``fax:`` (fax) - ``find:`` (Mozilla specific) + - ``gopher:`` (Gopher) - ``imap:`` (internet message access protocol) - ``isbn:`` (ISBN (int. book numbers)) - ``javascript:`` (JavaScript) @@ -156,27 +152,22 @@ Frequently asked questions -------------------------- **Q: LinkChecker produced an error, but my web page is ok with -Netscape/IE/Opera/... +Mozilla/IE/Opera/... Is this a bug in LinkChecker?** -A: Please check your web pages first. Are they really ok? Use -a `syntax highlighting editor`_. Use `HTML Tidy`_. -Check if you are using a proxy which produces the error. - -.. _`syntax highlighting editor`: - http://fte.sourceforge.net/ -.. _`HTML Tidy`: - http://tidy.sourceforge.net/ - +A: Please check your web pages first. Are they really ok? +Use the ``--check-html`` option, or check if you are using a proxy +which produces the error. **Q: I still get an error, but the page is definitely ok.** A: Some servers deny access of automated tools (also called robots) like LinkChecker. This is not a bug in LinkChecker but rather a -policy by the webmaster running the website you are checking. -It might even be possible for a website to send robots different -web pages than normal browsers. +policy by the webmaster running the website you are checking. Look +the ``/robots.txt`` file which follows the `robots.txt exclusion standard`_. +.. _`robots.txt exclusion standard`: + http://www.robotstxt.org/wc/norobots-rfc.html **Q: How can I tell LinkChecker which proxy to use?** @@ -201,10 +192,11 @@ Unfortunately browsers like IE and Netscape do not enforce this. **Q: Has LinkChecker JavaScript support?** -A: No, it never will. If your page is not working without JS then your -web design is broken. -Use PHP or Zope or ASP for dynamic content, and use JavaScript just as -an addon for your web pages. +A: No, it never will. If your page is not working without JS, it is +better checked with a browser testing tool like Selenium_. + +.. _Selenium: + http://seleniumhq.org/ **Q: Is LinkCheckers cookie feature insecure?** @@ -244,8 +236,8 @@ After this append a new Logging instance to the fileoutput. **Q: Some links with anchors are getting checked twice.** A: This is not a bug. -It is common practice to believe that if a URL ``ABC#anchor1`` works then -``ABC#anchor2`` works too. That is not specified anywhere and I have seen +It is not necessarily true that if a URL ``ABC#anchor1`` works then +``ABC#anchor2`` works too. That is not specified anywhere and there are server-side scripts that fail on some anchors and not on others. This is the reason for always checking URLs with different anchors. If you really want to disable this, use the ``--no-anchor-caching`` @@ -267,28 +259,14 @@ See the `Web Robot pages`_ and the `Spidering report`_ for more info. http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt -**Q: Ctrl-C does not stop LinkChecker immediately. Why is that so?** - -A: The Python interpreter has to wait for all threads to finish, and -this means waiting for all open connections to close. The default timeout -for connections is 30 seconds, hence the delay. -You can change the default connection timeout with the --timeout option. - - **Q: How do I print unreachable/dead documents of my website with LinkChecker?** A: No can do. This would require file system access to your web repository and access to your web server configuration. -You can instead store the linkchecker results in a database -and look for missing files. +**Q: How do I check HTML/XML/CSS syntax with LinkChecker?** -**Q: How do I check HTML/XML syntax with LinkChecker?** - -A: No can do. Use the `HTML Tidy`_ program. - -.. _`HTML Tidy`: - http://tidy.sourceforge.net/ +A: Use the ``--check-html`` and ``--check-css`` options. diff --git a/doc/source/index.txt b/doc/source/index.txt index 722e633b..cc450ae8 100644 --- a/doc/source/index.txt +++ b/doc/source/index.txt @@ -1,11 +1,23 @@ +.. meta:: + :keywords: link, URL, validation, checking + The latest version of this document is available at http://linkchecker.sourceforge.net/. +=============================== +Check websites for broken links +=============================== + +LinkChecker is a free, GPL_ licensed URL validator. + +.. _GPL: + http://www.gnu.org/licenses/gpl-2.0.html + If you like LinkChecker, consider a donation_ to improve it even more! .. _donation: - http://sourceforge.net/donate/index.php?group_id=1913 + http://sourceforge.net/project/project_donations.php?group_id=1913 Features ======== @@ -23,20 +35,15 @@ Features - HTML and CSS syntax check - Antivirus check - a command line interface +- a GUI client interface - a (Fast)CGI web interface (requires HTTP server) Download ======== -LinkChecker is `OpenSource`_ software and licensed under the `GPL`_. -Downloads are available for Windows and Unix systems from the -`LinkChecker download section`_. +Get it from the `LinkChecker download section`_. -.. _OpenSource: - http://www.opensource.org/ -.. _GPL: - http://www.gnu.org/licenses/licenses.html#TOCGPL .. _LinkChecker download section: http://sourceforge.net/project/showfiles.php?group_id=1913 @@ -47,77 +54,35 @@ look at the ChangeLog_. .. _install documentation: install.html .. _ChangeLog: - http://linkchecker.svn.sourceforge.net/viewvc/linkchecker/trunk/linkchecker/ChangeLog?view=markup + http://linkchecker.svn.sourceforge.net/viewvc/linkchecker/trunk/linkchecker/ChangeLog.txt?view=markup Screenshots =========== - +------------------------------------+------------------------------------+ - | .. image:: shot1_thumb.jpg | .. image:: shot2_thumb.jpg | - | :align: center | :align: center | - | :target: _static/shot1.png | :target: _static/shot2.png | - +------------------------------------+------------------------------------+ - | Commandline interface | Web interface | - +------------------------------------+------------------------------------+ + +------------------------------------+------------------------------------+------------------------------------+ + | .. image:: shot1_thumb.jpg | .. image:: shot2_thumb.jpg | .. image:: shot3_thumb.jpg | + | :align: center | :align: center | :align: center | + | :target: _static/shot1.png | :target: _static/shot2.png | :target: _static/shot3.png | + +------------------------------------+------------------------------------+------------------------------------+ + | Commandline interface | GUI client | Web interface | + +------------------------------------+------------------------------------+------------------------------------+ -Running +Support ======= -Running under Unix or Mac OS X platforms ----------------------------------------- - -The local configuration file is $HOME/.linkcheckerrc -Type "linkchecker" followed by your URLs you want to check. -Type "linkchecker -h" for help. - -Running under Windows platforms -------------------------------- - -Start "Check URL" in your LinkChecker program group. -URL input is interactive. - -Another way is executing on the command line. -If there is a file ``linkchecker.bat`` in your Python Scripts directory -you can run eg. ``c:\Python25\Scripts\linkchecker.bat``. - -If there is only a ``linkchecker`` file in your Python Scripts -directory, you have to run eg. -``c:\Python25\python.exe c:\Python25\Scripts\linkchecker``. - - -Internationalization --------------------- -For german output execute "export LC_MESSAGES=de" in bash or -"setenv LC_MESSAGES de" in tcsh. -Under Windows, execute "set LC_MESSAGES=de". -Other supported languages are 'nl' (Nederlands) and 'fr' (francais). - -You can help to translate LinkChecker by copying the included -``po/linkchecker.pot`` file to ``po/language.po``, translate it and -send it to me. - - -Bug reporting -============= - -The `SourceForge Bug interface`_ allows submitting of bugs, patches +The `SF tracker`_ allows submitting of bugs, patches and requests. -.. _SourceForge Bug interface: +.. _SF tracker: http://sourceforge.net/tracker/?func=add&group_id=1913&atid=101913 -Subversion access -================= +Repository +========== -The `SourceForge Subversion page`_ has all the information on how to -obtain the development version of LinkChecker. Development of -LinkChecker requires some more software to be available, which -is documented on the `installation page`_. +The `SF Subversion page`_ hosts the development source code of LinkChecker. -.. _SourceForge Subversion page: +.. _SF Subversion page: http://sourceforge.net/svn/?group_id=1913 -.. _installation page: - install.html diff --git a/doc/source/install.txt b/doc/source/install.txt index f8ddd15d..16de9b1b 100644 --- a/doc/source/install.txt +++ b/doc/source/install.txt @@ -40,8 +40,7 @@ Requirements for Unix/Linux or Mac OS X GeoIP from http://www.maxmind.com/app/python 6. *Optional, used for Virus checking:* - ClamAv from - http://www.clamav.net/ + ClamAv from http://www.clamav.net/ Requirements for Windows @@ -49,6 +48,7 @@ Requirements for Windows None, the installer contains all files. + Setup for Unix/Linux or Mac OS X -------------------------------- @@ -94,7 +94,7 @@ Setup for Windows - the binary .exe installer: Setup for Windows - compiling from source: ------------------------------------------ -1. Install Python >= 2.6 from http://www.python.org/ +1. Install Python >= 2.5 from http://www.python.org/ [http://www.python.org/ftp/python/2.6/python-2.6.1.msi] 2. *Optional, for console color support:* @@ -148,9 +148,6 @@ After installation ------------------ LinkChecker is now installed. Have fun! -See the `main page`_ on how to configure and start LinkChecker. - -.. _main page: index.html (Fast)CGI web interface diff --git a/doc/source/shot2_thumb.jpg b/doc/source/shot2_thumb.jpg index 32e3715a..7c2e213e 100644 Binary files a/doc/source/shot2_thumb.jpg and b/doc/source/shot2_thumb.jpg differ diff --git a/doc/source/shot3_thumb.jpg b/doc/source/shot3_thumb.jpg new file mode 100644 index 00000000..c79a941a Binary files /dev/null and b/doc/source/shot3_thumb.jpg differ diff --git a/doc/source/upgrading.txt b/doc/source/upgrading.txt index 100d8c26..c84841c6 100644 --- a/doc/source/upgrading.txt +++ b/doc/source/upgrading.txt @@ -6,11 +6,10 @@ Migrating from 4.x to 5.0 Python >= 2.5 is now required. -The CGI script access control variable ALLOWED_HOSTS has been renamed -to ALLOWED_CLIENTS. -Accordingly, the function linkcheck.lc_cgi.checkaccess() changed its -keyword parameters to from "hosts" and "servers" to "allowed_clients" and -"allowed_servers". +The CGI script access control has been removed. Please use the access +control of your webserver to restrict access to the CGI script. +An example configuration file for the Apache weberver has been included +in the distribution. Migrating from 4.4 to 4.5 -------------------------