LinkChecker ============= $long_description Installing, Requirements, Running --------------------------------- Read the file INSTALL. License -------- $appname is licensed under the GNU Public License. Credits go to Guido van Rossum for making Python. His hovercraft is full of eels! As this program is directly derived from my Java link checker, additional credits go to Robert Forsman (the author of JCheckLinks) and his robots.txt parse algorithm. I want to thank everybody who gave me feedback, bug reports and suggestions. Versioning ---------- Version numbers have the same meaning as Linux Kernel version numbers. The first number is the major package version. The second number is the minor package version. An odd second number stands for development versions, an even number for stable version. The third number is a package release sequence number. So for example 1.1.5 is the fifth release of the 1.1 development package. Included packages ----------------- httplib from http://www.lyra.org/greg/python/ httpslib from http://home.att.net/~nvsoft1/ssl_wrapper.html DNS see DNS/README fcgi.py and sz_fcgi.py from http://saarland.sz-sb.de/~ajung/sz_fcgi/ fintl.py from http://sourceforge.net/snippet/detail.php?type=snippet&id=100059 Note that the following packages are modified by me: httplib.py (renamed to http11lib.py and a bug fixed) fcgi.py (implemented streamed output) sz_fcgi.py (simplified the code) DNS/Lib.py:566 fixed rdlength name error DNS/Lib.py:105 tuple parameter for Python 1.6 compatibility DNS/Base.py: fixed /etc/resolv.conf parser to cope with empty lines Internationalization -------------------- For german output execute "export LC_MESSAGES=de" in bash or "setenv LC_MESSAGES de" in tcsh. Under Windows, execute "set LC_MESSAGES=de". For french output use 'fr' instead of 'de'. Code design ----------- Only if you want to hack on the code. (1) Look at the linkchecker script. This thing just reads all the commandline options and stores them in a Config object. (2) Which leads us directly to the Config class. This class stores all options and works a little magic: it tries to find out if your platform supports threads. If so, they are enabled. If not, they are disabled. Several functions are replaced with their threaded equivalents if threading is enabled. Another thing are config files. A Config object reads config file options on initialization so they get handled before any commandline options. (3) The linkchecker script finally calls linkcheck.checkUrls(), which calls linkcheck.Config.checkUrl(), which calls linkcheck.UrlData.check(). An UrlData object represents a single URL with all attached data like validity, check time and so on. These values are filled by the UrlData.check() function. Derived from the base class UrlData are the different URL types: HttpUrlData for http:// links, MailtoUrlData for mailto: links and so on. UrlData defines the functions which are common for *all* URLs, and the subclasses define functions needed for their URL type. (4) Lets look at the output. Every output is defined in a Logger class. Each logger has functions init(), newUrl() and endOfOutput(). We call init() once to initialize the Logger. UrlData.check() calls newUrl() (through UrlData.logMe()) and after all checking we call endOfOutput(). Easy. New loggers are created with the Config.newLogger(name, fileoutput) function. Nifty features you did not expect --------------------------------- o Included brain enhancer. Just read Python code to gain intelligence. o Wash-O-matic. LinkChecker has a secret option which washes all your dirty clothes in a matter of seconds. o Y10K-Compatibility(tm) guaranteed. Trust me on that one. o There is no spoon. Wake up already!