1.9.4 * parse CSS stylesheet files and check included urls, for example background images Changed: linkcheck/{File,Http,Ftp,}UrlData.py, linkcheck/linkparser.py 1.9.3 (released 16.10.2003) * re-added an updated robot parser which uses urllib2 and can decode compressed transfer encodings. Added: linkcheck/robotparser2.py * more restrictive url validity checking when running in CGI mode Changed: linkcheck/lc_cgi.py * accept more Windows path specifications, like file://C:\Dokume~1\test.html Changed: linkcheck/FileUrlData.py 1.9.2 * parser fixes: - do not #include , fixes build on some FreeBSD, Windows and Solaris/SunOS platforms - ignore first leading invalid backslash in a=\"b\" attributes Changed: linkcheck/parser/htmllex.{l,c} * add full script path to linkchecker on windows systems Changed: linkchecker.bat * fix generation of Linkchecker_Readme.txt under windows systems Changed: setup.py 1.9.1 * add documentation how to change the default C compiler Changed: INSTALL * fixed blacklist logging Changed: linkcheck/log/BlacklistLogger.py * removed unused imports Changed: linkcheck/*.py * parser fixes: - fixed parsing of end tags with trailing garbage - fixed parsing of script single comment lines Changed: linkcheck/parser/htmllex.l 1.9.0 * Require Python 2.3 - removed timeoutsocket.py and robotparser.py, using upstream - use True/False for boolean values - use csv module - use new-style classes Closes: SF bug 784977 Changed: a lot * update po makefiles and tools Changed po/* * start CGI output immediately Changed: lc.cgi, lc.fcgi, lc.sz_fcgi, linkcheck/lc_cgi.py Closes: SF bug 784331 1.8.22 * allow colons in HTML attribute names, used for namespaces Changed: linkcheck/parser/htmllex.l * fix match of intern patterns with --denyallow enabled Changed: linkcheck/UrlData.py * s/intern/internal/ and s/extern/external/ in the documentation Changed: linkchecker, linkchecker.1, FAQ * rename column "column" to "col" in SQL output, since "column" is a reserved keyword. Thanks Garvin Hicking for the hint. Changed: linkcheck/log/SQLLogger.py, create.sql * handle HTTP redirects to a non-http url Changed: linkcheck/{Http,}UrlData.py Closes: SF bug 784372 1.8.21 * detect recursive redirections; the maximum of five redirections is still there though * after every HTTP 301 or 302 redirection, check the URL cache again Closes: SF bug 776851 * put all HTTP 301 redirection answers also in the url cache as aliases of the original url. this could mess up some redirection warnings (ie warn about redirection when there is none), but it is more network efficient. 1.8.20 * fix setting of domain in set_intern_url Changed: linkcheck/UrlData.py * - parse JS strings and comments - accept "". Changed files: linkcheck/UrlData.py, linkchecker 1.8.17 * fix parsing of missing end tag in "" Changed files: linkcheck/parser/htmllex.l * fix entity resolving in parsed html links Closes: SF bug #749543 Changed files: linkcheck/StringUtil.py 1.8.16 * also look at id attributes on anchor check (Closes SF Bug #741131) Changed files: linkcheck/{linkparser,UrlData}.py * minor parser cleanups Changed files: linkcheck/parser/* 1.8.15 * Fix compile errors with C variable declarations in HTML parser. Thanks to Fazal Majid Changed files: linkcheck/parser/htmlparse.[yc] 1.8.14 * fix old bug in redirects not using the full url. This resulted in errors like (-2, "Name or service not known") Changed files: linkcheck/HttpUrlData.py Closes: SF Bug #729007 * only remove anchors on IIS servers (other servers are doing quite well with anchors... can you spell A-p-a-c-h-e ?) Changed files: linkcheck/{HttpUrlData, UrlData}.py * Parser changes: - correctly propagate and display parsing errors - really cope with missing ">" end tags Changed files: linkcheck/parser/html{lex.l, parse.y}, linkcheck/linkparse.py, linkcheck/UrlData.py * quote urls before a request Changed files: linkcheck/HttpUrlData.py 1.8.13 * fix typo in manpage Changed files: linkchecker.1 * remove anchor from HEAD and GET requests Changed files: linkcheck/{HttpUrlData, UrlData}.py 1.8.12 * convert urlparts to list also on redirect Changed files: linkcheck/HttpUrlData.py 1.8.11 * catch httplib.error exceptions Changed files: linkcheck/HttpUrlData.py * override interactive password question in robotparser.py Changed files: linkcheck/robotparser.py * switch to urllib2.py as default url connect. Changed files: linkcheck/UrlData.py * recompile html parser with flex 2.5.31 Changed files: linkcheck/parser/{htmllex.c,Makefile} 1.8.10 * new option --no-anchor-caching Changed files: linkchecker, linkcheck/{Config.py, UrlData.py}, FAQ * quote empty attribute arguments Changed files: linkcheck/parser/htmllex.[lc] 1.8.9 * recompile with bison 1.875a Changed files: linkcheck/parser/htmlparse.[ch] * remove stpcpy declaration, fixes compile error on RedHat 7.x Changed files: linkcheck/parser/htmlsax.h * clarify keyboard interrupt warning to wait for active connections to finish Changed files: linkcheck/__init__.py * resolve &#XXX; number entity references Changed files: linkcheck/{StringUtil.py,linkname.py} 1.8.8 * All amazon servers block HEAD requests with timeouts. Use GET as a workaround, but issue a warning. Changed files: linkcheck/HttpUrlData.py * restrict CGI access to localhost per default Changed files: lc.cgi, lc.fcgi, lc.sz_fcgi, linkcheck/lc_cgi.py 1.8.7 * #define YY_NO_UNISTD_H on Windows systems, fixes build error with Visual Studio compiler Changed files: setup.py * use python2.2 headers for parser compile, not 2.1. Changed files: linkcheck/parser/Makefile 1.8.6 * include a fixed robotparser.py (from Python 2.2 CVS maint branch) 1.8.5 * fix config.warn to warn Changed files: linkcheck/__init.py * parser changes: o recognise "" HTML comments (seen at Eonline) o recognise "" HTML comments (seen at www.nba.com) o rebuild with flex 2.5.27 Changed files: linkcheck/parser/htmllex.[lc] * added another url exclusion example to the FAQ numerate questions and answers Changed files: FAQ * fix linkchecker exceptions Changed files: linkcheck/{Ftp,Mailto,Nntp,Telnet,}UrlData.py, linkcheck/__init__.py 1.8.4 * Improve error message for failing htmlsax module import Changed files: linkcheck/parser/htmllib.py * Regenerate parser with new bison 1.875 Changed files: linkcheck/parser/htmlparser.c * Some CVS files were not the same as their local counterpart. Something went wrong. Anyway, I re-committed them. Changed files: a lot .py files 1.8.3 * add missing imports for StringUtil in log classes, defer i18n of log field names (used for CGI scripts) Changed files: linkcheck/log/*.py * fixed wrong debug level comparison from > to >= Changed files: linkcheck/Config.py * JavaScript checks in the CGI scripts Changed files: lconline/lc_cgi.html.* Added files: lconline/check.js * Updated documentation with a link restriction example Changed files: linkchecker, linkchecker.1, FAQ * Updated po/pygettext.py to version 1.5, cleaned up some gettext usages. * updated i18n Added files: linkcheck/i18n.py Changed files: all .py files using i18n * Recognise "= 2.2.1, remove httplib. Changed files: setup.py, INSTALL, linkchecker * Add again python-dns, the Debian package maintainer is unresponsive Added files: linkcheck/DNS/*.py Changed files: INSTALL, setup.py * You must now use named constants for ANSII color codes Changed files: linkcheckerrc, linkcheck/log/ColoredLogger.py * Release RedHat 8.0 rpm packages. Changed files: setup.py, MANIFEST.in * remove --robots-txt from manpage, fix HTZP->HTTP typo Changed files: linkchecker.1 1.7.1 * Fix memory leak in HTML parser flushing error path Changed files: htmlparse.y * add custom line and column tracking in parser Changed files: htmllex.l, htmlparse.y, htmlsax.h, htmllib.py * Use column tracking in urldata classes Changed files: UrlData.py, FileUrlData,py, FtpUrlData.py, HostCheckingUrlData.py * Use column tracking in logger classes Changed files: StandardLogger.py CVSLogger.py, ColoredLogger.py, HtmlLogger.py, SqlLogger.py 1.7.0 * Added new HTML parser written in C as a Python extension module. It is faster and it is more fault tolerant. Of course, this means I cannot provide .exe installers any more since the distutils dont provide cross-compilation. 1.6.7 * Removed check for tags codebase attribute, but honor it when checking applet links * Handle tags archive attribute as a comma separated list Closes: SF bug #636802 * Fix a nasty bug in tag searching, which ignored tags with more than one link attribute in it. * Fix concatenation with relative base urls by first joining the parent url. * New commandline option --profile to write profile data. * Add httplib.py from Python CVS 2.1 maintenance branch, which has the skip_host keyword argument I am using now. 1.6.6 * Use the new HTTPConnection/HTTPResponse interface of httplib Closes: SF bug #634679 Changed files: linkcheck/HTTPUrlData.py, linkcheck/HTTPSUrlData.py * Updated the ftp online test Changed files: test/output/test_ftp 1.6.5 * Catch the maximum recursion limit error while parsing links and print an error message instead of bailing out. Changed files: linkcheck/UrlData.py * Fixed Ctrl-C only interrupting one single thread, not the whole program. Changed files: linkcheck/UrlData.py, linkcheck/__init__.py * HTML syntax cleanup and relative cgi form url for the cgi scripts Changed files: lconline/*.html 1.6.4 * Support for ftp proxies Changed files: linkcheck/FtpUrlData.py, linkcheck/HttpUrlData.py Added files: linkcheck/ProxyUrlData.py * Updated german translation 1.6.3: * Generate md5sum checksums for distributed files Changed files: Makefile * use "startswith" string method instead of a regex Changed files: linkchecker, linkcheck/UrlData.py * Add a note about supported languages, updated the documentation. Changed files: README, linkchecker, FAQ * Remove --robots-txt option from documentation, it is per default enabled and you cannot disable it from the command line. Changed files: linkchecker, po/*.po * fix --extern argument creation Changed files: linkchecker, linkcheck/UrlData.py * Print help if PyDNS module is not installed Changed files: linkcheck/UrlData.py * Print information if a proxy was used. Changed files: linkcheck/HttpUrlData.py * Updated german documentation Changed files: po/de.po * Oops, an FTP proxy is not used. Will make it in the next release. Changed files: linkcheck/FtpUrlData.py * Default socket timeout is now 30 seconds (10 was too short) 1.6.2: * Warn about unknown Content-Encodings. Dont parse HTML in this case. * Support deflate content encoding (snatched from Debians reportbug) * Add appropriate Accept-Encoding header to HTTP request. * Updated german translations 1.6.1: * FileUrlData.py: remove searching for links in text files, this is error prone. Just handle *.html and Opera Bookmarks. * Make separate ChangeLog from debian/changelog. For previous changes, see debian/changelog. * Default socket timeout is now 10 seconds * updated linkcheck/timeoutsocket.py to newest version * updated README and INSTALL * s/User-agent/User-Agent/, use same case as other browsers