linkchecker/ChangeLog
2005-01-20 09:32:24 +00:00

1055 lines
36 KiB
Text
Raw Blame History

2.2 "" (released xx.xx.2005)
* Write CSV column headers as the first row.
Thanks to Hartmut Goebel.
Type: feature
Changed: linkcheck/logger/csv.py
* Support bzip-compressed man pages in RPM install script.
Also from Hartmut Goebel.
Type: feature
Changed: install-rpm.sh
* HTML parser updates:
- supply and use Py_CLEAR macro
- only call set_encoding function if tag name is 'meta'
Type: feature
Changed: linkcheck/HtmlParser/*
* Changed documentation format for epydoc.
Type: documentation
Changed: *.py
* Fix FTP error message display crash.
Type: bugfix
Changed: linkcheck/checker/ftpurl.py
* Ask before overwriting old profile data with --profile.
Type: feature
Changed: linkchecker
* When searching for link names, limit the amount of data to look at
to 500 characters. Do not look at the complete content anymore.
This speeds up parsing of big HTML files significantly.
Type: optimization
Changed: linkcheck/linkparse.py
2.1 "Shogun Assassin" (released 11.1.2005)
* Added XHTML support to the HTML parser.
Type: feature
Changed: linkcheck/HtmlParser/*
* Support plural forms in gettext translations.
Type: feature
Changed: po/*.po*
* Remove intern optcomplete installation, and make it optional to
install, since it is only needed on Unix installations using
bash-completion.
Type: feature
Changed: linkchecker, config/linkchecker-completion
Removed: linkcheck/optcomplete.py
* Minor enhancements in url parsing.
Type: feature
Changed: linkcheck/url.py
* Sort according to preference when checking MX hosts so that
preferred MX hosts get checked first.
Type: bugfix
Changed: linkcheck/checker/mailtourl.py
* If mail VRFY command fails, print a warning message.
Type: feature
Changed: linkcheck/checker/mailtourl.py
2.0 "I Kina spiser de hunde" (released 7.12.2004)
* Regenerate the HTML parser with new Bison version 1.875d.
Also use the now supported Bison memory macros YYMALLOC and
YYFREE.
Type: feature
Changed: linkcheck/HtmlParser/htmlparse.y
* Updated installation and usage documentation.
Type: documentation
Changed: doc/install.txt, doc/index.txt
* Added comment() method to loggers for printing comments.
Type: feature
Changed: linkcheck/logger/*.py
* Updated and translated manpages. French translation from
Yann Verley. German translation from me ;)
Type: documentation
Added: doc/de/linkchecker.de.1, doc/fr/linkchecker.fr.1
Changed: doc/en/linkchecker.1
* Fix mailto: URL norming by splitting the query type correctly.
Type: bugfix
Changed: linkcheck/url.py
* Encode all output strings for display.
Type: bugfix
Changed: linkchecker
* Accept -o option logger type as case independent string.
Type: feature
Changed: linkchecker
* Internal Unicode handling fixed.
Type: bugfix
Changed: linkcheck/url.py, linkcheck/checker/*.py
* Use correct FTP directory list parsing.
Type: bugfix
Changed: linkcheck/checker/ftpurl.py
2.0rc2 "El d<>a de la bestia" (released 20.11.2004)
* encode version string for --version output
Type: bugfix
Closes: SF bug #1067915
Changed: linkchecker
* Added shell config note with --home install option.
Type: documentation
Closes: SF bug #1067919
Changed: doc/install.txt
* Recheck robots.txt allowance and intern/extern filters for
redirected URLs.
Type: bugfix
Closes: SF bug #1067914
Changed: linkcheck/checker/httpurl.py
* Updated the warning and info messages to be always complete
sentences.
Type: feature
Changed: linkcheck/checker/*.py, po/*, linkcheck/ftests/*.py,
linkcheck/ftests/data/*.result
* Added missing script_dir to the windows installer script.
Use python.exe instead of pythonw.exe and --interactive option to
call linkcheck script.
Add Documentation link to the programs group.
Type: bugfix
Changed: install-linkchecker.py
2.0rc1 "The Incredibles" (released 16.11.2004)
* Only instantiate SSL connections if SSL is supported
Type: bugfix
Changed: linkcheck/checker/httpurl.py
* Close all opened log files.
Type: bugfix
Changed: linkcheck/logger/*.py
* All loggers have now an output encoding. Valid encodings are listed
in http://docs.python.org/lib/node127.html. The default encoding is
"iso-8859-15".
Type: feature
Changed: linkcheck/logger/*.py
* The --output and --file-output parameters can specify the encoding
now. The documentation has been updated with this change.
Type: feature
Changed: linkchecker, linkchecker.1
* The encoding can also be specified in the linkcheckerrc config file.
Type: feature
Changed: config/linkcheckerrc
* All leading directories of a given output log file are created
automatically now. Errors creating these directories or opening
the log file for writing abort the checking and print a usage mesage.
Type: feature
Changed: linkchecker, linkcheck/logger/__init__.py
* Coerce url names to unicode
Type: feature
Changed: linkcheck/checker/__init__.py
* Accept unicode filenames for resolver config
Type: feature
Changed: linkcheck/dns/resolver.py
* LinkChecker accepts now Unicode domain names and converts them
according to RFC 3490 (http://www.faqs.org/rfcs/rfc3490.html).
Type: feature
Changed: linkcheck/dns/resolver.py, linkcheck/url.py
* Exceptions in the log systems are no more caught.
Type: feature
Changed: linkcheck/ansicolor.py
* Remember a <base href=""> tag in the link parser. Saves one HTML
parse.
Type: feature
Changed: linkcheck/checker/urlbase.py, linkcheck/linkparse.py
* Optimize link name parsing of img alt tags.
Type: feature
Changed: linkcheck/linkname.py
* Remove all references to the old 'colored' output logger.
Type: documentation
Closes: SF bug #1062011
Changed: linkchecker.1
* Synchronized the linkchecker documentation and the man page.
Type: documentation
Closes: SF bug #1062034
Changed: linkchecker, linkchecker.1
* Make --quiet an alias for -o none.
Type: bugfix
Closes: SF bug #1063144
Changed: linkchecker, linkcheck/configuration.py,
linkcheck/checker/consumer.py
* Re-norm a changed file:// base url, avoiding a spurious warning.
Type: bugfix
Changed: linkcheck/checker/fileurl.py
* Wrong case of file links on Windows platforms now issue a
warning.
Type: feature
Closes: SF bug #1062007
Changed: linkcheck/checker/fileurl.py
* Updated the french translation. Thanks to Yann Verley.
Type: feature
Changed: po/fr.po
1.13.5 "Die Musterknaben" (released 22.9.2004)
* Use xgettext with Python support for .pot file creation, adjusted
developer documentation.
Type: feature
Changed: doc/install.txt, po/Makefile, MANIFEST.in
Removed: po/pygettext.py, po/msgfmt.py
* Use plural gettext form for log messages.
Type: feature
Changed: linkcheck/logger/{text,html}.py
* Check if FTP file really exists instead of only the parent dir.
Type: bugfix
Changed: linkcheck/checker/ftpurl.py
* Document the different logger output types.
Type: documentation
Changed: linkchecker, linkchecker.1
* Recursion into FTP directories and parseable files has been
implemented.
Type: feature
Changed: linkcheck/checker/ftpurl.py
1.13.4 "Shaun of the dead" (released 17.9.2004)
* Catch HTTP cookie errors and add a warning.
Type: bugfix
Changed: linkcheck/checker/httpurl.py
* fix up response page object in robots.txt parser for the upcoming
Python 2.4 release
Type: bugfix
Changed: linkcheck/robotparser2.py
* remove cached urls from progress queue, fixing endless wait for
checking to finish
Type: bugfix
Changed: linkcheck/checker/consumer.py
* updated and synchronized documentation of the man page (linkchecker.1)
and the linkchecker --help output.
Type: documentation
Changed: linkchecker, linkchecker.1
1.13.3 "Fight Club" (released 10.9.2004)
* Prevent collapsing of relative parent dir paths. This fixes false
positives on URLs of the form "../../foo".
Closes: SF bug #1025459
Changed: linkcheck/url.py, linkcheck/tests/test_url.py
1.13.2 "Zat<61>ichi" (released 8.9.2004)
* Fix permissions of data files on install to be world readable.
Type: bugfix
Closes: SF bug #1022132
Changed: setup.py
* Fixed the SQL logger when encountering empty URLs.
Type: bugfix
Closes: SF bug #1022156
Changed: linkcheck/logger/sql.py
* Added notes about access rules for CGI scripts
Type: documentation
Changed: doc/install.txt
* Updated french translation. Thanks, Yann Verley!
Type: feature
Changed: po/fr.po
* initialize i18n at program start
Type: bugfix
Changed: linkchecker, linkcheck/lc_cgi.py
* Make initialization function for i18n, and allow LOCPATH to override
the locale directory.
Type: feature
Changed: linkcheck/__init__.py
* Removed debug print statement when issueing linkchecker --help.
Type: bugfix
Changed: linkchecker
* Reset to default ANSI color scheme, we don't know what background
color the terminal has.
Type: bugfix
Closes: SF bug #1022158
Changed: linkcheck/configuration.py
* Reinit the logger object when config files change values.
Type: bugfix
Changed: linkcheck/configuration.py
* Only import ifconfig routines on POSIX systems.
Type: bugfix
Closes: SF bug #1024607
Changed: linkcheck/dns/resolver.py
1.13.1 "Old men in new cars" (released 3.9.2004)
* Fixed RPM generation by adding the generated config file to the
installed files list.
Type: bugfix
Changed: setup.py
* Mention to remove old versions when upgrading in the documentation.
Type: documentation
Changed: doc/upgrading.txt, doc/install.txt
* Fix typo in redirection cache handling.
Type: bugfix
Changed: linkcheck/checker/cache.py
* The -F file output must honor verbose/quiet configuration.
Type: bugfix
Changed: linkcheck/checker/consumer.py
* Generate all translation files under windows systems.
Type: bugfix
Changed: po/Makefile
* Added windows binary installer script and configuration.
Type: feature
Changed: setup.py, setup.cfg, doc/install.txt
Added: install-linkchecker.py
* Do not raise an error when user and/or password of ftp URLs is not
specified.
Type: bugfix
Changed: linkcheck/checker/ftpurl.py
* honor anchor part of cache url key, handle the recursion check
with an extra cache key
Type: bugfix
Changed: linkcheck/checker/{urlbase,cache,fileurl}.py
* Support URL lists in text files with one URL per line. Empty lines
or comment lines starting with '#' are ignored.
Type: feature
Changed: linkcheck/checker/fileurl.py
* Added new option --extern-strict to specify strict extern url
patterns.
Type: feature
Changed: linkchecker
* Strip quotes from parsed CSS urls.
Type: bugfix
Changed: linkcheck/checker/urlbase.py
1.13.0 "The Butterfly Effect" (released 1.9.2004)
* lots of internal code restructuring
Type: code cleanup
Changed: a lot
* If checking revealed errors (or warnings with --warnings),
the command line client exits with a non-zero exit status.
Type: feature
Closes: SF bug 1013191
Changed: linkchecker, linkcheck/checker/consumer.py
* Specify the HTML doctype and charset in HTML output.
Type: feature
Closes: SF bug 1014283
Changed: linkcheck/logger/html.py
* Fix endless loop on broken urls with non-empty anchor.
Type: bugfix
Changed: linkcheck/checker/httpurl.py
* For news: or nntp: urls, entries in ~/.netrc are now ignored.
You should give instead username/password info in the configuration
file or on the command line.
Type: bugfix
Changed: linkcheck/checker/nntpurl.py
* The HTML output shows now HTML and CSS validation links for
the parent URL of invalid links.
Type: feature
Changed: linkcheck/logger/html.py
* The status is now printed as default, it can be supressed with
the new --no-status option.
Type: feature
Changed: linkchecker
* The default recursion level is now infinite.
Type: feature
Changed: linkchecker
* The 'outside of domain filter' is no more a warning but an informational
message. A warning is inappropriate since the user is in full control
over what links are extern or intern.
Type: feature
Closes: SF bug 1013206
Changed: linkcheck/urlbase.py
* Renamed the --strict option to --extern-strict-all.
Type: feature
Changed: linkchecker
* a new cache and queueing algorithm makes sure that no URL is
checked twice.
Type: feature
Changed: linkcheck/checker/cache.py
* the given user/password authententication is now also used to
get robots.txt files.
Type: feature
Changed: linkcheck/robotparser2.py, linkcheck/checker/cache.py
1.12.3 "The Princess Bride" (released 27.5.2004)
* fall back to GET on bad status line of a HEAD request
Type: bugfix
Changed: linkcheck/HttpUrlData.py
* really fall back to GET with Zope servers; fixes infinite loop
Type: bugfix
Changed: linkcheck/HttpUrlData.py
* better error msg on BadStatusLine error
Type: feature
Changed: linkcheck/UrlData.py
* updated optcomplete to newest upstream
Type: feature
Changed: linkcheck/optcomplete.py
* also quote query parts of urls
Type: bugfix
Changed: linkcheck/{HttpUrlData, url}.py
* - preserve the order in which HTML attributes have been parsed
- cope with trailing space in HTML comments
Type: feature
Changed: linkcheck/parser/{__init__.py,htmllex.l}
Added: linkcheck/containers.py
* rework anchor fallback
Type: bugfix
Changed: linkcheck/HttpUrlData.py
* move contentAllowsRobot check to end of recursion check to avoid
unnecessary GET request
Type: bugfix
Changed: linkcheck/UrlData.py
1.12.2 (release 4.4.2004)
* use XmlUtils instead of xmlify for XML quoting
Type: code cleanup
Added: linkcheck/XmlUtils.py
Changed: linkcheck/StringUtil.py, linkcheck/log/XMLLogger.py
* don't require a value anymore with the --version option
Type: bugfix
Changed: linkchecker
* before putting url data objects in the queue, check if they have
correct syntax and are not already cached
Type: optimization
Changed: linkcheck/{UrlData,Config}.py
* every once in a while, remove all already cached urls from the
incoming queue. This action is reported when --status is given.
Type: optimization
Changed: linkcheck/Config.py
* both changes above result in significant performance improvements
when checking large websites, since a majority of the links tend
to be navigation links to already-cached pages.
Type: note
* updated examples and put them before options in the man page for
easier reading
Type: documentation
Changed: linkchecker, linkchecker.1
* added contact url and email to the HTTP User-Agent string, which
gets us more accepted by some bot-blocking software; also see
http://www.livejournal.com/bots/
Type: feature
Changed: linkcheck/Config.py
* only check robots.txt for http connections
Type: bugfix
Changed: linkcheck/{Http,}UrlData.py
Closes: SF bug 928895
* updated regression tests
Type: feature
Changed: test/test_*.py, Makefile
Added: test/run.sh
* preserve the order in which HTML attributes have been parsed
Type: feature
Changed: linkcheck/parser/{__init__.py,htmllex.l}
* handle and correct missing start quotes in HTML attributes
Type: feature
Changed: linkcheck/parser/htmllex.l
* full parsing of .css files
Type: feature
Changed: linkcheck/{Http,}UrlData.py, linkcheck/linkparse.py
* removed Gilman news draft
Type: feature
Removed: draft-gilman-news-url-00.txt
1.12.1 (release 21.2.2004)
* raise IncompleteRead instead of ValueError on malformed chunked
HTTP data
Changed: linkcheck/httplib2.py
* catch errors earlier in recursion check
Changed: linkcheck/UrlData.py
* quote url and parent url in log output
Changed: linkcheck/log/*.py
Added: linkcheck/url.py
1.12.0 (release 31.1.2004)
* added LRU.setdefault function
Changed: linkcheck/LRU.py
Closes: SF bug 885916
* Added Mac OS X as supported platform (version 10.3 is known to work)
Changed: README, INSTALL
* HTML parser objects are now subclassable and collectable by the cyclic
garbage collector
Changed: linkcheck/parser/htmlparse.y
* made some minor parser fixes for attribute scanning and JavaScript
Changed: linkcheck/parser/htmllex.l
* include the optcomplete module for bash autocompletion
Added: linkcheck/optcomplete.py, linkcheck-completion
Changed: MANIFEST.in, setup.py
* print out nicer error message for unknown host names
Changed: linkcheck/UrlData.py
* added new logger type "none" printing out nothing which is handy for
cron scripts.
Changed: linkchecker, linkcheck/Config.py, linkcheck/log/__init__.py
Added: linkcheck/log/NoneLogger.py
* the -F file output option disables console output now
Changed: linkchecker
* added an example cron script
Added: linkcheck-cron.sh
Changed: MANIFEST.in, setup.py
* only warn about missing anchor support servers when the url has
actually an anchor
Changed: linkcheck/HttpUrlData.py
* always fall back to HTTP GET request when HEAD gave an error to
cope with servers not supporting HEAD requests
Changed: linkcheck/HttpUrlData.py, FAQ
1.10.3 (release 10.1.2004)
* use the optparser module for command line parsing
Changed: linkchecker, po/*.po
* use Set() instead of hashmap
Changed: linkcheck/Config.py
* fix mime-type checking to allow parsing of .css stylesheets
Changed: linkcheck/HttpUrlData.py
* honor HTML meta tags for robots, ie.
<meta name="ROBOTS" content="NOFOLLOW">
Changed: linkcheck/UrlData.py, linkcheck/linkparse.py
* much less aggressive thread acquiring, this fixes the 100% CPU
usage from the previous version
Changed: linkcheck/Threader.py
1.10.2 (release 3.1.2004)
* fixed CGI safe_url pattern, it was too strict
Changed: linkcheck/lc_cgi.py
* replace backticks with repr() or %r
Changed: all .py files containing backticks, and po/*.po
* make windows DNS nameserver parsing more robust
Changed: linkcheck/DNS/Base.py
Closes: SF bugs 863227,864383
* only cache used data, not the whole url object
Changed: linkcheck/{Http,}UrlData.py
* limit cached data
Changed: linkcheck/{UrlData,Config}.py
Added: linkcheck/LRU.py
Closes: SF bug 864516
* use dummy_threading module and get rid of the _NoThreads
functions
Changed: linkchecker, linkcheck/{Config,Threader}.py,
test/test_*.py
* set default connection timeout to 60 seconds
Changed: linkcheck/__init__.py
* new option --status print regular messages about number of
checked urls and urls still to check
Changed: linkchecker, linkcheck/{__init__,Config}.py
1.10.1 (release 19.12.2003)
* added Mandrake .spec file from Chris Green <cmg@dok.org>
Added: linkchecker.spec
Changed: MANIFEST.in
* print last-modified date for http and https links in infos
Changed: linkcheck/HttpUrlData.py
* add detailed installation instructions for Windows
Changed: INSTALL
Closes: SF bug 857748
* updated the DNS nameserver config parse routines
Changed: linkcheck/DNS/Base.py
Added: linkcheck/DNS/winreg.py
Removed: linkcheck/DNS/win32dns.py
* fix https support test
Changed: linkcheck/HttpUrlData.py
1.10.0 (released 7.12.2003)
* catch httplib errors in robotparser
Changed: linkcheck/robotparser2.py
Closes: SF bug 836864
* - infinite recursion option with negative value works now
- initialize self.urlparts to avoid crash when reading cached http
urls
- with --strict option do not add any automatic filters if the user
gave his own on the command line
Changed: linkcheck/UrlData.py
1.9.5 (released 31.10.2003)
* Add Zope to servers with broken HEAD support, adjusted the FAQ
Changed: linkcheck/HttpUrlData.py, FAQ
Closes: SF bug 833419
* Disable psyco usage, it is causing infinite loops (this is a known
issue with psyco); and it is disabling ctrl-c interrupts (this
is also a known issue in psyco)
Changed: linkchecker
* use internal debug logger
Changed: linkcheck/robotparser2.py
* do not hardcode Accept-Encoding header in HTTP request
Added: linkcheck/httplib2.py
Changed: linkcheck/robotparser2.py
1.9.4 (released 22.10.2003)
* parse CSS stylesheet files and check included urls, for example
background images
Changed: linkcheck/{File,Http,Ftp,}UrlData.py, linkcheck/linkparser.py
* try to use psyco for the commandline linkchecker script
Changed: linkchecker
* when decompression of compressed HTML pages fails, assume the page
is not compressed
Changed: linkcheck/{robotparser2,HttpUrlData}.py
1.9.3 (released 16.10.2003)
* re-added an updated robot parser which uses urllib2 and can decode
compressed transfer encodings.
Added: linkcheck/robotparser2.py
* more restrictive url validity checking when running in CGI mode
Changed: linkcheck/lc_cgi.py
* accept more Windows path specifications, like
file://C:\Dokume~1\test.html
Changed: linkcheck/FileUrlData.py
1.9.2
* parser fixes:
- do not #include <stdint.h>, fixes build on some FreeBSD, Windows
and Solaris/SunOS platforms
- ignore first leading invalid backslash in a=\"b\" attributes
Changed: linkcheck/parser/htmllex.{l,c}
* add full script path to linkchecker on windows systems
Changed: linkchecker.bat
* fix generation of Linkchecker_Readme.txt under windows systems
Changed: setup.py
1.9.1
* add documentation how to change the default C compiler
Changed: INSTALL
* fixed blacklist logging
Changed: linkcheck/log/BlacklistLogger.py
* removed unused imports
Changed: linkcheck/*.py
* parser fixes:
- fixed parsing of end tags with trailing garbage
- fixed parsing of script single comment lines
Changed: linkcheck/parser/htmllex.l
1.9.0
* Require Python 2.3
- removed timeoutsocket.py and robotparser.py, using upstream
- use True/False for boolean values
- use csv module
- use new-style classes
Closes: SF bug 784977
Changed: a lot
* update po makefiles and tools
Changed po/*
* start CGI output immediately
Changed: lc.cgi, lc.fcgi, lc.sz_fcgi, linkcheck/lc_cgi.py
Closes: SF bug 784331
1.8.22
* allow colons in HTML attribute names, used for namespaces
Changed: linkcheck/parser/htmllex.l
* fix match of intern patterns with --denyallow enabled
Changed: linkcheck/UrlData.py
* s/intern/internal/ and s/extern/external/ in the documentation
Changed: linkchecker, linkchecker.1, FAQ
* rename column "column" to "col" in SQL output, since "column" is
a reserved keyword. Thanks Garvin Hicking for the hint.
Changed: linkcheck/log/SQLLogger.py, create.sql
* handle HTTP redirects to a non-http url
Changed: linkcheck/{Http,}UrlData.py
Closes: SF bug 784372
1.8.21
* detect recursive redirections; the maximum of five redirections is
still there though
* after every HTTP 301 or 302 redirection, check the URL cache again
Closes: SF bug 776851
* put all HTTP 301 redirection answers also in the url cache as
aliases of the original url. this could mess up some redirection
warnings (ie warn about redirection when there is none), but it is
more network efficient.
1.8.20
* fix setting of domain in set_intern_url
Changed: linkcheck/UrlData.py
* - parse JS strings and comments
- accept "<!- " as comment begin
Changed: linkcheck/parser/htmlex.l
Closes: SF bug 768661
* quote url before submitting the request, the previous map() call
was useless. Thanks Toby Dickenson for the patch.
Changed: linkcheck/HttpUrlData.py
Closes: SF bug 776416
1.8.19
* add scheme colon in set_intern_url
Changed: linkcheck/UrlData.py
* fix threading option -t
Changed: linkchecker, linkcheck/Config.py
* do not try to get content of urls that have no content (eg mail)
Closes: SF bug 765016
Changed: linkcheck/{Mailto,Nntp,Telnet,}UrlData.py
* added robots.txt FAQ, updated links
Removed: norobots-rfc.html
Changed: FAQ, WONTDO, TODO
* add iso-8859-1 coding line to all .py files
Changed: *.py
* Correctly quote the HTML output
Changed: linkcheck/log/HtmlLogger.py
1.8.18
* fix option error messages for invalid integer arguments
Changed files: linkchecker
* enable infinite recursion with a negative -r value
Changed files: linkcheck/{UrlData,Config}.py, linkchecker,
linkchecker.1
* if -s is given, add some link patterns to urls given on the
command line automatically:
for local files, add -i "^file:". For http and ftp urls, add
the domain name -i "<domain>".
Changed files: linkcheck/UrlData.py, linkchecker
1.8.17
* fix parsing of missing end tag in "</a <a b=c>"
Changed files: linkcheck/parser/htmllex.l
* fix entity resolving in parsed html links
Closes: SF bug #749543
Changed files: linkcheck/StringUtil.py
1.8.16
* also look at id attributes on anchor check
(Closes SF Bug #741131)
Changed files: linkcheck/{linkparser,UrlData}.py
* minor parser cleanups
Changed files: linkcheck/parser/*
1.8.15
* Fix compile errors with C variable declarations in HTML parser.
Thanks to Fazal Majid <fazal@majid.fm>
Changed files: linkcheck/parser/htmlparse.[yc]
1.8.14
* fix old bug in redirects not using the full url. This resulted in
errors like (-2, "Name or service not known")
Changed files: linkcheck/HttpUrlData.py
Closes: SF Bug #729007
* only remove anchors on IIS servers (other servers are doing quite
well with anchors... can you spell A-p-a-c-h-e ?)
Changed files: linkcheck/{HttpUrlData, UrlData}.py
* Parser changes:
- correctly propagate and display parsing errors
- really cope with missing ">" end tags
Changed files: linkcheck/parser/html{lex.l, parse.y},
linkcheck/linkparse.py, linkcheck/UrlData.py
* quote urls before a request
Changed files: linkcheck/HttpUrlData.py
1.8.13
* fix typo in manpage
Changed files: linkchecker.1
* remove anchor from HEAD and GET requests
Changed files: linkcheck/{HttpUrlData, UrlData}.py
1.8.12
* convert urlparts to list also on redirect
Changed files: linkcheck/HttpUrlData.py
1.8.11
* catch httplib.error exceptions
Changed files: linkcheck/HttpUrlData.py
* override interactive password question in robotparser.py
Changed files: linkcheck/robotparser.py
* switch to urllib2.py as default url connect.
Changed files: linkcheck/UrlData.py
* recompile html parser with flex 2.5.31
Changed files: linkcheck/parser/{htmllex.c,Makefile}
1.8.10
* new option --no-anchor-caching
Changed files: linkchecker, linkcheck/{Config.py, UrlData.py}, FAQ
* quote empty attribute arguments
Changed files: linkcheck/parser/htmllex.[lc]
1.8.9
* recompile with bison 1.875a
Changed files: linkcheck/parser/htmlparse.[ch]
* remove stpcpy declaration, fixes compile error on RedHat 7.x
Changed files: linkcheck/parser/htmlsax.h
* clarify keyboard interrupt warning to wait for active connections
to finish
Changed files: linkcheck/__init__.py
* resolve &#XXX; number entity references
Changed files: linkcheck/{StringUtil.py,linkname.py}
1.8.8
* All amazon servers block HEAD requests with timeouts. Use GET as
a workaround, but issue a warning.
Changed files: linkcheck/HttpUrlData.py
* restrict CGI access to localhost per default
Changed files: lc.cgi, lc.fcgi, lc.sz_fcgi, linkcheck/lc_cgi.py
1.8.7
* #define YY_NO_UNISTD_H on Windows systems, fixes build error with
Visual Studio compiler
Changed files: setup.py
* use python2.2 headers for parser compile, not 2.1.
Changed files: linkcheck/parser/Makefile
1.8.6
* include a fixed robotparser.py (from Python 2.2 CVS maint branch)
1.8.5
* fix config.warn to warn
Changed files: linkcheck/__init.py
* parser changes:
o recognise "<! -- -->" HTML comments (seen at Eonline)
o recognise "<! !>" HTML comments (seen at www.nba.com)
o rebuild with flex 2.5.27
Changed files: linkcheck/parser/htmllex.[lc]
* added another url exclusion example to the FAQ
numerate questions and answers
Changed files: FAQ
* fix linkchecker exceptions
Changed files: linkcheck/{Ftp,Mailto,Nntp,Telnet,}UrlData.py,
linkcheck/__init__.py
1.8.4
* Improve error message for failing htmlsax module import
Changed files: linkcheck/parser/htmllib.py
* Regenerate parser with new bison 1.875
Changed files: linkcheck/parser/htmlparser.c
* Some CVS files were not the same as their local counterpart.
Something went wrong. Anyway, I re-committed them.
Changed files: a lot .py files
1.8.3
* add missing imports for StringUtil in log classes, defer i18n of log
field names (used for CGI scripts)
Changed files: linkcheck/log/*.py
* fixed wrong debug level comparison from > to >=
Changed files: linkcheck/Config.py
* JavaScript checks in the CGI scripts
Changed files: lconline/lc_cgi.html.*
Added files: lconline/check.js
* Updated documentation with a link restriction example
Changed files: linkchecker, linkchecker.1, FAQ
* Updated po/pygettext.py to version 1.5, cleaned up some gettext
usages.
* updated i18n
Added files: linkcheck/i18n.py
Changed files: all .py files using i18n
* Recognise "<! --" HTML comments
Changed files: linkcheck/parser/htmllex.l
* -a anchor option implies -w because anchor errors are always warnings
Changed files: linkchecker
* added AnsiColors.py and debug.py to split out some functions
Changed files: a lot .py files using these things
* use yy_size_t for parser alloc definitions, fixes build errors on 64bit
architectures
Changed files: linkcheck/parser/htmllex.l
1.8.2
* - ignore invalid html attribute characters
- ignore trailing garbage on html end tags
- fixed debugging code with flex
- use flex memory management interface
- use only double quotes for attribute quoting
- check quoting of all attributes
Changed files: linkcheck/parser/htmllex.l
* build parser with flex 2.5.25
Changed files: linkcheck/parser/{Makefile, htmllex.c}
* put shared code of cgi scripts in lc_cgi.py
Changed files: lc.cgi, lc.fcgi, lc.sz_fcgi, linkcheck/lc_cgi.py
* put some linebreaks and target="top" into HTML output
Changed files: linkcheck/logging/HtmlLogger.py
* add translated cgi files
Changed files: setup.py, MANIFEST.in, debian/rules
Added files: lconline/*.{de,en}
Removed files: lconline/{leer.html,lc_cgi.html}
1.8.1
* Add missing () to function call in proxy handling code
Changed files: FtpUrlData.py
* Use urlparse.url(un)split instead of urlparse.url(un)parse
Changed files: FtpUrlData.py, UrlData.py, HttpUrlData.py,
FileUrlData.py
* Print size information if its available
Changed files: FtpUrlData.py, UrlData.py, HttpUrlData.py
* Add --warning-size-bytes option to print warning if content size
exceeds the given byte limit
Changed files: FtpUrlData.py, HttpUrlData.py, linkchecker, Config.py,
linkchecker.1
* Updated translations
Changed files: po/linkchecker.pot, po/*.po
* Parse supported file types for ftp links
Changed files: FtpUrlData.py, FileUrlData.py, UrlData.py
1.8.0
* Require Python >= 2.2.1, remove httplib.
Changed files: setup.py, INSTALL, linkchecker
* Add again python-dns, the Debian package maintainer is unresponsive
Added files: linkcheck/DNS/*.py
Changed files: INSTALL, setup.py
* You must now use named constants for ANSII color codes
Changed files: linkcheckerrc, linkcheck/log/ColoredLogger.py
* Release RedHat 8.0 rpm packages.
Changed files: setup.py, MANIFEST.in
* remove --robots-txt from manpage, fix HTZP->HTTP typo
Changed files: linkchecker.1
1.7.1
* Fix memory leak in HTML parser flushing error path
Changed files: htmlparse.y
* add custom line and column tracking in parser
Changed files: htmllex.l, htmlparse.y, htmlsax.h, htmllib.py
* Use column tracking in urldata classes
Changed files: UrlData.py, FileUrlData,py, FtpUrlData.py,
HostCheckingUrlData.py
* Use column tracking in logger classes
Changed files: StandardLogger.py CVSLogger.py, ColoredLogger.py,
HtmlLogger.py, SqlLogger.py
1.7.0
* Added new HTML parser written in C as a Python extension module.
It is faster and it is more fault tolerant.
Of course, this means I cannot provide .exe installers any more
since the distutils dont provide cross-compilation.
1.6.7
* Removed check for <applet> tags codebase attribute, but honor it
when checking applet links
* Handle <applet> tags archive attribute as a comma separated list
Closes: SF bug #636802
* Fix a nasty bug in tag searching, which ignored tags with more
than one link attribute in it.
* Fix concatenation with relative base urls by first joining the
parent url.
* New commandline option --profile to write profile data.
* Add httplib.py from Python CVS 2.1 maintenance branch, which has the
skip_host keyword argument I am using now.
1.6.6
* Use the new HTTPConnection/HTTPResponse interface of httplib
Closes: SF bug #634679
Changed files: linkcheck/HTTPUrlData.py, linkcheck/HTTPSUrlData.py
* Updated the ftp online test
Changed files: test/output/test_ftp
1.6.5
* Catch the maximum recursion limit error while parsing links and
print an error message instead of bailing out.
Changed files: linkcheck/UrlData.py
* Fixed Ctrl-C only interrupting one single thread, not the whole
program.
Changed files: linkcheck/UrlData.py, linkcheck/__init__.py
* HTML syntax cleanup and relative cgi form url for the cgi scripts
Changed files: lconline/*.html
1.6.4
* Support for ftp proxies
Changed files: linkcheck/FtpUrlData.py, linkcheck/HttpUrlData.py
Added files: linkcheck/ProxyUrlData.py
* Updated german translation
1.6.3:
* Generate md5sum checksums for distributed files
Changed files: Makefile
* use "startswith" string method instead of a regex
Changed files: linkchecker, linkcheck/UrlData.py
* Add a note about supported languages, updated the documentation.
Changed files: README, linkchecker, FAQ
* Remove --robots-txt option from documentation, it is per default
enabled and you cannot disable it from the command line.
Changed files: linkchecker, po/*.po
* fix --extern argument creation
Changed files: linkchecker, linkcheck/UrlData.py
* Print help if PyDNS module is not installed
Changed files: linkcheck/UrlData.py
* Print information if a proxy was used.
Changed files: linkcheck/HttpUrlData.py
* Updated german documentation
Changed files: po/de.po
* Oops, an FTP proxy is not used. Will make it in the next release.
Changed files: linkcheck/FtpUrlData.py
* Default socket timeout is now 30 seconds (10 was too short)
1.6.2:
* Warn about unknown Content-Encodings. Dont parse HTML in this case.
* Support deflate content encoding (snatched from Debians reportbug)
* Add appropriate Accept-Encoding header to HTTP request.
* Updated german translations
1.6.1:
* FileUrlData.py: remove searching for links in text files, this is
error prone. Just handle *.html and Opera Bookmarks.
* Make separate ChangeLog from debian/changelog. For previous
changes, see debian/changelog.
* Default socket timeout is now 10 seconds
* updated linkcheck/timeoutsocket.py to newest version
* updated README and INSTALL
* s/User-agent/User-Agent/, use same case as other browsers