Commit graph

129 commits

Author SHA1 Message Date
calvin
e9805dbd8a Updated copyright year to 2009
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3887 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-08 14:18:03 +00:00
calvin
209d5abc18 fix timeouts by testing earlier for persistent connections with HEAD
HEAD requests never have a body; nevertheless the http lib tries to
read() from them. This times out on some servers of course. Fix is
not to let those connections be persistent.

git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3871 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-29 08:14:28 +00:00
calvin
c20e706761 Made some format changes on translated strings.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3870 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-28 20:22:48 +00:00
calvin
c3b6fc5aa4 Readd
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3867 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-20 21:30:10 +00:00
calvin
97cf700e04 Fixed wrong cookie debugging format line.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3849 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-07-13 12:51:56 +00:00
calvin
b30fb3b09c Remove duplicate code in http checker.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3820 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 19:52:09 +00:00
calvin
caf8ba6297 Really allow parsing of XHTML files; I forgot some places to adjust the MIME checking.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3818 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 13:03:48 +00:00
calvin
a6deeeb8a5 Support parsing of HTML pages served with content type application/xhtml+xml
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3817 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 09:39:49 +00:00
calvin
a880939c40 Initialize variables in reset(), not in subsequent methods
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3796 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-08 09:27:13 +00:00
calvin
5f4d61e018 Use keyword arguments in translation strings.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3780 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:44:40 +00:00
calvin
bacb59597e Use relative imports from Python 2.5
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3750 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-09 06:16:03 +00:00
calvin
92c74ece4d Send HTTP Referer header to both http and https URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3741 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-29 13:33:35 +00:00
calvin
5d8bdaaa1f Use generators instead of lists where possible
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3739 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-28 00:26:02 +00:00
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
973da91f44 Source code cleanup: use or remove unused variables
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3724 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-25 07:49:52 +00:00
calvin
62efec3b35 Added CSS syntax check.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3719 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-24 09:44:18 +00:00
calvin
5a2f89fa3d Add redirect warning for commandline URLs
If URLs given on the commandline are redirected, the automatic
intern patterns might not match anymore. A warning makes this
more prominent.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3712 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:36 +00:00
calvin
4055721fd4 Use internal gzip2 module
Use the internal gzip replacement module gzip2 for all GzipFile handling.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3685 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:55 +00:00
calvin
4ce0ddd166 Changes for future Python 3.x compatibility
Replace backticks with repr(), replace .has_key() with "in".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3680 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:22:57 +00:00
calvin
91a0aad5d8 Fix buggy persistent HTTP connections
Workaround for buggy servers that break protocol synchronization of
persistent HTTP connections.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3677 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:43 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
c971ebdabf Added Shockwave Flash (SWF) parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3656 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 02:12:48 +00:00
calvin
30d2b4f520 HTTP content data is only considered valid for parsing if the request was not redirected and is a GET request.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3633 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 10:50:13 +00:00
calvin
41bc0b2b32 use 'self.data is None' to test if data is already read or not
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3631 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:09:20 +00:00
calvin
5591bbe052 fix self.downloadtime to self.dltime
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3630 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:01:36 +00:00
calvin
8d2dc781e1 Ensure unused or expired connections are closed.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3617 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 16:42:41 +00:00
calvin
9cf3314eab Use constants for warning tags, avoiding typos in string constants. And move the constants into a separate module const.py
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3611 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-29 07:50:22 +00:00
calvin
fcde8bd4d6 try to detect unknown URL schemes instead of manually setting the assume_local flag
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3609 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-28 18:46:50 +00:00
calvin
2edfaea03e Read complete body data on persistent connections, else subsequent requests could fail.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3568 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 19:33:10 +00:00
calvin
2b94c0c161 Assume missing HEAD requests for Zope server on text/plain content type
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3567 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 18:26:55 +00:00
calvin
df48d4a905 bump up copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3534 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-01-01 14:57:38 +00:00
calvin
c217b6d441 don't set result on self.get_content() redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3515 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-11-17 20:42:00 +00:00
calvin
bef2494211 remove unused imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3482 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-24 10:13:59 +00:00
calvin
1883b79303 follow redirections when getting HTTP contents
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3473 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-21 09:27:38 +00:00
calvin
576d404ce2 close non-idle connections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3453 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 21:27:19 +00:00
calvin
86514bb882 activate asset
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3448 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 20:21:06 +00:00
calvin
6348205dcc add persistent connections back to the connection cache, close all others
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3444 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:59:30 +00:00
calvin
d6676ab0a0 more response closing, and cleanups
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3443 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:51:02 +00:00
calvin
6fe2db6755 use unicode_safe alias helper
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3442 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:46:55 +00:00
calvin
4b818cb4b3 Detect more cases to close the connection, and close response objects
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3437 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 16:35:53 +00:00
calvin
da15b15923 Split off the host wait time function, and use it with a separate lock
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3434 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 12:18:24 +00:00
calvin
7e1e01bd36 do not catch UnicodeError, handle that intern
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3269 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-19 17:13:16 +00:00
calvin
3adaf48b3d add callback for crawldelay
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3227 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 15:35:58 +00:00
calvin
75e88c062a added --cookiefile option to set initial cookie values
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3210 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 20:56:34 +00:00
calvin
91ff370ed7 on redirection to different URL scheme take caching into account; adjust tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3173 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:47:50 +00:00
calvin
2a336f8dad put redirects in url queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3172 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:01:41 +00:00
calvin
dc9f04e6dc adjust debug asserts
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3159 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-13 21:03:21 +00:00
calvin
f002c5f965 Replace the old threading algorithm with a new one based on Queue.Queue and consumer threads
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3146 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-13 13:44:52 +00:00
calvin
276437c7d8 syntax cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3067 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-02-09 07:57:22 +00:00
calvin
e92aee054c updated copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3010 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-01-03 19:12:47 +00:00