Commit graph

118 commits

Author SHA1 Message Date
calvin
92c74ece4d Send HTTP Referer header to both http and https URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3741 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-29 13:33:35 +00:00
calvin
5d8bdaaa1f Use generators instead of lists where possible
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3739 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-28 00:26:02 +00:00
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
973da91f44 Source code cleanup: use or remove unused variables
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3724 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-25 07:49:52 +00:00
calvin
62efec3b35 Added CSS syntax check.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3719 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-24 09:44:18 +00:00
calvin
5a2f89fa3d Add redirect warning for commandline URLs
If URLs given on the commandline are redirected, the automatic
intern patterns might not match anymore. A warning makes this
more prominent.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3712 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:36 +00:00
calvin
4055721fd4 Use internal gzip2 module
Use the internal gzip replacement module gzip2 for all GzipFile handling.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3685 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:55 +00:00
calvin
4ce0ddd166 Changes for future Python 3.x compatibility
Replace backticks with repr(), replace .has_key() with "in".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3680 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:22:57 +00:00
calvin
91a0aad5d8 Fix buggy persistent HTTP connections
Workaround for buggy servers that break protocol synchronization of
persistent HTTP connections.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3677 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:43 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
c971ebdabf Added Shockwave Flash (SWF) parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3656 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 02:12:48 +00:00
calvin
30d2b4f520 HTTP content data is only considered valid for parsing if the request was not redirected and is a GET request.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3633 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 10:50:13 +00:00
calvin
41bc0b2b32 use 'self.data is None' to test if data is already read or not
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3631 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:09:20 +00:00
calvin
5591bbe052 fix self.downloadtime to self.dltime
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3630 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:01:36 +00:00
calvin
8d2dc781e1 Ensure unused or expired connections are closed.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3617 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 16:42:41 +00:00
calvin
9cf3314eab Use constants for warning tags, avoiding typos in string constants. And move the constants into a separate module const.py
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3611 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-29 07:50:22 +00:00
calvin
fcde8bd4d6 try to detect unknown URL schemes instead of manually setting the assume_local flag
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3609 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-28 18:46:50 +00:00
calvin
2edfaea03e Read complete body data on persistent connections, else subsequent requests could fail.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3568 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 19:33:10 +00:00
calvin
2b94c0c161 Assume missing HEAD requests for Zope server on text/plain content type
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3567 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 18:26:55 +00:00
calvin
df48d4a905 bump up copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3534 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-01-01 14:57:38 +00:00
calvin
c217b6d441 don't set result on self.get_content() redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3515 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-11-17 20:42:00 +00:00
calvin
bef2494211 remove unused imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3482 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-24 10:13:59 +00:00
calvin
1883b79303 follow redirections when getting HTTP contents
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3473 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-21 09:27:38 +00:00
calvin
576d404ce2 close non-idle connections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3453 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 21:27:19 +00:00
calvin
86514bb882 activate asset
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3448 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 20:21:06 +00:00
calvin
6348205dcc add persistent connections back to the connection cache, close all others
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3444 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:59:30 +00:00
calvin
d6676ab0a0 more response closing, and cleanups
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3443 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:51:02 +00:00
calvin
6fe2db6755 use unicode_safe alias helper
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3442 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:46:55 +00:00
calvin
4b818cb4b3 Detect more cases to close the connection, and close response objects
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3437 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 16:35:53 +00:00
calvin
da15b15923 Split off the host wait time function, and use it with a separate lock
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3434 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 12:18:24 +00:00
calvin
7e1e01bd36 do not catch UnicodeError, handle that intern
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3269 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-19 17:13:16 +00:00
calvin
3adaf48b3d add callback for crawldelay
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3227 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 15:35:58 +00:00
calvin
75e88c062a added --cookiefile option to set initial cookie values
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3210 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 20:56:34 +00:00
calvin
91ff370ed7 on redirection to different URL scheme take caching into account; adjust tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3173 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:47:50 +00:00
calvin
2a336f8dad put redirects in url queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3172 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:01:41 +00:00
calvin
dc9f04e6dc adjust debug asserts
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3159 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-13 21:03:21 +00:00
calvin
f002c5f965 Replace the old threading algorithm with a new one based on Queue.Queue and consumer threads
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3146 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-13 13:44:52 +00:00
calvin
276437c7d8 syntax cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3067 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-02-09 07:57:22 +00:00
calvin
e92aee054c updated copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3010 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-01-03 19:12:47 +00:00
calvin
388475cbe2 new-style exceptions
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2999 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-20 20:17:33 +00:00
calvin
c84a33c7ce syntax fix
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2988 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-18 09:13:23 +00:00
calvin
856ff8ef2a assert debugs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2987 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-18 08:55:42 +00:00
calvin
19c0a3c2ed use new cookie parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2983 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-18 08:19:11 +00:00
calvin
9425873830 cache aliases (from redirects)
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2964 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-08 20:20:21 +00:00
calvin
c86c0870d6 debugging
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2946 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-12-06 22:32:31 +00:00
calvin
a2e422ce0d reindent
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2900 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-10-13 22:26:12 +00:00
calvin
f01b84b894 rework the redirection routine a little, putting warnings specifically for redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2786 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-20 15:51:12 +00:00
calvin
317ef181f0 handle all redirections to different URL schemes,
not just HTTP -> not HTTP, and fix a variable typo


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2780 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-20 08:53:57 +00:00
calvin
d347840dee use official HTTP status names
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2754 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-15 11:31:32 +00:00
calvin
c140b510b1 fix warning tags
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@2746 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2005-07-13 19:33:26 +00:00