Commit graph

1638 commits

Author SHA1 Message Date
calvin
df857aab8d Intern patterns now accept URLs with and without "www." prefixes
as default. This allows sites to check that use both variants.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3714 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:58 +00:00
calvin
5a2f89fa3d Add redirect warning for commandline URLs
If URLs given on the commandline are redirected, the automatic
intern patterns might not match anymore. A warning makes this
more prominent.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3712 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:36 +00:00
calvin
8fa01f32c2 Use LC_ALL instead of LC_MESSAGES
Windows platforms do not have LC_MESSAGES. Use LC_ALL instead.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3709 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 07:40:40 +00:00
calvin
c58dd965af Set HTML charset according to logger output encoding.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3708 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 01:28:56 +00:00
calvin
18c6e6e38a Set default_encoding on i18n init
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3707 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 01:28:47 +00:00
calvin
d9f8bd3187 Properly set the locale in CGI scripts
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3704 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 01:28:19 +00:00
calvin
cfc651550a Use set() instead of a list for the set of supported languages
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3703 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 01:28:10 +00:00
calvin
fa48fe354d Use LC_MESSAGES locale, not default system locale in i18n
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3699 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 01:27:38 +00:00
calvin
963feb2288 Double Ctrl-C stops checking immediately without cleanup.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3696 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-20 23:56:25 +00:00
calvin
9c56f03ae3 Shutdown immediately when Ctrl-C is given twice
Try sys.exit() to shutdown immediately after Ctrl-C keyboard
interrupt was given twice.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3695 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-19 07:47:16 +00:00
calvin
8ae6d94b45 Improved error messages for exceptions
Prepend the exception name before the error message of exceptions.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3694 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-19 07:47:00 +00:00
calvin
67aed38df2 Bump copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3693 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-19 07:46:52 +00:00
calvin
4968f1b3cd Prevent empty exception values.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3690 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-18 07:42:24 +00:00
calvin
ba148a9d71 Proper MX DNS request fallback
Properly fall back to DNS A requests when no MX host could be found
for a mailto: URL.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3689 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:34:46 +00:00
calvin
9b7cf763ff Fix test for new www.example.org URL
Fix test data using www.example.org instead of imadoofus.org URLs.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3688 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:34:29 +00:00
calvin
88242b4612 Compare to singletons with "is"
Make sure comparisons with singletons like None/True/False use
"is", not "==/!=".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3687 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:34:16 +00:00
calvin
4055721fd4 Use internal gzip2 module
Use the internal gzip replacement module gzip2 for all GzipFile handling.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3685 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:55 +00:00
calvin
17cd16185f Remove timestamp from gzipped files
Remove the timestamp from gzipped files since it might be a security
and/or privacy risk to include it.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3684 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:42 +00:00
calvin
bf277085e9 Regenerate HTML scanner with new flex version
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3683 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:24:32 +00:00
calvin
1f5a2d47ea Syntax cleanups
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3682 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:24:21 +00:00
calvin
e178405748 Use example.{com,org} for example URLs
Use the guaranteed not available example.com and example.org DNS names
in example URLs.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3681 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:23:34 +00:00
calvin
4ce0ddd166 Changes for future Python 3.x compatibility
Replace backticks with repr(), replace .has_key() with "in".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3680 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:22:57 +00:00
calvin
370bd058ea Add htmlsax.so target for local build
Add target to build htmlsax.so locally. Also add include path
for local python SVN repository for testing.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3678 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:21:52 +00:00
calvin
91a0aad5d8 Fix buggy persistent HTTP connections
Workaround for buggy servers that break protocol synchronization of
persistent HTTP connections.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3677 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:43 +00:00
calvin
67e55d3832 Revert "Update httplib2.py from upstream SVN"
This reverts commit 00937008e0c2e6d86cf8d9e9c2d54ff5d7443dcc.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3676 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:33 +00:00
calvin
f0faf1b155 Update httplib2.py from upstream SVN
Added some bugfixes from the Python upstream httplib.py.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3675 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:26 +00:00
calvin
1730097265 Prevent Unicode errors for non-ASCII emails
Prevent Unicode errors when email address contains non-ASCII characters.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3673 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-02 23:38:02 +00:00
calvin
860def8d34 Remove superfluous path slash
Really fix the test_misc unit test by removing a superluous path slash.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3672 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:46 +00:00
calvin
13df77c0b5 Added .gitignore files
Ignore files for git version tracking system.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3671 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:29 +00:00
calvin
294261d80a Replace hardcoded test paths for test_misc
Replace the hardcoded test paths with variables. Fixes failures
in the test_misc unit test.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3670 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:11 +00:00
calvin
e1b1b7d916 Regenerate HTML lexer with flex 2.5.34
The HTML lexer .c file has been regenerated with a new upstream
release of flex 2.5.34.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3669 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:00 +00:00
calvin
f01a77bab1 Don't parse '-->' as end-of-comment in script mode. This fixes parsing errors on some sites.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3668 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-23 09:43:39 +00:00
calvin
8c4d8145a7 simplify the CDATA matching rules to be more straightforward
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3667 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-23 09:11:50 +00:00
calvin
7cf9723b10 don't parse <script for=''> as URL
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3659 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-11 16:45:30 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
c99b9b1e8f added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3657 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 08:25:40 +00:00
calvin
c971ebdabf Added Shockwave Flash (SWF) parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3656 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 02:12:48 +00:00
calvin
ad7c9bbc76 Don't print cached errors or warnings unless verbose output is requested.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3640 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-14 10:58:10 +00:00
calvin
1919c30bdf Do not throw internal errors when writing from a thread to a non-opened file
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3638 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 12:25:21 +00:00
calvin
6c07be042d Add optional leading dot for cookie domain value
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3637 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 11:52:10 +00:00
calvin
fddf890bd4 Allow spaces in cookie values
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3636 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 11:44:18 +00:00
calvin
30d2b4f520 HTTP content data is only considered valid for parsing if the request was not redirected and is a GET request.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3633 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 10:50:13 +00:00
calvin
41bc0b2b32 use 'self.data is None' to test if data is already read or not
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3631 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:09:20 +00:00
calvin
5591bbe052 fix self.downloadtime to self.dltime
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3630 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:01:36 +00:00
calvin
7a4c7e9f44 remove unused imports reported by pyflakes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3629 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 13:54:44 +00:00
calvin
8e6c6455ab add missing import
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3626 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-01 15:56:15 +00:00
calvin
09ce26d5fe removed debug flag, test the LOG_CHECK logger for debug settings
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3623 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-01 15:50:59 +00:00
calvin
ebb428044c Simplify option parsing: check option existance before access instead of catching an exception.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3622 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-01 15:50:33 +00:00
calvin
8d2dc781e1 Ensure unused or expired connections are closed.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3617 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 16:42:41 +00:00
calvin
042f70115f updated copyright
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3616 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 14:57:03 +00:00