Commit graph

771 commits

Author SHA1 Message Date
Bastian Kleineidam
362c7a1d9d Preselect filename on save dialog when editing file:// URLs. 2011-02-09 08:46:09 +01:00
Bastian Kleineidam
4a0c63aa56 Fix joining of URLs when parent URL has CGI parameter. 2011-02-08 21:25:55 +01:00
Bastian Kleineidam
71b15b70f4 Updated copyright 2011-01-06 09:59:57 +01:00
Bastian Kleineidam
5f70b7210f Add tempfile utility function. 2011-01-06 09:52:11 +01:00
Bastian Kleineidam
d011d1524c Parse PHP files recursively. 2010-12-28 17:11:29 +01:00
Bastian Kleineidam
fd3fe8dcaa Fix missing content types for cached URLs. 2010-12-23 07:37:36 +01:00
Bastian Kleineidam
2a4b60de4d Remove unused imports. 2010-12-22 13:06:24 +01:00
Bastian Kleineidam
84e4e3b28a Fix regression from last commit in this file. 2010-12-22 13:06:10 +01:00
Bastian Kleineidam
0d8a583e39 Fix internal pattern for file URLs (regression from commit 90e0f4e) 2010-12-21 21:10:31 +01:00
Bastian Kleineidam
6090e1a66c Print anchor in __str__() 2010-12-21 20:55:49 +01:00
Bastian Kleineidam
1ebd4d1fc4 Simplify code. 2010-12-21 20:55:35 +01:00
Bastian Kleineidam
90e0f4e5cc Detect filenames with spaces as internal links. 2010-12-21 07:05:12 +01:00
Bastian Kleineidam
9ea35241c0 Set correct scheme on file links. 2010-12-21 01:23:50 +01:00
Bastian Kleineidam
128f8eb6e4 Move firefox routines to firefox module. 2010-12-21 00:02:12 +01:00
Bastian Kleineidam
7c08290c44 Fix broken anchor checking. 2010-12-20 19:55:26 +01:00
Bastian Kleineidam
0b8f8d52b2 Check for empty URL before determining content type. 2010-12-18 08:26:59 +01:00
Bastian Kleineidam
224061e284 Fix to_wire by looking of URL parts have been initialized. 2010-12-15 13:24:12 +01:00
Bastian Kleineidam
7c55351511 Add get_content_type methods to subclasses. 2010-12-15 07:54:44 +01:00
Bastian Kleineidam
2b2121b9ed Added content type and domain to URL logging info. 2010-12-14 20:30:53 +01:00
Bastian Kleineidam
01184784ef Remove warning about Unicode domains which are more widely supported now. 2010-12-11 07:58:15 +01:00
Bastian Kleineidam
9e88377584 Remove stray raise statement from previous commit. 2010-11-26 21:35:49 +01:00
Bastian Kleineidam
c5676f0297 Catch socket errors when closing SMTP connections. 2010-11-26 19:51:26 +01:00
Bastian Kleineidam
5c9c15071a Limit FTP download file size. 2010-11-25 20:44:41 +01:00
Bastian Kleineidam
0cf22e5242 Limit FTP download file size. 2010-11-25 20:44:14 +01:00
Bastian Kleineidam
6fac69cddb Fall back to GET when connection is reset. 2010-11-21 19:50:51 +01:00
Bastian Kleineidam
03034ddc1c Updated copyright 2010-11-21 11:25:07 +01:00
Bastian Kleineidam
04f9c1b854 Use urlparse.parse_qs() instead of cgi.parse_qs() 2010-11-21 10:43:47 +01:00
Bastian Kleineidam
147bf31e1e Check for allowed HTTP GET method before parsing anchors in HTML file contents. 2010-11-17 19:13:26 +01:00
Bastian Kleineidam
17ce930611 Ignore irc:// URLs. 2010-11-10 19:56:31 +01:00
Bastian Kleineidam
2fde5bea8c Updated copyright 2010-11-06 18:02:56 +01:00
Bastian Kleineidam
4f5c957e43 Fix check of external domain after HTTP redirect. 2010-11-06 18:00:49 +01:00
Bastian Kleineidam
57ffa6bf97 Allow both redirection www.example.com -> example.com and vice versa. 2010-11-06 17:55:49 +01:00
Bastian Kleineidam
280b7892ef Remove unused NNTP warning. 2010-11-06 17:39:22 +01:00
Bastian Kleineidam
1188e0be2e Retry NNTP connections on temporary errors. 2010-11-06 17:26:40 +01:00
Bastian Kleineidam
23b20306e9 Remove duplicate HTTP response codes. 2010-11-01 09:27:53 +01:00
Bastian Kleineidam
c5f93a561d Fix debug message formatting. 2010-11-01 05:59:04 +01:00
Bastian Kleineidam
f14340a0a8 Do not check content of already cached URLs. 2010-10-27 19:52:48 +02:00
Bastian Kleineidam
1f81124dfa Fix typo. 2010-10-27 19:23:14 +02:00
Bastian Kleineidam
23403f09bb Do not print warning for HTTP to HTTPS or HTTPS to HTTP redirects. 2010-10-27 14:44:05 +02:00
Bastian Kleineidam
b2cf40151f Improved redirection warning text. 2010-10-27 09:15:46 +02:00
Bastian Kleineidam
d9e981e497 Don't log a warning if commandline URL has been redirected. 2010-10-26 16:24:27 +02:00
Bastian Kleineidam
4375d35328 Add warning about unsupported HTTP authentication, and revert the realm changes. 2010-10-25 22:41:31 +02:00
Bastian Kleineidam
332fa4f8f9 Prepare multi-realm auth configuration. 2010-10-25 22:07:16 +02:00
Bastian Kleineidam
2a7292845c Improved info message about sent cookies; do not report the retrieved cookie information. 2010-10-13 22:32:50 +02:00
Bastian Kleineidam
a8aa3bdb00 Another fix to ensure get_content() is only called when allowed. 2010-10-13 22:14:43 +02:00
Bastian Kleineidam
61e611e4bf Prevent unallowed content read when checking for robots.txt allowance in HTML files. 2010-10-12 00:40:34 +02:00
Bastian Kleineidam
1d0db02192 Refactor getting user and password for an URL. 2010-10-11 20:11:15 +02:00
Bastian Kleineidam
e494d6bbb6 Move MIME type detection into fileutil.py module, and use mimetools for detection. 2010-10-03 08:47:48 +02:00
Bastian Kleineidam
e0f4097eb0 Ensure HttpUrl.set_title_from_content() is only called when the content is allowed to be retrieved. 2010-09-29 19:26:03 +02:00
Bastian Kleineidam
840538d12a Remove uneeded check for HTML content. 2010-09-29 19:25:14 +02:00
Bastian Kleineidam
279a1eae70 Only add geoip info for non-empty hostnames. 2010-09-29 15:59:57 +02:00
Bastian Kleineidam
cc848cdb33 Fix import for moved geoip module. 2010-09-29 15:17:27 +02:00
Bastian Kleineidam
ffcd274087 Updated copyright 2010-09-05 21:02:51 +02:00
Bastian Kleineidam
8a1ac26c85 Warn about obfuscated IP numbers. 2010-09-05 20:11:02 +02:00
Bastian Kleineidam
5284017d67 Only fallback to HTTP GET when robots.txt sallows it. 2010-09-04 18:09:59 +02:00
Bastian Kleineidam
8a074aeea9 Work around Python 2.6+ urljoin bug. 2010-08-31 09:16:24 +02:00
Bastian Kleineidam
c3b8ff00b3 Check content and recursion in one try/except to avoid multiple errors when getting page content. 2010-08-31 06:52:08 +02:00
Bastian Kleineidam
60f7af4598 Allow redirections to external URLs with same domain. 2010-08-13 01:22:18 +02:00
Bastian Kleineidam
1faedafb33 Fix data size for HTTP requests. 2010-08-04 00:06:25 +02:00
Bastian Kleineidam
c086f49cea Catch KeyError when quoting URLs of index.html. 2010-07-30 20:12:52 +02:00
Bastian Kleineidam
4678802a81 Do not truncate UNC filepaths 2010-07-30 20:07:11 +02:00
Bastian Kleineidam
761b292e37 Added skype: to list of recognized but ignored URL schemes. 2010-07-29 20:26:04 +02:00
Bastian Kleineidam
0f92b76290 Remove the unnormed URL warning. 2010-07-29 20:20:59 +02:00
Bastian Kleineidam
7ad4f7c220 Compare size from meta info and content data. 2010-07-29 19:53:41 +02:00
Bastian Kleineidam
8413b427e9 Rename some warnings, and add size unequality warning. 2010-07-29 19:53:15 +02:00
Bastian Kleineidam
7536472797 Send correct host header when using http proxy. 2010-07-29 06:50:35 +02:00
Bastian Kleineidam
41e2e1a448 Add new warning to warning list. 2010-07-28 13:47:58 +02:00
Bastian Kleineidam
d9bfd25a68 Add warning if content size is zero 2010-07-28 08:19:55 +02:00
Bluebird75
28f4514b67 Use object with __slots__ for wire-format of UrlBase objects.
Saves memory since UrlBase wire-format objects are used for
logging and thus often created.

Signed-off-by: Bastian Kleineidam <calvin@debian.org>
2010-03-27 00:07:19 +01:00
Bastian Kleineidam
3370ea1562 Reflect changes in httplib2.py: use buffered read in httplib response object and use bad status line exception attribute. 2010-03-26 20:50:38 +01:00
Bastian Kleineidam
c4c098bd83 pep8-ify the source a little more 2010-03-13 08:47:12 +01:00
Bastian Kleineidam
37b4e97012 Revert "Only parse anchors if both --anchors option is given and the current link has an anchor."
This reverts commit b238527d54.
2010-03-10 00:04:02 +01:00
Bastian Kleineidam
b238527d54 Only parse anchors if both --anchors option is given and the current link has an anchor. 2010-03-09 11:45:50 +01:00
Bastian Kleineidam
57397e938b Improved linkname parsing by adding a new peek() HTML parser function. 2010-03-09 11:31:12 +01:00
Bastian Kleineidam
074b5ded32 Support UTF-8 encoded filenames in FTP servers. 2010-03-09 08:15:29 +01:00
Bastian Kleineidam
c88791b815 Fix support for non-standard FTP ports. 2010-03-09 07:49:05 +01:00
Bastian Kleineidam
51a0ef0ad4 Speed up HTML parsing by stopping early and adding callbacks. 2010-03-08 09:04:33 +01:00
Bastian Kleineidam
b8b0398dd2 Ensure redirected URL is Unicode encoded. 2010-03-07 22:11:55 +01:00
Bastian Kleineidam
c8e6995ecd Support HTTPS proxies. 2010-03-07 21:06:10 +01:00
Bastian Kleineidam
1e15e55689 Fix errors in Word file parsing. 2010-03-07 19:43:08 +01:00
Bastian Kleineidam
6a2fcf8ae9 Parse links in Word files. 2010-03-07 19:20:51 +01:00
Bastian Kleineidam
34a2f4a15d Disable and deprecated the --no-proxy-for option. 2010-03-07 17:45:48 +01:00
Bastian Kleineidam
796cf0a7cd Updated copyright year 2010-03-07 11:59:18 +01:00
Bastian Kleineidam
af6cb287d7 Only warn about missing emails in mailto: URLs. 2010-03-07 10:43:29 +01:00
Bastian Kleineidam
3d5c114f14 Warn on permament redirections even when URL is outside of domain filter. 2010-03-07 09:36:21 +01:00
Bastian Kleineidam
2d73b907f1 Retry HTTP when server sent empty status line; should fix most of the BadStatusLine errors that are sporadically encountered. 2010-03-06 10:23:34 +01:00
Bastian Kleineidam
77daf80e82 Add url encoding parameter 2009-11-28 11:56:35 +01:00
Bastian Kleineidam
5e06b6b8d4 Updated FSF address in GPL blurb 2009-07-24 23:58:20 +02:00
Bastian Kleineidam
e6f43b6822 Fixed the no_proxy handling and added changelog entry 2009-07-24 07:19:49 +02:00
Bastian Kleineidam
7f67027abf ignore the fragment part (ie. the anchor) of URIs when
+  getting and caching content
2009-06-26 07:22:36 +02:00
Bastian Kleineidam
c7b7af877f Read Mozilla bookmark titles correctly from places.sqlite. 2009-05-20 07:50:46 +02:00
Bastian Kleineidam
59ffbd43f0 Use AttrDict for transport object in loggers. 2009-03-07 09:43:55 +01:00
Bastian Kleineidam
7a59763508 Remove unused SetList container 2009-03-07 00:42:27 +01:00
Bastian Kleineidam
2351506752 Use plain list for info strings. 2009-03-07 00:19:19 +01:00
Bastian Kleineidam
897b68ae9b Fix copying of httpurl info 2009-03-07 00:17:17 +01:00
Bastian Kleineidam
88dbcb30cd Remove unused url_data.info tags - the tags were always None 2009-03-06 21:20:09 +01:00
Bastian Kleineidam
0b5f525f76 Print NNTP server welcome string as info 2009-03-06 20:57:35 +01:00
Bastian Kleineidam
4ee0fb0181 Add NNTP debugging. 2009-03-06 20:53:12 +01:00
Bastian Kleineidam
0bc2fbb47a Only try 3 times connecting to a busy NNTP server, not 5 times. 2009-03-06 20:52:53 +01:00
Bastian Kleineidam
29adfe92fd Minor syntax fix 2009-03-06 20:14:50 +01:00
Bastian Kleineidam
6024f2e43e Add missing reset of self.reused_connection flag 2009-03-06 20:10:03 +01:00
Bastian Kleineidam
ba160350dd Introduced transport object API for logging. 2009-03-06 19:30:58 +01:00
Bastian Kleineidam
58925b21d3 Improved persistent connection handling by retrying closed connections. 2009-03-06 08:15:34 +01:00
Bastian Kleineidam
29599e4c74 Make sure persistent connection will not close after reading contents. 2009-03-05 19:15:44 +01:00
Bastian Kleineidam
bf9ed8c659 Make sure file descriptors are closed after decoding HTTP content. 2009-03-05 19:15:03 +01:00
Bastian Kleineidam
b8944e493a Use new exception log keyword when logging errors 2009-03-02 13:18:36 +01:00
Bastian Kleineidam
a9335fb3e8 Make file list an iterator, and add missing slash if needed to manually given file URLs. 2009-03-02 08:02:27 +01:00
Bastian Kleineidam
7862147ca3 Fix showing content size. 2009-03-01 23:04:48 +01:00
Bastian Kleineidam
8caa601a7e Python 3.0 compatibility: use exc.args[] instead of exc[] 2009-02-24 12:41:45 +01:00
Bastian Kleineidam
2c9b8d6858 Use slash as path separator in file names 2009-02-24 12:41:28 +01:00
Bastian Kleineidam
323958951c Add name to unnamed file URLs. 2009-02-20 14:03:34 +01:00
calvin
2e918a7b7a Added email syntax check.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3960 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-02-18 15:35:23 +00:00
calvin
7214943f38 Remove wrong function return type documentation
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3959 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-02-18 15:34:46 +00:00
calvin
7e5a2ea23b Remove unused file
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3930 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-24 17:35:06 +00:00
calvin
e03df9e709 Removed gopher URL checking.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3929 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-24 17:34:18 +00:00
calvin
c6cb09c4aa Add missing import
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3900 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-10 19:41:42 +00:00
calvin
1c50cf288a Ignore DNS MX lookup failures in py2exe.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3899 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-10 18:50:14 +00:00
calvin
cc25deac12 Only accept MX dns response types when asking for MX servers.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3895 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-10 17:53:10 +00:00
calvin
979132c9b5 Catch all DNS exceptions when resolving MX hosts.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3894 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-10 15:13:55 +00:00
calvin
a26ca4c23a Replace C ftpparse module with Python implementation
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3892 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-10 14:11:17 +00:00
calvin
e9805dbd8a Updated copyright year to 2009
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3887 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-08 14:18:03 +00:00
calvin
8d5d4827c3 Change ftpparse import to avoid py2exe load error.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3883 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-08 12:28:39 +00:00
calvin
209d5abc18 fix timeouts by testing earlier for persistent connections with HEAD
HEAD requests never have a body; nevertheless the http lib tries to
read() from them. This times out on some servers of course. Fix is
not to let those connections be persistent.

git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3871 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-29 08:14:28 +00:00
calvin
c20e706761 Made some format changes on translated strings.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3870 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-28 20:22:48 +00:00
calvin
1abc2c504d Filter invalid mozilla bookmark urls from places.sqlite
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3869 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-28 10:54:16 +00:00
calvin
c3b6fc5aa4 Readd
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3867 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-20 21:30:10 +00:00
calvin
42c3e71329 Improved and tested opera bookmark parser
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3863 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-20 07:52:02 +00:00
calvin
9ab895751f Support parsing of Firefox 3 bookmark files
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3862 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-11-20 07:51:22 +00:00
calvin
97cf700e04 Fixed wrong cookie debugging format line.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3849 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-07-13 12:51:56 +00:00
calvin
523ee87f0c Add missing return in is_absolute_path()
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3846 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-07-09 19:17:33 +00:00
calvin
f68872f559 Improved detection of absolute Windows paths.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3844 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-07-09 19:00:02 +00:00
calvin
84355f7b94 Catch original httplib errors too since it is used indirectly by urllib functions.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3833 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-30 23:17:38 +00:00
calvin
b30fb3b09c Remove duplicate code in http checker.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3820 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 19:52:09 +00:00
calvin
caf8ba6297 Really allow parsing of XHTML files; I forgot some places to adjust the MIME checking.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3818 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 13:03:48 +00:00
calvin
a6deeeb8a5 Support parsing of HTML pages served with content type application/xhtml+xml
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3817 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 09:39:49 +00:00
calvin
ff41aa8d9f Lower the MIME content-type info from HTTP headers befure using it
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3816 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-16 09:38:09 +00:00
calvin
d26386d03f Catch errors when getting content for title.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3814 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-12 15:38:26 +00:00
calvin
a880939c40 Initialize variables in reset(), not in subsequent methods
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3796 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-08 09:27:13 +00:00
calvin
290528b84f Added title attribute to URL data.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3790 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-07 13:07:56 +00:00
calvin
99269d12cc Add base method for Url.get_title()
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3788 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-06-07 13:07:38 +00:00
calvin
5f4d61e018 Use keyword arguments in translation strings.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3780 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:44:40 +00:00
calvin
97772c9700 Improved email check messages.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3779 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:44:07 +00:00
calvin
2e4d0894fc Stop checking a list of emails at the first invalid one.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3778 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:43:20 +00:00
calvin
e6e51dbc6b Overwrite old results when checking a list of emails.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3777 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:42:52 +00:00
calvin
66ff422f6b Allow overwriting of an old check result.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3776 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-27 19:42:38 +00:00
calvin
7297519b04 Remove or replace unused variables.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3772 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-22 12:10:08 +00:00
calvin
9352dbf5e4 Move test files to separate module
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3763 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-20 17:01:16 +00:00
calvin
dbb498a395 Add virus checking
New option --scan-virus to check the content of URLs for
viruses with ClamAV.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3753 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-20 08:57:37 +00:00
calvin
bacb59597e Use relative imports from Python 2.5
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3750 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-09 06:16:03 +00:00
calvin
b96e8120d6 Add W3C Validator checks
Add new options --check-html-w3 and --check-css-w3 to allow checking
of HTML and CSS pages with the online W3C validators.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3748 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-08 10:36:54 +00:00
calvin
df9f31dcb1 Only check HTML/CSS syntax of intern URLs
The HTML and CSS syntax check now only applies to URLs
which match those given on the command line.
This makes checking of personal pages easier.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3743 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-29 17:48:47 +00:00
calvin
ac4d09f83d Fix errors in CSS and HTML syntax check
Properly encode the warning messages as Unicode, and prevent
overwriting of the "log" module with a local variable.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3742 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-29 17:48:22 +00:00
calvin
92c74ece4d Send HTTP Referer header to both http and https URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3741 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-29 13:33:35 +00:00
calvin
5d8bdaaa1f Use generators instead of lists where possible
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3739 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-28 00:26:02 +00:00
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
72db31e546 Only check syntax of valid URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3726 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-25 07:53:11 +00:00
calvin
973da91f44 Source code cleanup: use or remove unused variables
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3724 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-25 07:49:52 +00:00
calvin
e266a65b64 Fix css check
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3723 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-24 10:46:19 +00:00
calvin
62efec3b35 Added CSS syntax check.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3719 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-24 09:44:18 +00:00
calvin
cce6affa17 Add --check-html option to check the HTML syntax.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3718 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-23 23:04:31 +00:00
calvin
df857aab8d Intern patterns now accept URLs with and without "www." prefixes
as default. This allows sites to check that use both variants.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3714 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:58 +00:00
calvin
5a2f89fa3d Add redirect warning for commandline URLs
If URLs given on the commandline are redirected, the automatic
intern patterns might not match anymore. A warning makes this
more prominent.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3712 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-21 09:18:36 +00:00
calvin
8ae6d94b45 Improved error messages for exceptions
Prepend the exception name before the error message of exceptions.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3694 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-19 07:47:00 +00:00
calvin
4968f1b3cd Prevent empty exception values.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3690 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-18 07:42:24 +00:00
calvin
ba148a9d71 Proper MX DNS request fallback
Properly fall back to DNS A requests when no MX host could be found
for a mailto: URL.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3689 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:34:46 +00:00
calvin
9b7cf763ff Fix test for new www.example.org URL
Fix test data using www.example.org instead of imadoofus.org URLs.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3688 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:34:29 +00:00
calvin
4055721fd4 Use internal gzip2 module
Use the internal gzip replacement module gzip2 for all GzipFile handling.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3685 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-14 22:33:55 +00:00
calvin
1f5a2d47ea Syntax cleanups
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3682 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:24:21 +00:00
calvin
e178405748 Use example.{com,org} for example URLs
Use the guaranteed not available example.com and example.org DNS names
in example URLs.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3681 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:23:34 +00:00
calvin
4ce0ddd166 Changes for future Python 3.x compatibility
Replace backticks with repr(), replace .has_key() with "in".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3680 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:22:57 +00:00
calvin
91a0aad5d8 Fix buggy persistent HTTP connections
Workaround for buggy servers that break protocol synchronization of
persistent HTTP connections.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3677 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-14 13:28:43 +00:00
calvin
1730097265 Prevent Unicode errors for non-ASCII emails
Prevent Unicode errors when email address contains non-ASCII characters.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3673 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-02 23:38:02 +00:00
calvin
860def8d34 Remove superfluous path slash
Really fix the test_misc unit test by removing a superluous path slash.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3672 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:46 +00:00
calvin
294261d80a Replace hardcoded test paths for test_misc
Replace the hardcoded test paths with variables. Fixes failures
in the test_misc unit test.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3670 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:11 +00:00
calvin
7cf9723b10 don't parse <script for=''> as URL
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3659 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-11 16:45:30 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
c99b9b1e8f added
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3657 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 08:25:40 +00:00
calvin
c971ebdabf Added Shockwave Flash (SWF) parsing
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3656 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-28 02:12:48 +00:00
calvin
30d2b4f520 HTTP content data is only considered valid for parsing if the request was not redirected and is a GET request.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3633 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-13 10:50:13 +00:00
calvin
41bc0b2b32 use 'self.data is None' to test if data is already read or not
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3631 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:09:20 +00:00
calvin
5591bbe052 fix self.downloadtime to self.dltime
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3630 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-03 14:01:36 +00:00
calvin
8e6c6455ab add missing import
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3626 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-01 15:56:15 +00:00
calvin
09ce26d5fe removed debug flag, test the LOG_CHECK logger for debug settings
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3623 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-12-01 15:50:59 +00:00
calvin
8d2dc781e1 Ensure unused or expired connections are closed.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3617 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 16:42:41 +00:00
calvin
f8a54faae9 make sure internpat does not remove a trailing slash, which results in checking of URLs that are not a prefix of the given URL.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3613 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-30 10:37:14 +00:00
calvin
9cf3314eab Use constants for warning tags, avoiding typos in string constants. And move the constants into a separate module const.py
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3611 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-29 07:50:22 +00:00
calvin
e007ea5dae fix warning typo
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3610 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-29 06:37:40 +00:00
calvin
fcde8bd4d6 try to detect unknown URL schemes instead of manually setting the assume_local flag
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3609 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-28 18:46:50 +00:00
calvin
a50784042f make sure URL to test for IDNA encoding errors has non-ascii characters
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3608 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-28 18:40:33 +00:00
calvin
6a0960aa66 only store parser contents in LinkFinder handler, not in all handlers
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3602 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-14 19:06:39 +00:00
calvin
a1d911127b remove comments from CSS files before parsing for links
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3601 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-14 18:46:14 +00:00
calvin
cb588a3c5d replace tabs with spaces
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3598 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-14 17:55:55 +00:00
calvin
ce8b963dd9 more code cleanups and documentation
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3596 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-14 17:46:18 +00:00
calvin
370749cafb cleanup the code and add some documentation
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3595 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-14 17:34:50 +00:00
calvin
e9c973fe06 Honor urllib.proxy_bypass() when ignoring proxy settings
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3583 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-10-23 22:56:44 +00:00
calvin
2edfaea03e Read complete body data on persistent connections, else subsequent requests could fail.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3568 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 19:33:10 +00:00
calvin
2b94c0c161 Assume missing HEAD requests for Zope server on text/plain content type
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3567 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-08-08 18:26:55 +00:00
calvin
5aed37dada use german server for faster testing (at least for me)
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3553 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-06-13 00:35:32 +00:00
calvin
df48d4a905 bump up copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3534 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-01-01 14:57:38 +00:00
calvin
c217b6d441 don't set result on self.get_content() redirections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3515 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-11-17 20:42:00 +00:00
calvin
698f7183bc fix vrfy error message
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3507 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-10-19 19:57:29 +00:00
calvin
bef2494211 remove unused imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3482 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-24 10:13:59 +00:00
calvin
1883b79303 follow redirections when getting HTTP contents
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3473 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-21 09:27:38 +00:00
calvin
5ad59225a0 use dictionaries for translations with multiple arguments
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3460 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-16 09:12:59 +00:00
calvin
576d404ce2 close non-idle connections
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3453 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 21:27:19 +00:00
calvin
04f89d0668 use get_url_from helper alias
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3451 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 21:18:49 +00:00
calvin
72d198efcb don't send keep-alive header, it breaks some tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3450 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 21:18:29 +00:00
calvin
86514bb882 activate asset
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3448 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 20:21:06 +00:00
calvin
ba7eaeae09 ignore geo location info lines in test output
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3445 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 20:03:54 +00:00
calvin
6348205dcc add persistent connections back to the connection cache, close all others
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3444 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:59:30 +00:00
calvin
d6676ab0a0 more response closing, and cleanups
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3443 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:51:02 +00:00
calvin
6fe2db6755 use unicode_safe alias helper
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3442 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:46:55 +00:00
calvin
27a8869783 use helper alias for unicode_safe
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3441 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 19:34:03 +00:00
calvin
15dfaf35cb cleanup
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3438 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 16:36:44 +00:00
calvin
4b818cb4b3 Detect more cases to close the connection, and close response objects
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3437 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 16:35:53 +00:00
calvin
da15b15923 Split off the host wait time function, and use it with a separate lock
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3434 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-09-15 12:18:24 +00:00
calvin
0a5c03536d remove unneeded logger arguments
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3425 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-08-21 20:50:10 +00:00
calvin
f78d9bb337 s/fields/parts/ for logger arguments, and supporess the last modified info
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3424 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-08-21 20:50:00 +00:00
calvin
adc4e8c0e8 quote base reference URL, with tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3402 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-07-18 08:07:46 +00:00
calvin
c6f01faab5 improved debugging
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3401 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-07-18 07:43:36 +00:00
calvin
7a31cb7ede add tests for file output
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3351 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-05 19:47:09 +00:00
calvin
7781fe88ce use relative imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3350 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-05 19:46:47 +00:00
calvin
7667f3402f send short keep-alive header value for test server
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3349 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-05 19:46:23 +00:00
calvin
d95d8c3d96 correctly handle internal errors
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3338 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-06-03 01:14:05 +00:00
calvin
8763a42063 skip a test on nt platforms
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3327 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-29 16:08:37 +00:00
calvin
850684b1e0 datadir as url path
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3325 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-29 14:00:45 +00:00
calvin
2888c34859 split file tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3324 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-29 13:54:40 +00:00
calvin
e211d3fd6c fix internal error call
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3314 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-25 11:33:23 +00:00
calvin
98597c267d quote result line
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3272 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-19 17:21:20 +00:00
calvin
3142663135 added tests for UnicodeError 'label too long'
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3270 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-19 17:13:28 +00:00
calvin
7e1e01bd36 do not catch UnicodeError, handle that intern
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3269 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-19 17:13:16 +00:00
calvin
2c13d7cac1 norm test urls
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3267 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:43:55 +00:00
calvin
608f8ba1c3 prepare filenames as URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3266 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:38:57 +00:00
calvin
37615dba02 use datadir, curdir placeholders
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3265 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:35:52 +00:00
calvin
14a29fb015 prepare filenames as URLs
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3264 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:35:36 +00:00
calvin
0ba1520d13 fix filename for test
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3263 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:14:30 +00:00
calvin
1dbc97abe7 script moving
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3262 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 21:10:16 +00:00
calvin
23879d78d4 adjust test result for new cache optimization
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3259 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 19:14:54 +00:00
calvin
cd8886c77f adjust test results for optimized cache
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3258 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-18 18:55:03 +00:00
calvin
2ec5c054fe merge ignoredurl and errorurl into unknownurl, updated tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3237 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 19:08:40 +00:00
calvin
4ec74f6f5c added robots.txt tests for the internal HTTP server
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3232 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 15:47:48 +00:00
calvin
3adaf48b3d add callback for crawldelay
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3227 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-17 15:35:58 +00:00
calvin
811f5492c4 fix --pause to delay requests to the same host
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3222 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 22:56:13 +00:00
calvin
ad28599e57 Note if URL is missing (instead of saying it is empty)
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3220 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 22:15:34 +00:00
calvin
75e88c062a added --cookiefile option to set initial cookie values
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3210 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 20:56:34 +00:00
calvin
b6ad3084aa added more anchor tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3201 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-16 16:22:56 +00:00
calvin
523e6e8e43 use variables in result lines
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3174 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:49:58 +00:00
calvin
91ff370ed7 on redirection to different URL scheme take caching into account; adjust tests
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3173 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:47:50 +00:00
calvin
2a336f8dad put redirects in url queue
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3172 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 17:01:41 +00:00
calvin
9a431fde40 fix imports
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3170 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-05-14 10:14:07 +00:00