Commit graph

114 commits

Author SHA1 Message Date
Petr Dlouhý
f272206110 Python3: fix decoding strings 2019-09-10 19:52:23 +01:00
Petr Dlouhý
0d7a2cac72 Python3: fix decode_parts function 2019-09-06 19:45:20 +01:00
Petr Dlouhý
a6643034fb Python3: decode parts before submitting them to urllib.quote() 2019-05-10 20:06:01 +01:00
Petr Dlouhý
f4b73c6d42 Python3: fix unicode in url.py 2019-04-19 19:57:25 +01:00
Petr Dlouhý
a1b300c892 Python3: fix imports 2018-01-19 09:52:43 +01:00
Philipp Hahn
1368643a50 Fix fragment identifier quoting
According to <https://tools.ietf.org/html/rfc3986>:
 fragment    = *( pchar / "/" / "?" )
 pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
 unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
 pct-encoded = "%" HEXDIG HEXDIG
 sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

Fixes #96
2017-11-10 08:03:03 -05:00
Graham Seaman
233e7dcf68 Allow wayback-format urls without affecting atom 'feed' urls 2017-02-09 11:43:45 +00:00
Bastian Kleineidam
35eb30432e Added some Python3 fixes. 2014-09-12 19:36:30 +02:00
Bastian Kleineidam
37d4ed6f83 Add hyphen and dot to the allowed scheme characters. 2014-09-05 20:59:54 +02:00
Bastian Kleineidam
121602df87 Use SSL cert on Windows systems. 2014-03-11 20:58:16 +01:00
Bastian Kleineidam
7b34be590b Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements. 2014-03-01 00:12:34 +01:00
Bastian Kleineidam
7929a48d78 Fix url split with invalid port names. 2012-10-13 12:03:09 +02:00
Bastian Kleineidam
c4e15c7b88 Improved duplication url check. 2012-10-10 21:04:48 +02:00
Bastian Kleineidam
6d47b76509 Limit HTTP and FTP connections. Gets rid of spurious BadStatusLine errors. 2012-10-09 21:04:20 +02:00
Bastian Kleineidam
5ebd754cdb Improved duplicate url check. 2012-10-01 16:11:45 +02:00
Bastian Kleineidam
ed7c60e491 Do not warn about duplicate URLs which can point to the same content. 2012-10-01 13:42:46 +02:00
Bastian Kleineidam
049882e4fe Remove accept-encoding since some sites have wrong compression. 2012-09-20 22:39:15 +02:00
Bastian Kleineidam
1c739aed81 Use urlparse.uses_relative instead of unofficial urlparse.non_hierarchical (which has been removed in the current CPython 2.7.x trunk). 2012-08-04 20:40:31 +02:00
Bastian Kleineidam
5c045fef44 Fix UNC path handling on Windows. 2012-06-24 10:30:54 +02:00
Bastian Kleineidam
3f063a5e9f Remove unused import. 2012-06-23 14:29:16 +02:00
Bastian Kleineidam
6d9a8859d3 Require and use Python 2.7.2. 2012-06-22 23:58:20 +02:00
Bastian Kleineidam
f107092a8a Fix handling of user/password info in URLs. 2012-06-10 22:07:42 +02:00
Bastian Kleineidam
f1eb51d885 Updated copyright 2012-01-06 09:21:30 +01:00
Bastian Kleineidam
6409651f55 Remove unused function. 2012-01-04 20:04:14 +01:00
Bastian Kleineidam
02b54d804c Allow additional headers for url.get_content(). 2011-04-10 10:57:28 +02:00
Bastian Kleineidam
415c87e6cf Work around a urlsplit() regression in Python >2.6 2011-03-11 15:18:21 +01:00
Bastian Kleineidam
eaa2b79bc3 Updated documentation. 2011-02-17 19:59:02 +01:00
Bastian Kleineidam
e638a2fe6d Updated copyright and translations. Added some missing docstrings. 2011-02-17 07:38:02 +01:00
Bastian Kleineidam
5f06f1b194 Fix wrong call to __init__ of URL proxy handler. 2010-11-26 12:23:41 +01:00
Bastian Kleineidam
e429dbcc13 Do not parse URL CGI part recursively. 2010-10-27 20:55:21 +02:00
Bastian Kleineidam
4483635552 Add debuglevel, log errors and remove default handlers that are added by urllib2 for get_opener(). 2010-10-14 07:51:29 +02:00
Bastian Kleineidam
388ea0e7ff Add ability to pass POST data to url content function. 2010-10-11 19:54:06 +02:00
Bastian Kleineidam
a68329329f Fix get_content() function. 2010-10-03 12:11:25 +02:00
Bastian Kleineidam
9e54bbfa57 Move URL retreiving functions into url.py module. 2010-10-03 08:46:49 +02:00
Bastian Kleineidam
4e1b6d667e Set copyright. 2010-03-26 20:51:59 +01:00
Bastian Kleineidam
c4c098bd83 pep8-ify the source a little more 2010-03-13 08:47:12 +01:00
Bastian Kleineidam
0b7badc238 Do not quote slashes in query values. 2010-03-11 20:19:31 +01:00
Bastian Kleineidam
bee8023540 Fixed URL encoding 2010-02-22 01:06:19 +01:00
Bastian Kleineidam
77daf80e82 Add url encoding parameter 2009-11-28 11:56:35 +01:00
Bastian Kleineidam
5cd7b84596 Allow digits at end of domain names in safe domain check. 2009-07-26 23:16:42 +02:00
Bastian Kleineidam
5e06b6b8d4 Updated FSF address in GPL blurb 2009-07-24 23:58:20 +02:00
Bastian Kleineidam
fd610ba350 Encode spaces with %20 instead of + 2009-07-22 22:52:40 +02:00
calvin
366c711b43 Improved domain name checking
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3956 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-02-18 15:33:52 +00:00
calvin
e9805dbd8a Updated copyright year to 2009
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3887 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-08 14:18:03 +00:00
calvin
5d8bdaaa1f Use generators instead of lists where possible
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3739 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-28 00:26:02 +00:00
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
4ce0ddd166 Changes for future Python 3.x compatibility
Replace backticks with repr(), replace .has_key() with "in".


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3680 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:22:57 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
40b3be412b revert the catch UnicodeError change
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3607 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-28 18:39:12 +00:00
calvin
c93fc79702 ignore errors in idna encoding of hostnames
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3591 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-13 12:36:42 +00:00