anarcat
2c7573b3b8
Merge pull request #300 from cjmayo/python3_43
...
{python3_43} Python3: fix for test_telnet in urlbase.py
2019-09-16 10:08:18 -04:00
anarcat
bec68f237b
Merge pull request #299 from cjmayo/python3_42
...
{python3_42} fixes for Python 3: fix telneturl
2019-09-16 10:07:55 -04:00
anarcat
27d672c78b
Merge pull request #297 from cjmayo/python3_40
...
{python3_40} Python3: fixes form checker/__init__.py
2019-09-16 10:06:05 -04:00
Petr Dlouhý
c2af88ad2e
Python3: fix for test_telnet in urlbase.py
2019-09-15 19:49:26 +01:00
Petr Dlouhý
a2e67af7b4
fixes for Python 3: fix telneturl
2019-09-15 19:49:18 +01:00
Petr Dlouhý
bb542b00e9
Python3: fixes form checker/__init__.py
2019-09-15 19:49:00 +01:00
Chris Mayo
06fdd78f91
Python3: fix TypeError in HttpUrl.read_content()
...
From test_http_redirect:
File "linkchecker/linkcheck/checker/httpurl.py", line 323, in read_content
line: buf.write(data)
locals:
buf = <local> <_io.StringIO object at 0x7f8fe2f45e10>
buf.write = <local> <built-in method write of _io.StringIO object at 0x7f8fe2f45e10>
data = <local> b'<a href="newurl.html">Recursive Redirect</a>\n'
TypeError: string argument expected, got 'bytes'
2019-09-15 19:42:29 +01:00
Chris Mayo
4c9ec511b5
Python3: fix opening file URLs
...
urllib.request.urlopen() expects a string or Request object.
2019-09-12 19:58:27 +01:00
anarcat
2239458966
Merge pull request #285 from cjmayo/python3_34
...
{python3_34} fixes for Python 3: fix test_misc
2019-09-11 09:48:14 -04:00
anarcat
492058a360
Merge pull request #281 from cjmayo/python3_30
...
{python3_30} Python3: fix decoding strings
2019-09-11 09:47:10 -04:00
Petr Dlouhý
f272206110
Python3: fix decoding strings
2019-09-10 19:52:23 +01:00
Petr Dlouhý
e10f25b968
fixes for Python 3: fix running problems in Python 3
2019-09-10 19:30:09 +01:00
Petr Dlouhý
129a68da38
fixes for Python 3: fix test_misc
2019-09-09 19:51:30 +01:00
Petr Dlouhý
ffb0a68ff7
Python3: fix fileurl
2019-09-05 19:41:53 +01:00
anarcat
59fe9ed876
Merge pull request #228 from cjmayo/python3_18
...
{python3_18} Python3: fix unicode in urlbase
2019-04-25 16:17:00 -04:00
anarcat
70f0bbf225
Merge pull request #250 from cjmayo/ftpserver
...
Get FtpServerTest working by updating to current pyftpdlib API
2019-04-25 16:16:33 -04:00
Petr Dlouhý
e92b0a9f7b
Python3: fix unicode in urlbase
2019-04-25 19:57:45 +01:00
Petr Dlouhý
b3881ce3b5
Python3: fix urlbase, strformat and others
2019-04-25 19:57:45 +01:00
anarcat
bb0a1e1992
Merge pull request #242 from cjmayo/wummel
...
Update references to GitHub project from wummel to linkchecker
2019-04-24 10:58:15 -04:00
anarcat
ee8667e1ca
Merge pull request #229 from cjmayo/python3_19
...
{python3_19} Python3: fix unicode in fileurl
2019-04-24 10:57:45 -04:00
Chris Mayo
f60810b050
Fix Python 3 "TypeError: decoding str is not supported" in FtpUrl.cwd
2019-04-22 19:34:46 +01:00
Petr Dlouhý
b40f4722c7
Python3: fix unicode in fileurl
2019-04-19 20:42:38 +01:00
EsuS
004632a99b
Update references to GitHub project from wummel to linkchecker
...
Remove all mention of donations.
2019-04-18 19:59:52 +01:00
Petr Dlouhý
bc99dc51de
Python3: fix HtmlParser
2019-04-18 19:35:16 +01:00
Petr Dlouhý
4acabf5cb5
fix urllib imports
2019-04-09 20:09:35 +01:00
gerdneuman
de6a82b378
Added whatsapp:// to ignored protocols
...
Fixes https://github.com/wummel/linkchecker/issues/595
2018-08-09 13:49:15 +02:00
Petr Dlouhý
256202a20b
fixes for Python 3: fix proxysuport
2018-01-19 09:52:43 +01:00
Antoine Beaupré
9b12b5d66f
workaround new limitation in requests
...
newer requests do not expose the internal SSL socket object so we
cannot verify certificates. there was work to allow custom
verification routines which we could use, but this never finished:
https://github.com/shazow/urllib3/pull/257
so right now, just treat missing socket information as if the cert was
missing.
Closes : #76
2017-10-02 20:19:25 -04:00
Graham Seaman
233e7dcf68
Allow wayback-format urls without affecting atom 'feed' urls
2017-02-09 11:43:45 +00:00
Antoine Beaupré
9d899d1dfa
add --no-robots commandline flag
...
While this flag can be abused, it seems to me like a legitimate use
case that you want to check a fairly small document for mistakes,
which includes references to a website which has a robots.txt that
denies all robots. It turns out that most websites do *not* add a
permission for LinkCheck to use their site, and some sites, like the
Debian BTS for example, are very hostile with bots in general.
Between me using linkcheck and me using my web browser to check those
links one by one, there is not a big difference. In fact, using
linkcheck may be *better* for the website because it will use HEAD
requests instead of a GET, and will not fetch all page elements
(javascript, images, etc) which can often be fairly big.
Besides, hostile users will patch the software themselves: it took me
only a few minutes to disable the check, and a few more to make that
into a proper patch.
By forcing robots.txt without any other option, we are hurting our
good users and not keeping hostile users from doing harm.
The patch is still incomplete, but works. It lacks: documentation and
unit tests.
Closes : #508
2016-05-19 14:43:59 -04:00
Bastian Kleineidam
92c4ca9a5e
Debug request headers
2014-09-20 12:16:24 +02:00
Bastian Kleineidam
029c20ed98
More python3 fixes
2014-09-12 21:59:07 +02:00
Bastian Kleineidam
35eb30432e
Added some Python3 fixes.
2014-09-12 19:36:30 +02:00
Bastian Kleineidam
06c6b80ed3
Fix proxy support.
2014-09-05 22:48:10 +02:00
Bastian Kleineidam
ee4545399d
Support itms-services: URLs. #532
2014-09-05 21:06:10 +02:00
Bastian Kleineidam
c684918ba6
Ignore urllib3 warnings about invalid SSL certs since we check them ourselves.
2014-09-05 20:00:00 +02:00
Bastian Kleineidam
2354f16dbb
Catch urllib3 errors.
2014-09-05 19:59:28 +02:00
Bastian Kleineidam
a665d35feb
Use proxies and checker session in robots.txt.
2014-07-14 20:28:28 +02:00
Bastian Kleineidam
266e9e189f
Further code cleanup.
2014-07-14 20:14:00 +02:00
Bastian Kleineidam
7838521b6e
Code cleanup.
2014-07-14 19:49:01 +02:00
Bastian Kleineidam
eafa1ed2da
Updated unknown URL schemes.
2014-07-13 21:51:53 +02:00
Bastian Kleineidam
0fa7ed2699
Fix empty URL handling.
2014-07-03 23:34:40 +02:00
Bastian Kleineidam
1590ab6240
cleanup
2014-07-01 21:12:47 +02:00
Bastian Kleineidam
9a124513e3
Merge branch 'master' of github.com:wummel/linkchecker
2014-07-01 21:11:33 +02:00
wummel
9bb3852edf
Merge pull request #515 from Mark-Hetherington/extern-redirect
...
When following redirections update url.extern
2014-07-01 21:11:13 +02:00
Bastian Kleineidam
12cc12db53
Add get_redirects() function.
2014-07-01 21:11:06 +02:00
Bastian Kleineidam
cde261c009
Parse Refresh: and Content-Location: header values for URLs.
2014-07-01 20:16:43 +02:00
Bastian Kleineidam
c3ec91ac6d
Fix intern URL search pattern.
2014-06-13 23:52:21 +02:00
Bastian Kleineidam
ad8eb424f3
Merge Mark-Hetherington-xml-parse-warn with slight modifications.
2014-06-13 20:50:37 +02:00
Mark Hetherington
34d83db29c
When following redirections update url.extern
2014-05-19 14:59:58 +10:00