anarcat
492da5aee0
Merge pull request #227 from cjmayo/python3_17
...
{python3_17} Python3: fix unicode in url.py
2019-04-24 10:57:09 -04:00
Petr Dlouhý
f4b73c6d42
Python3: fix unicode in url.py
2019-04-19 19:57:25 +01:00
Petr Dlouhý
bc99dc51de
Python3: fix HtmlParser
2019-04-18 19:35:16 +01:00
Petr Dlouhý
2c6411d68e
Python3: fix regexp format
2019-04-17 19:50:06 +01:00
Petr Dlouhý
8f4acc3168
Python3: use str and basestring from builtins
2019-04-16 20:08:29 +01:00
anarcat
e93d18d6e9
Merge pull request #232 from cjmayo/gzip2
...
Remove leftovers from introduction of requests
2019-04-15 10:31:06 -04:00
Petr Dlouhý
2985e9ae65
Use Python 3 compatible octal masks
2019-04-13 20:37:39 +01:00
Chris Mayo
ff4a2e496e
Remove unused copy of gzip2
...
Not used since requests introduced in 7b34be590b .
2019-04-13 20:35:37 +01:00
anarcat
75626d456a
Merge pull request #217 from cjmayo/python3_07
...
{python3_07} Python3: use BytesIO instead of StringIO
2019-04-11 11:48:45 -04:00
anarcat
8223acd44e
Merge pull request #226 from cjmayo/python3_16
...
{python3_16} Python3: fix parsepdf
2019-04-11 11:47:57 -04:00
anarcat
2bdd155d56
Merge pull request #231 from cjmayo/python3_21
...
{python3_21} fix urllib imports
2019-04-11 11:47:50 -04:00
anarcat
ce76b7c82d
Merge pull request #222 from cjmayo/python3_12
...
{python3_12} Python3: fix bytes mark in parser/__init__.py
2019-04-11 11:46:41 -04:00
Petr Dlouhý
106d58c2da
Python3: use BytesIO instead of StringIO
2019-04-09 20:09:35 +01:00
Petr Dlouhý
79e05d1511
Python3: fix parsepdf
2019-04-09 20:09:35 +01:00
Petr Dlouhý
4acabf5cb5
fix urllib imports
2019-04-09 20:09:35 +01:00
Petr Dlouhý
aec8243348
Python3: fix bytes mark in parser/__init__.py
2019-04-09 20:09:35 +01:00
Petr Dlouhý
033f9fbdb3
Python3: mark bytes explicitly
2019-04-09 20:09:35 +01:00
Yaroslav Halchenko
7ed7919692
RF: place parser.flush() under mutex as well
...
Just a safety measure, not yet proven to be required but overall
makes sense
2018-11-06 10:58:10 -05:00
Yaroslav Halchenko
ee27e178ec
BF: place a mutex around apparently thread-unsafe parser.feed invocation
...
That leads to fix up of anchors analysis and probably other issues
such as floating number of found urls etc
2018-11-01 11:10:01 -04:00
Yaroslav Halchenko
b78c2d200e
DOC: minor typo fix
2018-11-01 11:08:09 -04:00
gerdneuman
de6a82b378
Added whatsapp:// to ignored protocols
...
Fixes https://github.com/wummel/linkchecker/issues/595
2018-08-09 13:49:15 +02:00
regexaurus
50a9ff65b8
Updated support (issues) URL
2018-08-03 00:53:47 -04:00
Marius Gedminas
6f55f446ae
Load cookies from the --cookiefile correctly
...
requests.cookies.merge_cookies() requires a dict or a CookieJar as the second argument.
We've been passing lists of Cookie objects instead.
Fixes #62 , harder this time.
2018-03-16 13:23:26 +02:00
Marius Gedminas
6becc08284
Fix internal error when using cookies
...
There was some kind of confusion between a module and a function argument,
introduced in commit 90257a1b5e .
Fixes #62 .
2018-03-15 23:30:41 +02:00
Petr Dlouhý
e615480850
Python3: fix reading Safari bookmarks
2018-01-19 09:52:43 +01:00
Petr Dlouhý
256202a20b
fixes for Python 3: fix proxysuport
2018-01-19 09:52:43 +01:00
Petr Dlouhý
f128c9c168
Python3: fix gzip2 format
2018-01-19 09:52:43 +01:00
Petr Dlouhý
a1b300c892
Python3: fix imports
2018-01-19 09:52:43 +01:00
Petr Dlouhý
0a13fae3b4
remove third party packages and use them as dependency
2018-01-09 23:25:27 +01:00
Reinhold Füreder
e864bbdabf
Use os.makedirs(...) instead of os.mkdir(...)
2018-01-03 11:33:53 +01:00
Philipp Hahn
1368643a50
Fix fragment identifier quoting
...
According to <https://tools.ietf.org/html/rfc3986 >:
fragment = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
Fixes #96
2017-11-10 08:03:03 -05:00
Antoine Beaupré
71be9b941b
fix incorrect call to the logging module ( Closes : #847208 )
2017-11-03 09:47:01 -04:00
Félix Sipma
c8d9038ae8
improve get_plugin_folders() docstring
2017-10-18 15:58:18 +02:00
Félix Sipma
deca8c667e
introduce linkcheck.configuration.get_user_data()
2017-10-18 15:55:55 +02:00
Félix Sipma
a03e2e4ada
use xdg dirs for config & data
...
~/.linkchecker is used instead of the xdg equivalents if the directory
exists (backward compatibility).
2017-10-17 18:48:07 +02:00
Antoine Beaupré
9b12b5d66f
workaround new limitation in requests
...
newer requests do not expose the internal SSL socket object so we
cannot verify certificates. there was work to allow custom
verification routines which we could use, but this never finished:
https://github.com/shazow/urllib3/pull/257
so right now, just treat missing socket information as if the cert was
missing.
Closes : #76
2017-10-02 20:19:25 -04:00
Marius Gedminas
4a092c218c
Whitespace bigotry
2017-03-14 17:18:27 +02:00
anarcat
5471b63ceb
Merge pull request #39 from PetrDlouhy/fix/cache
...
Fix cache: Don't check one url multiple times
2017-03-14 09:26:07 -04:00
Marius Gedminas
fb1debaa68
Fix incompatible pointer type warnings
...
The warnings looked like this:
htmlparse.c: In function ‘yyparse’:
htmlparse.c:1810:18: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’
htmlparse.c:1927:12: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’
The argument is not used, so it doesn't really matter what pointer type
it is.
2017-02-24 15:04:09 +02:00
Petr Dlouhý
eaa538c814
don't check one url multiple times
2017-02-14 10:23:25 +01:00
Marius Gedminas
03dfe3d3a1
Fix "operation on ... may be undefined" [-Wsequence-point] warnings
...
Fixes a bunch of warnings like
htmlparse.y:509:25: warning: operation on ‘self->userData->buf’ may be undefined [-Wsequence-point]
htmlparse.y:518:29: warning: operation on ‘self->userData->tmp_buf’ may be undefined [-Wsequence-point]
which were a result of (macro-expanded) code like this (simplified):
if ((tmp = (tmp = PyMem_Realloc(...))) == NULL) return NULL;
The PyMem_Resize(p, ...) macro assigns the new value to p before
returning it, so there's no need to assign it again.
See http://bugs.python.org/issue1668036 for evidence (from 2007) that
this is indeed a documented side-effect of the macro API.
2017-02-13 15:20:33 +02:00
Graham Seaman
233e7dcf68
Allow wayback-format urls without affecting atom 'feed' urls
2017-02-09 11:43:45 +00:00
Marius Gedminas
743a5f31cb
Crawl HTML attributes in deterministic order
...
Fixes #17 .
2017-02-01 19:19:53 +02:00
Graham Seaman
2e32780dc7
Force header names to lower to allow for CaseInsensitvieDict variability
2017-02-01 16:28:07 +00:00
Marius Gedminas
3c99b6aa30
Fix TypeError: hasattr(): attribute name must be string
...
The one test failure in Travis happens in
TestConsole.test_internal_error, but only if you have the argcomplete
package installed.
This was a real bug in error reporting code.
2017-02-01 16:02:35 +02:00
Antoine Beaupré
d51b7f34b6
Merge branch '9.3.x'
2017-01-31 19:21:22 -05:00
Antoine Beaupré
da8cecd83c
Merge remote-tracking branch 'anarcat/norobots'
2017-01-31 11:34:09 -05:00
Antoine Beaupré
bf45fb1884
fix HTTPS URL checks
...
in Debian Jessie, linkchecker fails because of an API problem.
it completely breaks HTTPs checks.
this patch fixes the problem
from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772947
2017-01-31 11:25:45 -05:00
Bastian Kleineidam
1e291afdfa
Fix python requests version check
2017-01-31 11:25:38 -05:00
Antoine Beaupré
46d96d0aa0
fix HTTPS URL checks
...
in Debian Jessie, linkchecker fails because of an API problem.
it completely breaks HTTPs checks.
this patch fixes the problem
from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772947
2016-09-30 11:20:38 -04:00