Marius Gedminas
c6de64978c
Merge pull request #325 from linkchecker/type-error-in-robot-parser
...
Fix TypeError: string arg required in content_allows_robots()
2019-10-22 18:07:31 +03:00
Marius Gedminas
fa32a89d6b
Fix MS Word parser, hopefully
...
MS Word files are binary data, and get_temp_filename() will write them
to disk using open(..., 'wb'), so we want to pass bytes in there, not
Unicode.
See #323 .
2019-10-22 16:39:57 +03:00
Marius Gedminas
58b0d5aaae
Fix TypeError: string arg required in content_allows_robots()
...
See #323 an #317 .
2019-10-22 14:13:45 +03:00
Chris Mayo
949f84d329
PdfParser requires bytes
2019-10-21 20:12:33 +01:00
Chris Mayo
7da64b16f0
Don't add linkcheck_dns directory to sys.path
...
This code was added in:
efbbb656 ("Remove python-dns conflict by moving the dns module into a custom subdirectory.", 2012-12-07)
Installation of linkcheck_dns stopped with:
0a13fae3 ("remove third party packages and use them as dependency", 2018-01-06)
2019-10-21 19:52:58 +01:00
Marius Gedminas
e274d74be2
Wait for threads to exit after stopping them
...
This fixes a race condition where the main thread would check if any
internal errors happened and get back a 0 while a worker thread was
still busy printing the internal error message before incrementing the
counter.
Fixes #320 .
My experiments show that this adds no perceptible delay to the script
runtime (on Linux). More specifically, there already is an annoying
perceptible delay of about 1 second, but it's not caused by this change.
2019-10-21 18:23:58 +03:00
Marius Gedminas
84dbb5d603
Fix TypeError: string arg required in find_links()
...
Fixes #317 .
2019-10-21 17:47:46 +03:00
Chris Mayo
c7a32d67fe
Remove unused code from network subpackage
2019-10-19 10:27:34 +01:00
anarcat
f73ba54a2a
Merge pull request #308 from cjmayo/decode
...
Decode content when retrieved
2019-10-10 09:46:32 -04:00
anarcat
7cfb1136e9
Merge pull request #313 from cjmayo/titlefinder
...
Remove unused linkparse.TitleFinder
2019-10-07 11:30:10 -04:00
Chris Mayo
127c2272c4
Remove unused linkparse.TitleFinder
...
Stopped being used with removal of UrlBase.set_title_from_content() in:
7b34be59 ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)
2019-10-05 19:43:33 +01:00
Chris Mayo
b7ec71d8cc
Always use utf-8 encoding when quoting
2019-10-05 19:38:57 +01:00
Chris Mayo
a9f147c347
Update fileutil.pathencode() because paths are now strings
2019-10-05 19:38:57 +01:00
Chris Mayo
5bb4524a63
Update strformat.ascii_safe() because paths are now strings
2019-10-05 19:38:57 +01:00
Chris Mayo
646e138166
Pass encoding when unquoting
...
Else non-UTF-8 codes are misinterpreted:
>>> from urllib import parse
>>> parse.unquote("%FF")
'�'
>>> parse.unquote("%FF", "latin1")
'ÿ'
2019-10-05 19:38:57 +01:00
Chris Mayo
153e53ba03
Reuse soup object used for detecting encoding in the HTML parser
2019-10-05 19:38:57 +01:00
Chris Mayo
978042a54e
Hide Beautiful Soup soupsieve warning
...
Shown every time linkchecker is run:
/usr/lib/python3.7/site-packages/bs4/element.py:16: UserWarning: The
soupsieve package is not installed. CSS selectors cannot be used.
'The soupsieve package is not installed. CSS selectors cannot be used.'
2019-10-05 19:38:57 +01:00
Chris Mayo
30df69c158
Improve pretty printed comments
2019-10-05 19:38:57 +01:00
Chris Mayo
607328d5c5
Support Beautiful Soup line numbers
2019-10-05 19:38:57 +01:00
Chris Mayo
4f8c2954cf
Don't set parser.encoding
...
Read-only property with new Beautiful Soup parser.
2019-10-05 19:38:57 +01:00
Chris Mayo
5732606c58
Remove urlutil.decode_for_unquote()
...
Not needed since all content is now being decoded on retrieval.
Added by:
a6643034 ("Python3: decode parts before submitting them to urllib.quote()", 2018-01-05)
2019-10-04 19:37:09 +01:00
Chris Mayo
2776eb5f52
Revert "Python3: fix opening file URLs"
...
This reverts commit 4c9ec511b5 .
2019-10-04 19:37:09 +01:00
Chris Mayo
c6a06d99ac
Remove unnecessary unicode() from StatusLogger.writeln()
2019-09-30 20:06:48 +01:00
Petr Dlouhý
6e8da10942
fixes for Python 3: fix markdowncheck
...
The translate() method of string objects (and Python 2 Unicode objects)
only accepts a single, table argument.
2019-09-30 19:46:24 +01:00
Chris Mayo
e01ea0d9f0
Safari bookmark parser requires bytes
2019-09-30 19:46:24 +01:00
Chris Mayo
ad33d359c1
Adapt Opera bookmark parser to work with decoded data
2019-09-30 19:46:24 +01:00
Chris Mayo
9460064084
Use requests to decode the content of login form
2019-09-30 19:46:24 +01:00
Chris Mayo
5fc01455b7
Decode content when retrieved, use bs4 to detect encoding if non-Unicode
...
UrlBase has been modified as follows:
- the "data" variable now holds bytes
- decoded content is stored in a new variable "text"
- functionality from get_content() has been split out into
get_raw_content() which returns "data" and download_content() which
calls read_content() and sets the download related variables.
This allows for subclasses to do their own decoding and parsers to
use bytes.
2019-09-30 19:46:24 +01:00
Chris Mayo
0c90c718bf
Revert "Python3: fix bytes mark in parser/__init__.py"
...
This reverts commit aec8243348 .
2019-09-30 19:46:24 +01:00
Chris Mayo
53cd9475b5
Replace deprecated cgi.escape
...
html provided for Python 2 by future
https://python-future.org/compatible_idioms.html#html-escaping-and-entities
2019-09-17 20:25:05 +01:00
anarcat
1590408a65
Merge pull request #306 from cjmayo/python3_49
...
{python3_49} enable and fix remaining bookmark tests
2019-09-16 15:18:26 -04:00
Petr Dlouhý
eaa7131523
enable and fix remaining bookmark tests
...
biplist module preferred for reading Safari bookmarks in
bookmarks/safari.py so install it for tox testing.
2019-09-16 20:08:01 +01:00
anarcat
4ccf0fb2d0
Merge pull request #305 from cjmayo/python3_48
...
{python3_48} Python3: fix displaying help
2019-09-16 10:10:36 -04:00
anarcat
2c7573b3b8
Merge pull request #300 from cjmayo/python3_43
...
{python3_43} Python3: fix for test_telnet in urlbase.py
2019-09-16 10:08:18 -04:00
anarcat
bec68f237b
Merge pull request #299 from cjmayo/python3_42
...
{python3_42} fixes for Python 3: fix telneturl
2019-09-16 10:07:55 -04:00
anarcat
27d672c78b
Merge pull request #297 from cjmayo/python3_40
...
{python3_40} Python3: fixes form checker/__init__.py
2019-09-16 10:06:05 -04:00
anarcat
5a0a02ae74
Merge pull request #294 from cjmayo/python3_39_alt
...
{python3_39_alt} Python3: fix TypeError in HttpUrl.read_content()
2019-09-16 10:04:23 -04:00
Petr Dlouhý
14e19efe07
Python3: fix displaying help
2019-09-15 19:50:05 +01:00
Petr Dlouhý
c2af88ad2e
Python3: fix for test_telnet in urlbase.py
2019-09-15 19:49:26 +01:00
Petr Dlouhý
a2e67af7b4
fixes for Python 3: fix telneturl
2019-09-15 19:49:18 +01:00
Petr Dlouhý
bb542b00e9
Python3: fixes form checker/__init__.py
2019-09-15 19:49:00 +01:00
Chris Mayo
06fdd78f91
Python3: fix TypeError in HttpUrl.read_content()
...
From test_http_redirect:
File "linkchecker/linkcheck/checker/httpurl.py", line 323, in read_content
line: buf.write(data)
locals:
buf = <local> <_io.StringIO object at 0x7f8fe2f45e10>
buf.write = <local> <built-in method write of _io.StringIO object at 0x7f8fe2f45e10>
data = <local> b'<a href="newurl.html">Recursive Redirect</a>\n'
TypeError: string argument expected, got 'bytes'
2019-09-15 19:42:29 +01:00
anarcat
736d2a786d
Merge pull request #293 from cjmayo/python3_37_alt
...
{python3_37_alt} Python3: fix TypeError when parsing cookie data
2019-09-14 11:51:26 -04:00
anarcat
fe39db4fbf
Merge pull request #287 from cjmayo/python3_36
...
{python3_36} fixes for Python 3 + Travis test: fix cgi
2019-09-14 11:50:53 -04:00
Chris Mayo
a7b7e31917
Python3: fix TypeError when parsing cookie data
...
> fp = BytesIO(strheader)
E TypeError: a bytes-like object is required, not 'str'
linkcheck/cookies.py:61: TypeError
The email package provides the message_from_string() convenience
function which avoids the need to create a file-like object.
Indeed http.client.HTTPMessage is implemented using email.message.Message.
2019-09-13 20:10:25 +01:00
Petr Dlouhý
36465112d0
fixes for Python 3 + Travis test: fix cgi
2019-09-13 19:46:13 +01:00
anarcat
aaa8cb675e
Merge pull request #291 from cjmayo/python3_33_alt
...
{python3_33_alt} Python3: fix opening file URLs
2019-09-13 10:31:20 -04:00
anarcat
80b62a3e21
Merge pull request #292 from cjmayo/lc_cgi_error
...
Fix errors caused by logging LCFormError exceptions
2019-09-13 09:12:05 -04:00
anarcat
b0b392f7cc
Merge pull request #282 from cjmayo/python3_31
...
{python3_31} Python3: fix strformat strline()
2019-09-13 09:11:33 -04:00
Chris Mayo
6dc25547d5
Fix errors caused by logging LCFormError exceptions
2019-09-12 20:13:08 +01:00
Chris Mayo
4c9ec511b5
Python3: fix opening file URLs
...
urllib.request.urlopen() expects a string or Request object.
2019-09-12 19:58:27 +01:00
anarcat
eb2e3271a2
Merge pull request #279 from cjmayo/python3_28
...
{python3_28} Python3: fix robotparser
2019-09-12 08:40:18 -04:00
anarcat
8c072fa757
Merge pull request #289 from cjmayo/python3_38
...
{python3_38} Python3: fix linkname.py
2019-09-12 08:39:29 -04:00
Petr Dlouhý
538c4cfeb9
Python3: fix linkname.py
2019-09-11 20:32:33 +01:00
Petr Dlouhý
8a294be95f
Python3: fix robotparser
2019-09-11 20:04:26 +01:00
anarcat
44944754d5
Merge pull request #286 from cjmayo/python3_35
...
{python3_35} Python3: fix unichr() in htmlparser
2019-09-11 09:48:35 -04:00
anarcat
2239458966
Merge pull request #285 from cjmayo/python3_34
...
{python3_34} fixes for Python 3: fix test_misc
2019-09-11 09:48:14 -04:00
anarcat
dbbb64cd90
Merge pull request #283 from cjmayo/python3_32
...
{python3_32} fixes for Python 3 + Travis test: fix threads
2019-09-11 09:47:44 -04:00
anarcat
492058a360
Merge pull request #281 from cjmayo/python3_30
...
{python3_30} Python3: fix decoding strings
2019-09-11 09:47:10 -04:00
anarcat
8eadc5f8a1
Merge pull request #280 from cjmayo/python3_29
...
{python3_29} fixes for Python 3: fix running problems in Python 3
2019-09-11 09:46:48 -04:00
Petr Dlouhý
f272206110
Python3: fix decoding strings
2019-09-10 19:52:23 +01:00
Petr Dlouhý
55a7973b93
Python3: fix csvlog
2019-09-10 19:42:26 +01:00
Petr Dlouhý
e10f25b968
fixes for Python 3: fix running problems in Python 3
2019-09-10 19:30:09 +01:00
Petr Dlouhý
d20ac0e108
Python3: fix strformat strline()
2019-09-09 19:51:30 +01:00
Petr Dlouhý
8b9f29ae52
Python3: fix unichr() in htmlparser
2019-09-09 19:51:30 +01:00
Petr Dlouhý
129a68da38
fixes for Python 3: fix test_misc
2019-09-09 19:51:30 +01:00
Petr Dlouhý
57f7ba0979
fixes for Python 3 + Travis test: fix threads
2019-09-09 19:51:30 +01:00
Marius Gedminas
60f9f80b9f
Fix test_console.py on Python 3
...
This is a alternative fix I suggested in the comments on PR #273 .
2019-09-09 18:52:29 +03:00
anarcat
4e6c806bff
Merge pull request #274 from cjmayo/python3_24
...
{python3_24} Python3: fix logger
2019-09-09 11:50:04 -04:00
Marius Gedminas
bb573e5eb1
Merge pull request #272 from cjmayo/python3_22
...
{python3_22} Python3: fix decode_parts function
2019-09-09 18:37:49 +03:00
anarcat
5c9376cfe2
Merge pull request #276 from cjmayo/python3_26
...
{python3_26} Python3: fix fileutil
2019-09-09 09:40:18 -04:00
Petr Dlouhý
0d7a2cac72
Python3: fix decode_parts function
2019-09-06 19:45:20 +01:00
Petr Dlouhý
9156576778
Python3: fix logger
2019-09-06 19:41:37 +01:00
Petr Dlouhý
ffb0a68ff7
Python3: fix fileurl
2019-09-05 19:41:53 +01:00
anarcat
59ab0644fd
Merge pull request #230 from cjmayo/python3_20
...
{python3_20} Python3: decode parts before submitting them to urllib.quote()
2019-09-04 09:48:19 -04:00
Petr Dlouhý
b5111453d8
change test_parse encoding to UTF-8
2019-07-22 19:59:37 +01:00
Petr Dlouhý
d6d48b4814
html parser: use name instead of peeking
2019-07-22 19:59:37 +01:00
Petr Dlouhý
51a06d8a1e
Remove home-cooked htmlparser and use BeautifulSoup
2019-07-22 19:59:37 +01:00
Nick Muerdter
fb3f65cdcc
Fix CSV output containing increasing number of null byte characters.
...
The CSV buffer is being truncated on each new row, but since the
stream's pointer isn't also being reset, each new row starts at the same
position as the previous row, but with null bytes up until that point.
This leads to increasing growth in the length of each CSV row, since
each line will be padded with null bytes equivalent to the previous
row's length.
2019-05-31 18:52:57 -06:00
Petr Dlouhý
a6643034fb
Python3: decode parts before submitting them to urllib.quote()
2019-05-10 20:06:01 +01:00
Chris Mayo
1c2e6c465e
squash! Python3: fix strformat ascii_safe() and unicode_safe()
2019-05-10 08:58:52 -04:00
Petr Dlouhý
ac14585a78
Python3: fix strformat for test_file
2019-05-10 08:58:52 -04:00
Petr Dlouhý
acaf8e671e
Python3: fix strformat unicode_safe()
2019-05-10 08:58:52 -04:00
Petr Dlouhý
e11ba8e427
squash! Python3: fix strformat ascii_safe() and unicode_safe()
...
From:
fixes for Python 3: fix running problems in Python 3
2019-05-10 08:58:52 -04:00
Petr Dlouhý
a1c6c4935e
Python3: fix strformat ascii_safe() and unicode_safe()
2019-05-10 08:58:52 -04:00
anarcat
9c9706a07a
Merge pull request #256 from cjmayo/parse_qs
...
Replace deprecated cgi.parse_qs
2019-04-27 13:27:19 -04:00
Chris Mayo
a355476b82
Replace deprecated regexp flags not at start
...
DeprecationWarning: Flags not at the start of the expression
2019-04-26 19:25:59 +01:00
Chris Mayo
5ae40c1ae2
Replace deprecated cgi.parse_qs
2019-04-26 19:23:45 +01:00
anarcat
59fe9ed876
Merge pull request #228 from cjmayo/python3_18
...
{python3_18} Python3: fix unicode in urlbase
2019-04-25 16:17:00 -04:00
anarcat
70f0bbf225
Merge pull request #250 from cjmayo/ftpserver
...
Get FtpServerTest working by updating to current pyftpdlib API
2019-04-25 16:16:33 -04:00
Petr Dlouhý
e92b0a9f7b
Python3: fix unicode in urlbase
2019-04-25 19:57:45 +01:00
Petr Dlouhý
b3881ce3b5
Python3: fix urlbase, strformat and others
2019-04-25 19:57:45 +01:00
anarcat
056ba1d717
Merge pull request #248 from cjmayo/donateurl
...
Remove configuration.DonateUrl
2019-04-24 10:59:50 -04:00
anarcat
b656346352
Merge pull request #246 from cjmayo/locale_format
...
Replace deprecated locale.format()
2019-04-24 10:59:17 -04:00
anarcat
a42bc14fc2
Merge pull request #243 from cjmayo/warning
...
Replace deprecated log.warn
2019-04-24 10:58:31 -04:00
anarcat
bb0a1e1992
Merge pull request #242 from cjmayo/wummel
...
Update references to GitHub project from wummel to linkchecker
2019-04-24 10:58:15 -04:00
anarcat
ee8667e1ca
Merge pull request #229 from cjmayo/python3_19
...
{python3_19} Python3: fix unicode in fileurl
2019-04-24 10:57:45 -04:00
anarcat
492da5aee0
Merge pull request #227 from cjmayo/python3_17
...
{python3_17} Python3: fix unicode in url.py
2019-04-24 10:57:09 -04:00
Chris Mayo
f60810b050
Fix Python 3 "TypeError: decoding str is not supported" in FtpUrl.cwd
2019-04-22 19:34:46 +01:00
Chris Mayo
20e11f1b1f
Remove configuration.DonateUrl
2019-04-21 19:44:18 +01:00
Chris Mayo
ce1dd55d7a
Replace deprecated locale.format()
...
locale.format_string() was introduced in Python 2.5.
2019-04-21 19:28:54 +01:00
Petr Dlouhý
b40f4722c7
Python3: fix unicode in fileurl
2019-04-19 20:42:38 +01:00
Petr Dlouhý
f4b73c6d42
Python3: fix unicode in url.py
2019-04-19 19:57:25 +01:00
Chris Mayo
46179f681c
Replace deprecated log.warn
...
warning() has been the documented method since logging was introduced in
Python 2.3.
2019-04-18 20:10:03 +01:00
EsuS
004632a99b
Update references to GitHub project from wummel to linkchecker
...
Remove all mention of donations.
2019-04-18 19:59:52 +01:00
Petr Dlouhý
bc99dc51de
Python3: fix HtmlParser
2019-04-18 19:35:16 +01:00
Petr Dlouhý
2c6411d68e
Python3: fix regexp format
2019-04-17 19:50:06 +01:00
Petr Dlouhý
8f4acc3168
Python3: use str and basestring from builtins
2019-04-16 20:08:29 +01:00
anarcat
e93d18d6e9
Merge pull request #232 from cjmayo/gzip2
...
Remove leftovers from introduction of requests
2019-04-15 10:31:06 -04:00
Petr Dlouhý
2985e9ae65
Use Python 3 compatible octal masks
2019-04-13 20:37:39 +01:00
Chris Mayo
ff4a2e496e
Remove unused copy of gzip2
...
Not used since requests introduced in 7b34be590b .
2019-04-13 20:35:37 +01:00
anarcat
75626d456a
Merge pull request #217 from cjmayo/python3_07
...
{python3_07} Python3: use BytesIO instead of StringIO
2019-04-11 11:48:45 -04:00
anarcat
8223acd44e
Merge pull request #226 from cjmayo/python3_16
...
{python3_16} Python3: fix parsepdf
2019-04-11 11:47:57 -04:00
anarcat
2bdd155d56
Merge pull request #231 from cjmayo/python3_21
...
{python3_21} fix urllib imports
2019-04-11 11:47:50 -04:00
anarcat
ce76b7c82d
Merge pull request #222 from cjmayo/python3_12
...
{python3_12} Python3: fix bytes mark in parser/__init__.py
2019-04-11 11:46:41 -04:00
Petr Dlouhý
106d58c2da
Python3: use BytesIO instead of StringIO
2019-04-09 20:09:35 +01:00
Petr Dlouhý
79e05d1511
Python3: fix parsepdf
2019-04-09 20:09:35 +01:00
Petr Dlouhý
4acabf5cb5
fix urllib imports
2019-04-09 20:09:35 +01:00
Petr Dlouhý
aec8243348
Python3: fix bytes mark in parser/__init__.py
2019-04-09 20:09:35 +01:00
Petr Dlouhý
033f9fbdb3
Python3: mark bytes explicitly
2019-04-09 20:09:35 +01:00
Yaroslav Halchenko
7ed7919692
RF: place parser.flush() under mutex as well
...
Just a safety measure, not yet proven to be required but overall
makes sense
2018-11-06 10:58:10 -05:00
Yaroslav Halchenko
ee27e178ec
BF: place a mutex around apparently thread-unsafe parser.feed invocation
...
That leads to fix up of anchors analysis and probably other issues
such as floating number of found urls etc
2018-11-01 11:10:01 -04:00
Yaroslav Halchenko
b78c2d200e
DOC: minor typo fix
2018-11-01 11:08:09 -04:00
gerdneuman
de6a82b378
Added whatsapp:// to ignored protocols
...
Fixes https://github.com/wummel/linkchecker/issues/595
2018-08-09 13:49:15 +02:00
regexaurus
50a9ff65b8
Updated support (issues) URL
2018-08-03 00:53:47 -04:00
Marius Gedminas
6f55f446ae
Load cookies from the --cookiefile correctly
...
requests.cookies.merge_cookies() requires a dict or a CookieJar as the second argument.
We've been passing lists of Cookie objects instead.
Fixes #62 , harder this time.
2018-03-16 13:23:26 +02:00
Marius Gedminas
6becc08284
Fix internal error when using cookies
...
There was some kind of confusion between a module and a function argument,
introduced in commit 90257a1b5e .
Fixes #62 .
2018-03-15 23:30:41 +02:00
Petr Dlouhý
e615480850
Python3: fix reading Safari bookmarks
2018-01-19 09:52:43 +01:00
Petr Dlouhý
256202a20b
fixes for Python 3: fix proxysuport
2018-01-19 09:52:43 +01:00
Petr Dlouhý
f128c9c168
Python3: fix gzip2 format
2018-01-19 09:52:43 +01:00
Petr Dlouhý
a1b300c892
Python3: fix imports
2018-01-19 09:52:43 +01:00
Petr Dlouhý
0a13fae3b4
remove third party packages and use them as dependency
2018-01-09 23:25:27 +01:00
Petr Dlouhý
2daf685633
Python3: fix few htmllib problems
2018-01-05 22:48:46 +01:00
Petr Dlouhý
fb39a4116f
Python3: fix fileutil
2018-01-05 20:31:21 +01:00
Reinhold Füreder
e864bbdabf
Use os.makedirs(...) instead of os.mkdir(...)
2018-01-03 11:33:53 +01:00
Philipp Hahn
1368643a50
Fix fragment identifier quoting
...
According to <https://tools.ietf.org/html/rfc3986 >:
fragment = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
Fixes #96
2017-11-10 08:03:03 -05:00
Antoine Beaupré
71be9b941b
fix incorrect call to the logging module ( Closes : #847208 )
2017-11-03 09:47:01 -04:00
Félix Sipma
c8d9038ae8
improve get_plugin_folders() docstring
2017-10-18 15:58:18 +02:00
Félix Sipma
deca8c667e
introduce linkcheck.configuration.get_user_data()
2017-10-18 15:55:55 +02:00
Félix Sipma
a03e2e4ada
use xdg dirs for config & data
...
~/.linkchecker is used instead of the xdg equivalents if the directory
exists (backward compatibility).
2017-10-17 18:48:07 +02:00
Antoine Beaupré
9b12b5d66f
workaround new limitation in requests
...
newer requests do not expose the internal SSL socket object so we
cannot verify certificates. there was work to allow custom
verification routines which we could use, but this never finished:
https://github.com/shazow/urllib3/pull/257
so right now, just treat missing socket information as if the cert was
missing.
Closes : #76
2017-10-02 20:19:25 -04:00
Marius Gedminas
4a092c218c
Whitespace bigotry
2017-03-14 17:18:27 +02:00
anarcat
5471b63ceb
Merge pull request #39 from PetrDlouhy/fix/cache
...
Fix cache: Don't check one url multiple times
2017-03-14 09:26:07 -04:00
Marius Gedminas
fb1debaa68
Fix incompatible pointer type warnings
...
The warnings looked like this:
htmlparse.c: In function ‘yyparse’:
htmlparse.c:1810:18: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’
htmlparse.c:1927:12: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’
The argument is not used, so it doesn't really matter what pointer type
it is.
2017-02-24 15:04:09 +02:00
Petr Dlouhý
eaa538c814
don't check one url multiple times
2017-02-14 10:23:25 +01:00
Marius Gedminas
03dfe3d3a1
Fix "operation on ... may be undefined" [-Wsequence-point] warnings
...
Fixes a bunch of warnings like
htmlparse.y:509:25: warning: operation on ‘self->userData->buf’ may be undefined [-Wsequence-point]
htmlparse.y:518:29: warning: operation on ‘self->userData->tmp_buf’ may be undefined [-Wsequence-point]
which were a result of (macro-expanded) code like this (simplified):
if ((tmp = (tmp = PyMem_Realloc(...))) == NULL) return NULL;
The PyMem_Resize(p, ...) macro assigns the new value to p before
returning it, so there's no need to assign it again.
See http://bugs.python.org/issue1668036 for evidence (from 2007) that
this is indeed a documented side-effect of the macro API.
2017-02-13 15:20:33 +02:00
Graham Seaman
233e7dcf68
Allow wayback-format urls without affecting atom 'feed' urls
2017-02-09 11:43:45 +00:00
Marius Gedminas
743a5f31cb
Crawl HTML attributes in deterministic order
...
Fixes #17 .
2017-02-01 19:19:53 +02:00
Graham Seaman
2e32780dc7
Force header names to lower to allow for CaseInsensitvieDict variability
2017-02-01 16:28:07 +00:00
Marius Gedminas
3c99b6aa30
Fix TypeError: hasattr(): attribute name must be string
...
The one test failure in Travis happens in
TestConsole.test_internal_error, but only if you have the argcomplete
package installed.
This was a real bug in error reporting code.
2017-02-01 16:02:35 +02:00
Antoine Beaupré
d51b7f34b6
Merge branch '9.3.x'
2017-01-31 19:21:22 -05:00
Antoine Beaupré
da8cecd83c
Merge remote-tracking branch 'anarcat/norobots'
2017-01-31 11:34:09 -05:00
Antoine Beaupré
bf45fb1884
fix HTTPS URL checks
...
in Debian Jessie, linkchecker fails because of an API problem.
it completely breaks HTTPs checks.
this patch fixes the problem
from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772947
2017-01-31 11:25:45 -05:00
Bastian Kleineidam
1e291afdfa
Fix python requests version check
2017-01-31 11:25:38 -05:00
Antoine Beaupré
46d96d0aa0
fix HTTPS URL checks
...
in Debian Jessie, linkchecker fails because of an API problem.
it completely breaks HTTPs checks.
this patch fixes the problem
from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772947
2016-09-30 11:20:38 -04:00
Bastian Kleineidam
c2ce810c3f
Fix python requests version check
2016-06-28 21:55:10 +02:00
Antoine Beaupré
9d899d1dfa
add --no-robots commandline flag
...
While this flag can be abused, it seems to me like a legitimate use
case that you want to check a fairly small document for mistakes,
which includes references to a website which has a robots.txt that
denies all robots. It turns out that most websites do *not* add a
permission for LinkCheck to use their site, and some sites, like the
Debian BTS for example, are very hostile with bots in general.
Between me using linkcheck and me using my web browser to check those
links one by one, there is not a big difference. In fact, using
linkcheck may be *better* for the website because it will use HEAD
requests instead of a GET, and will not fetch all page elements
(javascript, images, etc) which can often be fairly big.
Besides, hostile users will patch the software themselves: it took me
only a few minutes to disable the check, and a few more to make that
into a proper patch.
By forcing robots.txt without any other option, we are hurting our
good users and not keeping hostile users from doing harm.
The patch is still incomplete, but works. It lacks: documentation and
unit tests.
Closes : #508
2016-05-19 14:43:59 -04:00
Bastian Kleineidam
0ef00eea56
Move GUI files to separate project
2016-01-23 13:28:15 +01:00
Bastian Kleineidam
549533d701
Improved debugging
2016-01-19 21:55:50 +01:00
wummel
a40c39be59
Merge pull request #560 from xvadim/feature
...
Added plugin for parsing and checking links in Markdown files
2016-01-19 07:30:34 +01:00
wummel
e2556abbb6
Merge pull request #561 from nbigaouette/issue555
...
Detect if "url_data" contains proxy attributes before using them.
2016-01-17 21:59:35 +01:00
Bastian Kleineidam
3d711666e1
Fix parser for changes in bison 3.0.x
2015-11-26 12:33:44 +01:00
Nicolas Bigaouette
4e56eceb35
Detect if "url_data" contains proxy attributes before using them.
...
Fix proposed by @colwilson in issue #555 .
2014-11-12 09:58:30 -05:00
Vadim Khohlov
d4352fc828
Added plugin for parsing and checking links in Markdown files
2014-11-11 15:35:18 +02:00
Bastian Kleineidam
27937e6f83
Fix requests module version check.
2014-09-22 22:45:04 +02:00
Bastian Kleineidam
228bce1ba2
Add to instead of replace the HTTP client headers.
2014-09-20 12:17:42 +02:00
Bastian Kleineidam
92c4ca9a5e
Debug request headers
2014-09-20 12:16:24 +02:00
Bastian Kleineidam
029c20ed98
More python3 fixes
2014-09-12 21:59:07 +02:00
Bastian Kleineidam
35eb30432e
Added some Python3 fixes.
2014-09-12 19:36:30 +02:00
Bastian Kleineidam
697e7b82e1
Search for system certs
2014-09-11 21:19:49 +02:00
Bastian Kleineidam
21c7200360
Reactivate paging of help pages.
2014-09-11 19:42:42 +02:00
Bastian Kleineidam
06c6b80ed3
Fix proxy support.
2014-09-05 22:48:10 +02:00
wummel
6580d37dc9
Merge pull request #545 from ArloL/patch-1
...
Use correct attribute
2014-09-05 21:13:40 +02:00
Bastian Kleineidam
ee4545399d
Support itms-services: URLs. #532
2014-09-05 21:06:10 +02:00
Bastian Kleineidam
37d4ed6f83
Add hyphen and dot to the allowed scheme characters.
2014-09-05 20:59:54 +02:00
Bastian Kleineidam
c8df9355f0
Try to use the SSL certs from the certifi package.
2014-09-05 20:00:30 +02:00
Bastian Kleineidam
c684918ba6
Ignore urllib3 warnings about invalid SSL certs since we check them ourselves.
2014-09-05 20:00:00 +02:00
Bastian Kleineidam
2354f16dbb
Catch urllib3 errors.
2014-09-05 19:59:28 +02:00
Arlo Louis O'Keeffe
52337f82cb
Use correct attribute
2014-09-03 09:36:22 +02:00
Bastian Kleineidam
85dadc1f1a
Add documentation
2014-07-16 07:37:19 +02:00
Bastian Kleineidam
37664ea8a4
Fix Word file check plugin.
2014-07-15 22:39:41 +02:00
Bastian Kleineidam
b646293fd6
Remove unused import.
2014-07-15 22:38:57 +02:00
Bastian Kleineidam
29193bbcc9
Fix login URL cookies and don't sanitize after config reading.
2014-07-15 22:23:38 +02:00
Bastian Kleineidam
032c4091c3
Some easy python3 compatibility changes.
2014-07-15 18:40:47 +02:00
Bastian Kleineidam
90257a1b5e
Replace twill with custom code.
2014-07-15 18:37:05 +02:00
Bastian Kleineidam
a665d35feb
Use proxies and checker session in robots.txt.
2014-07-14 20:28:28 +02:00
Bastian Kleineidam
266e9e189f
Further code cleanup.
2014-07-14 20:14:00 +02:00
Bastian Kleineidam
6c38b4165a
Use given HTTP auth data for robots.txt fetching.
2014-07-14 19:50:11 +02:00
Bastian Kleineidam
7838521b6e
Code cleanup.
2014-07-14 19:49:01 +02:00
Bastian Kleineidam
100ce11d40
Sanitize CGI configuration.
2014-07-13 21:56:01 +02:00
Bastian Kleineidam
eafa1ed2da
Updated unknown URL schemes.
2014-07-13 21:51:53 +02:00
Bastian Kleineidam
176b95a30e
Do not strip quotes from resolved URLs.
2014-07-11 00:43:46 +02:00
Bastian Kleineidam
27702ddbac
Catch log output start errors.
2014-07-09 21:54:47 +02:00
Bastian Kleineidam
6ff89e9e8c
Fix GUI startup
2014-07-06 20:20:03 +02:00
Bastian Kleineidam
0fa7ed2699
Fix empty URL handling.
2014-07-03 23:34:40 +02:00
Bastian Kleineidam
1590ab6240
cleanup
2014-07-01 21:12:47 +02:00
Bastian Kleineidam
9a124513e3
Merge branch 'master' of github.com:wummel/linkchecker
2014-07-01 21:11:33 +02:00
wummel
9bb3852edf
Merge pull request #515 from Mark-Hetherington/extern-redirect
...
When following redirections update url.extern
2014-07-01 21:11:13 +02:00
Bastian Kleineidam
12cc12db53
Add get_redirects() function.
2014-07-01 21:11:06 +02:00
Bastian Kleineidam
cde261c009
Parse Refresh: and Content-Location: header values for URLs.
2014-07-01 20:16:43 +02:00
Bastian Kleineidam
c3ec91ac6d
Fix intern URL search pattern.
2014-06-13 23:52:21 +02:00
Bastian Kleineidam
ad8eb424f3
Merge Mark-Hetherington-xml-parse-warn with slight modifications.
2014-06-13 20:50:37 +02:00
Mark Hetherington
34d83db29c
When following redirections update url.extern
2014-05-19 14:59:58 +10:00
Bastian Kleineidam
eaa8a963ec
Refactor logging configuration.
2014-05-10 21:23:06 +02:00
Bastian Kleineidam
4b28e6e860
Move mime stuff into own submodule.
2014-05-10 21:22:10 +02:00
Bastian Kleineidam
9b794b936c
Print interrupt note in text output.
2014-04-30 20:17:33 +02:00
Bastian Kleineidam
43c2e6641b
Logging refactor, interrupt and abort flags added.
2014-04-30 19:59:43 +02:00
Bastian Kleineidam
b152ce7a6e
Add PDF test and fix page number.
2014-04-29 18:53:24 +02:00
Bastian Kleineidam
0d9881cf03
Fix add_url() with local files.
2014-04-29 18:43:21 +02:00
Bastian Kleineidam
82dd76b0d7
Add PDF link parsing.
2014-04-28 18:13:45 +02:00
Bastian Kleineidam
0ffdea2b8d
Added parser plugins and the applies_to() function.
2014-04-28 18:11:19 +02:00
Bastian Kleineidam
0f8ee234c3
Fix documentation.
2014-04-28 18:10:20 +02:00
Bastian Kleineidam
6bae3e0f49
Use the same request arguments for redirects.
2014-04-23 22:03:44 +02:00
Bastian Kleineidam
981079c041
Support itemtype attribute parsing.
2014-04-23 22:03:20 +02:00
Bastian Kleineidam
4232b69633
Support <img> srcset attribute parsing.
2014-04-10 17:51:59 +02:00
Bastian Kleineidam
6caf654031
Parse Link: heaaders.
2014-04-10 17:50:55 +02:00
Bastian Kleineidam
22caa9367a
Refactor recursion checks.
2014-04-10 17:50:55 +02:00
Bastian Kleineidam
08fbd891ef
Do not check external robots.txt sitemaps.
2014-04-09 19:44:29 +02:00
Bastian Kleineidam
c57f607fc3
Use urldata.add_url()
2014-04-07 18:54:33 +02:00
Bastian Kleineidam
9c5693ad41
Add doc and copyright.
2014-03-30 19:23:42 +02:00
Bastian Kleineidam
4759cee377
Updated mailto: documentation.
2014-03-30 08:30:14 +02:00
Bastian Kleineidam
b6b5c7a12e
Simpler link parsing routine.
2014-03-27 19:49:17 +01:00
Bastian Kleineidam
f180592cc4
Increase thread poll intervall to reduce CPU usage.
2014-03-27 17:43:14 +01:00
Bastian Kleineidam
81da2eb48f
Code cleanup
2014-03-27 17:19:52 +01:00
Bastian Kleineidam
da0ef8e8ea
Fix for moved functions.
2014-03-27 17:19:24 +01:00
Bastian Kleineidam
fa26876f67
Don't use encoding detection since it's very slow.
2014-03-27 12:27:11 +01:00
Bastian Kleineidam
8cf84be2e2
Fix pyopenssl certificate date parsing.
2014-03-26 20:25:44 +01:00
Bastian Kleineidam
49df359317
Some fixes when pyopenssl is used instead of python ssl module.
2014-03-26 19:59:17 +01:00
Bastian Kleineidam
dec0f6c8dc
Fix error with SNI checks
2014-03-26 12:38:16 +01:00
Bastian Kleineidam
a8623bc0bc
Display SSL info on redirects.
2014-03-26 07:16:03 +01:00
Bastian Kleineidam
be59802569
Set http connection charset.
2014-03-20 21:20:34 +01:00
Bastian Kleineidam
098dede12c
Fix warningregex setting in GUI.
2014-03-20 20:46:58 +01:00
Bastian Kleineidam
9cd67dfcb2
More SSL message work.
2014-03-20 20:24:57 +01:00
Bastian Kleineidam
4c76345338
Add certificate valid date info and always set verify flag.
2014-03-19 17:16:42 +01:00
Bastian Kleineidam
9a7ad3a84f
Print SSL cipher info for https URLs.
2014-03-19 17:02:34 +01:00
Bastian Kleineidam
931ca4f402
Add missing log keyword arg.
2014-03-19 17:02:00 +01:00
Bastian Kleineidam
71a7898ee6
Don't check non-connected URLs.
2014-03-19 16:33:38 +01:00
Bastian Kleineidam
ce733ae76b
Don't check for robots.txt directives in local html files.
2014-03-19 16:33:22 +01:00
Bastian Kleineidam
e528d5f7db
Fix ssl connection handling and change plugin type to connection plugin.
2014-03-19 14:28:33 +01:00
Bastian Kleineidam
9be667b52a
Do not warn about missing addresses on mailto links that have subjects.
2014-03-18 23:27:59 +01:00
Bastian Kleineidam
2eb6b1b44c
Call connect() on unconnected ssl responses.
2014-03-18 23:27:21 +01:00
Bastian Kleineidam
fc73c6ca6e
Log number of checked unique URLs.
2014-03-14 23:46:17 +01:00
Bastian Kleineidam
91c6e1d29f
Don't log bytes in status.
2014-03-14 22:25:19 +01:00
Bastian Kleineidam
34bdf5c75a
Updated copyright and docs.
2014-03-14 22:09:05 +01:00
Bastian Kleineidam
19b8baf08c
Move cached queue items to top once in a while.
2014-03-14 22:08:51 +01:00
Bastian Kleineidam
6437f08277
Display downloaded bytes.
2014-03-14 21:06:10 +01:00
Bastian Kleineidam
c51caf1133
Assertions should be earlier.
2014-03-14 20:26:11 +01:00
Bastian Kleineidam
cc401923ac
Improve wording of status message.
2014-03-14 20:25:37 +01:00
Bastian Kleineidam
cfff4c4a84
Disable URL length warning for data: URLs.
2014-03-14 20:24:28 +01:00
Bastian Kleineidam
ac78c6d5b8
Internal errors do not stop the checking thread any more.
2014-03-14 20:23:04 +01:00
Bastian Kleineidam
b18854649d
Count unique URLs for url queue limit.
2014-03-14 20:21:46 +01:00
Bastian Kleineidam
257644e660
Add cache length function to get number of cached elements.
2014-03-14 20:19:34 +01:00
Bastian Kleineidam
306979abca
Add HttpHeaderInfo plugin
2014-03-12 19:28:37 +01:00
Bastian Kleineidam
279db5c5b8
Fix documentation.
2014-03-12 19:22:18 +01:00
Bastian Kleineidam
ccd0d4ead7
Updated the list of unknown or ignored URI schemes.
2014-03-12 19:20:49 +01:00
Bastian Kleineidam
121602df87
Use SSL cert on Windows systems.
2014-03-11 20:58:16 +01:00
Bastian Kleineidam
0ad5969b54
Simplify config dir functions.
2014-03-11 20:23:49 +01:00
Bastian Kleineidam
41d07729bb
Install certificate store with installers.
2014-03-10 22:34:37 +01:00
Bastian Kleineidam
ee0717131d
Add marker for http debugging
2014-03-10 20:09:05 +01:00
Bastian Kleineidam
9c9cf0c3e2
Check for Python requests >= 2.2.0
2014-03-10 19:31:31 +01:00
Bastian Kleineidam
57edf0923e
Updated copyright year
2014-03-10 19:27:22 +01:00
Bastian Kleineidam
bca226c293
Fix assertion checking external links; fix tests
2014-03-10 18:23:44 +01:00
Bastian Kleineidam
40b663cf9e
Ignore URLs earlier.
2014-03-10 18:05:11 +01:00
Bastian Kleineidam
6b334dc79b
Fix URL result caching.
2014-03-08 19:35:10 +01:00
Bastian Kleineidam
0113f06406
Enable arbitrary output encodings in CSV output. See #467
2014-03-06 22:40:52 +01:00
Bastian Kleineidam
102837b875
Set maximum redirects
2014-03-06 21:58:35 +01:00
Bastian Kleineidam
fab2c2da98
Improve content type setting.
2014-03-05 20:12:19 +01:00
Bastian Kleineidam
ef13a3fce1
Implement sitemap and sitemap index parsing.
2014-03-05 19:26:37 +01:00
Bastian Kleineidam
b72cf252fb
Move parseable check down since it might get the content.
2014-03-05 19:26:05 +01:00
Bastian Kleineidam
9ef65cb774
Fix UrlData string representation.
2014-03-05 19:25:40 +01:00
Bastian Kleineidam
00bd549c0c
Remove duplicate content type map.
2014-03-05 19:24:58 +01:00
Bastian Kleineidam
380f14453b
Fix mimetype guessing from content.
2014-03-05 19:23:58 +01:00
Bastian Kleineidam
192cfab009
Cleanup of the UrlData.is_* functions
2014-03-05 19:23:16 +01:00
Bastian Kleineidam
b17211f162
Set for release.
2014-03-04 21:36:24 +01:00
Bastian Kleineidam
978b24f2d7
Merge branch 'caching'
2014-03-04 07:21:42 +01:00
Bastian Kleineidam
f1076c8813
Increase url-too-long warning.
2014-03-03 23:31:04 +01:00
Bastian Kleineidam
82f81241fd
Check all links and add better caching.
2014-03-03 23:29:45 +01:00
Bastian Kleineidam
510af337c1
Improved --version output.
2014-03-01 21:00:16 +01:00
Bastian Kleineidam
74d804ac82
Print release date on --version and internal errors.
2014-03-01 20:59:00 +01:00
Bastian Kleineidam
39df1812c7
Default to 10 threads instead of 100.
2014-03-01 20:49:06 +01:00
Bastian Kleineidam
6f205a2574
Support checking Sitemap: URLs in robots.txt files.
2014-03-01 20:25:19 +01:00
Bastian Kleineidam
0f0d79c7e0
Remove crawl-delay stuff
2014-03-01 20:01:42 +01:00
Bastian Kleineidam
00f8011709
Catch overflowerror in robots.txt crawl-delay
2014-03-01 19:58:22 +01:00
Bastian Kleineidam
0e4d6f6e1a
Parse sitemap urls in robots.txt files.
2014-03-01 19:57:57 +01:00
Bastian Kleineidam
78a99717fe
Check regular expressions from users for errors.
2014-03-01 19:15:48 +01:00
Bastian Kleineidam
c20005a031
Add missing docstring.
2014-03-01 19:14:43 +01:00
Bastian Kleineidam
39c39b1d9f
Disable twill page refresh.
2014-03-01 18:19:29 +01:00
Bastian Kleineidam
0211529d79
Use twill form field number if all else fails.
2014-03-01 18:12:06 +01:00
Bastian Kleineidam
7d84e1e729
Do not check permissions on non-posix systems for now.
2014-03-01 18:01:08 +01:00
Bastian Kleineidam
eb7e52c0e2
-o none sets exit code now
2014-03-01 15:31:39 +01:00
Bastian Kleineidam
f7f5001256
Add missing column name to SQL insert statement.
2014-03-01 12:03:33 +01:00
Bastian Kleineidam
f9bf831804
Remove some empty lines
2014-03-01 12:02:00 +01:00
Bastian Kleineidam
900e04ceda
Dynamic language switch in the GUI.
2014-03-01 12:01:47 +01:00
Bastian Kleineidam
9d0255e156
Fix bookmark imports
2014-03-01 10:16:29 +01:00
Bastian Kleineidam
7b34be590b
Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.
2014-03-01 00:12:34 +01:00
Bastian Kleineidam
c806be5c15
Updated copyright
2014-01-08 22:33:04 +01:00
Bastian Kleineidam
c076e312a2
Send an Accept header.
2014-01-08 19:56:00 +01:00
Bastian Kleineidam
f3b435c2a6
Add missing docstrings.
2013-12-24 07:15:31 +01:00
Bastian Kleineidam
e0a2558b2b
Updated copyright.
2013-12-24 07:13:16 +01:00
Bastian Kleineidam
845a6a1146
Fix loader in frozen executables.
2013-12-18 20:53:17 +01:00
wummel
9646f0b652
Merge pull request #418 from chuckbjones/reset-url-on-fallback
...
Reset to original url when falling back to GET
2013-12-17 22:37:17 -08:00
Bastian Kleineidam
fbbced4d8f
Fix tests
2013-12-13 07:39:59 +01:00
Bastian Kleineidam
5151e68a3e
Fix logger config
2013-12-13 07:37:21 +01:00
Bastian Kleineidam
103e00b4d1
Allow disabling of ssl certificate checks.
2013-12-12 22:17:57 +01:00
Bastian Kleineidam
39fb02f9a9
Remember last save result as filetype.
2013-12-12 20:44:09 +01:00
Bastian Kleineidam
5736987b60
Refactor output loggers.
2013-12-11 18:41:55 +01:00
Bastian Kleineidam
78ed1e9e52
Do not GET on POST forms.
2013-12-10 23:42:43 +01:00
Bastian Kleineidam
0ca63797bf
Remove content cache.
2013-12-10 23:41:52 +01:00
Bastian Kleineidam
a7c1cdd6f6
Check for help files.
2013-12-10 20:56:26 +01:00
Bastian Kleineidam
2c5ede2eb7
Fallback to GET for Apache Coyote servers.
2013-12-08 08:22:56 +01:00
Bastian Kleineidam
b567f766ba
Fix strtime test.
2013-12-06 07:13:44 +01:00
Bastian Kleineidam
6d68e00068
Merge branch 'master' of github.com:wummel/linkchecker
2013-12-04 19:21:45 +01:00
Bastian Kleineidam
023da7c993
Remove the duplicate URL content check.
2013-12-04 19:12:40 +01:00
Bastian Kleineidam
36badddfac
Update cookie code from Python module.
2013-12-04 19:05:08 +01:00
wummel
ab54809d95
Merge pull request #426 from alperkokmen/fix-lastmod-format
...
Fix ISO formatting for modified datetime.
2013-12-03 12:22:27 -08:00
Bastian Kleineidam
c676a4c829
Avoid DoS in SSL certificate host matching.
2013-11-30 22:07:23 +01:00
Alper Kokmen
4b3e78cac0
Fix ISO formatting for modified datetime.
...
This change will make sure that format_modified returns datetime value
in ISO 8601 format. See W3C documentation at
http://www.w3.org/TR/NOTE-datetime .
Since ```modified``` is parsed and then converted to UTC after it's
extracted from HTTP response, it's safe to assume that format_modified
will always format UTC datetime values.
Instead of ```isoformat``` method which omits timezone information for
UTC values, ```strftime``` with a specific format (that ends with Z)
will be used.
2013-09-02 15:38:54 -07:00
Charles Jones
4294633c04
Close connection prior to falling back to get, since we change the url back to the original at that time.
2013-08-09 13:08:51 -05:00
Charles Jones
8bc138f18b
Reset to original url when falling back to GET
2013-07-30 13:38:59 -05:00
Bastian Kleineidam
c966fe6b24
Remove the http-wrong-redirect warning
2013-04-11 18:33:19 +02:00
Bastian Kleineidam
134db22830
Updated homepage URL.
2013-04-09 20:11:04 +02:00
Bastian Kleineidam
21678c661d
Updated gzip and httplib copies.
2013-03-11 20:21:58 +01:00
Bastian Kleineidam
6b05f1d290
Paginate help output again.
2013-02-28 21:21:00 +01:00
Bastian Kleineidam
123578a4cd
Make per-host connection limits configurable.
2013-02-27 19:37:28 +01:00
Bastian Kleineidam
b7c82d1e75
Fix strformat.strsize() test.
2013-02-27 19:36:03 +01:00
Bastian Kleineidam
b38317d57b
Replace optparse with argparse.
2013-02-27 19:35:44 +01:00
Bastian Kleineidam
64d95e45e0
Remove local HTML and CSS syntax check.
2013-02-08 21:36:02 +01:00
Bastian Kleineidam
b104482174
Add missing docstring.
2013-01-25 21:15:12 +01:00
Bastian Kleineidam
35bc79dd90
Updated copyright.
2013-01-25 21:14:27 +01:00
Bastian Kleineidam
707b7b7db1
Close HTTP connections without body content. Github issue #376
2013-01-23 19:42:29 +01:00
Bastian Kleineidam
e6ad32c028
Catch UnicodeError for invalid host names.
2013-01-23 19:42:29 +01:00
Bastian Kleineidam
c0a0efbd1d
Do not handle non-existing SIGUSR1 signal.
2013-01-22 21:23:46 +01:00
Bastian Kleineidam
47451d7def
Fix GUI drag and drop.
2013-01-22 19:06:10 +01:00
Bastian Kleineidam
faa743e876
Increase per-host connection limits.
2013-01-22 18:18:48 +01:00
Bastian Kleineidam
fa402c0d70
Allow drag-and-drop of all local files.
2013-01-22 18:17:07 +01:00
Bastian Kleineidam
7134c0bb05
Print thread stack traces on SIGUSR1
2013-01-22 18:16:53 +01:00
Bastian Kleineidam
9b8cb67d78
Updated copyright.
2013-01-17 20:41:47 +01:00
Bastian Kleineidam
4dad2aa33c
Support dns-prefetch URLs.
2013-01-17 20:41:09 +01:00
Bastian Kleineidam
7fe72745ae
Updated copyright.
2013-01-09 23:03:12 +01:00
Bastian Kleineidam
fe7e9a5c6c
Improve Word document opening: open read-only and invisble, avoiding unnecessary dialogs.
2013-01-07 22:18:39 +01:00
Bastian Kleineidam
a5b6136e70
Check word document validity before closing.
2013-01-07 21:58:02 +01:00
Bastian Kleineidam
0e50834f9a
Rename external module to exclude it from some style checks.
2013-01-06 18:17:29 +01:00
Bastian Kleineidam
65a0031c10
Updated copyright.
2013-01-06 18:12:44 +01:00
Bastian Kleineidam
16b84be490
Updated all links.
2013-01-06 18:10:13 +01:00
Bastian Kleineidam
0283362ce6
Updated copyright.
2012-12-23 21:32:16 +01:00
Bastian Kleineidam
a7b83e6200
Fix GUI startup for Windows.
2012-12-19 21:12:02 +01:00
Bastian Kleineidam
9820530313
Use better_exchook to print more internal error info.
2012-12-18 23:06:48 +01:00
Bastian Kleineidam
f568a04a7c
Fix ignore option storing in GUI.
2012-12-13 17:06:06 +01:00
Bastian Kleineidam
27df4e20da
Add error handling for screen console function.
2012-12-07 22:31:48 +01:00
Bastian Kleineidam
efbbb656a1
Remove python-dns conflict by moving the dns module into a custom subdirectory.
2012-12-07 22:19:32 +01:00