Commit graph

3214 commits

Author SHA1 Message Date
Chris Mayo
0b3bdedd6d
Merge pull request #583 from cjmayo/newest
Replace "Get the newest version at"
2021-12-13 19:21:32 +00:00
Chris Mayo
945ad903a3
Merge pull request #579 from cjmayo/redirect
Update HttpUrl.encoding after following redirects
2021-12-13 19:20:28 +00:00
Koen Van den Wijngaert
900586dc01
Better handling for link rel dns-prefetch and add preconnect support (#536)
preconnect is only DNS checked.

This is allowed even in the Resource Hints Editor's Draft
https://w3c.github.io/resource-hints/#preconnect
2021-12-09 19:38:30 +00:00
Chris Mayo
d08f6a0730 Replace "Get the newest version at" 2021-12-06 19:36:22 +00:00
Chris Mayo
a04214465a Update HttpUrl.encoding after following redirects 2021-12-06 19:34:31 +00:00
Chris Mayo
0325ecd73f Remove httpurl.HEADER_ENCODING
Unused since:
d91a32822 ("Remove strformat.unicode_safe() and strformat.url_unicode_split()", 2020-07-07)
2021-12-06 19:34:31 +00:00
Chris Mayo
c89c617a58 Ignore an encoding of ISO-8859-1 returned by Requests
ISO-8859-1 is a fallback for Requests and causes us to mangle UTF-8
content.

Requests' utils.py:

def get_encoding_from_headers(headers):
    """Returns encodings from given HTTP Header Dict.

    :param headers: dictionary to extract encoding from.
    :rtype: str
    """

    content_type = headers.get('content-type')

    if not content_type:
        return None

    content_type, params = _parse_content_type_header(content_type)

    if 'charset' in params:
        return params['charset'].strip("'\"")

    if 'text' in content_type:
        return 'ISO-8859-1'

    if 'application/json' in content_type:
        # Assume UTF-8 based on RFC 4627: https://www.ietf.org/rfc/rfc4627.txt since the charset was unset
        return 'utf-8'
2021-11-29 19:52:37 +00:00
Chris Mayo
a4b14047d6 Make quiet/-q set application logging to warning 2021-11-29 19:48:50 +00:00
Chris Mayo
0356524369 Disable AnchorCheck plugin
Can't be relied on. Multiple reports of expected results not returned.

https://github.com/linkchecker/linkchecker/issues/542
https://github.com/linkchecker/linkchecker/issues/555
https://github.com/linkchecker/linkchecker/issues/568

Previously a fix was needed just to get the tests working:
0912e8a2c ("Don't strip the URL fragment from cache key if using AnchorCheck", 2020-07-27)

After:
eaa538c81 ("don't check one url multiple times", 2016-11-09)
2021-11-29 19:35:34 +00:00
Chris Mayo
2a77e12618 Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
Chris Mayo
43507cf80a Make partial and example URLs in docstrings italic
Prevent Sphinx from turning them into broken links.
2021-08-12 19:28:50 +01:00
Chris Mayo
5de3920f6c Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
Paul Haerle
f395c74aac
Make ResultCache max_size configurable (#544)
* Make ResultCache max_size configurable

fixes #463

* Add tests and docs.

* fix documentation...

...adapt the source, not the auto-generated man pages themselves as
requested in #544.

* fix typo.
2021-06-21 19:45:19 +01:00
Chris Mayo
c31d233f06 Disable status logging in WSGI application
Not a problem earlier because the default for the CLI is to record
status, but this was not fully implemented until:
4f3f1ac0 ("Fix status=0 setting being ignored", 2020-08-06)
2021-01-28 19:20:24 +00:00
Chris Mayo
09b4da393e Initialise Configuration.status_logger
Fixes failure of the LinkChecker WSGI application which does
not call Configuration.set_status_logger().
2021-01-28 19:20:24 +00:00
Chris Mayo
136e8a3625 Update to version 10.0.1.dev0 2021-01-28 19:20:24 +00:00
Chris Mayo
a3e9c31560 Remove execute bits from parsepdf.py and parseword.py 2021-01-14 19:48:22 +00:00
Chris Mayo
e922dd0224 Stop using biplist
plistlib has supported binary files since Python 3.4.
2020-10-12 19:55:46 +01:00
Chris Mayo
0920508413
Merge pull request #498 from cjmayo/linkchecker
Tidy linkchecker
2020-09-24 19:31:07 +01:00
Chris Mayo
ca59966cf0 Add a note linking to biplist Python 3.9 compatibility bug 2020-09-23 19:38:17 +01:00
Chris Mayo
26c15c5e67 Fix deprecation warning for resolver.query()
/home/travis/build/linkchecker/linkchecker/linkcheck/checker/mailtourl.py:321: DeprecationWarning: please use dns.resolver.resolve() instead
    answers = resolver.query(domain, 'MX')
2020-09-14 19:55:05 +01:00
Chris Mayo
70d749a967 Drop Python 3.5, add 3.9 2020-09-14 19:55:05 +01:00
Chris Mayo
f268b95cf8 biplist is not compatible with Python 3.9
File ".tox/py39/lib/python3.9/site-packages/biplist/__init__.py", line 143, in readPlist
    line: raise InvalidPlistException(e)
    locals:
      InvalidPlistException = <global> <class 'biplist.InvalidPlistException'>
      e = <not found>

InvalidPlistException: module 'plistlib' has no attribute 'Data'
2020-09-14 19:55:05 +01:00
Chris Mayo
b1faef93c3
Merge pull request #495 from cjmayo/mswindows
MS Windows Python 3.7 and MS Store compatibility
2020-09-01 19:46:44 +01:00
Chris Mayo
314ec085a3
Merge pull request #462 from cjmayo/anchor
Fix anchor checking
2020-09-01 19:39:29 +01:00
Chris Mayo
a6d6fa0cd4 Tidy linkchecker intro 2020-08-30 18:40:39 +01:00
Chris Mayo
2fbd49dd0b Replace os.path.splitunc() with os.path.splitdrive()
os.path.splitunc() removed in Python 3.7.

https://docs.python.org/3/whatsnew/3.7.html#api-and-feature-removals
2020-08-29 16:57:57 +01:00
Chris Mayo
37e4981089
Merge pull request #492 from cjmayo/pass
Assorted tidying included unneeded pass statements
2020-08-29 16:55:39 +01:00
Chris Mayo
7ef599fc20
Merge pull request #491 from cjmayo/sphinx2
Documentation Updates
2020-08-29 16:50:27 +01:00
Chris Mayo
1390c9cd7e
Merge pull request #489 from cjmayo/urlsplit
Replace deprecated urllib.parse.split functions
2020-08-29 16:44:56 +01:00
Chris Mayo
47604e7d34
Merge pull request #481 from cjmayo/failures
Rename blacklist to failures
2020-08-29 16:39:24 +01:00
Chris Mayo
7dfba766a9
Merge pull request #486 from cjmayo/url
Remove unused code from url.py
2020-08-26 19:28:50 +01:00
Chris Mayo
b1d19e5eab Update copyright and version 2020-08-23 17:24:09 +01:00
Chris Mayo
2de25d54fd Rename blacklist to failures
Continue to support blacklist for the time being, with deprecation
warnings.
2020-08-23 17:19:26 +01:00
Chris Mayo
dfa1ff05dc Backport tabs to spaces from better_exchook.py 2020-08-22 17:17:02 +01:00
Chris Mayo
2864962c13 Backport bare except changes from better_exchook.py 2020-08-22 17:17:02 +01:00
Chris Mayo
1f58419322 Remove unneeded pass statements 2020-08-22 17:17:02 +01:00
Chris Mayo
8779c39735 Replace deprecated urllib.parse.split functions 2020-08-22 16:28:53 +01:00
Chris Mayo
5a2eda9058
Merge pull request #488 from cjmayo/gschema
Avoid dependency on gsettings-desktop-schemas
2020-08-21 16:56:25 +01:00
Chris Mayo
1b497389b5
Merge pull request #483 from cjmayo/retryafter
Don't translate "Retry-After" server header field
2020-08-21 16:51:17 +01:00
Chris Mayo
4969b6dd0a
Merge pull request #482 from cjmayo/syntaxcheck
Fix CssSyntaxCheck list index out of range
2020-08-21 16:46:37 +01:00
Chris Mayo
e9db151145
Merge pull request #480 from cjmayo/blacklist
Fix blacklist updating
2020-08-20 19:48:59 +01:00
Chris Mayo
b869b8876f Avoid dependency on gsettings-desktop-schemas
Gio.Settings.new() causes LinkChecker to exit if the GNOME proxy schema
cannot be found.
2020-08-20 19:42:44 +01:00
Chris Mayo
cfe5c89eb6
Merge pull request #479 from cjmayo/versions
Add missing essential modules to internal error message
2020-08-20 19:36:45 +01:00
Chris Mayo
d7efa20d33 Remove unused constants from url.py 2020-08-19 19:27:28 +01:00
Chris Mayo
be24836c73 Remove unused url.url_unsplit() 2020-08-18 19:57:46 +01:00
Chris Mayo
d58b3ab285 Remove unused url.url_fix_common_typos() 2020-08-18 19:57:46 +01:00
Chris Mayo
9488e1eb41 Remove unused url.is_safe_x matches 2020-08-18 19:57:46 +01:00
Chris Mayo
71ea78382b Remove unused url.safe_host_pattern() 2020-08-18 19:57:46 +01:00
Chris Mayo
794efd6d44 Remove unused url.is_duplicate_content_url() 2020-08-18 19:57:46 +01:00