linkchecker/linkcheck
Chris Mayo c89c617a58 Ignore an encoding of ISO-8859-1 returned by Requests
ISO-8859-1 is a fallback for Requests and causes us to mangle UTF-8
content.

Requests' utils.py:

def get_encoding_from_headers(headers):
    """Returns encodings from given HTTP Header Dict.

    :param headers: dictionary to extract encoding from.
    :rtype: str
    """

    content_type = headers.get('content-type')

    if not content_type:
        return None

    content_type, params = _parse_content_type_header(content_type)

    if 'charset' in params:
        return params['charset'].strip("'\"")

    if 'text' in content_type:
        return 'ISO-8859-1'

    if 'application/json' in content_type:
        # Assume UTF-8 based on RFC 4627: https://www.ietf.org/rfc/rfc4627.txt since the charset was unset
        return 'utf-8'
2021-11-29 19:52:37 +00:00
..
bookmarks Stop using biplist 2020-10-12 19:55:46 +01:00
cache Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
checker Ignore an encoding of ISO-8859-1 returned by Requests 2021-11-29 19:52:37 +00:00
configuration Make quiet/-q set application logging to warning 2021-11-29 19:48:50 +00:00
director Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
htmlutil Fix treating data: URIs in srcset values as links 2020-08-07 20:04:23 +01:00
logger Tidy linkchecker intro 2020-08-30 18:40:39 +01:00
network Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
parser Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
plugins Disable AnchorCheck plugin 2021-11-29 19:35:34 +00:00
__init__.py Drop Python 3.5, add 3.9 2020-09-14 19:55:05 +01:00
ansicolor.py Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
better_exchook2.py Backport tabs to spaces from better_exchook.py 2020-08-22 17:17:02 +01:00
cmdline.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
colorama.py Restore better_exchook2.py and colorama.py to pre-Black state 2020-06-03 20:06:36 +01:00
containers.py Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
cookies.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
decorators.py Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
dummy.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
fileutil.py Remove isinstance() from fileutil.path_safe() 2020-06-18 19:27:06 +01:00
ftpparse.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
httputil.py Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
i18n.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
lc_cgi.py Disable status logging in WSGI application 2021-01-28 19:20:24 +00:00
loader.py Merge pull request #478 from cjmayo/imp 2020-08-18 19:56:40 +01:00
lock.py Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
log.py Run black on linkcheck/ 2020-05-30 17:01:36 +01:00
logconf.py Fix remaining flake8 violations in linkcheck/ 2020-05-30 17:01:36 +01:00
memoryutil.py Fix formatting and typos in docstrings 2020-07-25 16:35:48 +01:00
mimeutil.py Detect sitemaps that do not start with an XML declaration 2020-08-11 19:35:56 +01:00
robotparser2.py Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
socketutil.py Remove spaces after names in function definitions 2020-05-16 20:19:42 +01:00
strformat.py Merge pull request #444 from cjmayo/isinstance 2020-07-08 19:55:29 +01:00
threader.py Convert to Python 3 super() 2020-06-03 20:06:36 +01:00
trace.py Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
url.py Merge pull request #489 from cjmayo/urlsplit 2020-08-29 16:44:56 +01:00