Commit graph

6745 commits

Author SHA1 Message Date
Chris Mayo
9ed2d8703b Fix installation from source without git installed 2021-12-15 19:40:27 +00:00
Chris Mayo
92f189579e
Merge pull request #593 from cjmayo/man-updates
Updated man pages and translation catalogs
2021-12-15 19:31:21 +00:00
Chris Mayo
5f3b007934
Merge pull request #591 from cjmayo/robot
Assume robots.txt is UTF-8
2021-12-15 19:31:00 +00:00
Chris Mayo
4b6f3c38b3
Merge pull request #590 from cjmayo/guess
Don't guess the URL for files that end in .html
2021-12-15 19:30:40 +00:00
LinkChecker
e6021559e8 Update application translation catalogs 2021-12-14 19:27:40 +00:00
LinkChecker
7497f1aa63 Update man pages 2021-12-14 19:27:40 +00:00
LinkChecker
289ef5dc5b Update doc locale 2021-12-14 19:27:20 +00:00
Chris Mayo
d70ec6f75b Assume robots.txt is UTF-8
Match the Python standard library and Google's interpretation:
https://developers.google.com/search/docs/advanced/robots/robots_txt#file-format

Avoid Unhandled LookupError.
2021-12-13 19:31:55 +00:00
Chris Mayo
76815bcf47 Don't guess the URL for files that end in .html
Fixes:
linkchecker ftp.html
failing looking for ftp://ftp.html
2021-12-13 19:31:13 +00:00
Chris Mayo
7f175c13d4
Merge pull request #589 from cjmayo/proxy
Remove linkcheck.checker.proxysupport
2021-12-13 19:29:53 +00:00
Chris Mayo
315e729ec3 Fix typo "variables" in linkchecker(1) Proxy Support 2021-12-13 19:25:23 +00:00
Chris Mayo
9504a6dddf Document the curl_ca_bundle environment variable 2021-12-13 19:25:23 +00:00
Chris Mayo
a2e379a595 Remove built-in GNOME and KDE proxy support
Only http_proxy was ever supported.

Requests uses urllib.request.getproxies().

Fedora 35 and Ubuntu 20.04 do set proxy environment variables when
settings are added through the GUI.

GNOME location of proxy settings is subject to change:
https://wiki.gnome.org/Projects/NetworkManager/Proxies
https://gitlab.gnome.org/GNOME/gsettings-desktop-schemas/-/issues/27
2021-12-13 19:25:23 +00:00
Chris Mayo
fe5a34c68f Remove linkcheck.checker.proxysupport
Set up the requests.Session() with the complete proxy configuration
to fix a problem with using an HTTP server as an HTTPS proxy and
potential redirection issues.

Requests handles no_proxy.
2021-12-13 19:25:23 +00:00
Chris Mayo
35ecb7e639 Add https_proxy to internal error message 2021-12-13 19:25:23 +00:00
Chris Mayo
a60648e348 Remove support for ftp_proxy
Was limited to HTTP proxy servers and prevents simplifying and fixing
HTTP proxy support.
2021-12-13 19:25:23 +00:00
Chris Mayo
f2e5a435e3 Remove unused ProxySupport.proxyauth
Not used since:
7b34be590 ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)
2021-12-13 19:25:23 +00:00
Chris Mayo
2e9f08748d
Merge pull request #587 from cjmayo/util-actions
Workflows to automate updating man pages and create distribution files
2021-12-13 19:24:48 +00:00
Chris Mayo
96f207d25f
Merge pull request #585 from cjmayo/scm_git_archive
Make installing from the git archive of a tagged commit possible
2021-12-13 19:23:34 +00:00
Chris Mayo
01dfc13886
Merge pull request #584 from cjmayo/releasedate
Set release date from HEAD
2021-12-13 19:22:19 +00:00
Chris Mayo
0b3bdedd6d
Merge pull request #583 from cjmayo/newest
Replace "Get the newest version at"
2021-12-13 19:21:32 +00:00
Chris Mayo
945ad903a3
Merge pull request #579 from cjmayo/redirect
Update HttpUrl.encoding after following redirects
2021-12-13 19:20:28 +00:00
Koen Van den Wijngaert
900586dc01
Better handling for link rel dns-prefetch and add preconnect support (#536)
preconnect is only DNS checked.

This is allowed even in the Resource Hints Editor's Draft
https://w3c.github.io/resource-hints/#preconnect
2021-12-09 19:38:30 +00:00
Chris Mayo
ef33a61b41 Workflow to automatically add Python distribution files to the release 2021-12-09 19:31:28 +00:00
Chris Mayo
26ed46ad40 Workflow to semi-automate man page and translation updates
Expected to be used form a fork, it will create a branch man-updates,
based on the upstream repository with additional commits with the updates.
Then a pull request would be created to send the updates upstream.
2021-12-09 19:31:28 +00:00
Chris Mayo
9dd39ef264 Remove *.mo from ignore-bad-ideas
.mo files have not been distributed since:
e297b1a47 ("Stop including binary translation catalogs in the source", 2021-11-22)
2021-12-07 19:44:20 +00:00
Chris Mayo
7eb4cc7a66 Fix check warning that MANIFEST.in is not compatible with Windows 2021-12-07 19:44:20 +00:00
Chris Mayo
699008ddc2 Set release date from HEAD
Match the version reported using setuptools_scm.
Use the same git command as setuptools_scm uses for node_date.
2021-12-07 19:44:20 +00:00
Chris Mayo
2cc1319bee GitHub release must be created before Python distribution files
setuptools_scm will use the release tag.
2021-12-06 19:40:14 +00:00
Chris Mayo
d2d2a563cc Make installing from the git archive of a tagged commit possible
aka make GitHub release assets usable.

Requires setuptools-scm-git-archive.
2021-12-06 19:40:02 +00:00
Chris Mayo
d08f6a0730 Replace "Get the newest version at" 2021-12-06 19:36:22 +00:00
Chris Mayo
a04214465a Update HttpUrl.encoding after following redirects 2021-12-06 19:34:31 +00:00
Chris Mayo
0325ecd73f Remove httpurl.HEADER_ENCODING
Unused since:
d91a32822 ("Remove strformat.unicode_safe() and strformat.url_unicode_split()", 2020-07-07)
2021-12-06 19:34:31 +00:00
Chris Mayo
b6d97be46c
Merge pull request #581 from cjmayo/version
Fix LinkChecker version shown on github.io
2021-12-06 19:31:22 +00:00
Chris Mayo
d1ad23016c
Merge pull request #582 from cjmayo/readme
Reference manuals and mention GitHub Packages in README
2021-12-06 19:30:11 +00:00
Chris Mayo
702bf36f53
Merge pull request #571 from cjmayo/cchardet
Add guidance on character set detecting to install.txt
2021-12-06 19:28:55 +00:00
Chris Mayo
7f78acf856 Fetch tag history in publish-pages 2021-12-06 19:27:49 +00:00
Chris Mayo
3eb3a70aab Limit token permissions and pin 3rd-party action in publish-pages 2021-12-06 19:27:49 +00:00
Chris Mayo
a4d2d49989
Merge pull request #580 from cjmayo/yamllint
Add a yamllint check for workflows
2021-12-06 19:24:59 +00:00
Chris Mayo
3b19680e97 Add guidance on character set detecting including cchardet 2021-12-06 19:24:26 +00:00
Chris Mayo
7e882d2530 Reference manuals and mention GitHub Packages in README 2021-12-01 19:43:07 +00:00
Chris Mayo
454ce0c3a5 Add a yamllint check for workflows 2021-11-30 19:45:17 +00:00
Chris Mayo
2bc0b716bc
Merge pull request #574 from cjmayo/nl_NL
Add Dutch translation
2021-11-30 19:29:39 +00:00
Chris Mayo
dc05b33460 Update application translation catalogs 2021-11-30 19:24:44 +00:00
Chris Mayo
0171bc950f Update nl_NL.po header fields 2021-11-30 19:24:44 +00:00
Gideon van Melle
8b11c2f56a Add Dutch translation
I made a Dutch translation
2021-11-30 19:24:44 +00:00
Chris Mayo
606472e910
Merge pull request #572 from cjmayo/latin1
Ignore an encoding of ISO-8859-1 returned by Requests
2021-11-29 19:55:16 +00:00
Chris Mayo
c89c617a58 Ignore an encoding of ISO-8859-1 returned by Requests
ISO-8859-1 is a fallback for Requests and causes us to mangle UTF-8
content.

Requests' utils.py:

def get_encoding_from_headers(headers):
    """Returns encodings from given HTTP Header Dict.

    :param headers: dictionary to extract encoding from.
    :rtype: str
    """

    content_type = headers.get('content-type')

    if not content_type:
        return None

    content_type, params = _parse_content_type_header(content_type)

    if 'charset' in params:
        return params['charset'].strip("'\"")

    if 'text' in content_type:
        return 'ISO-8859-1'

    if 'application/json' in content_type:
        # Assume UTF-8 based on RFC 4627: https://www.ietf.org/rfc/rfc4627.txt since the charset was unset
        return 'utf-8'
2021-11-29 19:52:37 +00:00
Chris Mayo
a78c78a803
Merge pull request #569 from cjmayo/output
Improve output documentation and make quiet/-q set log level to warning
2021-11-29 19:51:38 +00:00
Chris Mayo
6269134017 Correct documentation of text output encoding
50ea41b12 ("use preferred locale for default encoding", 2005-10-13)
714147cb2 ("Improved language and encoding detection by using local.getdefaultlocale() instead of locale.getlocale(category=LC_ALL)", 2009-03-04)
2021-11-29 19:48:50 +00:00