Commit graph

3243 commits

Author SHA1 Message Date
Malte Gerth
cc48a09308 Add Telegram and WhatsApp link schemes 2022-02-06 23:41:33 +01:00
Malte Gerth
067dd8edbb Update IANA schemes 2022-02-06 23:40:36 +01:00
Chris Mayo
141a811ba6 Enable creating a binary with PyOxidizer
With PyOxidizer 0.18.0 AppName in setup.py has to be changed to the
all lower case "linkchecker".

Application translations do not work.

better_exchook2.fallback_findfile() may still need converting, first
needs a test.
2021-12-30 19:27:04 +00:00
Chris Mayo
5768b76f6c Use pkgutil to simplify loader.get_package_modules()
Replaces use of __file__.
2021-12-30 19:27:04 +00:00
Chris Mayo
a55bbc5237 Write RELEASE_DATE to egg-info 2021-12-30 19:27:04 +00:00
Chris Mayo
50b2063a4b Install translation catalogs in the package data
Custom clean command no longer needed because share directory is not
created in build.
2021-12-30 19:27:04 +00:00
Chris Mayo
1d10fffde4 Use package metadata 2021-12-30 19:27:04 +00:00
Chris Mayo
819dacb9bb Install linkcheckerrc in the package data
data/__init__.py needed for Python < 3.10
(namespace packages supported from importlib_resources v3.2)
2021-12-30 19:27:04 +00:00
Chris Mayo
5c0d66dd74 Raise minimum Python requirement to 3.7 2021-12-30 19:27:04 +00:00
Chris Mayo
a9ab4d847b Remove get_share_file()
cacert.pem not used since:
e3ab9024 ("Remove platform-specific installer stuff and ensure a build .whl wheel file can be built.", 2016-01-17)
2021-12-30 19:27:04 +00:00
Chris Mayo
2fa0016ae9 Remove Portable
Building portable removed in:
e3ab9024 ("Remove platform-specific installer stuff and ensure a build .whl wheel file can be built.", 2016-01-17)
2021-12-30 19:27:04 +00:00
Chris Mayo
3359c7364f Remove is_frozen()
Not used since:
e3ab9024 ("Remove platform-specific installer stuff and ensure a build .whl wheel file can be built.", 2016-01-17)
2021-12-30 19:27:04 +00:00
Chris Mayo
271cb59e62 Remove unused code from i18n 2021-12-30 19:27:04 +00:00
Chris Mayo
158c401dae Update copyright to 2022 2021-12-30 19:27:04 +00:00
Chris Mayo
8bc3b39b41 One more proxy documentation update
a2e379a5 ("Remove built-in GNOME and KDE proxy support", 2021-12-13)
2021-12-21 19:23:00 +00:00
Chris Mayo
5fef9a3b60 Generate linkchecker command using an entry point
drop_privileges() is only used by the linkchecker command.
Move installing SIGUSR1 handler to the linkchecker command only - fixes
intermittent test failures.
2021-12-20 19:34:58 +00:00
Chris Mayo
efb92fbee8 Create setup_config from linkchecker 2021-12-20 19:34:58 +00:00
Chris Mayo
e501c4ffac Create ArgParser from linkchecker 2021-12-20 19:34:58 +00:00
Chris Mayo
9bc1f4d04e Use relative import for configuration in failures.py 2021-12-20 19:34:58 +00:00
Chris Mayo
4444a87eb9 Update Requests bug link 2021-12-15 19:34:24 +00:00
Chris Mayo
5f3b007934
Merge pull request #591 from cjmayo/robot
Assume robots.txt is UTF-8
2021-12-15 19:31:00 +00:00
Chris Mayo
d70ec6f75b Assume robots.txt is UTF-8
Match the Python standard library and Google's interpretation:
https://developers.google.com/search/docs/advanced/robots/robots_txt#file-format

Avoid Unhandled LookupError.
2021-12-13 19:31:55 +00:00
Chris Mayo
76815bcf47 Don't guess the URL for files that end in .html
Fixes:
linkchecker ftp.html
failing looking for ftp://ftp.html
2021-12-13 19:31:13 +00:00
Chris Mayo
9504a6dddf Document the curl_ca_bundle environment variable 2021-12-13 19:25:23 +00:00
Chris Mayo
a2e379a595 Remove built-in GNOME and KDE proxy support
Only http_proxy was ever supported.

Requests uses urllib.request.getproxies().

Fedora 35 and Ubuntu 20.04 do set proxy environment variables when
settings are added through the GUI.

GNOME location of proxy settings is subject to change:
https://wiki.gnome.org/Projects/NetworkManager/Proxies
https://gitlab.gnome.org/GNOME/gsettings-desktop-schemas/-/issues/27
2021-12-13 19:25:23 +00:00
Chris Mayo
fe5a34c68f Remove linkcheck.checker.proxysupport
Set up the requests.Session() with the complete proxy configuration
to fix a problem with using an HTTP server as an HTTPS proxy and
potential redirection issues.

Requests handles no_proxy.
2021-12-13 19:25:23 +00:00
Chris Mayo
35ecb7e639 Add https_proxy to internal error message 2021-12-13 19:25:23 +00:00
Chris Mayo
a60648e348 Remove support for ftp_proxy
Was limited to HTTP proxy servers and prevents simplifying and fixing
HTTP proxy support.
2021-12-13 19:25:23 +00:00
Chris Mayo
f2e5a435e3 Remove unused ProxySupport.proxyauth
Not used since:
7b34be590 ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)
2021-12-13 19:25:23 +00:00
Chris Mayo
0b3bdedd6d
Merge pull request #583 from cjmayo/newest
Replace "Get the newest version at"
2021-12-13 19:21:32 +00:00
Chris Mayo
945ad903a3
Merge pull request #579 from cjmayo/redirect
Update HttpUrl.encoding after following redirects
2021-12-13 19:20:28 +00:00
Koen Van den Wijngaert
900586dc01
Better handling for link rel dns-prefetch and add preconnect support (#536)
preconnect is only DNS checked.

This is allowed even in the Resource Hints Editor's Draft
https://w3c.github.io/resource-hints/#preconnect
2021-12-09 19:38:30 +00:00
Chris Mayo
d08f6a0730 Replace "Get the newest version at" 2021-12-06 19:36:22 +00:00
Chris Mayo
a04214465a Update HttpUrl.encoding after following redirects 2021-12-06 19:34:31 +00:00
Chris Mayo
0325ecd73f Remove httpurl.HEADER_ENCODING
Unused since:
d91a32822 ("Remove strformat.unicode_safe() and strformat.url_unicode_split()", 2020-07-07)
2021-12-06 19:34:31 +00:00
Chris Mayo
c89c617a58 Ignore an encoding of ISO-8859-1 returned by Requests
ISO-8859-1 is a fallback for Requests and causes us to mangle UTF-8
content.

Requests' utils.py:

def get_encoding_from_headers(headers):
    """Returns encodings from given HTTP Header Dict.

    :param headers: dictionary to extract encoding from.
    :rtype: str
    """

    content_type = headers.get('content-type')

    if not content_type:
        return None

    content_type, params = _parse_content_type_header(content_type)

    if 'charset' in params:
        return params['charset'].strip("'\"")

    if 'text' in content_type:
        return 'ISO-8859-1'

    if 'application/json' in content_type:
        # Assume UTF-8 based on RFC 4627: https://www.ietf.org/rfc/rfc4627.txt since the charset was unset
        return 'utf-8'
2021-11-29 19:52:37 +00:00
Chris Mayo
a4b14047d6 Make quiet/-q set application logging to warning 2021-11-29 19:48:50 +00:00
Chris Mayo
0356524369 Disable AnchorCheck plugin
Can't be relied on. Multiple reports of expected results not returned.

https://github.com/linkchecker/linkchecker/issues/542
https://github.com/linkchecker/linkchecker/issues/555
https://github.com/linkchecker/linkchecker/issues/568

Previously a fix was needed just to get the tests working:
0912e8a2c ("Don't strip the URL fragment from cache key if using AnchorCheck", 2020-07-27)

After:
eaa538c81 ("don't check one url multiple times", 2016-11-09)
2021-11-29 19:35:34 +00:00
Chris Mayo
2a77e12618 Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
Chris Mayo
43507cf80a Make partial and example URLs in docstrings italic
Prevent Sphinx from turning them into broken links.
2021-08-12 19:28:50 +01:00
Chris Mayo
5de3920f6c Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
Paul Haerle
f395c74aac
Make ResultCache max_size configurable (#544)
* Make ResultCache max_size configurable

fixes #463

* Add tests and docs.

* fix documentation...

...adapt the source, not the auto-generated man pages themselves as
requested in #544.

* fix typo.
2021-06-21 19:45:19 +01:00
Chris Mayo
c31d233f06 Disable status logging in WSGI application
Not a problem earlier because the default for the CLI is to record
status, but this was not fully implemented until:
4f3f1ac0 ("Fix status=0 setting being ignored", 2020-08-06)
2021-01-28 19:20:24 +00:00
Chris Mayo
09b4da393e Initialise Configuration.status_logger
Fixes failure of the LinkChecker WSGI application which does
not call Configuration.set_status_logger().
2021-01-28 19:20:24 +00:00
Chris Mayo
136e8a3625 Update to version 10.0.1.dev0 2021-01-28 19:20:24 +00:00
Chris Mayo
a3e9c31560 Remove execute bits from parsepdf.py and parseword.py 2021-01-14 19:48:22 +00:00
Chris Mayo
e922dd0224 Stop using biplist
plistlib has supported binary files since Python 3.4.
2020-10-12 19:55:46 +01:00
Chris Mayo
0920508413
Merge pull request #498 from cjmayo/linkchecker
Tidy linkchecker
2020-09-24 19:31:07 +01:00
Chris Mayo
ca59966cf0 Add a note linking to biplist Python 3.9 compatibility bug 2020-09-23 19:38:17 +01:00
Chris Mayo
26c15c5e67 Fix deprecation warning for resolver.query()
/home/travis/build/linkchecker/linkchecker/linkcheck/checker/mailtourl.py:321: DeprecationWarning: please use dns.resolver.resolve() instead
    answers = resolver.query(domain, 'MX')
2020-09-14 19:55:05 +01:00