Chris Mayo
353909bacc
Merge pull request #688 from cjmayo/de
...
Some German application translations
2022-11-07 19:21:13 +00:00
Chris Mayo
9bdfd52ec9
Some German application translations
2022-11-01 19:25:12 +00:00
Chris Mayo
d8b6d77706
Merge pull request #687 from cjmayo/man-updates
...
Update application translation catalogs
2022-11-01 19:21:09 +00:00
LinkChecker
b2ba5830c6
Update application translation catalogs
2022-10-31 19:45:56 +00:00
LinkChecker
eeef12cbb7
Update man pages
2022-10-31 19:45:56 +00:00
LinkChecker
4246d919df
Update doc translation catalogs
2022-10-31 19:45:38 +00:00
Chris Mayo
614e84b554
Merge pull request #686 from cjmayo/xgettext
...
Specify source encoding to xgettext
2022-10-31 19:44:30 +00:00
Chris Mayo
189cd35fdf
Specify source encoding to xgettext
...
Default is ASCII.
xgettext: Non-ASCII string at ../linkcheck/htmlutil/srcsetparse.py:39.
Please specify the source encoding through --from-code or through a comment
as specified in http://www.python.org/peps/pep-0263.html .
2022-10-31 19:39:15 +00:00
Chris Mayo
b796cec346
Merge pull request #683 from cjmayo/anchorcheckfileurl
...
Move AnchorCheck local file handling into a new class
2022-10-31 19:23:27 +00:00
Chris Mayo
740fce4df5
Merge pull request #682 from cjmayo/node16
...
Update Actions to Node16 versions
2022-10-31 19:22:50 +00:00
Chris Mayo
b2fd7b30c5
Merge pull request #684 from cjmayo/python311
...
Add Python 3.11
2022-10-26 19:31:16 +01:00
Chris Mayo
169e327d50
Add Python 3.11
2022-10-25 19:21:39 +01:00
Chris Mayo
16bee50068
Move AnchorCheck local file handling into a new class
...
When checking local files with AnchorCheck, anchors in URLs
like "example/#anchor" are not supported.
Without AnchorCheck enabled, the Real URL reported for such URLs
was changed to include the anchor when local file checking was added to
AnchorCheck, but it is the directory that is checked.
The same URL was also then used as the Parent URL for the check of each
of the contents of that directory.
For FileUrl this is a revert of:
c221afda ("Enable AnchorCheck to be used with local files", 2022-10-03)
2022-10-24 19:30:56 +01:00
Chris Mayo
776f2980bc
Update 3rd-party Actions to Node16 versions
2022-10-24 19:26:52 +01:00
Chris Mayo
a3bedadfb6
Update GitHub Actions to Node16 versions
2022-10-24 19:26:52 +01:00
Chris Mayo
b66ca30e84
Merge pull request #680 from cjmayo/misc
...
Collection of independent small improvements
2022-10-24 19:26:13 +01:00
Chris Mayo
e32c76aa5c
Make text logger outro "checked" translatable
2022-10-18 19:24:08 +01:00
Chris Mayo
9631c314dd
Use \d in regexp in TestDecorators.test_timeit2()
2022-10-18 19:24:08 +01:00
Chris Mayo
deac09d2c1
Clarify note in TestConfig
2022-10-18 19:24:08 +01:00
Chris Mayo
ef2d571761
Support building wheel from sdist
...
Build hook is also called for the wheel since:
38dea6b7 ("Fix install with pip git+https", 2022-09-13)
2022-10-18 19:24:08 +01:00
Chris Mayo
a0eb6d5187
Align documentation of debug in man pages
...
Linked to:
b3967f75 ("Correct documentation of --debug in linkchecker(1)", 2022-09-30)
2022-10-18 19:24:08 +01:00
Chris Mayo
0f36153f69
Merge pull request #679 from cjmayo/pytest
...
Fix tests failing when run with pytest
2022-10-18 19:22:09 +01:00
Chris Mayo
78536c578a
Fix tests failing when run with pytest
...
TypeError: 'NoneType' object is not callable
As per:
2cbff492 ("Fix http tests failing with pytest due to missing _()", 2022-10-03)
2022-10-17 19:26:53 +01:00
Chris Mayo
b6eea83f63
Merge pull request #676 from cjmayo/robotmap
...
Document sitemaps in linkchecker(1)
2022-10-17 19:25:57 +01:00
Chris Mayo
96c3336013
Merge pull request #677 from cjmayo/maxrate
...
Enable average HTTP request rate to be above 4 per second
2022-10-17 19:24:49 +01:00
Chris Mayo
afccdb9608
Merge pull request #675 from cjmayo/mx
...
Replace deprecated dns.resolver.query()
2022-10-17 19:23:33 +01:00
Chris Mayo
93f1d3f4ac
Document sitemaps in linkchecker(1)
2022-10-17 19:21:03 +01:00
Chris Mayo
689557d9af
Add logging of MIME types and improve docstrings
2022-10-17 19:21:03 +01:00
Chris Mayo
eab2fa410e
Log robots.txt as the sitemap parent URL
...
This is the location the sitemap URL was found in. The line being
reported is the line in robots.txt.
2022-10-17 19:21:03 +01:00
Chris Mayo
7367e6e865
Skip incomplete Sitemap in robots.txt and warn
...
Sitemap values should be fully qualified URLs; LinkChecker may not
resolve relative paths correctly.
2022-10-17 19:21:03 +01:00
Chris Mayo
8bc849dfde
Make --cookiefile description in linkchecker(1) a bit clearer
2022-10-17 19:21:03 +01:00
Chris Mayo
0c5db040c8
Support maxrequestspersecond less than one
2022-10-05 19:28:01 +01:00
Chris Mayo
e88cf49c8f
Enable average HTTP request rate to be above 4 per second
2022-10-05 19:28:01 +01:00
Chris Mayo
f2be98b8ad
Replace deprecated dns.resolver.query()
...
Missed in:
26c15c5e ("Fix deprecation warning for resolver.query()", 2020-09-14)
2022-10-05 19:27:13 +01:00
Chris Mayo
bbb8096df5
Add @need_network to test_no_error() in test_ignoreerrors.py
...
Needs network access for DNS:
warning No MX mail host for example.com found.
2022-10-05 19:27:13 +01:00
Chris Mayo
354ea933ca
Merge pull request #673 from cjmayo/sitemap
...
Fix sitemap output with multiple threads
2022-10-05 19:20:40 +01:00
Chris Mayo
d9265bb71c
Merge pull request #669 from cjmayo/anchorcheck
...
Re-enable AnchorCheck plugin
2022-10-03 19:36:08 +01:00
Nathan Arthur
2d1bf6ef98
Add tests for encoded anchors for file: and http:
...
I started with a test of urlencoded anchors, assuming at the URL might
have a urlencoded anchor, but the actual anchor in the HTML would NOT be
urlencoded.
2022-10-03 19:33:05 +01:00
Nathan Arthur
33036803b0
Fix a difference in anchor quoting between http and file
...
"I added a test for file:// processing, and it was showing different
results for when the URL anchor was and wasn't quoted. I tracked it down
to code in fileurl.py that was calling url_norm, and I'm pretty sure the
code is unnecessary at this point. But I made a minimally-invasive
change, to be as safe as possible."
UrlBase.build_url() in line 174 also calls url_norm()
2022-10-03 19:33:05 +01:00
Nathan Arthur
4cdaa59fcc
Fix AnchorCheck mismatching encoded anchors
...
Problem identified by Christian Kirchhof.
2022-10-03 19:33:05 +01:00
Nathan Arthur
6499b7b233
Fix a major thread-safety bug in AnchorCheck
...
The threading issue has been there for years, but I didn't notice it
until after I thought I was done, while I was doing manual testing
(with threads re-enabled).
The problem was with storing URL-specific state (.anchors) on the
AnchorCheck object itself, because there's only one global AnchorCheck
object, so all the threads are competing to use that one simgle variable
(self.anchors).
The solution was to create a new object to hold .anchors, for each
processed URL.
2022-10-03 19:33:05 +01:00
Nathan Arthur
5398fd2406
Add an anchor test for multiple inter-connected files
2022-10-03 19:33:05 +01:00
Nathan Arthur
c221afdab5
Enable AnchorCheck to be used with local files
...
[I] discovered that fileurl.py was stripping the anchors from url_data,
which breaks AnchorCheck. So I stopped it from doing that, and
tried to fix up all the places that were assuming the url would map to a
filesystem file. The tests all pass, but I'm not 100% sure I caught all
the cases, or fixed them correctly.
2022-10-03 19:33:05 +01:00
Nathan Arthur
a29750c57f
Fix anchor comments in UrlBase
...
Parent url query not stripped since:
4a0c63aa ("Fix joining of URLs when parent URL has CGI parameter.", 2011-02-08)
2022-10-03 19:33:05 +01:00
Chris Mayo
2cbff49221
Fix http tests failing with pytest due to missing _()
...
TypeError: 'NoneType' object is not callable
Ensure LinkCheckTest.setUp() is called to initialise translations.
2022-10-03 19:33:05 +01:00
Chris Mayo
8b2fb86895
Remove AnchorCheck disabled note in linkcheckerrc(5)
...
A partial revert of:
fe6dea12 ("Update documentation for disabled plugins", 2021-11-29)
2022-10-03 19:33:05 +01:00
Chris Mayo
54bcefd7d7
Revert "Disable AnchorCheck plugin"
...
This reverts commit 0356524369 .
2022-10-03 19:33:05 +01:00
Chris Mayo
033dcf89f9
Merge pull request #671 from cjmayo/example
...
Fix formatting of ignoreerrors example in linkcheckerrc(5)
2022-10-03 19:22:36 +01:00
Chris Mayo
d6d5e918dc
Merge pull request #672 from cjmayo/encoding
...
Separate URL encoding and content encoding
2022-10-03 19:22:03 +01:00
Chris Mayo
e6763f8516
Fix sitemap output with multiple threads
...
SitemapXmlLogger assumes the first result logged is for the root of the
website being mapped. Ensure results are logged before content is
checked.
2022-09-30 19:22:17 +01:00