Commit graph

6381 commits

Author SHA1 Message Date
Chris Mayo
02deff426b Check the main linkchecker executable from tox 2020-05-26 19:49:57 +01:00
Chris Mayo
a8301f43ca Update setup.cfg flake8 section
- Add the _n gettext prefix as a builtin, resolves "F821 undefined
  name"
- Include the main executable when running flake8 from the top directory
  without specifying files
- Ignore E402,F401 module import violations for specific files
2020-05-26 19:48:28 +01:00
Chris Mayo
50df8035d0
Merge pull request #413 from cjmayo/deprecated_options
Remove ineffective command-line options
2020-05-26 19:44:28 +01:00
Chris Mayo
3d2407ed70
Merge pull request #416 from cjmayo/imports
Resolve flake8 import related violations
2020-05-26 19:43:56 +01:00
Chris Mayo
b56cb634e6
Merge pull request #417 from cjmayo/test_network
Adjust tests for need_network
2020-05-26 19:31:30 +01:00
Marius Gedminas
5a2e5ba7e6
Merge pull request #419 from linkchecker/fix-misleading-readme
Fix misleading information in the README
2020-05-26 18:20:53 +03:00
Marius Gedminas
c567adcd7c
Fix misleading information in the README
(Also change some HTTP URLs to HTTPS).

I think this is the minimum fix we should do ASAP to avoid [user confusion](https://github.com/linkchecker/linkchecker/issues/88#issuecomment-633853033), while we hash out all the questions raised in the discussion in #362.
2020-05-26 11:31:28 +03:00
anarcat
bdc9c1ce88
Merge pull request #418 from linkchecker/remove-unused-gui-screenshot
Remove unused screenshot
2020-05-25 16:56:16 -04:00
Marius Gedminas
43d1403b1b Remove unused screenshot
git grep tells me shot2.png is not referenced anywhere in the tree.  It
depicts the old, removed Qt GUI.
2020-05-25 22:25:48 +03:00
Marius Gedminas
99d1e59ed2
Merge pull request #415 from linkchecker/manifest
Include all files in the sdist via MANIFEST.in
2020-05-25 22:15:21 +03:00
Marius Gedminas
10c56ceca5 Remove unneeded globs from MANIFEST.in
qhp and qhcp are Qt help files, removed with the GUI

rej is a rejected patch, definitely not part of the source.

There are no bat/cer/pvk/pfk files under tests.

Now python3 setup.py -q sdist no longer produces warnings.
2020-05-25 22:03:44 +03:00
Chris Mayo
f6e182f0e4 Mark TestFile.test_html_url_quote as need_network
Else without the internet the test fails, eventually, with:

warning No MX mail host for users.sourceforge.net found
2020-05-25 19:55:28 +01:00
Chris Mayo
d3c9618b1b TestHttps.test_https doesn't need the internet now
A result of changes introduced in:

dee4be4b ("Enable https checking using a test server", 2019-11-11)
2020-05-25 19:55:28 +01:00
Chris Mayo
32689ea230 Enable as many TestHttp html tests as possible without the internet 2020-05-25 19:55:28 +01:00
Chris Mayo
97f50e8be1 Remove unused import htmlsoup from checker/httpurl.py
Unused since:

f7337f55 ("Fix error due to an empty html file accessed over http", 2020-05-23)
2020-05-25 19:50:57 +01:00
Chris Mayo
3473656fe1 Replace import of distutils.spawn.find_executable with shutil.which 2020-05-25 19:50:57 +01:00
Chris Mayo
6dda2f9669 Move imports to the top of files to resolve flake8 E402 2020-05-25 19:50:57 +01:00
Chris Mayo
0f3444e906 Drop run-time requests version check
Requests 2.4.0 was released in 2014.
2020-05-25 19:50:57 +01:00
Chris Mayo
89c7c74bcf Remove unused set_linecache() from better_exchook2.py 2020-05-25 19:50:57 +01:00
Chris Mayo
7257e5e1a0 Remove unused imports in parser/__init__.py 2020-05-25 19:50:57 +01:00
Marius Gedminas
4cab3214c9 Oops, editor accident 2020-05-25 21:50:50 +03:00
Chris Mayo
3c62adb1ba Remove ineffective command-line options
These options were replaced by plugins and made ineffective [1]. This change
was included in the 9.0 release.

[1] 7b34be59 ("Introduce check plugins, use Python requests for http/s
    connections, and some code cleanups and improvements.", 2014-03-01)
2020-05-25 19:27:11 +01:00
Chris Mayo
083aa164e8 Add codename and release date for 9.4 to changelog.txt 2020-05-25 19:27:11 +01:00
Marius Gedminas
93612adcf7 Add check-manifest to CI
Closes #414.
2020-05-25 19:33:17 +03:00
Marius Gedminas
9e06c0ea80 Update MANIFEST.in to list ALL THE FILES!
We've discussed this in #414 and everyone's fine with including all
source files in the sdists (where "source file" == file versioned in git).
2020-05-25 19:33:10 +03:00
Marius Gedminas
00c1b410d0 Remove MANIFEST build logic from the Makefile
I almost didn't notice it existed.
2020-05-25 19:33:10 +03:00
Marius Gedminas
6a2d247be1 Remove custom MANIFEST logic from setup.py
I don't like the extra MANIFEST file lying around.  It clashes with the
old distutils feature of having a MANIFEST file.  I intend to replace
this check with check-manifest.
2020-05-25 19:27:02 +03:00
Chris Mayo
6c8e88dae6
Merge pull request #412 from cjmayo/unicode2
Remove instances of Python 2 unicode
2020-05-24 19:20:07 +01:00
Chris Mayo
313a14ff0d Remove instances of Python 2 unicode 2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Marius Gedminas
dcafa2df75
Avoid u-prefixed strings
linkchecker is Python 3 only, all strings are unicode.
2020-05-24 14:50:07 +03:00
Chris Mayo
9c982533e0
Merge pull request #411 from cjmayo/empty-http
Fix internal error on empty HTML files accessed over HTTP
2020-05-23 20:27:12 +01:00
Chris Mayo
03b1c4919d Record encoding in debug log messages 2020-05-23 20:01:24 +01:00
Chris Mayo
f7337f55e8 Fix error due to an empty html file accessed over http
Use the already fixed [1] UrlBase.get_content() in HttpUrl.

[1] 5bd1fb4 ("Fix internal error on empty HTML files", 2020-05-21)
2020-05-23 20:01:24 +01:00
Chris Mayo
d611564cb0 Add a test for an empty html file accessed over http 2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting 2020-05-23 14:15:52 +03:00
anarcat
b1e8137da2
Merge pull request #410 from cjmayo/install
Installation of test data, standardise Markdown extension, remove linkchecker.desktop
2020-05-22 20:07:15 -04:00
Chris Mayo
df79e9b196 Add missing test data and Markdown documentation to the distribution 2020-05-22 19:43:57 +01:00
Chris Mayo
c60887cc63 Rename .mdwn files to .md
- RFC 7763 file extensions are .md and .markdown
- Consistent with other documentation files
2020-05-22 19:43:57 +01:00
Chris Mayo
87f0c31928 Remove linkchecker.desktop
- Not useful for a command-line application
- Refers to an icon with a generic name, that is not installed
2020-05-22 19:43:57 +01:00
Marius Gedminas
6dffacf17f
Merge pull request #409 from linkchecker/fix-login-timeouts
Make sure login form fetching uses a timeout and sends User-Agent
2020-05-22 21:40:48 +03:00
anarcat
2256a6e889
Merge pull request #408 from linkchecker/fix-timeouts
Make sure fetching robots.txt uses the configured timeout
2020-05-22 14:29:12 -04:00
Marius Gedminas
b0435b3d47 Make sure login form fetching uses a timeout
Also resolve an XXX comment about the User-Agent header (which is
configured in new_request_session), but add a couple of XXX comments
about using proxy and possibly disabling TLS certificate checking.
2020-05-22 11:19:51 +03:00
Marius Gedminas
4f3fe5e1c3 Make sure fetching robots.txt uses the configured timeout
Closes #396.
2020-05-22 10:53:33 +03:00
Marius Gedminas
639ba0dba2
Merge pull request #406 from linkchecker/fix-empty-file-problem
Fix internal error on empty HTML files
2020-05-21 19:57:46 +03:00
Marius Gedminas
c60d7c66e4 Clarify the decision to fall back to Latin-1 2020-05-21 19:35:39 +03:00
Marius Gedminas
5bd1fb4e36 Fix internal error on empty HTML files
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None.  It doesn't matter what encoding we pick for
empty files, so let's just pick one.

I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.

Closes #392.
2020-05-21 19:01:33 +03:00
Marius Gedminas
fd3ab13470
Merge pull request #397 from linkchecker/doc-389
do not require ssh to clone from source
2020-05-21 18:28:20 +03:00
anarcat
a226b4e406
Merge pull request #405 from cjmayo/tidyten13
Remove encoding of TestLogger diff  and url in Checker.check_url_data()
2020-05-21 08:56:08 -04:00
Chris Mayo
6cfc8eeb49 Replace threading.Thread.setName() with setting the name property
As recommended in:

https://docs.python.org/3.5/library/threading.html#threading.Thread.setName
2020-05-20 19:58:44 +01:00