Commit graph

6523 commits

Author SHA1 Message Date
Marius Gedminas
43d1403b1b Remove unused screenshot
git grep tells me shot2.png is not referenced anywhere in the tree.  It
depicts the old, removed Qt GUI.
2020-05-25 22:25:48 +03:00
Marius Gedminas
99d1e59ed2
Merge pull request #415 from linkchecker/manifest
Include all files in the sdist via MANIFEST.in
2020-05-25 22:15:21 +03:00
Marius Gedminas
10c56ceca5 Remove unneeded globs from MANIFEST.in
qhp and qhcp are Qt help files, removed with the GUI

rej is a rejected patch, definitely not part of the source.

There are no bat/cer/pvk/pfk files under tests.

Now python3 setup.py -q sdist no longer produces warnings.
2020-05-25 22:03:44 +03:00
Chris Mayo
f6e182f0e4 Mark TestFile.test_html_url_quote as need_network
Else without the internet the test fails, eventually, with:

warning No MX mail host for users.sourceforge.net found
2020-05-25 19:55:28 +01:00
Chris Mayo
d3c9618b1b TestHttps.test_https doesn't need the internet now
A result of changes introduced in:

dee4be4b ("Enable https checking using a test server", 2019-11-11)
2020-05-25 19:55:28 +01:00
Chris Mayo
32689ea230 Enable as many TestHttp html tests as possible without the internet 2020-05-25 19:55:28 +01:00
Chris Mayo
97f50e8be1 Remove unused import htmlsoup from checker/httpurl.py
Unused since:

f7337f55 ("Fix error due to an empty html file accessed over http", 2020-05-23)
2020-05-25 19:50:57 +01:00
Chris Mayo
3473656fe1 Replace import of distutils.spawn.find_executable with shutil.which 2020-05-25 19:50:57 +01:00
Chris Mayo
6dda2f9669 Move imports to the top of files to resolve flake8 E402 2020-05-25 19:50:57 +01:00
Chris Mayo
0f3444e906 Drop run-time requests version check
Requests 2.4.0 was released in 2014.
2020-05-25 19:50:57 +01:00
Chris Mayo
89c7c74bcf Remove unused set_linecache() from better_exchook2.py 2020-05-25 19:50:57 +01:00
Chris Mayo
7257e5e1a0 Remove unused imports in parser/__init__.py 2020-05-25 19:50:57 +01:00
Marius Gedminas
4cab3214c9 Oops, editor accident 2020-05-25 21:50:50 +03:00
Chris Mayo
3c62adb1ba Remove ineffective command-line options
These options were replaced by plugins and made ineffective [1]. This change
was included in the 9.0 release.

[1] 7b34be59 ("Introduce check plugins, use Python requests for http/s
    connections, and some code cleanups and improvements.", 2014-03-01)
2020-05-25 19:27:11 +01:00
Chris Mayo
083aa164e8 Add codename and release date for 9.4 to changelog.txt 2020-05-25 19:27:11 +01:00
Marius Gedminas
93612adcf7 Add check-manifest to CI
Closes #414.
2020-05-25 19:33:17 +03:00
Marius Gedminas
9e06c0ea80 Update MANIFEST.in to list ALL THE FILES!
We've discussed this in #414 and everyone's fine with including all
source files in the sdists (where "source file" == file versioned in git).
2020-05-25 19:33:10 +03:00
Marius Gedminas
00c1b410d0 Remove MANIFEST build logic from the Makefile
I almost didn't notice it existed.
2020-05-25 19:33:10 +03:00
Marius Gedminas
6a2d247be1 Remove custom MANIFEST logic from setup.py
I don't like the extra MANIFEST file lying around.  It clashes with the
old distutils feature of having a MANIFEST file.  I intend to replace
this check with check-manifest.
2020-05-25 19:27:02 +03:00
Chris Mayo
6c8e88dae6
Merge pull request #412 from cjmayo/unicode2
Remove instances of Python 2 unicode
2020-05-24 19:20:07 +01:00
Chris Mayo
313a14ff0d Remove instances of Python 2 unicode 2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Marius Gedminas
dcafa2df75
Avoid u-prefixed strings
linkchecker is Python 3 only, all strings are unicode.
2020-05-24 14:50:07 +03:00
Chris Mayo
9c982533e0
Merge pull request #411 from cjmayo/empty-http
Fix internal error on empty HTML files accessed over HTTP
2020-05-23 20:27:12 +01:00
Chris Mayo
03b1c4919d Record encoding in debug log messages 2020-05-23 20:01:24 +01:00
Chris Mayo
f7337f55e8 Fix error due to an empty html file accessed over http
Use the already fixed [1] UrlBase.get_content() in HttpUrl.

[1] 5bd1fb4 ("Fix internal error on empty HTML files", 2020-05-21)
2020-05-23 20:01:24 +01:00
Chris Mayo
d611564cb0 Add a test for an empty html file accessed over http 2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting 2020-05-23 14:15:52 +03:00
anarcat
b1e8137da2
Merge pull request #410 from cjmayo/install
Installation of test data, standardise Markdown extension, remove linkchecker.desktop
2020-05-22 20:07:15 -04:00
Chris Mayo
df79e9b196 Add missing test data and Markdown documentation to the distribution 2020-05-22 19:43:57 +01:00
Chris Mayo
c60887cc63 Rename .mdwn files to .md
- RFC 7763 file extensions are .md and .markdown
- Consistent with other documentation files
2020-05-22 19:43:57 +01:00
Chris Mayo
87f0c31928 Remove linkchecker.desktop
- Not useful for a command-line application
- Refers to an icon with a generic name, that is not installed
2020-05-22 19:43:57 +01:00
Marius Gedminas
6dffacf17f
Merge pull request #409 from linkchecker/fix-login-timeouts
Make sure login form fetching uses a timeout and sends User-Agent
2020-05-22 21:40:48 +03:00
anarcat
2256a6e889
Merge pull request #408 from linkchecker/fix-timeouts
Make sure fetching robots.txt uses the configured timeout
2020-05-22 14:29:12 -04:00
Marius Gedminas
b0435b3d47 Make sure login form fetching uses a timeout
Also resolve an XXX comment about the User-Agent header (which is
configured in new_request_session), but add a couple of XXX comments
about using proxy and possibly disabling TLS certificate checking.
2020-05-22 11:19:51 +03:00
Marius Gedminas
4f3fe5e1c3 Make sure fetching robots.txt uses the configured timeout
Closes #396.
2020-05-22 10:53:33 +03:00
Marius Gedminas
639ba0dba2
Merge pull request #406 from linkchecker/fix-empty-file-problem
Fix internal error on empty HTML files
2020-05-21 19:57:46 +03:00
Marius Gedminas
c60d7c66e4 Clarify the decision to fall back to Latin-1 2020-05-21 19:35:39 +03:00
Marius Gedminas
5bd1fb4e36 Fix internal error on empty HTML files
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None.  It doesn't matter what encoding we pick for
empty files, so let's just pick one.

I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.

Closes #392.
2020-05-21 19:01:33 +03:00
Marius Gedminas
fd3ab13470
Merge pull request #397 from linkchecker/doc-389
do not require ssh to clone from source
2020-05-21 18:28:20 +03:00
anarcat
a226b4e406
Merge pull request #405 from cjmayo/tidyten13
Remove encoding of TestLogger diff  and url in Checker.check_url_data()
2020-05-21 08:56:08 -04:00
Chris Mayo
6cfc8eeb49 Replace threading.Thread.setName() with setting the name property
As recommended in:

https://docs.python.org/3.5/library/threading.html#threading.Thread.setName
2020-05-20 19:58:44 +01:00
Chris Mayo
42eba19a7d No need to encode url in Checker.check_url_data()
Was causing b'' in log messages e.g. CheckThread-b'http:...
2020-05-20 19:58:44 +01:00
Chris Mayo
96e1c00ff7 TestLogger diff output is all Unicode in Python 3 2020-05-20 19:58:44 +01:00
Chris Mayo
768952e111
Merge pull request #403 from cjmayo/tidyten12
Remove "from builtins import str as str_text"
2020-05-20 19:38:14 +01:00
Marius Gedminas
1ab45c2e60
Merge pull request #402 from linkchecker/flake8
Add a 'tox -e flake8' and a Travis CI job
2020-05-19 23:09:03 +03:00
Chris Mayo
71eaf9a982 Remove str_text from tests/ 2020-05-19 19:56:42 +01:00
Chris Mayo
28f4587dfa Remove str_text from fileutil.py, strformat.py and url.py 2020-05-19 19:56:42 +01:00
Chris Mayo
ebcc3c4961 Remove str_text from plugins/ 2020-05-19 19:56:42 +01:00
Chris Mayo
1c14583535 Remove str_text from logger/ 2020-05-19 19:56:42 +01:00