Commit graph

6365 commits

Author SHA1 Message Date
Marius Gedminas
5a2e5ba7e6
Merge pull request #419 from linkchecker/fix-misleading-readme
Fix misleading information in the README
2020-05-26 18:20:53 +03:00
Marius Gedminas
c567adcd7c
Fix misleading information in the README
(Also change some HTTP URLs to HTTPS).

I think this is the minimum fix we should do ASAP to avoid [user confusion](https://github.com/linkchecker/linkchecker/issues/88#issuecomment-633853033), while we hash out all the questions raised in the discussion in #362.
2020-05-26 11:31:28 +03:00
anarcat
bdc9c1ce88
Merge pull request #418 from linkchecker/remove-unused-gui-screenshot
Remove unused screenshot
2020-05-25 16:56:16 -04:00
Marius Gedminas
43d1403b1b Remove unused screenshot
git grep tells me shot2.png is not referenced anywhere in the tree.  It
depicts the old, removed Qt GUI.
2020-05-25 22:25:48 +03:00
Marius Gedminas
99d1e59ed2
Merge pull request #415 from linkchecker/manifest
Include all files in the sdist via MANIFEST.in
2020-05-25 22:15:21 +03:00
Marius Gedminas
10c56ceca5 Remove unneeded globs from MANIFEST.in
qhp and qhcp are Qt help files, removed with the GUI

rej is a rejected patch, definitely not part of the source.

There are no bat/cer/pvk/pfk files under tests.

Now python3 setup.py -q sdist no longer produces warnings.
2020-05-25 22:03:44 +03:00
Marius Gedminas
4cab3214c9 Oops, editor accident 2020-05-25 21:50:50 +03:00
Marius Gedminas
93612adcf7 Add check-manifest to CI
Closes #414.
2020-05-25 19:33:17 +03:00
Marius Gedminas
9e06c0ea80 Update MANIFEST.in to list ALL THE FILES!
We've discussed this in #414 and everyone's fine with including all
source files in the sdists (where "source file" == file versioned in git).
2020-05-25 19:33:10 +03:00
Marius Gedminas
00c1b410d0 Remove MANIFEST build logic from the Makefile
I almost didn't notice it existed.
2020-05-25 19:33:10 +03:00
Marius Gedminas
6a2d247be1 Remove custom MANIFEST logic from setup.py
I don't like the extra MANIFEST file lying around.  It clashes with the
old distutils feature of having a MANIFEST file.  I intend to replace
this check with check-manifest.
2020-05-25 19:27:02 +03:00
Chris Mayo
6c8e88dae6
Merge pull request #412 from cjmayo/unicode2
Remove instances of Python 2 unicode
2020-05-24 19:20:07 +01:00
Chris Mayo
313a14ff0d Remove instances of Python 2 unicode 2020-05-24 19:14:47 +01:00
Marius Gedminas
d0169c46d4
Merge pull request #348 from weshaggard/HandleRateLimiting
Turn status code 429 into warning instead of failure
2020-05-24 16:16:56 +03:00
Marius Gedminas
dcafa2df75
Avoid u-prefixed strings
linkchecker is Python 3 only, all strings are unicode.
2020-05-24 14:50:07 +03:00
Chris Mayo
9c982533e0
Merge pull request #411 from cjmayo/empty-http
Fix internal error on empty HTML files accessed over HTTP
2020-05-23 20:27:12 +01:00
Chris Mayo
03b1c4919d Record encoding in debug log messages 2020-05-23 20:01:24 +01:00
Chris Mayo
f7337f55e8 Fix error due to an empty html file accessed over http
Use the already fixed [1] UrlBase.get_content() in HttpUrl.

[1] 5bd1fb4 ("Fix internal error on empty HTML files", 2020-05-21)
2020-05-23 20:01:24 +01:00
Chris Mayo
d611564cb0 Add a test for an empty html file accessed over http 2020-05-23 20:01:24 +01:00
Marius Gedminas
f268a90cfb
Merge branch 'master' into HandleRateLimiting 2020-05-23 14:15:52 +03:00
anarcat
b1e8137da2
Merge pull request #410 from cjmayo/install
Installation of test data, standardise Markdown extension, remove linkchecker.desktop
2020-05-22 20:07:15 -04:00
Chris Mayo
df79e9b196 Add missing test data and Markdown documentation to the distribution 2020-05-22 19:43:57 +01:00
Chris Mayo
c60887cc63 Rename .mdwn files to .md
- RFC 7763 file extensions are .md and .markdown
- Consistent with other documentation files
2020-05-22 19:43:57 +01:00
Chris Mayo
87f0c31928 Remove linkchecker.desktop
- Not useful for a command-line application
- Refers to an icon with a generic name, that is not installed
2020-05-22 19:43:57 +01:00
Marius Gedminas
6dffacf17f
Merge pull request #409 from linkchecker/fix-login-timeouts
Make sure login form fetching uses a timeout and sends User-Agent
2020-05-22 21:40:48 +03:00
anarcat
2256a6e889
Merge pull request #408 from linkchecker/fix-timeouts
Make sure fetching robots.txt uses the configured timeout
2020-05-22 14:29:12 -04:00
Marius Gedminas
b0435b3d47 Make sure login form fetching uses a timeout
Also resolve an XXX comment about the User-Agent header (which is
configured in new_request_session), but add a couple of XXX comments
about using proxy and possibly disabling TLS certificate checking.
2020-05-22 11:19:51 +03:00
Marius Gedminas
4f3fe5e1c3 Make sure fetching robots.txt uses the configured timeout
Closes #396.
2020-05-22 10:53:33 +03:00
Marius Gedminas
639ba0dba2
Merge pull request #406 from linkchecker/fix-empty-file-problem
Fix internal error on empty HTML files
2020-05-21 19:57:46 +03:00
Marius Gedminas
c60d7c66e4 Clarify the decision to fall back to Latin-1 2020-05-21 19:35:39 +03:00
Marius Gedminas
5bd1fb4e36 Fix internal error on empty HTML files
When BeautifulSoup finds an empty file on disk, it sets
original_encoding to None.  It doesn't matter what encoding we pick for
empty files, so let's just pick one.

I don't know if there are any circumstances where BeautifulSoup might
set the encoding to None for a non-empty file.

Closes #392.
2020-05-21 19:01:33 +03:00
Marius Gedminas
fd3ab13470
Merge pull request #397 from linkchecker/doc-389
do not require ssh to clone from source
2020-05-21 18:28:20 +03:00
anarcat
a226b4e406
Merge pull request #405 from cjmayo/tidyten13
Remove encoding of TestLogger diff  and url in Checker.check_url_data()
2020-05-21 08:56:08 -04:00
Chris Mayo
6cfc8eeb49 Replace threading.Thread.setName() with setting the name property
As recommended in:

https://docs.python.org/3.5/library/threading.html#threading.Thread.setName
2020-05-20 19:58:44 +01:00
Chris Mayo
42eba19a7d No need to encode url in Checker.check_url_data()
Was causing b'' in log messages e.g. CheckThread-b'http:...
2020-05-20 19:58:44 +01:00
Chris Mayo
96e1c00ff7 TestLogger diff output is all Unicode in Python 3 2020-05-20 19:58:44 +01:00
Chris Mayo
768952e111
Merge pull request #403 from cjmayo/tidyten12
Remove "from builtins import str as str_text"
2020-05-20 19:38:14 +01:00
Marius Gedminas
1ab45c2e60
Merge pull request #402 from linkchecker/flake8
Add a 'tox -e flake8' and a Travis CI job
2020-05-19 23:09:03 +03:00
Chris Mayo
71eaf9a982 Remove str_text from tests/ 2020-05-19 19:56:42 +01:00
Chris Mayo
28f4587dfa Remove str_text from fileutil.py, strformat.py and url.py 2020-05-19 19:56:42 +01:00
Chris Mayo
ebcc3c4961 Remove str_text from plugins/ 2020-05-19 19:56:42 +01:00
Chris Mayo
1c14583535 Remove str_text from logger/ 2020-05-19 19:56:42 +01:00
Chris Mayo
6bddd4ac60 Remove str_text from checker/ 2020-05-19 19:56:42 +01:00
Chris Mayo
a127902607 Replace str_text in asserts 2020-05-19 19:56:42 +01:00
Chris Mayo
7490804e2c
Merge pull request #395 from cjmayo/tidyten11
Remove unused code from linkcheck/fileutil.py
2020-05-19 19:45:08 +01:00
Marius Gedminas
7a43abe6d6 Add a flake8 job to the Travis matrix
https://docs.travis-ci.com/user/build-matrix/#rows-that-are-allowed-to-fail
suggests that this might not work.
2020-05-19 19:27:40 +03:00
Marius Gedminas
72e7c600f3 Add a 'tox -e flake8' 2020-05-19 19:24:22 +03:00
anarcat
8183b7feb8
Update doc/install.txt
Co-authored-by: Marius Gedminas <marius@gedmin.as>
2020-05-19 12:05:21 -04:00
Marius Gedminas
391bd5882a
Merge pull request #394 from gbabin/fix-translations-encoding
Fix translations encoding (issue #165)
2020-05-19 18:53:06 +03:00
Marius Gedminas
e6e969f975
Merge pull request #391 from linkchecker/dev-version
Bump version in git to 10.0.0.dev0
2020-05-19 18:49:34 +03:00