Socket communication deals with bytes.
There are probably remaining issues with the viruscheck plugin on
Python 3, we just can't see them because the code is not fully covered
with tests.
pytest output before this change:
SKIPPED [3] tests/__init__.py:217: condition: True
SKIPPED [1] tests/checker/test_news.py:63: condition: True
SKIPPED [1] tests/checker/test_news.py:41: condition: True
SKIPPED [1] tests/checker/test_news.py:116: condition: True
SKIPPED [1] tests/checker/test_news.py:75: condition: True
After:
SKIPPED [3] tests/__init__.py: disabled for now until some stable news server comes up
SKIPPED [4] tests/checker/test_news.py: disabled for now until some stable news server comes up
MS Word files are binary data, and get_temp_filename() will write them
to disk using open(..., 'wb'), so we want to pass bytes in there, not
Unicode.
See #323.
This code was added in:
efbbb656 ("Remove python-dns conflict by moving the dns module into a custom subdirectory.", 2012-12-07)
Installation of linkcheck_dns stopped with:
0a13fae3 ("remove third party packages and use them as dependency", 2018-01-06)
This fixes a race condition where the main thread would check if any
internal errors happened and get back a 0 while a worker thread was
still busy printing the internal error message before incrementing the
counter.
Fixes#320.
My experiments show that this adds no perceptible delay to the script
runtime (on Linux). More specifically, there already is an annoying
perceptible delay of about 1 second, but it's not caused by this change.
This is so that I can run `tox -- -n 8` to run the tests in parallel, or
`tox -- tests/checker/test_misc.py::TestMisc::test_html5` to run just a
single test, without having to repeat all the other options.
I haven't moved --cov=linkcheck because I don't want coverage results
when I'm limiting the test run to a single test (they just make the
interesting bit -- the test result itself -- scroll up).
I've also added -ra to the default option list because then several
tests fail, I'd like to see a list of their names in one place, not
spead out between the huge tracebacks.
Stopped being used with removal of UrlBase.set_title_from_content() in:
7b34be59 ("Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements.", 2014-03-01)
Not needed since all content is now being decoded on retrieval.
Added by:
a6643034 ("Python3: decode parts before submitting them to urllib.quote()", 2018-01-05)
UrlBase has been modified as follows:
- the "data" variable now holds bytes
- decoded content is stored in a new variable "text"
- functionality from get_content() has been split out into
get_raw_content() which returns "data" and download_content() which
calls read_content() and sets the download related variables.
This allows for subclasses to do their own decoding and parsers to
use bytes.