check links in web documents or full websites
Find a file
Chris Mayo deed6ce231 Ensure chardet is installed when testing using tox
Beautiful Soup uses chardet, if installed, to detect character
encodings. This can lead to different test results based on whether
chardet is installed or not.

Requests < 2.26.0 requires chardet, but since 2.26.0 Requests requires
charset_normalizer.

Explicitly installing chardet maintains consistent test results.
2021-07-27 19:48:27 +01:00
.github Sacrificing chickens to coveralls gods 2021-05-19 16:50:38 +03:00
cgi-bin Use python3 for cgi-bin/lc.wsgi 2021-01-28 19:20:24 +00:00
config Make ResultCache max_size configurable (#544) 2021-06-21 19:45:19 +01:00
doc Make ResultCache max_size configurable (#544) 2021-06-21 19:45:19 +01:00
linkcheck Make ResultCache max_size configurable (#544) 2021-06-21 19:45:19 +01:00
po Update translation files for MS link in configuration/confparse.py 2020-06-19 16:44:18 +01:00
scripts Add slack to the list of ignored schemes 2020-08-09 17:10:26 +01:00
tests Ensure chardet is installed when testing using tox 2021-07-27 19:48:27 +01:00
windows Remove home-cooked htmlparser and use BeautifulSoup 2019-07-22 19:59:37 +01:00
.gitattributes Add .gitattributes 2013-12-04 20:04:34 +01:00
.gitignore Add doc/i18n for man 2020-08-15 17:02:40 +01:00
.project Add Eclipse Pydev project files. 2011-05-18 21:12:18 +02:00
.pydevproject Updated pydev settings. 2011-12-17 19:13:43 +01:00
.travis.yml Drop support for Beautiful Soup < 4.8.1 2021-01-28 19:20:24 +00:00
CODE_OF_CONDUCT.rst Include CONTRIBUTING and CODE_OF_CONDUCT in Sphinx documentation 2020-08-15 17:02:40 +01:00
CONTRIBUTING.rst Include CONTRIBUTING and CODE_OF_CONDUCT in Sphinx documentation 2020-08-15 17:02:40 +01:00
COPYING Moved some files into the doc/ subdirectory. 2010-03-06 21:52:25 +01:00
dev-requirements.txt Remove unused py2app from dev-requirements.txt 2020-08-23 17:24:09 +01:00
Dockerfile Don't remove directories from Docker image 2020-09-01 19:24:10 +01:00
install-rpm.sh Fix RPM installer generation. 2012-04-11 18:41:34 +02:00
linkchecker Don't rely on linkcheck/__init__.py for log aliases in linkchecker 2020-09-24 19:23:13 +01:00
Makefile Stop using biplist 2020-10-12 19:55:46 +01:00
MANIFEST.in Generate man pages with Sphinx 2020-08-15 17:02:40 +01:00
pytest.ini Move some pytest options into pytest.ini 2019-10-21 17:42:29 +03:00
README.rst Link Github action badge in README.rst 2021-06-17 12:47:05 +02:00
requirements.txt Drop support for Beautiful Soup < 4.8.1 2021-01-28 19:20:24 +00:00
robots.txt Add non-ascii values to test robots.txt 2008-07-13 13:01:59 +00:00
setup.cfg Merge pull request #470 from cjmayo/sphinx 2020-08-22 16:26:41 +01:00
setup.py Changes for release 10.0.1 2021-01-28 19:20:24 +00:00
tox.ini Ensure chardet is installed when testing using tox 2021-07-27 19:48:27 +01:00

LinkChecker
============

|Build Status|_ |License|_

.. |Build Status| image:: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml/badge.svg?branch=master
.. _Build Status: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml
.. |License| image:: https://img.shields.io/badge/license-GPL2-d49a6a.svg
.. _License: https://opensource.org/licenses/GPL-2.0

Check for broken links in web sites.

Features
---------

- recursive and multithreaded checking and site crawling
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
- restrict link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP, FTP and Telnet
- honors robots.txt exclusion protocol
- Cookie support
- HTML5 support
- a command line and web interface
- various check plugins available, eg. HTML syntax and antivirus checks.

Installation
-------------

See `doc/install.txt`_ in the source code archive for general information. Except the given information there, please take note of the following:

.. _doc/install.txt: doc/install.txt

Python 3.6 or later is needed.

The version in the pip repository may be old. Instead, you can use pip to install the latest code from git: ``pip3 install git+https://github.com/linkchecker/linkchecker.git``.

Usage
------
Execute ``linkchecker https://www.example.com``.
For other options see ``linkchecker --help``.

Docker usage
-------------

*The Docker images are out-of-date, pip installation is the only currently recommended method.*

If you do not want to install any additional libraries/dependencies you can use the Docker image.

Example for external web site check::

  docker run --rm -it -u $(id -u):$(id -g) linkchecker/linkchecker --verbose https://www.example.com

Local HTML file check::

  docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt linkchecker/linkchecker --verbose index.html