check links in web documents or full websites
Find a file
Chris Mayo edc974e7b9 Tag Docker image using latest, commit checksum and semver
The following tags will be added: latest, 1234567 and 1.2.3
2021-11-28 18:55:46 +00:00
.github Tag Docker image using latest, commit checksum and semver 2021-11-28 18:55:46 +00:00
cgi-bin Use python3 for cgi-bin/lc.wsgi 2021-01-28 19:20:24 +00:00
config Make ResultCache max_size configurable (#544) 2021-06-21 19:45:19 +01:00
doc Stop including binary translation catalogs in the source 2021-11-22 19:30:33 +00:00
linkcheck Replace deprecated Thread.getName() and Condition.notifyAll() 2021-11-16 19:45:38 +00:00
po Stop including binary translation catalogs in the source 2021-11-22 19:30:33 +00:00
scripts Add slack to the list of ignored schemes 2020-08-09 17:10:26 +01:00
tests Enable certificate verification during https test 2021-11-22 19:27:18 +00:00
windows Remove home-cooked htmlparser and use BeautifulSoup 2019-07-22 19:59:37 +01:00
.gitattributes Add .gitattributes 2013-12-04 20:04:34 +01:00
.gitignore Stop including binary translation catalogs in the source 2021-11-22 19:30:33 +00:00
.project Add Eclipse Pydev project files. 2011-05-18 21:12:18 +02:00
.pydevproject Updated pydev settings. 2011-12-17 19:13:43 +01:00
CODE_OF_CONDUCT.rst Include CONTRIBUTING and CODE_OF_CONDUCT in Sphinx documentation 2020-08-15 17:02:40 +01:00
CONTRIBUTING.rst Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
COPYING Moved some files into the doc/ subdirectory. 2010-03-06 21:52:25 +01:00
dev-requirements.txt Remove unused py2app from dev-requirements.txt 2020-08-23 17:24:09 +01:00
Dockerfile Release new Docker image hosted on GitHub Packages 2021-11-25 19:33:29 +00:00
install-rpm.sh Fix RPM installer generation. 2012-04-11 18:41:34 +02:00
linkchecker Don't rely on linkcheck/__init__.py for log aliases in linkchecker 2020-09-24 19:23:13 +01:00
Makefile Stop using biplist 2020-10-12 19:55:46 +01:00
MANIFEST.in Stop including binary translation catalogs in the source 2021-11-22 19:30:33 +00:00
pytest.ini Move some pytest options into pytest.ini 2019-10-21 17:42:29 +03:00
README.rst Tag Docker image using latest, commit checksum and semver 2021-11-28 18:55:46 +00:00
requirements.txt Drop support for Beautiful Soup < 4.8.1 2021-01-28 19:20:24 +00:00
robots.txt Add non-ascii values to test robots.txt 2008-07-13 13:01:59 +00:00
setup.cfg Merge pull request #470 from cjmayo/sphinx 2020-08-22 16:26:41 +01:00
setup.py Stop including binary translation catalogs in the source 2021-11-22 19:30:33 +00:00
tox.ini Test with Python 3.10 2021-10-18 19:46:31 +01:00

LinkChecker
============

|Build Status|_ |License|_

.. |Build Status| image:: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml/badge.svg?branch=master
.. _Build Status: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml
.. |License| image:: https://img.shields.io/badge/license-GPL2-d49a6a.svg
.. _License: https://opensource.org/licenses/GPL-2.0

Check for broken links in web sites.

Features
---------

- recursive and multithreaded checking and site crawling
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
- restrict link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP, FTP and Telnet
- honors robots.txt exclusion protocol
- Cookie support
- HTML5 support
- a command line and web interface
- various check plugins available, eg. HTML syntax and antivirus checks.

Installation
-------------

Python 3.6 or later is needed. Using pip to install LinkChecker:

``pip3 install linkchecker``

The version in the pip repository may be old, to find out how to get the latest
code, plus platform-specific information and other advice see `doc/install.txt`_
in the source code archive.

.. _doc/install.txt: doc/install.txt


Usage
------
Execute ``linkchecker https://www.example.com``.
For other options see ``linkchecker --help``.

Docker usage
-------------

If you do not want to install any additional libraries/dependencies you can use the Docker image.

Example for external web site check::

  docker run --rm -it -u $(id -u):$(id -g) ghcr.io/linkchecker/linkchecker:latest --verbose https://www.example.com

Local HTML file check::

  docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt ghcr.io/linkchecker/linkchecker:latest --verbose index.html