check links in web documents or full websites
Find a file
Nathan Arthur 47a83cbb27 Make timing test more tolerant
On my M1 Mac, this was taking 1.01 seconds rather than the expected 1.00
seconds. This is OK, because sleep() is not guaranteed to be precise.
2022-09-08 09:15:29 -04:00
.github Update path to linkcheckerrc in ISSUE_TEMPLATE.md 2022-09-06 19:34:53 +01:00
cgi-bin Use python3 for cgi-bin/lc.wsgi 2021-01-28 19:20:24 +00:00
config Install linkcheckerrc in the package data 2021-12-30 19:27:04 +00:00
doc Merge pull request #639 from cjmayo/hatch 2022-09-05 19:27:48 +01:00
linkcheck Fix checking directory containing Unicode filenames 2022-09-05 19:28:40 +01:00
po Update application translation catalogs 2021-12-21 19:41:24 +00:00
scripts Add Telegram and WhatsApp link schemes 2022-02-06 23:41:33 +01:00
tests Make timing test more tolerant 2022-09-08 09:15:29 -04:00
tools Replace setuptools and setup.py with hatch and pyproject.toml 2022-09-05 19:24:01 +01:00
windows Remove home-cooked htmlparser and use BeautifulSoup 2019-07-22 19:59:37 +01:00
.flake8 Rename setup.cfg to .flake8 2022-09-05 19:24:01 +01:00
.gitattributes Replace setuptools and setup.py with hatch and pyproject.toml 2022-09-05 19:24:01 +01:00
.gitignore Replace setuptools and setup.py with hatch and pyproject.toml 2022-09-05 19:24:01 +01:00
.project Add Eclipse Pydev project files. 2011-05-18 21:12:18 +02:00
.pydevproject Updated pydev settings. 2011-12-17 19:13:43 +01:00
.yamllint Add a yamllint check for workflows 2021-11-30 19:45:17 +00:00
CODE_OF_CONDUCT.rst Include CONTRIBUTING and CODE_OF_CONDUCT in Sphinx documentation 2020-08-15 17:02:40 +01:00
CONTRIBUTING.rst Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
COPYING Moved some files into the doc/ subdirectory. 2010-03-06 21:52:25 +01:00
dev-requirements.txt Test and recommend pdfminer.six for PdfParser 2022-05-18 19:29:54 +01:00
Dockerfile Merge pull request #634 from cjmayo/pyxdg 2022-08-30 19:28:03 +01:00
Makefile Stop using biplist 2020-10-12 19:55:46 +01:00
pyoxidizer.bzl Enable creating a binary with PyOxidizer 2021-12-30 19:27:04 +00:00
pyproject.toml Add hatch test environment 2022-09-05 19:24:01 +01:00
pytest.ini Ignore bs4 markup and XML parser warnings 2022-09-02 19:29:11 +01:00
README.rst Use the website for installation link in README 2022-02-22 19:35:04 +00:00
requirements.txt Remove dependency on pyxdg 2022-08-23 19:26:15 +01:00
robots.txt Add non-ascii values to test robots.txt 2008-07-13 13:01:59 +00:00
tox.ini Remove check-python-versions that needs setup.py 2022-09-05 19:24:01 +01:00

LinkChecker
============

|Build Status|_ |License|_

.. |Build Status| image:: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml/badge.svg?branch=master
.. _Build Status: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml
.. |License| image:: https://img.shields.io/badge/license-GPL2-d49a6a.svg
.. _License: https://opensource.org/licenses/GPL-2.0

Check for broken links in web sites.

Features
---------

- recursive and multithreaded checking and site crawling
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
- restrict link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP, FTP and Telnet
- honors robots.txt exclusion protocol
- Cookie support
- HTML5 support
- a command line and web interface
- various check plugins available

Installation
-------------

Python 3.7 or later is needed. Using pip to install LinkChecker:

``pip3 install linkchecker``

The version in the pip repository may be old, to find out how to get the latest
code, plus platform-specific information and other advice see `doc/install.txt`_
in the source code archive.

.. _doc/install.txt: https://linkchecker.github.io/linkchecker/install.html


Usage
------
Execute ``linkchecker https://www.example.com``.
For other options see ``linkchecker --help``, and for more information the
manual pages `linkchecker(1)`_ and `linkcheckerrc(5)`_.

.. _linkchecker(1): https://linkchecker.github.io/linkchecker/man/linkchecker.html

.. _linkcheckerrc(5): https://linkchecker.github.io/linkchecker/man/linkcheckerrc.html

Docker usage
-------------

If you do not want to install any additional libraries/dependencies you can use
the Docker image which is published on GitHub Packages.

Example for external web site check::

  docker run --rm -it -u $(id -u):$(id -g) ghcr.io/linkchecker/linkchecker:latest --verbose https://www.example.com

Local HTML file check::

  docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt ghcr.io/linkchecker/linkchecker:latest --verbose index.html

In addition to the rolling latest image, uniquely tagged images can also be found
on the `packages`_ page.

.. _packages: https://github.com/linkchecker/linkchecker/pkgs/container/linkchecker