check links in web documents or full websites
Find a file
nodet 28f6743778
Add ignorewarningsforurls to ignore specific warnings (#794)
We want to allow specifying a warning to ignore for
each URL. If no regex is specified for the warning to ignore,
we'll ignore all warnings.

The tests still pass as they are, which means that unknown
values in the configuration file are simply ignored.

* [#782] Add values to configuration file

* [#782] Parse new configuration values

* [#782] Actually ignore a warning

* [#782] Confirm side cases work as expected

* [#782] Add logging when deciding to ignore warnings

* [#782] Documentation for ignorewarningsforurls

* [#782] Update (generated) man pages

* [#782] These tests pass without network, actually

* [#782] Fix copy/paste error in symbol naming

* [#782] The regex matches the name of the warning, not the message

* [#782] Better wording

* [#782] Update (generated) man pages

* [#782] We match the type, not the message
2024-02-13 19:43:29 +00:00
.github Stop generating Sigstore signatures on release 2023-12-11 19:44:17 +00:00
cgi-bin Use python3 for cgi-bin/lc.wsgi 2021-01-28 19:20:24 +00:00
config Install linkcheckerrc in the package data 2021-12-30 19:27:04 +00:00
doc Add ignorewarningsforurls to ignore specific warnings (#794) 2024-02-13 19:43:29 +00:00
linkcheck Add ignorewarningsforurls to ignore specific warnings (#794) 2024-02-13 19:43:29 +00:00
po Update application translation catalogs 2023-12-04 19:24:38 +00:00
scripts Add ms-windows-store scheme 2023-10-30 19:23:46 +00:00
tests Add ignorewarningsforurls to ignore specific warnings (#794) 2024-02-13 19:43:29 +00:00
tools Re-enable support for installing git archives 2023-01-17 19:26:24 +00:00
.flake8 Rename setup.cfg to .flake8 2022-09-05 19:24:01 +01:00
.git_archival.txt Re-enable support for installing git archives 2023-01-17 19:26:24 +00:00
.gitattributes Re-enable support for installing git archives 2023-01-17 19:26:24 +00:00
.gitignore Add .coverage and .pytest_cache to .gitignore 2022-09-05 19:30:38 +01:00
.project Add Eclipse Pydev project files. 2011-05-18 21:12:18 +02:00
.pydevproject Updated pydev settings. 2011-12-17 19:13:43 +01:00
.pylintrc Add linting with Pylint to build workflow 2023-05-03 19:24:53 +01:00
.yamllint Add a yamllint check for workflows 2021-11-30 19:45:17 +00:00
CODE_OF_CONDUCT.rst Include CONTRIBUTING and CODE_OF_CONDUCT in Sphinx documentation 2020-08-15 17:02:40 +01:00
CONTRIBUTING.rst Fix broken external links in documentation 2021-08-12 19:28:50 +01:00
COPYING Moved some files into the doc/ subdirectory. 2010-03-06 21:52:25 +01:00
Dockerfile Update Docker image to Python 3.12 2023-12-11 19:26:59 +00:00
Makefile Fix make homepage 2023-09-25 19:22:19 +01:00
pyoxidizer.bzl Enable creating a binary with PyOxidizer 2021-12-30 19:27:04 +00:00
pyproject.toml Raise minimum Python version to 3.9 2023-11-27 19:22:08 +00:00
pytest-minreqs.ini Test with minimum versions of requirements 2022-11-30 19:21:06 +00:00
pytest.ini Ignore bs4 markup and XML parser warnings 2022-09-02 19:29:11 +01:00
README.rst Raise minimum Python version to 3.9 2023-11-27 19:22:08 +00:00
requirements-min.txt Add pdfminer.six to minimum version testing 2022-12-05 19:22:35 +00:00
requirements.txt Add pdfminer.six to minimum version testing 2022-12-05 19:22:35 +00:00
robots.txt Add non-ascii values to test robots.txt 2008-07-13 13:01:59 +00:00
tox.ini Raise minimum Python version to 3.9 2023-11-27 19:22:08 +00:00

LinkChecker
============

|Build Status|_ |License|_

.. |Build Status| image:: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml/badge.svg?branch=master
.. _Build Status: https://github.com/linkchecker/linkchecker/actions/workflows/build.yml
.. |License| image:: https://img.shields.io/badge/license-GPL2-d49a6a.svg
.. _License: https://opensource.org/licenses/GPL-2.0

Check for broken links in web sites.

Features
---------

- recursive and multithreaded checking and site crawling
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
- HTTP/1.1, HTTPS, FTP, mailto: and local file links support
- restrict link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP and FTP
- honors robots.txt exclusion protocol
- Cookie support
- HTML5 support
- a command line and web interface
- various check plugins available

Installation
-------------

Python 3.9 or later is needed. Using pip to install LinkChecker:

``pip3 install linkchecker``

pipx can also be used to install LinkChecker.

The version in the pip repository may be old, to find out how to get the latest
code, plus platform-specific information and other advice see `doc/install.txt`_
in the source code archive.

.. _doc/install.txt: https://linkchecker.github.io/linkchecker/install.html


Usage
------
Execute ``linkchecker https://www.example.com``.
For other options see ``linkchecker --help``, and for more information the
manual pages `linkchecker(1)`_ and `linkcheckerrc(5)`_.

.. _linkchecker(1): https://linkchecker.github.io/linkchecker/man/linkchecker.html

.. _linkcheckerrc(5): https://linkchecker.github.io/linkchecker/man/linkcheckerrc.html

Docker usage
-------------

If you do not want to install any additional libraries/dependencies you can use
the Docker image which is published on GitHub Packages.

Example for external web site check::

  docker run --rm -it -u $(id -u):$(id -g) ghcr.io/linkchecker/linkchecker:latest --verbose https://www.example.com

Local HTML file check::

  docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt ghcr.io/linkchecker/linkchecker:latest --verbose index.html

In addition to the rolling latest image, uniquely tagged images can also be found
on the `packages`_ page.

.. _packages: https://github.com/linkchecker/linkchecker/pkgs/container/linkchecker