check links in web documents or full websites
Find a file
Chris Mayo ac0967e251 Fix remaining flake8 violations in linkcheck/
linkcheck/better_exchook2.py:28:89: E501 line too long (90 > 88 characters)
linkcheck/better_exchook2.py:155:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:166:9: E722 do not use bare 'except'
linkcheck/better_exchook2.py:289:13: E741 ambiguous variable name 'l'
linkcheck/better_exchook2.py:299:9: E722 do not use bare 'except'
linkcheck/containers.py:48:13: E731 do not assign a lambda expression, use a def
linkcheck/ftpparse.py:123:89: E501 line too long (93 > 88 characters)
linkcheck/loader.py:46:47: E203 whitespace before ':'
linkcheck/logconf.py:45:29: E231 missing whitespace after ','
linkcheck/robotparser2.py:157:89: E501 line too long (95 > 88 characters)
linkcheck/robotparser2.py:182:89: E501 line too long (89 > 88 characters)
linkcheck/strformat.py:181:16: E203 whitespace before ':'
linkcheck/strformat.py:181:43: E203 whitespace before ':'
linkcheck/strformat.py:253:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:254:9: E731 do not assign a lambda expression, use a def
linkcheck/strformat.py:341:89: E501 line too long (111 > 88 characters)
linkcheck/url.py:102:32: E203 whitespace before ':'
linkcheck/url.py:277:5: E741 ambiguous variable name 'l'
linkcheck/url.py:402:5: E741 ambiguous variable name 'l'
linkcheck/checker/__init__.py:203:1: E402 module level import not at top of file
linkcheck/checker/fileurl.py:200:89: E501 line too long (103 > 88 characters)
linkcheck/checker/mailtourl.py:122:60: E203 whitespace before ':'
linkcheck/checker/mailtourl.py:157:89: E501 line too long (96 > 88 characters)
linkcheck/checker/mailtourl.py:190:89: E501 line too long (109 > 88 characters)
linkcheck/checker/mailtourl.py:200:89: E501 line too long (111 > 88 characters)
linkcheck/checker/mailtourl.py:249:89: E501 line too long (106 > 88 characters)
linkcheck/checker/unknownurl.py:226:23: W291 trailing whitespace
linkcheck/checker/urlbase.py:245:89: E501 line too long (101 > 88 characters)
linkcheck/configuration/confparse.py:236:89: E501 line too long (186 > 88 characters)
linkcheck/configuration/confparse.py:247:89: E501 line too long (111 > 88 characters)
linkcheck/configuration/__init__.py:164:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:184:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:190:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:195:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:198:9: E266 too many leading '#' for block comment
linkcheck/configuration/__init__.py:435:89: E501 line too long (90 > 88 characters)
linkcheck/director/aggregator.py:45:43: E231 missing whitespace after ','
linkcheck/director/aggregator.py:178:89: E501 line too long (106 > 88 characters)
linkcheck/logger/__init__.py:29:1: E731 do not assign a lambda expression, use a def
linkcheck/logger/__init__.py:108:13: E741 ambiguous variable name 'l'
linkcheck/logger/__init__.py:275:19: F821 undefined name '_'
linkcheck/logger/__init__.py:342:16: F821 undefined name '_'
linkcheck/logger/__init__.py:380:13: F821 undefined name '_'
linkcheck/logger/__init__.py:384:13: F821 undefined name '_'
linkcheck/logger/__init__.py:387:13: F821 undefined name '_'
linkcheck/logger/__init__.py:396:13: F821 undefined name '_'
linkcheck/network/__init__.py:1:1: W391 blank line at end of file
linkcheck/plugins/locationinfo.py:89:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/locationinfo.py:91:9: E731 do not assign a lambda expression, use a def
linkcheck/plugins/markdowncheck.py:112:89: E501 line too long (111 > 88 characters)
linkcheck/plugins/markdowncheck.py:141:9: E741 ambiguous variable name 'l'
linkcheck/plugins/markdowncheck.py:165:23: E203 whitespace before ':'
linkcheck/plugins/viruscheck.py:95:42: E203 whitespace before ':'
2020-05-30 17:01:36 +01:00
.github add github issue template 2018-03-26 09:35:43 -04:00
cgi-bin Update references to GitHub project from wummel to linkchecker 2019-04-18 19:59:52 +01:00
config Move GUI files to separate project 2016-01-23 13:28:15 +01:00
doc Fix remaining flake8 violations in doc/ and scripts/ 2020-05-26 20:20:57 +01:00
linkcheck Fix remaining flake8 violations in linkcheck/ 2020-05-30 17:01:36 +01:00
po Make po/Makefile version-agnostic 2020-05-18 18:24:36 +02:00
scripts Fix remaining flake8 violations in doc/ and scripts/ 2020-05-26 20:20:57 +01:00
tests Convert space-separated strings in tests/ 2020-05-29 19:40:46 +01:00
windows Remove home-cooked htmlparser and use BeautifulSoup 2019-07-22 19:59:37 +01:00
.gitattributes Add .gitattributes 2013-12-04 20:04:34 +01:00
.gitignore Add check-manifest to CI 2020-05-25 19:33:17 +03:00
.project Add Eclipse Pydev project files. 2011-05-18 21:12:18 +02:00
.pydevproject Updated pydev settings. 2011-12-17 19:13:43 +01:00
.travis.yml Add check-manifest to CI 2020-05-25 19:33:17 +03:00
CODE_OF_CONDUCT.md split code of conduct and contributing guidelines in two 2018-03-26 09:35:01 -04:00
CONTRIBUTING.md Rename .mdwn files to .md 2020-05-22 19:43:57 +01:00
COPYING Moved some files into the doc/ subdirectory. 2010-03-06 21:52:25 +01:00
dev-requirements.txt Enable https checking using a test server 2019-11-11 20:12:25 +00:00
Dockerfile use python3 in readme and dockerfile 2020-05-09 08:03:23 -04:00
install-rpm.sh Fix RPM installer generation. 2012-04-11 18:41:34 +02:00
linkchecker Convert triple-quoted and space-separated strings in linkchecker 2020-05-29 19:24:17 +01:00
linkchecker.freecode Update references to GitHub project from wummel to linkchecker 2019-04-18 19:59:52 +01:00
Makefile Remove MANIFEST build logic from the Makefile 2020-05-25 19:33:10 +03:00
MANIFEST.in Remove unneeded globs from MANIFEST.in 2020-05-25 22:03:44 +03:00
pytest.ini Move some pytest options into pytest.ini 2019-10-21 17:42:29 +03:00
README.rst Fix misleading information in the README 2020-05-26 11:31:28 +03:00
requirements.txt Remove use of the future package 2020-04-15 19:49:16 +01:00
robots.txt Add non-ascii values to test robots.txt 2008-07-13 13:01:59 +00:00
setup.cfg Fix remaining flake8 violations in linkcheck/ 2020-05-30 17:01:36 +01:00
setup.py Reinstate separate lines for install_requires in setup.py 2020-05-29 19:24:17 +01:00
tox.ini Enable tox flake8 checking for all Python files 2020-05-26 20:20:57 +01:00

LinkChecker
============

|Build Status|_ |License|_

.. |Build Status| image:: https://travis-ci.com/linkchecker/linkchecker.svg?branch=master
.. _Build Status: https://travis-ci.com/linkchecker/linkchecker
.. |License| image:: https://img.shields.io/badge/license-GPL2-d49a6a.svg
.. _License: https://opensource.org/licenses/GPL-2.0

Check for broken links in web sites.

Features
---------

- recursive and multithreaded checking and site crawling
- output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
- restrict link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP, FTP and Telnet
- honors robots.txt exclusion protocol
- Cookie support
- HTML5 support
- a command line and web interface
- various check plugins available, eg. HTML syntax and antivirus checks.

Installation
-------------

See `doc/install.txt`_ in the source code archive for general information. Except the given information there, please take note of the following:

.. _doc/install.txt: doc/install.txt

Python 3 or later is needed.

The version in the pip repository is old. Instead, you can use pip to install the latest release from git: ``pip install git+https://github.com/linkchecker/linkchecker.git``. See `#359 <https://github.com/linkchecker/linkchecker/issues/359>`_.

Windows builds are seriously lagging behind the Linux releases, see `#53 <https://github.com/linkchecker/linkchecker/issues/53>`_ for details. For now, the only two options are to install from source or use `Docker for Windows <https://www.docker.com/docker-windows>`_.

Usage
------
Execute ``linkchecker http://www.example.com``.
For other options see ``linkchecker --help``.

Docker usage
-------------

If you do not want to install any additional libraries/dependencies you can use the Docker image.

Example for external web site check:
```
docker run --rm -it -u $(id -u):$(id -g) linkchecker/linkchecker --verbose https://google.com
```

Local HTML file check:
```
docker run --rm -it -u $(id -u):$(id -g) -v "$PWD":/mnt linkchecker/linkchecker --verbose index.html
```