Commit graph

308 commits

Author SHA1 Message Date
Bastian Kleineidam
43c2e6641b Logging refactor, interrupt and abort flags added. 2014-04-30 19:59:43 +02:00
Bastian Kleineidam
82dd76b0d7 Add PDF link parsing. 2014-04-28 18:13:45 +02:00
Bastian Kleineidam
fc73c6ca6e Log number of checked unique URLs. 2014-03-14 23:46:17 +01:00
Bastian Kleineidam
6437f08277 Display downloaded bytes. 2014-03-14 21:06:10 +01:00
Bastian Kleineidam
6b334dc79b Fix URL result caching. 2014-03-08 19:35:10 +01:00
Bastian Kleineidam
0113f06406 Enable arbitrary output encodings in CSV output. See #467 2014-03-06 22:40:52 +01:00
Bastian Kleineidam
82f81241fd Check all links and add better caching. 2014-03-03 23:29:45 +01:00
Bastian Kleineidam
eb7e52c0e2 -o none sets exit code now 2014-03-01 15:31:39 +01:00
Bastian Kleineidam
f7f5001256 Add missing column name to SQL insert statement. 2014-03-01 12:03:33 +01:00
Bastian Kleineidam
7b34be590b Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements. 2014-03-01 00:12:34 +01:00
Bastian Kleineidam
c806be5c15 Updated copyright 2014-01-08 22:33:04 +01:00
Bastian Kleineidam
e0a2558b2b Updated copyright. 2013-12-24 07:13:16 +01:00
Bastian Kleineidam
5736987b60 Refactor output loggers. 2013-12-11 18:41:55 +01:00
Alper Kokmen
4b3e78cac0 Fix ISO formatting for modified datetime.
This change will make sure that format_modified returns datetime value
in ISO 8601 format. See W3C documentation at
http://www.w3.org/TR/NOTE-datetime.

Since ```modified``` is parsed and then converted to UTC after it's
extracted from HTTP response, it's safe to assume that format_modified
will always format UTC datetime values.

Instead of ```isoformat``` method which omits timezone information for
UTC values, ```strftime``` with a specific format (that ends with Z)
will be used.
2013-09-02 15:38:54 -07:00
Bastian Kleineidam
b0c2a90b94 Updated copyright. 2012-11-07 18:08:44 +01:00
Bastian Kleineidam
eabaa41bd2 Do not check duplicate URLs. 2012-11-06 21:34:22 +01:00
Bastian Kleineidam
0c20ef5de4 Strip console characters only from line text. 2012-10-10 12:27:08 +02:00
Bastian Kleineidam
e1e80b7dd5 Remove addrinfo cache. 2012-10-10 10:54:58 +02:00
Bastian Kleineidam
20be0f2519 Strip control chars from logger output. 2012-10-10 10:54:30 +02:00
Bastian Kleineidam
03a5d476b3 Use URL name if title is empty. 2012-10-09 21:04:54 +02:00
Bastian Kleineidam
cbc3bcb0d3 Sitemap logger fixes. 2012-09-23 23:20:21 +02:00
Bastian Kleineidam
1f3034b5f5 Sitemap logger fixes. 2012-09-23 20:59:38 +02:00
Bastian Kleineidam
18a200d85f Fix tests. 2012-09-19 11:05:26 +02:00
Bastian Kleineidam
b8f8bdf5fc Fix last modified formatting. 2012-09-19 10:09:19 +02:00
Bastian Kleineidam
75719b34f6 Updated copyright. 2012-09-19 09:17:25 +02:00
Bastian Kleineidam
71fba0f8b7 Log all valid URLs in sitemap loggers. 2012-09-19 09:17:08 +02:00
Bastian Kleineidam
3a352631ba Add modified field to loggers. 2012-09-18 12:12:00 +02:00
Bastian Kleineidam
932a07a9cf Added XML sitemap logger. 2012-09-18 09:16:34 +02:00
Bastian Kleineidam
99bf8aa940 Updated copyright. 2012-09-17 16:09:55 +02:00
Bastian Kleineidam
03667a4ec9 Print warning tags in text output. 2012-09-17 15:29:04 +02:00
Bastian Kleineidam
6e1841cf1f Print download and cache statistics. 2012-09-17 15:23:25 +02:00
Bastian Kleineidam
e98f15933f Stop checking of all output loggers have been deactivated. 2012-09-14 22:36:59 +02:00
Bastian Kleineidam
86f1c74006 Close loggers properly on I/O errors. 2012-09-14 22:09:18 +02:00
Bastian Kleineidam
451a520943 Prevent double color stream proxying. 2012-08-10 19:43:33 +02:00
Bastian Kleineidam
580ab74f0e Updated german translation. 2012-08-09 20:43:31 +02:00
Bastian Kleineidam
979d7f13d3 Updated copyright. 2012-06-20 21:40:21 +02:00
Bastian Kleineidam
1e13a4f8fc Add donation url to info blurb. 2012-06-20 00:37:39 +02:00
Bastian Kleineidam
eb30191bb0 Add copyright and missing docs. 2012-06-20 00:30:52 +02:00
Bastian Kleineidam
a6eaae2c38 Implement abstract method for logger. 2012-06-20 00:15:45 +02:00
Bastian Kleineidam
2dfe9d4b4f Use abstract base class for loggers. 2012-06-19 23:27:26 +02:00
Bastian Kleineidam
c7ced2445b Ensure correct encoding when writing non-ascii CSV output. 2012-04-22 17:52:47 +02:00
Bastian Kleineidam
3d831c1adb Updated copyright. 2012-04-11 22:23:43 +02:00
Bastian Kleineidam
ae0bd406d4 Do not encode CSV outro output comment. 2012-04-11 20:43:46 +02:00
Bastian Kleineidam
e9420d77db Updated copyright. 2012-03-31 09:24:08 +02:00
Bastian Kleineidam
b48812f612 Encode comments in CSV logger. 2012-03-31 09:17:49 +02:00
Bastian Kleineidam
1c26c14b64 Set copyright year and add missing docstrings. 2011-12-25 08:45:27 +01:00
Bastian Kleineidam
21532a70ec Return with non-zero return value when internal program errors occurred. 2011-12-14 22:54:26 +01:00
Bastian Kleineidam
91dce84c59 Fix sqlify for multiline contents. 2011-10-18 14:40:33 +02:00
Bastian Kleineidam
6e69f8f3b1 Fix sql logging output. 2011-10-09 19:02:23 +02:00
Bastian Kleineidam
581da4a9c6 Fix misnamed variable. 2011-05-14 20:21:39 +02:00
Bastian Kleineidam
50fc4ab566 Colorize result in text logger. 2011-05-14 09:36:21 +02:00
Bastian Kleineidam
7365170564 Updated copyright. 2011-04-12 09:13:39 +02:00
Bastian Kleineidam
84f6d56a49 Print level in loggers xml, csv and sql. 2011-04-09 10:51:03 +02:00
Bastian Kleineidam
b9c9dda9b3 Correctly encode CSV output. 2011-04-06 12:54:58 +02:00
Bastian Kleineidam
c3b3027c6d Only output configured parts in CSV logger. 2011-04-06 11:12:05 +02:00
Bastian Kleineidam
cdb00e9ef8 Do not write empty tag attributes. 2011-03-21 16:07:45 +01:00
Bastian Kleineidam
3e6de5213c Updated copyright 2011-03-21 15:23:40 +01:00
Bastian Kleineidam
847d740e37 Move get_stdout_writer() to i18n module and allow the make sys.stdout a function argument. 2011-03-21 13:11:32 +01:00
Bastian Kleineidam
82e5ba8ce6 Add warning tag attribute in XML loggers. 2011-03-15 13:42:21 +01:00
Bastian Kleineidam
f4f921384e Updated copyright 2011-03-13 07:52:18 +01:00
Bastian Kleineidam
8da37a32ee Refactor sys.stdout wrapping into a function. 2011-03-11 20:05:27 +01:00
Bastian Kleineidam
c3bc16cde7 Added blank line before each URL output. 2011-03-10 10:51:44 +01:00
Bastian Kleineidam
2c53507097 Improved logging documentation. 2011-03-09 12:08:03 +01:00
Bastian Kleineidam
2dfe62afa2 Updated copyright. 2011-02-14 21:07:07 +01:00
Bastian Kleineidam
c5884b8d87 Add function documentation. 2011-02-14 21:06:34 +01:00
Bastian Kleineidam
48e4bd8bfd Updated copyright 2011-02-06 09:50:48 +01:00
Bastian Kleineidam
eaf3ca0d89 Set codec error policy on StreamWriter for stdout. 2011-01-09 13:57:14 -06:00
Bastian Kleineidam
3c48c04b1c Updated copyright and remove unused imports. 2011-01-06 09:38:32 +01:00
Bastian Kleineidam
066c57ffe3 Update copyright and german translation. 2011-01-04 20:44:07 +01:00
Bastian Kleineidam
e6ccd71ae1 Improved statistics output. 2010-12-22 13:05:32 +01:00
Bastian Kleineidam
06ec8e6389 Reset GUI statistics before each check run. 2010-12-21 00:35:07 +01:00
Bastian Kleineidam
a9b6c10cd5 Fix unicode errors when writing to sys.stdout. 2010-12-20 23:43:37 +01:00
Bastian Kleineidam
b05ca0e345 Clear properties and statistics before check. 2010-12-17 20:25:06 +01:00
Bastian Kleineidam
b485594dfb Print statistics information in HTML output. 2010-12-15 14:36:19 +01:00
Bastian Kleineidam
a94269fd5b Remove unused ID part of loggers. 2010-12-15 13:24:31 +01:00
Bastian Kleineidam
f2b8c742fc Gather URL length statistics. 2010-12-15 07:55:00 +01:00
Bastian Kleineidam
870bd2147a Output statistics in text logger. 2010-12-14 20:52:42 +01:00
Bastian Kleineidam
2b2121b9ed Added content type and domain to URL logging info. 2010-12-14 20:30:53 +01:00
Bastian Kleineidam
e57456ccdb Use correct charset encoding in XML output. 2010-11-26 21:24:36 +01:00
Bastian Kleineidam
431953a6d9 Fix typos. 2010-11-26 21:23:13 +01:00
Bastian Kleineidam
b2bdbed3c4 Move encode() method to base class. 2010-11-22 07:43:33 +01:00
Bastian Kleineidam
d97b7a4e4e Force UTF-8 for CSV logger. 2010-11-21 20:48:50 +01:00
Bastian Kleineidam
f0b911b608 Use codecs module for proper output encoding. 2010-11-21 20:19:27 +01:00
Bastian Kleineidam
03034ddc1c Updated copyright 2010-11-21 11:25:07 +01:00
Bastian Kleineidam
350c952a1f Use new textwrap feature to not break on hyphens. 2010-11-21 10:48:40 +01:00
Bastian Kleineidam
017a1087ba Remove unneeded __future__ import 2010-11-21 10:45:30 +01:00
Bastian Kleineidam
3ecfb4a67b Updated support URL. 2010-11-06 17:00:09 +01:00
Bastian Kleineidam
dd18c11cd1 Do not write node ID to label 2010-11-06 16:08:05 +01:00
Bastian Kleineidam
9d88a7117d Fix GML comment format. 2010-11-06 15:47:00 +01:00
Bastian Kleineidam
e7dff74cf9 Quote quote edge labels and strip leading and trailing whitespace from node and egde labels. 2010-11-06 15:36:21 +01:00
Bastian Kleineidam
885ce223a4 Modified logger output strings. 2010-11-05 12:53:57 +01:00
Bastian Kleineidam
06dcf13629 Updated copyright. 2010-11-05 12:27:29 +01:00
Bastian Kleineidam
ca395e7d82 Avoid error when intro or outro logging fields are configured. 2010-11-03 20:35:45 +01:00
Bastian Kleineidam
166969f3a4 Remove duplicate logger code. 2010-11-01 09:58:03 +01:00
Bastian Kleineidam
ffcd274087 Updated copyright 2010-09-05 21:02:51 +02:00
Bastian Kleineidam
851e1121e9 Use semicolon as default CSV separator. 2010-07-31 22:30:11 +02:00
Bastian Kleineidam
4e1b6d667e Set copyright. 2010-03-26 20:51:59 +01:00
Bastian Kleineidam
c4c098bd83 pep8-ify the source a little more 2010-03-13 08:47:12 +01:00
Bastian Kleineidam
86ca7d0dba Do not break long words when text wrapping. 2010-03-11 21:50:23 +01:00
Bastian Kleineidam
8533ade21f Add ID for each logged URL. 2009-07-26 22:31:51 +02:00