Commit graph

112 commits

Author SHA1 Message Date
Petr Dlouhý
8b9f29ae52 Python3: fix unichr() in htmlparser 2019-09-09 19:51:30 +01:00
Petr Dlouhý
bc99dc51de Python3: fix HtmlParser 2019-04-18 19:35:16 +01:00
Marius Gedminas
fb1debaa68 Fix incompatible pointer type warnings
The warnings looked like this:

    htmlparse.c: In function ‘yyparse’:
    htmlparse.c:1810:18: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
    htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’
    htmlparse.c:1927:12: warning: passing argument 1 of ‘yyerror’ from incompatible pointer type [-Wincompatible-pointer-types]
    htmlparse.y:40:13: note: expected ‘PyObject ** {aka struct _object **}’ but argument is of type ‘PyObject * {aka struct _object *}’

The argument is not used, so it doesn't really matter what pointer type
it is.
2017-02-24 15:04:09 +02:00
Marius Gedminas
03dfe3d3a1 Fix "operation on ... may be undefined" [-Wsequence-point] warnings
Fixes a bunch of warnings like

  htmlparse.y:509:25: warning: operation on ‘self->userData->buf’ may be undefined [-Wsequence-point]
  htmlparse.y:518:29: warning: operation on ‘self->userData->tmp_buf’ may be undefined [-Wsequence-point]

which were a result of (macro-expanded) code like this (simplified):

  if ((tmp = (tmp = PyMem_Realloc(...))) == NULL) return NULL;

The PyMem_Resize(p, ...) macro assigns the new value to p before
returning it, so there's no need to assign it again.

See http://bugs.python.org/issue1668036 for evidence (from 2007) that
this is indeed a documented side-effect of the macro API.
2017-02-13 15:20:33 +02:00
Bastian Kleineidam
3d711666e1 Fix parser for changes in bison 3.0.x 2015-11-26 12:33:44 +01:00
Bastian Kleineidam
029c20ed98 More python3 fixes 2014-09-12 21:59:07 +02:00
Bastian Kleineidam
35eb30432e Added some Python3 fixes. 2014-09-12 19:36:30 +02:00
Bastian Kleineidam
7b34be590b Introduce check plugins, use Python requests for http/s connections, and some code cleanups and improvements. 2014-03-01 00:12:34 +01:00
Bastian Kleineidam
6d5e5f9efb Updated copyright. 2012-03-30 22:24:10 +02:00
Bastian Kleineidam
9ee9abcf0f Parse invalid comments <! bla > 2012-03-23 07:41:03 +01:00
Bastian Kleineidam
b9b8e3f5b2 Honor the charset encoding of the Content-Type HTTP
header when parsing HTML.
2012-03-22 22:45:11 +01:00
Bastian Kleineidam
71f5ee42c8 Updated copyright. 2012-01-29 17:18:28 +01:00
Bastian Kleineidam
dff425710d More Freshmeat/Freecode replacements. 2011-12-25 09:06:18 +01:00
Bastian Kleineidam
5c496df9ed Regenerate HTML parser with new Bison 2.5 version. 2011-10-31 06:41:45 +01:00
Bastian Kleineidam
fb237041d1 Updated copyright 2011-10-20 08:14:16 +02:00
Bastian Kleineidam
d2ae6bf71c Properly detect HTML character encoding. 2011-08-14 12:49:31 +02:00
Bastian Kleineidam
689ab9f073 Add debugging for charset encoding parameter setting. 2011-08-14 12:45:08 +02:00
Bastian Kleineidam
c9707ee735 Handle stray < before end tags. 2011-05-28 13:39:04 +02:00
Bastian Kleineidam
7d04c3ee81 Handle stray < characters in HTML. 2011-05-20 06:50:08 +02:00
Bastian Kleineidam
74c132c90b Updated copyright. 2011-04-26 14:57:57 +02:00
Bastian Kleineidam
54a14d0f91 Use Python 2.7 for local build. 2011-04-22 08:39:45 +02:00
Bastian Kleineidam
c0957a20df Make strlen variables type size_t. 2011-04-19 16:07:10 +02:00
Bastian Kleineidam
4c98c463dc Correctly declare all variables at beginning of block. 2011-04-16 15:25:51 +02:00
Bastian Kleineidam
f4f921384e Updated copyright 2011-03-13 07:52:18 +01:00
Bastian Kleineidam
427b878834 Updated translation and copyright 2010-12-18 21:00:29 +01:00
Bastian Kleineidam
e48acc08af Remove old comments and set line and column number on flush. 2010-12-11 07:57:50 +01:00
Bastian Kleineidam
03034ddc1c Updated copyright 2010-11-21 11:25:07 +01:00
Bastian Kleineidam
6dcb0e10de Require Python 2.6 2010-11-21 10:42:44 +01:00
Bastian Kleineidam
5b5a62f6d5 Updated copyright 2010-03-10 00:05:05 +01:00
Bastian Kleineidam
57397e938b Improved linkname parsing by adding a new peek() HTML parser function. 2010-03-09 11:31:12 +01:00
Bastian Kleineidam
5e06b6b8d4 Updated FSF address in GPL blurb 2009-07-24 23:58:20 +02:00
Bastian Kleineidam
a0ba9a7446 Improved Python 2.6 compatibility in HTML parser 2009-02-28 13:47:25 +01:00
calvin
527b617f88 Regenerate with newer flex and bison versions.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3949 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-02-01 11:21:13 +00:00
calvin
e9805dbd8a Updated copyright year to 2009
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3887 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2009-01-08 14:18:03 +00:00
calvin
2f25962789 Match newlines in catch-all rules
Avoid printing spurious newlines when HTML parsing. The "." does
not match newlines, correct that in the catch-all lexer rules.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3760 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-05-20 16:56:58 +00:00
calvin
3eac1be9ab Require and use Python 2.5
Use Python 2.5 features and get rid of old compat code. Also some
code cleanups have been made.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3737 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-27 11:39:21 +00:00
calvin
9f77f97434 Add distclean target; use Python2.5 includes
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3717 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-04-23 23:03:55 +00:00
calvin
bf277085e9 Regenerate HTML scanner with new flex version
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3683 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-25 21:24:32 +00:00
calvin
370bd058ea Add htmlsax.so target for local build
Add target to build htmlsax.so locally. Also add include path
for local python SVN repository for testing.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3678 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-03-19 10:21:52 +00:00
calvin
13df77c0b5 Added .gitignore files
Ignore files for git version tracking system.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3671 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:29 +00:00
calvin
e1b1b7d916 Regenerate HTML lexer with flex 2.5.34
The HTML lexer .c file has been regenerated with a new upstream
release of flex 2.5.34.


git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3669 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-02-08 10:38:00 +00:00
calvin
f01a77bab1 Don't parse '-->' as end-of-comment in script mode. This fixes parsing errors on some sites.
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3668 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-23 09:43:39 +00:00
calvin
8c4d8145a7 simplify the CDATA matching rules to be more straightforward
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3667 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-23 09:11:50 +00:00
calvin
6499cb1a63 updated copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3658 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2008-01-02 14:31:19 +00:00
calvin
17906ca1e0 use Python 2.4 for local builds
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3592 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-11-13 12:38:01 +00:00
calvin
4c0620c498 use default python
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3559 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-06-16 22:08:41 +00:00
calvin
9de237b4c2 Check that charset is not None before lowering it in set_encoding().
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3547 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-03-21 19:32:19 +00:00
calvin
df48d4a905 bump up copyright year
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3534 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2007-01-01 14:57:38 +00:00
calvin
2e5a5d20df prepare for Py_ssize_t conversion
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3531 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-12-08 22:07:31 +00:00
calvin
b274787c5b prepare for Py_ssize_t conversion
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@3530 e7d03fd6-7b0d-0410-9947-9c21f3af8025
2006-12-08 22:07:23 +00:00