mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-03-17 06:20:27 +00:00
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@5 e7d03fd6-7b0d-0410-9947-9c21f3af8025
142 lines
5.7 KiB
HTML
142 lines
5.7 KiB
HTML
<html>
|
|
<body bgcolor="#ffffff">
|
|
<title> PyLR -- Fast LR parsing in python </title>
|
|
<!-- Changed by: Scott, 15-Dec-1997 -->
|
|
<center>
|
|
<h2>PyLR -- Fast LR parsing in python</h2>
|
|
<hr>
|
|
</center>
|
|
|
|
<ul>
|
|
<li> <a href="#whatis"> What is PyLR? </a>
|
|
<li> <a href="#status"> What is the current state of PyLR? </a>
|
|
<li> <a href="#where"> Where do I get PyLR? </a>
|
|
<li> <a href="#directions"> What will be added to PyLR? </a>
|
|
<li> <a href="#parsing"> Where do I find out about parsing theory? </a>
|
|
<li> <a href="#contrib"> How can I contribute to PyLR? </a>
|
|
</ul>
|
|
<hr>
|
|
<p><p>
|
|
<a name="whatis"><h2>What is PyLR?</h2></a>
|
|
|
|
PyLR is a package of tools for creating efficient parsers in python,
|
|
commonly known as a compiler compiler. PyLR is currently under
|
|
development. A ful release is almost complete, but there are still a few missing
|
|
features that would make it much nicer.
|
|
|
|
<p>
|
|
PyLR (pronounced 'pillar') was motivated by the frequencly with which parsers are hand
|
|
coded in python, the performance demands that these parsers are subject to (you just can't beat
|
|
native machine code for speed...), and academic curiosity (I wanted to really know how LR
|
|
parsing works).
|
|
<p><p>
|
|
|
|
|
|
<a name="status"> <h2>What is the current state of PyLR? </h2></a>
|
|
PyLR currently has class interfaces to a Grammar, a Lexer, an extension module
|
|
defining a parsing engine builtin type, and a parser generator script. All of these components
|
|
are based on sound parsing theory, but nevertheless haven't been tested by anyone but it's author.
|
|
The code as is stands can definitely be of use to anyone hand writing a parser in python, but some
|
|
of the nicer things in the complete package <em> just haven't been done yet </em>. <p>
|
|
PyLR is therefore under development, as it will always be. PyLR will be given a release number
|
|
once it supplies the following tools:
|
|
<ul>
|
|
|
|
|
|
<LI> write an 'engine' module that implements the LR parsing
|
|
algorythm in C with callbacks to python functions. (done) </LI>
|
|
|
|
|
|
<LI> write a Lexer class using re (done)</LI>
|
|
|
|
|
|
<LI> write a Grammar class that will take as input a context
|
|
free grammar and produce the parsing tables necessary to complement
|
|
the engine. This is to be done with LR(1) grammars (done and then
|
|
deleted -- extremely inefficient) and LALR(1) Grammars(done,
|
|
except with epsilon (empty) productions,<EM> much</EM> more efficient). </LI>
|
|
|
|
|
|
<LI> add a user interface -- manually write a lexer and Grammar
|
|
using the exisiting classes to parse lexer and grammar specifications
|
|
modelled after lex/flex and yacc/bison. (done for Grammars)
|
|
</LI>
|
|
|
|
<LI> write documentation. (usable, but not done)
|
|
</LI>
|
|
|
|
<LI> (post release) add grammars to various languages to the
|
|
distribution.
|
|
</LI>
|
|
</ul>
|
|
In addtion, I have the following plan for the project:
|
|
<UL>
|
|
<LI> make 'epsilon' (empty) productions work (many of them work now, but not all) </LI>
|
|
|
|
<LI> optimize the Lexer. Try to join it into one regular expression and derive
|
|
function calls from match object data. (done, still the slowest part of parsing)</LI>
|
|
|
|
<LI> add error specification routines. </LI>
|
|
|
|
<LI> change the parser generation algorithm to use only kernel LALR(1) items
|
|
in the computation of shift actions and gotos in the goto table. This
|
|
should significantly enhance the rate of parser generation, which is currently
|
|
a bit slow, but certainly acceptable for medium-sized grammars (< ~100 productions)
|
|
(done!) this version
|
|
</LI>
|
|
|
|
|
|
<LI> write a Parser for sql, as used in <A HREF="http://www.pythonpros.com/arw/kwParsing/">gadfly</A>
|
|
</LI>
|
|
|
|
<LI> add operator precedence as an option to the parser specification (further down the road...)</LI>
|
|
|
|
</UL>
|
|
These things will probably be done over the next month or two (as I only have free time to give
|
|
to this project...Ahemmm...).
|
|
<p><p>
|
|
<a name="where"><h2>Where do I get PyLR? </h2></a>
|
|
You can get PyLR in one of two places, <a href="ftp://chronis.icgroup.com/pub/">here</a>
|
|
or <a href="PyLR.tgz"> here</a>. Both versions will be in sync with each other.
|
|
<p><p>
|
|
|
|
<a name="directions"><h2>What will be added to PyLR? </h2></a>
|
|
In addition to the <a href ="#status">list of things to finish </a> before a full release,
|
|
is published, PyLR could be used as the basis for an efficient datapath analyzer (optimizer),
|
|
for a front end to translation from one language to another, for type checking code, etc.<p>
|
|
As soon as the first release is completed, Tools to aid in all these things could well be added
|
|
to the package. Also, anyone wanting to contribute parser specifications for
|
|
languages of general use is most welcome.
|
|
<p><p>
|
|
|
|
<a name="parsing"> <h2>Where do I find out more about parsing? </h2></a>
|
|
Parsing was for a long time a big challenge for computer scientists. The need for
|
|
computer parsing originally came about with the first writing of compilers. Since then, the
|
|
theory behind parsing has been studied in depth and has pretty much stabilized as it no longer
|
|
really presents a big problem in terms of speed or size in terms of parsing todays computer
|
|
languages. One standard means of parsing that has been used for years because of its efficiency
|
|
is LR parsing (more particularly, LALR parsing). A lot of good information is in
|
|
<a href="http://www.amazon.com/exec/obidos/ISBN=1565920007">
|
|
Lex and Yacc</a> ,
|
|
<a href="http://www.amazon.com/exec/obidos/ISBN=0201100886">
|
|
The Dragon Book </a>, and
|
|
it seems like the only place to find good info on LALR parsing is in
|
|
|
|
<pre>
|
|
DeRemer, F.; and Pennello, T.Efficient computation of LALR(1) look-ahead sets, ACM Trans.
|
|
Program. Lang. Syst. 4 (1982), 615-649.
|
|
</pre>
|
|
|
|
Finally, to find out how to use PyLR, see the<A HREF="manual.html">PyLR manual</A>
|
|
|
|
<a name="contrib"> <h2>How do I contribute to PyLR? </h2></a>
|
|
<a href="mailto:scott@chronis.icgroup.com">mail me. </a>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|