Rainforest

Sankuru

Implémenter, personaliser, étendre et réparer Joomla/Virtuemart

Views: 2714

Nous vous aidons avec ...

Virtuemart
Joomfish
Autres extensions
SocialTwist Tell-a-Friend

Traduction automatique

English Arabic Chinese (Simplified) German Japanese Russian Spanish



Re-utilisons des sources libres

Les logiciels dont vous avez besoin, éxistent souvent déjà en source libre, et couvrent vos besoins à 80%. Nous ajouterons pour vous les 20% qui manquent.
Php implementation of an LALR1 parser generator PDF Imprimer E-mail
Note des utilisateurs: / 0
MauvaisTrès bien 
Écrit par erik   
Jeudi, 24 Décembre 2009 14:30
There are no translations available.

 

Here is my initial version (0.15) of a Php implementation of an LALR1 parser generator: lrparsers.zip.

Parser tables, along with a lexer table, are the core constituents of any compiler's front end.

This LALR parser generator is heavily based on my custom Php version of the map/reduce programming technique (functional programming). I will elaborate in future posts, on why I used a map/reduce basis to implement the parser generator.

In fact, I have implemented three LR parser generators:

  • LR0. No lookaheads. The other parsers call it to calculate the collection of grammar states.
  • SLR1. States based on LR0. Lookaheads based on grammar-wide follow sets. To test the LR0 parser against SLR1 examples.
  • LALR1. States based on LR0. Lookaheads based on individual state-wide follow sets (merged LR1 lookaheads).

 

I am still looking for a way to validate the LALR1 parser generator more thoroughly. I would use the output of existing parser generators more readily, if they did not generate very cryptical parser tables.

It is a commandline application in Php. You can generate the parser tables for a particular grammar as following:

$ ./lalr1cc examples/grammar07.txt -v

[RULES]
0 S : E ;
1 E : E + ( E ) ;
2 E : int ;

[START]
S

[NON-TERMINALS]
S E

[TERMINALS]
+ ( ) int

[REDUCTIONS] --> state number, rule number reduced, non-terminal reduced | lookahead tokens
1 0 S | EOF
2 2 E | ) + EOF
6 1 E | EOF + )

[TRANSITIONS]  --> transition number, state from, transition term, state to | lookahead tokens
0 0 E 1 | EOF +
1 0 int 2 | EOF +
2 1 + 3 | EOF +
3 3 ( 4 | EOF + )
4 4 E 5 | EOF + )
5 4 int 2 | ) +
6 5 ) 6 | EOF + )
7 5 + 3 | ) +

[STATES]
0 0-0|1-0/2-0 INIT --> state number, core rule positions | expanded rule positions
0-0 S : .E | EOF --> rule position, the dot shows the position in the rule | lookahead tokens
1-0 E : .E + ( E ) | EOF +
2-0 E : .int | EOF +
1 0-1/1-1
0-1 S : E. | EOF
1-1 E : E.+ ( E ) | EOF +
2 2-1
2-1 E : int. | ) + EOF
3 1-2
1-2 E : E +.( E ) | EOF + )
4 1-3|1-0/2-0
1-3 E : E + (.E ) | EOF + )
1-0 E : .E + ( E ) | ) +
2-0 E : .int | ) +
5 1-4/1-1
1-4 E : E + ( E.) | EOF + )
1-1 E : E.+ ( E ) | ) +
6 1-5
1-5 E : E + ( E ). | EOF + )

 

The -v option means "verbose". With this option, the parser generator will also output the individual state cursors in the [STATES] section of the parser tables. The individual state cursors are not needed for the LR parsing algorithm.

The example grammar above (grammar07.txt) comes from lecture on LALR1, apparently at Berkeley (http://users.cs.dal.ca/~jost/4131/lecture09_4up.pdf). The parser generates indeed the same states and lookahead tokens as in the solution in the lecture. This parser generator, however, does not operate by merging LR1 states, but by enriching directly the LR0 states.

Of course, I need much more further validation than the few examples included, to make sure the parser generator generates the tables correctly.

I would love to get some help in validating the parser generator. Therefore, if you feel like running it against other examples, feel free!

 


blog comments powered by Disqus
 
 
Joomla 1.5 Templates by Joomlashack