Erlang/OTP Forums

Author Message

<  Erlang  ~  Pareenthesis Handling in Leex

Bill M.
Posted: Wed Mar 18, 2009 2:07 pm Reply with quote
User Joined: 06 Jun 2008 Posts: 24 Location: New York
Hello All:

I'm sorry if you see this twice, but I'm cross posting from the mailing lists to elicit a response.
I've tried the latest and greatest leex package, version 0.1 and have had a generally good experience. Hopefully it will become part of the Erlang standard distribution.

One minor concern I have is that I find it tricky to escape parenthesis in my grammar. I'm wondering if I'm doing it wrong or if there is a bug in leex?

Consider the following grammar:


Code:

Definitions.

Rules.
%% The following line is rejected by leex, why?
()[/*+-] : io:format("In suspected rule~n"), {token, {TokenChars, TokenLine}}.
%% The following line is accepted by leex
%% []()[/*+-] : io:format("In suspected rule~n"), {token, {TokenChars, TokenLine}}.


In practice, I'm using the commented out workaround, which defines the square braces [ and ] in addition to the symbols I need to scan, so I can live with it, but I get the feeling I shouldn't have to have the scanner tokenize symbols not used in the parser.

When I try to build it I get the error:

Quote:

make -f Makemytest
mkdir -p ./ebin
# Compile the lexer generator because it is not part of OTP
erlc -I ./include -I ./eunit/include -pa ./ebin -pa ./eunit/ebin -o ./ebin -W0 -Ddebug +debug_info ./leex.erl
# Generate the lexer
erl -I -pa ./ebin -pa ./eunit/ebin -noshell -eval 'leex:file("./mytest",[{outdir,"."}]), halt().'
Parsing file ./mytest.xrl,
./mytest.xrl:6: bad regexp `unterminated `(''


On a side note I've also tried to define stand alone rules for open and close parentheses, i.e. (, \(, \050, which also brought no joy.

Thanks:

Bill M.
View user's profile Send private message
rvirding
Posted: Thu Mar 19, 2009 10:27 pm Reply with quote
User Joined: 30 Aug 2006 Posts: 452 Location: Stockholm, Sweden
OK, it is really quite simple, as is everything if you know the answer. Smile

- [ ... ] specifies a character class which will match any character between the braces. All characters in between the braces lose their special meanings, EXCEPT: '-' which is used for character ranges, a-e means abcde; and ']' which is the end of character class. To include ']' in the class it must be the first character after the opening '[' and to have '-' lose its range meaning it should be the last character. Otherwise the order of the characters is not significant. This is how you, correctly, do it in your commented rule, which is why it works.

- normally () are meta-characters and used for grouping and the empty group. To use any meta-character as itself you must escape it with \. So \( and \) match parentheses, \[ and \] match braces, \. matches a dot, \\ matches a \ , etc.

- to include a space in a rule regexp you must use \s as a blank character terminates the regexp.

So rules for matching parentheses could be:

\( : {token,{'(',TokenLine}}.
\) : {token,{')',TokenLine}}.

Leex will be included in the erlang in a not too distant future release. It is being worked on but the Erlang group have many things to work on, as do I. But the decision has been made.

Which version of leex are you using? The most recent is on github, where incremental changes are included. Here on trapexit I only put up full releases.
View user's profile Send private message Visit poster's website MSN Messenger

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum