|
|
| Author |
Message |
< Erlang ~ Pareenthesis Handling in Leex |
| Bill M. |
Posted: Wed Mar 18, 2009 2:07 pm |
|
|
|
User
Joined: 06 Jun 2008
Posts: 24
Location: New York
|
Hello All:
I'm sorry if you see this twice, but I'm cross posting from the mailing lists to elicit a response.
I've tried the latest and greatest leex package, version 0.1 and have had a generally good experience. Hopefully it will become part of the Erlang standard distribution.
One minor concern I have is that I find it tricky to escape parenthesis in my grammar. I'm wondering if I'm doing it wrong or if there is a bug in leex?
Consider the following grammar:
Code:
Definitions.
Rules.
%% The following line is rejected by leex, why?
()[/*+-] : io:format("In suspected rule~n"), {token, {TokenChars, TokenLine}}.
%% The following line is accepted by leex
%% []()[/*+-] : io:format("In suspected rule~n"), {token, {TokenChars, TokenLine}}.
In practice, I'm using the commented out workaround, which defines the square braces [ and ] in addition to the symbols I need to scan, so I can live with it, but I get the feeling I shouldn't have to have the scanner tokenize symbols not used in the parser.
When I try to build it I get the error:
Quote:
make -f Makemytest
mkdir -p ./ebin
# Compile the lexer generator because it is not part of OTP
erlc -I ./include -I ./eunit/include -pa ./ebin -pa ./eunit/ebin -o ./ebin -W0 -Ddebug +debug_info ./leex.erl
# Generate the lexer
erl -I -pa ./ebin -pa ./eunit/ebin -noshell -eval 'leex:file("./mytest",[{outdir,"."}]), halt().'
Parsing file ./mytest.xrl,
./mytest.xrl:6: bad regexp `unterminated `(''
On a side note I've also tried to define stand alone rules for open and close parentheses, i.e. (, \(, \050, which also brought no joy.
Thanks:
Bill M. |
|
|
| Back to top |
|
| rvirding |
Posted: Thu Mar 19, 2009 10:27 pm |
|
|
|
User
Joined: 30 Aug 2006
Posts: 452
Location: Stockholm, Sweden
|
OK, it is really quite simple, as is everything if you know the answer.
- [ ... ] specifies a character class which will match any character between the braces. All characters in between the braces lose their special meanings, EXCEPT: '-' which is used for character ranges, a-e means abcde; and ']' which is the end of character class. To include ']' in the class it must be the first character after the opening '[' and to have '-' lose its range meaning it should be the last character. Otherwise the order of the characters is not significant. This is how you, correctly, do it in your commented rule, which is why it works.
- normally () are meta-characters and used for grouping and the empty group. To use any meta-character as itself you must escape it with \. So \( and \) match parentheses, \[ and \] match braces, \. matches a dot, \\ matches a \ , etc.
- to include a space in a rule regexp you must use \s as a blank character terminates the regexp.
So rules for matching parentheses could be:
\( : {token,{'(',TokenLine}}.
\) : {token,{')',TokenLine}}.
Leex will be included in the erlang in a not too distant future release. It is being worked on but the Erlang group have many things to work on, as do I. But the decision has been made.
Which version of leex are you using? The most recent is on github, where incremental changes are included. Here on trapexit I only put up full releases. |
|
|
| Back to top |
|
|
|
All times are GMT
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum
|
|
|