JavaCC regex that ignores everything until a specific string - regex

So I have a function on javacc:
void parseDSL() throws SemanticException #void :
{}
{
<ALL> "/*#mat" dslStatements() "*/" <ALL> <EOF>
}
My objective is to ignore everything until the "/*#mat" matches and after the parsing ignores everythings until EOF.
I'm really struggling to find a regular expressions that works here.
One example of a file that should pass is:
public class blabla {
int i=1;/*#mat
in float B[100];
in float C[100];
in int A[9];
in int Z[9];
out float D[];
D=A*(B+C-Z)+A*Z;
*/boolean a;
}
Thank You.

This is what lexical states are for. See the documentation and FAQ for more information. Roughly what you want is
<DEFAULT> SKIP: { < ~[] > } // Skip everything up to "/*#mat"
<DEFAULT> TOKEN: { < STARTMAT: "/*#mat" > : GO }
<GO> TOKEN: { <IN : "in" >
| <OUT : "out" >
| .... other rules go here ....
| <ENDMAT : "*/" > : DEFAULT // On a "*/" go back to skipping.
}

Related

Can't rename (refactor) names in WebStorm

Can't refactor variables from _someName to someName (delete _).
Even when I rename all variable like from _someName to anotherName it come out like _anotherName. Can't get rid of "_". Does it require some settings?
I use TypeScript language, last version of WebStorm, MacOS X. With variables without _ everything is okay.
export class Vector2 {
protected _top: number;
protected _left: number;
public constructor(top: number, left: number) {
this._top = top;
this._left = left;
}
public equals(otherVector: Vector2): boolean {
if (this.getTop() === otherVector.getTop() && this.getLeft() === otherVector.getLeft())
return true;
else
return false;
}
public getTop(): number {
return this._top;
}
public setTop(value: number) {
this._top = value;
}
public getLeft(): number {
return this._left;
}
public setLeft(value: number) {
this._left = value;
}
}
Please try removing _ from Field prefix: in Settings | Editor | Code Style | TypeScript | Code generation - this should help

sweet.js: transforming occurrences of a repeated token

I want to define a sweet macro that transforms
{ a, b } # o
into
{ o.a, o.b }
My current attempt is
macro (#) {
case infix { { $prop:ident (,) ... } | _ $o } => {
return #{ { $prop: $o.$prop (,) ... } }
}
}
However, this give me
SyntaxError: [patterns] Ellipses level does not match in the template
I suspect I don't really understand how ... works, and may need to somehow loop over the values of $prop and build syntax objects for each and somehow concatenate them, but I'm at a loss as to how to do that.
The problem is the syntax expander thinks you're trying to expand $o.$prop instead of $prop: $o.$prop. Here's the solution:
macro (#) {
rule infix { { $prop:ident (,) ... } | $o:ident } => {
{ $($prop: $o.$prop) (,) ... }
}
}
Notice that I placed the unit of code in a $() block of its own to disambiguate the ellipse expansion.
Example: var x = { a, b } # o; becomes var x = { a: o.a, b: o.b };.

A parser program for the following grammar

Write a parser (both Yacc and Lex files) that uses the following productions and actions:
S -> cSS {print “x”}
S -> a {print “y”}
S -> b {print “z”}
Indicate the string that it will print when the input is cacba.
I am getting this error: when I give input to it, it says valid input and also says syntax error.
My Scanner Code is this
%{
#include "prac.h"
%}
%%
[c] {return C; }
[a] {return A; }
[b] {return B; }
[ \t] ;
\n { return 0; }
. { return yytext[0]; }
%%
int yywrap(void) {
return 1;
}
And my yacc code is this:
%{
#include <stdio.h>
%}
%token A B C
%%
statement: S {printf("Valid Input"); }
;
S: C S S {printf("Print x\n");}
| A {printf("Print y\n");}
| B {printf("Print z\n");}
;
%%
int main()
{
return yyparse();
}
yyerror(char *s)
{
printf("\n%s\n",s);
printf("Invalid Input");
fprintf(stderr,"At line %d %s ",s,yylineno);
}
How can I fix this?
(Comments converted to an answer)
#ChrisDodd wrote:
Best guess -- you're running on windows, so you're getting a \r (carriage return) character before the newline which is causing your error. Try adding \r to the [ \t] pattern to ignore it.
#Cyclone wrote:
Change your fprintf() statement to fprintf(stderr, "At line %d %s", yylineno, s); not that it will solve your problem.
The OP wrote:
You mean I should add \r into \t so the new regex for it will be [\r\t] Am I right ?
#rici wrote:
#chris suggests [ \r\t]. If you have Windows somewhere in the loop, I agree.

Shift/Reduce conflict in bison

I was trying to do a simple parsing of general html codes.
Here's my entire bison file (example4.y).
%{
#include <iostream>
#include <cstring>
using namespace std;
extern "C" int yylex();
extern "C" int yyparse();
extern "C" FILE *yyin;
void yyerror(const char *str)
{
cout<<"Error: "<<str<<"\n";
}
int yywrap()
{
return 0;
}
main()
{
yyparse();
}
%}
%token NUMBER LANGLE CLOSERANGLE RANGLE SLASH ANYTHING
%union
{
int intVal;
float floatVal;
char *strVal;
}
%%
tag: |
opening_tag anything closing_tag
{
if(strcmp($<strVal>1,$<strVal>3)==0){
cout<<"\n[i] Tag Matches: "<<$<strVal>1;
cout <<"\n[!] The text: "<<$<strVal>2;
} else {
cout<<"\n[!] Tag Mismatch: "<<$<strVal>1<<" and "<<$<strVal>3;
}
$<strVal>$ = $<strVal>2;
}
|
opening_tag tag closing_tag
{
if(strcmp($<strVal>1,$<strVal>3)==0){
cout<<"\n[i] Tag Matches: "<<$<strVal>1;
cout <<"\n[!] The text: "<<$<strVal>2;
} else {
cout<<"\n[!] Tag Mismatch: "<<$<strVal>1<<" and "<<$<strVal>3;
}
}
;
opening_tag:
LANGLE ANYTHING RANGLE
{
$<strVal>$ = $<strVal>2;
}
;
anything:
ANYTHING
{
$<strVal>$ = $<strVal>1;
}
;
closing_tag:
LANGLE SLASH ANYTHING RANGLE
{
$<strVal>$= $<strVal>3;
}
%%
The error i get is: example4.y: conflicts: 1 shift/reduce
I think it has to do something with opening_tag tag closing_tag but i could not think what's happening here?
Any help?
It's because of two rules that start with opening_tag. The parser has to decide between the rules by lookint at most one token ahead, but it cannot. <FOO> may lead to either rule, and this requires two more tokens of lookahead.
You can do this:
tag : /* nothing */
| opening_tag contents closing_tag
;
contents: tag
| anything
;
UPDATE This new grammar has a different shift/reduce conflict. (UPDATE2: or perhaps it's the same one). Because a tag can be empty, the parser cannot decide what to do at this input:
<Foo> <...
^
|
input is here
If the next symbol is a slash, then we have a closing tag, and the empty tag rule should be matched. If the next symbol is not a slash, then we have an opening tag, and the non-empty tag rule should be matched. But the parser cannot know, it is only allowed to look at <.
The solution would be to create a new token, LANGLE_SLASH, for the </ combination.
The problem is that tag can be empty, so that <x>< might be the beginning of opening_tag tag closing_tag or of opening_tag opening_tag. Consequently, bison cannot tell whether to reduce an empty tag before shifting the <.
You should be able to fix it by removing the empty production for tag and adding an explicit production for opening_tag closing_tag.

Generate Parse Tree in c

The following code is supposed to generate a parse tree of the input expression, but the problem is that the output E,T,F,S (functions used in the code). I want it to be something like:
a+b*c => E*c => E+b*c => a+b*c
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
char next;
void E(void);void T(void);
void S(void);void F(void);
void error(int);void scan(void);
void enter(char);
void leave(char);
void spaces(int);
int level = 0;
//The main should always be very simple
//First scan the string
//second check for end of string reached , if yes success and if not error.
//P ---> E '#'
int main(void){
printf("Input:");
scan(); E();
if (next != '#') error(1);
else printf("***** Successful parse *****\n");
}
//E ---> T {('+'|'-') T}
void E(void){
enter('E');
T();
while (next == '+' || next == '-') {
scan();
T();
}
leave('E');
}
//T ---> S {('*'|'/') S}
void T(void)
{
enter('T'); S();
while (next == '*' || next == '/') {
scan(); S();
}
leave('T');
}
//S ---> F '^' S | F
void S(void)
{
enter('S'); F();
if (next == '^') {
scan(); S();
}
leave('S');
}
//F ---> char | '(' E ')'
void F(void)
{
enter('F');
if (isalpha(next))
{
scan();
}
else if (next == '(') {
scan(); E();
if (next == ')')
scan();
else
error(2);
}
else {
error(3);
}
leave('F');
}
//Scan the entire input
void scan(void){
while (isspace(next = getchar()));
}
void error(int n)
{
printf("\n*** ERROR: %i\n", n);
exit(1);
}
void enter(char name)
{
spaces(level++);
printf("+-%c\n", name);
}
void leave(char name)
{
spaces(--level);
printf("+-%c\n", name);
}
//TO display the parse tree
void spaces(int local_level)
{
while (local_level-- > 0)
printf("| ");
}
Looks like a recursive descent parser. First, work out your grammar by hand. What you are expecting is not what your grammar says. You've got, from your comments,
E ---> T {('+'|'-') T} expression
T ---> S {('*'|'/') S} term
S ---> F '^' S | F subexpression?
F ---> char | '(' E ')' factor
The definitions of E and T put * at a higher precedence than + so there is no way that you will get E*c. If you want that, you'll have to switch the grammar to
E ---> T {('*'|'/') T} expression
T ---> S {('+'|'-') S} term
If you just want the output to include the rest of the expression,
Get the whole line in
Change your scanner or lexer to get the next character from that scanned line. Mark this as the scanned point.
Change your Enter routine to print the the mnemonic as well as the line from the scanned point.
You don't get to choose the parse tree. I guess you don't understand the output, but (again) I guess you'd like
a+b*c => F+b*c => S+b*c => T+b*c => T+F*c => T+S*c => T+S*F => T+S*S => T+T => E
So here are a few questions to help.
what kind of parsing is going on there? bottom up or top down?
what state the parser is in when it prints enter/leave E, or T?
If you answered bottom up to the first question, what does E, T, S, F, F denotes (enter E, enter T, enter S, enter F, leave F)? When you have leave F that means the parser successfully recognised a non-terminal.
Try the input string 1+b*c. What do you get? Why do you get an error after E, T, S, F?
The output you seem to require can be easily produced if you understand what is produced at the moment. Hope this helps.