For some reason bison is rejecting a specific rule, the notequal_expression, beware that Im just starting to learn the whole concept so my line of thought is not so mature, the input file: ( The Error is: "string.y contains 1 useless nonterminal and 1 useless rule." )
/* Parser for StringC */
%{
/* ------------------------------------------------------------------
Initial code (copied verbatim to the output file)
------------------------------------------------------------------ */
// Includes
#include <malloc.h> // _alloca is used by the parser
#include <string.h> // strcpy
#include "lex.h" // the lexer
// Some yacc (bison) defines
#define YYDEBUG 1 // Generate debug code; needed for YYERROR_VERBOSE
#define YYERROR_VERBOSE // Give a more specific parse error message
// Error-reporting function must be defined by the caller
void Error (char *format, ...);
// Forward references
void yyerror (char *msg);
%}
/* ------------------------------------------------------------------
Yacc declarations
------------------------------------------------------------------ */
/* The structure for passing value between lexer and parser */
%union {
char *str;
}
%token ERROR_TOKEN IF ELSE PRINT INPUT ASSIGN EQUAL NOTEQUAL
%token CONCAT END_STMT OPEN_PAR CLOSE_PAR
%token BEGIN_CS END_CS
%token <str> ID STRING BOOLEAN
/*%type <type> type simple_type cast*/
%expect 1 /* shift/reduce conflict: dangling ELSE */
/* declaration */
%%
/* ------------------------------------------------------------------
Yacc grammar rules
------------------------------------------------------------------ */
program
: statement_list
;
statement_list
: statement_list statement
| /* empty */
;
statement
: END_STMT {puts ("Empty statement");}
| expression END_STMT {puts ("Expression statement");}
| PRINT expression END_STMT {puts ("Print statement");}
| INPUT identifier END_STMT {puts ("Input statement");}
| if_statement {puts ("If statement");}
| compound_statement {puts ("Compound statement");}
| error END_STMT {puts ("Error statement");}
| notequal_expression {puts ("Not equal statement");}
;
/* NOTE: This rule causes an unresolvable shift/reduce conflict;
That's why %expect 1 was added (see above) */
if_statement
: IF OPEN_PAR expression CLOSE_PAR statement optional_else_statement
;
optional_else_statement
: ELSE statement
| /* empty */
;
compound_statement
: BEGIN_CS statement_list END_CS
;
expression
: equal_expression
| OPEN_PAR expression CLOSE_PAR
;
equal_expression
: expression EQUAL assign_expression
| assign_expression
;
notequal_expression
: expression NOTEQUAL assign_expression
| NOTEQUAL assign_expression
;
assign_expression
: identifier ASSIGN assign_expression
| concat_expression
;
concat_expression
: concat_expression CONCAT simple_expression
| simple_expression
;
simple_expression
: identifier
| string
;
identifier
: ID {}
;
string
: STRING {}
;
bool
: BOOLEAN {}
;
%%
/* ------------------------------------------------------------------
Additional code (again copied verbatim to the output file)
------------------------------------------------------------------ */
The lexer:
/* Lexical analyzer for StringC */
%{
/* ------------------------------------------------------------------
Initial code (copied verbatim to the output file)
------------------------------------------------------------------ */
// Includes
#include <string.h> // strcpy, strncpy
#include <io.h> // isatty
#ifdef MSVC
#define isatty _isatty // for some reason isatty is called _isatty in VC..
#endif
#define _LEX_CPP_ // make sure our variables get created
#include "lex.h"
#include "lexsymb.h"
extern "C" int yywrap (); // the yywrap function is declared by the caller
// Forward references
void Identifier ();
void StringConstant ();
void BoolConstant ();
void EatComment ();
//// End of inititial code
%}
/* ------------------------------------------------------------------
Some macros (standard regular expressions)
------------------------------------------------------------------ */
LETTER [a-zA-Z_]
DIGIT [0-9]
IDENT {LETTER}({LETTER}|{DIGIT})*
STR \"[^\"]*\"
BOOL \(false|true)\
WSPACE [ \t]+
/* ------------------------------------------------------------------
The lexer rules
------------------------------------------------------------------ */
%%
"if" {return IF;}
"else" {return ELSE;}
"print" {return PRINT;}
"input" {return INPUT;}
"=" {return ASSIGN;}
"==" {return EQUAL;}
"!=" {return NOTEQUAL;} /* Not equal to */
"+" {return CONCAT;}
";" {return END_STMT;}
"(" {return OPEN_PAR;}
")" {return CLOSE_PAR;}
"{" {return BEGIN_CS;}
"}" {return END_CS;}
{BOOL} {BoolConstant (); return BOOLEAN;}
{STR} {StringConstant (); return STRING;}
{IDENT} {Identifier (); return ID;}
"//" {EatComment();} /* comment: skip */
\n {lineno++;} /* newline: count lines */
{WSPACE} {} /* whitespace: (do nothing) */
. {return ERROR_TOKEN;} /* other char: error, illegal token */
%%
/* ------------------------------------------------------------------
Additional code (again copied verbatim to the output file)
------------------------------------------------------------------ */
// The comment-skipping function: skip to end-of-line
void EatComment() {
char c;
while ((c = yyinput()) != '\n' && c != 0);
lineno++;
}
// Pass the id name
void Identifier () {
yylval.str = new char[strlen(yytext)+1];
strcpy (yylval.str, yytext);
}
// Pass the string constant
void StringConstant() {
int l = strlen(yytext)-2;
yylval.str = new char[l+1];
strncpy (yylval.str, &yytext[1], l); yylval.str[l] = 0;
}
void BoolConstant() {
int l = strlen(yytext)-2;
yylval.str = new char[l+1];
strncpy(yylval.str, &yytext[1], l); yylval.str[l] = 0;
}
Are you sure that it's notequal_expression that is causing the issue? The nonterminal and rule that are not used, as I read it, are
bool
: BOOLEAN {}
;
Perhaps instead of
simple_expression
: identifier
| string
;
you intended to code
simple_expression
: identifier
| string
| bool
;
There are two problems with the grammar. The first is the shift/reduce conflict you've already seen (and addressed with %expect 1. I prefer to address it in the grammar instead and use %expect 0 instead. You can do that by removing ELSE from the %token list and adding a line
%right THEN ELSE
To declare right associativity. Your language doesn't actually have a THEN keyword but that's fine. You can then remove completely the rule for optional_else_statement and reword the rule for if_statement as follows:
if_statement
: IF OPEN_PAR expression CLOSE_PAR statement %prec THEN
| IF OPEN_PAR expression CLOSE_PAR statement ELSE statement
;
There are those who prefer to resolve it this way, and others who advocate the %expect 1 approach. I prefer this way, but now that you have both methods, you can certainly choose for yourself.
For the other problem, the useless rule is definitely this one:
bool
: BOOLEAN {}
;
because the non-terminal bool is not used anywhere else in the grammar. This accounts for both "1 useless nonterminal and 1 useless rule" as reported by bison. To be able to identify this kind of thing for yourself, you can use
bison --report=solved -v string.y
This will create a string.output file which will contain a large but readable report including any resolved shift-reduce errors (such as your IF-ELSE construction) and also a complete set of states created by bison. It's very useful when attempting to troubleshoot grammar problems.
Related
I am studying compilers and studying Lex and Yacc. I write a LexYacc code as my teacher shows:
here is exp.l:
/*%option outfile="scanner.cpp"*/
%{
/*#include "exp.tab.h"*/
#include "y.tab.h"
extern int yylval;
%}
%%
0|[1-9][0-9]* { yylval = atoi(yytext); return INTEGER; }
[+*()\n] { return yytext[0]; }
. { /* do nothing */ }
%%
and this is exp.y:
/*
%output "parser.cpp"
%skeleton "lalr1.cc"
*/
%{
#include <stdio.h>
%}
%token INTEGER
%left '+'
%left '*'
%%
input : /* empty string */
| input line
;
line : '\n'
| exp '\n' { printf ("\t%d\n", $1); }
| error '\n'
;
exp : INTEGER { $$ = $1; }
| exp '+' exp { $$ = $1 + $3; }
| exp '*' exp { $$ = $1 * $3; }
| '(' exp ')' { $$ = $2; } ;
%%
main () {
yyparse ();
}
yyerror (char *s) {
printf ("%s\n", s);
}
and I use linux command to run it:
flex exp.l
bison -d exp.y
gcc exp.tab.c lex.yy.c -o exp -lfl
and it shows this:
exp.tab.c: In function ‘yyparse’:
exp.tab.c:1217:16: warning: implicit declaration of function ‘yylex’ [-Wimplicit-function-declaration]
1217 | yychar = yylex ();
|
exp.tab.c:1374:7: warning: implicit declaration of function ‘yyerror’; did you mean ‘yyerrok’? [-Wimplicit-function-declaration]
1374 | yyerror (YY_("syntax error"));
|
| yyerrok
exp.y: At top level:
exp.y:28:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
28 | main () {
|
exp.y:33:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
33 | yyerror (char *s) {
|
exp.l:4:10: fatal error: y.tab.h: No such file or directory
4 | #include "y.tab.h"
|
compilation terminated.
The whole program is mainly a calculator to calculate addition and multiplication.
I don't know what happened and hope someone can help me.
I am trying to create a parser-generator using flex/bison. This is my partial parser.y code:
func_definition : type_specifier ID LPAREN parameter_list RPAREN compound_statement
{
$$=new Symbol_info();
$$->code+="PROC:"+ $2->symbol+"\n";
if($2->symbol!="main")
{
$$->code+="PUSH AX\n";
$$->code+="PUSH BX\n";
$$->code+="PUSH CX\n";
$$->code+="PUSH DX\n";
}
$$->code += $6->code ;
if($2->symbol!="main") {
$$->code+="POP DX\n";
$$->code+="POP CX\n";
$$->code+="POP BX\n";
$$->code+="POP AX\n";
}
fprintf(parseLog, "GRAMMER RULE: func_definition -> type_specifier ID LPAREN parameter_list RPAREN compound_statement \n");
}
;
And this is my partial lex.l code.
{id} {
Symbol_info *s= new Symbol_info(yytext, "ID");
yylval = (YYSTYPE)s;
return ID;
}
And this is my partial symbol_table.h code
class SymbolInfo{
string type;
string symbol;
public:
string code;
SymbolInfo *next;
SymbolInfo(){
symbol="";
type="";
code="";
}
SymbolInfo(string symbol, string type){
this->symbol=symbol;
this->type=type;
code="";
}
SymbolInfo(char *symbol, char *type){
this->symbol=string(symbol);
this->type= string(type);
code="";
}
SymbolInfo(const SymbolInfo *sym){
symbol=sym->symbol;
type=sym->type;
code=sym->code;
}
So, when I create a program, I get a SIGSEGV segmentation fault. (Address boundary error). It appears that I get that error when I try to access the yylval returned to me by the lex function.
I tried to run this code on an Ubuntu 64-bit instance (Ubuntu 17.10). I don't know why but the same code runs fine on a 32 bit system (Ubuntu 14.10).
Maybe it's because of the large Integer sizes. Here is the code if you're interested.
Write a parser (both Yacc and Lex files) that uses the following productions and actions:
S -> cSS {print “x”}
S -> a {print “y”}
S -> b {print “z”}
Indicate the string that it will print when the input is cacba.
I am getting this error: when I give input to it, it says valid input and also says syntax error.
My Scanner Code is this
%{
#include "prac.h"
%}
%%
[c] {return C; }
[a] {return A; }
[b] {return B; }
[ \t] ;
\n { return 0; }
. { return yytext[0]; }
%%
int yywrap(void) {
return 1;
}
And my yacc code is this:
%{
#include <stdio.h>
%}
%token A B C
%%
statement: S {printf("Valid Input"); }
;
S: C S S {printf("Print x\n");}
| A {printf("Print y\n");}
| B {printf("Print z\n");}
;
%%
int main()
{
return yyparse();
}
yyerror(char *s)
{
printf("\n%s\n",s);
printf("Invalid Input");
fprintf(stderr,"At line %d %s ",s,yylineno);
}
How can I fix this?
(Comments converted to an answer)
#ChrisDodd wrote:
Best guess -- you're running on windows, so you're getting a \r (carriage return) character before the newline which is causing your error. Try adding \r to the [ \t] pattern to ignore it.
#Cyclone wrote:
Change your fprintf() statement to fprintf(stderr, "At line %d %s", yylineno, s); not that it will solve your problem.
The OP wrote:
You mean I should add \r into \t so the new regex for it will be [\r\t] Am I right ?
#rici wrote:
#chris suggests [ \r\t]. If you have Windows somewhere in the loop, I agree.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I am new to flex and Bison. The following bison file does not compile to produce .cpp and .h files:
If I remove the code to support if statement then it works. The |ifStmt{$$=$1;} in the statement:, rule ifStmt: TOKEN_IF expression TOKEN_DO statement TOKEN_ELSE TOKEN_DO statement{$$=makeIf($2, $3, $7);}; and change the token declaration by deleting the TOKEN_IF and TOKEN_ELSE
%error-verbose /* instruct bison to generate verbose error messages*/
%{
#include "astgen.h"
#define YYDEBUG 1
/* Since the parser must return the AST, it must get a parameter where
* the AST can be stored. The type of the parameter will be void*. */
struct AstElement* astDest;
extern int yylex();
%}
%union {
int val;
char op;
char* name;
struct AstElement* ast; /* this is the new member to store AST elements */
}
%token TOKEN_BEGIN TOKEN_END TOKEN_WHILE TOKEN_DO TOKEN_IF TOKEN_ELSE
%token<name> TOKEN_ID
%token<val> TOKEN_NUMBER
%token<op> TOKEN_OPERATOR
%type<ast> program block statements statement assignment expression whileStmt call
%start program
%{
/* Forward declarations */
void yyerror(const char* const message);
%}
%%
program: statement';' { astDest = $1; };
block: TOKEN_BEGIN statements TOKEN_END{ $$ = $2; };
statements: {$$=0;}
| statements statement ';' {$$=makeStatement($1, $2);}
| statements block';' {$$=makeStatement($1, $2);};
statement:
assignment {$$=$1;}
| whileStmt {$$=$1;}
| ifStmt{$$=$1;}
| block {$$=$1;}
| call {$$=$1;}
assignment: TOKEN_ID '=' expression {$$=makeAssignment($1, $3);}
expression: TOKEN_ID {$$=makeExpByName($1);}
| TOKEN_NUMBER {$$=makeExpByNum($1);}
| expression TOKEN_OPERATOR expression {$$=makeExp($1, $3, $2);}
whileStmt: TOKEN_WHILE expression TOKEN_DO statement{$$=makeWhile($2, $4);};
ifStmt: TOKEN_IF expression TOKEN_DO statement TOKEN_ELSE TOKEN_DO statement{$$=makeIf($2, $4, $7);};
call: TOKEN_ID '(' expression ')' {$$=makeCall($1, $3);};
%%
#include "astexec.h"
#include <stdlib.h>
void yyerror(const char* const message)
{
fprintf(stderr, "Parse error:%s\n", message);
exit(1);
}
And the lex file for the above is:
%option noyywrap
%{
#include "parser.tab.h"
#include <stdlib.h>
%}
%option noyywrap
%%
"while" return TOKEN_WHILE;
"{" return TOKEN_BEGIN;
"}" return TOKEN_END;
"do" return TOKEN_DO;
"if" return TOKEN_IF;
"else" return TOKEN_ELSE;
"==" {yylval.op = *yytext; return TOKEN_OPERATOR;}
"!=" {yylval.op = *yytext; return TOKEN_OPERATOR;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.name = _strdup(yytext); return TOKEN_ID;}
[-]?[0-9]+ {yylval.val = atoi(yytext); return TOKEN_NUMBER;}
[()=;] {return *yytext;}
"<=" {yylval.op = *yytext; return TOKEN_OPERATOR;}
">=" {yylval.op = *yytext; return TOKEN_OPERATOR;}
[*/+-<>] {yylval.op = *yytext; return TOKEN_OPERATOR;}
[ \t\n] {/* suppress the output of the whitespaces from the input file to stdout */}
#.* {/* one-line comment */}
%%
What am I doing wrong here?
You're missing the %type declaration for ifStmt, as the error message from bison tells you:
t.y:46.17-18: $1 of `statement' has no declared type
t.y:58.78-79: $$ of `ifStmt' has no declared type
t.y:58.92-93: $3 of `ifStmt' has no declared type
Adding ifStmt to the declaration %type<ast> on line 23 will fix the first 2 errors; the third can be fixed by using $4 instead of $3.
I am writing a front end for my C compiler, where in I am adding Type system currently. Previously I assumed everything was an int and hence the following rule worked fine.
declaration: datatype varList ';' { gTrace<<"declaration ";}
varList: IDENTIFIER { builder.addSymbol($1); }
| varList',' IDENTIFIER { builder.addSymbol($3); }
;
But now I also add type to the symbol, and hence modified my rule like below:
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
varList: IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
I get a shift/reduce error when I do that, since the { currentType = $1; } is being considered as an empty rule. How do I go about this error? Is there a way to specify that it is just an action?
Attached below is a snippet from my y.output
32 $#6: /* empty */
33 declaration: datatype $#6 varList ';'
34 varList: IDENTIFIER
35 | varList ',' IDENTIFIER
I don't get any error or warnings:
%token INT
%token FLOAT
%token CHAR
%token IDENTIFIER
%%
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
varList : IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList ',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
datatype: INT
| FLOAT
| CHAR
;
%%
Command
% bison p.yacc
%
I think you will need to provide more information.
The full yacc file and the parameters you are passing to yacc/bison
Edit
I tried your file (as per the comment) I still get no errors or warnings:
> yacc --version
bison (GNU Bison) 2.3
Written by Robert Corbett and Richard Stallman.
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I fixed the problem as below:
declaration: datatype varList ';' { gTrace<<"declaration "; currentType = -1; }
varList: IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
datatype: INTEGER { gTrace<<"int "; $$ = currentType = Type::IntegerTy; }
| FLOAT { gTrace<<"float "; $$ = currentType = Type::FloatTy; }
| VOID { gTrace<<"void "; $$ = currentType = Type::VoidTy; }
;
#sarnold, hope this helps!
I thing you can only define an actions block for each rule, so
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
should be done as
declaration: datatype varList ';' { currentType = $1; gTrace<<"declaration "; currentType = -1; }
Anyway, you are setting currentType to the lexical value of datatype and to -1 right after