SIGSEGV while creating a parser-tree

SIGSEGV while creating a parser-tree - c++

I am trying to create a parser-generator using flex/bison. This is my partial parser.y code:
func_definition : type_specifier ID LPAREN parameter_list RPAREN compound_statement
{
$$=new Symbol_info();
$$->code+="PROC:"+ $2->symbol+"\n";
if($2->symbol!="main")
{
$$->code+="PUSH AX\n";
$$->code+="PUSH BX\n";
$$->code+="PUSH CX\n";
$$->code+="PUSH DX\n";
}
$$->code += $6->code ;
if($2->symbol!="main") {
$$->code+="POP DX\n";
$$->code+="POP CX\n";
$$->code+="POP BX\n";
$$->code+="POP AX\n";
}
fprintf(parseLog, "GRAMMER RULE: func_definition -> type_specifier ID LPAREN parameter_list RPAREN compound_statement \n");
}
;
And this is my partial lex.l code.
{id} {
Symbol_info *s= new Symbol_info(yytext, "ID");
yylval = (YYSTYPE)s;
return ID;
}
And this is my partial symbol_table.h code
class SymbolInfo{
string type;
string symbol;
public:
string code;
SymbolInfo *next;
SymbolInfo(){
symbol="";
type="";
code="";
}
SymbolInfo(string symbol, string type){
this->symbol=symbol;
this->type=type;
code="";
}
SymbolInfo(char *symbol, char *type){
this->symbol=string(symbol);
this->type= string(type);
code="";
}
SymbolInfo(const SymbolInfo *sym){
symbol=sym->symbol;
type=sym->type;
code=sym->code;
}
So, when I create a program, I get a SIGSEGV segmentation fault. (Address boundary error). It appears that I get that error when I try to access the yylval returned to me by the lex function.

I tried to run this code on an Ubuntu 64-bit instance (Ubuntu 17.10). I don't know why but the same code runs fine on a 32 bit system (Ubuntu 14.10).
Maybe it's because of the large Integer sizes. Here is the code if you're interested.

Related

LexYacc program gives error including implicit declaration of 'yylex'

I am studying compilers and studying Lex and Yacc. I write a LexYacc code as my teacher shows:
here is exp.l:
/*%option outfile="scanner.cpp"*/
%{
/*#include "exp.tab.h"*/
#include "y.tab.h"
extern int yylval;
%}
%%
0|[1-9][0-9]* { yylval = atoi(yytext); return INTEGER; }
[+*()\n] { return yytext[0]; }
. { /* do nothing */ }
%%
and this is exp.y:
/*
%output "parser.cpp"
%skeleton "lalr1.cc"
*/
%{
#include <stdio.h>
%}
%token INTEGER
%left '+'
%left '*'
%%
input : /* empty string */
| input line
;
line : '\n'
| exp '\n' { printf ("\t%d\n", $1); }
| error '\n'
;
exp : INTEGER { $$ = $1; }
| exp '+' exp { $$ = $1 + $3; }
| exp '*' exp { $$ = $1 * $3; }
| '(' exp ')' { $$ = $2; } ;
%%
main () {
yyparse ();
}
yyerror (char *s) {
printf ("%s\n", s);
}
and I use linux command to run it:
flex exp.l
bison -d exp.y
gcc exp.tab.c lex.yy.c -o exp -lfl
and it shows this:
exp.tab.c: In function ‘yyparse’:
exp.tab.c:1217:16: warning: implicit declaration of function ‘yylex’ [-Wimplicit-function-declaration]
1217 | yychar = yylex ();
|
exp.tab.c:1374:7: warning: implicit declaration of function ‘yyerror’; did you mean ‘yyerrok’? [-Wimplicit-function-declaration]
1374 | yyerror (YY_("syntax error"));
|
| yyerrok
exp.y: At top level:
exp.y:28:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
28 | main () {
|
exp.y:33:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
33 | yyerror (char *s) {
|
exp.l:4:10: fatal error: y.tab.h: No such file or directory
4 | #include "y.tab.h"
|
compilation terminated.
The whole program is mainly a calculator to calculate addition and multiplication.
I don't know what happened and hope someone can help me.

bison parse error in runtime confusion

I'm working on vmWare linux ubuntu on a bison-dlex project, and I have an error in my bison file which I can't get over. In my "line' definition I have " logExp '\n' " definition, but for some reason it never gets there even though it does recognize the expression as logExp.
line:
expr '\n' { printf("\nExpression = %d\n", $1); }
| logExp '\n' { printf("\nNEVER GETS HERE!!\n"); } //ERROR
;
logExp:
expr AND expr { $$ = 0 ; printf("\n$1=%d, $3=%d\n",$1,$3); } //PRINTS GOOD
| AND { }
;
input:
5&&6
output:
$1=5, $3=6
Error: parse error
If it recognizes the logExp, how come it doesn't recognize the line above??
..HELP ?

A parser program for the following grammar

Write a parser (both Yacc and Lex files) that uses the following productions and actions:
S -> cSS {print “x”}
S -> a {print “y”}
S -> b {print “z”}
Indicate the string that it will print when the input is cacba.
I am getting this error: when I give input to it, it says valid input and also says syntax error.
My Scanner Code is this
%{
#include "prac.h"
%}
%%
[c] {return C; }
[a] {return A; }
[b] {return B; }
[ \t] ;
\n { return 0; }
. { return yytext[0]; }
%%
int yywrap(void) {
return 1;
}
And my yacc code is this:
%{
#include <stdio.h>
%}
%token A B C
%%
statement: S {printf("Valid Input"); }
;
S: C S S {printf("Print x\n");}
| A {printf("Print y\n");}
| B {printf("Print z\n");}
;
%%
int main()
{
return yyparse();
}
yyerror(char *s)
{
printf("\n%s\n",s);
printf("Invalid Input");
fprintf(stderr,"At line %d %s ",s,yylineno);
}
How can I fix this?

(Comments converted to an answer)
#ChrisDodd wrote:
Best guess -- you're running on windows, so you're getting a \r (carriage return) character before the newline which is causing your error. Try adding \r to the [ \t] pattern to ignore it.
#Cyclone wrote:
Change your fprintf() statement to fprintf(stderr, "At line %d %s", yylineno, s); not that it will solve your problem.
The OP wrote:
You mean I should add \r into \t so the new regex for it will be [\r\t] Am I right ?
#rici wrote:
#chris suggests [ \r\t]. If you have Windows somewhere in the loop, I agree.

Useless rule in Bison

For some reason bison is rejecting a specific rule, the notequal_expression, beware that Im just starting to learn the whole concept so my line of thought is not so mature, the input file: ( The Error is: "string.y contains 1 useless nonterminal and 1 useless rule." )
/* Parser for StringC */
%{
/* ------------------------------------------------------------------
Initial code (copied verbatim to the output file)
------------------------------------------------------------------ */
// Includes
#include <malloc.h> // _alloca is used by the parser
#include <string.h> // strcpy
#include "lex.h" // the lexer
// Some yacc (bison) defines
#define YYDEBUG 1 // Generate debug code; needed for YYERROR_VERBOSE
#define YYERROR_VERBOSE // Give a more specific parse error message
// Error-reporting function must be defined by the caller
void Error (char *format, ...);
// Forward references
void yyerror (char *msg);
%}
/* ------------------------------------------------------------------
Yacc declarations
------------------------------------------------------------------ */
/* The structure for passing value between lexer and parser */
%union {
char *str;
}
%token ERROR_TOKEN IF ELSE PRINT INPUT ASSIGN EQUAL NOTEQUAL
%token CONCAT END_STMT OPEN_PAR CLOSE_PAR
%token BEGIN_CS END_CS
%token <str> ID STRING BOOLEAN
/*%type <type> type simple_type cast*/
%expect 1 /* shift/reduce conflict: dangling ELSE */
/* declaration */
%%
/* ------------------------------------------------------------------
Yacc grammar rules
------------------------------------------------------------------ */
program
: statement_list
;
statement_list
: statement_list statement
| /* empty */
;
statement
: END_STMT {puts ("Empty statement");}
| expression END_STMT {puts ("Expression statement");}
| PRINT expression END_STMT {puts ("Print statement");}
| INPUT identifier END_STMT {puts ("Input statement");}
| if_statement {puts ("If statement");}
| compound_statement {puts ("Compound statement");}
| error END_STMT {puts ("Error statement");}
| notequal_expression {puts ("Not equal statement");}
;
/* NOTE: This rule causes an unresolvable shift/reduce conflict;
That's why %expect 1 was added (see above) */
if_statement
: IF OPEN_PAR expression CLOSE_PAR statement optional_else_statement
;
optional_else_statement
: ELSE statement
| /* empty */
;
compound_statement
: BEGIN_CS statement_list END_CS
;
expression
: equal_expression
| OPEN_PAR expression CLOSE_PAR
;
equal_expression
: expression EQUAL assign_expression
| assign_expression
;
notequal_expression
: expression NOTEQUAL assign_expression
| NOTEQUAL assign_expression
;
assign_expression
: identifier ASSIGN assign_expression
| concat_expression
;
concat_expression
: concat_expression CONCAT simple_expression
| simple_expression
;
simple_expression
: identifier
| string
;
identifier
: ID {}
;
string
: STRING {}
;
bool
: BOOLEAN {}
;
%%
/* ------------------------------------------------------------------
Additional code (again copied verbatim to the output file)
------------------------------------------------------------------ */
The lexer:
/* Lexical analyzer for StringC */
%{
/* ------------------------------------------------------------------
Initial code (copied verbatim to the output file)
------------------------------------------------------------------ */
// Includes
#include <string.h> // strcpy, strncpy
#include <io.h> // isatty
#ifdef MSVC
#define isatty _isatty // for some reason isatty is called _isatty in VC..
#endif
#define _LEX_CPP_ // make sure our variables get created
#include "lex.h"
#include "lexsymb.h"
extern "C" int yywrap (); // the yywrap function is declared by the caller
// Forward references
void Identifier ();
void StringConstant ();
void BoolConstant ();
void EatComment ();
//// End of inititial code
%}
/* ------------------------------------------------------------------
Some macros (standard regular expressions)
------------------------------------------------------------------ */
LETTER [a-zA-Z_]
DIGIT [0-9]
IDENT {LETTER}({LETTER}|{DIGIT})*
STR \"[^\"]*\"
BOOL \(false|true)\
WSPACE [ \t]+
/* ------------------------------------------------------------------
The lexer rules
------------------------------------------------------------------ */
%%
"if" {return IF;}
"else" {return ELSE;}
"print" {return PRINT;}
"input" {return INPUT;}
"=" {return ASSIGN;}
"==" {return EQUAL;}
"!=" {return NOTEQUAL;} /* Not equal to */
"+" {return CONCAT;}
";" {return END_STMT;}
"(" {return OPEN_PAR;}
")" {return CLOSE_PAR;}
"{" {return BEGIN_CS;}
"}" {return END_CS;}
{BOOL} {BoolConstant (); return BOOLEAN;}
{STR} {StringConstant (); return STRING;}
{IDENT} {Identifier (); return ID;}
"//" {EatComment();} /* comment: skip */
\n {lineno++;} /* newline: count lines */
{WSPACE} {} /* whitespace: (do nothing) */
. {return ERROR_TOKEN;} /* other char: error, illegal token */
%%
/* ------------------------------------------------------------------
Additional code (again copied verbatim to the output file)
------------------------------------------------------------------ */
// The comment-skipping function: skip to end-of-line
void EatComment() {
char c;
while ((c = yyinput()) != '\n' && c != 0);
lineno++;
}
// Pass the id name
void Identifier () {
yylval.str = new char[strlen(yytext)+1];
strcpy (yylval.str, yytext);
}
// Pass the string constant
void StringConstant() {
int l = strlen(yytext)-2;
yylval.str = new char[l+1];
strncpy (yylval.str, &yytext[1], l); yylval.str[l] = 0;
}
void BoolConstant() {
int l = strlen(yytext)-2;
yylval.str = new char[l+1];
strncpy(yylval.str, &yytext[1], l); yylval.str[l] = 0;
}

Are you sure that it's notequal_expression that is causing the issue? The nonterminal and rule that are not used, as I read it, are
bool
: BOOLEAN {}
;
Perhaps instead of
simple_expression
: identifier
| string
;
you intended to code
simple_expression
: identifier
| string
| bool
;

There are two problems with the grammar. The first is the shift/reduce conflict you've already seen (and addressed with %expect 1. I prefer to address it in the grammar instead and use %expect 0 instead. You can do that by removing ELSE from the %token list and adding a line
%right THEN ELSE
To declare right associativity. Your language doesn't actually have a THEN keyword but that's fine. You can then remove completely the rule for optional_else_statement and reword the rule for if_statement as follows:
if_statement
: IF OPEN_PAR expression CLOSE_PAR statement %prec THEN
| IF OPEN_PAR expression CLOSE_PAR statement ELSE statement
;
There are those who prefer to resolve it this way, and others who advocate the %expect 1 approach. I prefer this way, but now that you have both methods, you can certainly choose for yourself.
For the other problem, the useless rule is definitely this one:
bool
: BOOLEAN {}
;
because the non-terminal bool is not used anywhere else in the grammar. This accounts for both "1 useless nonterminal and 1 useless rule" as reported by bison. To be able to identify this kind of thing for yourself, you can use
bison --report=solved -v string.y
This will create a string.output file which will contain a large but readable report including any resolved shift-reduce errors (such as your IF-ELSE construction) and also a complete set of states created by bison. It's very useful when attempting to troubleshoot grammar problems.

Shift/Reduce conflict when I introduce an action in yacc

I am writing a front end for my C compiler, where in I am adding Type system currently. Previously I assumed everything was an int and hence the following rule worked fine.
declaration: datatype varList ';' { gTrace<<"declaration ";}
varList: IDENTIFIER { builder.addSymbol($1); }
| varList',' IDENTIFIER { builder.addSymbol($3); }
;
But now I also add type to the symbol, and hence modified my rule like below:
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
varList: IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
I get a shift/reduce error when I do that, since the { currentType = $1; } is being considered as an empty rule. How do I go about this error? Is there a way to specify that it is just an action?
Attached below is a snippet from my y.output
32 $#6: /* empty */
33 declaration: datatype $#6 varList ';'
34 varList: IDENTIFIER
35 | varList ',' IDENTIFIER

I don't get any error or warnings:
%token INT
%token FLOAT
%token CHAR
%token IDENTIFIER
%%
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
varList : IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList ',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
datatype: INT
| FLOAT
| CHAR
;
%%
Command
% bison p.yacc
%
I think you will need to provide more information.
The full yacc file and the parameters you are passing to yacc/bison
Edit
I tried your file (as per the comment) I still get no errors or warnings:
> yacc --version
bison (GNU Bison) 2.3
Written by Robert Corbett and Richard Stallman.
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I fixed the problem as below:
declaration: datatype varList ';' { gTrace<<"declaration "; currentType = -1; }
varList: IDENTIFIER { builder.addSymbol($1, getType(currentType)); }
| varList',' IDENTIFIER { builder.addSymbol($3, getType(currentType)); }
;
datatype: INTEGER { gTrace<<"int "; $$ = currentType = Type::IntegerTy; }
| FLOAT { gTrace<<"float "; $$ = currentType = Type::FloatTy; }
| VOID { gTrace<<"void "; $$ = currentType = Type::VoidTy; }
;
#sarnold, hope this helps!

I thing you can only define an actions block for each rule, so
declaration: datatype { currentType = $1; } varList ';' { gTrace<<"declaration "; currentType = -1; }
should be done as
declaration: datatype varList ';' { currentType = $1; gTrace<<"declaration "; currentType = -1; }
Anyway, you are setting currentType to the lexical value of datatype and to -1 right after

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

SIGSEGV while creating a parser-tree - c++

I tried to run this code on an Ubuntu 64-bit instance (Ubuntu 17.10). I don't know why but the same code runs fine on a 32 bit system (Ubuntu 14.10). Maybe it's because of the large Integer sizes. Here is the code if you're interested.

Related

LexYacc program gives error including implicit declaration of 'yylex'

bison parse error in runtime confusion

A parser program for the following grammar

Useless rule in Bison

Shift/Reduce conflict when I introduce an action in yacc

Categories

Resources