Using Lex Yacc compiled .c file in C++ [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Good tools for creating a C/C++ parser/analyzer
String input to flex lexer
My idea is to create a parser that can calculate the expression of Boolean, and my next step is to use it in my c++ program, but I don't know how to use it.
Currently, this calculator can run in command line, the code is not good, and I don't know how to use it in my program. I want to use a function lex_yacc(var) to call this calculator, and var is the input, for example, the main program read the var is (T+F), so it will be sent to lex_yacc(var), finally, the func returns 1.
I define the lexya.l as follows:
%{
#include <stdlib.h>
void yyerror(char *);
#include "lexya_a1.tab.h"
%}
%%
"T" { yylval = 1; return boolean; }
"F" { yylval = 0; return boolean; }
"!F" { yylval = 1; return boolean; }
"!T" { yylval = 0; return boolean; }
[+*\n] return *yytext;
"(" return *yytext;
")" return *yytext;
[\t] ;/* .... */
. yyerror("....");
%%
int yywrap(void) {
return 1;
}
And lexya_a1.y:
%{
#include <stdlib.h>
int yylex(void);
void yyerror(char *);
%}
%token boolean
%left '+' '-'
%left '*'
%left '(' ')'
%%
program:
program expr '\n' { printf("%d\n", $2); }
|
;
expr:
boolean { $$ = $1; }
| expr '*' expr { $$ = $1 * $3; }
| expr '+' expr { $$ = $1 + $3; }
| '(' expr ')' { $$ = $2; }
;
%%
void yyerror(char *s) {
printf("%s\n", s);
}
int main(void) {
yyparse();
return 0;
}

Related

flex/bison gives me a syntax error after printing the result and if another input is written to work

After running the compiler and type the entry on, works fine. Then if I type another entry (and if it also works ok) it gave me Syntax Error.
I must mention I am a new in ​​the world of flex/bison. To be honest I do not know what's could be wrong, some one please help?
Here is my lex code:
%{
#include <stdio.h>
#include "calc.tab.h"
void yyerror(char *);
%}
%option noyywrap
DIGIT -?[0-9]
NUM -?{DIGIT}+
%%
{NUM} { yylval = atoi(yytext); return NUMBER; }
[-()+*/;] { return *yytext; }
"evaluar" { return EVALUAR; }
[[:blank:]] ;
\r {}
. yyerror("caracter invalido");
%%
and here it is my bison code:
%{
#include <stdio.h>
int yylex(void);
void yyerror(char *s);
%}
%token NUMBER EVALUAR
%start INICIO
%left '+' '-'
%left '*' '/'
%%
INICIO
: EVALUAR '(' Expr ')' ';'
{
printf("\nResultado=%d\n", $3);
}
;
Expr
: Expr '+' Expr
{
$$ = $1 + $3;
}
| Expr '-' Expr
{
$$ = $1 - $3;
}
| Expr '*' Expr
{
$$ = $1 * $3;
}
| Expr '/' Expr
{
$$ = $1 / $3;
}
| NUMBER
{
$$ = $1;
}
;
%%
int main(){
return(yyparse());
}
void yyerror(char *s){
printf("\n%s\n", s);
}
int yywrap(){
return 1;
}
Here is an example of the output:
C:\Users\Uchih\Desktop\bison>a
evaluar(2+3);
Resultado=5
evaluar(3+2);
syntax error
Your parser is written to accept a single input INICIO rule/clause, after which it will expect an EOF (and will exit after it sees it). Since instead you have a second INICIO, you get a syntax error message.
To fix this, you want your grammar to accept one or more things. Add a rule like this:
input: INICIO | input INICIO ;
and change the start to
%start input

Concat value in yacc recursion

I'm building a yacc program to check for while and do-while loop.
here is my program
%{
#include<stdio.h>
#include<string.h>
#include <stdlib.h>
#include"my_lex.tab.h"
extern FILE *yyin;
int yylex();
void yyerror(const char *str);
%}
%token ID NUMBER WHILE DO CONDITION OR AND VAR
%left AND OR "-" "+" "*" "/"
%start S
%union {
char *s;
string str;
}
%type <s> S_W S_DW block assignment statement declaration statement_list
%%
S : S1
|S1 S
S1 : S_W
|S_DW
|statement
{ printf("- Stand-alone statement \n");}
S_DW : DO block WHILE '(' condition ')' ';' { printf("- Do-while loop \n"); printf("%s \n", $2); }
S_W : WHILE '(' condition ')' block { printf("- While loop \n"); printf("%s \n", $5); }
block : '{' statement_list '}' { $$ = $2; }
statement_list : /* blank */
{ $$ = ""; }
| statement statement_list
{
I'm stucking right here ...
}
statement : declaration { $$ = $1; }
| assignment { $$ = $1; }
........
So now when I run my program, I want it to print my yacc as a parse tree, something like this:
\\ input file
do {
a = a + 4;
b = c + d - 4;
} while (i <= 0);
var test = 4;
while (i > 0 && a && b) {
var x = a - b;
}
\\ yacc output:
- do-while loop
-- assignment
-- assignment
- stand-alone statement
- while loop
-- declaration
I'm stucking at the recursion statements define.
statement_list : /* blank */
{ $$ = ""; }
| statement statement_list
{
I'm stucking right here ...
}
I don't know how to "concat" all the statements to the $$ of "statement_list". :(
You have to design a class hierarchy. Using char*/string only is not enough to retain the program semantics.

How print string in compiler bison/lexer?

I created a small compiler and need help to fix it.
Code of my compiler:
t.l:
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
%}
%x DOUBLE_QUOTES
%%
<INITIAL>[s|S][h|H][o|O][w|W] {return show;}
<INITIAL>[a-zA-z] {yylval.id=yytext[0];return identifier;}
<INITIAL>[0-9]+ {yylval.num=atoi(yytext);return number;}
<INITIAL>[\-\+\=\;\*\/] {return yytext[0];}
<INITIAL>["] {
printf("(STRING_OPEN) ");
BEGIN(DOUBLE_QUOTES);
}
<DOUBLE_QUOTES>["] {
printf("(STRING_CLOSE) ");
BEGIN(INITIAL);
printf("(STRING:%S) ",yytext[1]);
}
%%
int yywrap (void) {return 1;}
t.y:
%{
void yyerror(char *s);
#include <stdio.h>
#include <stdlib.h>
int symbols[52];
int symbolVal(char symbol);
void updateSymbolVal(char symbol,int val);
%}
%union {int num;char id;}
%start line
%token show
%token <num> number
%token <id> identifier
%type <num> line exp term
%type <id> assignment
%%
line : assignment ';' {;}
| show exp ';' {printf("showing : %d\n",$2);}
| line assignment ';' {;}
| line show exp ';' {printf("showing : %d\n",$3);}
;
assignment: identifier '=' exp {updateSymbolVal($1,$3);}
;
exp : term {$$ = $1;}
| exp '+' term {$$ = $1 + $3;}
| exp '-' term {$$ = $1 - $3;}
| exp '*' term {$$ = $1 * $3;}
| exp '/' term {$$ = $1 / $3;}
;
term : number {$$ = $1;}
| identifier {$$ = symbolVal($1);}
%%
int computerSymbolIndex(char token)
{
int idx=-1;
if(islower(token))
{
idx=token-'a'+26;
}
else if(isupper(token))
{
idx = token - 'A';
}
return idx;
}
int symbolVal(char symbol)
{
int bucket = computerSymbolIndex(symbol);
return symbols[bucket];
}
void updateSymbolVal(char symbol,int val)
{
int bucket = computerSymbolIndex(symbol);
symbols[bucket] = val;
}
int main (void) {
printf("Created By BoxWeb Inc\n");
int i;
for(i=0;i<52;i++)
{
symbols[i]=0;
}
return yyparse();
}
void yyerror (char *s) {printf("-%s at %s !\n",s );}
command for test compiler :
show 5+5;
show 5*2;
show 5+5-2*2/1;
i need to upgrade to (want can print string):
show "hello" . " " . "mr";//hello mr
show 5+5 . " ?";//10 ?
and more....
In the lexer I use :
<INITIAL>["] {
printf("(STRING_OPEN) ");
BEGIN(DOUBLE_QUOTES);
}
<DOUBLE_QUOTES>["] {
printf("(STRING_CLOSE) ");
BEGIN(INITIAL);
printf("(STRING:%S) ",yytext[1]);
}
but I don't know how use this in a parser.
Please help me to complete this compiler.
Lets simplify it for a moment to just one possible operation
We have the following grammar
assignment: '$' identifier '=' exp ';' {updateSymbolVal($2,$4); }
;
exp: number {$$ = createExp($1);}
| string {$$ = createExp($1);}
| exp '+' exp {$$ = addExp($1,$3);}
;
Since the expression can be many different things we can't just save it in a integer but need a more complex structure, something like this:
enum expType {NUMBER, STRING};
struct Exp{
expType type;
double number;
std::string str;
};
Then we make the functions to create your expressions:
Exp* createExp(int v){
Exp *e = new Exp();
e->type = NUMBER;
e->number = v;
return e;
}
Exp* createExp(std::string s){
Exp *e = new Exp();
e->type = STRING;
e->str = s;
return e;
}
And then to do all your calculations and assignment you will always have to check the type.
Exp* addExp(Exp *a, Exp *b){
Exp *c;
if(a->type == NUMBER && b->type == NUMBER){
c->type == NUMBER;
c->number == a->number + b->number;
}
else{
std::cout << "some nice error message\n";
}
return c;
}
Same with the assign function
void updateSymbolVal(const std::string &identifier, Exp *e){
if(e->type == NUMBER){
myNumbers[identifier] = e->number;
}
if(e->type == STRING){
myStrings[identifier] = e->str;
}
}
Of course you could also make a map/vector/array of the struct Exp if you need to do some more manipulations with it. Or just hand it over to the next level.
Edit for the question of multi-language support
As written in the comment I refer to this question Flex(lexer) support for unicode. To simplify it to your need here you can make it like this.
ASC [a-zA-Z_0-9]
U [\x80-\xbf]
U2 [\xc2-\xdf]
U3 [\xe0-\xef]
U4 [\xf0-\xf4]
UANY {ASC}|{U2}{U}|{U3}{U}{U}|{U4}{U}{U}{U}
UANY+ {yylval.id = yytext[0]; return string;}

Flex/Bison not evaluating properly

For some reason or another, bison doesn't want to do any evaluation. Compilation of all files goes smoothly and the program runs. When I enter the expression 4+5 and press return, it creates tokens for 4 + 5 respectively. I can even put in some printf into the places where bison recognizes the attributes of each token including the plus (43).
However the program never evaluates this production expr '+' term { $$ = $1 + $3; }. It's simply never called at least to my knowledge and even if it was this production assign '\n' { printf("%d\n", $1); } never prints out the value. Upon ^D to quit, it fires void yyerror(const char *).
Any help on this matter is much appreciated. Thanks!
//FLEX
%{
//#include <stdio.h>
#include "y.tab.h"
%}
%option noyywrap
letter [A-Za-z]
digit [0-9]
space [ \t]
var {letter}
int {digit}+
ws {space}+
%%
{var} { yylval = (int)yytext[0]; return VAR; }
{int} { yylval = atoi(yytext); return CONST; }
{ws} { }
. { return (int)yytext[0]; }
%%
/* nothing */
.
//BISON
%{
//INCLUDE
//#include <ctype.h>
//DEFINE
#define YYDEBUG 1
//PROTOTYPE
void yyerror(const char *);
void print_welcome();
int get_val(int);
void set_val(int, int);
%}
%token CONST
%token VAR
%%
session
: { print_welcome(); }
eval
;
eval
: eval line
|
;
line
: assign '\n' { printf("%d\n", $1); }
;
assign
: VAR '=' expr { set_val($1, $3); $$ = $3; }
| expr { $$ = $1; }
;
expr
: expr '+' term { $$ = $1 + $3; }
| expr '-' term { $$ = $1 - $3; }
| term { $$ = $1; }
;
term
: term '*' factor { $$ = $1 * $3; }
| term '/' factor { $$ = $1 / $3; }
| term '%' factor { $$ = $1 % $3; }
| factor { $$ = $1; }
;
factor
: '(' expr ')' { $$ = $2; }
| CONST { $$ = $1; }
| VAR { $$ = get_val($1); }
;
%%
void yyerror(const char * s)
{
fprintf(stderr, "%s\n", s);
}
void print_welcome()
{
printf("Welcome to the Simple Expression Evaluator.\n");
printf("Enter one expression per line, end with ^D\n\n");
}
static int val_tab[26];
int get_val(int var)
{
return val_tab[var - 'A'];
}
void set_val(int var, int val)
{
val_tab[var - 'A'] = val;
}
.
//MAIN
//PROTOTYPE
int yyparse();
int main()
{
extern int yydebug;
yydebug = 0;
yyparse();
return 0;
}
Your lex file does not have any rule which matches \n, because in lex/flex, . matches any character except line-end. The default rule for lex (or flex) echoes and otherwise ignores the matched character, so that's what happens to the \n. Since the parser won't be able to accept a line unless it sees a \n token, it will eventually be forced to present you with a syntax error.
So you need to change the rule
. { return (int)yytext[0]; }
to
.|\n { return (int)yytext[0]; }
(I wouldn't have bothered with the cast to int but it's certainly not doing any harm, so I left it in.)

Bison %prec doesn't work

I'm implementing a simple calculator with flex and bison.
I'd like the following input to give -4 and not 4:
-2^2
In order to achieve -4, I had to declare the priority of ^ operator to be higher than the priority of the unary minus operator, but it doesn't work.
This is the bison code:
%{
#include <iostream>
#include <math.h>
using namespace std;
void yyerror(const char *s);
int yylex();
%}
%union {
int int_val;
char* string_val;
double double_val;
}
%token INTEGER
%left '+' '-'
%left '*' '/' '%'
%left UMINUS UPLUS
%right '^'
%type <int_val> expr_int INTEGER
%%
program: line '\n'
| '\n' { return 0; }
;
line: expr_int { cout<<$1<<endl; return 0; }
;
expr_int: expr_int '+' expr_int { $$ = $1 + $3; }
| expr_int '-' expr_int { $$ = $1 - $3; }
| expr_int '*' expr_int { $$ = $1 * $3; }
| expr_int '^' expr_int { $$ = pow($1,$3); }
| '-' INTEGER %prec UMINUS { $$ = -$2; }
| '+' INTEGER %prec UPLUS { $$ = $2; }
| INTEGER
;
%%
void yyerror(const char *s) {
printf("error");
}
int main(void) {
while(yyparse()==0);
return 0;
}
And this is the flex code:
%{
#include <iostream>
#include "calc.tab.h"
using namespace std;
void yyerror(const char *s);
%}
INTEGER [1-9][0-9]*|0
UNARY [+|\-]
BINARY [+|\-|*|^|]
WS [ \t]+
%%
{INTEGER} { yylval.int_val=atoi(yytext); return INTEGER; }
{UNARY}|{BINARY}|\n { return *yytext; }
{WS} {}
. {}
%%
//////////////////////////////////////////////////
int yywrap(void) { return 1; } // Callback at end of file
Why doesnt bison first handle 2^2 and then adds the unary minus, like I defined?
It keep printing 4 instead...
Thanks alot for the helpers.
Your syntax for unary minus:
'-' INTEGER %prec UMINUS
does not allow its argument to be an expression. So it unambiguously grabs the following INTEGER and the %prec rule is never needed.
<personal_opinion>
The problem with %prec is that yacc/bison does not complain if the rule is not needed. So you never really know if it does anything or not. IMHO it's really better to just write an unambiguous grammar.