'Pre-mature EOF' and 'Bad Character' errors - c++

I have written Flex stuff in a previous class but none of my previously working code is solving the issue I am having.
I have searched around StackOverflow for a solution but none of them have solved it.
I have:
Checked I have no errant spaces in the %{ ... %} area
Tried using #include <'iostream>
Tried %option noyywrap
Here is my code (I removed all the tokens and such because there is a lot of them):
%{
...
int numLines = 0;
void printTokenInfo(char* tokenType, char* lexeme);
void handleComments(char* text);
%}
WSPACE [ \t\r]+
NEWLINE \n
DIGIT [0-9]
LETTER [a-zA-Z]
IDENT ({LETTER}|_)({LETTER}|{DIGIT}|_)*
INTCONST {DIGIT}+
CHARCONST "'"{LETTER}+"'"
%%
...
%%
// User-written code goes here
void printTokenInfo(char* tokenType, char* lexeme)
{
printf("A");
printf("TOKEN: %s LEXEME: %s\n", tokenType, lexeme);
}
void handleComments(char* text)
{
printf("%s\n", text);
}
int yywrap() { return 1; }
int main()
{
do {
yylex();
} while (!feof(yyin));
return 0;
}
Here is how I am compiling and running it:
flex FILENAME.l
g++ lex.yy.c -o lexer
lexer < INPUT.txt
And the instructor provided us with input files but none of them have worked. They all fail with 'premature EOF' or 'bad character'
Any ideas?

Well, I think I finally discovered the answer... Try running it with the complete path rather than just the name of your compiled lexer. I discovered the 'actual' path by running it under gdb (Which admittedly should have been my first instinct).
gdb lexer
(gdb) run < INPUT.txt
Originally, I was trying to run it with:
lexer < INPUT.txt
But I discovered by running it with gdb that this worked:
/nethome/users/mjc7w6/Classes/lexer < INPUT.txt
Edit: Someone chimed in on my Facebook with a further improvement. If the above solution fixes it for you, you might need to edit your ~/.bashrc with the following:
export PATH=/nethome/users/mjc7w6/Classes:$PATH
Or however you find that path to be set-up.

Related

Flex/Bison Markdown to HTML Program

This is for a homework assignment. The only code I've edited myself are the definitions, rules, and tokens. What I have so far compiles successfully but gives me a segmentation fault when I try to run it on the markdown file (.md), and the HTML output is just a blank file because of that.
%{
#define YYSTYPE char *
#include <string.h>
#include "miniMD2html.tab.h"
extern YYSTYPE yylval;
%}
%option yylineno
/* Flex definitions */
whitespace [ \t]+
newline [\n]+|{whitespace}[\n]+
textword [a-zA-Z:/.\-,\']+
integer [0-9]+
header #|##|###|####|#####
%%
{header} { return T_HEADER; }
{integer} { return T_INTEGER; }
{textword} { return T_TEXTWORD; }
{whitespace} { return T_BLANK; }
{newline} { return T_NEWLINE; }
%%
The generate functions are given in another file. Most of them just accept char*, the generate_header function takes an int and char*, and the generate_image function takes two char* and two int. The grammar may look weird but this is what was given in the assignment.
%{
#include "global.h"
#include "stdlib.h"
#include "stdio.h"
#define YYSTYPE char *
extern int yylex();
int yywrap();
int yyerror(const char*);
int yyparse();
extern FILE *yyin;
Html_Doc *html_doc;
%}
/* Define tokens here */
%token T_BLANK T_NEWLINE
%token T_HEADER T_INTEGER T_TEXTWORD
%% /* Grammar rules and actions follow */
s: mddoc;
mddoc: /*empty*/ | mddoc paragraph;
paragraph: T_NEWLINE {add_linebreak(html_doc);}
| pcontent T_NEWLINE {add_element(html_doc, $1); free($1);} ;
pcontent: header
| rftext {generate_paragraph($1);}
header: T_HEADER T_BLANK rftext {generate_header(strlen($1), $3);}
rftext: rftext T_BLANK rftextword {strappend($1, $3);}
| rftext rftextword {strappend($1, $2);}
| rftextword
rftextword: textnum | image | format
image: "![" text "](" text '=' T_INTEGER '#' T_INTEGER ')' {generate_image($2, $4, atoi($6), atoi($8));}
format: "**" text "**" {generate_bold($2);}
| '_' text '_' {generate_italic($2);}
| "**" format "**" {generate_bold($2);}
| '_' format '_' {generate_italic($2);}
text: text T_BLANK textnum {strappend($1, $3);}
| text textnum {strappend($1, $2);}
| textnum
textnum: T_TEXTWORD | T_INTEGER
%%
int main(int argc, char *argv[]) {
// yydebug = 1;
FILE *fconfig = fopen(argv[1], "r");
// make sure it is valid
if (!fconfig) {
printf("Error reading file!\n");
return -1;
}
html_doc = new_html_doc();
// set lex to read from file
yyin = fconfig;
int ret = yyparse();
output_result(html_doc);
del_html_doc(html_doc);
return ret;
}
int yywrap(){
return 1;
}
int yyerror(const char* s){
extern int yylineno;
extern char *yytext;
printf("error while parsing line %d: %s at '%s', ASCII code: %d\n", yylineno, s, yytext, (int)(*yytext));
return 1;
}
None of your flex rules ever set the value of yylval, so it will be NULL throughout. And so will all the references to semantic values ($n) in your grammar. Since most functions which take a char* assume that it is a valid string, it's pretty likely that one of them will soon try to examine the string value, and the fact that the pointer is NULL will certainly lead to a segfault.
In addition, there are both single character and quoted string tokens in your grammar, none of which can be produced by your scanner. So it's quite likely that the parser will stop with a syntax error as soon as one of the non-word characters is encountered in the input.
In the bison file, every token should be separated by ;
s: mddoc;
mddoc: /*empty*/ | mddoc paragraph;
paragraph: ...
Notice the
;
after mmdoc paragraph.
This is correct but the following tokens are not separated well.
Also, as #Rockcat as said, in the flex file, you should add
yylval = strdup(yytext);
before returning your token to the bison file.

Bash autocomplete an option without running the application

I have found this code as a bash autocomplete. But, it looks strange to me. What if I do not like to run the code at all. If I would like to type ./a.out then space (without entering) and then by pressing tab, I would like to see only two options apple and cherry and if I type a and press tab, then it autocomplete the option apple and similarly for c. Let's say only one of the two options are acceptable:
./a.out apple
./a.out cherry
where apple and cherry are options and not the name of the files in the directory. In the first case, I would like the program types that your option is apple and in the second case your option is cherry. In any other case, the program should print an error that the option is not valid.
All examples that I find on the internet such as what follows look like that you should run the program first, then it reacts. The while loop inside the main function collides with the normal functionality of the program. Have I misunderstood the readline library? Is the above-described application possible to implement by editing the following code?
// sudo apt-get install libreadline-dev
// g++ -std=c++11 main.cpp -lreadline
#include <iostream>
#include "readline/readline.h"
#include "readline/history.h"
using namespace std;
int main(int argc, char** argv)
{
const char *line;
while ((line = readline("? ")) != nullptr) {
cout << "[" << line << "]" << endl;
if (*line) add_history(line);
free(line);
}
// if(argc!=2)
// {
// cout<<"<exe> one_parameter"<<endl;
// return 1;
// }
// string option=argv[1];
// if(option=="apple" || option=="cherry")
// cout<<"Your option is "<<option<<endl;
// else
// {
// cout<<"Error: invalid option "<<option<<endl;
// return 1;
// }
return 0;
}
// partial answer - why you may want to invoke the app while doing the autocompletion
One way of implementing the autocomplete for an application is to have the application binary configure it (by having a flag that prints the instructions for autocomplete configuration or by just parsing the --help output of the application).
Schemataically:
complete -F $(./a.out --generate-autocomplete-config) ./a.out
This is why you might see the binary actually invoked as a part of autocomplete implementation.
This has nothing to do with your executable. You need to put this in a file and source (source autocomplete_file or . autocomplete_file) it in the bash.
_a_complete_()
{
local word=${COMP_WORDS[COMP_CWORD]}
local files='apple cherry'
COMPREPLY=( $( compgen -W "${files}" -- ${word} ) )
}
complete -F _a_complete_ ./a.out
Here a nice documentation can be found.

Flex, Bison, C++ all in Xcode

I'm working through Problems with reentrant Flex and Bison. It compiles and runs just fine on my machine. What I want to do though is make use of C++ STL. Anytime I try to include a CPP header, it says it can't be found. There are only a handful of questions about this on Goog. Does anyone have a working example of this sort of setup, or a solution I might implement?
Any help would be greatly appreciated.
Thanks!
EDIT So for one reason or another, I have to add the include path of any headers in the build settings. Must be due to the custom makefile of this person's example. It's above my pay-grade. Anyway, I can now use STL libraries inside of main.
WHAT I REALLY WANT TO DO IS USE FLEX/BISON WITH CPP, AND IF I TRY TO INCLUDE STL HEADERS ANYWHERE BUT MAIN, I GET ERROR "HEADER NOT FOUND".
I can include C-headers just fine, though.
Here's answer from the author of another answer in the linked topic.
I have adapted that my example to work with C++.
The key points are:
I am using recent Flex / Bison: brew install flex and brew install bison. Not sure if the same will work with default OSX/Xcode's flex/bison.
Generated flex/bison files should have C++ extensions (lexer.[hpp|mm], parser.[hpp|mm]) for Xcode to pick up the C++ code.
There is a Xcode's Build Phase that runs Make.
All the relevant files follow below but I recommend you to check out the example project.
main.mm's code is
#include "parser.hpp"
#include "lexer.hpp"
extern YY_BUFFER_STATE yy_scan_string(const char * str);
extern void yy_delete_buffer(YY_BUFFER_STATE buffer);
ParserConsumer *parserConsumer = [ParserConsumer new];
char input[] = "RAINBOW UNICORN 1234 UNICORN";
YY_BUFFER_STATE state = yy_scan_string(input);
yyparse(parserConsumer);
yy_delete_buffer(state);
Lexer.lm:
%{
#include "ParserConsumer.h"
#include "parser.hpp"
#include <iostream>
#include <cstdio>
int yylex(void);
void yyerror(id <ParserConsumer> consumer, const char *msg);
%}
%option header-file = "./Parser/Generated Code/lexer.hpp"
%option outfile = "./Parser/Generated Code/lexer.mm"
%option noyywrap
NUMBER [0-9]+
STRING [A-Z]+
SPACE \x20
%%
{NUMBER} {
yylval.numericValue = (int)strtoul(yytext, NULL, 10);
std::cout << "Lexer says: Hello from C++\n";
printf("[Lexer, number] %s\n", yytext);
return Token_Number;
}
{STRING} {
yylval.stringValue = strdup(yytext);
printf("[Lexer, string] %s\n", yytext);
return Token_String;
}
{SPACE} {
// Do nothing
}
<<EOF>> {
printf("<<EOF>>\n");
return 0;
}
%%
void yyerror (id <ParserConsumer> consumer, const char *msg) {
printf("%s\n", msg);
abort();
}
Parser.ym:
%{
#include <iostream>
#include <cstdio>
#include "ParserConsumer.h"
#include "parser.hpp"
#include "lexer.hpp"
int yylex();
void yyerror(id <ParserConsumer> consumer, const char *msg);
%}
%output "Parser/Generated Code/parser.mm"
%defines "Parser/Generated Code/parser.hpp"
//%define api.pure full
%define parse.error verbose
%parse-param { id <ParserConsumer> consumer }
%union {
char *stringValue;
int numericValue;
}
%token <stringValue> Token_String
%token <numericValue> Token_Number
%%
/* http://www.tldp.org/HOWTO/Lex-YACC-HOWTO-6.html 6.2 Recursion: 'right is wrong' */
tokens: /* empty */
| tokens token
token:
Token_String {
std::cout << "Parser says: Hello from C++\n";
printf("[Parser, string] %s\n", $1);
[consumer parserDidParseString:$1];
free($1);
}
| Token_Number {
printf("[Parser, number]\n");
[consumer parserDidParseNumber:$1];
}
%%
Makefile:
generate-parser: clean flex bison
clean:
rm -rf './Parser/Generated Code'
mkdir -p './Parser/Generated Code'
flex:
# brew install flex
/usr/local/bin/flex ./Parser/Lexer.lm
bison:
# brew install bison
/usr/local/bin/bison -d ./Parser/Parser.ym

how to execute code after yylex(); command

I have a simple flex source code which skips the comments in /* */ and should get the count of comments found:
%{
int in_comment = 0;
int count = 0;
%}
%%
\/\* { in_comment = 1; count++; }
\*\/ { in_comment = 0; }
. { if (!in_comment) ECHO; }
%%
int main(void)
{
yylex();
printf("Comments found %d\n", count); // never executed
return 0;
}
First half works fine - it really skips the comments, but they are not counted... what can I do to execute printf line?
I just tried it myself. So I copied your source code to "x.l" and did a make x
ld then complained about the missing yywrap() function. After adding
%option noyywrap
The compile succeeded and a test showed:
ronald#cheetah:~/tmp$ ./x < cribbage.c
... lots of output ...
Comments found 15
UPDATE:
If the text is not loaded from a file (just ./x), you have to end your manual input by CTRL + D

c++ : convert symbols to code line numbers programmatically

I'm developing under Linux/gcc
I currently use the following to get a stack trace on custom thrown exceptions. Demangled functions names and line numbers are as expected, but I would like to avoid the use of addr2line to have a full control on the formatting of the output strings.
static void posix_print_stack_trace()
{
int i, trace_size = 0;
char **messages = (char **)NULL;
trace_size = backtrace(stack_traces, MAX_STACK_FRAMES);
messages = backtrace_symbols(stack_traces, trace_size);
for (i = 0; i < trace_size; ++i)
{
if (addr2line(program_invocation_name, stack_traces[i]) != 0)
{
printf(" error determining line # for: %s\n", messages[i]);
}
}
if (messages) { free(messages); }
}
static int addr2line(char const * const program_name, void const * const addr)
{
char addr2line_cmd[512] = {0};
sprintf(addr2line_cmd,"addr2line -C -f -p -i -e %.256s %p", program_name, addr);
return system(addr2line_cmd);
}
Note : The use of -f for displaying the functions names in play in the stack trace and -C to display them demangled.
Q : Does anyone could point me on a programmatic solution ?
(And if possibly give me some advices on how to get it working as well with MinGW/gcc).
NB : Or may be simply using gdb in some way could help in getting more customized output ?
Thanks for the help.
EDIT : It looks like for the windows part, it is doable that way : https://stackoverflow.com/a/6207030/1715716
EDIT : The above points to a Microsoft Visual only solution, so is finally useless to me.
You probably could use or adapt (at least on Linux, and systems using ELF and DWARF) the libbacktrace by Ian Taylor, which is currently inside GCC source tree. See here; in principle it should be usable independently of GCC (provided you obey its BSD-like license).