error: expected primary expression before `>` token - c++

I have this code that has an error and can't find it.
if (Lista.at(i).getStartHour() <= temp->getStartHour() &&
Lista.at(i).getEndHour() => temp->getEndHour() &&
Lista.at(i).getStartMinute() < temp->getStartMinute() &&
Lista.at(i).getEndMinute() > temp->getEndMinute())
I get this error:
error: expected primary-expression before '>' token at that line.
I can't see what I'm doing wrong.
Lista is a vector of objects, same object as temp. All functions return int. I'm trying to check if those times overlap.

=> is not a token; it's two tokens, = and >.
The greater-than-or-equal-to operator is >=.

Related

Spirit Qi: Error when replacing sequence with expectation operator

I have a grammar that works perfectly fine and contains the following lines.
element = container | list | pair;
container = name >> '(' >> -(arg % ',') >> ')' >> '{' >> +element > '}';
// trying to put an expectation operator here --------^
list = name >> '(' > (value % ',') > ')' > ';';
pair = name >> ':' > value > ';';
To have meaningful error messages, I want to make sure that container does not backtrack as soon as it hits '{'. But for some reason, if I replace the sequence operator with an expectation operator right after the '{', I get a huge compiler error. Any ideas what the problem might be?
element is a boost::variant; container, list and pair are own structs with BOOST_FUSION_ADAPT_STRUCT applied. Please have a look here for the full source code: https://github.com/fklemme/liberty_tool/blob/master/src/liberty_grammar.hpp#L24
Yes. Because the precedences of operator>> and operator> aren't equal, the resulting synthesized attribute type is different.
In fact, it is no longer automatically compatible with the intended exposed attribute type.
In this case the problem can be quickly neutralized with some disambiguating parentheses around the sub-expression:
container = name >> '(' >> -(arg % ',') >> ')' >> ('{' > +element > '}');

Best way to extra array of strings with regular expressions this single string

I have a data table converted from JSON where one of the columns has rows that look like this that are of character class:
"[{u'className': u'Sticker', u'__type': u'Pointer', u'objectId': u'mYz1ietNEt'}, {u'className': u'Sticker', u'__type': u'Pointer', u'objectId': u'FVn0hE5Zar'}, {u'className': u'Sticker', u'__type': u'Pointer', u'objectId': u'ZxUTYYCunL'}]"
What I really want is a vector of the objectId's, so ideally from the above I would get:
['mYz1ietNEt', 'FVn0hE5Zar', 'ZxUTYYCunL']
What's the best way to get there? Or how do I get there at all? For a single string, test, here's what I've tried:
test1 = strsplit(test, split = "}, ")
test1 = test1[[1]]
That's fine, but I can't seem to find a way to get rid of the left bracket { let alone the other portions of the string that are undesirable.
> test2 = strsplit(test1, "{")
Error in strsplit(test1, "{") :
invalid regular expression '{', reason 'Missing '}''
> test2 = strsplit(test1, "\{")
Error: '\{' is an unrecognized escape in character string starting ""\{"
> test2 = strsplit(test1, u"{")
Error: unexpected string constant in "test2 = strsplit(test1, u"{""
> test2 = strsplit(test1, r"{")
Error: unexpected string constant in "test2 = strsplit(test1, r"{""
Ideally I could find some regex expression that could extract all of the objectId fields in one fell swoop into a vector. Is there something like this?
We can extract the identified strings with a regular expression pattern:
library(stringr)
str_extract_all(test1, "(?<=objectId':\\su')(.*?)(?=')")[[1]]
[1] "mYz1ietNEt" "FVn0hE5Zar" "ZxUTYYCunL"

syntax error in compile cool project

i get project from
https://github.com/cchuahuico/COOL-Compiler
when compile this code:
class Main inherits SuperClass {
main(): Object {{
out_string("Enter an integer greater-than or equal-to 0: ");
let input: Int in
if input < 0 then
input
-- out_string("ERROR: Number must be greater-than or equal-to 0\n")
else {
-- out_string("The factorial of ").out_int(input);
-- out_string(" is ").out_int(factorial(input));
factorial(input);
}
fi;
}};
factorial(num: Int): Int {
if num = 0 then 1 else num * factorial(num - 1) fi
};
};
class SuperClass {
out_string(str:String){};
};
when compile this with mingw i have this error
<stdin>:2:error:syntax error near or aat character or token '{'
<stdin>:5:error:syntax error near or aat character or token 'let'
<stdin>:14:error:syntax error near or aat character or token 'fi'
<stdin>:15:error:syntax error near or aat character or token '}'
<stdin>:23:error:syntax error near or aat character or token '('
<stdin>:24:error:syntax error near or aat character or token '}'
<stdin>:24:error:syntax error near or aat character or token ' '
copmilation halted due to lexical or syntax errors
The mingw compiler is complaining because the language is not standard C++:
The inherits keyword. Although, it could be #define as
anything.
fi is not a keyword.
The if statement needs parens () around the expression.
let is not a keyword.
then is not a keyword.
// There are a lot more
Is this another language?

Bison outputting string after the wrong line

The input
1 -- Narrowing Variable Initialization
2
3 function main a: integer returns integer;
4 b: integer is a * 2.;
5 begin
6 if a <= 0 then
7 b + 3;
8 else
9 b * 4;
10 endif;
11 end;
is yielding the output
1 -- Narrowing Variable Initialization
2
3 function main a: integer returns integer;
4 b: integer is a * 2.;
5 begin
Narrowing Variable Initialization
6 if a <= 0 then
7 b + 3;
8 else
9 b * 4;
10 endif;
11 end;
Instead of placing that error message under line 4, which is where the error actually occurs. I've looked at it for hours and can't figure it out.
%union
{
char* ident;
Types types;
}
%token <ident> IDENTIFIER
%token <types> INTEGER_LITERAL
%token <types> REAL_LITERAL
%token BEGIN_
%token FUNCTION
%token IS
%token <types> INTEGER
%token <types> REAL
%token RETURNS
%type <types> expression
%type <types> factor
%type <types> literal
%type <types> term
%type <types> statement
%type <types> type
%type <types> variable
%%
program:
/* empty */ |
functions ;
functions:
function_header_recovery body ; |
function_header_recovery body functions ;
function_header_recovery:
function_header ';' |
error ';' ;
function_header:
FUNCTION {locals = new Locals();} IDENTIFIER optional_parameters RETURNS type {globals->insert($3,locals->tList);} ;
optional_parameters:
/* empty */ |
parameters;
parameters:
IDENTIFIER ':' type {locals->insert($1, $3); locals->tList.push_back($3); } |
IDENTIFIER ':' type {locals->insert($1, $3); locals->tList.push_back($3); } "," parameters;
type:
INTEGER | REAL ;
body:
optional_variables BEGIN_ statement END ';' ;
optional_variables:
/* empty */ |
variables ;
variables:
variable IS statement {checkTypes($1, $3, 2);} |
variable IS statement {checkTypes($1, $3, 2);} variables ;
variable:
IDENTIFIER ':' type {locals->insert($1, $3);} {$$ = $3;} ;
statement:
expression ';' |
...
Types checkTypes(Types left, Types right, int flag)
{
if (left == right)
{
return left;
}
if (flag == 1)
{
Listing::appendError("Conditional Expression Type Mismatch", Listing::SEMANTIC);
}
else if (flag == 2)
{
if (left < right)
{
Listing::appendError("Narrowing Variable Initialization", Listing::SEMANTIC);
}
}
return REAL_TYPE;
}
printing being handled by:
void Listing::nextLine()
{
printf("\n");
if (error == "")
{
lineNo++;
printf("%4d%s",lineNo," ");
}
else
{
printf("%s", error.c_str());
error = "";
nextLine();
}
}
void Listing::appendError(const char* errText, int errEnum)
{
error = error + errText;
if (errEnum == 997)
{
lexErrCount++;
}
else if (errEnum == 998)
{
synErrCount++;
}
else if (errEnum == 999)
{
semErrCount++;
}
}
void Listing::display()
{
printf( "\b\b\b\b\b\b " );
if (lexErrCount + synErrCount + semErrCount > 0)
{
printf("\n\n%s%d","Lexical Errors ",lexErrCount);
printf("\n%s%d","Syntax Errors ",synErrCount);
printf("\n%s%d\n","Semantic Errors ",semErrCount);
}
else
{
printf("\nCompiled Successfully\n");
}
}
That's just the way bison works. It produces a one-token lookahead parser, so your production actions don't get triggered until it has read the token following the production. Consequently, begin must be read before the action associated with variables happens. (bison never tries to combine actions, even if they are textually identical. So it really cannot know which variables production applies and which action to execute until it sees the following token.)
There are various ways to associate a line number and/or column position with each token, and to use that information when an error message is to be produced. Interspersing the errors and/or warnings with the input text, in general, requires buffering the input; for syntax errors, you only need to buffer until the next token but that is not a general solution; in some cases, for example, you may want to associate an error with an operator, for example, but the error won't be detected until the operator's trailing argument has been parsed.
A simple technique to correctly intersperse errors/warnings with source is to write all the errors/warnings to a temporary file, putting the file offset at the front of each error. This file can then be sorted, and the input can then be reread, inserting the error messages at appropriate points. The nice thing about this strategy is that it avoids having to maintain line numbers for each error, which noticeably slows down lexical analysis. Of course, it won't work so easily if you allow constructs like C's #include.
Because generating good error messages is hard, and even tracking locations can slow parsing down quite a bit, I've sometimes used the strategy of parsing input twice if an error is detected. The first parse only detects errors and fails early if it can't do anything more reasonable; if an error is detected, the input is reparsed with a more elaborate parser which carefully tracks file locations and possibly even uses heuristics like indentation depth to try to produce better error messages.
As rici notes, bison produces an LALR(1) parser, so it uses one token of lookahead to know what action to take. However, it doesn't ALWAYS use a token of lookahead -- in some cases (where there's only one possibility regardless of lookahead), it uses default reductions which can reduce a rule (and run the associated action) WITHOUT lookahead.
In your case, you can take advantage of that to get the action to run without lookahead if you really need to. The particular rule in question (which triggers the requirement for lookahead) is:
variables:
variable IS statement {checkTypes($1, $3, 2);} |
variable IS statement {checkTypes($1, $3, 2);} variables ;
in this case, after seeing a variable IS statement, it needs to see the next token to decide if there are more variable declarations in order to know which action (the first or the second) to run. But as the two actions are really the same, you could combine them into a single action:
variables: vardecl | vardecl variables ;
vardecl: variable IS statement {checkTypes($1, $3, 2);}
which would end up using a default reduction as it doesn't need the lookahead to decide between two reductions/actions.
Note that the above depends on being able to find the end of a statement without lookahead, which should be the case as long as all statements end unambiguously with a ;

Explanation and solution for JavaCC's warning "Regular expression choice : FOO can never be matched as : BAR"?

I am teaching myself to use JavaCC in a hobby project, and have a simple grammar to write a parser for. Part of the parser includes the following:
TOKEN : { < DIGIT : (["0"-"9"]) > }
TOKEN : { < INTEGER : (<DIGIT>)+ > }
TOKEN : { < INTEGER_PAIR : (<INTEGER>){2} > }
TOKEN : { < FLOAT : (<NEGATE>)? <INTEGER> | (<NEGATE>)? <INTEGER> "." <INTEGER> | (<NEGATE>)? <INTEGER> "." | (<NEGATE>)? "." <INTEGER> > }
TOKEN : { < FLOAT_PAIR : (<FLOAT>){2} > }
TOKEN : { < NUMBER_PAIR : <FLOAT_PAIR> | <INTEGER_PAIR> > }
TOKEN : { < NEGATE : "-" > }
When compiling with JavaCC I get the output:
Warning: Regular Expression choice : FLOAT_PAIR can never be matched as : NUMBER_PAIR
Warning: Regular Expression choice : INTEGER_PAIR can never be matched as : NUMBER_PAIR
I'm sure this is a simple concept but I don't understand the warning, being a novice in both parser generation and regular expressions.
What does this warning mean (in as-novice-as-you-can-get terms)?
I don't know JavaCC, but I am a compiler engineer.
The FLOAT_PAIR rule is ambiguous. Consider the following text:
0.0
This could be FLOAT 0 followed by FLOAT .0; or it could be FLOAT 0. followed by FLOAT 0; both resulting in FLOAT_PAIR. Or it could be a single FLOAT 0.0.
More importantly, though, you are using lexical analysis with composition in a way that is never likely to work. Consider this number:
12345
This could be parsed as INTEGER 12, INTEGER 345 resulting in an INTEGER_PAIR. Or it could be parsed as INTEGER 123, INTEGER 45, another INTEGER_PAIR. Or it could be INTEGER 12345, another token. The problem exists because you are not requiring white space between the lexical elements of the INTEGER_PAIR (or FLOAT_PAIR).
You should almost never try to handle pairs like this in the lexer. Instead, you should handle plain numbers (INTEGER and FLOAT) as tokens, and handle things like negation and pairing in the parser, where whitespace has been dealt with and stripped.
(For example, how are you going to process "----42"? This is a valid expression in most programming languages, which will correctly calculate multiple negations, but would not be handled by your lexer.)
Also, be aware that single-digit integers in your lexer will not be matched as INTEGER, they will come out as DIGIT. I don't know the correct syntax for JavaCC to fix that for you, though. What you want is to define DIGIT not as a token, but simply something you can use in the definitions of other tokens; alternatively, embed the definition of DIGIT ([0-9]) directly wherever you are using DIGIT in your rules.
I haven't used JavaCC, but it is possible that NUMBER_PAIR is ambiguous.
I think the problem comes down to the fact that the same exact thing can be matched as both FLOAT_PAIR and INTEGER_PAIR since FLOAT can match an INTEGER.
But this is just a guess having never seen the JavaCC syntax :)
It probably means that for every FLOAT_PAIR you'll just get a FLOAT_PAIR token, never a NUMBER_PAIR token. The FLOAT_PAIR rule already matches all the input and JavaCC will not try to find further matching rules. That would be my interpretation, but I don't know JavaCC, so take it with a grain of salt.
Maybe you can specify somehow that NUMBER_PAIR is the main production and that you don't want to get any other tokens as results.
Thanks to Barry Kelly's answer, the solution I've come up with is:
SKIP : { < #TO_SKIP : " " | "\t" > }
TOKEN : { < #DIGIT : (["0"-"9"]) > }
TOKEN : { < #DIGITS : (<DIGIT>)+ > }
TOKEN : { < INTEGER : <DIGITS> > }
TOKEN : { < INTEGER_PAIR : (<INTEGER>) (<TO_SKIP>)+ (<INTEGER>) > }
TOKEN : { < FLOAT : (<NEGATE>)?<DIGITS>"."<DIGITS> | (<NEGATE>)?"."<DIGITS> > }
TOKEN : { < FLOAT_PAIR : (<FLOAT>) (<TO_SKIP>)+ (<FLOAT>) > }
TOKEN : { < #NUMBER : <FLOAT> | <INTEGER> > }
TOKEN : { < NUMBER_PAIR : (<NUMBER>) (<TO_SKIP>)+ (<NUMBER>) >}
TOKEN : { < NEGATE : "-" > }
I had completely forgot to include the space which is used to separate the two tokens, I've also used the '#' symbol which stops the tokens being matched, and is just used in the definition of other tokens. The above is compiled by JavaCC without warning or error.
However, as noted by Barry, there are reasons against doing this.