So I discovered that writing an if statement with parentheses in Perl 6 results in it throwing this error at me:
===SORRY!===
Word 'if' interpreted as 'if()' function call; please use whitespace instead of parens
at C:/test.p6:8
------> if<HERE>(True) {
Unexpected block in infix position (two terms in a row)
at C:/test.p6:8
------> if(True)<HERE> {
This makes me assume that there is some sort of if() function? However, creating and running a script with if(); in it produces the following compiler error:
===SORRY!===
Undeclared routine:
if used at line 15
So like what's the deal?
I read here https://en.wikibooks.org/wiki/Perl_6_Programming/Control_Structures#if.2Funless that parens are optional but that seems to not to be the case for me.
My if statements do work without parens just wondering why it would stop me from using them or why it would think that if is a subroutine because of them.
EDIT: Well aren't I a bit daft... looks like I wasn't reading well enough at the link I linked which I assume is why you are confused. The link I linked points out the following which was basically what I was asking:
if($x > 5) { # Calls subroutine "if"
}
if ($x > 5) { # An if conditional
}
I've accepted the below answer as it does provide some insight.
Are you sure you created a sub with the name 'if'? If so, (no pun intended), you get the keyword if you use a space after the literal 'if', otherwise you get your pre-declared function if you use a paren after the literal 'if' - i.e. if your use of the term looks like a function call - and you have declared such a function - it will call it;
use#localhost:~$ perl6
> sub if(Str $s) { say "if sub says: arg = $s" };
sub if (Str $s) { #`(Sub|95001528) ... }
> if "Hello World";
===SORRY!=== Error while compiling <unknown file>
Missing block
at <unknown file>:1
------> if "Hello World"⏏;
expecting any of:
block or pointy block
> if("Hello World");
if sub says: arg = Hello World
>
> if 12 < 16 { say "Excellent!" }
Excellent!
>
You can see above, I've declared a function called 'if'.
if "Hello World"; errors as the space means I'm using the keyword and therefore we have a syntax error in trying to use the if keyword.
if("Hello World") successfully calls the pre-declared function.
if 12 < 18 { say "Excellent!" } works correctly as the space means 'if' is interpreted as the keyword and this time there is no syntax error.
So again, are you sure you have (or better - can you paste here) your pre-declared 'if' function?
The reference for keywords and whitespace (which co-incidentally uses the keyword 'if' as an example!) is here: SO2 - Keywords and whitespace
Related
I've simplified a more complicated pattern I'm trying to match down to the following program:
my token paren { '(' <tok> ')' }
my token tok { <paren>? foo }
say "(foo)foo" ~~ /<tok>/;
This seems straightforward enough to me, but I get this error:
No such method 'tok' for invocant of type 'Match'. Did you mean 'to'?
in regex paren at a.p6 line 1
in regex tok at a.p6 line 2
in block <unit> at a.p6 line 4
What's the reason for this error?
The pattern matches without error if I change the first <tok> to <&tok>, but then I don't get a capture of that named pattern, which, in my original, more-complicated case, I need.
The problem is that tok isn't in the current lexical namespace yet, so <tok> gets compiled as a method call instead.
If you force it to be a lexical call with &, it works.
my token paren { '(' <&tok> ')' }
my token tok { <paren>? foo }
say "(foo)foo" ~~ /<tok>/;
If <…> starts with anything other than a letter it doesn't capture.
So to capture it under the name tok, we add tok= to the <…>
my token paren { '(' <tok=&tok> ')' }
my token tok { <paren>? foo }
say "(foo)foo" ~~ /<tok>/;
Brad's answer is correct, but I wanted to offer another possible solution: you can recurse within a single regex using the special <~~> token. Using that, plus a regular named capture, would create the captures you want in the simplified example in your question; here's how that would look:
my token paren { $<tok> = ['(' <~~> ')']? foo }
I'm not sure whether your more complicated case can be as easily re-written to recurse on itself rather than with two mutually recursive tokens. But, when this pattern works, it can simplify the code quite a bit.
I would like to define an expression in C++ with macros and I am having quite a bit of trouble.
The expression is :
MATCH string WITH other_string
where string and other_string do not require " "
For example: MATCH r1 WITH string1 is the result i desire.
The purpose of this macro would be to check if r1 string matches with r2.
(I already have the code for the matching)
UPDATE
I would like to call MATCH hello WITH hi
in my main function
int main(){
MATCH hello WITH hi
}
and call my function from this macro to compare them. **Both hello and hi are unquoted arguments and must be treated as variable names
It is always dubious to use macros to make your code look like a different language. It is probably better to consider using a separate parser for your "meta-language" that generates the C++ code for you.
In this case, since C++ syntax requires some way to indicate the end of a statement (close braces or semi-colon) you are in kind of a jam.
Consider your example:
int main () { MATCH hello WITH hi }
Since hi is the last token before the end of main, there is no chance to fix-up the syntax to match C++ requirements.
You can't do what you want, so you have to do something different
If you really intend to embed this syntax into your C++ code, you need sentinel tokens to allow you to fix-up the syntax. My proposed syntax is:
int main () {
BEGIN_MATCHING
MATCH hello WITH hi
MATCH hello WITH hi
END_MATCHING
};
If this syntax is acceptable, then you can use the following macros.
#define BEGIN_MATCHING ((void)0
#define MATCH ); my_function(
#define WITH ,
#define END_MATCHING );
This will cause the code in the proposed syntax example to expand to:
int main () {
((void)0
); my_function( hello , hi
); my_function( hello , hi
);
}
Live Demo
Simply stringify your arguments with #, something like:
#define MATCH_WITH(str1, str2) MATCH #str1 WITH #str2
That way:
MATCH_WITH(testing, testing)
becomes:
MATCH "testing" WITH "testing"
I can't, for the life of me, figure out what's wrong with my regex's.
What I'd like to tokenize are two (2) types of strings, both of which to be contained on a single line. One string can be anything (other than a new line), and the other, any alpha-numeric (ASCII) character and literal '_', '/' '-', and '.'.
The snippet of flex code is:
nl \n|\r\n|\r|\f|\n\r
...
%%
...
\"[^\"]+{nl} { frx_parser_error("Label is missing trailing double quote."); }
\"[a-zA-Z0-9_\.\/\-]+\" {
if (yyleng > 1024) frx_parser_error("File name too long.");
yytext[yyleng - 1] = '\0';
frx_parser_lval.str = strdup(yytext+1);
fprintf(stderr,"TOSP_FILENAME: %s\n", frx_parser_lval.str);
return (TOSP_FILENAME);
}
\"[^{nl}]+\" {
yytext[yyleng - 1] = '\0';
frx_parser_lval.str = strdup(yytext+1);
fprintf(stderr,"TOSP_IDENTIFIER:\n%s\n", frx_parser_lval.str);
return (TOSP_IDENTIFIER);
}
And when I run the parser, the fprintf's spit this out:
TOSP_FILENAME: ModStar-Picture-Analysis.txt
TOSP_FILENAME: ModStar-Rubric.log.txt
TOSP_IDENTIFIER:
picture-A"
Progress (26,255) camera 'C' root("picture-C-
Syntax (line 34): syntax error
For whatever reason, the quote after picture-A is being ... missed. Why? I checked the ASCII values for the eight locations the quote character appears and they're all 0x22 (where the double quutoes appear that is).
If I add some characters to the end of the "picture-A" it can work sometimes; adding ".par", ".pbr" doesn't work as expected, but ".pnr" does.
I've even added a specific non-regexy token:
\"picture-A\" { frx_parser_lval.str = strdup("picture-A"); return TOSP_FILENAME; }
to the lex file and it gets skipped.
I'm using flex 2.5.39, no flex libraries, one option (%option prefix=frx_parser_) in the lex file and the flex command line is:
flex -t script-lexer.l > script-lexer.c
What gives?
EDIT I need to test this on the actual system, but unit tests show this tokenizer to be much more robust (based on rici's answer):
nl \n|\r\n|\r|\f|\n\r
...
%%
...
["][^"]+{nl} { printf("Missing trailing quote.\n%s\n",yytext); }
["][[:alnum:]_./-]+["] { printf("File name:\n%s\n",yytext); }
["][^"]+["] { printf("String:\n%s\n",yytext); }
EDIT The rule ["].+["] swallows consecutive multiple strings as one big string. It was changed to ["][^"]+["]
The problem is your pattern:
\"[^{nl}]+\"
You're attempting to expand a definition inside a character class, but that is not possible; inside a character class, { is always just a {, not a flex operator. See the flex manual:
Note that inside of a character class, all regular expression operators lose their special meaning except escape (‘\’) and the character class operators, ‘-’, ‘]]’, and, at the beginning of the class, ‘^’.
A definition is not a macro. Rather, a definition defines a new regular expression operator.
As a consequence of the above, you can write [^\"] as simply [^"] and \"[a-zA-Z0-9_\.\/\-]+\" as \"[a-zA-Z0-9_./-]+\" (The - needs to be either at the end or at the beginning.) Personally, I'd write the second pattern as:
["][[:alnum:]_./-]+["]
But everyone has their own style.
In vim (eg 7.3), how can I use/modify the cindent or smartindent options, or otherwise augment my .vimrc, in order to automatically indent curly braces inside open parentheses to align to the first "word" (defined later) directly preceding the opening (?
The fN option seems promising, but appears to be overridden by the (N option when inside open parentheses. From :help cinoptions-values:
fN Place the first opening brace of a function or other block in
column N. This applies only for an opening brace that is not
inside other braces and is at the start of the line. What comes
after the brace is put relative to this brace. (default 0).
cino= cino=f.5s cino=f1s
func() func() func()
{ { {
int foo; int foo; int foo;
Current behavior:
func (// no closing )
// (N behavior, here N=0
{ // (N behavior overrides fN ?
int foo; // >N behavior, here N=2
while I wish for:
func (// no closing )
// (N behavior as before
{ // desired behavior
int foo; // >N behavior still works
What I am asking for is different from fN because fN aligns to the prevailing indent, and I want to align to any C++ nested-name-specifier that directly precedes the opening (, like
code; f::g<T> ( instead of code; f::g<T> (
{ {
If there is no nested-name-specifier, I'd like it to match the ( itself. Perhaps matching a nested-name-specifier is too complicated, or maybe there is another part of the grammer this is more appropriate for this scenario. Anyway, for my typical use case, I think I'd be satisfied if the { aligns with the first nonwhitespace character of the maximal sequence of characters to the left of the innermost unclosed (, inclusive, that does not contain any semicolons or left curly braces }.
By the way, I arrived at this when trying to autoindent various std::for_each(b,e,[]{}); constructs in vim7.3. Thanks for your help!
Not sure that any of the {auto,smart,c}indent features could be finagled to do what you want. I made up a mapping which might give some inspiration:
inoremap ({ <esc>T<space>y0A<space>(<cr><esc>pVr<space>A{
Downsides are that you may need to do something smarter than 'T' to get back to the beginning of the last identifier (you could use '?' with a regex), that it trashes your default register, and that if your identifier before the paren is at the start of the line you have to do '({' yourself. The notion is to jump back to just before the identifier, copy to the beginning of the line, paste that to the next line, and replace every character with a space.
Good luck!
Roughly speaking in C++ there are:
operators (+, -, *, [], new, ...)
identifiers (names of classes, variables, functions,...)
const literals (10, 2.5, "100", ...)
some keywords (int, class, typename, mutable, ...)
brackets ({, }, <, >)
preprocessor (#, ## ...).
But what is the semicolon?
The semicolon is a punctuator, see 2.13 §1
The lexical representation of C++ programs includes a number of preprocessing tokens which are used in
the syntax of the preprocessor or are converted into tokens for operators and punctuators
It is part of the syntax and therein element of several statements. In EBNF:
<do-statement>
::= 'do' <statement> 'while' '(' <expression> ')' ';'
<goto-statement>
::= 'goto' <label> ';'
<for-statement>
::= 'for' '(' <for-initialization> ';' <for-control> ';' <for-iteration> ')' <statement>
<expression-statement>
::= <expression> ';'
<return-statement>
::= 'return' <expression> ';'
This list is not complete. Please see my comment.
The semicolon is a terminal, a token that terminates something. What exactly it terminates depends on the context.
Semicolon denotes sequential composition. It is also used to delineate declarations.
Semicolon is a statement terminator.
The semicolon isn't given a specific name in the C++ standard. It's simply a character that's used in certain grammar productions (and it just happens to be at the end of them quite often, so it 'terminates' those grammatical constructs). For example, a semicolon character is at the end of the following parts of the C++ grammar (not necessarily a complete list):
an expression-statement
a do/while iteration-statement
the various jump-statements
the simple-declaration
Note that in an expression-statement, the expression is optional. That's why a 'run' of semicolons, ;;;;, is valid in many (but not all) places where a single one is.
';'s are often used to delimit one bit of C++ source code, indicating it's intentionally separate from the following code. To see how it's useful, let's imagine we didn't use it:
For example:
#include <iostream>
int f() { std::cout << "f()\n"; }
int g() { std::cout << "g()\n"; }
int main(int argc)
{
std::cout << "message"
"\0\1\0\1\1"[argc] ? f() : g(); // final ';' needed to make this compile
// but imagine it's not there in this new
// semicolon-less C++ variant....
}
This (horrible) bit of code, called with no arguments such that argc is 1, prints:
ef()\n
Why not "messagef()\n"? That's what might be expected given first std::cout << "message", then "\0\1\0\1\1"[1] being '\1' - true in a boolean sense - suggests a call to f() printing f()\n?
Because... (drumroll please)... in C++ adjacent string literals are concatenated, so the program's parsed like this:
std::cout << "message\0\1\0\1\1"[argc] ? f() : g();
What this does is:
find the [argc/1] (second) character in "message\0\1\0\1\1", which is the first 'e'
send that 'e' to std::cout (printing it)
the ternary operator '?' triggers casting of std::cout to bool which produces true (because the printing presumably worked), so f() is called...!
Given this string literal concatenation is incredibly useful for specifying long strings
(and even shorter multi-line strings in a readable format), we certainly wouldn't want to assume that such strings shouldn't be concatenated. Consequently, if the semicolon's gone then the compiler must assume the concatenation is intended, even though visually the layout of the code above implies otherwise.
That's a convoluted example of how C++ code with and with-out ';'s changes meaning. I'm sure if I or other readers think on it for a few minutes we could come up with other - and simpler - examples.
Anyway, the ';' is necessary to inform the compiler that statement termination/separation is intended.
The semicolon lets the compiler know that it's reached the end of a command AFAIK.
The semicolon (;) is a command in C++. It tells the compiler that you're at the end of a command.
If I recall correctly, Kernighan and Ritchie called it punctuation.
Technically, it's just a token (or terminal, in compiler-speak), which
can occur in specific places in the grammar, with a specific semantics
in the language. The distinction between operators and other punctuation
is somewhat artificial, but useful in the context of C or C++, since
some tokens (,, = and :) can be either operators or punctuation,
depending on context, e.g.:
f( a, b ); // comma is punctuation
f( (a, b) ); // comma is operator
a = b; // = is assignment operator
int a = b; // = is punctuation
x = c ? a : b; // colon is operator
label: // colon is punctuation
In the case of the first two, the distinction is important, since a user
defined overload will only affect the operator, not punctuation.
It represents the end of a C++ statement.
For example,
int i=0;
i++;
In the above code there are two statements. The first is for declaring the variable and the second one is for incrementing the value of variable by one.