Regex pattern to match switch statements C++ - c++

I am trying to write a regex pattern to be used in a bash script which checks for the syntax of switch statements (C++).
The syntax for switch statements which I want to follow is the following one.
switch(expression)
{
case constant-expression:
statement(s);
break; // must be present
case constant-expression:
statement(s);
break; // must be present
....
....
default : // must be present
statement(s);
break; // must be present
}
Please note that even though the break and default statements are not a must, I wish to check for their presence.
I have written this regex pattern to match switch blocks.
switch(.*?)\n(\s)*?{(\n(.*?))*?(\n(\s)*case(.*?):?(\n(.*?))*?break;)+(\n(.*?))*?\n(\s)*(default:)?(\n(\s)*)*(break|return(.*?))?;(\n(\s)*(.*?))*}
It successfully matches switch blocks but the problem is that it matches the switch blocks even if the break and default statements are missing. I tried using + operator with the break and default words but they don't seem to work.
EDIT UPDATE:
Is it possible to match switch blocks such as the following one using a parser?
switch (PC_INT[address.port][address.pin])
{
#if defined (__AVR_ATmega2560__) || defined(__AVR_AT90CAN128__)
case EINT_0:
// Mask the interrupt so it doesn't fire anymore, i.e put a zero in the mask register.
EIMSK &= ~(1 << INT0);
break;
case EINT_1:
EIMSK &= ~(1 << INT1);
break;
....
default:
return GPIO_INT_OUT_OF_RANGE;
#elif defined(__AVR_ATmega64M1__) || defined(__AVR_ATmega64C1__)
case EINT_0:
// Mask the interrupt so it doesn't fire anymore, i.e put a zero in the mask register.
EIMSK &= ~(1 << INT0);
break;
case EINT_1:
EIMSK &= ~(1 << INT1);
break;
....
default:
return GPIO_INT_OUT_OF_RANGE;
#else
#error "GPIO interrupts not implemented for this configuration."
#endif
}

Non-greedy patterns (like .*?) are not magic.
You apparently expect the .*? in (\<case:.*?\<break;\s*)+ (a simplified form of your regex) to not match case:. Why wouldn't it? In other words, the text:
case 1:
do_something();
case 2:
do_something_else();
break;
certainly matches case.*?break;; the .*? matches 1: do_something(); case 2: do_something_else();.
.*? isn't a fence, either. case.*?break(more) might not match the first break following the case, if (more) doesn't match the text following the first break but does match the text following the second one.
As for the default: apparently being optional, that's precisely what your regex says:
(default:)?
I don't think the regex is salvageable. You can't parse C or C++ with regexes.
You really need to use a better parsing infrastructure. You could build a simple parser using flex and bison which would work for source code which doesn't play games with the preprocessor, but you might be better off using a real C++ parsing library, like libclang.

Related

Space after 'if', 'while', 'catch' etc.. with clang-format

Cannot figure out that option adds the space after if, while, catch, etc...
Currently my .clang-format file produce this:
while(true)
{
if(flushedCount == count)
{
break;
}
}
The clang-format configuration option controlling space after if, while, catch and other control statements is called SpaceBeforeParens.
SpaceBeforeParens: ControlStatements
From clang-format 8 documentation:
SpaceBeforeParens (SpaceBeforeParensOptions)
Defines in which cases to put a space before opening parentheses.
Possible values:
[...]
SBPO_ControlStatements (in configuration: ControlStatements) Put a space before opening parentheses only after control statement keywords (for/if/while...).
[...]

write interpreter for file format (C++ Arduino)

So I have a .txt file (Excellon) which I want to interpret.
Example file:
M48
FMAT,2
ICI,OFF
METRIC,TZ,000.000
T1C1.016
%
G90
M71
T1
X36551Y-569519
X17780Y-589280
When I scan the file I seperate the statement (e.g. METRIC) and save this in a string. After this I want to execute code based on the value of this string.
What would be the best practice to execute commands on statement detection.
if(String == "METRIC")
{
execute code;
}
else if (String == "M48")
{
execute code;
}
etc.
Or something like this:
switch(String)
{
case: "M48"
execute code;
break;
case: "METRIC"
execute code;
break;
etc.
}
Or are both of these methods wrong and should I use a different method?
I found this: Switch or if statements in writing an interpreter in java they are talking about using a map should I also try this? If so could you provide a simple example because I don't really understand this method.
The proper answer will depend on many factors, but reading between the lines of your post, I am 98% sure that what you want is a simple tokenizer to enum:
enum class Token {
AAA,
BBB,
CCC
};
// Trivially implementable as a if() {} else if {} sequence,
// or as a trie search if you want to get fancy.
Token token_from_string(const std::string& str);
// and in the code.
Token tok = token_from_string(String);
switch(tok) {
case Token::AAA:
break;
case Token::BBB:
break;
case Token::CCC:
break;
}
And then, a good practice is to tokenize the string as soon as it comes out of the stream, and then operate on the token itself.
Q: What would be the best practice to execute commands on statement detection.
You want to change control flow, when a certain string is found.
A switch is saying "pick one of commands based on this variables value". You could also use if/else.
Q: If so could you provide a simple example because I don't really understand this method.
The Excellon file format isn't far away from CNC g-code.
This is an example for a switch from an EXCELLON to GCODE converter.
The trick would be to modify the output method generateFile, to not generate the G-code file using fprint's, but call your commands instead (probably move, lift, wait, etc.).
You could also start with a g-code parser and modify it to handle the excellon format.

Nested switch gives error and indicates virus (trojan)

Trying to make a program using switch case (nested switch) my system alerted me that my program has a virus (trojan). How is it even possible? I am new to programming (complete novice) so I would be grateful for any help.
The task - to make automated telephonic reply system based upon requirements (just something I wanted to try).
#include<iostream>
using namespace std;
void customer_service()
{
cout<<"Kindly wait for our employees to contact you";
}
void feedback()
{
cout<<"Kindly record your feedback after the beep";
}
void offer()
{
cout<<"You are entitled to accept our one-time offer. You will be directed to one of our employees shortly\n";
}
void satisfied()
{
cout<<"Thanks a lot for calling. Have a great day ahead";
}
int main()
{
int input,yes_no;
cout<<"\nPress 1 if you would want to directly contact our employee\n";
cout<<"\nPress 2 if you wan to give a feedback\n";
cout<<"\nPress 3 if ypu would want to know about our offers\n";
cout<<"\nPress 4 if you are satisfied with our service\n";
cout<<"\nKindly press the required key\n";
cin>>input;
switch (input)
{
case 1:
customer_service();
break;
case 2:
feedback();
break;
case 3:
offer();
cout<<"Would you like to accept our one time offer? You will get a 50% decrease in tariff";
cin>>yes_no;
switch (yes_no)
{
case 1:
cout<<"Congratulations! You have won our one time offer";
break;
default:
cout<<"Guess you didn't like our offer";
break;
}
break;
case 4:
satisfied();
break;
default:
cout<<"Kindly press either one of '1, 2, 3 or 4' keys. Thankyou.";
}
cin.get();
return 0.00;
}
This is the indication of Trojan and the program not executing
It's a false positive.
You may be able to help the situation by initialising your variables. As it is, you do not check that reading into yes_no succeeded, so your program has undefined behaviour. That could make your AV think that you are trying to write a memory exploit.
Otherwise, get better AV!
Some anti virus programs simply have false positives. Just whitelist in this case or get another anti virus.
OR your toolchain itself is infected and you compile bad stuff into your programs (then it's time to clean up your OS)
There is nothing wrong with your code.
Programs like 360 total security are anti-malware products that are designed to run on your mother's machine. They are not appropriate on a programmer's machine. They deal poorly with an executable file that appears from nowhere. Uninstall and consider something less aggressive, Windows Defender for example.
Nested switch cases will work work for sure.
First try like instead of using second switch case use if else condition if still it shows TROJAN thing then its a problem with your compiler.

What's a better way to parse for loads of cases than loads of individual if statements?

I am working on a parser to handle hundreds of possible commands, some with their own subcommands. I've got a tokenizer pulling out the commands into an object, but from there I just have a very very long list of if statements checking for each individual case.
Is there a better or more efficient way to check for each individual case rather than 100+ specified if statements?
For example, a command could be : A,CONFIG,SET,GARBLE,5. This would launch into setting the config for garble to 5. But that varies from A,CONFIG,SET,JAM,5 or another command like P,DO,ACTION which is itself another command entirely.
Right now my program covers all of these cases with individual if statements, but I feel like it's really inefficient. If you're the last command, you're taking the longest no matter what. Is there a better, more practical way to do this?
If you want code examples, it's really as simple as it sounds. After getting the object full of tokenized commands, I have an absolutely huge check where it individual looks for stuff like if (command == "P") launch into the command handler for P commands.
Depending on how complex your commands are this might be a job for a 'real' parser, either a hand rolled recursive descent parser or one built using tools like lexx and yacc.
Alternatively, if that is overkill for your use case you could use a hash table of function pointers or command objects, look up the command in the table and call the function or a method on the object to process it. That would be more maintainable than a bunch of if statements in my opinion.
Arrange your commands alphabetically, then use nested case statements dealing with one letter at a time.
switch( command[0] ) {
case 'A': ...( code to handle commands starting with A ) ... break;
case 'B':
switch( command[1] ) {
case 'A': ... ( code to handle commands BA... ) ... break;
}
...
}
The switch statement optimizes the jump to the correct case.
You can use configuration-like functionalities.
I think it's better. You would basically have to create a configuration file with a dictionary containing all of the commands and their actions, and the program would just grab the command related to the keyword.
Pseudocode:
{
"COMMAND_1": function_1,
"COMMAND_2": function_2
}
This is JSON-like but you can do it however you like obviously.
The functions would be basically void pointers to functions executing the related code.
Your code would basically become something like:
token = get_token();
if (void* command = token_in_dictionary(token, dictionary)){
command(get_params());
}else{
printf("Invalid token");
}

PDCurses KEY_ENTER does not work

Lets start with what my code looks like then I will explain my problem:
int main {
char ch; //Stores key presses
initscr();
raw();
nonl();
keypad(stdscr, TRUE);
noecho();
//Some code
ch = getch();
switch (ch) {
case KEY_UP:{
//Code that works
break;
}
case KEY_ENTER:{
//Some code- that doesn't work problem being the above
break;
}
//Other case statements
}
Now the problem:
The problem I run into if you haven't already worked it out is that when ever I press the enter/return key on my keyboard absolutely nothing happens.
I have tried changing the KEY_ENTER to '\n' - didn't work - even changed the char ch which when through multiple iterations including int and wchar_t.
All to no avail, and before you say search for answers and send me packing my bags to go onto a perilous adventure through every corner of the interwebs, I have already tried that, if I hadn't I wouldn't have ventured here, in search of aid.
So now my search has brought me here and I ask of you - the lovely people of the interwebs - to help me in my search of the answer I have been looking for
And to who ever may be valiant enough to answer it I give you my up most gratitude and thanks
Try case '\r':. (For good measure, you could do case '\r': case '\n': case KEY_ENTER:, as is basically done in testcurs.c, to capture all possibilities.) The call to nonl() is why you're getting '\r' instead of '\n'.
As for KEY_ENTER, my only excuse is that it's marked "not reliable" in the PDCurses comments. I could pretend that it's meant to represent the keypad's "Enter" key, rather than the key usually marked "Return" in the main part of the keyboard... except that PDCurses also has PADENTER, specifically for that purpose. In truth, like a lot of things in PDCurses, the reason KEY_ENTER is there, and defined the way it is, is a bit of a historical mess.