(clang AST matchers) How to match consecutive statements?

(clang AST matchers) How to match consecutive statements? - c++

I'm planning a bunch of refactorings on a large code base that I'd like to automate using Clang tooling. For this, I'm trying to write a Clang AST Matcher expression.
Specifically, I'm trying to match pairs of statements that I'd like to replace with something else, like
a(); => a_and_b(x);
b(x);
So I'm trying to match an a callExpr() followed by a b callExpr() (but could be any statement, really). I have constructed matchers for the first and the second statements, independently, let's call them aMatcher() and bMatcher() but haven't found how to combine them so that they match only if they're back-to-back, something like bMatcher(follows(aMatcher()). None of the existing matchers seems to be pertinent (looked for "next", "prev", "position", ...).
How do I go about this the correct way, please?

The implementation of UseAnyOfAllOfCheck contains a private nextStmt matcher:
/// Matches a Stmt whose parent is a CompoundStmt, and which is directly
/// followed by a Stmt matching the inner matcher.
AST_MATCHER_P(Stmt, nextStmt, ast_matchers::internal::Matcher<Stmt>,
InnerMatcher) {
DynTypedNodeList Parents = Finder->getASTContext().getParents(Node);
if (Parents.size() != 1)
return false;
auto *C = Parents[0].get<CompoundStmt>();
if (!C)
return false;
const auto *I = llvm::find(C->body(), &Node);
assert(I != C->body_end() && "C is parent of Node");
if (++I == C->body_end())
return false; // Node is last statement.
return InnerMatcher.matches(**I, Finder, Builder);
}
I don't know how robust that is, but I'll try to use it and report back.

Related

Safely unwrap consecutively

I have an if statement which needs to check for the existence of a value in a nested Option. The statement currently looks like
if my_vec.get(0).is_some() && my_vec.get(0).unwrap().is_some() {
// My True Case
} else {
// My Else Case
}
I feel like this is an amateurish way of checking if this potential nested value exists. I want to maintain safety when fetching the Option from the array as it may or may not exist, and also when unwrapping the Option itself. I tried using and_then and similar operators but haven't had any luck.

I would check the length first and access it like a regular array instead of using .get(x) unless there is some benefit in doing so (like passing it to something which expects an option).
if my_vec.len() > x && my_vec[x].is_some() {
// etc
}
Another option is to just match the value with an if let x = y or full match statement.
if let Some(Some(_)) = my_vec.get(x) {
// etc
}
The matches! macro can also be used in this situation similarly to the if let when you don't need to take a reference to the data.
if matches!(my_vec.get(x), Some(Some(_))) {
// etc
}
Or the and_then version, but personally it is probably my least favorite since it is longer and gargles the intention.
if my_vec.get(x).and_then(|y| y.as_ref()).is_some() {
// etc
}
You can pick whichever one is your favorite. They all compile down to the same thing (probably, I haven't checked).

Can I use virtual tokens (tokens with identical return value) in ANTLR4 similar to c++?

In C++ I can use virtual functions to process data from similar classes that have the same parent/ancestor, does ANTLR4 support this and how would I have to set up the grammar?
I have tried to set up a grammar, using arguments that have the same return value and use that value in a token that contains the different "subclassed" tokens.
Here is some code I have tried to work with:
amf_group
: statements=amf_statements (GROUPSEP WS? LINE_COMMENT? EOL? | EOF)
;
amf_statements returns [amf::AmfStatements stmts]
: ( WS? ( stmt=amf_statement { stmts.emplace_back(std::move($stmt.stmtptr)); } WS? EOL) )*
;
amf_statement returns [amf::AmfStatementPtr stmtptr]
: (
stmt = jsonparent_statement
| stmt = jsonvalue_statement
)
{
$stmtptr = std::move($stmt.stmtptr);
}
;
jsonparent_statement returns [amf::AmfStatementPtr stmtptr] locals [int lineno=0]
:
(T_JSONPAR { $lineno = $T_JSONPAR.line;} ) WS (arg=integer_const)
{
$stmtptr = std::make_shared<amf::JSONParentStatement>($lineno, nullptr);
}
;
jsonvalue_statement returns [amf::AmfStatementPtr stmtptr] locals [int lineno=0]
: ( T_JSONVALUE { $lineno = $T_JSONVALUE.line; } ) WS (arg=integer_const) (WS fmt=integer_const)?
{
$stmtptr = std::make_shared<amf::JSONValueStatement>($lineno, std::move($arg.argptr), std::move($fmt.argptr));
}
;
I receive the following error:
error(75): amf1.g4:23:10: label stmt=jsonvalue_statement type mismatch with previous definition: stmt=jsonparent_statement
This error is or course quite logical, because the tokens are indeed of a different type, but there return value types are identical. For two (virtual) tokens I can write all the code separatelty, but in my case I have some 40+ different tokens that either represent arguments or statements and writing all the combinations would be cumbersome. The above code did work in Antlr3 by the way.
Is there another way to get around these errors using ANTLR4? Does anybody have any suggestions?

What's specified in a rule return value is not really a return value in a functional sense. Instead the context representing the rule will get a new member field that takes the "return" value. Given that it makes no sense trying to treat parser rules like C++ functions, they are simply not comparable.
Instead of handling all the fields in your grammar, I recommend a different approach: with ANTLR4 you will get a parse tree (if enabled), which represents the matched rules using parse rule contexts (which is super view of the previously generated AST). This context contains all the values that have been parsed out. You just need a listener in a second step after the parse run (often called the semantic phase) to walk over this tree, pick those values up and create your own data structures from them. This separation also allows to use your parser for quick syntax checks, since you don't do all the heavy work in the parse run.

(Qt) Validate string against multiple regular expressions simultaneously

I'm checking a string which contains vehicle registration information against regular expressions for validity. I have several regular expression for each criteria I need. How can I validate the string against all my reg expressions without having to combine them into one expression or do something like this to determine if it's valid?
if( s_expGP.exactMatch(lineEdit->text()) ||
s_expGPNew.exactMatch(lineEdit->text()) ||
s_expPersonal.exactMatch(lineEdit->text()) ||
s_expGov.exactMatch(lineEdit->text()) )
{
//do stuff
}

The only option would be to create a single regular expression by combining s_expGP, s_expGPNew, s_expPersonal and the rest if that is possible, otherwise I don't think there could be any other way.

If you have a big number of regexp to test or if you may need to verify the string more than once. You can create a function like this
bool isValid(const QVector<QRegExp>& regExps, const QString& input)
{
for(QRegExp exp : regExps)
{
if(!exp.exactMatch(input))
return false;
}
return true;
}
Or use a static QVector like you have static regexp.

How replace If-else block condition

In my code I have an if-else block condition like this:
public String method (Info info) {
if (info.isSomeBooleanCondition) {
return "someString";
}
else if (info.isSomeOtherCondition) {
return "someOtherString";
}
else if (info.anotherCondition) {
return "anotherStringAgain";
}
else if (lastCondition) {
return "string ...";
}
else return "lastButNotLeastString";
}
Each conditional branch returns a String.
Since if-else statements are difficult to read, test and maintain, how can I replace?
I was thinking to use Chain Of Responsability Pattern, is it right in this case?
Is there any other elegant way that I can do that?

I am left to assume that your code does not exist in the Info class as it is passed in an referenced for all but that last condition. My first instinct would be to make String OtherClass.method(Info) into String Info.method() and have it return the appropriate string.
Next, I would take a look at the conditions. Are they really conditions or can they be mapped to a table. Whenever I see code performing a lookup, such as this, I tend to fall back on attempting to fit into a dictionary or map so I can perform a lookup for the value.
If you are left with conditions that must be checked then I would begin thinking about lambdas, delegates or custom interface. A series of if..then across the same type could easily be represented. Next, you would collect them and execute accordingly. IMO, this would make the if..then bunch much clearer. It is more code by is secondary at this point.
interface IInfoCheck
{
bool TryCheck(Info info, out string);
}
public OtherClass()
{
// Setup checks
CheckerCollection.add(new IInfoCheck{
public String check(out result) {
// check code
}
});
}
public String method(Info info) {
foreach (IInfoCheck ic in CheckerCollection)
{
String result = null;
if (ic.TryCheck(out result))
{
return result;
}
}
}

The problem statement does not fit into an ideal chain of responsibility scenario because it is either/or kind or conditions which look 'chained' but is actually 'not'. Reason - one processes all the chain-links in the chain of responsibility pattern irrespective of what happened in the previous links, i.e. no chain-links are skipped(although you can configure which chain links to process and which not - but still the execution of a chain-link is not dependent on the outcome of a previous chain-link). However, in this if-else-if* scenario - once an if statement condition matches, the further conditions are not evaluated.
I have thought of an alternative design which achieves the above without if-else, but it is lengthier but at the same time more flexible.
Lets say we have a FunctionalInterface IfElseReplacer which takes 'info' as input and gives 'String' output.
public Interface IfElseReplacer(){
public String executeCondition(Info);
}
Then the above conditions can be re-phrased as lambda expressions would look like -
"(Info info) -> info.someCondition ? someString"
"(Info info) -> info.anotherCondition ? someOtherString"
and so on...
Then we need a processConditons method to process these Lambdas- it could be a default method in ifElseReplacer -
default String processConditions(List<IfElseReplacer> ifElseReplacerList, Info info){
String strToReturn="lastButNotLeastString";
for(IfElseReplacer ifElseRep:ifElseReplacerList){
strToReturn=ifElseRep.executeCondition(info);
if(!"lastButNotLeastString".equals(strToReturn)){
break;//if strToReturn's value changes i.e. executeCondition returns a String valueother than "lastButNotLeastString" then exit the for loop
}
return strToReturn;
}
What remains now is to (I am skipping the code for this - please let me know if you need it then will write this also) -
From wherever the if-else conditions need to be checked there -
Create an array of lambda expressions as explained above assigning them to IfElseReplacer interfaces while adding them to a list of type IfElseReplacer.
Pass this list to the default method processConditions() along with an instance of Info.
Default method would return the String value which we would be same as the result of if-else-if* block given in the problem statement.

I'd simply factor out the returns:
return
info.isSomeBooleanCondition ? "someString" :
info.isSomeOtherCondition ? "someOtherString" :
info.anotherCondition ? "anotherStringAgain" :
lastCondition ? "string ..." :
"lastButNotLeastString"
;

From the limited information about the problem, and the code given, it looks like this a case of type-switching. The default solution would be to use a inheritance for that:
class Info {
public abstract String method();
};
class BooleanCondition extends Info {
public String method() {
return "something";
};
class SomeOther extends Info {
public String getString() {
return "somethingElse";
};
Patterns which are interesting in this case are Decorator, Strategy and Template Method. Chain of Responsibility has another focus. Each element in the chain implement logic to process some commands. When chained, an object forwards the command if it cannot process it. This implements a loosly coupled structure to process commands where no central dispatch is needed.
If computing the string on the conditions is an operation, and from the name of the class I am guessing that it is probably an expression tree, you should look at the Visitor pattern.

D std.regex - return as text

Is there a possibility to return text which was used to create regular expression?
Something like this:
auto r = regex(r"[0-9]", "g"); // create regular expression
writeln(r.dumpAsText()); // this would write: [0-9]
There is nothing in http://dlang.org/phobos/std_regex.html on this. (or at least I did not notice)

No, because it compiles the regex, and I don't believe it even stores the string after compilation.
The best thing to do is just to store the string yourself on creation.
Source for struct Regex
As you can see, it doesn't store the pattern string, only the bytecode.

Typically using a subtype would work, but unfortunately ti doesn't due to failed template constraints. E.g. a plausible solution (that doesn't work right now) would be to wrap the regex as a subtype:
auto myregex(string arg1, string arg2)
{
struct RegexWrap
{
Regex!char reg;
alias reg this;
string dumpAsText;
}
return RegexWrap(regex(arg1, arg2), arg1);
}
void main()
{
auto r = myregex(r"[0-9]", "g"); // create regular expression
writeln(r.dumpAsText); // this would write: [0-9]
writeln(match("12345", r)); // won't work
}
The match function in std.regex won't work with this wrapper struct even when using a subtype, because it fails this template constraint:
public auto match(R, RegEx)(R input, RegEx re)
is(RegEx == Regex!(BasicElementOf!R)
Even if you changed the header to this, it still won't work:
public auto match(R)(R input, Regex!(BasicElementOf!R) re)
The only way it would work is if the type was explicit so the subtype could be passed:
public auto match(R)(R input, Regex!char re)
I find this to be an awkward part of D that could be improved.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

(clang AST matchers) How to match consecutive statements? - c++

Related

Safely unwrap consecutively

Can I use virtual tokens (tokens with identical return value) in ANTLR4 similar to c++?

(Qt) Validate string against multiple regular expressions simultaneously

How replace If-else block condition

D std.regex - return as text

Categories

Resources