How to find two llvm basic blocks correspond to a part of if-then body and a part of if-else body in the if statement - llvm

When I want to check the branches of opposites on LLVM IR's cfg, what should I do to find that these two basic blocks correspond to a part of paired if-then body and if-else body in the if statement?
for example:
int foo(int a, int b, int c){
if(a > b){
c = 1;
if(a / 2 > b * 3){
c = 3;
}
}else{
c = 2;
}
c ++;
return c;
}
How can I find the block if.then and the block if.else be the part of paired if-else's body and if-then's body in the source code?
I have an idea. I don't know whether it's feasible.
My opinion is to find the control dependency of if.then and if.else by post dominators and check the dependency edges have same inner node (here is entry) and entry has only these two outer edges on cfg.

Related

Why does c++(and most other languages) have both 'for' and 'while' loops?/What can you do with one type that you can't with the other?

Why do most programming languages have two or more types of loops with almost no difference? In c++, is one of the two better for specific tasks? What about switches and if's; is there any difference there?
If not, then an answer about why they were implemented would be appreciated.
There is nothing that you can do with one type that you can't do with the other, because mechanical transformations exist between them:
while (X) { body; }
can be written as
for (;X;) { body; }
and
for (A;B;C) { body; }
can be written as
{ A; while (B) { body; C; } }
The only difference is readability. The for loop puts the update-expression C at the top of the loop, making it easier to recognize certain patterns like looping through a numeric range:
for( i = 0; i < limit; ++i )
or following a linked list
for( ; ptr != nullptr; ptr = ptr->next )

Deconstruct a C++ code block from a novice perspective

I am very new to C++ and I am trying to deconstruct someone's code, and I am not quite sure what to Google for, hence I am just going to ask here. This is a second attempt at a question I asked earlier which was poorly posed. Should this one not measure up, please let me know and I shall try to rectify.
Here is a structurally identical MWE, of the piece of code I am trying to understand.
#include <iostream>
using namespace std;
int square(int x){
// Function that squares without using *
int result = 0;
for (int counter = 0; counter < x; ++counter){
result += x;
}
return result;
}
int main()
{
int const D = 4;
int myArray[D] = {}; // all elements 0 in C++
char colour[D] = {'c','o','e','g'}; // Initialize String Array
int AEST = 5; // Initialise AEST
for (int d =0; d<D; d++){
if (colour[d]!='c' && colour[d]!='o'){
double aux= square (d);
if (aux!=0){
myArray[d]=aux;
}else{
return AEST;
}
}
}
// Lets see what we achieved.
for (int d =0; d<D; d++){
cout << myArray[d];
}
return 0;
}
Now then, lets crack on with some questions.
Precisely what I do not fully understand is this block:
}else{
return AEST;
}
Please not, AEST is not an error code, it is a numerical value that the code calculates. I have only initialized it here for the purpose of this MWE, but in actuality, it is calculated earlier on in the original code block.
My question is as follows:
The if statement is only true if the colours are not c or o and in which case we square d. In the MWE we square d twice. Hence, is the code then saying that we break out of the loop (with return AEST) IF we stumble upon a colour that is not c or o? But if we do break out of the loop under these conditions, why must we return AEST? It is already initialised AEST=5 earlier on, and nothing we do inside this loop will affect it (remember this block is structurally identical to what I am trying to understand, but obviously not fully identical). This is why I do not understand the else bit.
Again, if there is not enough information, please let me know.
The return AEST part in question exits the main() function. That means the program exits in state 5.
This is done to have some sort of error code detection. For example. If you have various things that can go wrong, you try to retun those with specific codes so you can look up and identify where the problem occured.
It is common to return 0 if everything is fine.

branching depending on which of 3 numbers is smallest

The solution is obvious, but this is question about nice solution.
(EDIT: by nice I mean e.g. 1) without code redudancy 2) without comprimising performance 3) without forcing programmer to make some unnecessary function or temporary variables )
Consider situation when I would like to execute 3 different blocks of code depending on which of numbers a,b,c is smallest.
The code would look like this:
if( a < b ){
if( a < c ){
// code block for "a is minimum" case
}else{
// code block for "c is minimum" case
}
}else{
if( b < c ){
// code block for "b is minimum" case
}else{
// code block for "c is minimum" case
}
}
What I don't like is that I have to copy the // code block for "c is minimum" case twice.
There are several solutions to that. E.g. I can put the block of code for "c is minimum" into an inline function or macro. But I don't like it (seems less clear to read).
Old school solution would be use goto like:
if( a < b ){
if( a < c ){
// code for "a is minimum" case
goto BRANCHE_END;
}
}else{
if( b < c ){
// code for "b is minimum" case
goto BRANCHE_END;
}
}
// code for "c is minimum" case
BRANCHE_END:
but people don't like to see goto (for good reason). On the other hand in this particular case it is even very well readable.
If the block of code would be independent function it can be written like
void myBranching( double a, double b, double c ){
if( a < b ){
if( a < c ){
// code for "a is minimum" case
return;
}
}else{
if( b < c ){
// code for "b is minimum" case
return;
}
}
// code for "c is minimum" case
return;
}
(which is actually almost the same as that goto) But in many cases similar block of code have to be part of more complex algorithm and it is inconvenient to put it inside function. Encapsulation in function would require passing many variables which are used both inside and outside ( e.g. see the use case below) .
Is there any control structure in C/C++ which would solve this in elegant way.
NOTE: Consider performance critical code. (e.g. ray-tracer, GLSL shader, physical simulation ). Anything which would add some unnecessary computational overhead is out of question.
Additional questions / comments
This is one of a few examples when I feel like Structured programming tie my hands, and that it is just subset of what is possible to do with jump instruction. Do you know other examples where algorithm would be more simple and clear using goto rather than standard control structures ?
can you imagine more complex branching where it would be necessary to copy some blocks of code even more times ?
EDIT : Use case
I think some confusion resulted from the fact that I did not specified context in which I want to use this. This is part of algorithm which raytrace regular triclinic 3D grid ( something like Bresenham's line algorithm in 3D, but the starting point is float (not aligned to center of any box) )
but please, do not focus on algorithm itself, it may be also wrong, I'm currently debugging it.
double pa,pb,pc,invPa,invPb,invPc,mda,mdb,mdc,tmax,t;
int ia,ib,ic;
// for shortness I don't show initialization of these variables
while( t<tmax ){
double tma = mda * invPa;
double tmb = mdb * invPb;
double tmc = mdc * invPc;
if( tma < tmb ){
if( tma < tmc ){ // a min
t += tma;
mda = 1;
mdb -= pb*tma;
mdc -= pc*tma;
ia++;
}else{ // c min
t += tmc;
mda -= pa*tmc;
mdb -= pb*tmc;
mdc = 1;
ic++;
}
}else{
if( tmb < tmc ){ // b min
t += tmb;
mda -= pa*tmb;
mdb = 1;
mdc -= pc*tmb;
ib++;
}else{ // c min
t += tmc;
mda -= pa*tmc;
mdb -= pb*tmc;
mdc = 1;
ic++;
}
}
// do something with ia,ib,ic,mda,mdb,mdc
}
You can solve this pretty easily with std::min and using the version that takes a std::initializer_list. You call min on the three variables to get the minimum and then you have three if statements to check the return against each of the variables.
void foo(int a, int b, int c)
{
int min = std::min({a, b, c});
if (min == a)
{
// a case
}
else if (min == b)
{
// b case
}
else
{
// c case
}
}
You don't have to nest conditionals by the way:
if(a <= b && a <= c) { /* a is the lowest */ }
else if(b <= c) { /* b is the lowest */ }
else { /* c is the lowest */ }
Note that the semantics of C++'s operator && is such that
if(a <= b && a <= c)
works roughly like
if(a <= b) if(a <= c)
(The difference is that the succeeding else clause, if any, covers both ifs simultaneously.) In the expression cond1 && cond2, if cond1 proves to be false then cond2 is never evaluated, at least, it does not have observable side-effects.
Sure we can make something more monstrous, with non-local exits, for example:
do {
if(a <= b) {
if(a <= c) {
// a is the lowest
break;
}
}
else if(b <= c) {
// b is the lowest
break;
}
// c is the lowest
}
while(0);
But in fact this construct, despite being incredibly taller, is logically equivalent to those three lines above (with Dieter's proposed edit).

check if one string is interleaved by two other strings

I am debugging the following problem. Post detailed problem statement and the coding. My question is whether the last else if (else if (A[i-1]==C[i+j-1] && B[j-1]==C[i+j-1])) is necessary? I think it is not necessary since it is always covered either by else if(A[i-1]==C[i+j-1] && B[j-1]!=C[i+j-1]), or covered by else if (A[i-1]!=C[i+j-1] && B[j-1]==C[i+j-1]), i.e. previous two if-else check conditions. Thanks.
Given s1, s2, s3, find whether s3 is formed by the interleaving of s1 and s2.
For example,
Given:
s1 = "aabcc",
s2 = "dbbca",
When s3 = "aadbbcbcac", return true.
When s3 = "aadbbbaccc", return false.
// The main function that returns true if C is
// an interleaving of A and B, otherwise false.
bool isInterleaved(char* A, char* B, char* C)
{
// Find lengths of the two strings
int M = strlen(A), N = strlen(B);
// Let us create a 2D table to store solutions of
// subproblems. C[i][j] will be true if C[0..i+j-1]
// is an interleaving of A[0..i-1] and B[0..j-1].
bool IL[M+1][N+1];
memset(IL, 0, sizeof(IL)); // Initialize all values as false.
// C can be an interleaving of A and B only of sum
// of lengths of A & B is equal to length of C.
if ((M+N) != strlen(C))
return false;
// Process all characters of A and B
for (int i=0; i<=M; ++i)
{
for (int j=0; j<=N; ++j)
{
// two empty strings have an empty string
// as interleaving
if (i==0 && j==0)
IL[i][j] = true;
// A is empty
else if (i==0 && B[j-1]==C[j-1])
IL[i][j] = IL[i][j-1];
// B is empty
else if (j==0 && A[i-1]==C[i-1])
IL[i][j] = IL[i-1][j];
// Current character of C matches with current character of A,
// but doesn't match with current character of B
else if(A[i-1]==C[i+j-1] && B[j-1]!=C[i+j-1])
IL[i][j] = IL[i-1][j];
// Current character of C matches with current character of B,
// but doesn't match with current character of A
else if (A[i-1]!=C[i+j-1] && B[j-1]==C[i+j-1])
IL[i][j] = IL[i][j-1];
// Current character of C matches with that of both A and B
else if (A[i-1]==C[i+j-1] && B[j-1]==C[i+j-1])
IL[i][j]=(IL[i-1][j] || IL[i][j-1]) ;
}
}
return IL[M][N];
}
thanks in advance,
Lin
You do need the final else if to catch the cases when the next character in C matches the next character in both A and B. For example, run your program with A="aaaa", B="aaaa", and C="aaaaaaaa" and see if you enter that last else if block.
Additionally, you also need a final else block to handle cases when none of the previous conditions match. In this case, you need to set IL[i][j] to false. Otherwise, the function will incorrectly return true.
Edit: Even though the code uses memset to initialize all elements of IL to 0, it may not work because ISO C++ does not support variable length arrays (VLAs). In fact, this is what happened when I tried the code at cpp.sh. It uses g++-4.9.2 with flags that causes it to report sizeof(IL) to be 1 even though g++ is supposed to support VLAs. Maybe this is a compiler bug or maybe it does not support multidimensional VLAs. In any case, it might be safer to not use them at all.

C++ recursive string mergesort

This is a homework question so while I'd like usable code what I'm really seeking is insight on how to tackle this problem. I have two sorted arrays in ascending order that I need to combine in a recursive function. It seems like I need to institute the sorting part of a merge sort algorithm. The requirements are that the recursive function can only take the two sorted strings as parameters and it cannot use global or static variables.
I think the psuedocode is:
if the size of the two strings == 0 then return the result string.
compare substr(0,1) of each strings to see which is lesser, and append that to result string
recursively call the function with the new parameters being a substr of the string that appended
My questions are: how do I save a result string if I can't use static variables? I've seen code where a string is defined as = to the return statement of the recursive call. Would that work in this case?
The second question is how to increment the function. I need to call a substr(1,size-1) after the first iteration and then increment that without using static variables.
Here's my attempt to solve the equation WITH static variables (which are not allowed):
static string result="";
static int vv=0;
static int ww=0;
if(v.size()==0 && w.size()==0)
return result;
if(w.size()==0 || v.substr(0,1) <= w.substr(0,1)){
result+=v.substr(0,1);
vv++;
return spliceSortedStrings( v.substr(vv,v.size()-vv) , w);
}
else if(v.size()==0 || w.substr(0,1) > v.substr(0,1)){
result+=w.substr(0,1);
ww++;
return spliceSortedStrings( v , w.substr(ww,w.size()-ww));
}
I'll be appreciative for any guidance.
How about this:
std::string merge(std::string a, std::string b) {
if (a.size() == 0) return b;
if (b.size() == 0) return a;
if (a.back() < b.back()) {
std::string m = merge(a, b.substr(0, b.size()-1));
return m + b.back();
}
else {
std::string m = merge(a.substr(0, a.size()-1), b);
return m + a.back();
}
}
Correctness and termination should be obvious, and I think it should fit the constraints you are given. But I wonder what teacher would pose such a task in C++, for the above code is about as inefficient as it possibly can be.