LLVM: implementing if-then branching - llvm

How do I implement if-then branching in LLVM? The Kaleidoscope tutorial only shows how to create if-then-else branching but it doesn't show how to implement if statements without the else block.
example C code that I'd like to compile:
if 0 {
// do something...
} // no else block
thanks!

Related

Ternary operator in bison, avoid both side computation

I have this rule in my grammer for ternary operator:
Int:
Boolean '?' Int ':' Int {if($1==1) $$=$3; else $$=$5;}
| ...
For numbers and expressions this works fine but suppose I have this code when a is integer:
a=5
1==1 ? a++ : a++
cout<<a;// now a==6 is the correct print but I got a==7
Both side of the ':' are computed but I need only one side.
How can i do it in bison?
The only way I see to accomplish what you want while keeping your one-pass interpreter approach would be to have a global flag that controls whether evaluation takes place (while that flag was set to false, the parsing rules would parse normally, but not execute anything, which you'd accomplish by enclosing each action in an if. The rule for the ternary operator could then invoke mid-rule actions or special parsing rules that set this flag according to the condition.
The proper way to solve this is by not executing the program directly in the parser. Instead let the parser build an AST (or some other intermediate representation if you prefer), which you then walk to execute the program in an additional stage.
In that stage you can then easily decide which branch to evaluate after evaluating the condition. The logic for that would look something like this:
class TernaryOperator : public IntExpression {
// ...
public:
int eval() {
if(condition.eval()) {
return then_branch.eval();
} else {
return else_branch.eval();
}
}
}
Of course the above is only an example and might be better written using the visitor pattern instead.

Boolean Not Equal To in Lua programming

How to write if statement in Lua with not equal symbol for boolean variable.
//In Java,
boolean a;
a = false;
if(!a){
//do something
}
-- In Lua I am trying to replicate the same code
local a
a = false
if(~a) then
-- do something
end
But I am getting error. How to write this in Lua ?
Lua uses mostly keywords. Use not a instead of ~a.

Using nested if statements to structure code

I'm trying to structure my code in a readable way. I've read that one way of doing it is as follows:
if(Init1() == TRUE)
{
if(Init2() == TRUE)
{
if(Init3() == TRUE)
{
...
Free3();
}
Free2();
}
Free1();
}
I like this way of doing things, because it keeps each FreeX inside its matching InitX loop, but if the nesting goes beyond three levels it quickly becomes unreadable and goes way beyond 80 columns. Many functions can be broken up into multiple functions so that this doesn't happen, but it seems dumb to break up a function just to avoid too many levels of nesting. In particular, consider a function that does the initialization for a whole class, where that initialization requires ten or more function calls. That's ten or more levels of nesting.
I'm sure I'm overthinking this, but is there something fundamental I'm missing in the above? Can deep nesting be done in a readable way? Or else restructured somehow whilst keeping each FreeX inside its own InitX loop?
By the way, I realise that the above code can be compacted to if(Init1() && Init2()..., but the code is just an example. There would be other code between each InitX call that would prevent such a compaction.
Since you included the C++ tag, you should be using RAII - Resource Acquisition Is Initialization. There's a bunch of good online resources explaining this concept, and it will make a lot of things pertaining to resource management a lot easier.
I'm sure I'm overthinking this, but is there something fundamental I'm missing in the above? [...] Or else restructured somehow whilst keeping each FreeX inside its own InitX loop?
Yes. This is a textbook case of code that will benefit enormously from RAII code:
instead of the construct:
if(init(3) == TRUE)
{
free3();
}
consider this:
raii_resource3 r3 = init3(); // throws exception if init3 fails
// free3 called internally
// by raii_resource3::~raii_resource3
Your full code becomes:
raii_resource1 r1 = init1();
raii_resource2 r2 = init2();
raii_resource3 r3 = init3();
You will have no nested ifs, your code will be clear and straightforward (and focusing on the positive case).
You will just have to write RAII wrappers for resources 1, 2 and 3.
As others have pointed out, the obvious answer is RAII. But if
the issue of overly deep nesting comes up without RAII, you
should really ask yourself if you aren't making your functions
too complicated. A function should rarely have more than about
ten lines (including such checks). If you look at real cases,
you'll find that it almost always makes more sense to break the
function up. Even with RAII, you should generally only have one
instance of an RAII class per function. (There are exceptions,
of course; arguably, something like std::lock_guard shouldn't
count.)
What i would recommend would be to use a switch statement.
I'm a big fan of switch statements when it comes down to code like this
Hope this helps.
switch (i) {
case 1:
// action 1
break;
case 2:
// action 2
break;
case 3:
// action 3
break;
default:
// action 4
break;
}
You may have heard that gotos are evil, But if you are sticking to c, there is nothing "dirty" about using gotos to handle exceptions like this:
foo()
{
if (!Init1())
goto Error1;
if (!Init2())
goto Error2;
...
...
Error2:
Free2();
Error1:
Free1();
}

Building up AST from a function with more than one returns for llvm

For example I have this function:
def max(a,b) {
if(a < b) return b;
if(a > b) return a;
}
I am curious how to parse this into an AST.
If I understand this well then the node of it's body should return a ReturnInst*.
But in my AST this body is contained by two nodes (as expressions), one for the first if and another one for the other one.
Is there some trick or the design is wrong to begin with?
Edit: I've just tough out a might-be-a-solution:
CreateAlloca at the begin of the body.
CreateStore and jump to the end label at every return.
At the end label return the var.
Is it a good idea? And how to jump/goto with llvm?
You can try the online demo: http://llvm.org/demo/ Type in the C or C++ for what you want to do and it will show you the LLVM output.
Your outlined solution with alloca+store+jump definitely works, if I'm understanding it correctly. And yes, the LLVM optimizers will handle it just fine. Alternatively, there isn't any restriction on the number of ret instructions in a given function.
Not sure what you're asking with generating a goto; if you've managed to write an if statement, it the same sort of br instruction you need at the end of one. In general, running "clang -S -emit-llvm" over some simple code is a good technique to see how to generate simple constructs. Also, "llc -march=cpp" is a good way to see how to build instructions in C++.

do {...} while(false)

I was looking at some code by an individual and noticed he seems to have a pattern in his functions:
<return-type> function(<params>)
{
<initialization>
do
{
<main code for function>
}
while(false);
<tidy-up & return>
}
It's not bad, more peculiar (the actual code is fairly neat and unsurprising). It's not something I've seen before and I wondered if anyone can think of any logic behind it - background in a different language perhaps?
You can break out of do{...}while(false).
A lot of people point out that it's often used with break as an awkward way of writing "goto". That's probably true if it's written directly in the function.
In a macro, OTOH, do { something; } while (false) is a convenient way to FORCE a semicolon after the macro invocation, absolutely no other token is allowed to follow.
And another possibility is that there either once was a loop there or iteration is anticipated to be added in the future (e.g. in test-driven development, iteration wasn't needed to pass the tests, but logically it would make sense to loop there if the function needed to be somewhat more general than currently required)
The break as goto is probably the answer, but I will put forward one other idea.
Maybe he wanted to have a locally defined variables and used this construct to get a new scope.
Remember while recent C++ allows for {...} anywhere, this was not always the case.
I've seen it used as a useful pattern when there are many potential exit points for the function, but the same cleanup code is always required regardless of how the function exits.
It can make a tiresome if/else-if tree a lot easier to read, by just having to break whenever an exit point is reached, with the rest of the logic inline afterwards.
This pattern is also useful in languages that don't have a goto statement. Perhaps that's where the original programmer learnt the pattern.
I've seen code like that so you can use break as a goto of sorts.
I think it's more convenient to write break instead of goto end. You don't even have to think up a name for the label which makes the intention clearer: You don't want to jump to a label with a specific name. You want to get out of here.
Also chances are you would need the braces anyway. So this is the do{...}while(false); version:
do {
// code
if (condition) break; // or continue
// more code
} while(false);
And this is the way you would have to express it if you wanted to use goto:
{
// code
if (condition) goto end;
// more code
}
end:
I think the meaning of the first version is much easier to grasp. Also it's easier to write, easier to extend, easier to translate to a language that doesn't support goto, etc.
The most frequently mentioned concern about the use of break is that it's a badly disguised goto. But actually break has more resemblance to return: Both instructions jump out of a block of code which is pretty much structured in comparison to goto. Nevertheless both instructions allow multiple exit points in a block of code which can be confusing sometimes. After all I would try to go for the most clear solution, whatever that is in the specific situation.
This is just a perversion of while to get the sematics of goto tidy-up without using the word goto.
It's bad form because when you use other loops inside the outer while the breaks become ambiguous to the reader. "Is this supposed to goto exit? or is this intended only to break out of the inner loop?"
This trick is used by programmers that are too shy to use an explicit goto in their code. The author of the above code wanted to have the ability to jump directly to the "cleanup and return" point from the middle of the code. But they didn't want to use a label and explicit goto. Instead, they can use a break inside the body of the above "fake" cycle to achieve the same effect.
Several explanations. The first one is general, the second one is specific to C preprocessor macros with parameters:
Flow control
I've seen this used in plain C code. Basically, it's a safer version of goto, as you can break out of it and all memory gets cleaned up properly.
Why would something goto-like be good? Well, if you have code where pretty much every line can return an error, but you need to react to all of them the same way (e.g. by handing the error to your caller after cleaning up), it's usually more readable to avoid an if( error ) { /* cleanup and error string generation and return here */ } as it avoids duplication of clean-up code.
However, in C++ you have exceptions + RAII for exactly this purpose, so I would consider it bad coding style.
Semicolon checking
If you forget the semicolon after a function-like macro invocation, arguments might contract in an undesired way and compile into valid syntax. Imagine the macro
#define PRINT_IF_DEBUGMODE_ON(msg) if( gDebugModeOn ) printf("foo");
That is accidentally called as
if( foo )
PRINT_IF_DEBUGMODE_ON("Hullo\n")
else
doSomethingElse();
The "else" will be considered to be associated with the gDebugModeOn, so when foo is false, the exact reverse of what was intended will happen.
Providing a scope for temporary variables.
Since the do/while has curly braces, temporary variables have a clearly defined scope they can't escape.
Avoiding "possibly unwanted semicolon" warnings
Some macros are only activated in debug builds. You define them like:
#if DEBUG
#define DBG_PRINT_NUM(n) printf("%d\n",n);
#else
#define DBG_PRINT_NUM(n)
#endif
Now if you use this in a release build inside a conditional, it compiles to
if( foo )
;
Many compilers see this as the same as
if( foo );
Which is often written accidentally. So you get a warning. The do{}while(false) hides this from the compiler, and is accepted by it as an indication that you really want to do nothing here.
Avoiding capturing of lines by conditionals
Macro from previous example:
if( foo )
DBG_PRINT_NUM(42)
doSomething();
Now, in a debug build, since we also habitually included the semicolon, this compiles just fine. However, in the release build this suddenly turns into:
if( foo )
doSomething();
Or more clearly formatted
if( foo )
doSomething();
Which is not at all what was intended. Adding a do{ ... }while(false) around the macro turns the missing semicolon into a compile error.
What's that mean for the OP?
In general, you want to use exceptions in C++ for error handling, and templates instead of macros. However, in the very rare case where you still need macros (e.g. when generating class names using token pasting) or are restricted to plain C, this is a useful pattern.
It looks like a C programmer. In C++, automatic variables have destructors which you use to clean up, so there should not be anything needed tidying up before the return. In C, you didn't have this RAII idiom, so if you have common clean up code, you either goto it, or use a once-through loop as above.
Its main disadvantage compared with the C++ idiom is that it will not tidy up if an exception is thrown in the body. C didn't have exceptions, so this wasn't a problem, but it does make it a bad habit in C++.
It is a very common practice. In C. I try to think of it as if you want to lie to yourself in a way "I'm not using a goto". Thinking about it, there would be nothing wrong with a goto used similarly. In fact it would also reduce indentation level.
That said, though, I noticed, very often this do..while loops tend to grow. And then they get ifs and elses inside, rendering the code actually not very readable, let alone testable.
Those do..while are normally intended to do a clean-up. By all means possible I would prefer to use RAII and return early from a short function. On the other hand, C doesn't provide you as much conveniences as C++ does, making a do..while one of the best approaches to do a cleanup.
Maybe it’s used so that break can be used inside to abort the execution of further code at any point:
do {
if (!condition1) break;
some_code;
if (!condition2) break;
some_further_code;
// …
} while(false);
I think this is done to use break or continue statements. Some kind of "goto" code logic.
It's simple: Apparently you can jump out of the fake loop at any time using the break statement. Furthermore, the do block is a separate scope (which could also be achieved with { ... } only).
In such a situation, it might be a better idea to use RAII (objects automatically destructing correctly when the function ends). Another similar construct is the use of goto - yes, I know it's evil, but it can be used to have common cleanup code like so:
<return-type> function(<params>)
{
<initialization>
<main code for function using "goto error;" if something goes wrong>
<tidy-up in success case & return>
error:
<commmon tidy-up actions for error case & return error code or throw exception>
}
(As an aside: The do-while-false construct is used in Lua to come up for the missing continue statement.)
How old was the author?
I ask because I once came across some real-time Fortran code that did that, back in the late 80's. It turns out that is a really good way to simulate threads on an OS that doesn't have them. You just put the entire program (your scheduler) in a loop, and call your "thread" routines" one by one. The thread routines themselves are loops that iterate until one of a number of conditions happen (often one being a certain amount of time has passed). It is "cooperative multitasking", in that it is up to the individual threads to give up the CPU every now and then so the others don't get starved. You can nest the looping subprogram calls to simulate thread priority bands.
Many answerers gave the reason for do{(...)break;}while(false). I would like to complement the picture by yet another real-life example.
In the following code I had to set enumerator operation based on the address pointed to by data pointer. Because a switch-case can be used only on scalar types first I did it inefficiently this way
if (data == &array[o1])
operation = O1;
else if (data == &array[o2])
operation = O2;
else if (data == &array[on])
operation = ON;
Log("operation:",operation);
But since Log() and the rest of code repeats for any chosen value of operation I was wandering how to skip the rest of comparisons when the address has been already discovered. And this is where do{(...)break;}while(false) comes in handy.
do {
if (data == &array[o1]) {
operation = O1;
break;
}
if (data == &array[o2]) {
operation = O2;
break;
}
if (data == &array[on]) {
operation = ON;
break;
}
} while (false);
Log("operation:",operation);
One may wonder why he couldn't do the same with break in an if statement, like:
if (data == &array[o1])
{
operation = O1;
break;
}
else if (...)
break interacts solely with the closest enclosing loop or switch, whether it be a for, while or do .. while type, so unfortunately that won't work.
In addition to the already mentioned 'goto examples', the do ... while (0) idiom is sometimes used in a macro definition to provide for brackets in the definition and still have the compiler work with adding a semi colon to the end of a macro call.
http://groups.google.com/group/comp.soft-sys.ace/browse_thread/thread/52f670f1292f30a4?tvc=2&q=while+(0)
I agree with most posters about the usage as a thinly disguised goto. Macros have also been mentioned as a potential motivation for writing code in the style.
I have also seen this construct used in mixed C/C++ environments as a poor man's exception. The "do {} while(false)" with a "break" can be used to skip to the end of the code block should something that would normally warrant an exception be encountered in the loop.
I have also sen this construct used in shops where the "single return per function" ideology is enforced. Again, this is in lieu of an explicit "goto" - but the motivation is to avoid multiple return points, not to "skip over" code and continue actual execution within that function.
I work with Adobe InDesign SDK, and the InDesign SDK examples have almost every function written like this. It is due to fact that the function are usually really long. Where you need to do QueryInterface(...) to get anything from the application object model. So usually every QueryInterface is followed by if not went well, break.
Many have already stated the similarity between this construct and a goto, and expressed a preference for the goto. Perhaps this person's background included an environment where goto's were strictly forbidden by coding guidelines?
The other reason I can think of is that it decorates the braces, whereas I believe in a newer C++ standard naked braces are not okay (ISO C doesn't like them). Otherwise to quiet a static analyzer like lint.
Not sure why you'd want them, maybe variable scope, or advantage with a debugger.
See Trivial Do While loop, and Braces are Good from C2.
To clarify my terminology (which I believe follows standard usage):
Naked braces:
init();
...
{
c = NULL;
mkwidget(&c);
finishwidget(&c);
}
shutdown();
Empty braces (NOP):
{}
e.g.
while (1)
{} /* Do nothing, endless loop */
Block:
if (finished)
{
closewindows(&windows);
freememory(&cache);
}
which would become
if (finished)
closewindows(&windows);
freememory(&cache);
if the braces are removed, thus altering the flow of execution, not just the scope of local variables. Thus not 'freestanding' or 'naked'.
Naked braces or a block may be used to signify any section of code that might be a potential for an (inline) function that you wish to mark, but not refactor at that time.
It's a contrived way to emulate a GOTO as these two are practically identical:
// NOTE: This is discouraged!
do {
if (someCondition) break;
// some code be here
} while (false);
// more code be here
and:
// NOTE: This is discouraged, too!
if (someCondition) goto marker;
// some code be here
marker:
// more code be here
On the other hand, both of these should really be done with ifs:
if (!someCondition) {
// some code be here
}
// more code be here
Although the nesting can get a bit ugly if you just turn a long string of forward-GOTOs into nested ifs. The real answer is proper refactoring, though, not imitating archaic language constructs.
If you were desperately trying to transliterate an algorithm with GOTOs in it, you could probably do it with this idiom. It's certainly non-standard and a good indicator that you're not adhering closely to the expected idioms of the language, though.
I'm not aware of any C-like language where do/while is an idiomatic solution for anything, actually.
You could probably refactor the whole mess into something more sensible to make it more idiomatic and much more readable.
Some coders prefer to only have a single exit/return from their functions. The use of a dummy do { .... } while(false); allows you to "break out" of the dummy loop once you've finished and still have a single return.
I'm a java coder, so my example would be something like
import java.util.Arrays;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class p45
{
static List<String> cakeNames = Arrays.asList("schwarzwald torte", "princess", "icecream");
static Set<Integer> forbidden = Stream.of(0, 2).collect(Collectors.toSet());
public static void main(String[] argv)
{
for (int i = 0; i < 4; i++)
{
System.out.println(String.format("cake(%d)=\"%s\"", i, describeCake(i)));
}
}
static String describeCake(int typeOfCake)
{
String result = "unknown";
do {
// ensure type of cake is valid
if (typeOfCake < 0 || typeOfCake >= cakeNames.size()) break;
if (forbidden.contains(typeOfCake)) {
result = "not for you!!";
break;
}
result = cakeNames.get(typeOfCake);
} while (false);
return result;
}
}
In such cases I use
switch(true) {
case condution1:
...
break;
case condution2:
...
break;
}
This is amusing. There are probably breaks inside the loop as others have said. I would have done it this way :
while(true)
{
<main code for function>
break; // at the end.
}