Unnecessary curly braces in C++ - c++

When doing a code review for a colleague today I saw a peculiar thing. He had surrounded his new code with curly braces like this:
Constructor::Constructor()
{
// Existing code
{
// New code: do some new fancy stuff here
}
// Existing code
}
What is the outcome, if any, from this? What could be the reason for doing this? Where does this habit come from?
The environment is embedded devices. There is a lot of legacy C code wrapped in C++ clothing. There are a lot of C turned C++ developers.
There are no critical sections in this part of the code. I have only seen it in this part of the code. There are no major memory allocations done, just some flags that are set, and some bit twiddling.
The code that is surrounded by curly braces is something like:
{
bool isInit;
(void)isStillInInitMode(&isInit);
if (isInit) {
return isInit;
}
}
(Don't mind the code, just stick to the curly braces... ;) )
After the curly braces there are some more bit twiddling, state checking, and basic signaling.
I talked to the guy and his motivation was to limit the scope of variables, naming clashes, and some other that I couldn't really pick up.
From my point of view this seems rather strange and I don't think that the curly braces should be in our code. I saw some good examples in all the answers on why one could surround code with curly braces, but shouldn't you separate the code into methods instead?
fsdf

It's sometimes nice since it gives you a new scope, where you can more "cleanly" declare new (automatic) variables.
In C++ this is maybe not so important since you can introduce new variables anywhere, but perhaps the habit is from C, where you could not do this until C99. :)
Since C++ has destructors, it can also be handy to have resources (files, mutexes, or whatever) automatically released as the scope exits, which can make things cleaner. This means you can hold on to some shared resource for a shorter duration than you would if you grabbed it at the start of the method.

One possible purpose is to control variable scope. And since variables with automatic storage are destroyed when they go out of scope, this can also enable a destructor to be called earlier than it otherwise would.

The extra braces are used to define the scope of the variable declared inside the braces. It is done so that the destructor will be called when the variable goes out of scope. In the destructor, you may release a mutex (or any other resource) so that other could acquire it.
In my production code, I've written something like this:
void f()
{
// Some code - MULTIPLE threads can execute this code at the same time
{
scoped_lock lock(mutex); // Critical section starts here
// Critical section code
// EXACTLY ONE thread can execute this code at a time
} // The mutex is automatically released here
// Other code - MULTIPLE threads can execute this code at the same time
}
As you can see, in this way, you can use scoped_lock in a function and at the same time, can define its scope by using extra braces. This makes sure that even though the code outside the extra braces can be executed by multiple threads simultaneously, the code inside the braces will be executed by exactly one thread at a time.

As others have pointed out, a new block introduces a new scope, enabling one to write a bit of code with its own variables that don't trash the namespace of the surrounding code, and doesn't use resources any longer than necessary.
However, there's another fine reason for doing this.
It is simply to isolate a block of code that achieves a particular (sub)purpose. It is rare that a single statement achieves a computational effect I want; usually it takes several. Placing those in a block (with a comment) allows me tell the reader (often myself at a later date):
This chunk has a coherent conceptual purpose
Here's all the code needed
And here's a comment about the chunk.
e.g.
{ // update the moving average
i= (i+1) mod ARRAYSIZE;
sum = sum - A[i];
A[i] = new_value;
sum = sum + new_value;
average = sum / ARRAYSIZE ;
}
You might argue I should write a function to do all that. If I only do it once, writing a function just adds additional syntax and parameters; there seems little point. Just think of this as a parameterless, anonymous function.
If you are lucky, your editor will have a fold/unfold function that will even let you hide the block.
I do this all the time. It is great pleasure to know the bounds of the code I need to inspect, and even better to know that if that chunk isn't the one I want, I don't have to look at any of the lines.

One reason could be that the lifetime of any variables declared inside the new curly braces block is restricted to this block. Another reason that comes to mind is to be able to use code folding in the favourite editor.

This is the same as an if (or while, etc.) block, just without if. In other words, you introduce a scope without introducing a control structure.
This "explicit scoping" is typically useful in following cases:
To avoid name clashes.
To scope using.
To control when the destructors are called.
Example 1:
{
auto my_variable = ... ;
// ...
}
// ...
{
auto my_variable = ... ;
// ...
}
If my_variable happens to be a particularly good name for two different variables that are used in isolation from each other, then explicit scoping allows you to avoid inventing a new name just to avoid the name clash.
This also allows you to avoid using my_variable out of its intended scope by accident.
Example 2:
namespace N1 { class A { }; }
namespace N2 { class A { }; }
void foo() {
{
using namespace N1;
A a; // N1::A.
// ...
}
{
using namespace N2;
A a; // N2::A.
// ...
}
}
Practical situations when this is useful are rare and may indicate the code is ripe for refactoring, but the mechanism is there should you ever genuinely need it.
Example 3:
{
MyRaiiClass guard1 = ...;
// ...
{
MyRaiiClass guard2 = ...;
// ...
} // ~MyRaiiClass for guard2 called.
// ...
} // ~MyRaiiClass for guard1 called.
This can be important for RAII in cases when the need for freeing resources does not naturally "fall" onto boundaries of functions or control structures.

Everyone else already covered correctly the scoping, RAII etc. possiblities, but since you mention an embedded environment, there is one further potential reason:
Maybe the developer doesn't trust this compiler's register allocation or wants to explicitly control the stack frame size by limiting the number of automatic variables in scope at once.
Here isInit will likely be on the stack:
{
bool isInit;
(void)isStillInInitMode(&isInit);
if (isInit) {
return isInit;
}
}
If you take out the curly braces, space for isInit may be reserved in the stack frame even after it could potentially be reused: if there are lots of automatic variables with similarly localized scope, and your stack size is limited, that could be a problem.
Similarly, if your variable is allocated to a register, going out of scope should provide a strong hint that register is now available for re-use. You'd have to look at the assembler generated with and without the braces to figure out if this makes a real difference (and profile it - or watch for stack overflow - to see if this difference really matters).

This is really useful when using scoped locks in conjunction with critical sections in multithreaded programming. Your scoped lock initialised in the curly braces (usually the first command) will go out of scope at the end of the end of the block and so other threads will be able to run again.

I think others have covered scoping already, so I'll mention the unnecessary braces might also serve purpose in the development process. For example, suppose you are working on an optimization to an existing function. Toggling the optimization or tracing a bug to a particular sequence of statements is simple for the programmer -- see the comment prior to the braces:
// if (false) or if (0)
{
//experimental optimization
}
This practice is useful in certain contexts like debugging, embedded devices, or personal code.

I agree with ruakh. If you want a good explanation of the various levels of scope in C, check out this post:
Various Levels of Scope in C Application
In general, the use of "Block scope" is helpful if you want to just use a temporary variable that you don't have to keep track of for the lifetime of the function call. Additionally, some people use it so you can use the same variable name in multiple locations for convenience, though that's not generally a good idea. E.g.:
int unusedInt = 1;
int main(void) {
int k;
for(k = 0; k<10; k++) {
int returnValue = myFunction(k);
printf("returnValue (int) is: %d (k=%d)",returnValue,k);
}
for(k = 0; k<100; k++) {
char returnValue = myCharacterFunction(k);
printf("returnValue (char) is: %c (k=%d)",returnValue,k);
}
return 0;
}
In this particular example, I have defined returnValue twice, but since it is just at block scope, instead of function scope (i.e., function scope would be, for example, declaring returnValue just after int main(void)), I don't get any compiler errors, as each block is oblivious to the temporary instance of returnValue declared.
I can't say that this is a good idea in general (i.e., you probably shouldn't reuse variable names repeatedly from block-to-block), but in general, it saves time and lets you avoid having to manage the value of returnValue across the entire function.
Finally, please note the scope of the variables used in my code sample:
int: unusedInt: File and global scope (if this were a static int, it would only be file scope)
int: k: Function scope
int: returnValue: Block scope
char: returnValue: Block scope

So, why to use "unnecessary" curly braces?
For "Scoping" purposes (as mentioned above)
Making code more readable in a way (pretty much like using #pragma, or defining "sections" that can be visualized)
Because you can. Simple as that.
P.S. It's not BAD code; it's 100% valid. So, it's rather a matter of (uncommon) taste.

After viewing the code in the edit, I can say that the unnecessary brackets are probably (in the original coders view) to be 100% clear what will happen during the if/then, even tho it is only one line now, it might be more lines later, and the brackets guarantee you wont make an error.
{
bool isInit;
(void)isStillInInitMode(&isInit);
if (isInit) {
return isInit;
}
return -1;
}
if the above was original, and removing "extras" woudl result in:
{
bool isInit;
(void)isStillInInitMode(&isInit);
if (isInit)
return isInit;
return -1;
}
then, a later modification might look like this:
{
bool isInit;
(void)isStillInInitMode(&isInit);
if (isInit)
CallSomethingNewHere();
return isInit;
return -1;
}
and that, would of course, cause an issue, since now isInit would always be returned, regardless of the if/then.

Objects are automagically destroyed when they go out of scope...

Another example of usage is UI-related classes, especially Qt.
For example, you have some complicated UI and a lot of widgets, each of them got its own spacing, layout, etc. Instead of naming them space1, space2, spaceBetween, layout1, ... you can save yourself from non-descriptive names for variables that exist only in two-three lines of code.
Well, some might say that you should split it in methods, but creating 40 non-reusable methods doesn't look ok - so I decided to just add braces and comments before them, so it looks like logical block.
Example:
// Start video button
{
<Here goes the code >
}
// Stop video button
{
<...>
}
// Status label
{
<...>
}
I can't say that's the best practice, but it's good one for legacy code.
Got these problems when a lot of people added their own components to UI and some methods became really massive, but it's not practical to create 40 onetime-usage methods inside class that already messed up.

Related

c++, how do I create thread-restricted/protected variables and functions?

I have three threads in an application I'm building, all of which remain open for the lifetime of the application. Several variables and functions should only be accessed from specific threads. In my debug compile, I'd like a check to be run and an error to be thrown if one of these functions or variables is accessed from an illegal thread, but I don't want this as overhead in my final compilation. I really just want this so I the programmer don't make stupid mistakes, not to protect my executing program from making mistakes.
Originally, I had a 'thread protected' class template that would wrap around return types for functions, and run a check on construction before implicitly converting to the intended return type, but this didn't seem to work for void return types without disabling important warnings, and it didn't resolve my issue for protected variables.
Is there a method of doing this, or is it outside the scope of the language? 'If you need this solution, you're doing it wrong' comments not appreciated, I managed to near halve my program's execution time with this methodology, but it's just too likely I'm going to make a mistake that results in a silent race condition and ultimately undefined behavior.
What you described is exactly what assert macro is for.
assert(condition)
In a debug build condition is checked. If it is false, the program will throw an exception at that line. In a release build, the assert and whatever is inside the parentheses aren't compiled.
Without being harsh, it would have been more helpful if you had explained the variables you are trying to protect. What type are they? Where do they come from? What's their lifetime? Are they global? Why do you need to protect a returned type if it's void? How did you end up in a situation where one thread might accidentally access something. I kinda have to guess but I'll throw out some ideas here:
#include <thread>
#include <cassert>
void protectedFunction()
{
assert(std::this_thread::get_id() == g_thread1.get_id());
}
// protect a global singleton (full program lifetime)
std::string& protectedGlobalString()
{
static std::string inst;
assert(std::this_thread::get_id() == g_thread1.get_id());
return inst;
}
// protect a class member
int SomeClass::protectedInt()
{
assert(std::this_thread::get_id() == g_thread1.get_id());
return this->m_theVar;
}
// thread protected wrapper
template <typename T>
class ThreadProtected
{
std::thread::id m_expected;
T m_val;
public:
ThreadProtected(T init, std::thread::id expected)
: m_val(init), m_expected(expected)
{ }
T Get()
{
assert(std::this_thread::get_id() == m_expected);
return m_val;
}
};
// specialization for void
template <>
class ThreadProtected<void>
{
public:
ThreadProtected(std::thread::id expected)
{
assert(std::this_thread::get_id() == expected);
}
};
assert is oldschool. We were actually told to stop using it at work because it was causing resource leaks (the exception was being caught high up in the stack). It has the potential to cause debugging headaches because the debug behavior is different from the release behavior. A lot of the time if the asserted condition is false, there isn't really a good choice of what to do; you usually don't want to continue running the function but you also don't know what value to return. assert is still very useful when developing code. I personally use assert all the time.
static_assert will not help here because the condition you are checking for (e.g. "Which thread is running this code?") is a runtime condition.
Another note:
Don't put things that you want to be compiled inside an assert. It seems obvious now, but it's easy to do something dumb like
int* p;
assert(p = new(nothrow) int); // check that `new` returns a value -- BAD!!
It's good to check the allocation of new, but the allocation won't happen in a release build, and you won't even notice until you start release testing!
int* p;
p = new(nothrow) int;
assert(p); // check that `new` returns a value -- BETTER...
Lastly, if you write the protected accessor functions in a class body or in a .h, you can goad the compiler into inlining them.
Update to address the question:
The real question though is where do I PUT an assert macro? Is a
requirement that I write setters and getters for all my thread
protected variables then declare them as inline and hope they get
optimised out in the final release?
You said there are variables that should be checked (in the debug build only) when accessed to make sure the correct thread is accessing them. So, theoretically, you would want an assert macro before every such access. This is easy if there are only a few places (if this is the the case, you can ignore everything I'm about to say). However, if there are so many places that it starts to violate the DRY Principal, I suggest writing getters/setters and putting the assert inside (This is what I've casually given examples of above). But while the assert won't add overhead in release mode (since it's conditionally compiled), using extra functions (probably) adds function call overhead. However, if you write them in the .h, there's a good chance they'll be inlined.
Your requirement for me was to come up with a way to do this without release overhead. Now that I've mentioned inlining I'm obligated to say that the compiler knows best. There usually are compiler-specific ways to force inlining (since the compiler is allowed to ignore the inline keyword). You should be profiling the code before trying to inline things. See the answer to this question. Is it good practice to make getters and setters inline?. You can easily see if the compiler is inlining the function by looking at the assembly. Don't worry, you don't have to be good at assembly. Just find the calling function and look for a call to the getter/setter. If the function was inlined, you won't see a call and you'll see probably a mov instead.

Use of global variables in C++ application

I’ve used global variables without having any noticeable problems but would like to know if there are potential problems or drawbacks with my use of globals.
In the first scenario, I include const globals into a globals.h file, I then include the header into various implementation files where I need access to any one of the globals:
globals.h
const int MAX_URL_LEN = 100;
const int MAX_EMAIL_LEN = 50;
…
In the second scenario, I declare and initialize the globals in an implementation file when the application executes. These globals are never modified again. When I need access to these globals from a different implementation file, I use the extern keyword:
main.cpp
char application_path[128];
char data_path[128];
// assign data to globals
strcpy(application_path, get_dll_path().c_str());
…
do_something.cpp
extern char application _path[]; // global is now accessible in do_something.cpp
Regarding the first scenario above, I’ve considered removing all of the different “include globals.h” and using extern where access to those globals is needed but have not done so since just including the globals.h is so convenient.
I am concerned that I will have different versions of the variables for each implementation file that includes globals.h.
Should I use extern instead of including the globals.h everywhere access is needed?
Please advise, and thank you.
Global mutable variables
provide invisible lines of influence across all of the code, and
you cannot rely on their values, or whether they've been initialized.
That is, global mutable variables do for data flow what the global goto once did for execution flow, creating a spaghetti mess, wasting everyone's time.
Constant global variables are more OK, but even for those you run into
the initialization order fiasco.
I remember how angry I got when I realized that all my troubles in wrapping a well known GUI framework, was due to it needlessly using global variables and provoking the initialization order fiasco. First the anger was directed at the author, then at myself for being so stupid, not realizing what was going on (or rather, was not going on). Anyway.
A sensible solution to all this is Meyers' singletons, like
inline
auto pi_decimal_digits()
-> const string&
{
static const string the_value = compute_pi_digits();
return the_value;
}
For the case of a global that's dynamically initialized from some place that knows the value, “one programmer's constant is another programmer's variable”, there is no good solution, but one practical solution is to accept the possibility of a run time error and at least detect it:
namespace detail {
inline
auto mutable_pi_digits()
-> string&
{
static string the_value;
return the_value;
}
} // namespace detail
inline
void set_pi_digits( const string& value )
{
string& digits = detail::mutable_pi_digits();
assert( digits.length() == 0 );
digits = value;
}
inline
auto pi_digits()
-> const string&
{ return detail::mutable_pi_digits(); }
Your implementation is fine for now. Globals become a problem when
Your program grows and so does your number of globals.
New people join the team that don't know what you were thinking.
Number 1 becomes particularly troublesome when your program becomes multi-threaded. Then you have a number of threads using the same data and you may require protection, which is difficult with just a list of globals.
By grouping data in separate files according to some criteria such as purpose or subject matter your code becomes more maintainable as it grows and you leave breadcrumbs for new programmers on the project to figure out how the software works.
One issue with globals is that when you go to include 3rd party libraries in your code, sometimes they've used globals with the same names as yours. There are definitely times when a global makes sense, but if possible you should also take care to do something like put it into a namespace.

Refactoring large case statement, externs, static locals

I'm trying to do some refactoring and wish to figure out the best path forward.
I have
myonce{
static int i //for operation 1
switch(commandid) {
case 1: operation 1
i = 1;
...
where myonce is a function that is called in a loop. This is not my code, I'm trying to make it better. Operation 1 (or each case) is a series of commands, and I want to put them in their own translation units (one function per file).
Since myonce runs in a loop, the original author has many static variables that he uses to keep state, some of these state sets are used across multiple operations. Note that these are not static file scope, they are static block scope.
To keep things simple, as a proof of concept, I want to know if the following is possible.
Consider 1 operation with 1 set of static vars.
main.cpp
myonce {
static int i //for op 1
switch(commandid) {
case 1: operation1();
operation1.cpp
extern int i;
void operation1() {
i = 1;
}
In the case of multiple operations using the same sets of state, I would make a header to declare them all extern.
Currently compilation of this file is counted in minutes, and my first goal is to break it up into smaller compilation units so that the author can work more freely. this refactoring will take a long time, but I mention this as an explanation of my motivation of this approach.
I understand that a static file scope variable is not accessible to other translation units (extern in other files), so I wish to distinguish that this is not the case I'm handling. What I don't know at the moment, is where I should declare operation1() to main, should it be
static int i
extern void operation1();
So that int is declared as visible to the function?
I would appreciate any pointers in this regard. Thanks.
Put the state variables into a struct. Pass this struct to each function.
Example.
// foo.h
struct TheState
{
int x;
char *y;
// ...
};
void func1(TheState &);
void func2(TheState &);
// main.cc
#include "foo.h"
void main_loop()
{
TheState the_state; // initialize this however you want
for (;;)
{
if ( blah) func1(the_state);
else func2(the_state);
}
}
// func1.cc
#include "foo.h"
void func1(TheState &the_state)
{
++the_state.x;
}
No, you can't do that. static objects aren't visible in other source files, ever.
How large is your switch anyway? And what is the reason for modifying it?
Perhaps the original programmer had good reasons for the local, static variables? You say it is called in a loop, and some of the static variables are used to keep state from one iteration to the next, shared among branches of the switch. It is certainly a weird way to structure the code. I can think of doing something like this to run some sort of finite automaton, but in that case I'd write the automaton as a string of snippets of code for each state, and transfer among them by straight gotos. I'd make certain somewhere very near there is a description of the automaton in a more readable form.
But I might be totally off-base. Can you share a bit more about what this code does?
First, switches often are avoidable by creating better data structures with their functions (e.g. classes with a virtual member function command whose implementations do the right thing).
On a less ambitious level you could just pass pointers to the statics which are needed in that particular case to the function so that it can read and modify the state of those variables.
Depending on what the functions do, one could also pass state information as value parameters (copies), let the function do their work depending on that state, receive the results and THEN change global state in the main switch according to the result. The state change then is clearly visible (i.e. no side effects in the functions) and the noisy distracting details are banned to another file.
If each case tends to use many of the static variables then you could put them all in a struct; that change should be doable with a text editor (replace variable name x with mystruct.x etc.). Then each function just gets a pointer to that struct. EDIT: As I said in a comment: Perhaps the commands naturally form groups which are concerned with only parts of the state (e.g. there are commands which only read, others which only write data etc.). Then the global state could be split in corresponding groups of data. Each function only gets to see the data group which concerns it, which limits potential side effects.
But generally spoken the function as it is now seems badly designed/grown over time; working on a large set of static variables means having "side effects" in the code all over -- it's not easy to see what any given portion of code does and how it interacts with others. The information flow is not explicit. Analyzing clusters of data which belong together, organizing them in classes and separating them in files would be one task here, even without any virtual member functions.
As to your last question: The "case functions" you create (operation1(); etc.) need only be known in the file which call them. If they are in one or several separate files you should create a header containing the prototypes.

How to avoid C++ anonymous objects

I have a ScopedLock class which can help to release lock automatically when running out of scope.
However, the problem is: Sometimes team members write invalid lock-code such as
{
ScopedLock(mutex); // anonymous
xxx;
}
The above code is wrong because the ScopedLock object is constructed and destructed immediately, so it fails to lock the expected area (xxx). I want the compiler to give an error when trying to compile such code. Can this be done?
I have searched g++ warning options, but fail to find the right one.
I have seen an interesting trick in one codebase, but it only works if your scoped_lock type is not a template (std::scoped_lock is).
#define scoped_lock(x) static_assert(false, "you forgot the variable name")
If you use the class correctly, you have
scoped_lock lock(mutex);
and since the scoped_lock identifier isn't followed by an open paren, the macro won't trigger and the code will remain as it is. If you write\
scoped_lock(mutex);
the macro will trigger and the code will be substituted with
static_assert(false, "you forgot the variable name");
This will generate an informative message.
If you use a qualified name
threads::scoped_lock(mutext);
then the result will still not compile, but the message won't be as nice.
Of course, if your lock is a template, the bad code is
scoped_lock<mutex_type>(mutex);
which won't trigger the macro.
No, unfortunately there is no way to do this, as I explored in a blog post last year.
In it, I concluded:
I guess the moral of the story is to remember this story when using scoped_locks.
You can try to force all programmers in your team to use a macro, or a range-for trick, but then if you could guarantee that in every case then you'd be able to guarantee catching this bug in every case also.
You are looking for a way to programmatically catch this specific mistake when it's made, and there is none.
You can use a class and deleted function with the same name. Unfortunately this requires adding "class" keyword before the type.
class Guard
{
public:
explicit Guard(void)
{
}
};
static void Guard(void) = delete;
int main()
{
// Guard(); // Won't compile
// Guard g; // Won't compile
class Guard g;
}
To avoid this, introduce a macro which does this for you, always using the same name for the locker:
#define LOCK(mutex) ScopedLock _lock(mutex)
Then use it like this:
{
LOCK(mutex);
xxx;
}
As an alternative, Java's synchronize block can be simulated using a macro construct: In a for-loop running always exactly once, I instantiate such a locker in the initialization statement of the for-loop, so it gets destroyed when leaving the for-loop.
However, it has some pitfalls, unexpected behavior of a break statement being one example. This "hack" is introduced here.
Of course, none of the above methods fully avoid accidental code like your example. But if you're used to write locking mutexes using one of the two macros, it will less likely happen. As the name of the locker class will then never appear in the code except in the macro definition, you can even introduce a commit hook in a version control system to avoid committing invalid code.
AFAIK there's no such a flag in gcc. A static analyzer may better suit your needs.
In C++17, a type can be marked [[nodiscard]], in which case a warning is encouraged for an expression that discards a value of that type (including by the case described here that resembles a declaration of a variable). In C++20, it can be applied to individual constructors as well if only some of them cause this sort of problem.
replace it with macro
#define CON2(x,y) x##y
#define CON(x,y) CON2(x,y)
#define LOCK(x) ScopedLock CON(unique_,__COUNTER__)(mutex)
usage
{
LOCK(mutex);
//do stuff
}
This macro will generate unique names for locks, allowing lockeng of other mutexes in inner scopes

do {...} while(false)

I was looking at some code by an individual and noticed he seems to have a pattern in his functions:
<return-type> function(<params>)
{
<initialization>
do
{
<main code for function>
}
while(false);
<tidy-up & return>
}
It's not bad, more peculiar (the actual code is fairly neat and unsurprising). It's not something I've seen before and I wondered if anyone can think of any logic behind it - background in a different language perhaps?
You can break out of do{...}while(false).
A lot of people point out that it's often used with break as an awkward way of writing "goto". That's probably true if it's written directly in the function.
In a macro, OTOH, do { something; } while (false) is a convenient way to FORCE a semicolon after the macro invocation, absolutely no other token is allowed to follow.
And another possibility is that there either once was a loop there or iteration is anticipated to be added in the future (e.g. in test-driven development, iteration wasn't needed to pass the tests, but logically it would make sense to loop there if the function needed to be somewhat more general than currently required)
The break as goto is probably the answer, but I will put forward one other idea.
Maybe he wanted to have a locally defined variables and used this construct to get a new scope.
Remember while recent C++ allows for {...} anywhere, this was not always the case.
I've seen it used as a useful pattern when there are many potential exit points for the function, but the same cleanup code is always required regardless of how the function exits.
It can make a tiresome if/else-if tree a lot easier to read, by just having to break whenever an exit point is reached, with the rest of the logic inline afterwards.
This pattern is also useful in languages that don't have a goto statement. Perhaps that's where the original programmer learnt the pattern.
I've seen code like that so you can use break as a goto of sorts.
I think it's more convenient to write break instead of goto end. You don't even have to think up a name for the label which makes the intention clearer: You don't want to jump to a label with a specific name. You want to get out of here.
Also chances are you would need the braces anyway. So this is the do{...}while(false); version:
do {
// code
if (condition) break; // or continue
// more code
} while(false);
And this is the way you would have to express it if you wanted to use goto:
{
// code
if (condition) goto end;
// more code
}
end:
I think the meaning of the first version is much easier to grasp. Also it's easier to write, easier to extend, easier to translate to a language that doesn't support goto, etc.
The most frequently mentioned concern about the use of break is that it's a badly disguised goto. But actually break has more resemblance to return: Both instructions jump out of a block of code which is pretty much structured in comparison to goto. Nevertheless both instructions allow multiple exit points in a block of code which can be confusing sometimes. After all I would try to go for the most clear solution, whatever that is in the specific situation.
This is just a perversion of while to get the sematics of goto tidy-up without using the word goto.
It's bad form because when you use other loops inside the outer while the breaks become ambiguous to the reader. "Is this supposed to goto exit? or is this intended only to break out of the inner loop?"
This trick is used by programmers that are too shy to use an explicit goto in their code. The author of the above code wanted to have the ability to jump directly to the "cleanup and return" point from the middle of the code. But they didn't want to use a label and explicit goto. Instead, they can use a break inside the body of the above "fake" cycle to achieve the same effect.
Several explanations. The first one is general, the second one is specific to C preprocessor macros with parameters:
Flow control
I've seen this used in plain C code. Basically, it's a safer version of goto, as you can break out of it and all memory gets cleaned up properly.
Why would something goto-like be good? Well, if you have code where pretty much every line can return an error, but you need to react to all of them the same way (e.g. by handing the error to your caller after cleaning up), it's usually more readable to avoid an if( error ) { /* cleanup and error string generation and return here */ } as it avoids duplication of clean-up code.
However, in C++ you have exceptions + RAII for exactly this purpose, so I would consider it bad coding style.
Semicolon checking
If you forget the semicolon after a function-like macro invocation, arguments might contract in an undesired way and compile into valid syntax. Imagine the macro
#define PRINT_IF_DEBUGMODE_ON(msg) if( gDebugModeOn ) printf("foo");
That is accidentally called as
if( foo )
PRINT_IF_DEBUGMODE_ON("Hullo\n")
else
doSomethingElse();
The "else" will be considered to be associated with the gDebugModeOn, so when foo is false, the exact reverse of what was intended will happen.
Providing a scope for temporary variables.
Since the do/while has curly braces, temporary variables have a clearly defined scope they can't escape.
Avoiding "possibly unwanted semicolon" warnings
Some macros are only activated in debug builds. You define them like:
#if DEBUG
#define DBG_PRINT_NUM(n) printf("%d\n",n);
#else
#define DBG_PRINT_NUM(n)
#endif
Now if you use this in a release build inside a conditional, it compiles to
if( foo )
;
Many compilers see this as the same as
if( foo );
Which is often written accidentally. So you get a warning. The do{}while(false) hides this from the compiler, and is accepted by it as an indication that you really want to do nothing here.
Avoiding capturing of lines by conditionals
Macro from previous example:
if( foo )
DBG_PRINT_NUM(42)
doSomething();
Now, in a debug build, since we also habitually included the semicolon, this compiles just fine. However, in the release build this suddenly turns into:
if( foo )
doSomething();
Or more clearly formatted
if( foo )
doSomething();
Which is not at all what was intended. Adding a do{ ... }while(false) around the macro turns the missing semicolon into a compile error.
What's that mean for the OP?
In general, you want to use exceptions in C++ for error handling, and templates instead of macros. However, in the very rare case where you still need macros (e.g. when generating class names using token pasting) or are restricted to plain C, this is a useful pattern.
It looks like a C programmer. In C++, automatic variables have destructors which you use to clean up, so there should not be anything needed tidying up before the return. In C, you didn't have this RAII idiom, so if you have common clean up code, you either goto it, or use a once-through loop as above.
Its main disadvantage compared with the C++ idiom is that it will not tidy up if an exception is thrown in the body. C didn't have exceptions, so this wasn't a problem, but it does make it a bad habit in C++.
It is a very common practice. In C. I try to think of it as if you want to lie to yourself in a way "I'm not using a goto". Thinking about it, there would be nothing wrong with a goto used similarly. In fact it would also reduce indentation level.
That said, though, I noticed, very often this do..while loops tend to grow. And then they get ifs and elses inside, rendering the code actually not very readable, let alone testable.
Those do..while are normally intended to do a clean-up. By all means possible I would prefer to use RAII and return early from a short function. On the other hand, C doesn't provide you as much conveniences as C++ does, making a do..while one of the best approaches to do a cleanup.
Maybe it’s used so that break can be used inside to abort the execution of further code at any point:
do {
if (!condition1) break;
some_code;
if (!condition2) break;
some_further_code;
// …
} while(false);
I think this is done to use break or continue statements. Some kind of "goto" code logic.
It's simple: Apparently you can jump out of the fake loop at any time using the break statement. Furthermore, the do block is a separate scope (which could also be achieved with { ... } only).
In such a situation, it might be a better idea to use RAII (objects automatically destructing correctly when the function ends). Another similar construct is the use of goto - yes, I know it's evil, but it can be used to have common cleanup code like so:
<return-type> function(<params>)
{
<initialization>
<main code for function using "goto error;" if something goes wrong>
<tidy-up in success case & return>
error:
<commmon tidy-up actions for error case & return error code or throw exception>
}
(As an aside: The do-while-false construct is used in Lua to come up for the missing continue statement.)
How old was the author?
I ask because I once came across some real-time Fortran code that did that, back in the late 80's. It turns out that is a really good way to simulate threads on an OS that doesn't have them. You just put the entire program (your scheduler) in a loop, and call your "thread" routines" one by one. The thread routines themselves are loops that iterate until one of a number of conditions happen (often one being a certain amount of time has passed). It is "cooperative multitasking", in that it is up to the individual threads to give up the CPU every now and then so the others don't get starved. You can nest the looping subprogram calls to simulate thread priority bands.
Many answerers gave the reason for do{(...)break;}while(false). I would like to complement the picture by yet another real-life example.
In the following code I had to set enumerator operation based on the address pointed to by data pointer. Because a switch-case can be used only on scalar types first I did it inefficiently this way
if (data == &array[o1])
operation = O1;
else if (data == &array[o2])
operation = O2;
else if (data == &array[on])
operation = ON;
Log("operation:",operation);
But since Log() and the rest of code repeats for any chosen value of operation I was wandering how to skip the rest of comparisons when the address has been already discovered. And this is where do{(...)break;}while(false) comes in handy.
do {
if (data == &array[o1]) {
operation = O1;
break;
}
if (data == &array[o2]) {
operation = O2;
break;
}
if (data == &array[on]) {
operation = ON;
break;
}
} while (false);
Log("operation:",operation);
One may wonder why he couldn't do the same with break in an if statement, like:
if (data == &array[o1])
{
operation = O1;
break;
}
else if (...)
break interacts solely with the closest enclosing loop or switch, whether it be a for, while or do .. while type, so unfortunately that won't work.
In addition to the already mentioned 'goto examples', the do ... while (0) idiom is sometimes used in a macro definition to provide for brackets in the definition and still have the compiler work with adding a semi colon to the end of a macro call.
http://groups.google.com/group/comp.soft-sys.ace/browse_thread/thread/52f670f1292f30a4?tvc=2&q=while+(0)
I agree with most posters about the usage as a thinly disguised goto. Macros have also been mentioned as a potential motivation for writing code in the style.
I have also seen this construct used in mixed C/C++ environments as a poor man's exception. The "do {} while(false)" with a "break" can be used to skip to the end of the code block should something that would normally warrant an exception be encountered in the loop.
I have also sen this construct used in shops where the "single return per function" ideology is enforced. Again, this is in lieu of an explicit "goto" - but the motivation is to avoid multiple return points, not to "skip over" code and continue actual execution within that function.
I work with Adobe InDesign SDK, and the InDesign SDK examples have almost every function written like this. It is due to fact that the function are usually really long. Where you need to do QueryInterface(...) to get anything from the application object model. So usually every QueryInterface is followed by if not went well, break.
Many have already stated the similarity between this construct and a goto, and expressed a preference for the goto. Perhaps this person's background included an environment where goto's were strictly forbidden by coding guidelines?
The other reason I can think of is that it decorates the braces, whereas I believe in a newer C++ standard naked braces are not okay (ISO C doesn't like them). Otherwise to quiet a static analyzer like lint.
Not sure why you'd want them, maybe variable scope, or advantage with a debugger.
See Trivial Do While loop, and Braces are Good from C2.
To clarify my terminology (which I believe follows standard usage):
Naked braces:
init();
...
{
c = NULL;
mkwidget(&c);
finishwidget(&c);
}
shutdown();
Empty braces (NOP):
{}
e.g.
while (1)
{} /* Do nothing, endless loop */
Block:
if (finished)
{
closewindows(&windows);
freememory(&cache);
}
which would become
if (finished)
closewindows(&windows);
freememory(&cache);
if the braces are removed, thus altering the flow of execution, not just the scope of local variables. Thus not 'freestanding' or 'naked'.
Naked braces or a block may be used to signify any section of code that might be a potential for an (inline) function that you wish to mark, but not refactor at that time.
It's a contrived way to emulate a GOTO as these two are practically identical:
// NOTE: This is discouraged!
do {
if (someCondition) break;
// some code be here
} while (false);
// more code be here
and:
// NOTE: This is discouraged, too!
if (someCondition) goto marker;
// some code be here
marker:
// more code be here
On the other hand, both of these should really be done with ifs:
if (!someCondition) {
// some code be here
}
// more code be here
Although the nesting can get a bit ugly if you just turn a long string of forward-GOTOs into nested ifs. The real answer is proper refactoring, though, not imitating archaic language constructs.
If you were desperately trying to transliterate an algorithm with GOTOs in it, you could probably do it with this idiom. It's certainly non-standard and a good indicator that you're not adhering closely to the expected idioms of the language, though.
I'm not aware of any C-like language where do/while is an idiomatic solution for anything, actually.
You could probably refactor the whole mess into something more sensible to make it more idiomatic and much more readable.
Some coders prefer to only have a single exit/return from their functions. The use of a dummy do { .... } while(false); allows you to "break out" of the dummy loop once you've finished and still have a single return.
I'm a java coder, so my example would be something like
import java.util.Arrays;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class p45
{
static List<String> cakeNames = Arrays.asList("schwarzwald torte", "princess", "icecream");
static Set<Integer> forbidden = Stream.of(0, 2).collect(Collectors.toSet());
public static void main(String[] argv)
{
for (int i = 0; i < 4; i++)
{
System.out.println(String.format("cake(%d)=\"%s\"", i, describeCake(i)));
}
}
static String describeCake(int typeOfCake)
{
String result = "unknown";
do {
// ensure type of cake is valid
if (typeOfCake < 0 || typeOfCake >= cakeNames.size()) break;
if (forbidden.contains(typeOfCake)) {
result = "not for you!!";
break;
}
result = cakeNames.get(typeOfCake);
} while (false);
return result;
}
}
In such cases I use
switch(true) {
case condution1:
...
break;
case condution2:
...
break;
}
This is amusing. There are probably breaks inside the loop as others have said. I would have done it this way :
while(true)
{
<main code for function>
break; // at the end.
}