In short: it is a smart pointers in C question. Reason: embedded programming and need to ensure that if complex algorithm is used, then proper deallocation occurs with little effort on the developer side.
My favorite feature of C++ is ability to execute a proper deallocation of object allocated on stack and that goes out of scope. GO language defer provides same functionality and it is a bit closer in spirit to C.
GO defer would be the desired way of doing things in C. Is there a practical way to add such functionality?
The goal of doing so is simplification of tracking when and where object goes out of scope. Here is a quick example:
struct MyDataType *data = malloc(sizeof(struct MyDataType));
defer(data, deallocator);
if (condition) {
// dallocator(data) is called automatically
return;
}
// do something
if (irrelevant) {
struct DT *localScope = malloc(...);
defer(localScope, deallocator);
// deallocator(localScope) is called when we exit this scope
}
struct OtherType *data2 = malloc(...);
defer(data2, deallocator);
if (someOtherCondition) {
// dallocator(data) and deallocator(data2) are called in the order added
return;
}
In other languages I could create an anonymous function inside the code block, assign it to the variable and execute manually in front of every return. This would be at least a partial solution. In GO language defer functions can be chained. Manual chaining with anonymous functions in C is error prone and impractical.
Thank you
In C++, I've seen "stack based classes" that follow the RAII pattern. You could make a general purpose Defer class (or struct) that can take any arbitrary function or lambda.
For example:
#include <cstddef>
#include <functional>
#include <iostream>
#include <string>
using std::cout;
using std::endl;
using std::function;
using std::string;
struct Defer {
function<void()> action;
Defer(function<void()> doLater) : action{doLater} {}
~Defer() {
action();
}
};
void Subroutine(int i) {
Defer defer1([]() { cout << "Phase 1 done." << endl; });
if (i == 1) return;
char const* p = new char[100];
Defer defer2([p]() { delete[] p; cout << "Phase 2 done, and p deallocated." << endl; });
if (i == 2) return;
string s = "something";
Defer defer3([&s]() { s = ""; cout << "Phase 3 done, and s set to empty string." << endl; });
}
int main() {
cout << "Call Subroutine(1)." << endl;
Subroutine(1);
cout << "Call Subroutine(2)." << endl;
Subroutine(2);
cout << "Call Subroutine(3)." << endl;
Subroutine(3);
return EXIT_SUCCESS;
}
Many different answers, but a few interesting details was not said.
Of course destructors of C++ are very strong and should be used very often. Sometime some smart pointers could help you. But the mechanism, that is the most resemble to defer is ON_BLOCK_EXIT/ON_BLOCK_EXIT_OBJ (see http://http://www.drdobbs.com/cpp/generic-change-the-way-you-write-excepti/184403758 ). Do not forgot to read about ByRef.
One big difference between C++ and go is when deffered is called. In C++ when your program leaving scope, where is was created. But in go when your program leaving function. That means, this code won't work at all:
for i:=0; i < 10; i++ {
mutex.Lock()
defer mutex.Unlock()
/* do something under the mutex */
}
Of course C does not pretends that is object oriented and therefore there are no destructors at all. It help a lot of readability of code, because you know that your program at line X do only what is written in that line. In contrast of C++ where each closing curly bracket could cause calling of dozens destructors.
In C you can use hated statement goto. Don't use it for anything else, but it is practical to have cleanup label at the end of function call goto cleanup from many places. Bit more complicated is when more than one resource you want do release, than you need more that one cleanup. Than your function finish with
cleanup_file:
fclose(f);
cleanup_mutex:
pthread_mutex_unlock(mutex);
return ret;
}
C does not have destructors (unless you think of the GCC specific variable attribute cleanup, which is weird and rarely used; notice also that the GCC function attribute destructor is not what other languages, C++ notably, call destructor). C++ have them. And C & C++ are very different languages.
In C++11, you might define your class, having a std::vector or std::function-s, initialized using a std::initialized_list of lambda expressions (and perhaps dynamically augmented by some push_back). Then its destructor could mimic Go's defer-ed statements. But this is not idiomatic.
Go have defer statements and they are idiomatic in Go.
I recommend sticking to the idioms of your programming languages.
(In other words: don't think in Go while coding in C++)
You could also embed some interpreter (e.g. Lua or Guile) in your application. You might also learn more about garbage collection techniques and concepts and use them in your software (in other words, design your application with its specific GC).
Reason: embedded programming and need to ensure that if complex algorithm is used, then proper deallocation occurs with little effort on the developer side.
You might use arena-based allocation techniques, and de-allocate the arena when suitable... When you think about that, it is similar to copying GC techniques.
Maybe you dream of some homoiconic language with a powerful macro system suitable for meta-programming. Then look into Common Lisp.
I just implemented a very simple thing like defer in golang several days ago.
The only one behaviour different from golang is my defer will not be executed when you throw an exception but does not catch it at all. Another difference is this cannot accept a function with multiple arguments like in golang, but we can deal it with lambda capturing local variables.
The implementations are here.
class _Defer {
std::function<void()> __callback;
public:
_Defer(_Defer &&);
~_Defer();
template <typename T>
_Defer(T &&);
};
_Defer::_Defer(_Defer &&__that)
: __callback{std::forward<std::function<void()>>(__that.__callback)} {
}
template <typename T>
_Defer::_Defer(T &&__callback)
: __callback{
static_cast<std::function<void()>>(std::forward<T>(__callback))
} {
static_assert(std::is_convertible<T, std::function<void()>>::value,
"Cannot be convert to std::function<void()>.");
}
_Defer::~_Defer() {
this->__callback();
}
And then I defined some macros to make my defer like a keyword in C++ (just for fun)
#define __defer_concatenate(__lhs, __rhs) \
__lhs##__rhs
#define __defer_declarator(__id) \
if (0); /* You may forgot a `;' or deferred outside of a scope. */ \
_Defer __defer_concatenate(__defer, __id) =
#define defer \
__defer_declarator(__LINE__)
The if (0); is used to prevent defer a function out of a scope. And then we can use defer like in golang.
#include <iostream>
void foo() {
std::cout << "foo" << std::endl;
}
int main() {
defer []() {
std::cout << "bar" << std::endl;
};
defer foo;
}
This will print
foo
bar
to screen.
GO defer would be the desired way of doing things in C. Is there a practical way to add such functionality?
The goal of doing so is simplification of tracking when and where object goes out of scope.
C does not have any built-in mechanism for automatically invoking any kind of behavior at the end of an object's lifetime. The object itself ceases to exist, and any memory it occupied is available for re-use, but there is no associated hook for executing code.
For some kinds of objects, that is entirely satisfactory by itself -- those whose values do not refer to other objects with allocated storage duration that need to be cleaned up as well. In particular, if struct MyDataType in your example is such a type, then you get automatic cleanup for free by declaring instances as automatic variables instead of allocating them dynamically:
void foo(void) {
// not a pointer:
struct MyDataType data /* = initializer */;
// ...
/* The memory (directly) reserved for 'data' is released */
}
For objects that require attention at the end of their lifetime, it is generally a matter of code style and convention to ensure that you know when to clean up. It helps, for example, to declare all of your variables at the top of the innermost block containing them, though C itself does not require this. It can also help to structure your code so that for each object that requires custom cleanup, all code paths that may execute during its lifetime converge at the end of that lifetime.
Myself, as a matter of personal best practices, I always try to write any cleanup code needed for a given object as soon as I write its declaration.
In other languages I could create an anonymous function inside the code block, assign it to the variable and execute manually in front of every return. This would be at least a partial solution. In GO language defer functions can be chained. Manual chaining with anonymous functions in C is error prone and impractical
C has neither anonymous functions nor nested ones. It often does make sense, however, to write (named) cleanup functions for data types that require cleanup. These are analogous to C++ destructors, but you must call them manually.
The bottom line is that many C++ paradigms such as smart pointers, and coding practices that depend on them, simply do not work in C. You need different approaches, and they exist, but converting a large body of existing C++ code to idiomatic C is a distinctly non-trivial undertaking.
For those using C, I’ve built a preprocessor in C (open source, Apache license) that inserts the deferred code at the end of each block:
https://sentido-labs.com/en/library/#cedro
GitHub: https://github.com/Sentido-Labs/cedro/
It includes a C utility that wraps the compiler (works out-of-the-box with GCC and clang, configurable) so you can use it as drop-in replacement for cc, called cedrocc, and if you decide to get rid of it, running cedro on a C source file will produce plain C. (see the examples in the manual)
The alternatives I know about are listed in the “Related work” part of the documentation:
Apart from the already mentioned «A defer mechanism for C», there are macros that use a for loop as for (allocation and initialization; condition; release) { actions } [a] or other techniques [b].
[a] “P99 Scope-bound resource management with for-statements” from the same author (2010), “Would it be possible to create a scoped_lock implementation in C?” (2016), ”C compatible scoped locks“ (2021), “Modern C and What We Can Learn From It - Luca Sas [ ACCU 2021 ] 00:17:18”, 2021
[b] “Would it be possible to create a scoped_lock implementation in C?” (2016), “libdefer: Go-style defer for C” (2016), “A Defer statement for C” (2020), “Go-like defer for C that works with most optimization flag combinations under GCC/Clang” (2021)
Compilers like GCC and clang have non-standard features to do this like the __cleanup__ variable attribute.
This implementation avoids dynamic allocation and most limitations of other implementations shown here
#include<type_traits>
#include<utility>
template<typename F>
struct deferred
{
std::decay_t<F> f;
template<typename G>
deferred(G&& g) : f{std::forward<G>(g)} {}
~deferred() { f(); }
};
template<typename G>
deferred(G&&) -> deferred<G>;
#define CAT_(x, y) x##y
#define CAT(x, y) CAT_(x, y)
#define ANONYMOUS_VAR(x) CAT(x, __LINE__)
#define DEFER deferred ANONYMOUS_VAR(defer_variable) = [&]
And use it like
#include<iostream>
int main()
{
DEFER {
std::cout << "world!\n";
};
std::cout << "Hello ";
}
Now, whether to allow exceptions in DEFER is a design choice bordering on philosophy, and I'll leave it to Andrei to fill in the details.
Note all such deferring functionalities in C++ necessarily has to be bound to the scope at which it is declared, as opposed to Go's which binds to the function at which it is declared.
I am working in C++ with two large pieces of code, one done in "C style" and one in "C++ style".
The C-type code has functions that return const char* and the C++ code has in numerous places things like
const char* somecstylefunction();
...
std::string imacppstring = somecstylefunction();
where it is constructing the string from a const char* returned by the C style code.
This worked until the C style code changed and started returning NULL pointers sometimes. This of course causes seg faults.
There is a lot of code around and so I would like to most parsimonious way fix to this problem. The expected behavior is that imacppstring would be the empty string in this case. Is there a nice, slick solution to this?
Update
The const char* returned by these functions are always pointers to static strings. They were used mostly to pass informative messages (destined for logging most likely) about any unexpected behavior in the function. It was decided that having these return NULL on "nothing to report" was nice, because then you could use the return value as a conditional, i.e.
if (somecstylefunction()) do_something;
whereas before the functions returned the static string "";
Whether this was a good idea, I'm not going to touch this code and it's not up to me anyway.
What I wanted to avoid was tracking down every string initialization to add a wrapper function.
Probably the best thing to do is to fix the C library functions to their pre-breaking change behavior. but maybe you don't have control over that library.
The second thing to consider is to change all the instances where you're depending on the C lib functions returning an empty string to use a wrapper function that'll 'fix up' the NULL pointers:
const char* nullToEmpty( char const* s)
{
return (s ? s : "");
}
So now
std::string imacppstring = somecstylefunction();
might look like:
std::string imacppstring( nullToEmpty( somecstylefunction());
If that's unacceptable (it might be a lot of busy work, but it should be a one-time mechanical change), you could implement a 'parallel' library that has the same names as the C lib you're currently using, with those functions simply calling the original C lib functions and fixing the NULL pointers as appropriate. You'd need to play some tricky games with headers, the linker, and/or C++ namespaces to get this to work, and this has a huge potential for causing confusion down the road, so I'd think hard before going down that road.
But something like the following might get you started:
// .h file for a C++ wrapper for the C Lib
namespace clib_fixer {
const char* somecstylefunction();
}
// .cpp file for a C++ wrapper for the C Lib
namespace clib_fixer {
const char* somecstylefunction() {
const char* p = ::somecstylefunction();
return (p ? p : "");
}
}
Now you just have to add that header to the .cpp files that are currently calling calling the C lib functions (and probably remove the header for the C lib) and add a
using namespace clib_fixer;
to the .cpp file using those functions.
That might not be too bad. Maybe.
Well, without changing every place where a C++ std::string is initialized directly from a C function call (to add the null-pointer check), the only solution would be to prohibit your C functions from returning null pointers.
In GCC compiler, you can use a compiler extension "Conditionals with Omitted Operands" to create a wrapper macro for your C function
#define somecstylefunction() (somecstylefunction() ? : "")
but in general case I would advise against that.
I suppose you could just add a wrapper function which tests for NULL, and returns an empty std::string. But more importantly, why are your C functions now returning NULL? What does a NULL pointer indicate? If it indicates a serious error, you might want your wrapper function to throw an exception.
Or to be safe, you could just check for NULL, handle the NULL case, and only then construct an std::string.
const char* s = somecstylefunction();
if (!s) explode();
std::string str(s);
For a portable solution:
(a) define your own string type. The biggest part is a search and replace over the entire project - that can be simple if it's always std::string, or big one-time pain. (I'd make the sole requriement that it's Liskov-substitutable for a std::string, but also constructs an empty string from an null char *.
The easiest implementation is inheriting publicly from std::string. Even though that's frowned upon (for understandable reasons), it would be ok in this case, and also help with 3rd party libraries expecting a std::string, as well as debug tools. Alternatively, aggegate and forward - yuck.
(b) #define std::string to be your own string type. Risky, not recommended. I wouldn't do it unless I knew the codebases involved very well and saves you tons of work (and I'd add some disclaimers to protect the remains of my reputation ;))
(c) I've worked around a few such cases by re-#define'ing the offensive type to some utility class only for the purpose of the include (so the #define is much more limited in scope). However, I have no idea how to do that for a char *.
(d) Write an import wrapper. If the C library headers have a rather regular layout, and/or you know someone who has some experience parsing C++ code, you might be able to generate a "wrapper header".
(e) ask the library owner to make the "Null string" value configurable at least at compile time. (An acceptable request since switching to 0 can break compatibility as well in other scenarios) You might even offer to submit the change yourself if that's less work for you!
You could wrap all your calls to C-stlye functions in something like this...
std::string makeCppString(const char* cStr)
{
return cStr ? std::string(cStr) : std::string("");
}
Then wherever you have:
std::string imacppstring = somecstylefunction();
replace it with:
std::string imacppstring = makeCppString( somecystylefunction() );
Of course, this assumes that constructing an empty string is acceptable behavior when your function returns NULL.
I don't generally advocate subclassing standard containers, but in this case it might work.
class mystring : public std::string
{
// ... appropriate constructors are an exercise left to the reader
mystring & operator=(const char * right)
{
if (right == NULL)
{
clear();
}
else
{
std::string::operator=(right); // I think this works, didn't check it...
}
return *this;
}
};
Something like this should fix your problem.
const char *cString;
std::string imacppstring;
cString = somecstylefunction();
if (cString == NULL) {
imacppstring = "";
} else {
imacppstring = cString;
}
If you want, you could stick the error checking logic in its own function. You'd have to put this code block in fewer places, then.
I've stumbled across this great post about validating parameters in C#, and now I wonder how to implement something similar in C++. The main thing I like about this stuff is that is does not cost anything until the first validation fails, as the Begin() function returns null, and the other functions check for this.
Obviously, I can achieve something similar in C++ using Validate* v = 0; IsNotNull(v, ...).IsInRange(v, ...) and have each of them pass on the v pointer, plus return a proxy object for which I duplicate all functions.
Now I wonder whether there is a similar way to achieve this without temporary objects, until the first validation fails. Though I'd guess that allocating something like a std::vector on the stack should be for free (is this actually true? I'd suspect an empty vector does no allocations on the heap, right?)
Other than the fact that C++ does not have extension methods (which prevents being able to add in new validations as easily) it should be too hard.
class Validation
{
vector<string> *errors;
void AddError(const string &error)
{
if (errors == NULL) errors = new vector<string>();
errors->push_back(error);
}
public:
Validation() : errors(NULL) {}
~Validation() { delete errors; }
const Validation &operator=(const Validation &rhs)
{
if (errors == NULL && rhs.errors == NULL) return *this;
if (rhs.errors == NULL)
{
delete errors;
errors = NULL;
return *this;
}
vector<string> *temp = new vector<string>(*rhs.errors);
std::swap(temp, errors);
}
void Check()
{
if (errors)
throw exception();
}
template <typename T>
Validation &IsNotNull(T *value)
{
if (value == NULL) AddError("Cannot be null!");
return *this;
}
template <typename T, typename S>
Validation &IsLessThan(T valueToCheck, S maxValue)
{
if (valueToCheck < maxValue) AddError("Value is too big!");
return *this;
}
// etc..
};
class Validate
{
public:
static Validation Begin() { return Validation(); }
};
Use..
Validate::Begin().IsNotNull(somePointer).IsLessThan(4, 30).Check();
Can't say much to the rest of the question, but I did want to point out this:
Though I'd guess that allocating
something like a std::vector on the
stack should be for free (is this
actually true? I'd suspect an empty
vector does no allocations on the
heap, right?)
No. You still have to allocate any other variables in the vector (such as storage for length) and I believe that it's up to the implementation if they pre-allocate any room for vector elements upon construction. Either way, you are allocating SOMETHING, and while it may not be much allocation is never "free", regardless of taking place on the stack or heap.
That being said, I would imagine that the time taken to do such things will be so minimal that it will only really matter if you are doing it many many times over in quick succession.
I recommend to get a look into Boost.Exception, which provides basically the same functionality (adding arbitrary detailed exception-information to a single exception-object).
Of course you'll need to write some utility methods so you can get the interface you want. But beware: Dereferencing a null-pointer in C++ results in undefined behavior, and null-references must not even exist. So you cannot return a null-pointer in a way as your linked example uses null-references in C# extension methods.
For the zero-cost thing: A simple stack-allocation is quite cheap, and a boost::exception object does not do any heap-allocation itself, but only if you attach any error_info<> objects to it. So it is not exactly zero cost, but nearly as cheap as it can get (one vtable-ptr for the exception-object, plus sizeof(intrusive_ptr<>)).
Therefore this should be the last part where one tries to optimize further...
Re the linked article: Apparently, the overhaead of creating objects in C# is so great that function calls are free in comparison.
I'd personally propose a syntax like
Validate().ISNOTNULL(src).ISNOTNULL(dst);
Validate() contructs a temporary object which is basically just a std::list of problems. Empty lists are quite cheap (no nodes, size=0). ~Validate will throw if the list is not empty. If profiling shows even this is too expensive, then you just change the std::list to a hand-rolled list. Remember, a pointer is an object too. You're not saving an object just by sticking to the unfortunate syntax of a raw pointer. Conversely, the overhead of wrapping a raw pointer with a nice syntax is purely a compile-time price.
PS. ISNOTNULL(x) would be a #define for IsNotNull(x,#x) - similar to how assert() prints out the failed condition, without having to repeat it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Currently I am working on a project where goto statements are heavely used. The main purpose of goto statements is to have one cleanup section in a routine rather than multiple return statements.
Like below:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = NULL;
p = new int;
if (p == NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p = NULL;
}
return bRetVal;
}
This makes it much easier as we can track our clean up code at one section in code, that is, after the Exit label.
However, I have read many places it's bad practice to have goto statements.
Currently I am reading the Code Complete book, and it says that we need to use variables close to their declarations. If we use goto then we need to declare/initialize all variables before first use of goto otherwise the compiler will give errors that initialization of xx variable is skipped by the goto statement.
Which way is right?
From Scott's comment:
It looks like using goto to jump from one section to another is bad as it makes the code hard to read and understand.
But if we use goto just to go forward and to one label then it should be fine(?).
I am not sure what do you mean by clean up code but in C++ there is a concept called "resource acquisition is initialization" and it should be the responsibility of your destructors to clean up stuff.
(Note that in C# and Java, this is usually solved by try/finally)
For more info check out this page:
http://www.research.att.com/~bs/bs_faq2.html#finally
EDIT: Let me clear this up a little bit.
Consider the following code:
void MyMethod()
{
MyClass *myInstance = new MyClass("myParameter");
/* Your code here */
delete myInstance;
}
The problem: What happens if you have multiple exits from the function? You have to keep track of each exit and delete your objects at all possible exits! Otherwise, you will have memory leaks and zombie resources, right?
The solution: Use object references instead, as they get cleaned up automatically when the control leaves the scope.
void MyMethod()
{
MyClass myInstance("myParameter");
/* Your code here */
/* You don't need delete - myInstance will be destructed and deleted
* automatically on function exit */
}
Oh yes, and use std::unique_ptr or something similar because the example above as it is is obviously imperfect.
I've never had to use a goto in C++. Ever. EVER. If there is a situation it should be used, it's incredibly rare. If you are actually considering making goto a standard part of your logic, something has flown off the tracks.
There are basically two points people are making in regards to gotos and your code:
Goto is bad. It's very rare to encounter a place where you need gotos, but I wouldn't suggest striking it completely. Though C++ has smart enough control flow to make goto rarely appropriate.
Your mechanism for cleanup is wrong: This point is far more important. In C, using memory management on your own is not only OK, but often the best way to do things. In C++, your goal should be to avoid memory management as much as possible. You should avoid memory management as much as possible. Let the compiler do it for you. Rather than using new, just declare variables. The only time you'll really need memory management is when you don't know the size of your data in advance. Even then, you should try to just use some of the STL collections instead.
In the event that you legitimately need memory management (you have not really provided any evidence of this), then you should encapsulate your memory management within a class via constructors to allocate memory and deconstructors to deallocate memory.
Your response that your way of doing things is much easier is not really true in the long run. Firstly, once you get a strong feel for C++ making such constructors will be 2nd nature. Personally, I find using constructors easier than using cleanup code, since I have no need to pay careful attention to make sure I am deallocating properly. Instead, I can just let the object leave scope and the language handles it for me. Also, maintaining them is MUCH easier than maintaining a cleanup section and much less prone to problems.
In short, goto may be a good choice in some situations but not in this one. Here it's just short term laziness.
Your code is extremely non-idiomatic and you should never write it. You're basically emulating C in C++ there. But others have remarked on that, and pointed to RAII as the alternative.
However, your code won't work as you expect, because this:
p = new int;
if(p==NULL) { … }
won't ever evaluate to true (except if you've overloaded operator new in a weird way). If operator new is unable to allocate enough memory, it throws an exception, it never, ever returns 0, at least not with this set of parameters; there's a special placement-new overload that takes an instance of type std::nothrow and that indeed returns 0 instead of throwing an exception. But this version is rarely used in normal code. Some low-level codes or embedded device applications could benefit from it in contexts where dealing with exceptions is too expensive.
Something similar is true for your delete block, as Harald as said: if (p) is unnecessary in front of delete p.
Additionally, I'm not sure if your example was chose intentionally because this code can be rewritten as follows:
bool foo() // prefer native types to BOOL, if possible
{
bool ret = false;
int i;
// Lots of code.
return ret;
}
Probably not a good idea.
In general, and on the surface, there isn't any thing wrong with your approach, provided that you only have one label, and that the gotos always go forward. For example, this code:
int foo()
{
int *pWhatEver = ...;
if (something(pWhatEver))
{
delete pWhatEver;
return 1;
}
else
{
delete pWhatEver;
return 5;
}
}
And this code:
int foo()
{
int ret;
int *pWhatEver = ...;
if (something(pWhatEver))
{
ret = 1;
goto exit;
}
else
{
ret = 5;
goto exit;
}
exit:
delete pWhatEver;
return ret;
}
really aren't all that different from each other. If you can accept one, you should be able to accept the other.
However, in many cases the RAII (resource acquisition is initialization) pattern can make the code much cleaner and more maintainable. For example, this code:
int foo()
{
Auto<int> pWhatEver = ...;
if (something(pWhatEver))
{
return 1;
}
else
{
return 5;
}
}
is shorter, easier to read, and easier to maintain than both of the previous examples.
So, I would recommend using the RAII approach if you can.
Your example is not exception safe.
If you are using goto to clean up the code then, if an exception happens before the cleanup code, it is completely missed. If you claim that you do not use exceptions then you are mistaken because the new will throw bad_alloc when it does not have enough memory.
Also at this point (when bad_alloc is thrown), your stack will be unwound, missing all the cleanup code in every function on the way up the call stack thus not cleaning up your code.
You need to look to do some research into smart pointers. In the situation above you could just use a std::auto_ptr<>.
Also note in C++ code there is no need to check if a pointer is NULL (usually because you never have RAW pointers), but because new will not return NULL (it throws).
Also in C++ unlike (C) it is common to see early returns in the code. This is because RAII will do the cleanup automatically, while in C code you need to make sure that you add special cleanup code at the end of the function (a bit like your code).
I think other answers (and their comments) have covered all the important points, but here's one thing that hasn't been done properly yet:
What your code should look like instead:
bool foo() //lowercase bool is a built-in C++ type. Use it if you're writing C++.
{
try {
std::unique_ptr<int> p(new int);
// lots of code, and just return true or false directly when you're done
}
catch (std::bad_alloc){ // new throws an exception on OOM, it doesn't return NULL
cout<<" OOM \n";
return false;
}
}
Well, it's shorter, and as far as I can see, more correct (handles the OOM case properly), and most importantly, I didn't need to write any cleanup code or do anything special to "make sure my return value is initialized".
One problem with your code I only really noticed when I wrote this, is "what the hell is bRetVal's value at this point?". I don't know because, it was declared waaaaay above, and it was last assigned to when? At some point above this. I have to read through the entire function to make sure I understand what's going to be returned.
And how do I convince myself that the memory gets freed?
How do I know that we never forget to jump to the cleanup label? I have to work backwards from the cleanup label, finding every goto that points to it, and more importantly, find the ones that aren't there. I need to trace through all paths of the function just to be sure that the function gets cleaned up properly. That reads like spaghetti code to me.
Very fragile code, because every time a resource has to be cleaned up you have to remember to duplicate your cleanup code. Why not write it once, in the type that needs to be cleaned up? And then rely on it being executed automatically, every time we need it?
In the eight years I've been programming I've used goto a lot, most of that was in the first year when I was using a version of GW-BASIC and a book from 1980 that didn't make it clear goto should only be used in certain cases. The only time I've used goto in C++ is when I had code like the following, and I'm not sure if there was a better way.
for (int i=0; i<10; i++) {
for (int j=0; j<10; j++)
{
if (somecondition==true)
{
goto finish;
}
//Some code
}
//Some code
}
finish:
The only situation I know of where goto is still used heavily is mainframe assembly language, and the programmers I know make sure to document where code is jumping and why.
As used in the Linux kernel, goto's used for cleanup work well when a single function must perform 2 or more steps that may need to be undone. Steps need not be memory allocation. It might be a configuration change to a piece of code or in a register of an I/O chipset. Goto's should only be needed in a small number of cases, but often when used correctly, they may be the best solution. They are not evil. They are a tool.
Instead of...
do_step1;
if (failed)
{
undo_step1;
return failure;
}
do_step2;
if (failed)
{
undo_step2;
undo_step1;
return failure;
}
do_step3;
if (failed)
{
undo_step3;
undo_step2;
undo_step1;
return failure;
}
return success;
you can do the same with goto statements like this:
do_step1;
if (failed) goto unwind_step1;
do_step2;
if (failed) goto unwind_step2;
do_step3;
if (failed) goto unwind_step3;
return success;
unwind_step3:
undo_step3;
unwind_step2:
undo_step2;
unwind_step1:
undo_step1;
return failure;
It should be clear that given these two examples, one is preferable to the other. As to the RAII crowd... There is nothing wrong with that approach as long as they can guarantee that the unwinding will always occur in exactly reverse order: 3, 2, 1. And lastly, some people do not use exceptions in their code and instruct the compilers to disable them. Thus not all code must be exception safe.
You should read this thread summary from the Linux kernel mailing lists (paying special attention to the responses from Linus Torvalds) before you form a policy for goto:
http://kerneltrap.org/node/553/2131
In general, you should design your programs to limit the need for gotos. Use OO techniques for "cleanup" of your return values. There are ways to do this that don't require the use of gotos or complicating the code. There are cases where gotos are very useful (for example, deeply nested scopes), but if possible should be avoided.
The downside of GOTO is pretty well discussed. I would just add that 1) sometimes you have to use them and should know how to minimize the problems, and 2) some accepted programming techniques are GOTO-in-disguise, so be careful.
1) When you have to use GOTO, such as in ASM or in .bat files, think like a compiler. If you want to code
if (some_test){
... the body ...
}
do what a compiler does. Generate a label whose purpose is to skip over the body, not to do whatever follows. i.e.
if (not some_test) GOTO label_at_end_of_body
... the body ...
label_at_end_of_body:
Not
if (not some_test) GOTO the_label_named_for_whatever_gets_done_next
... the body ...
the_label_named_for_whatever_gets_done_next:
In otherwords, the purpose of the label is not to do something, but to skip over something.
2) What I call GOTO-in-disguise is anything that could be turned into GOTO+LABELS code by just defining a couple macros. An example is the technique of implementing finite-state-automata by having a state variable, and a while-switch statement.
while (not_done){
switch(state){
case S1:
... do stuff 1 ...
state = S2;
break;
case S2:
... do stuff 2 ...
state = S1;
break;
.........
}
}
can turn into:
while (not_done){
switch(state){
LABEL(S1):
... do stuff 1 ...
GOTO(S2);
LABEL(S2):
... do stuff 2 ...
GOTO(S1);
.........
}
}
just by defining a couple macros. Just about any FSA can be turned into structured goto-less code. I prefer to stay away from GOTO-in-disguise code because it can get into the same spaghetti-code issues as undisguised gotos.
Added: Just to reassure: I think one mark of a good programmer is recognizing when the common rules don't apply.
Using goto to go to a cleanup section is going to cause a lot of problems.
First, cleanup sections are prone to problems. They have low cohesion (no real role that can be described in terms of what the program is trying to do ), high coupling (correctness depends very heavily on other sections of code), and are not at all exception-safe. See if you can use destructors for cleanup. For example, if int *p is changed to auto_ptr<int> p, what p points to will be automatically released.
Second, as you point out, it's going to force you to declare variables long before use, which will make it harder to understand the code.
Third, while you're proposing a fairly disciplined use of goto, there's going to be the temptation to use them in a looser manner, and then the code will become difficult to understand.
There are very few situations where a goto is appropriate. Most of the time, when you are tempted to use them, it's a signal that you're doing things wrong.
The entire purpose of the every-function-has-a-single-exit-point idiom in C was to put all the cleanup stuff in a single place. If you use C++ destructors to handle cleanup, that's no longer necessary -- cleanup will be done regardless of how many exit points a function has. So in properly-designed C++ code, there's no longer any need for this kind of thing.
Since this is a classic topic, I will reply with Dijkstra's Go-to statement considered harmful (originally published in ACM).
Goto provides better don't repeat yourself (DRY) when "tail-end-logic" is common to some-but-not-all-cases. Especially within a "switch" statement I often use goto's when some of the switch-branches have tail-end-commonality.
switch(){
case a: ... goto L_abTail;
case b: ... goto L_abTail;
L_abTail: <commmon stuff>
break://end of case b
case c:
.....
}//switch
You have probably noticed than introducing additional curly-braces is enough to satisfy the compiler when you need such tail-end-merging in-the-middle of a routine. In other words, you don't need to declare everything way up at the top; that's inferior readability indeed.
...
goto L_skipMiddle;
{
int declInMiddleVar = 0;
....
}
L_skipMiddle: ;
With the later versions of Visual Studio detecting the use of uninitialized variables, I find myself always initializing most variables even though I think they may be assigned in all branches - it's easy to code a "tracing" statement which refs a variable that was never assigned because your mind doesn't think of the tracing statement as "real code", but of course Visual Studio will still detect an error.
Besides don't repeat yourself, assigning label-names to such tail-end-logic even seems to help my mind keep things straight by choosing nice label names. Without a meaningful label your comments might end up saying the same thing.
Of course, if you are actually allocating resources then if auto-ptr doesn't fit, you really must use a try-catch, but tail-end-merge-don't-repeat-yourself happens quite often when exception-safety is not an issue.
In summary, while goto can be used to code spaghetti-like structures, in the case of a tail-end-sequence which is common to some-but-not-all-cases then the goto IMPROVES the readability of the code and even maintainability if you would otherwise be copy/pasting stuff so that much later on someone might update one-and-not-the-other. So it's another case where being fanatic about a dogma can be counterproductive.
The only two reasons I use goto in my C++ code are:
Breaking a level 2+ nested loops
Complicated flows like this one (a comment in my program):
/* Analysis algorithm:
1. if classData [exporter] [classDef with name 'className'] exists, return it,
else
2. if project/target_codename/temp/classmeta/className.xml exist, parse it and go back to 1 as it will succeed.
3. if that file don't exists, generate it via haxe -xml, and go back to 1 as it will succeed.
*/
For code readability here, after this comment, I defined the step1 label and used it in step 2 and 3. Actually, in 60+ source files, only this situation and one 4-levels nested for are the places I used goto. Only two places.
A lot of people freak out with gotos are evil; they are not. That said, you will never need one; there is just about always a better way.
When I find myself "needing" a goto to do this type of thing, I almost always find that my code is too complex and can be easily broken up into a few method calls that are easier to read and deal with. Your calling code can do something like:
// Setup
if(
methodA() &&
methodB() &&
methodC()
)
// Cleanup
Not that this is perfect, but it's much easier to follow since all your methods will be named to clearly indicate what the problem might be.
Reading through the comments, however, should indicate that your team has more pressing issues than goto handling.
The code you're giving us is (almost) C code written inside a C++ file.
The kind of memory cleaning you're using would be OK in a C program not using C++ code/libraries.
In C++, your code is simply unsafe and unreliable. In C++ the kind of management you're asking for is done differently. Use constructors/destructors. Use smart pointers. Use the stack. In a word, use RAII.
Your code could (i.e., in C++, SHOULD) be written as:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<int> p = new int;
// Lot of code...
return bRetVal ;
}
(Note that new-ing an int is somewhat silly in real code, but you can replace int by any kind of object, and then, it makes more sense). Let's imagine we have an object of type T (T could be an int, some C++ class, etc.). Then the code becomes:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<T> p = new T;
// Lot of code...
return bRetVal ;
}
Or even better, using the stack:
BOOL foo()
{
BOOL bRetVal = FALSE;
T p ;
// Lot of code...
return bRetVal;
}
Anyway, any of the above examples are magnitudes more easy to read and secure than your example.
RAII has many facets (i.e. using smart pointers, the stack, using vectors instead of variable length arrays, etc.), but all in all is about writing as little code as possible, letting the compiler clean up the stuff at the right moment.
All of the above is valid, you might also want to look at whether you might be able to reduce the complexity of your code and alleviate the need for goto's by reducing the amout of code that is in the section marked as "lot of code" in your example. Additionaly delete 0 is a valid C++ statement
Using GOTO labels in C++ is a bad way to program, you can reduce the need by doing OO programming (deconstructors!) and trying to keep procedures as small as possible.
Your example looks a bit weird, there is no need to delete a NULL pointer. And nowadays an exception is thrown when a pointer can't get allocated.
Your procedure could just be wrote like:
bool foo()
{
bool bRetVal = false;
int p = 0;
// Calls to various methods that do algorithms on the p integer
// and give a return value back to this procedure.
return bRetVal;
}
You should place a try catch block in the main program handling out of memory problems that informs the user about the lack of memory, which is very rare... (Doesn't the OS itself inform about this too?)
Also note that there is not always the need to use a pointer, they are only useful for dynamic things. (Creating one thing inside a method not depending on input from anywhere isn't really dynamic)
I am not going to say that goto is always bad, but your use of it most certainly is. That kind of "cleanup sections" was pretty common in early 1990's, but using it for new code is pure evil.
The easiest way to avoid what you are doing here is to put all of this cleanup into some kind of simple structure and create an instance of it. For example instead of:
void MyClass::myFunction()
{
A* a = new A;
B* b = new B;
C* c = new C;
StartSomeBackgroundTask();
MaybeBeginAnUndoBlockToo();
if ( ... )
{
goto Exit;
}
if ( ... ) { .. }
else
{
... // what happens if this throws an exception??? too bad...
goto Exit;
}
Exit:
delete a;
delete b;
delete c;
StopMyBackgroundTask();
EndMyUndoBlock();
}
you should rather do this cleanup in some way like:
struct MyFunctionResourceGuard
{
MyFunctionResourceGuard( MyClass& owner )
: m_owner( owner )
, _a( new A )
, _b( new B )
, _c( new C )
{
m_owner.StartSomeBackgroundTask();
m_owner.MaybeBeginAnUndoBlockToo();
}
~MyFunctionResourceGuard()
{
m_owner.StopMyBackgroundTask();
m_owner.EndMyUndoBlock();
}
std::auto_ptr<A> _a;
std::auto_ptr<B> _b;
std::auto_ptr<C> _c;
};
void MyClass::myFunction()
{
MyFunctionResourceGuard guard( *this );
if ( ... )
{
return;
}
if ( ... ) { .. }
else
{
...
}
}
A few years ago I came up with a pseudo-idiom that avoids goto, and is vaguely similar to doing exception handling in C. It has been probably already invented by someone else so I guess I "discovered it independently" :)
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
do
{
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
break;
}
// Lot of code...
bRetVal = TRUE;
} while (false);
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
I think using the goto for exit code is bad since there's a lot of other solutions with low overhead such as having an exit function and returning the exit function value when needed. Typically in member functions though, this shouldn't be needed, otherwise this could be indication that there's a bit too much code bloat happening.
Typically, the only exception I make of the "no goto" rule when programming is when breaking out of nested loops to a specific level, which I've only ran into the need to do when working on mathematical programming.
For example:
for(int i_index = start_index; i_index >= 0; --i_index)
{
for(int j_index = start_index; j_index >=0; --j_index)
for(int k_index = start_index; k_index >= 0; --k_index)
if(my_condition)
goto BREAK_NESTED_LOOP_j_index;
BREAK_NESTED_LOOP_j_index:;
}
That code has a bunch of problems, most of which were pointed out already, for example:
The function is too long; refactoring out some code into separate functions might help.
Using pointers when normal instances will probably work just fine.
Not taking advantage of STL types such as auto_ptr
Incorrectly checking for errors, and not catching exceptions. (I would argue that checking for OOM is pointless on the vast majority of platforms, since if you run out of memory you have bigger problems than your software can fix, unless you are writing the OS itself)
I have never needed a goto, and I've always found that using goto is a symptom of a bigger set of problems. Your case appears to be no exception.
Using "GOTO" will change the "logics" of a program and how you enterpret or how you would imagine it would work.
Avoiding GOTO-commands have always worked for me so guess when you think you might need it, all you maybe need is a re-design.
However, if we look at this on an Assmebly-level, jusing "jump" is like using GOTO and that's used all the time, BUT, in Assembly you can clear out, what you know you have on the stack and other registers before you pass on.
So, when using GOTO, i'd make sure the software would "appear" as the co-coders would enterpret, GOTO will have an "bad" effect on your software imho.
So this is more an explenation to why not to use GOTO and not a solution for a replacement, because that is VERY much up to how everything else is built.
I may have missed something: you jump to the label Exit if P is null, then test to see if it's not null (which it's not) to see if you need to delete it (which isn't necessary because it was never allocated in the first place).
The if/goto won't, and doesn't need to delete p. Replacing the goto with a return false would have the same effect (and then you could remove the Exit label).
The only places I know where goto's are useful are buried deep in nasty parsers (or lexical analyzers), and in faking out state machines (buried in a mass of CPP macros). In those two cases they've been used to make very twisted logic simpler, but that is very rare.
Functions (A calls A'), Try/Catches and setjmp/longjmps are all nicer ways of avoiding a difficult syntax problem.
Paul.
Ignoring the fact that new will never return NULL, take your code:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
and write it like this:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = new int;
if (p!=NULL)
{
// Lot of code...
delete p;
}
else
{
cout<<" OOM \n";
}
return bRetVal;
}