question with longjmp - c++

I want to use longjmp to simulate goto instruction.I have an array DS containing elements of struct types (int , float, bool ,char). I want to jump to the place labled "lablex" where x is DS[TOP].int_val. how can I handle this?
sample code :
...
jmp_buf *bfj;
...
stringstream s;s<<"label"<<DS[TOP].int_val;
bfj = (jmp_buf *) s.str();
longjmp(*bfj,1);
but as I thought it's having problem what should I do?
error:
output.cpp: In function ‘int main()’:
output.cpp:101: error: invalid cast from type ‘std::basic_string, std::allocator >’ to type ‘__jmp_buf_tag (*)[1]’

You probably don't want to use longjmp at all but I hate it when people answer a question with "Why would you want to do that?" As has been pointed out your longjmp() usage is wrong. Here is a simple example of how to use it correctly:
#include <setjmp.h>
#include <iostream>
using namespace std;
jmp_buf jumpBuffer; // Declared globally but could also be in a class.
void a(int count) {
// . . .
cout << "In a(" << count << ") before jump" << endl;
// Calling longjmp() here is OK because it is above setjmp() on the call
// stack.
longjmp(jumpBuffer, count); // setjump() will return count
// . . .
}
void b() {
int count = 0;
cout << "Setting jump point" << endl;
if (setjmp(jumpBuffer) == 9) return;
cout << "After jump point" << endl;
a(count++); // This will loop 10 times.
}
int main(int argc, char *argv[]) {
b();
// Note: You cannot call longjmp() here because it is below the setjmp() call
// on the call stack.
return 0;
}
The problems with your usage of longjmp() are as follows:
You don't call setjmp()
You haven't allocated the jmp_buf either on the stack or dynamically. jmp_buf *bfj is just a pointer.
You cannot cast a char * to jmp_buf * and expect it to work. C++ not a dynamic language it is statically compiled.
But really, it is very unlikely that you should be using longjmp() at all.

The normal way to use longjump is incombination with setjump() as described here. You seem to want to make a jumptable as normally done with switch-case or with virtual functions.
Anyway, labels in code (compile-time) are not reachable with strings (run-time), so that is already your first problem. You would really need to find out the address of where you want to jump to and my best guess would be to put setjump()'s where your labels are.

You've totally failed C++. Firstly, goto's are bad, and not for the uninitiated- there's a reason that for, while, break, continue etc exist. Secondly, you're trying to convert a string into an identifier, which is impossible at runtime unless you code it yourself. Thirdly, you're.. trying to cast a const char* to a jmp_buf*? What?
In addition to that, C++ does have goto. But if you want to jump given an int, then you're going to have to switch it, e.g.
switch (DS[TOP].int_val) {
case 1:
goto label1;
break;
case 2:
goto label2;
break;
default:
throw std::runtime_error("Unrecognized label!");
}

Sounds like you want a function pointer:
((void(*)(void))*((int *)DS[TOP].int_val))();
This treats DS[TOP].int_value like an address and jumps to it. If you wanted to jump to where DS[TOP].int_value is located, you would:
((void(*)(void))*((int *)&DS[TOP].int_val))();
Either way, ugly, ugly code. But it should do what you want.

When setjmp() is called, the system effectively takes a snapshot of the call and parameter stack. This snapshot will remain valid until user code exits the block in which setjmp() was called; if longjmp() is called with that snapshot, execution will resume as though the setjmp() were returning for the first time, except that instead of returning zero, it will return the second parameter passed to longjmp(). It is very important to note that calling longjmp() with an invalid snapshot may have very bad effects. In some systems, such an invalid call may "seem" to work, but corrupt the system in such a way that it later crashes.
Although setjmp()/longjmp() are sometimes appropriate in pure C programs, having a C program call setjmp() to create a snapshot, and then call some C++ code which in turn calls longjmp() to return to that snapshot, is a recipe for disaster. Nearly all situations where one would want to do that may be better handled using exceptions.

Related

Bad practice to call static function from external file via function pointer?

Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?
There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.
In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.

main () returns an integer? [duplicate]

This question already has answers here:
What should main() return in C and C++?
(19 answers)
Closed 9 years ago.
I am currently reading what functions are in c++. It says that they are "artifacts that enable you to divide the content of your application into functional units that can be invoked in a sequence of your choosing.A function, when invoked, typically returns a value to the calling function."
It then goes on to say that main() is recognized by the compiler as the starting point of your c++ application and has to return an int (integer).
I don't know what is meant by 'has to return an integer'. From my (extremely limited experience) int main () is the start of your application. But what is meant by 'has to return an int'?. This is also intertwined with me not understanding 'typically returns a value to the calling function'
Just like in mathematics, in C++ functions return values. All functions in C++ must specify exactly what type of value they return, and every function must return only one type of thing. In some cases, that "one type of thing" might be nothing, which is denoted in C++ with the keyword void.
Every function must declare what it returns. This is done via a function declaration. Here are several examples:
int foo();
void bar();
string baz();
int main();
4 function declarations. foo returns an int, bar returns nothing, baz returns a string (which is declared in the C++ Standard Library), and main returns an int.
Not only must every function declare what it returns, it must also return that type of thing. If your function returns void, then you can write:
void bar()
{
return;
}
...or just do nothing:
void bar()
{
}
If your function returns anything other than void, then you have to have a return statement that returns that type of thing:
int foo()
{
return 42;
}
If you declare a function to return one type of thing, but then try to return another type of thing, then either there must be a way to implicitly convert from whatever you're trying to convert to what the function is declared to return. If there is no possible implicit conversion, your program won't compile. Consider:
int foo()
{
return "foobar rulez!";
}
Here, foo is declared to return an int, but I'm trying to return a string (not a string from the Standard Library, but an old C-style const char* string. `"foobar rulez!" here is called a string literal.)
It is possible to write code to provide the implicit conversion I mentioned earlier, but unless you know exactly why you want to do that it's better to not get mixed up in all that right now.
What do you do with the values that are returned from functions? Again, just like with mathematics, you can use those values somewhere else in your program.
#include <cstdlib>
#include <iostream>
int foo()
{
return 42;
}
int main()
{
int answer = foo();
std::cout << "The answer to Life, the Universe and Everything is...\n"
<< answer << "!\n";
return 0;
}
Obviously you can't do anything with the value that is returned from a function that returns void, because a function that returns void doesn't really return anything at all. But these kinds of functions are useful for doing stuff kind of on the side.
#include <cstdlib>
#include <iostream>
int theAnswer = 0;
void DeepThought()
{
theAnswer = 42;
}
int foo()
{
return theAnswer;
}
int main()
{
DeepThought();
int answer = foo();
std::cout << "The answer to Life, the Universe and Everything is...\n"
<< answer << "!\n";
return 0;
}
OK, back to all this business with main.
main is a function in C++. There are a few things about main that make it special compared to other functions in C++, and two of those things are:
Every program must have exactly one function called main() (in global scope).
That function must return an int
There is one more thing about main that's a little special and possibly confusing. You don't actually have to write a return statement in main*, even though it is declared to return an int. Consider:
int main()
{
}
Note that there's no return statement here. That is legal and valid in C++ for main, but main is the only function where this is allowed. All other functions must have an explicit return statement if they don't return void.
So what about the return value from main()? When you run a program on an Windows or Linux computer, the program returns a value to the operating system. What that value means depends on the program, but in general a value of 0 means that the program worked without any problems. A value other than 0 often means that the program didn't work, and the exact value is actually a code for what went wrong.
Scripts and other programs can use these return values to decide what to do next. For example, if you wrote a program to rename an MP3 file based on the Artist and track Number, then your program might return 0 if it worked, 1 if it couldn't figure out the Artist, and 2 if it couldn't figure out the Track Number. You can call this function in a script that renames and then moves files. If you want your script to quit if there was an error renaming the file, then it can check these return values to see if it worked or not.
no explicit return statement in main: In cases where main does not have an explicit return, it is defined to return the value 0.
Although it may appear so when you are programming in C or C++, main is not actually the "first thing" that happens. Typically, somewhere in the guts of the C or C++ runtime library is a call to main, which starts your program. When your program is finished and returns from main, it will return a value (in C++, if you don't specify something, the compiler will automatically add return 0), and this return value is used to signal "the success" of the program.
In Unix/Linux etc, this is used as $?, so you can echo $? after running a program to see what the "result" was - 0 means "went well", other values are used for "failure". In windows, there is a ERRORLEVEL variable in batch scripts, etc, that can be used to see the result of the last command.
Edit: If your code calls another program, e.g. through CreatProcess in Windows, or fork()/exec() in a Unix style OS (or the C runtime functions spawn and siblings in almost any OS), the return value from main is the new process finishes, and made available for the owning process. End Edit.
Since, even in C++, main is a "C" style function, if you change the return type, it still has the same name, so the linker/compiler can't "detect" that it's got the wrong return type, and some weird stuff will happen if you declare void main(), std::string main() or float main() or something other than int main() - it will still compile, but what happens in the code calling main will be undefined behaviour - this means "almost anything can happen".
This is how you report back to the operating system the exit status of the program, whether it ran successfully or not. For example in Linux you can use the following command:
echo $?
to obtain the the exit status of the program that ran previously.
Yes, main should always returns an int, this can be used to show if the program runs successfully, usually 0 represents sucess, a non-zero value represents some kind of failure.
For example, in Linux, you can call your program in a bash script, and in this script, run different commands on the return status of your program.
It means, in the signature of main() the return type is int:
int main();
int main(int argc, char const *argv[]);
Now what value you would return from main(), is the question. Well, the return value is actually exit status which indicates to the runtime whether the main() executes sucessfully or unsuccessfully.
In Linux, I usually return EXIT_SUCCESS or EXIT_FAILURE depending on the cases. These are macros defined by <cstdlib>.
int main()
{
//code
if ( some failure condition )
return EXIT_FAILURE;
//code
return EXIT_SUCCESS;
}
As per the doc:
#define EXIT_SUCCESS /*implementation defined*/
#define EXIT_FAILURE /*implementation defined*/
The EXIT_SUCCESS and EXIT_FAILURE macros expand into an integral expression and indicate program execution status.
Constant Explanation
EXIT_SUCCESS successful execution of a program
EXIT_FAILURE unsuccessful execution of a program

Segfault when calling a method c++

I am fairly new to c++ and I am a bit stumped by this problem. I am trying to assign a variable from a call to a method in another class but it always segfaults. My code compiles with no warnings and I have checked that all variables are correct in gdb but the function call itself seems to cause a segfault. The code I am using is roughly like the following:
class History{
public:
bool test_history();
};
bool History::test_history(){
std::cout<<"test"; //this line never gets executed
//more code goes in here
return true;
}
class Game{
private:
bool some_function();
public:
History game_actions_history;
};
bool Game::some_function(){
return game_actions_history.test_history();
}
Any tips or advice is greatly appreciated!
EDIT: I edited the code so there is no more local_variable and the value returns directly. But it still segfaults. As for posting the actual code, it's fairly large, what parts should I post?
From what I can see there's nothing wrong with the code you've displayed. However, segfaults often are a good indication that you've got corrupted memory. It's happening some place else besides what you've shown and only happens to impact the code here. I'd look any place you're dealing with arrays, pointers, or any manual memory interactions.
I have used valgrind succesfully with a lot of segfaults.
and have you tried to run gdb with the coredump caused by the segfault? from man gdb:
gdb program core
To create a coredump you might have to set:
ulimit -c unlimited
Shot in the dark. (Game*)this is NULL ?
The code is fine but the example is too incomplete to say what's wrong. Some things I'd suggest:
Add printouts to each class's destructor and constructor:
Game::Game() { cerr << this << " Game::Game" << endl; }
Game::Game(Game const&) { cerr << this << " Game::Game(Game const&)" << endl; }
Game::~Game() { cerr << this << " Game::~Game" << endl; }
bool Game::some_function() { cerr << this << " Game::some_function()" << endl; ... }
This will reveal:
Null object pointers.
Bad/deleted class pointers.
Second, for debugging, I'd strongly recommended sending printouts to cerr instead of cout. cout is usually buffered (for efficiency) before being output, cerr is not (at least, this used to be the case). If your program quits without executing its error handlers, at_exit, etc..., you are more likely to see the output if it is unbuffered and printed immediately.
Thirdly, if your class declarations live in a header, the class definitions, live in one cpp file and the code that uses the class in yet another, you may get this kind of crash if either of the cpp files were not recompiled after you changed the header.
Some other possibilities are:
stack overflow: you've allocated a lot of memory on the stack because of deep recursion or are allocating objects containing large arrays of data as local variables (i.e. not created or the heap with new or malloc))
corrupted class vtable (usually only possible due to dependency errors in your build tools),
corrupted object vtable pointer: possible through misuse of pointers: using pointers to deleted memory, or incorrectly writing to an in-use address. Not likely in your example because there are no virtual functions.
maintaining a pointer or reference to an object allocated on the stack that has been deleted: the printout code above will uncover this case.
I am wondering because you have defined some_function() in private of the Game class. So the code structure which you have mentioned above will also throw error for that.

Possible memory leak?

Okay, so I have two classes, call them A and B--in that order in the code. Class B instantiates class A as an array, and class B also has an error message char* variable, which class A must set in the event of an error. I created a third class with a pure virtual function to set the errorMessage variable in B, then made B a child of that third class. Class A creates a pointer to the third class, call it C--when B initializes the array of A objects, it loops through them and invokes a function in A to set A's pointer to C-- it passes "this" to that function, and then A sets the pointer to C to "this," and since C is B's parent, A can set C->errorMessage (I had to do all this because A and B couldn't simultaneously be aware of each other at compile time).
Anyways it works fine, however, and when I pass command line parameters to main(int,char**), it works unless I pass seven, eight, or more than twelve parameters to it... I narrowed it down (through commenting out lines) to the line of code, in A, which sets the pointer to C, to the value passed to it by B. This made no sense to me... I suspected a memory leak or something, but it seems wrong and I have no idea how to fix it... Also I don't get why specifically seven, eight, and more than twelve arguments don't work, 1-6 and 9-12 work fine.
Here is my code (stripped down)--
//class C
class errorContainer{
public:
virtual ~errorContainer(){ }
virtual void reportError(int,char*)=0;
};
//Class A
class switchObject{
void reportError(int,char*);
errorContainer* errorReference;
public:
void bindErrorContainer(errorContainer*);
};
//Class A member function definitions
void switchObject::reportError(int errorCode,char* errorMessage){
errorReference->reportError(errorCode,errorMessage);
}
void switchObject::bindErrorContainer(errorContainer* newReference){
errorReference=newReference; //commenting out this line fixes the problem
}
//Class B
class switchSystem: public errorContainer{
int errorCode;
char* errorMessage;
public:
switchSystem(int); //MUST specify number of switches in this system.
void reportError(int,char*);
int errCode();
char* errMessage();
switchObject* switchList;
};
//Class B member function definitions
switchSystem::switchSystem(int swLimit){
int i;
switchList=new (nothrow) switchObject[swLimit];
for(i=0;i<swLimit;i++){
switchList[i].bindErrorContainer(this);
}
errorCode=0;
errorMessage="No errors.";
}
void switchSystem::reportError(int reportErrorCode,char* reportErrorMessage){
int len=0,i;
errorCode=reportErrorCode;
if(errorMessage){
delete[] errorMessage;
}
while(reportErrorMessage[len]!='\0'){
len++;
}
errorMessage=new char[len];
for(i=0;i<=len;i++){
errorMessage[i]=reportErrorMessage[i];
}
}
int switchSystem::errCode(){
return errorCode;
}
char* switchSystem::errMessage(){
return errorMessage;
}
Anyone know what I've done wrong here?
It's bugging the crap out of me... I can't seem to fix it.
---EDIT---
okay, I have it set up the way I do so that I can use it like this in main()
int main(int argc,char** argv){
switchSystem sw (2)
sw.switchList[0].argumentCount=2;
sw.switchList[1].argumentCount=0;
sw.switchList[0].identifier="a";
sw.switchList[1].identifier="switch";
sw.init(argc,argv);
if(sw.errCode()>0){
cout<< "Error "<< sw.errCode()<< ": "<< sw.errMessage()<< endl;
}
}
this program is supposed to read the command line arguments and handle user defined "switches"--like how most command line programs handle switches, but instead of testing for all of them at the beginning of main I wanted to try to write a class and some functions to do it for me--create a switchSystem object with the number of switches, set their identifiers, whether or not they take arguments, and then pass the command line arguments to "init()" to sort it out. Then test like, if(sw.isSet("switch")){ ... } etc.
It seems scary that you:
Mix dynamic memory with static string constants ("No errors.") in the same pointer.
Use an explicit while-loop to compute the string's length; have you not heard of strlen()?
Use such low-level C-like string processing, for no good reason ... What's wrong with std::string?
Don't properly account for the terminating '\0' in the string, when allocating space for it and copying it. The length is also not stored, leaving the resulting char array rather difficult to interpret.
reportError() should be declared virtual in switchSystem, as it is in errorContainer.
char* should instead be std::string to avoid all of that needless work.
Is there some reason that you can't use an std::vector<switchObject> instead of new[]?
You shouldn't delete[] errorMessage when it points to a static literal string. This leads to undefined behavior. (Translation: Bad Thing(TM).)
Why are you iteratively counting and copying the contents of a char*? This is begging for trouble. You're not doing anything to protect yourself from harm.
Why must switchObject pass a string to switchSystem? Wouldn't it be better to simply return an error code or throw some class derived from std::exception? Or perhaps it should send a string to a global logging facility?
I think perhaps you should rethink your design instead of trying to fix this.
All of the above message contains really good advice, unless this is homework and you can't use STL I'd recommend that you follow them, your problems will be much less.
For example in this snippet
errorMessage=new char[len];
for(i=0;i<=len;i++){
errorMessage[i]=reportErrorMessage[i];
}
You have allocated len bytes, but you are writing to len+1 bytes, you have not allocated memory for the null terminator '\0', overwriting memory lead to nasty bugs that are difficult to trackdown.
I think the memory leak is your minor cocnern in here. You better throw this code and start over again with a new approach. This is a mess.

To GOTO or not to GOTO? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Currently I am working on a project where goto statements are heavely used. The main purpose of goto statements is to have one cleanup section in a routine rather than multiple return statements.
Like below:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = NULL;
p = new int;
if (p == NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p = NULL;
}
return bRetVal;
}
This makes it much easier as we can track our clean up code at one section in code, that is, after the Exit label.
However, I have read many places it's bad practice to have goto statements.
Currently I am reading the Code Complete book, and it says that we need to use variables close to their declarations. If we use goto then we need to declare/initialize all variables before first use of goto otherwise the compiler will give errors that initialization of xx variable is skipped by the goto statement.
Which way is right?
From Scott's comment:
It looks like using goto to jump from one section to another is bad as it makes the code hard to read and understand.
But if we use goto just to go forward and to one label then it should be fine(?).
I am not sure what do you mean by clean up code but in C++ there is a concept called "resource acquisition is initialization" and it should be the responsibility of your destructors to clean up stuff.
(Note that in C# and Java, this is usually solved by try/finally)
For more info check out this page:
http://www.research.att.com/~bs/bs_faq2.html#finally
EDIT: Let me clear this up a little bit.
Consider the following code:
void MyMethod()
{
MyClass *myInstance = new MyClass("myParameter");
/* Your code here */
delete myInstance;
}
The problem: What happens if you have multiple exits from the function? You have to keep track of each exit and delete your objects at all possible exits! Otherwise, you will have memory leaks and zombie resources, right?
The solution: Use object references instead, as they get cleaned up automatically when the control leaves the scope.
void MyMethod()
{
MyClass myInstance("myParameter");
/* Your code here */
/* You don't need delete - myInstance will be destructed and deleted
* automatically on function exit */
}
Oh yes, and use std::unique_ptr or something similar because the example above as it is is obviously imperfect.
I've never had to use a goto in C++. Ever. EVER. If there is a situation it should be used, it's incredibly rare. If you are actually considering making goto a standard part of your logic, something has flown off the tracks.
There are basically two points people are making in regards to gotos and your code:
Goto is bad. It's very rare to encounter a place where you need gotos, but I wouldn't suggest striking it completely. Though C++ has smart enough control flow to make goto rarely appropriate.
Your mechanism for cleanup is wrong: This point is far more important. In C, using memory management on your own is not only OK, but often the best way to do things. In C++, your goal should be to avoid memory management as much as possible. You should avoid memory management as much as possible. Let the compiler do it for you. Rather than using new, just declare variables. The only time you'll really need memory management is when you don't know the size of your data in advance. Even then, you should try to just use some of the STL collections instead.
In the event that you legitimately need memory management (you have not really provided any evidence of this), then you should encapsulate your memory management within a class via constructors to allocate memory and deconstructors to deallocate memory.
Your response that your way of doing things is much easier is not really true in the long run. Firstly, once you get a strong feel for C++ making such constructors will be 2nd nature. Personally, I find using constructors easier than using cleanup code, since I have no need to pay careful attention to make sure I am deallocating properly. Instead, I can just let the object leave scope and the language handles it for me. Also, maintaining them is MUCH easier than maintaining a cleanup section and much less prone to problems.
In short, goto may be a good choice in some situations but not in this one. Here it's just short term laziness.
Your code is extremely non-idiomatic and you should never write it. You're basically emulating C in C++ there. But others have remarked on that, and pointed to RAII as the alternative.
However, your code won't work as you expect, because this:
p = new int;
if(p==NULL) { … }
won't ever evaluate to true (except if you've overloaded operator new in a weird way). If operator new is unable to allocate enough memory, it throws an exception, it never, ever returns 0, at least not with this set of parameters; there's a special placement-new overload that takes an instance of type std::nothrow and that indeed returns 0 instead of throwing an exception. But this version is rarely used in normal code. Some low-level codes or embedded device applications could benefit from it in contexts where dealing with exceptions is too expensive.
Something similar is true for your delete block, as Harald as said: if (p) is unnecessary in front of delete p.
Additionally, I'm not sure if your example was chose intentionally because this code can be rewritten as follows:
bool foo() // prefer native types to BOOL, if possible
{
bool ret = false;
int i;
// Lots of code.
return ret;
}
Probably not a good idea.
In general, and on the surface, there isn't any thing wrong with your approach, provided that you only have one label, and that the gotos always go forward. For example, this code:
int foo()
{
int *pWhatEver = ...;
if (something(pWhatEver))
{
delete pWhatEver;
return 1;
}
else
{
delete pWhatEver;
return 5;
}
}
And this code:
int foo()
{
int ret;
int *pWhatEver = ...;
if (something(pWhatEver))
{
ret = 1;
goto exit;
}
else
{
ret = 5;
goto exit;
}
exit:
delete pWhatEver;
return ret;
}
really aren't all that different from each other. If you can accept one, you should be able to accept the other.
However, in many cases the RAII (resource acquisition is initialization) pattern can make the code much cleaner and more maintainable. For example, this code:
int foo()
{
Auto<int> pWhatEver = ...;
if (something(pWhatEver))
{
return 1;
}
else
{
return 5;
}
}
is shorter, easier to read, and easier to maintain than both of the previous examples.
So, I would recommend using the RAII approach if you can.
Your example is not exception safe.
If you are using goto to clean up the code then, if an exception happens before the cleanup code, it is completely missed. If you claim that you do not use exceptions then you are mistaken because the new will throw bad_alloc when it does not have enough memory.
Also at this point (when bad_alloc is thrown), your stack will be unwound, missing all the cleanup code in every function on the way up the call stack thus not cleaning up your code.
You need to look to do some research into smart pointers. In the situation above you could just use a std::auto_ptr<>.
Also note in C++ code there is no need to check if a pointer is NULL (usually because you never have RAW pointers), but because new will not return NULL (it throws).
Also in C++ unlike (C) it is common to see early returns in the code. This is because RAII will do the cleanup automatically, while in C code you need to make sure that you add special cleanup code at the end of the function (a bit like your code).
I think other answers (and their comments) have covered all the important points, but here's one thing that hasn't been done properly yet:
What your code should look like instead:
bool foo() //lowercase bool is a built-in C++ type. Use it if you're writing C++.
{
try {
std::unique_ptr<int> p(new int);
// lots of code, and just return true or false directly when you're done
}
catch (std::bad_alloc){ // new throws an exception on OOM, it doesn't return NULL
cout<<" OOM \n";
return false;
}
}
Well, it's shorter, and as far as I can see, more correct (handles the OOM case properly), and most importantly, I didn't need to write any cleanup code or do anything special to "make sure my return value is initialized".
One problem with your code I only really noticed when I wrote this, is "what the hell is bRetVal's value at this point?". I don't know because, it was declared waaaaay above, and it was last assigned to when? At some point above this. I have to read through the entire function to make sure I understand what's going to be returned.
And how do I convince myself that the memory gets freed?
How do I know that we never forget to jump to the cleanup label? I have to work backwards from the cleanup label, finding every goto that points to it, and more importantly, find the ones that aren't there. I need to trace through all paths of the function just to be sure that the function gets cleaned up properly. That reads like spaghetti code to me.
Very fragile code, because every time a resource has to be cleaned up you have to remember to duplicate your cleanup code. Why not write it once, in the type that needs to be cleaned up? And then rely on it being executed automatically, every time we need it?
In the eight years I've been programming I've used goto a lot, most of that was in the first year when I was using a version of GW-BASIC and a book from 1980 that didn't make it clear goto should only be used in certain cases. The only time I've used goto in C++ is when I had code like the following, and I'm not sure if there was a better way.
for (int i=0; i<10; i++) {
for (int j=0; j<10; j++)
{
if (somecondition==true)
{
goto finish;
}
//Some code
}
//Some code
}
finish:
The only situation I know of where goto is still used heavily is mainframe assembly language, and the programmers I know make sure to document where code is jumping and why.
As used in the Linux kernel, goto's used for cleanup work well when a single function must perform 2 or more steps that may need to be undone. Steps need not be memory allocation. It might be a configuration change to a piece of code or in a register of an I/O chipset. Goto's should only be needed in a small number of cases, but often when used correctly, they may be the best solution. They are not evil. They are a tool.
Instead of...
do_step1;
if (failed)
{
undo_step1;
return failure;
}
do_step2;
if (failed)
{
undo_step2;
undo_step1;
return failure;
}
do_step3;
if (failed)
{
undo_step3;
undo_step2;
undo_step1;
return failure;
}
return success;
you can do the same with goto statements like this:
do_step1;
if (failed) goto unwind_step1;
do_step2;
if (failed) goto unwind_step2;
do_step3;
if (failed) goto unwind_step3;
return success;
unwind_step3:
undo_step3;
unwind_step2:
undo_step2;
unwind_step1:
undo_step1;
return failure;
It should be clear that given these two examples, one is preferable to the other. As to the RAII crowd... There is nothing wrong with that approach as long as they can guarantee that the unwinding will always occur in exactly reverse order: 3, 2, 1. And lastly, some people do not use exceptions in their code and instruct the compilers to disable them. Thus not all code must be exception safe.
You should read this thread summary from the Linux kernel mailing lists (paying special attention to the responses from Linus Torvalds) before you form a policy for goto:
http://kerneltrap.org/node/553/2131
In general, you should design your programs to limit the need for gotos. Use OO techniques for "cleanup" of your return values. There are ways to do this that don't require the use of gotos or complicating the code. There are cases where gotos are very useful (for example, deeply nested scopes), but if possible should be avoided.
The downside of GOTO is pretty well discussed. I would just add that 1) sometimes you have to use them and should know how to minimize the problems, and 2) some accepted programming techniques are GOTO-in-disguise, so be careful.
1) When you have to use GOTO, such as in ASM or in .bat files, think like a compiler. If you want to code
if (some_test){
... the body ...
}
do what a compiler does. Generate a label whose purpose is to skip over the body, not to do whatever follows. i.e.
if (not some_test) GOTO label_at_end_of_body
... the body ...
label_at_end_of_body:
Not
if (not some_test) GOTO the_label_named_for_whatever_gets_done_next
... the body ...
the_label_named_for_whatever_gets_done_next:
In otherwords, the purpose of the label is not to do something, but to skip over something.
2) What I call GOTO-in-disguise is anything that could be turned into GOTO+LABELS code by just defining a couple macros. An example is the technique of implementing finite-state-automata by having a state variable, and a while-switch statement.
while (not_done){
switch(state){
case S1:
... do stuff 1 ...
state = S2;
break;
case S2:
... do stuff 2 ...
state = S1;
break;
.........
}
}
can turn into:
while (not_done){
switch(state){
LABEL(S1):
... do stuff 1 ...
GOTO(S2);
LABEL(S2):
... do stuff 2 ...
GOTO(S1);
.........
}
}
just by defining a couple macros. Just about any FSA can be turned into structured goto-less code. I prefer to stay away from GOTO-in-disguise code because it can get into the same spaghetti-code issues as undisguised gotos.
Added: Just to reassure: I think one mark of a good programmer is recognizing when the common rules don't apply.
Using goto to go to a cleanup section is going to cause a lot of problems.
First, cleanup sections are prone to problems. They have low cohesion (no real role that can be described in terms of what the program is trying to do ), high coupling (correctness depends very heavily on other sections of code), and are not at all exception-safe. See if you can use destructors for cleanup. For example, if int *p is changed to auto_ptr<int> p, what p points to will be automatically released.
Second, as you point out, it's going to force you to declare variables long before use, which will make it harder to understand the code.
Third, while you're proposing a fairly disciplined use of goto, there's going to be the temptation to use them in a looser manner, and then the code will become difficult to understand.
There are very few situations where a goto is appropriate. Most of the time, when you are tempted to use them, it's a signal that you're doing things wrong.
The entire purpose of the every-function-has-a-single-exit-point idiom in C was to put all the cleanup stuff in a single place. If you use C++ destructors to handle cleanup, that's no longer necessary -- cleanup will be done regardless of how many exit points a function has. So in properly-designed C++ code, there's no longer any need for this kind of thing.
Since this is a classic topic, I will reply with Dijkstra's Go-to statement considered harmful (originally published in ACM).
Goto provides better don't repeat yourself (DRY) when "tail-end-logic" is common to some-but-not-all-cases. Especially within a "switch" statement I often use goto's when some of the switch-branches have tail-end-commonality.
switch(){
case a: ... goto L_abTail;
case b: ... goto L_abTail;
L_abTail: <commmon stuff>
break://end of case b
case c:
.....
}//switch
You have probably noticed than introducing additional curly-braces is enough to satisfy the compiler when you need such tail-end-merging in-the-middle of a routine. In other words, you don't need to declare everything way up at the top; that's inferior readability indeed.
...
goto L_skipMiddle;
{
int declInMiddleVar = 0;
....
}
L_skipMiddle: ;
With the later versions of Visual Studio detecting the use of uninitialized variables, I find myself always initializing most variables even though I think they may be assigned in all branches - it's easy to code a "tracing" statement which refs a variable that was never assigned because your mind doesn't think of the tracing statement as "real code", but of course Visual Studio will still detect an error.
Besides don't repeat yourself, assigning label-names to such tail-end-logic even seems to help my mind keep things straight by choosing nice label names. Without a meaningful label your comments might end up saying the same thing.
Of course, if you are actually allocating resources then if auto-ptr doesn't fit, you really must use a try-catch, but tail-end-merge-don't-repeat-yourself happens quite often when exception-safety is not an issue.
In summary, while goto can be used to code spaghetti-like structures, in the case of a tail-end-sequence which is common to some-but-not-all-cases then the goto IMPROVES the readability of the code and even maintainability if you would otherwise be copy/pasting stuff so that much later on someone might update one-and-not-the-other. So it's another case where being fanatic about a dogma can be counterproductive.
The only two reasons I use goto in my C++ code are:
Breaking a level 2+ nested loops
Complicated flows like this one (a comment in my program):
/* Analysis algorithm:
1. if classData [exporter] [classDef with name 'className'] exists, return it,
else
2. if project/target_codename/temp/classmeta/className.xml exist, parse it and go back to 1 as it will succeed.
3. if that file don't exists, generate it via haxe -xml, and go back to 1 as it will succeed.
*/
For code readability here, after this comment, I defined the step1 label and used it in step 2 and 3. Actually, in 60+ source files, only this situation and one 4-levels nested for are the places I used goto. Only two places.
A lot of people freak out with gotos are evil; they are not. That said, you will never need one; there is just about always a better way.
When I find myself "needing" a goto to do this type of thing, I almost always find that my code is too complex and can be easily broken up into a few method calls that are easier to read and deal with. Your calling code can do something like:
// Setup
if(
methodA() &&
methodB() &&
methodC()
)
// Cleanup
Not that this is perfect, but it's much easier to follow since all your methods will be named to clearly indicate what the problem might be.
Reading through the comments, however, should indicate that your team has more pressing issues than goto handling.
The code you're giving us is (almost) C code written inside a C++ file.
The kind of memory cleaning you're using would be OK in a C program not using C++ code/libraries.
In C++, your code is simply unsafe and unreliable. In C++ the kind of management you're asking for is done differently. Use constructors/destructors. Use smart pointers. Use the stack. In a word, use RAII.
Your code could (i.e., in C++, SHOULD) be written as:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<int> p = new int;
// Lot of code...
return bRetVal ;
}
(Note that new-ing an int is somewhat silly in real code, but you can replace int by any kind of object, and then, it makes more sense). Let's imagine we have an object of type T (T could be an int, some C++ class, etc.). Then the code becomes:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<T> p = new T;
// Lot of code...
return bRetVal ;
}
Or even better, using the stack:
BOOL foo()
{
BOOL bRetVal = FALSE;
T p ;
// Lot of code...
return bRetVal;
}
Anyway, any of the above examples are magnitudes more easy to read and secure than your example.
RAII has many facets (i.e. using smart pointers, the stack, using vectors instead of variable length arrays, etc.), but all in all is about writing as little code as possible, letting the compiler clean up the stuff at the right moment.
All of the above is valid, you might also want to look at whether you might be able to reduce the complexity of your code and alleviate the need for goto's by reducing the amout of code that is in the section marked as "lot of code" in your example. Additionaly delete 0 is a valid C++ statement
Using GOTO labels in C++ is a bad way to program, you can reduce the need by doing OO programming (deconstructors!) and trying to keep procedures as small as possible.
Your example looks a bit weird, there is no need to delete a NULL pointer. And nowadays an exception is thrown when a pointer can't get allocated.
Your procedure could just be wrote like:
bool foo()
{
bool bRetVal = false;
int p = 0;
// Calls to various methods that do algorithms on the p integer
// and give a return value back to this procedure.
return bRetVal;
}
You should place a try catch block in the main program handling out of memory problems that informs the user about the lack of memory, which is very rare... (Doesn't the OS itself inform about this too?)
Also note that there is not always the need to use a pointer, they are only useful for dynamic things. (Creating one thing inside a method not depending on input from anywhere isn't really dynamic)
I am not going to say that goto is always bad, but your use of it most certainly is. That kind of "cleanup sections" was pretty common in early 1990's, but using it for new code is pure evil.
The easiest way to avoid what you are doing here is to put all of this cleanup into some kind of simple structure and create an instance of it. For example instead of:
void MyClass::myFunction()
{
A* a = new A;
B* b = new B;
C* c = new C;
StartSomeBackgroundTask();
MaybeBeginAnUndoBlockToo();
if ( ... )
{
goto Exit;
}
if ( ... ) { .. }
else
{
... // what happens if this throws an exception??? too bad...
goto Exit;
}
Exit:
delete a;
delete b;
delete c;
StopMyBackgroundTask();
EndMyUndoBlock();
}
you should rather do this cleanup in some way like:
struct MyFunctionResourceGuard
{
MyFunctionResourceGuard( MyClass& owner )
: m_owner( owner )
, _a( new A )
, _b( new B )
, _c( new C )
{
m_owner.StartSomeBackgroundTask();
m_owner.MaybeBeginAnUndoBlockToo();
}
~MyFunctionResourceGuard()
{
m_owner.StopMyBackgroundTask();
m_owner.EndMyUndoBlock();
}
std::auto_ptr<A> _a;
std::auto_ptr<B> _b;
std::auto_ptr<C> _c;
};
void MyClass::myFunction()
{
MyFunctionResourceGuard guard( *this );
if ( ... )
{
return;
}
if ( ... ) { .. }
else
{
...
}
}
A few years ago I came up with a pseudo-idiom that avoids goto, and is vaguely similar to doing exception handling in C. It has been probably already invented by someone else so I guess I "discovered it independently" :)
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
do
{
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
break;
}
// Lot of code...
bRetVal = TRUE;
} while (false);
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
I think using the goto for exit code is bad since there's a lot of other solutions with low overhead such as having an exit function and returning the exit function value when needed. Typically in member functions though, this shouldn't be needed, otherwise this could be indication that there's a bit too much code bloat happening.
Typically, the only exception I make of the "no goto" rule when programming is when breaking out of nested loops to a specific level, which I've only ran into the need to do when working on mathematical programming.
For example:
for(int i_index = start_index; i_index >= 0; --i_index)
{
for(int j_index = start_index; j_index >=0; --j_index)
for(int k_index = start_index; k_index >= 0; --k_index)
if(my_condition)
goto BREAK_NESTED_LOOP_j_index;
BREAK_NESTED_LOOP_j_index:;
}
That code has a bunch of problems, most of which were pointed out already, for example:
The function is too long; refactoring out some code into separate functions might help.
Using pointers when normal instances will probably work just fine.
Not taking advantage of STL types such as auto_ptr
Incorrectly checking for errors, and not catching exceptions. (I would argue that checking for OOM is pointless on the vast majority of platforms, since if you run out of memory you have bigger problems than your software can fix, unless you are writing the OS itself)
I have never needed a goto, and I've always found that using goto is a symptom of a bigger set of problems. Your case appears to be no exception.
Using "GOTO" will change the "logics" of a program and how you enterpret or how you would imagine it would work.
Avoiding GOTO-commands have always worked for me so guess when you think you might need it, all you maybe need is a re-design.
However, if we look at this on an Assmebly-level, jusing "jump" is like using GOTO and that's used all the time, BUT, in Assembly you can clear out, what you know you have on the stack and other registers before you pass on.
So, when using GOTO, i'd make sure the software would "appear" as the co-coders would enterpret, GOTO will have an "bad" effect on your software imho.
So this is more an explenation to why not to use GOTO and not a solution for a replacement, because that is VERY much up to how everything else is built.
I may have missed something: you jump to the label Exit if P is null, then test to see if it's not null (which it's not) to see if you need to delete it (which isn't necessary because it was never allocated in the first place).
The if/goto won't, and doesn't need to delete p. Replacing the goto with a return false would have the same effect (and then you could remove the Exit label).
The only places I know where goto's are useful are buried deep in nasty parsers (or lexical analyzers), and in faking out state machines (buried in a mass of CPP macros). In those two cases they've been used to make very twisted logic simpler, but that is very rare.
Functions (A calls A'), Try/Catches and setjmp/longjmps are all nicer ways of avoiding a difficult syntax problem.
Paul.
Ignoring the fact that new will never return NULL, take your code:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
and write it like this:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = new int;
if (p!=NULL)
{
// Lot of code...
delete p;
}
else
{
cout<<" OOM \n";
}
return bRetVal;
}