C++ Running actual code from file without compiling it [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I am trying to make a program which can read code from a file (similar to an interpretor but not as complex) store it into a linear list and then execute it when needed.
This is what i want this "interpretor" to know:
variable declaration
array declaration
for, while, if control structures
input and output files (output files can be opened in append mode)
non recursive functions
Because i execute it command by command i can stop/start at a specific number of commands executed which is helpful when trying to do more at the same time (similar to multithreading) and for this reason, i will name the class _MINI_THREAD. Here are the declarations of the struct COMMAND and class _MINI_THREAD:
struct COMMAND
{
unsigned int ID;
char command_text[151];
COMMAND* next;
COMMAND* prev;
};
class _MINI_THREAD
{
public:
void allocate_memory()
{
while (start_of_list == NULL)
start_of_list = new (std::nothrow) COMMAND;
while (end_of_list == NULL)
end_of_list = new (std::nothrow) COMMAND;
start_of_list -> prev = NULL;
start_of_list -> next = end_of_list;
end_of_list -> prev = start_of_list;
end_of_list -> next = NULL;
}
void free_memory()
{
for(COMMAND* i=start_of_list -> next;i!=end_of_list;i=i->next)
delete i -> prev;
delete end_of_list -> prev;
delete end_of_list;
}
bool execute_command(unsigned int number_of_commands)
{
for(unsigned int i=0;i<number_of_commands;i++)
{
/*match id of current command pointed by the cursor with a function from the map*/
if (cursor==end_of_list) return false;
else cursor=cursor->next;
}
return true;
}
bool if_finished()
{
if (cursor==end_of_list)return true;
else return false;
}
unsigned int get_ticks()
{
return ticks_per_loop;
}
void set_ticks(unsigned int ticks)
{
ticks_per_loop = ticks;
}
private:
unsigned int ticks_per_loop;
COMMAND* cursor=NULL;
COMMAND* start_of_list=NULL;
COMMAND* end_of_list=NULL;
};
I also try to keep the syntax of the "invented code" from the source files as close as possible to the c/c++ syntax but sometimes i placed a new parameter because it makes verification a lot easier. Please notice that even the while has a name so that i can manage nested loops faster.
Here is an example i came up with:
Source_file.txt
int a;
input_file fin ("numbers.in");
output_file fout ("numbers.out");
while loop_one ( fin.read(a,int,skipws) )
{
fout.print(a,int);
fout.print(32,char); /*prints a space after each number*/
}
close_input_file fin;
close_output_file fout;
/*This code is supposed to take all numbers from the input file and */
/* move them into the output file */
In the real program the object thread1 of class _MINI_THREAD contains a dinamically allocated list (i will display it as an array for simple understanding)
_MINI_THREAD thread1;
/*read from Source_file.txt each command into thread1 command array*/
thread1.commandarr={
define_integer("a"),
open_input_file("numbers.in",fin),
open_output_file("numbers.out",fout),
define_label_while("loop_one",fin.read()), /*if the condition is false the `cursor` will jump to labe_while_end*/
type_to_file(fout,a,int),
type_to_file(fout,32,char),
label_while_return("loop_one"), /*returns the cursor to the first line after the while declaration*/
label_while_end("loop_one"), /*marks the line after the while return point*/
close_input_file("numbers.in",fin),
close_output_file("numbers.out",fout),
};
/*the cursor is already pointing at the first command (define_integer("a"))*/
/*this will execute commands until the cursor reaches the end_of_list*/
while(thread1.execute_commands(1))NULL;
thread1.free_memory();
Now my problem is actually implementing the IF_CONTROL_STRUCTURE. Because you may want to type if (a==b) or if (foo()) etc... and i don't know how can i test all this stuff.
I somehow managed to make the cursor move accordingly to any structure (while,do ... while,for etc) with the idea of labels but still i cannot check the condition each structure has.

You really want to write some interpreter (probably using some bytecode). Read more about semantics.
Writing well a good interpreter is not a trivial task. Consider using some existing one, e.g. Lua, Guile, Neko, Python, Ocaml, .... and take some time to study their free software implementation.
Otherwise, spend several months reading stuff, notably:
SICP is an absolute must to read (and freely downloadable).
the Dragon Book
Programming Language Pragmatics
Lisp In Small Pieces
about the SECD machine
the GC handbook
Notice that an entire book (at least) is needed to explain how an interpreter works. See also relevant SIGPLAN conferences.
Many (multi-thread friendly) interpreters have some GIL. A genuinely multi-threaded interpreter (without any GIL) is very difficult to design (what exactly would be its REPL ???), and a multi-threaded garbage collector is also very difficult to implement and debug (consider using an existing one, perhaps MPS or Boehm's GC).
So "your simple work" could require several years of full-time work (and could get you a PhD).
a simpler approach
After having read SICP and becoming familiar with some Lisp-like language (probably some Scheme, e.g. thru Guile), you could decide on some simpler approach (basically a tiny Lisp interpreter which you could code in a few hundred lines of C++; not as serious as full-fledged interpreters mentioned before).
You first need to define on paper, at least in English, the syntax and the semantics of your scripting language. Take a strong inspiration from Lisp and its S-expressions. You probably want your scripting language to be homoiconic (so your AST would be values of your languages), and it would have (like Lisp) only expressions (and no statements). So the conditional is ternary like C++ ? :
You would represent the AST of your scripting language as some C++ data structure (probably some class with a few virtual methods). Parsing some script file into an AST (or a sequence of AST, maybe feed to some REPL) is so classical that I won't even explain; you might use some parser generator -improperly called compiler-compilers (like bison or lemon).
You would then at least implement some eval function. It takes two arguments Exp and Env: the first one, Exp, is the AST of the expression to be evaluated, and the second one, Env is some binding environment (defining the binding of local variables of your scripting language, it could be as simple as a stack of mapping from variables to values). And that eval function returns some value. It could be a member function of your AST class (then Exp is this, the receiver ....). Of course ASTs and values of your scripting language are some tagged union (which you might, if so wished, represent as a class hierarchy).
Implementing recursively such an eval in C++ is quite simple. Here is some pseudo code:
eval Exp Env :
if (Exp is some constant) {
return that constant }
if (Exp is a variable Var) {
return the bounded value of that Var in Env }
if (Exp is some primitive binary operator Op /* like + */
with operands Exp1 Exp2) {
compute V1 = eval Exp1 Env
and V2 = Exp2 Env
return the application of Op /* eg addition */ on V1 and V2
}
if (Exp is a conditional If Exp1 Exp2 Exp3) {
compute V1 = eval Exp1 Env
if (V1 is true) {
compute V2 = eval Exp2 Env
return V2
} else { /*V1 is false*/
compute V3 = eval Exp3 Env
return V3
}
}
.... etc....
There are many other cases to consider (e.g. some While, some Let or LetRec which probably would augment Env, primitive operations of different arities, Apply of an arbitrary functional value to some sequence of arguments, etc ...) that are left as an exercise to the reader. Of course some expressions have side effects when evaluated.
Both SICP and Lisp In Small Pieces explain the idea quite well. Read about meta-circular evaluators. Don't code without having read SICP ...
The code chunk in your question is a design error (even the MINI_THREAD thing is a mistake). Take a few weeks to read more, throw your code to the thrash bin, and start again. Be sure to use some version control system (I strongly recommend git).
Of course you want to be able to interpret recursive functions. There are not harder to interpret than non-recursive ones.
PS. I am very interested by your work. Please send me some email, and/or publish your tentative source code.

Related

How to make a string into a reference?

I have looked into this, but it's not what I wanted: Convert string to variable name or variable type
I have code that reads an ini file, stores data in a QHash table, and checks the values of the hash key, (see below) if a value is "1" it's added to World.
Code Examples:
World theWorld;
AgentMove AgentMovement(&theWorld);
if(rules.value("AgentMovement") == "1")
theWorld.addRule(&AgentMovement);
INI file:
AgentMovement=1
What I want to do is, dynamically read from the INI file and set a reference to a hard coded variable.
for(int j = 0; j < ck.size(); j++)
if(rules.value(ck[j]) == "1")
theWorld.addRule("&" + ck[j]);
^
= &AgentMovement
How would you make a string into a reference as noted above?
This is a common theme in programming: A value which can only be one of a set (could be an enum, one of a finite set of ints, or a set of possible string values, or even a number of buttons in a GUI) is used as a criteria to perform some kind of action. The simplistic approach is to use a switch (for atomic types) or an if/else chain for complex types. That is what you are currently doing, and there is nothing wrong with it as such:
if(rules.value(ck[j]) == "1") theWorld.addRule(&AgentMovement);
else if(rules.value(ck[j]) == "2") theWorld.addRule(&AgentEat);
else if(rules.value(ck[j]) == "3") theWorld.addRule(&AgentSleep);
// etc.
else error("internal error: weird rules value %s\n", rules.value(ck[j]));
The main advantages of this pattern are in my experience that it is crystal clear: anybody, including you in a year, understands immediately what's going on and can see immediately which criteria leads to which action. It is also trivial to debug which can be a surprising advantage: You can break at a specific action, and only at that action.
The main disadvantage is maintainability. If the same criteria (enum or whatever) is used to switch between different things in various places, all these places have to be maintained, for example when a new enum value is added. An action may come with a sound, an icon, a state change, a log message, and so on. If these do not happen at the same time (in the same switch), you'll end up switching multiple times over the action enum (or if/then/else over the string values). In that case it's better to bundle all information connected to an action in a data structure and put the structures in a map/hash table with the actions as keys. All the switches collapse to single calls. The compile-time initialization of such a map could look like this:
struct ActionDataT { Rule rule; Icon icon; Sound sound; };
map<string, AcionDataT> actionMap
= {
{"1", {AgentMovement, moveIcon, moveSound} }
{"2", {AgentEat, eatIcon, eatSound } } ,
//
};
The usage would be like
for(int j = 0; j < ck.size(); j++)
theWorld.addRule(actionMap[rules.value(ck[j])].rule);
And elsewhere, for example:
if(actionFinished(action)) removeIcon(actionMap[action].icon);
This is fairly elegant. It demonstrates two principles of software design: 1. "All problems in computer science can be solved by another level of indirection" (David Wheeler), and 2. There is often a choice between more data or more code. The simplistic approach is code-oriented, the map approach is data oriented.
The data-centrist approach is indispensable if switches occur in more than one situation, because coding them out each time would be a maintenance nightmare.
Note that with the data-centrist approach none of the places where an action is used has to be touched when a new action is added. This is essential. The mechanism resembles (in principle and implementation, actually) the call of a virtual member function. The calling code doesn't know and isn't really interested in what is actually done. Responsibility is transferred to the object. The calling code may perform actions later in the life cycle of a program which didn't exist when it was written. By contrast, compare it to a program with many explicit switches where every single use must be examined when an action is added.
The indirection involved in the data-centrist approach is its disadvantage though, and the only problem which cannot be solved by another level of indirection, as Wheeler remarked. The code becomes more abstract and hence less obvious and harder to debug.
You have to provide the mapping from the names to the object by yourself. I would wrap it into a class, something like this:
template <typename T>
struct ObjectMap {
void addObject(std::string name,T* obj){
m[name] = obj;
}
T& getRef(std::string name) const {
auto x = m.find(name);
if (x != m.end() ) { return *(x->second);}
else { return dummy; }
}
private:
std::map<std::string,T*> m;
T dummy;
}
The problem with this approach is that you have to decide what to do if an object is requested that is actually not in the map. A reference always has to reference something (in contrast to a pointer that can be 0). I decided to return the reference to a dummy object. However, you might want to consider to use pointers instead of references. Another option might be to throw an error in case the object is not in the map.

How to avoid GOTO in C++

I have read that GOTO is bad, but how do I avoid it? I don't know how to program without GOTO. In BASIC I used GOTO for everything. What should I use instead in C and C++?
I used GOTO in BASIC like this:
MainLoop:
INPUT string$
IF string$ = "game" THEN
GOTO game
ENDIF
Consider the following piece of C++ code:
void broken()
{
int i = rand() % 10;
if (i == 0) // 1 in 10 chance.
goto iHaveABadFeelingAboutThis;
std::string cake = "a lie";
// ...
// lots of code that prepares the cake
// ...
iHaveABadFeelingAboutThis:
// 1 time out of ten, the cake really is a lie.
eat(cake);
// maybe this is where "iHaveABadFeelingAboutThis" was supposed to be?
std::cout << "Thank you for calling" << std::endl;
}
Ultimately, "goto" is not much different than C++'s other flow-control keywords: "break", "continue", "throw", etc; functionally it introduces some scope-related issues as demonstrated above.
Relying on goto will teach you bad habits that produce difficult to read, difficult to debug and difficult to maintain code, and it will generally tend to lead to bugs. Why? Because goto is free-form in the worst possible way, and it lets you bypass structural controls built into the language, such as scope rules, etc.
Few of the alternatives are particularly intuitive, and some of them are arguably as ambiguous as "goto", but at least you are operating within the structure of the language - referring back to the above sample, it's much harder to do what we did in the above example with anything but goto (of course, you can still shoot yourself in the foot with for/while/throw when working with pointers).
Your options for avoiding it and using the language's natural flow control constructs to keep code humanly readable and maintainable:
Break your code up into subroutines.
Don't be afraid of small, discrete, well-named functions, as long as you are not perpetually hauling a massive list of arguments around (if you are, then you probably want to look at encapsulating with a class).
Many novices use "goto" because they write ridiculously long functions and then find that they want to get from line 2 of a 3000 line function to line 2998. In the above code, the bug created by goto is much harder to create if you split the function into two payloads, the logic and the functional.
void haveCake() {
std::string cake = "a lie";
// ...
// lots of code that prepares the cake
// ...
eat(cake);
}
void foo() {
int i = rand() % 10;
if (i != 0) // 9 times out of 10
haveCake();
std::cout << "Thanks for calling" << std::endl;
}
Some folks refer to this as "hoisting" (I hoisted everything that needed to be scoped with 'cake' into the haveCake function).
One-shot for loops.
These are not always obvious to programmers starting out, it says it's a for/while/do loop but it's actually only intended to run once.
for ( ; ; ) { // 1-shot for loop.
int i = rand() % 10;
if (i == 0) // 1 time in 10
break;
std::string cake = "a lie";
// << all the cakey goodness.
// And here's the weakness of this approach.
// If you don't "break" you may create an infinite loop.
break;
}
std::cout << "Thanks for calling" << std::endl;
Exceptions.
These can be very powerful, but they can also require a lot of boiler plate. Plus you can throw exceptions to be caught further back up the call stack or not at all (and exit the program).
struct OutOfLuck {};
try {
int i = rand() % 10;
if (i == 0)
throw OutOfLuck();
std::string cake = "a lie";
// << did you know: cake contains no fat, sugar, salt, calories or chemicals?
if (cake.size() < MIN_CAKE)
throw CakeError("WTF is this? I asked for cake, not muffin");
}
catch (OutOfLuck&) {} // we don't catch CakeError, that's Someone Else's Problem(TM).
std::cout << "Thanks for calling" << std::endl;
Formally, you should try and derive your exceptions from std::exception, but I'm sometimes kind of partial to throwing const char* strings, enums and occasionally struct Rock.
try {
if (creamyGoodness.index() < 11)
throw "Well, heck, we ran out of cream.";
} catch (const char* wkoft /*what kind of fail today*/) {
std::cout << "CAKE FAIL: " << wkoft << std::endl;
throw std::runtime_error(wkoft);
}
The biggest problem here is that exceptions are intended for handling errors as in the second of the two examples immediately above.
There are several reasons to use goto, the main would be: conditional execution, loops and "exit" routine.
Conditional execution is managed by if/else generally, and it should be enough
Loops are managed by for, while and do while; and are furthermore reinforced by continue and break
The most difficult would be the "exit" routine, in C++ however it is replaced by deterministic execution of destructors. So to make you call a routine on exiting a function, you simply create an object that will perform the action you need in its destructor: immediate advantages are that you cannot forget to execute the action when adding one return and that it'll work even in the presence of exceptions.
Usually loops like for, while and do while and functions have more or less disposed the need of using GOTO. Learn about using those and after a few examples you won't think about goto anymore.
:)
Edsger Dijkstra published a famous letter titled Go To Statement Considered Harmful. You should read about it, he advocated for structured programming. That wikipedia article describes what you need to know about structured programming. You can write structured programs with goto, but that is not a popular view these days, for that perspective read Donald Knuth's Structured Programming with goto Statements.
goto is now displaced by other programming constructs like for, while, do-while etc, which are easier to read. But goto still has it's uses. I use it in a situation where different code blocks in a function (for e.g., which involve different conditional checks) have a single exit point. Apart from this one use for every other thing you should use appropriate programming constructs.
goto is not inherently bad, it has it's uses, just like any other language feature. You can completely avoid using goto, by using exceptions, try/catch, and loops as well as appropriate if/else constructs.
However, if you realize that you get extremly out of your way, just to avoid it, it might be an indiaction that it would be better to use it.
Personally I use goto to implement functions with single entry and exit points, which makes the code much more readable. This is the only thing where I still find goto usefull and actually improves the structure and readabillity of the code.
As an example:
int foo()
{
int fd1 = -1;
int fd2 = -1;
int fd3 = -1;
fd1 = open();
if(fd1 == -1)
goto Quit:
fd2 = open();
if(fd2 == -1)
goto Quit:
fd3 = open();
if(fd3 == -1)
goto Quit:
... do your stuff here ...
Quit:
if(fd1 != -1)
closefile();
if(fd2 != -1)
closefile();
if(fd3 != -1)
closefile();
}
In C++ you find, that the need for such structures might be drastically reduced, if you properly implement classes which encapsulate access to resources. For example using smartpointer are such an example.
In the above sample, you would implement/use a file class in C++, so that, when it gets destructed, the file handle is also closed. Using classes, also has the advantage that it will work when exceptions are thrown, because then the compiler ensures that all objects are properly destructed. So in C++ you should definitely use classes with destructors, to achieve this.
When you want to code in C, you should consider that extra blocks also add additional complexity to the code, which in turn makes the code harder to understand and to control. I would prefer a well placed goto anytime over a series of artifical if/else clauses just to avoid it. And if you later have to revisit the code, you can still understand it, without following all the extra blocks.
Maybe instead of
if(something happens)
goto err;
err:
print_log()
one can use :
do {
if (something happens)
{
seterrbool = true;
break; // You can avoid using using go to I believe
}
} while (false) //loop will work only one anyways
if (seterrbool)
printlog();
It may not seem friendly because in the example above there is only one goto but will be more readable if there are many "goto" .
This implementation of the function above avoids using goto's. Note, this does NOT contain a loop. The compiler will optimize this. I prefer this implementation.
Using 'break' and 'continue', goto statements can (almost?) always be avoided.
int foo()
{
int fd1 = -1;
int fd2 = -1;
int fd3 = -1;
do
{
fd1 = open();
if(fd1 == -1)
break;
fd2 = open();
if(fd2 == -1)
break:
fd3 = open();
if(fd3 == -1)
break;
... do your stuff here ...
}
while (false);
if(fd1 != -1)
closefile();
if(fd2 != -1)
closefile();
if(fd3 != -1)
closefile();
}
BASIC originally is an interpreted language. It doesn't have structures so it relies on GOTOs to jump the specific line, like how you jump in assembly. In this way the program flow is hard to follow, making debugging more complicated.
Pascal, C and all modern high-level programming languages including Visual Basic (which was based on BASIC) are strongly structured with "commands" grouped into blocks. For example VB has Do... Loop, While... End While, For...Next. Even some old derivatives of BASIC support structures like Microsoft QuickBASIC:
DECLARE SUB PrintSomeStars (StarCount!)
REM QuickBASIC example
INPUT "What is your name: ", UserName$
PRINT "Hello "; UserName$
DO
INPUT "How many stars do you want: ", NumStars
CALL PrintSomeStars(NumStars)
DO
INPUT "Do you want more stars? ", Answer$
LOOP UNTIL Answer$ <> ""
Answer$ = LEFT$(Answer$, 1)
LOOP WHILE UCASE$(Answer$) = "Y"
PRINT "Goodbye "; UserName$
END
SUB PrintSomeStars (StarCount)
REM This procedure uses a local variable called Stars$
Stars$ = STRING$(StarCount, "*")
PRINT Stars$
END SUB
Another example in Visual Basic .NET
Public Module StarsProgram
Private Function Ask(prompt As String) As String
Console.Write(prompt)
Return Console.ReadLine()
End Function
Public Sub Main()
Dim userName = Ask("What is your name: ")
Console.WriteLine("Hello {0}", userName)
Dim answer As String
Do
Dim numStars = CInt(Ask("How many stars do you want: "))
Dim stars As New String("*"c, numStars)
Console.WriteLine(stars)
Do
answer = Ask("Do you want more stars? ")
Loop Until answer <> ""
Loop While answer.StartsWith("Y", StringComparison.OrdinalIgnoreCase)
Console.WriteLine("Goodbye {0}", userName)
End Sub
End Module
Similar things will be used in C++, like if, then, for, do, while... which together define the program flow. You don't need to use goto to jump to the next statement. In specific cases you can still use goto if it makes the control flow clearer, but in general there's no need for it

confusion about void and what it means.

I am very new to programming and am confused about what void does, I know that when you put void in front of a function it means that "it returns nothing" but if the function returns nothing then what is the point of writing the function?? Anyway, I got this question on my homework and am trying to answer it but need some help with the general concept along with it. any help would be great, and please try to avoid technical lingo, I'm a serious newb here.
What does this function accomplish?
void add2numbers(double a, double b)
{
double sum;
sum = a + b;
}
void ReturnsNothing()
{
cout << "Hello!";
}
As you can see, this function returns nothing, but that doesn't mean the function does nothing.
A function is nothing more than a refactoring of the code to put commonly-used routines together. If I'm printing "Hello" often, I put the code that prints "Hello" in a function. If I'm calculating the sum of two numbers, I'll put the code to do that and return the result in a function. It's all about what you want.
There are loads of reasons to have void functions, some of these are having 'non pure' side effects:
int i=9;
void f() {
++i;
}
In this case i could be global or a class data member.
The other is observable effects
void f() {
std::cout <<"hello world" << std::endl;
}
A void function may act on a reference or pointer value.
void f(int& i) {
++i;
}
It could also throw, although don't do this for flow control.
void f() {
while(is_not_broke()) {
//...
}
throw std::exception(); //it broke
}
The purpose of a void function is to achieve a side effect (e.g., modify a reference parameter or a global variable, perform system calls such as I/O, etc.), not return a value.
The use of the term function in the context of C/C++ is rather confusing, because it disagrees wiht the mathematical concept of a function as "something returning a value". What C/C++ calls functions returning void corresponds to the concept of a procedure in other languages.
The major difference between a function and a procedure is that a function call is an expression, while a procedure call is a statement While functions are invoked for their return value, procedures are invoked for their side effects (such as producing output, changing state, and so on).
A function with void return value can be useful for its side effects. For example consider the standard library function exit:
void exit(int status)
This function doesn't return any value to you, but it's still useful for its side-effect of terminating the process.
You are on the right lines - the function doesn't accomplish anything, because it calculates something but that something then gets thrown away.
Functions returning void can be useful because they can have "side effects". This means something happens that isn't an input or output of the function. For example it could write to a file, or send an email.
Function is a bit of a missnomer in this case; perhaps calling it a method is better. You can call a method on an object to change its state, i.e. the values of it's fields (or properties). So you might have an object with properites for x and y coordinates and a method called Move which takes parameters xDelta and yDelta.
Calling Move with 2, 3 will cause 2 to be added to your X property and 3 to be added to your Y property. So the state of the object has changed and it wouldn't have made musch sense for Move to have returned a value.
That function achieves nothing - but if you had written
void add2numbers(double a, double b, double &sum)
{
sum = a + b;
}
It would give you the sum, whether it's easier to return a value or use a parameter depends on the function
Typically you would use a parameter if there are multiple results but suppose you had a maths routine where an answer might not be possible.
bool sqrt(double value, double &answer)
{
if value < 0.0 ) {
return false;
} else {
answer = real_sqrt_function(value);
return true;
}
}
I currently use a visualization library called VTK. I normally write void functions to update some part of the graphics that are displayed to the screen. I also use void functions to handle GUI interaction within Qt. For example, if you click a button, some text gets updated on the GUI.
You're completely right: calculating a function that returns nothing is meaningless – if you're talking about mathematical functions. But like with many mathematical concepts, "functions" are in many programming languages only related to mathematical functions, but behave more or less subtly different.
I believe it's good to explain it with a language that does not get it wrong: one such language is Haskell. That's a purely functional language which means a Haskell function is also a mathematical function. Indeed you can write Haskell functions much more mathematical-styled, e.g.
my_tan(x) = sin(x)/cos(x) -- or (preferred): tan' x = sin x / cos x
than in C++
double my_tan(double x) { return sin(x)/cos(x); }
However, in computer programs you don't just want to calculate functions, do you? You also want to get stuff done, like displaying something on your screen, sending data over the network, reading values from sensors etc.. In Haskell, things like these are well separated from pure functions, they all act in the so-called IO monad. For instance, the function putStrLn, which prints a line of characters, has type String -> IO(). Meaning, it takes a String as its argument and returns an IO action which prints out that string when invoked from the main function, and nothing else (the () parens are roughly what's void in C++).
This way of doing IO has many benefits, but most programming languages are more sloppy: they allow all functions to do IO, and also to change the internal state of your program. So in C++, you could simply have a function void putStrLn(std::string), which also "returns" an IO action that prints the string and nothing else, but does not explicitly tell you so. The benefit is that you don't need to tie multiple knots in your brain when thinking about what the IO monad actually is (it's rather roundabout). Also, many algorithms can be implemented to run faster if you have the ability to actually tell the machine "do this sequence of processes, now!" rather than just asking for the result of some computation in the IO monad.

Making the leap from Java to C++

As the topic says I'm very new to c++, but I have some experience with java.
To start learning c++ I had the (not very original) idea of making a simple command line calculator.
What I'm trying to do is store the numbers and operators in a binary tree.
#include <iostream>
using namespace std;
class Node
{
bool leaf;
double num;
char oper;
Node* pLNode;
Node* pRNode;
public:
Node(double n)
{
num = n;
leaf = true;
pLNode = 0;
pRNode = 0;
}
Node(char o, Node lNode, Node rNode)
{
oper = o;
pLNode = &lNode;
pRNode = &rNode;
leaf = false;
}
bool isLeaf()
{
return leaf;
}
double getNumber()
{
return num;
}
char getOperator()
{
return oper;
}
Node* getLeftNodePointer()
{
return pLNode;
}
Node* getRightNodePointer()
{
return pRNode;
}
//debug function
void dump()
{
cout << endl << "**** Node Dump ****" << endl;
cout << "oper: " << oper << endl;
cout << "num: " << num << endl;
cout << "leaf: " << leaf << endl;
cout << "*******************" << endl << endl;
}
};
class CalcTree
{
Node* pRootNode;
Node* pCurrentNode;
public:
Node* getRootNodePointer()
{
return pRootNode;
}
Node* getCurrentNodePointer()
{
return pCurrentNode;
}
void setRootNode(Node node)
{
pRootNode = &node;
}
void setCurrentNode(Node node)
{
pCurrentNode = &node;
}
double calculateTree()
{
return calculateTree(pRootNode);
}
private:
double calculateTree(Node* nodePointer)
{
if(nodePointer->isLeaf())
{
return nodePointer->getNumber();
}
else
{
Node* leftNodePointer = nodePointer->getLeftNodePointer();
Node* rightNodePointer = nodePointer->getRightNodePointer();
char oper = nodePointer->getOperator();
if(oper == '+')
{
return calculateTree(leftNodePointer) + calculateTree(rightNodePointer);
}
else if(oper == '-')
{
return calculateTree(leftNodePointer) - calculateTree(rightNodePointer);
}
else if(oper == '*')
{
return calculateTree(leftNodePointer) * calculateTree(rightNodePointer);
}
else if(oper == '/')
{
return calculateTree(leftNodePointer) / calculateTree(rightNodePointer);
}
}
}
};
int main(int argc, char* argv[])
{
CalcTree tree;
tree.setRootNode(Node('+', Node(1), Node(534)));
cout << tree.calculateTree() << endl;
return 0;
}
I've got a couple of questions about this code:
This compiles but does not do what's intended. It seems that after tree.setRootNode(Node('+', Node(1), Node(534))); in main, the rightnode is initialized properly but the leftnode isn't. Compiling and running this prints out 534 for me (gcc, freebsd). What is wrong here?
It seems in c++ people prefer to define members of a class outside the class, like
class A
{
public:
void member();
};
A :: member(){std::cout << "Hello world" << std::endl;}
why is that?
I'd very much like some pointers on c++ conventions (naming, indenting etc.)
I'm used to coding java with eclipse. Atm I'm using emacs for learning c++. Can someone advise me on a good (free) c++ ide, or should I stfu and stick with emacs like a real man? :)
First, I can only recommend to have a look at the book "Accelerated C++". It will jump-start you into STL, and C++ style, and can save you a year of bad experiences. (There is quite an amount of extraordinary well written literature on C++ out there, if you decide to go deeper).
C++ does not use automatic memory management on the heap. You are storing pointers to temporaries. Your program is incorrect as it tries to access destructed objects. Like it or not, you'll have to learn on object lifetime first. You can simplify a good part of this in simple cases by using value semantics (not storing pointers, but storing a copy of the object).
Eclipse/CDT is reported to be quite OK on linux. On Windows you'll do easier with Microsoft Visual C++ Express Edition. When you've gotten into the basics, switching later will be no problem for you.
Defining members outside of the class itself is preferred as we usually split header and implementation files. In the header you try not to expose any unnecessary information, and also to reduce code-size, it's a simple matter of compile time. But for many modern programming techniques (like using template meta-programming) this cannot be used, so quite some C++ code moves in the direction of inline definitions.
First, i recommend you a good book too. There are very good SO threads about C++ books, like The definitive C++ book guide and List. In the following, you find i tell you how to solve some problems, but i don't delve into details, because i think that's what a book can do much better.
I've glanced over the code, here is what i've figured:
Node(char o, Node lNode, Node rNode)
{
oper = o;
pLNode = &lNode;
pRNode = &rNode;
leaf = false;
}
That constructor has 3 parameters, all of which are local to the function. When the function returns, the parameters do not exist anymore, and their memory occupied by them is cleaned up automatically. But you store their addresses into the pointers. That will fail then. What you want is passing pointers. By the way always use constructor initializer lists:
Node(char o, Node *lNode, Node *rNode)
:oper(o), pLNode(lNode), pRNode(rNode), leaf(false)
{ }
Creating the tree now does look different:
CalcTree tree;
tree.setRootNode(new Node('+', new Node(1), new Node(534)));
cout << tree.calculateTree() << endl;
New creates a dynamic object and returns a pointer to it. The pointer must not be lost - otherwise you have a memory-leak because you can't delete the object anymore. Now, make sure you delete child nodes ordinarily by creating a destructor for Node (put it as you would put any other member function):
~Node() {
delete pLNode;
delete pRNode;
}
Here you see it's important to always null-ify pointers: delete on a null pointer will do nothing. To start cleaning-up, you create a CalcTree destructor too, which starts the delete chain:
~CalcTree() {
delete pRootNode;
}
Don't forget to create a default constructor for CalcTree that initializes that pointer to 0! Now, you have one remaining problem: If you copy your object, the original object and the copy share the same pointer, and when the second object (the copy) goes out of scope, it will call delete on the pointer a second time, thus deleting the same object twice. That's unfortunate. It can be solved by forbidding copying your classes - or by using smart pointers that have shared ownership semantics (look into shared_ptr). A third variant is to write your own copy constructor and copy assignment operator. Well, here is how you would disable copy constructor and copy assignment operators. Once you try to copy, you will get a compile error. Put this into Node:
private:
Node(Node const&);
Node& operator=(Node const&);
The same for CalcTree, and you are protected from that subtle bug now.
Now, onto your other questions:
It seems in c++ people prefer to define members of a class outside the class
That is because the more code you add to the header, the more you have to include into your header from other files (because your code depends on something that's defined in them). Note that all other files that include that header then will transitively include the headers that one includes too. So you will in the end have less indirectly included headers when you put your code into separately compiled files instead. Another problem that is solved are cyclic references. Sometimes, you have to write a method that needs to have access to something defined later in the header. Since C++ is a single-pass language, the compiler can't resolve references to symbols that are declared after the point of use - generally.
I'd very much like some pointers on c++ conventions
That's very subjective, but i like this convention:
Class data-members are written like mDataMember
Function are written like getDataMember
Local variables are written like localVariable
Indenting with spaces, 4 spaces each (uh oh, this one is controversial. There are many possible ways to indent, please do not ask for the best one!)
I'm used to coding java with eclipse. Atm I'm using emacs for learning c++.
I'm using emacs for my C++ development, and eclipse for Java. Some use eclipse for Java too (there is a package called CDT for eclipse C++ development). It seems to be quite common gist (i agree) that Visual C++ on windows is the best IDE you get for windows. There are also some answers on SO regarding this: Best IDE for C++ (best search with google in stackoverflow, it will give you much more results than the builtin search).
You're passing by value, not by reference when you say
Node(char o, Node lNode, Node rNode)
It should be
Node(char o, Node &lNode, Node &rNode)
or better yet (for consistency with the rest of your code),
Node(char o, Node *lNode, Node *rNode)
Compilation speed: The header (.h file) contains extra information not embedded in the .o files, so it must be recompiled by every other C++ file that includes it. Space: If you include method bodies, they are duplicated in every file that includes the .h file. This is in contrast to Java, where all of the relevant information is embedded in the .class file. C++ cannot do this because it is a much richer language. In particular, macros make it so that C++ will never be able to embed all the information in a .o file. Also, Turing-complete templates make it challenging to get rid of the .h / .cpp distinction.
There are many conventions. The standard C++ library has one set, the standard C library has another, BSD and GNU each have their own, and Microsoft uses yet another. I personally like to be fairly close to Java when it comes to naming identifiers and indenting.
Eclipse, NetBeans, KDevelop, vim, and emacs are all good options. If you're on Windows, Visual Studio is really nice.
There are very different conventions for C++. Google some. There is no "official" convention. ;)
To your IDE question: I am using the CDT (C[/C++] Development Tools, Eclipse). There has been a first release for another Eclipse based C++ IDE: http://www.eclipse.org/linuxtools/ I am going to test it out these days. Sounds very good.
Also try out KDevelop if you are using KDE.
2. It seems in c++ people prefer to define members of a class outside the
class[..]
It's a matter of style. Mostly. We group the big bulky methods in a separate cpp file, and keep the small ones along with the header file. Declaring the method within the class declaration makes the function inline (which is a hint for the compiler to do, you guessed it -- inlining). This is something you may or may not want depending on the problem you want to solve.
3. I'd very much like some pointers on c++ conventions (naming, indenting etc.)
The standard library is a good reference. Start looking at the headers.
4. I'm used to coding java with eclipse.
Eclipse can be configured to use a C++ compiler. Go(ogle) for it, please. Why punish yourself twice now that you've taken to C++ ;-)

To GOTO or not to GOTO? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Currently I am working on a project where goto statements are heavely used. The main purpose of goto statements is to have one cleanup section in a routine rather than multiple return statements.
Like below:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = NULL;
p = new int;
if (p == NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p = NULL;
}
return bRetVal;
}
This makes it much easier as we can track our clean up code at one section in code, that is, after the Exit label.
However, I have read many places it's bad practice to have goto statements.
Currently I am reading the Code Complete book, and it says that we need to use variables close to their declarations. If we use goto then we need to declare/initialize all variables before first use of goto otherwise the compiler will give errors that initialization of xx variable is skipped by the goto statement.
Which way is right?
From Scott's comment:
It looks like using goto to jump from one section to another is bad as it makes the code hard to read and understand.
But if we use goto just to go forward and to one label then it should be fine(?).
I am not sure what do you mean by clean up code but in C++ there is a concept called "resource acquisition is initialization" and it should be the responsibility of your destructors to clean up stuff.
(Note that in C# and Java, this is usually solved by try/finally)
For more info check out this page:
http://www.research.att.com/~bs/bs_faq2.html#finally
EDIT: Let me clear this up a little bit.
Consider the following code:
void MyMethod()
{
MyClass *myInstance = new MyClass("myParameter");
/* Your code here */
delete myInstance;
}
The problem: What happens if you have multiple exits from the function? You have to keep track of each exit and delete your objects at all possible exits! Otherwise, you will have memory leaks and zombie resources, right?
The solution: Use object references instead, as they get cleaned up automatically when the control leaves the scope.
void MyMethod()
{
MyClass myInstance("myParameter");
/* Your code here */
/* You don't need delete - myInstance will be destructed and deleted
* automatically on function exit */
}
Oh yes, and use std::unique_ptr or something similar because the example above as it is is obviously imperfect.
I've never had to use a goto in C++. Ever. EVER. If there is a situation it should be used, it's incredibly rare. If you are actually considering making goto a standard part of your logic, something has flown off the tracks.
There are basically two points people are making in regards to gotos and your code:
Goto is bad. It's very rare to encounter a place where you need gotos, but I wouldn't suggest striking it completely. Though C++ has smart enough control flow to make goto rarely appropriate.
Your mechanism for cleanup is wrong: This point is far more important. In C, using memory management on your own is not only OK, but often the best way to do things. In C++, your goal should be to avoid memory management as much as possible. You should avoid memory management as much as possible. Let the compiler do it for you. Rather than using new, just declare variables. The only time you'll really need memory management is when you don't know the size of your data in advance. Even then, you should try to just use some of the STL collections instead.
In the event that you legitimately need memory management (you have not really provided any evidence of this), then you should encapsulate your memory management within a class via constructors to allocate memory and deconstructors to deallocate memory.
Your response that your way of doing things is much easier is not really true in the long run. Firstly, once you get a strong feel for C++ making such constructors will be 2nd nature. Personally, I find using constructors easier than using cleanup code, since I have no need to pay careful attention to make sure I am deallocating properly. Instead, I can just let the object leave scope and the language handles it for me. Also, maintaining them is MUCH easier than maintaining a cleanup section and much less prone to problems.
In short, goto may be a good choice in some situations but not in this one. Here it's just short term laziness.
Your code is extremely non-idiomatic and you should never write it. You're basically emulating C in C++ there. But others have remarked on that, and pointed to RAII as the alternative.
However, your code won't work as you expect, because this:
p = new int;
if(p==NULL) { … }
won't ever evaluate to true (except if you've overloaded operator new in a weird way). If operator new is unable to allocate enough memory, it throws an exception, it never, ever returns 0, at least not with this set of parameters; there's a special placement-new overload that takes an instance of type std::nothrow and that indeed returns 0 instead of throwing an exception. But this version is rarely used in normal code. Some low-level codes or embedded device applications could benefit from it in contexts where dealing with exceptions is too expensive.
Something similar is true for your delete block, as Harald as said: if (p) is unnecessary in front of delete p.
Additionally, I'm not sure if your example was chose intentionally because this code can be rewritten as follows:
bool foo() // prefer native types to BOOL, if possible
{
bool ret = false;
int i;
// Lots of code.
return ret;
}
Probably not a good idea.
In general, and on the surface, there isn't any thing wrong with your approach, provided that you only have one label, and that the gotos always go forward. For example, this code:
int foo()
{
int *pWhatEver = ...;
if (something(pWhatEver))
{
delete pWhatEver;
return 1;
}
else
{
delete pWhatEver;
return 5;
}
}
And this code:
int foo()
{
int ret;
int *pWhatEver = ...;
if (something(pWhatEver))
{
ret = 1;
goto exit;
}
else
{
ret = 5;
goto exit;
}
exit:
delete pWhatEver;
return ret;
}
really aren't all that different from each other. If you can accept one, you should be able to accept the other.
However, in many cases the RAII (resource acquisition is initialization) pattern can make the code much cleaner and more maintainable. For example, this code:
int foo()
{
Auto<int> pWhatEver = ...;
if (something(pWhatEver))
{
return 1;
}
else
{
return 5;
}
}
is shorter, easier to read, and easier to maintain than both of the previous examples.
So, I would recommend using the RAII approach if you can.
Your example is not exception safe.
If you are using goto to clean up the code then, if an exception happens before the cleanup code, it is completely missed. If you claim that you do not use exceptions then you are mistaken because the new will throw bad_alloc when it does not have enough memory.
Also at this point (when bad_alloc is thrown), your stack will be unwound, missing all the cleanup code in every function on the way up the call stack thus not cleaning up your code.
You need to look to do some research into smart pointers. In the situation above you could just use a std::auto_ptr<>.
Also note in C++ code there is no need to check if a pointer is NULL (usually because you never have RAW pointers), but because new will not return NULL (it throws).
Also in C++ unlike (C) it is common to see early returns in the code. This is because RAII will do the cleanup automatically, while in C code you need to make sure that you add special cleanup code at the end of the function (a bit like your code).
I think other answers (and their comments) have covered all the important points, but here's one thing that hasn't been done properly yet:
What your code should look like instead:
bool foo() //lowercase bool is a built-in C++ type. Use it if you're writing C++.
{
try {
std::unique_ptr<int> p(new int);
// lots of code, and just return true or false directly when you're done
}
catch (std::bad_alloc){ // new throws an exception on OOM, it doesn't return NULL
cout<<" OOM \n";
return false;
}
}
Well, it's shorter, and as far as I can see, more correct (handles the OOM case properly), and most importantly, I didn't need to write any cleanup code or do anything special to "make sure my return value is initialized".
One problem with your code I only really noticed when I wrote this, is "what the hell is bRetVal's value at this point?". I don't know because, it was declared waaaaay above, and it was last assigned to when? At some point above this. I have to read through the entire function to make sure I understand what's going to be returned.
And how do I convince myself that the memory gets freed?
How do I know that we never forget to jump to the cleanup label? I have to work backwards from the cleanup label, finding every goto that points to it, and more importantly, find the ones that aren't there. I need to trace through all paths of the function just to be sure that the function gets cleaned up properly. That reads like spaghetti code to me.
Very fragile code, because every time a resource has to be cleaned up you have to remember to duplicate your cleanup code. Why not write it once, in the type that needs to be cleaned up? And then rely on it being executed automatically, every time we need it?
In the eight years I've been programming I've used goto a lot, most of that was in the first year when I was using a version of GW-BASIC and a book from 1980 that didn't make it clear goto should only be used in certain cases. The only time I've used goto in C++ is when I had code like the following, and I'm not sure if there was a better way.
for (int i=0; i<10; i++) {
for (int j=0; j<10; j++)
{
if (somecondition==true)
{
goto finish;
}
//Some code
}
//Some code
}
finish:
The only situation I know of where goto is still used heavily is mainframe assembly language, and the programmers I know make sure to document where code is jumping and why.
As used in the Linux kernel, goto's used for cleanup work well when a single function must perform 2 or more steps that may need to be undone. Steps need not be memory allocation. It might be a configuration change to a piece of code or in a register of an I/O chipset. Goto's should only be needed in a small number of cases, but often when used correctly, they may be the best solution. They are not evil. They are a tool.
Instead of...
do_step1;
if (failed)
{
undo_step1;
return failure;
}
do_step2;
if (failed)
{
undo_step2;
undo_step1;
return failure;
}
do_step3;
if (failed)
{
undo_step3;
undo_step2;
undo_step1;
return failure;
}
return success;
you can do the same with goto statements like this:
do_step1;
if (failed) goto unwind_step1;
do_step2;
if (failed) goto unwind_step2;
do_step3;
if (failed) goto unwind_step3;
return success;
unwind_step3:
undo_step3;
unwind_step2:
undo_step2;
unwind_step1:
undo_step1;
return failure;
It should be clear that given these two examples, one is preferable to the other. As to the RAII crowd... There is nothing wrong with that approach as long as they can guarantee that the unwinding will always occur in exactly reverse order: 3, 2, 1. And lastly, some people do not use exceptions in their code and instruct the compilers to disable them. Thus not all code must be exception safe.
You should read this thread summary from the Linux kernel mailing lists (paying special attention to the responses from Linus Torvalds) before you form a policy for goto:
http://kerneltrap.org/node/553/2131
In general, you should design your programs to limit the need for gotos. Use OO techniques for "cleanup" of your return values. There are ways to do this that don't require the use of gotos or complicating the code. There are cases where gotos are very useful (for example, deeply nested scopes), but if possible should be avoided.
The downside of GOTO is pretty well discussed. I would just add that 1) sometimes you have to use them and should know how to minimize the problems, and 2) some accepted programming techniques are GOTO-in-disguise, so be careful.
1) When you have to use GOTO, such as in ASM or in .bat files, think like a compiler. If you want to code
if (some_test){
... the body ...
}
do what a compiler does. Generate a label whose purpose is to skip over the body, not to do whatever follows. i.e.
if (not some_test) GOTO label_at_end_of_body
... the body ...
label_at_end_of_body:
Not
if (not some_test) GOTO the_label_named_for_whatever_gets_done_next
... the body ...
the_label_named_for_whatever_gets_done_next:
In otherwords, the purpose of the label is not to do something, but to skip over something.
2) What I call GOTO-in-disguise is anything that could be turned into GOTO+LABELS code by just defining a couple macros. An example is the technique of implementing finite-state-automata by having a state variable, and a while-switch statement.
while (not_done){
switch(state){
case S1:
... do stuff 1 ...
state = S2;
break;
case S2:
... do stuff 2 ...
state = S1;
break;
.........
}
}
can turn into:
while (not_done){
switch(state){
LABEL(S1):
... do stuff 1 ...
GOTO(S2);
LABEL(S2):
... do stuff 2 ...
GOTO(S1);
.........
}
}
just by defining a couple macros. Just about any FSA can be turned into structured goto-less code. I prefer to stay away from GOTO-in-disguise code because it can get into the same spaghetti-code issues as undisguised gotos.
Added: Just to reassure: I think one mark of a good programmer is recognizing when the common rules don't apply.
Using goto to go to a cleanup section is going to cause a lot of problems.
First, cleanup sections are prone to problems. They have low cohesion (no real role that can be described in terms of what the program is trying to do ), high coupling (correctness depends very heavily on other sections of code), and are not at all exception-safe. See if you can use destructors for cleanup. For example, if int *p is changed to auto_ptr<int> p, what p points to will be automatically released.
Second, as you point out, it's going to force you to declare variables long before use, which will make it harder to understand the code.
Third, while you're proposing a fairly disciplined use of goto, there's going to be the temptation to use them in a looser manner, and then the code will become difficult to understand.
There are very few situations where a goto is appropriate. Most of the time, when you are tempted to use them, it's a signal that you're doing things wrong.
The entire purpose of the every-function-has-a-single-exit-point idiom in C was to put all the cleanup stuff in a single place. If you use C++ destructors to handle cleanup, that's no longer necessary -- cleanup will be done regardless of how many exit points a function has. So in properly-designed C++ code, there's no longer any need for this kind of thing.
Since this is a classic topic, I will reply with Dijkstra's Go-to statement considered harmful (originally published in ACM).
Goto provides better don't repeat yourself (DRY) when "tail-end-logic" is common to some-but-not-all-cases. Especially within a "switch" statement I often use goto's when some of the switch-branches have tail-end-commonality.
switch(){
case a: ... goto L_abTail;
case b: ... goto L_abTail;
L_abTail: <commmon stuff>
break://end of case b
case c:
.....
}//switch
You have probably noticed than introducing additional curly-braces is enough to satisfy the compiler when you need such tail-end-merging in-the-middle of a routine. In other words, you don't need to declare everything way up at the top; that's inferior readability indeed.
...
goto L_skipMiddle;
{
int declInMiddleVar = 0;
....
}
L_skipMiddle: ;
With the later versions of Visual Studio detecting the use of uninitialized variables, I find myself always initializing most variables even though I think they may be assigned in all branches - it's easy to code a "tracing" statement which refs a variable that was never assigned because your mind doesn't think of the tracing statement as "real code", but of course Visual Studio will still detect an error.
Besides don't repeat yourself, assigning label-names to such tail-end-logic even seems to help my mind keep things straight by choosing nice label names. Without a meaningful label your comments might end up saying the same thing.
Of course, if you are actually allocating resources then if auto-ptr doesn't fit, you really must use a try-catch, but tail-end-merge-don't-repeat-yourself happens quite often when exception-safety is not an issue.
In summary, while goto can be used to code spaghetti-like structures, in the case of a tail-end-sequence which is common to some-but-not-all-cases then the goto IMPROVES the readability of the code and even maintainability if you would otherwise be copy/pasting stuff so that much later on someone might update one-and-not-the-other. So it's another case where being fanatic about a dogma can be counterproductive.
The only two reasons I use goto in my C++ code are:
Breaking a level 2+ nested loops
Complicated flows like this one (a comment in my program):
/* Analysis algorithm:
1. if classData [exporter] [classDef with name 'className'] exists, return it,
else
2. if project/target_codename/temp/classmeta/className.xml exist, parse it and go back to 1 as it will succeed.
3. if that file don't exists, generate it via haxe -xml, and go back to 1 as it will succeed.
*/
For code readability here, after this comment, I defined the step1 label and used it in step 2 and 3. Actually, in 60+ source files, only this situation and one 4-levels nested for are the places I used goto. Only two places.
A lot of people freak out with gotos are evil; they are not. That said, you will never need one; there is just about always a better way.
When I find myself "needing" a goto to do this type of thing, I almost always find that my code is too complex and can be easily broken up into a few method calls that are easier to read and deal with. Your calling code can do something like:
// Setup
if(
methodA() &&
methodB() &&
methodC()
)
// Cleanup
Not that this is perfect, but it's much easier to follow since all your methods will be named to clearly indicate what the problem might be.
Reading through the comments, however, should indicate that your team has more pressing issues than goto handling.
The code you're giving us is (almost) C code written inside a C++ file.
The kind of memory cleaning you're using would be OK in a C program not using C++ code/libraries.
In C++, your code is simply unsafe and unreliable. In C++ the kind of management you're asking for is done differently. Use constructors/destructors. Use smart pointers. Use the stack. In a word, use RAII.
Your code could (i.e., in C++, SHOULD) be written as:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<int> p = new int;
// Lot of code...
return bRetVal ;
}
(Note that new-ing an int is somewhat silly in real code, but you can replace int by any kind of object, and then, it makes more sense). Let's imagine we have an object of type T (T could be an int, some C++ class, etc.). Then the code becomes:
BOOL foo()
{
BOOL bRetVal = FALSE;
std::auto_ptr<T> p = new T;
// Lot of code...
return bRetVal ;
}
Or even better, using the stack:
BOOL foo()
{
BOOL bRetVal = FALSE;
T p ;
// Lot of code...
return bRetVal;
}
Anyway, any of the above examples are magnitudes more easy to read and secure than your example.
RAII has many facets (i.e. using smart pointers, the stack, using vectors instead of variable length arrays, etc.), but all in all is about writing as little code as possible, letting the compiler clean up the stuff at the right moment.
All of the above is valid, you might also want to look at whether you might be able to reduce the complexity of your code and alleviate the need for goto's by reducing the amout of code that is in the section marked as "lot of code" in your example. Additionaly delete 0 is a valid C++ statement
Using GOTO labels in C++ is a bad way to program, you can reduce the need by doing OO programming (deconstructors!) and trying to keep procedures as small as possible.
Your example looks a bit weird, there is no need to delete a NULL pointer. And nowadays an exception is thrown when a pointer can't get allocated.
Your procedure could just be wrote like:
bool foo()
{
bool bRetVal = false;
int p = 0;
// Calls to various methods that do algorithms on the p integer
// and give a return value back to this procedure.
return bRetVal;
}
You should place a try catch block in the main program handling out of memory problems that informs the user about the lack of memory, which is very rare... (Doesn't the OS itself inform about this too?)
Also note that there is not always the need to use a pointer, they are only useful for dynamic things. (Creating one thing inside a method not depending on input from anywhere isn't really dynamic)
I am not going to say that goto is always bad, but your use of it most certainly is. That kind of "cleanup sections" was pretty common in early 1990's, but using it for new code is pure evil.
The easiest way to avoid what you are doing here is to put all of this cleanup into some kind of simple structure and create an instance of it. For example instead of:
void MyClass::myFunction()
{
A* a = new A;
B* b = new B;
C* c = new C;
StartSomeBackgroundTask();
MaybeBeginAnUndoBlockToo();
if ( ... )
{
goto Exit;
}
if ( ... ) { .. }
else
{
... // what happens if this throws an exception??? too bad...
goto Exit;
}
Exit:
delete a;
delete b;
delete c;
StopMyBackgroundTask();
EndMyUndoBlock();
}
you should rather do this cleanup in some way like:
struct MyFunctionResourceGuard
{
MyFunctionResourceGuard( MyClass& owner )
: m_owner( owner )
, _a( new A )
, _b( new B )
, _c( new C )
{
m_owner.StartSomeBackgroundTask();
m_owner.MaybeBeginAnUndoBlockToo();
}
~MyFunctionResourceGuard()
{
m_owner.StopMyBackgroundTask();
m_owner.EndMyUndoBlock();
}
std::auto_ptr<A> _a;
std::auto_ptr<B> _b;
std::auto_ptr<C> _c;
};
void MyClass::myFunction()
{
MyFunctionResourceGuard guard( *this );
if ( ... )
{
return;
}
if ( ... ) { .. }
else
{
...
}
}
A few years ago I came up with a pseudo-idiom that avoids goto, and is vaguely similar to doing exception handling in C. It has been probably already invented by someone else so I guess I "discovered it independently" :)
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
do
{
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
break;
}
// Lot of code...
bRetVal = TRUE;
} while (false);
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
I think using the goto for exit code is bad since there's a lot of other solutions with low overhead such as having an exit function and returning the exit function value when needed. Typically in member functions though, this shouldn't be needed, otherwise this could be indication that there's a bit too much code bloat happening.
Typically, the only exception I make of the "no goto" rule when programming is when breaking out of nested loops to a specific level, which I've only ran into the need to do when working on mathematical programming.
For example:
for(int i_index = start_index; i_index >= 0; --i_index)
{
for(int j_index = start_index; j_index >=0; --j_index)
for(int k_index = start_index; k_index >= 0; --k_index)
if(my_condition)
goto BREAK_NESTED_LOOP_j_index;
BREAK_NESTED_LOOP_j_index:;
}
That code has a bunch of problems, most of which were pointed out already, for example:
The function is too long; refactoring out some code into separate functions might help.
Using pointers when normal instances will probably work just fine.
Not taking advantage of STL types such as auto_ptr
Incorrectly checking for errors, and not catching exceptions. (I would argue that checking for OOM is pointless on the vast majority of platforms, since if you run out of memory you have bigger problems than your software can fix, unless you are writing the OS itself)
I have never needed a goto, and I've always found that using goto is a symptom of a bigger set of problems. Your case appears to be no exception.
Using "GOTO" will change the "logics" of a program and how you enterpret or how you would imagine it would work.
Avoiding GOTO-commands have always worked for me so guess when you think you might need it, all you maybe need is a re-design.
However, if we look at this on an Assmebly-level, jusing "jump" is like using GOTO and that's used all the time, BUT, in Assembly you can clear out, what you know you have on the stack and other registers before you pass on.
So, when using GOTO, i'd make sure the software would "appear" as the co-coders would enterpret, GOTO will have an "bad" effect on your software imho.
So this is more an explenation to why not to use GOTO and not a solution for a replacement, because that is VERY much up to how everything else is built.
I may have missed something: you jump to the label Exit if P is null, then test to see if it's not null (which it's not) to see if you need to delete it (which isn't necessary because it was never allocated in the first place).
The if/goto won't, and doesn't need to delete p. Replacing the goto with a return false would have the same effect (and then you could remove the Exit label).
The only places I know where goto's are useful are buried deep in nasty parsers (or lexical analyzers), and in faking out state machines (buried in a mass of CPP macros). In those two cases they've been used to make very twisted logic simpler, but that is very rare.
Functions (A calls A'), Try/Catches and setjmp/longjmps are all nicer ways of avoiding a difficult syntax problem.
Paul.
Ignoring the fact that new will never return NULL, take your code:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p=NULL;
p = new int;
if(p==NULL)
{
cout<<" OOM \n";
goto Exit;
}
// Lot of code...
Exit:
if(p)
{
delete p;
p= NULL;
}
return bRetVal;
}
and write it like this:
BOOL foo()
{
BOOL bRetVal = FALSE;
int *p = new int;
if (p!=NULL)
{
// Lot of code...
delete p;
}
else
{
cout<<" OOM \n";
}
return bRetVal;
}