I'm encountering a C++ problem (or lack there-of) because this code works, but I don't know why.
For context, my professor has given us a header file, and a main function, the program generates a Fibonnaci Sequence.
I am using an iterator to iterate the sequence using ++a and a++.
Here is my implementation of a++.
FibonacciIterator FibonacciIterator::operator++(int) //i++, increment i, return old i.
{
FibonacciIterator old = *this; //create old object to return
//increment actual fibonacci sequence
long nextTemp = fCurrent + fPrevious;
fPrevious = fCurrent;
fCurrent = nextTemp;
fCurrentN++; //increment count
return old;
}
Now, I create a value 'old', by using the dereference operator of the pointer 'this'.
I do some logic to the current iterator, and return the old iterator.
This all works, and using the following do-while loop:
FibonacciIterator lIterator2 = lIterator.begin();
do
{
cout << *lIterator2++ << endl;
} while (lIterator2 != lIterator2.end());
everything works. This do-while loop is written by the professor, we are not meant to change it.
My question is, why does this code work?
To my understanding, when we create a local variable in a method, that variable is encased within the methods stack frame.
When we exit the stack frame, should we return a local variable that was created in that stack frame, we might get our value. We also might not.
My understanding is this is because the memory location in which this variable was created is now "up for grabs" by any program on the computer that might need it.
So IF we get the value we desired, its because nothing has overwritten it yet. IF we don't, its because something has overwritten it.
So why is it that this code works, 100% of the time? Why do I not SOMETIMES see old become garbage, and crash my program with an unhandled exception?
My only guess is that because 'FibonacciIterator' is a user made class, it is AUTOMATICALLY allocated memory on the heap, thus we don't run in to this problem.
My only guess would be that this problem only exists using types such as int, long, double, etc.
However, I am 99% sure that my guess is wrong, and I want to understand what is going on here. I want to understand this because for one, I enjoy C++, but I don't enjoy not knowing why something works.
Thank you!
You can return a local object - it will be copied. You shouldn't return a pointer to a local object. As you correctly point out, the pointer will point to junk.
In your case you've got a copy and so it is ok (with all the caveats that the copy must be "safe")
Related
I am receiving an segmentation fault (SIGSEGV) when I try to reinterpret_cast a struct that contains an vector. The following code does not make sense on its own, but shows an minimal working (failing) example.
// compiler: g++ -std=c++17
struct Table
{
std::vector<int> ids;
};
std::vector<std::byte> storage;
// put that table into the storage
Table table = {.ids = {3, 5}};
auto convert = [](Table x){ return reinterpret_cast<std::byte*>(&x); };
std::byte* bytes = convert(table);
storage.insert(storage.end(), bytes, bytes + sizeof(Table));
// ...
// get that table back from the storage
Table& tableau = *reinterpret_cast<Table*>(&storage.front());
assert(tableau.ids[0] == 3);
assert(tableau.ids[1] == 5);
The code works fine if I inline the convert function, so my guess is that some underlying memory is deleted. The convert function makes a local copy of the table and after leaving the function, the destructor for the local copy's ids vector is called. Recasting just
returns the vector, but the ids are already deleted.
So here are my questions:
Why does the segmentation fault happen? (Is my guess correct?)
How could I resolve this issue?
Thanks in advance :D
I see at least three reasons for undefined behavior in the shown code, that fatally undermines what the shown code is attempting to do. One or some combination of the following reasons is responsible for your observed crash.
struct Table
{
std::vector<int> ids;
};
Reason number 1 is that this is not a trivially copyable object, so any attempt to copy it byte by byte, as the shown code attempts to do, results in undefined behavior.
storage.insert(storage.end(), bytes, bytes + sizeof(Table));
Reason number 2 is that sizeof() is a compile time constant. You might be unaware that the sizeof of this Table object is always the same, whether or not its vector is empty or contains the first billion digits of π. The attempt here to copy the whole object into the byte buffer, this way, therefore fails for this fundamental reason.
auto convert = [](Table x){ return reinterpret_cast<std::byte*>(&x); };
Reason number 3 is that this lambda, for all practical purposes, is the same as any other function with respect to its parameters: its x parameter goes out of scope and gets destroyed as soon as this function returns.
When a function receives a parameter, that parameter is just like a local object in the function, and is a copy of whatever the caller passed to it, and like all other local objects in the function it gets destroyed when the function returns. This function ends up returning a pointer to a destroyed object, and subsequent usage of this pointer also becomes undefined behavior.
In summary, what the shown code is attempting to do is, unfortunately, going against multiple core fundamentals of C++, and manifests in a crash for one or some combination of these reasons; C++ simply does not work this way.
The code works fine if I inline the convert function
If, by trial and error, you come up with some combination of compiler options, or cosmetic tweaks, that avoids a crash, for some miraculous reason, it doesn't fix any of the underlying problems and, at some point later down the road you'll get a crash anyway, or the code will fail to work correctly. Guaranteed.
How could I resolve this issue?
The only way for you to resolve this issue is, well, not do any of this. You also indicated that what you're trying to do is just "store multiple vectors of different types in the same container". This happens to be what std::variant can easily handle, safely, so you'll want to look into that.
In the small sample below:
#include<iostream>
using namespace std;
int z(){
return 5 + 10; // returns 15
}
int main(){
z(); // what happens to this return?
cout << "Did not fail";
return 0;
}
What happens to the 15? I tried running it in debugger but I can't find it anywhere. I assume that because it didn't get assigned to anything it just vanished but I feel like that's wrong.
I asked my TA about this today and he told me it's stored on the call stack but when I viewed it in debugger I see that it is not.
The C++ standard imposes the "as-if" rule. That rule means that a C++ compiler can do anything to a program as long as all side effects (inputs and outputs that are visible to the rest of the system, like writing to a file or showing stuff on the screen) are respected. Going back to my cheeky philosophical comment, this means that in C++, when a tree falls in the forest and no one is there to hear it, it doesn't have to make a sound (but it can).
In the case of your program, at a high level, since your function does nothing, the compiler may or may not create a call to it, or could even remove it from the compiled binary. If it does include and call it, the return value will go to whatever return slot your platform's application binary interface specifies. On almost every x86_64 system, that will be the rax register for an integer return value. The return value is there but will never be read and will be overwritten at some point.
If it was a non-trivial object instead of an int, its destructor would be invoked immediately.
In general: when a function returns a non-void value and the value does not get stored anywhere, the value is destroyed.
Specifically: natural datatypes, like ints and doubles, or pointers, don't have an explicit destructor, so nothing really happens. The returned value simply gets ignored.
If a function returns a class instance, the class instance gets destroyed, which results in an invocation of the class's defined destructor, or a default destructor.
In the small sample below:
#include<iostream>
using namespace std;
int z(){
return 5 + 10; // returns 15
}
int main(){
z(); // what happens to this return?
cout << "Did not fail";
return 0;
}
What happens to the 15? I tried running it in debugger but I can't find it anywhere. I assume that because it didn't get assigned to anything it just vanished but I feel like that's wrong.
I asked my TA about this today and he told me it's stored on the call stack but when I viewed it in debugger I see that it is not.
The C++ standard imposes the "as-if" rule. That rule means that a C++ compiler can do anything to a program as long as all side effects (inputs and outputs that are visible to the rest of the system, like writing to a file or showing stuff on the screen) are respected. Going back to my cheeky philosophical comment, this means that in C++, when a tree falls in the forest and no one is there to hear it, it doesn't have to make a sound (but it can).
In the case of your program, at a high level, since your function does nothing, the compiler may or may not create a call to it, or could even remove it from the compiled binary. If it does include and call it, the return value will go to whatever return slot your platform's application binary interface specifies. On almost every x86_64 system, that will be the rax register for an integer return value. The return value is there but will never be read and will be overwritten at some point.
If it was a non-trivial object instead of an int, its destructor would be invoked immediately.
In general: when a function returns a non-void value and the value does not get stored anywhere, the value is destroyed.
Specifically: natural datatypes, like ints and doubles, or pointers, don't have an explicit destructor, so nothing really happens. The returned value simply gets ignored.
If a function returns a class instance, the class instance gets destroyed, which results in an invocation of the class's defined destructor, or a default destructor.
I tried a few Google searches before making this post, but to be honest I don't know what to search for. I have a C++ project and have been happily going about using the GNU compilers (g++). Today I tried to compile with clang++ and got a segfault.
Fine, ok, I can deal with this. After perusing my code and printing some stuff I was able to fix the problem. However the solution deeply troubles and confuses me.
Here's the situation: I'm using a tree-like data structure that stores a class called Ligament, but I'm storing it in a std::vector. I do this by storing a vector of "children" which are really just integer offsets between parent and child within the vector. In this way I can access children by using the this pointer, i.e
child = this[offset];
However, none of that's important. Here's this issue: I have an Ligament::addChild(int) function that takes an integer and pushes it to the back of a vector that is a member of Ligament:
void Ligament::addChild(uint32_t offset){
children.push_back(offset);
}
Very simple stuff. In general I pass to addChild an argument that gets returned from a recursive function called fill:
//starting at root
uint32_t fill(vector<Ligament>& lVec, TiXmlElement * el){
//store current size here, as size changes during recursion
uint32_t curIdx = lVec.size();
lVec.push_back(createLigament());
//Add all of this Ligament's children
TiXmlElement * i = el->FirstChildElement("drawable");
for (; i; i=i->NextSiblingElement("drawable")){
uint32_t tmp = fill(lVec, i) - curIdx;
lVec[curIdx].addChild(tmp);
//Does not work in clang++, but does in g++
//lVec[curIdx].addChild(fill(lVec,i)-curIdx);
}
//return the ligament's index
return curIdx;
}
The fill function gets called on an XML element and goes through its children, depth first.
Sorry if all that was unclear, but the core of the problem seems to be what's in that for loop. For some reason I have to store the return value of the fill call in a variable before I send it to the addChild function.
If I don't store it in a temporary variable, it seems as though the addChild function does not change the size of children, but I can't imagine why.
To check all this I printed out the size of the children vector before and after these calls, and it never went above 1. Only when I called addChild with a value that wasn't directly returned from a function did it seems to work.
I also printed out the values of offset inside the addChild function as well as inside the for loop before it was called. In all cases the values were the same, both in clang++ and in g++.
Since the issue is resolved I was able to move forward, but this is something I'd expect to work. Is there something I'm doing wrong?
Feel free to yell at me if I could do more to make this question clearer.
ALSO: I realize now that passing lVec around by reference through these recursions may be bad, as a push_back call may cause the address to change. Is this a legitimate concern?
EDIT:
So as people have pointed out, my final concern turned out to be related to the issue. The fill call has the potential to resize the vector, while the lVec[curIdx] = modifier will change an element in the vector. The order in which these things occurs can have drastic consequences.
As a follow up, is using the tmp variable acceptable? There's still the issue of a reallocation occuring...I think I will use SHR's suggestion of a map, then convert it to a vector when all is said and done.
// Does not work in clang++, but does in g++:
lVec[curIdx].addChild(fill(lVec,i)-curIdx);
The bug you are seeing is due to dependence on order of evaluation. Since fill(lVec, i) may cause lVec to reallocate its elements, the program will have undefined behavior if lVec[curIdx] is evaluated before fill(lVec,i). The order of evaluation of function arguments - and the postfix expression that determines which function to call - is unspecified.
I think it is undefined behavior.
you push into vector, and change it in the same command.
one compiler may do the fill first and the other may get lVec[curIdx] first.
if it is the case it will work for both compilers when you use map<uint32_t,uint32_t> instead of the vector. since map doesn't require the memory to be sequential.
I have a C++ class where I have a dynamically-allocated array of pointers to structs. I have a member function to "add an item" to this array by assigning an index of the array to the pointer to a dynamically allocated instance of the struct.
I have sort_arr initialized with sort_arr = new node *[this->max_items];.
In my assignment function I have sort_arr[this->num_items] = item; where the pointer is being passed as an argument with node *item.
In this function, I am able to access a member variable using (*sort_arr[i]).key_a (where i is the index), but once another item is added, this reference is no longer valid and causes a seg fault.
Is the pointer being deallocated, and if so, is it possible to prevent this?
EDIT: Sorry for the ambiguity here. I am trying to understand the problem generally and not specifically (in a pedagogical sort of way). I was hoping it was a problem with my conceptual approach. Given that it probably isn't, here are some more details:
node is defined as node **sort_arr; in the class declaration and then initialized by the constructor as sort_arr = new node *[this->max_items];. The insert method of the class executes: sort_arr[this->num_items] = item;, where item is passed with node *item.
It seems that after an item 'n2' is inserted after 'n1', 'n1' is no longer accessible via the reference (*sort_arr[num_items]).key_a. key_a is a member variable of the node struct.
EDIT 2: node *item is dynamically allocated outside of the class (in the main function).
The code you posted looks basically correct (if not the best way to do this sort of thing), but I can't tell what key_a is, or what context you are calling it in. Because of that, it's hard to tell exactly what the problem is. Posting the entire body of your function might be useful.
The only way something you allocated via new will be deallocated is if you (or some code you call) explicitly calls delete. That's pretty much the whole point of dynamic memory allocation, to allow your objects to live after the stack frame gets popped off.
My best guess with the current information is that you're trying to access a local value that got allocated on the stack after returning from the function. For example, this would cause a problem:
some_type* some_function(int i)
{
// ...
some_type p = (*sort_arr[i]).key_a; // p is a copy of key_a, allocated on the stack
// ...
some_type* result = &p;
return result;
}
In this scenario, p would be okay to return directly (if you changed the return type to some_type instead of some_type*), but you can't return a pointer to a local value. The local value is no longer valid after the function exits. This often causes a segfault.
Make sure that this->num_items as well asi is less than this->max_items and greater than -1, as this could be the cause of seg-fault.
Don't use dynamic arrays if it isn't for lecturing. Use a simple and save std::vector. It handles nearly everything that could go wrong. Try with that and see if there's still a seg-fault.
As noted, the code does appear to be correct. The problem was unrelated to the referenced code. I had been debugging some memory leaks and was deleting the referenced items, which was in turn causing the problem (quite obviously now that I see that).
I appreciate everyone's help and I'm sorry if I drove anyone crazy trying to find what was wrong.