C++ 17 copy elision of heap allocated objects? - c++

I'm using C++ 17 and I have a rather large (two dimensional) array of numbers I'm trying to initialize at the namespace scope. This array is intended to be a pre-computed lookup table that's going to be used in many parts of my code. It's definition looks something like this:
using table_type = std::array<std::array<uint64_t, SOME_BIG_NUMBER>, SOME_OTHER_BIG_NUMBER>;
table_type MyTable = ...
Where the total size of the table is > 200,000 or so. Now, all of the values in the table could be known at compile time, so initially I went ahead and did something like this:
Attempt 1
// Header.h
constexpr table_type MyTable = []()
{
table_type table{};
// code to initialize table...
return table;
}();
My initial fears were realized when MSVC refused to compile this code. The table is simply too large to be computed at compile time. I could have fiddled with the settings and increased the maximum allowed steps but I didn't really feel like getting into that.
Attempt 2
// Header.h
const extern table_type MyTable;
// Implementation.cpp
const table_type MyTable = []()
{
table_type table{};
// code to initialize table...
return table;
}();
I had a feeling this also wouldn't work when I was coding it up and I was right. MSVC warns that I'm using over 2MB of stack memory in the lambda function, and though it compiles the executable immediately crashes upon startup due to the stack being blown up.
Attempt 3
// Header.h
// MyTable is no longer const
extern table_type MyTable;
// Implementation.cpp
table_type MyTable{};
int init_table()
{
// code to initialize table...
// we initialize it by directly writing to it, e.g
// table[0][0] = 5;
// table[0][1] = 4;
// etc.
return 0;
}
const auto _Unused = init_table();
This works. MSVS, GCC, and Clang will compile this code without complaint and the resulting executable runs as desired. Nevertheless, I found this method unsatisfying for several reasons. The first obvious issue is that I had to make my table non-const which means that it can potentially be mutated from anywhere else in the code. I could just be careful not to do that, but if someone else ever comes along and uses my code (it's a static library) they might not be so careful. I'd like to avoid giving myself and others unneeded opportunities to shoot ourselves in the foot.
To top it all off, the initialization code is ugly. Rather than being able to use an immediately evaluated lambda I instead have to declare a free function and then call it in another location. Furthermore, if I want this to happen automatically when the program launches I need to have the init_table function return something and then store that result in a variable somewhere. C++ doesn't seem to allow for something like init_table to be called at namespace scope unless the result is being stored somewhere. So now I have an initialization function that returns a useless value and I am required to save that value in a namespace scoped variable. Ugly.
Attempt 4
// Header.h
const extern table_type MyTable;
// Implementation.cpp
const table_type MyTable = []() -> table_type
{
auto table_p = std::make_unique<table_type>();
// code to initialize table_p...
return *table_p;
}();
The idea is to avoid blowing up the stack by allocating a temporary table on the heap. That table is initialized and then copied into MyTable. MSVC, GCC, and Clang accept this code with no warnings and the resulting executables run fine.
My question
The problem is I didn't expect this to work and I can't quite wrap my head around why it works. The initial table is allocated on the heap without issue, but I don't completely understand what's happening when it's returned from the lambda. I added the -> table_type explicit return type to the lambda to make sure that it didn't deduce the return type as table_type& as that would result in a dangling reference. But since it's returning the table by copy wouldn't a temporary r-value need to be made on the call stack when the lambda returns? And that should result in the same crash as attempt #2.
I'm aware of RVO in C++ and that it was enhanced further with C++ 17. But in this case I'm attempting to return an object that's allocated on the heap and managed by a unique_ptr. I don't see how RVO could apply here because once the return statement is hit the destructor of the unique_ptr will free the memory used to store the table so there's no way that MyTable could then be initialized by simply copying the contents of the memory pointed to by table_p.
I understand that if I directly returned a stack allocated object by value the compiler can optimize away the call to the destructor of that object and effectively memcpy its contents into the new value it's being stored in. But in this case the object being returned by value is on the heap. It appears to me that what the compiler is doing is copying the contents of the memory pointed to by table_p into the memory used to store MyTable, and it is doing this before the destructor of table_p is run. Since destructor calls (as far as I'm aware) are considered part of the function body in which they are called, this would mean that MyTable is being fully intialized before the function that produces the value that initializes it actually exits. This sounds very strange to me.
I've been puzzling about this all day and I just can't figure out what's going on here. I also can't seem to find much online that's related to this specific scenario. My fear is that I'm relying on undefined/implementation defined behavior that the three major compilers just happen to work nicely with. Could another conforming compiler come along and produce an executable that blows up the stack here?

MyTable is declared at namespace scope, so it's not stored on the stack. It's stored in the "static area" or whatever you call it.
The object that table_p points to is stored on the heap.
So what happens when the lambda returns? First, the object that is stored on the heap is copied directly to the object that is stored in the static area (without having to go through a large temporary stack object). Then, table_p is destroyed (and the heap object with it).
The C++17 "guaranteed copy elision" feature ensures that when the lambda returns, a temporary object is not created unless it needs to be. The call expression is a prvalue, which means that it does not designate an object, but is a "recipe" that describes how to initialize an object. The compiler determines which object is the target of the prvalue (i.e., the memory location on which the "recipe" will be run in order to create and initialize an object). In this case, the target is MyTable itself, not a temporary. The return statement directly initializes that target object, not a temporary.

Related

Casting structs with non-aggregate members

I am receiving an segmentation fault (SIGSEGV) when I try to reinterpret_cast a struct that contains an vector. The following code does not make sense on its own, but shows an minimal working (failing) example.
// compiler: g++ -std=c++17
struct Table
{
std::vector<int> ids;
};
std::vector<std::byte> storage;
// put that table into the storage
Table table = {.ids = {3, 5}};
auto convert = [](Table x){ return reinterpret_cast<std::byte*>(&x); };
std::byte* bytes = convert(table);
storage.insert(storage.end(), bytes, bytes + sizeof(Table));
// ...
// get that table back from the storage
Table& tableau = *reinterpret_cast<Table*>(&storage.front());
assert(tableau.ids[0] == 3);
assert(tableau.ids[1] == 5);
The code works fine if I inline the convert function, so my guess is that some underlying memory is deleted. The convert function makes a local copy of the table and after leaving the function, the destructor for the local copy's ids vector is called. Recasting just
returns the vector, but the ids are already deleted.
So here are my questions:
Why does the segmentation fault happen? (Is my guess correct?)
How could I resolve this issue?
Thanks in advance :D
I see at least three reasons for undefined behavior in the shown code, that fatally undermines what the shown code is attempting to do. One or some combination of the following reasons is responsible for your observed crash.
struct Table
{
std::vector<int> ids;
};
Reason number 1 is that this is not a trivially copyable object, so any attempt to copy it byte by byte, as the shown code attempts to do, results in undefined behavior.
storage.insert(storage.end(), bytes, bytes + sizeof(Table));
Reason number 2 is that sizeof() is a compile time constant. You might be unaware that the sizeof of this Table object is always the same, whether or not its vector is empty or contains the first billion digits of π. The attempt here to copy the whole object into the byte buffer, this way, therefore fails for this fundamental reason.
auto convert = [](Table x){ return reinterpret_cast<std::byte*>(&x); };
Reason number 3 is that this lambda, for all practical purposes, is the same as any other function with respect to its parameters: its x parameter goes out of scope and gets destroyed as soon as this function returns.
When a function receives a parameter, that parameter is just like a local object in the function, and is a copy of whatever the caller passed to it, and like all other local objects in the function it gets destroyed when the function returns. This function ends up returning a pointer to a destroyed object, and subsequent usage of this pointer also becomes undefined behavior.
In summary, what the shown code is attempting to do is, unfortunately, going against multiple core fundamentals of C++, and manifests in a crash for one or some combination of these reasons; C++ simply does not work this way.
The code works fine if I inline the convert function
If, by trial and error, you come up with some combination of compiler options, or cosmetic tweaks, that avoids a crash, for some miraculous reason, it doesn't fix any of the underlying problems and, at some point later down the road you'll get a crash anyway, or the code will fail to work correctly. Guaranteed.
How could I resolve this issue?
The only way for you to resolve this issue is, well, not do any of this. You also indicated that what you're trying to do is just "store multiple vectors of different types in the same container". This happens to be what std::variant can easily handle, safely, so you'll want to look into that.

Is this the right way to return a struct in a parameter?

I made the following method in a C++/CLI project:
void GetSessionData(CDROM_TOC_SESSION_DATA& data)
{
auto state = CDROM_TOC_SESSION_DATA{};
// ...
data = state;
}
Then I use it like this in another method:
CDROM_TOC_SESSION_DATA data;
GetSessionData(data);
// do something with data
It does work, returned data is not garbage, however there's something I don't understand.
Question:
C++ is supposed to clean up state when it has exitted its scope, so data is a copy of state, correct ?
And in what exactly it is different from the following you see on many examples:
CDROM_TOC_SESSION_DATA data;
GetSessionData(&data); // signature should be GetSession(CDROM_TOC_SESSION_DATA *data)
Which one makes more sense to use or is the right way ?
Reference:
CDROM_TOC_SESSION_DATA
Using a reference vs a pointer for an out parameter is really more of a matter of style. Both function equally well, but some people feel that the explicit & when calling a function makes it more clear that the function may modify the parameter it was passed.
i.e.
doAThing(someObject);
// It's not clear that doAThing accepts a reference and
// therefore may modify someObject
vs
doAThing(&someObject);
// It's clear that doAThing accepts a pointer and it's
// therefore possible for it to modify someOjbect
Note that 99% of the time the correct way to return a class/struct type is to just return it. i.e.:
MyType getObject()
{
MyType object{};
// ...
return object;
}
Called as
auto obj = getObject();
In the specific case of CDROM_TOC_SESSION_DATA it likely makes sense to use an out parameter, since the class contains a flexible array member. That means that the parameter is almost certainly a reference/pointer to the beginning of some memory buffer that's larger than sizeof(CDROM_TOC_SESSION_DATA), and so must be handled in a somewhat peculiar way.
C++ is supposed to clean up state when it has exitted its scope, so
data is a copy of state, correct ?
In the first example, the statement
data = state
presumably copies the value of state into local variable data, which is a reference to the same object that is identified by data in the caller's scope (because those are the chosen names -- they don't have to match). I say "presumably" because in principle, an overridden assignment operator could do something else entirely. In any library you would actually want to use, you can assume that the assignment operator does something sensible, but it may be important to know the details, so you should check.
The lifetimes of local variables data and state end when the method exits. They will be cleaned up at that point, and no attempt may be made to access them thereafter. None of that affects the caller's data object.
And in what exactly it is different from the following you see on many
examples:
CDROM_TOC_SESSION_DATA data;
GetSessionData(&data);
Not much. Here the caller passes a pointer instead of a reference. GetSessionData must be declared appropriately for that, and its implementation must explicitly dereference the pointer to access the caller's data object, but the general idea is the same for most intents and purposes. Pointer and reference are similar mechanisms for indirect access.
Which one makes more sense to use or is the right way ?
It depends. Passing a reference is generally a bit more idiomatic in C++, and it has the advantage that the method does not have to worry about receiving a null or invalid pointer. On the other hand, passing a pointer is necessary if the function has C linkage, or if you need to accommodate the possibility of receiving a null pointer.

Questions concerning value classes and vectors

More C++ learning questions. I've been using vectors primarily with raw pointers with a degree of success, however, I've been trying to play with using value objects instead. The first issue I'm running into is compile error in general. I get errors when compiling the code below:
class FileReference {
public:
FileReference(const char* path) : path(string(path)) {};
const std::string path;
};
int main(...) {
std::vector<FileReference> files;
// error C2582: 'operator =' function is unavailable in 'FileReference'
files.push_back(FileReference("d:\\blah\\blah\\blah"));
}
Q1: I'm assuming it's because of somehow specifying a const path, and/or not defining an assignment operator - why wouldn't a default operator work? Does defining const on my object here even I'm assuming it's because I defined a const path, Does const even win me anything here?
Q2: Secondly, in a vector of these value objects, are my objects memory-safe? (meaning, will they get automatically deleted for me). I read here that vectors by default get allocated to the heap -- so does that mean I need to "delete" anything.
Q3: Thirdly, to prevent copying of the entire vector, I have to create a parameter that passes the vector as a reference like:
// static
FileReference::Query(const FileReference& reference, std::vector<FileReference>& files) {
// push stuff into the passed in vector
}
What's the standard for returning large objects that I don't want to die when the function dies. Would I benefit from using a shared_ptr here or something like that?
If any member variables are const, then a default assignment operator can't be created; the compiler doesn't know what you would want to happen. You would have to write your own operator overload, and figure out what behaviour you want. (For this reason, const member variables are often less useful than one might first think.)
So long as you're not taking ownership of raw memory or other resources, then there's nothing to clean up. A std::vector always correctly deletes its contained elements when its lifetime ends, so long as they in turn always correctly clean up their own resources. And in your case, your only member variable is a std:string, which also looks after itself. So you're completely safe.
You could use a shared pointer, but unless you do profiling and identify a bottleneck here, I wouldn't worry about it. In particular, you should read about copy elision, which the compiler can do in many circumstances.
Elements in vector must be assignable from section 23.2.4 Class template vector of the C++ standard:
...the stored object shall meet the requirements of Assignable.
Having a const member makes the class unassignable.
As the elements are being stored by value, they will be destructed when the vector is destroyed or when they are removed from the vector. If the elements were raw pointers, then they would have to be explicitly deleted.

What purpose does this code change serve?

I am trying to understand the implications / side effects / advantages of a recent code change someone made. The change is as follows:
Original
static List<type1> Data;
Modified
static List<type1> & getData (void)
{
static List<type1> * iList = new List<type1>;
return * iList;
}
#define Data getData()
What purpose could the change serve?
The benefit to the revision that I can see is an issue of 'initialization time'.
The old code triggered an initialization before main() is called.
The new code does not trigger initialization until getData() is called for the first time; if the function is never called, you never pay to initialize a variable you didn't use. The (minor) downside is that there is an initialization check in the generated code each time the function is used, and there is a function call every time you need to access the list of data.
If you have a variable with static duration, it is created when the application is initialized. When the application terminates the object is destroyed. It is not possible to control the order in which different objects are created.
The change will make the object be created when it is first used, and (as it is allocated dynamically) it will never be destroyed.
This can be a good thing if other objects need this objects when they are destroyed.
Update
The original code accessed the object using the variable Data. The new code does not have to be modified in any way. When the code use Data it will, in fact, be using the macro Data, which will be expanded into getData(). This function will return a reference to the actual (dynamically allocated object). In practice, the new code will work as a drop-in replacement for the old code, with the only noticable difference being what I described in the original answer above.
Delaying construction until the first use of Data avoids the "static initialization order fiasco".
Making some guesses about your List,... the default-constructed Data is probably an empty list of type1 items, so it's probably not at great risk of causing the fiasco in question. But perhaps someone felt it better to be safe than sorry.
There are several reasons why that change was made :
to prevent the static order initialization fiasco
to delay the initialization of the static variable (for whatever reason)

Are there conditions under which using "__attribute__((warn_unused_result))" will not work?

I have been trying to find a reason why this does not work in my code - I think this should work. Here is an excerpt from a header file:
#define WARN_UNUSED __attribute__((warn_unused_result))
class Trans {
Vector GetTranslation() const WARN_UNUSED {
return t;
}
};
So my question is: why don't I get a warning when I compile code with something like:
Gt.GetTranslation();
?
Thanks for the help.
The purpose of this attribute is intended (but not exclusively) for pointers to dynamically allocated data.
It gives a compile-time garantee that the calling code will store the pointer in a variable (may as a parameter to a function too ,but that I'm not certain of) en thereby delegates the responsibility of freeing\releasing\deleting the object it points to.
This in order to prevent memory leakage and\or other lifetime controlling aspects.
for instance ,if you call malloc( ... ) without storing the pointer ,you are not able to free it it afterwards. (malloc should have this attribute)
If you use it on function return an object ,than the mechanism is meaningless because the object that is returned is stored in a temporary and may be copied to a non-temporary variable (might be optimized out) and will always be destructed (because it will.
BTW , it's not particulary usefull for returned references (unless you code is aware of it and requires some kind of release mechanism) ,since the referenced object doesn't get destructed when going out of scope.