I have a pretty nasty bug that has been bothering for a while. Here's the situation, I'm creating an in memory filesystem. I have pre-allocated data blocks for each file in which to do reads and writes from. To implement directories I have a simple std::vector object containing all the files in the directory. This vector object is at the top of the file in each directory. Therefore, in order to read and write from the directory I read the first 16 bytes into a character buffer and type cast it as a vector (16 bytes because sizeof(vector<T>) is 16 on my system). Specifically, the first 16 bytes are not elements of the vector, but the vector itself. However, the vector is somehow being clobbered after I exit out of a key function.
The following code does not throw an exception and can correctly save a vector to a character buffer to be later retrieved.
#include <vector>
char dblock[16];
typedef std::vector<int> Entries;
void foo() {
char buf[sizeof(Entries)];
Entries* test = new (buf)Entries();
test->push_back(0);
for (int i = 0; i < sizeof(std::vector<int>); ++i) {
dblock[i] = buf[i];
}
}
void bar() {
char buf[sizeof(Entries)];
for (int i = 0; i < sizeof(std::vector<int>); ++i) {
buf[i] = dblock[i];
}
Entries* test = (Entries*)buf;
test->back();
}
int main()
{
foo();
bar();
return 0;
}
However, once the function foo is changed to include an argument like so, any time I try to use iterators an exception is thrown.
void foo(int this_arg_breaks_everything) {
char buf[sizeof(Entries)];
Entries* test = new (buf)Entries();
test->push_back(0);
for (int i = 0; i < sizeof(std::vector<int>); ++i) {
dblock[i] = buf[i];
}
}
Looking at the disassembly, I found the problem assmbler when the function is tearing down its frame stack:
add esp,128h <----- After stack is reduced, vector goes to an unusable state.
cmp ebp,esp
call __RTC_CheckEsp (0D912BCh)
mov esp,ebp
pop ebp
ret
I found this code by using the debugger to test if ((Entries*)dblock)->back() returns an exception or not. Specifically the exception is "Access violation reading location 0XCCCCCCCC". The location in question is std::vector's std::_Container_base12::_Myproxy->_Mycont->_Myproxy == 0XCCCCCCCC; you can find it at line 165 of xutility.
And yes, the bug only happens when the inplace new operator is used. Using a normal new operator then writing test to dblock doesn't throw an exception.
Thus, I came to the conclusion that the ultimate reason for this is somehow the compiler is doing a bad stack unwind clobbering some part of memory that it shouldn't be.
Edit: Changed wording for clarity sake.
Edit2: Explained magic numbers.
In visual studio 2013 this generates errors and after taking a look at the internal data of the vector it was pretty easy to figure out why. Vector allocates an internal object that does a lot of it's heavy lifting, this internal object in turn stores a pointer back to the vector and thus knows the location that the vector is supposed to be. When the vector's memory is moved from one location to another, the memory it occupies changes and thus that internal object now instead points back to memory that later gets wiped by debug code.
Looking at your code, this looks like the exact same thing. std::_Container_base12 is a base class used by the vector which has a member called myProxy. myProxy is an internal object that does heavy lifting, and has a pointer back to the vector that contains it. When you move the vector, you can't update this pointer and so when you go to use the vector data that was moved it tries to use myProxy which is still trying to refer back to the original vector location which was wiped. Because that data area was wiped, it looks for a pointer in it and instead finds 'CCCCCCCC' which is what the debug code does to memory data that was wiped. It tries to access that memory location and everything explodes.
What you are doing (serializing an opaque C++ object with the equivalent of memcpy to a local buffer) is not going to work sustainably because vector objects are deep objects with lots of pointers to heap memory. Nevertheless to answer your question, why you are getting a crash.
The problem is alignment. When you try doing
char buf[sizeof(Entries)];
Entries* test = new (buf)Entries();
this assumes that buf has the proper alignment for the object Entries. I am not going to claim to know the internal structure of vector, but I bet it looks something like
class vector
{
T* start;
T* end;
other stuff
}
i.e. it is a bunch of pointers to heap. Pointers require register alignment, which is 8 bytes on a 64-bit machine. Alignment by N bytes means you can divide the address evenly by N. However, you are allocating buf on the stack, which is not guaranteed to have any alignment, but probably accidentally has 8 byte alignment because it is the only thing on your local stack frame. However, if you declare an argument to foo, and the argument is say an int which is 4 bytes, then buf is no longer 8-byte aligned since you've just added 4 bytes. Then when you try to access the pointers that are unaligned, you get your crash.
As an experiment, try changing foo to
void foo(int unused1, int unused2) {
This might accidentally realign buf, and it might not crash. However, stop what you are doing and don't do it this way.
See http://en.wikipedia.org/wiki/Data_structure_alignment for more info
and this for guidance on serializing: http://www.parashift.com/c++-faq/serialization.html . You might consider a Boost vector class that is serializable.
Why not just write the following?
std::vector<int> dblock;
void foo() {
dblock.push_back(0);
}
void bar() {
dblock.back();
}
Simplest answer for this: UNDEFINED BEHAVIOR.
std::vector isn't trivially copyable, you can't memcpy it from one place to another.
Another problem with your code is that dblock could have different alignment than std::vector. This could cause crash on some processors.
Third problem is compiler sometime could return garbage when you copy buff to dblock.
Its because you break strict aliasing rule.
Related
I am relatively new to C++...
I am learning and coding but I am finding the idea of pointers to be somewhat fuzzy. As I understand it * points to a value and & points to an address...great but why? Which is byval and which is byref and again why?
And while I feel like I am learning and understanding the idea of stack vs heap, runtime vs design time etc, I don't feel like I'm fully understanding what is going on. I don't like using coding techniques that I don't fully understand.
Could anyone please elaborate on exactly what and why the pointers in this fairly "simple" function below are used, esp the pointer to the function itself.. [got it]
Just asking how to clean up (delete[]) the str... or if it just goes out of scope.. Thanks.
char *char_out(AnsiString ansi_in)
{
// allocate memory for char array
char *str = new char[ansi_in.Length() + 1];
// copy contents of string into char array
strcpy(str, ansi_in.c_str());
return str;
}
Revision 3
TL;DR:
AnsiString appears to be an object which is passed by value to that function.
char* str is on the stack.
A new array is created on the heap with (ansi_in.Length() + 1) elements. A pointer to the array is stored in str. +1 is used because strings in C/C++ typically use a null terminator, which is a special character used to identify the end of the string when scanning through it.
ansi_in.cstr() is called, copying a pointer to its string buffer into an unnamed local variable on the stack.
str and the temporary pointer are pushed onto the stack and strcpy is called. This has the effect of copying the string(including the null-terminator) pointed at from the temporary to str.
str is returned to the caller
Long answer:
You appear to be struggling to understand stack vs heap, and pointers vs non-pointers. I'll break them down for you and then answer your question.
The stack is a concept where a fixed region of memory is allocated for each thread before it starts and before any user code runs.
Ignoring lower level details such as calling conventions and compiler optimizations, you can reason that the following happens when you call a function:
Arguments are pushed onto the stack. This reserves part of the stack for use of the arguments.
The function performs some job, using and copying the arguments as needed.
The function pops the arguments off the stack and returns. This frees the space reserved for the arguments.
This isn't limited to function calls. When you declare objects and primitives in a function's body, space for them is reserved via pushing. When they're out of scope, they're automatically cleaned up by calling destructors and popping.
When your program runs out of stack space and starts using the space outside of it, you'll typically encounter an error. Regardless of what the actual error is, it's known as a stack overflow because you're going past it and therefore "overflowing".
The heap is a different concept where the remaining unused memory of the system is available for you to manually allocate and deallocate from. This is primarily used when you have a large data set that's too big for the stack, or when you need data to persist across arbitrary functions.
C++ is a difficult beast to master, but if you can wrap your head around the core concepts is becomes easier to understand.
Suppose we wanted to model a human:
struct Human
{
const char* Name;
int Age;
};
int main(int argc, char** argv)
{
Human human;
human.Name = "Edward";
human.Age = 30;
return 0;
}
This allocates at least sizeof(Human) bytes on the stack for storing the 'human' object. Right before main() returns, the space for 'human' is freed.
Now, suppose we wanted an array of 10 humans:
int main(int argc, char** argv)
{
Human humans[10];
humans[0].Name = "Edward";
humans[0].Age = 30;
// ...
return 0;
}
This allocates at least (sizeof(Human) * 10) bytes on the stack for storing the 'humans' array. This too is automatically cleaned up.
Note uses of ".". When using anything that's not a pointer, you access their contents using a period. This is direct memory access if you're not using a reference.
Here's the single object version using the heap:
int main(int argc, char** argv)
{
Human* human = new Human();
human->Name = "Edward";
human->Age = 30;
delete human;
return 0;
}
This allocates sizeof(Human*) bytes on the stack for the pointer 'human', and at least sizeof(Human) bytes on the heap for storing the object it points to. 'human' is not automatically cleaned up, you must call delete to free it. Note uses of "a->b". When using pointers, you access their contents using the "->" operator. This is indirect memory access, because you're accessing memory through an variable address.
It's sort of like mail. When someone wants to mail you something they write an address on an envelope and submit it through the mail system. A mailman takes the mail and moves it to your mailbox. For comparison the pointer is the address written on the envelope, the memory management unit(mmu) is the mail system, the electrical signals being passed down the wire are the mailman, and the memory location the address refers to is the mailbox.
Here's the array version using the heap:
int main(int argc, char** argv)
{
Human* humans = new Human[10];
humans[0].Name = "Edward";
humans[0].Age = 30;
// ...
delete[] humans;
return 0;
}
This allocates sizeof(Human*) bytes on the stack for pointer 'humans', and (sizeof(Human) * 10) bytes on the heap for storing the array it points to. 'humans' is also not automatically cleaned up; you must call delete[] to free it.
Note uses of "a[i].b" rather than "a[i]->b". The "[]" operator(indexer) is really just syntactic sugar for "*(a + i)", which really just means treat it as a normal variable in a sequence so I can type less.
In both of the above heap examples, if you didn't write delete/delete[], the memory that the pointers point to would leak(also known as dangle). This is bad because if left unchecked it could eat through all your available memory, eventually crashing when there isn't enough or the OS decides other apps are more important than yours.
Using the stack is usually the wiser choice as you get automatic lifetime management via scope(aka RAII) and better data locality. The only "drawback" to this approach is that because of scoped lifetime you can't directly access your stack variables once the scope has exited. In other words you can only use stack variables within the scope they're declared. Despite this, C++ allows you to copy pointers and references to stack variables, and indirectly use them outside the scope they're declared in. Do note however that this is almost always a very bad idea, don't do it unless you really know what you're doing, I can't stress this enough.
Passing an argument by-ref means pushing a copy of a pointer or reference to the data on the stack. As far as the computer is concerned pointers and references are the same thing. This is a very lightweight concept, but you typically need to check for null in functions receiving pointers.
Pointer variant of an integer adding function:
int add(const int* firstIntPtr, const int* secondIntPtr)
{
if (firstIntPtr == nullptr) {
throw std::invalid_argument("firstIntPtr cannot be null.");
}
if (secondIntPtr == nullptr) {
throw std::invalid_argument("secondIntPtr cannot be null.");
}
return *firstIntPtr + *secondIntPtr;
}
Note the null checks. If it didn't verify its arguments are valid, they very well may be null or point to memory the app doesn't have access to. Attempting to read such values via dereferencing(*firstIntPtr/*secondIntPtr) is undefined behavior and if you're lucky results in a segmentation fault(aka access violation on windows), crashing the program. When this happens and your program doesn't crash, there are deeper issues with your code that are out of the scope of this answer.
Reference variant of an integer adding function:
int add(const int& firstInt, const int& secondInt)
{
return firstInt + secondInt;
}
Note the lack of null checks. By design C++ limits how you can acquire references, so you're not suppose to be able to pass a null reference, and therefore no null checks are required. That said, it's still possible to get a null reference through converting a pointer to a reference, but if you're doing that and not checking for null before converting you have a bug in your code.
Passing an argument by-val means pushing a copy of it on the stack. You almost always want to pass small data structures by value. You don't have to check for null when passing values because you're passing the actual data itself and not a pointer to it.
i.e.
int add(int firstInt, int secondInt)
{
return firstInt + secondInt;
}
No null checks are required because values, not pointers are used. Values can't be null.
Assuming you're interested in learning about all this, I highly suggest you use std::string(also see this) for all your string needs and std::unique_ptr(also see this) for managing pointers.
i.e.
std::string char_out(AnsiString ansi_in)
{
return std::string(ansi_in.c_str());
}
std::unique_ptr<char[]> char_out(AnsiString ansi_in)
{
std::unique_ptr<char[]> str(new char[ansi_in.Length() + 1]);
strcpy(str.get(), ansi_in.c_str());
return str; // std::move(str) if you're using an older C++11 compiler.
}
I am creating an Arduino device using C++. I need a stack object with variable size and variable data types. Essentially this stack needs to be able to be resized and used with bytes, chars, ints, doubles, floats, shorts, and longs.
I have a basic class setup, but with the amount of dynamic memory allocation that is required, I wanted to make sure that my use of data frees enough space for the program to continue without memory problems. This does not use std methods, but instead built in versions of those for the Arduino.
For clarification, my question is: Are there any potential memory problems in my code?
NOTE: This is not on the Arduino stack exchange because it requires an in depth knoweledge of C/C++ memory allocation that could be useful to all C and C++ programmers.
Here's the code:
Stack.h
#pragma once
class Stack {
public:
void init();
void deinit();
void push(byte* data, size_t data_size);
byte* pop(size_t data_size);
size_t length();
private:
byte* data_array;
};
Stack.cpp
#include "Arduino.h"
#include "Stack.h"
void Stack::init() {
// Initialize the Stack as having no size or items
data_array = (byte*)malloc(0);
}
void Stack::deinit() {
// free the data so it can be re-used
free(data_array);
}
// Push an item of variable size onto the Stack (byte, short, double, int, float, long, or char)
void Stack::push(byte* data, size_t data_size) {
data_array = (byte*)realloc(data_array, sizeof(data_array) + data_size);
for(size_t i = 0; i < sizeof(data); i++)
data_array[sizeof(data_array) - sizeof(data) + i] = data[i];
}
// Pop an item of variable size off the Stack (byte, short, double, int, float, long, or char)
byte* Stack::pop(size_t data_size) {
byte* data;
if(sizeof(data_array) - data_size >= 0) {
data = (byte*)(&data_array + sizeof(data_array) - data_size);
data_array = (byte*)realloc(data_array, sizeof(data_array) - data_size);
} else {
data = NULL;
}
// Make sure to free(data) when done with the data from pop()!
return data;
}
// Return the sizeof the Stack
size_t Stack::length() {
return sizeof(data_array);
}
There are some minor code bugs, apparently, which -- although important -- are easily resolved. The following answer only applies to the overall design of this class:
There is nothing wrong with just the code that is shown.
But only the code that's shown. No opinion is rendered on any code that's not shown.
And, it's fairly likely that there are going to be massive problems, and memory leaks, in the rest of the code which will attempt to use this class.
It's going to very, very easy to use this class in a way that leaks or corrupts memory. It's going to be much harder to use this class correctly, and much easier to screw up. The fact that these functions themselves appear to do their job correctly is not going to help if all you have to do is sneeze in the wrong direction, and end up with these functions not being used in the proper order, or sequence.
Just to name the first two readily apparent problems:
1) Failure to call deinit(), when any instance of this class goes out of scope and gets destroyed, will leak memory. Every time you use this class, you have to be cognizant of when the instance of this class goes out of scope and gets destroyed. It's easy to keep track of every time you create an instance of this class, and it's easy to remember to call init() every time. But keeping track of every possible way an instance of this class could go out of scope and get destroyed, so that you must call deinit() and free up the internal memory, is much harder. It's very easy to not even realize when that happens.
2) If an instance of this class gets copy-constructed, or the default assignment operator gets invoked, this is guaranteed to result in memory corruption, with an extra side-helping of a memory leak.
Note that you don't have to go out of your way to write code that copy-constructs, or assigns one instance of the object to another one. The compiler will be more than happy to do it for you, if you do not pay attention.
Generally, the best way to avoid these kinds of problems is to make it impossible to happen, by using the language correctly. Namely:
1) Following the RAII design pattern. Get rid of init() and deinit(). Instead, do this work in the object's constructor and destructor.
2) Either deleting the copy constructor and the assignment operator, or implementing them correctly. So, if instances of this class should never be copy-constructed or assigned-to, it's much better to have the compiler yell at you, if you accidentally write some code that does that, instead of spending a week tracking down where that happens. Or, if the class can be copy-constructed or assigned, doing it properly.
Of course, if there would only be a small number of instances of this class, it should be possible to safely use it, with tight controls, and lots of care, without doing this kind of a redesign. But, even if it were the case, it's always better to do the job right, instead of shrugging this off now, but then later deciding to expand the use of this class in more places, and then forgetting about the fact that this class is so error-prone.
P.S.: a few of the minor bugs that I mentioned in the beginning:
data_array = (byte*)realloc(data_array, sizeof(data_array) + data_size);
This can't be right. data_array is a byte *, so sizeof(data_array) will always be a compile-time constant, which would be sizeof(byte *). That's obviously not what you want here. You need to explicitly keep track of the allocated array's size.
The same general bug appears in several other places here, but it's easily fixed. The overall class design is the bigger problem.
static const int MAX_SIZE = 256; //I assume this is static Data
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
{
if(size > MAX_SIZE)
{
return false;
}
for(int i=0; i<sizeParam; i++)
{
arrayParam[i] = 9;
}
return true;
}
void main()
{
int* myArray = new int[30]; //I assume this is allocated on heap memory
bool res = initialiseArray(myArray, 30); //Where does this lie?
delete myArray;
}
We're currently learning the different categories of memory, i know that theres
-Code Memory
-Static Data
-Run-Time Stack
-Free Store(Heap)
I have commented where im unsure about, just wondering if anyone could help me out. My definition for the Run-Time stack describes that this is used for functions but my code memory defines that it contains all instructions for the methods/functions so im just a bit confused.
Can anyone lend a hand?
static const int MAX_SIZE = 256; //I assume this is static Data
Yes indeed. In fact, because it's const, this value might not be kept in your final executable at all, because the compiler can just substitute "256" anywhere it sees MAX_SIZE.
bool initialiseArray(int* arrayParam, int sizeParam) //where does this lie?
The code for the initialiseArray() function will be in the data section of your exectuable. You can get a pointer to the memory address, and call the function via that address, but other than that there's not much else you can do with it.
The arrayParam and sizeParam arguments will be passed to the function by value, on the stack. Likewise, the bool return value will be placed into the calling function's stack area.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Correct.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Effectively, the myArray pointer and the literal 30 are copied into the stack area of initialiseArray(), which then operates on them, and then the resulting bool is copied into the stack area of the calling function.
The actual details of argument passing are a lot more grizzly and depend on calling conventions (of which there are several, particularly on Windows), but unless you're doing something really specialised then they're not really important :-)
The stack is used for automatic variables - that is, variables declared within functions, or as function parameters. These variables are destroyed automatically when the program leaves the block of code they were declared in.
You're correct that MAX_SIZE has a static lifetime - it is destroyed automatically at the end of the program. You're also correct that the array allocated with new[] is on the heap (having a dynamic lifetime) - it won't be destroyed automatically, so need to be deleted. By the way, you need delete [] myArray; to match the use of new [].
The pointer to it (myArray) is an automatic variable, on the stack, as are res and the function arguments.
There is just one type of memory... it is a memory :D
What is different is where it is and how you access it.
If you go deep into the exe loader in Windows ( or in any kind of OS actually ) what it really does is that is stores the information of your sections ( parts of you exe ) and at run time at lays it out properly into the memory and applies access rights. So generally the code section where your "program" is the same memory ( your RAM ) as your data section. The difference is that the access rights are different, the code section usually only have read + execute the data just read + write ( and no execute ).
The stack is still a memory, it is special in the sense that it is again controlled by the OS, the stack size is the size in bytes of how big your stack is, but here the purpose is to hold immediate values between function calls ( as per stdcall ) and local variables ( depends on the compiler how it does it exactly ) so because it is a memory you CAN use it but like you it is to to lets say allocate a 10000 byte string on the stack. In assembly you have direct access since there is a stack pointer EBP ( If I remember correctly :P ) or in C/C++ you can use alloca.
The new and the delete operators are built ins for the C++ language but as far as I know they use the same system allocators as you do, in fact you can override them and use malloc/free and it should work which means that again this is the same memory.
The difference between using new/delete and an os specific function is that you let the language handle the allocation but in the end you will get a pointer just like you would with any other function.
On top of this there are special ways but those change the way the memory handled, in Windows this is the virtual memory for example, like VirutalAlloc, VirutalFree will allow you specify what you will do with the memory you want to use thus you allow the OS to optimize better, like you tell it I want 2Gb of memory BUT it doesn't have to be in RAM, so it may save it on the disk but you STILL access this with memory pointers.
And about your questions :
static const int MAX_SIZE = 256; //I assume this is static Data
It usually depends on the compiler but mostly they will treat this as const ( static is something else ) which means that it will be in the const section of the exe which in turn means that this memory block will be read-only.
int* myArray = new int[30]; //I assume this is allocated on heap memory
Yes this will be on the heap, but how it is allocated depends on the implementation and whenever you override the new operator, if you do you can for example force it to be in the Virtual memory so in fact it could be on the disk or in RAM, but this is silly thing to do so yes it will be on the heap.
bool res = initialiseArray(myArray, 30); //Where does this lie?
Multiple things happen here, because the compiler know that the first parameter of initialiseArray must be a pointer to an int it will pass a pointer to myArray so both a pointer and the value of 30 will go on the stack and then the function is called.
In the function which is in the memory ( the code section ) it runs and gets the parameters ( int* arrayParam, int sizeParam ) from the stack it will know that you want to write to the arrayParam and that is is pointer so it will write into the location arrayParam points to. To where exactly you specify it with arrayParam[i] < i will offset the memory pointer to the correct value, again C++ does some magic by adjusting the pointer for you since the adjustment in code should be in bytes it will move the memory pointer by 4 since ( usually ) int == 4 bytes.
To get a better overview of where goes what and how it works, use a debugger or a disassembler ( like OllyDbg ) and see it for yourself, if you want know more about how the stack is used look up the stdcall calling convention.
This question already has answers here:
Segmentation fault on large array sizes
(7 answers)
Closed 4 years ago.
I'm converting a program from fortran to C++.
My code seems to run fine until I add this array declaration:
float TC[100][100][100];
And then when I run it I get a segmentation fault error. This array should only take up 8Mb of memory and my machine has 3 Gb. Is there a problem with this declaration? My c++ is pretty rusty.
That array is about 4 megabyte large. If this definition is inside a function (as local variable), then the compiler tries to store it on the stack, which on most systems cannot grow that large.
The Fortran compiler probably allocated it statically (Fortran routines are not allowed to be called recursively unless explicitly marked as recursive, so static allocation for local variables works there for non-recursive functions), and therefore the error doesn't occur there.
A simple fix would be to explicitly declare the variable static, assuming the Fortran function was not declared recursive. However this may bite you later, if you ever try to call that function recursively from a revised version. So a better solution would probably be to allocate it dynamically. However that costs extra time and therefore depending on the nature of the code, may hurt your performance too much (Fortran code quite often is numerical code where performance matters).
If you choose to make the array static, you can build in a protection against accidental recursive calls:
void yourfunction()
{
static bool active;
static float TC[100][100][100];
assert(!active);
active = true;
// your code
active = false;
}
I'm guessing TC is being allocated as an auto local variable. This means it's being stored on the stack. You don't get 4mb of stack memory, so it's causing a stack overflow.
To solve it, use dynamic allocation with a structured container or new.
This looks like a stack-based declaration. Try allocating from the heap (i.e. using the new operator).
If you are declaring it inside of a function, as a local variable, it may be that you stack is not big enough to fit the array. You may try to allocate in the heap, with new or malloc(), or, if your design allows, make it a global variable.
In C++ the stack has a limited amount of space. MSVC defaults this size to 1MB. If the stack uses more than 1MB it will segfault or stackoverflow or something. You will have to move that structure to dynamic memory. To move it to dynamic memory, you want something like this:
typedef float (bigarray)[100][100][100];
bigarray& TC() {
static bigarray* ptr = NULL;
if (ptr == NULL) {
ptr = new float[100][100][100];
for(int j=0; j<100; j++) {
ptr[j] = new float[100][100];
for(int i=0; i<100; i++)
ptr[j][i] = new float[100];
}
}
return *ptr;
}
That will allocate the structure in dynamic memory the first time it is accessed, as a jagged array. You can get more performance out of a rectangle array, but you have to change types:
typedef std::vector<std::array<std::array<float, 100>, 100> bigarray;
bigarray TC(100);
According to http://cs.nyu.edu/exact/core/doc/stackOverflow.txt, gcc/linux defaults the stack size to 8MB, which isn't big enough for your structure and int main() If you really want to, MSVC has flags to increase the stack sizes up to 32MB. Linux has a ulimit command to increase the stack size up to 32MB.
Here is my code, I casted the buffer to different type of objects, is this what causes the failure? I really want to know why the FromBase::find2(int key) works, but not FromBase::find(int key)?
class Base
{
public:
virtual int find(int key)=0;
int keys[4];
};
class FromBase:public Base
{
public:
FromBase();
int find(int key);
int find2(int key);
};
FromBase::FromBase()
{
for(int i=0;i<4;i++)
keys[i]=-1;
}
int FromBase::find(int key)
{
for(int i=0;i<4;i++){
if(keys[i]==key)
return i;
}
return i;
};
int FromBase::find2(int key)
{
for(int i=0;i<4;i++){
if(keys[i]==key)
return i;
}
return i;
};
int main()
{
FromBase frombase;
FILE* fptr=fopen("object.dat","w");
fwrite((void*)&frombase,48,1,fptr);
fclose(fptr);
char object[48];
fptr=fopen("object.dat","r");
fread((void*)object,48,1,fptr);
// looks like this works
(FromBase*)object->find2(7);
//These two do not work, I got segmentation fault!
(FromBase*)object->find(7);
(Base*)object->find(7);
}
The reason I want to do this is because I need to read the object from a file, thus I need to cast the buffer to an particular type then I can call the mothod.
There is a high chance that you are overwriting the virtual function table with your code leading to a bad address when you call the method. You cannot just save objects into a file and expect to restore them by just restoring the memory content at the time they were saved.
There are some nice libraries like boost::serialization to save and restore objects. I would urge you to read about this or to turn your objects into plain old data types (structs) containing no references or addresses.
There are several reasons why this code is not guaranteed to work. I think the biggest concern is this code here:
char object[48];
The number 48 here is a magic number and there's absolutely no guarantee that the size of the object you're writing out is 48 bytes. If you want to have a buffer large enough to hold an object, use sizeof to see how many bytes you need:
char object[sizeof(FromBase)];
Moreover, this is not guaranteed to work due to alignment issues. Every object in C++ has an alignment, some number that its address must be a multiple of. When you declare a variable, C++ ensures that it has the proper alignment for its type, though there's no guarantee that it ends up having the alignment of any other type. This means that when you declare a char array, there's no guarantee that it's aligned the same way as a real FromBase object would be, and using the buffer as an object of that type results in undefined behavior.
As others have pointed out, though, you also have a problem because this line:
fopen("object.dat","r");
Doesn't update the local variable you're using to keep track of the file pointer, so what you're reading back is almost certainly going to be garbage (if you read back anything at all). The segfault is probably coming from the bytes for the virtual dispatch table not being read back in correctly.
// will these two methods work? I got segmentation fault!
(FromBase*)object->find(7);
(Base*)object->find(7);
No they will not work. The segmentation fault might be a hint ;)
object is a type on the stack, which is fine, but you need to call the class constructor. If this was valid c++, ANY memory could be casted to any class.
I'd start off by creating the class on the stack and call some Load()-method on it, e.g.
FromBase object;
object.Load("object.dat");
And let the Load()-method read the data from file and set values on the internal data.
Apart from all the other problems that people have pointed out.
I absolutely shocked that nobody has mentioned that:
(FromBase*)object->find2(7);
Is just NOT guaranteed to work.
You are depending on a raft of implementation details. object is an array of char! Not a FromBase thus the compiler has not had the chance to initialize any of its implementation dependent details.
Even if we assume that the implementation uses a vtable (and thus a vtable pointer in the class). Does the implementation use a relative pointer or an absolute pointer. Assuming you want to save with one run and then reload the next time? Are you assuming the vtable is actually located in the same location between different runs (what happens when you load this part of the application from a dynamic library)!
This is just horrible. You SHOULD NOT DO THIS EVER.
If you want to serialize and de-serialize the object from storage. Then the class has to know how to do the serialization itself. Thus all the correct constructors/destructors get called at the correct time.
First problem I can see when you use fopen second time:
fopen("object.dat","r"); //problem - your code
which should be this:
fptr = fopen("object.dat","r"); //fix (atleast one fix)
That means, in your code you're trying to read data using fptr which is already closed!
One problem is that the array of characters do not have a method called find.
The cast do not convert the array to FromBase or Base. It only tells the compiler to ignore the error.