C++ dynamic allocation - c++

I'm very confused with regard to the following instructions:
#include <iostream>
#define MAX_IT 100
using namespace std;
class Integer{
private :
int a;
public:
Integer(int valoare){a=valoare;}
int getA(){return a;}
void setA(int valoare){a=valoare;}
};
int main(){
Integer* a=new Integer(0);
//cout<<a[0].getA();
for(int i=1;i<=MAX_IT;i++)
{
a[i]=*(new Integer(i));
}
for(int i=0;i<=MAX_IT;i++)
cout<<a[i].getA()<<endl;
return 13;
}
It works for small values of MAX_IT, but when I try to set MAX_IT to 1000 it doesn't work anymore.
Initially, I thought "new" operator was supposed to do the job, but after some reading documentation I understood it is not supposed to work at all like this (out of bound array).
So my question is: why is it working for small values of MAX_IT and not for bigger ones?
EDIT:
I am experimenting with this code for a larger program, where I am not allowed to use STL. You have not understood my concern: if I have Integer *var=new Integer[10]; for(int k=1;K<10;k++) *(var+k)=k; //this is perfectly fine, but if I try var[10]=new Integer; //this should not be working and should generate a memory problem //My concern is that it is working if I do it only 100 times or so...The question if why is it working everytime for small number of iterations?

Because by allocating space for one Integer then using it as an array of multiple Integers, your code invokes undefined behavior, meaning that it can do anything, including crashing, working seemingly fine, or pulling demons out of your nose.
And anyways it's leaking memory. If you don't need dynamic memory allocation, then don't use it.
a[i]=*(new Integer(i));
And kaboom, you lost the pointer to the Integer, no chance to delete it later. Leaks.
If you don't need raw arrays, don't use them. Prefer std::vector. Or switch to C if C++ is too hard.
std::vector<Integer> vec;
vec.push_back(Integer(1337));

The reason that things tend to work nicely when you overflow your buffer by just a little bit is... memory fragmentation! Who would have guessed?
To avoid memory fragmentation, allocators won't return you a block of just sizeof (Integer). They'll give you a somewhat larger block, to ensure that if the block is later freed before the adjacent blocks, it's at least big enough to be useful.
Exactly how big this is can vary by architecture, OS, compiler version, or even how much memory is physically present in the machine. You should consider it to be completely unpredictable. Also, some libraries designed to help catch this sort of bug force any small object to be placed at the end of the block instead of the beginning, so the extra bytes could be negative array indices instead of positive.
Therefore, don't ever rely on having spare area given to you for free after (or before) an object.
Guru note: Occasionally someone comes up with a valid use for the extra memory, and asks for a way to discover how large it is. One good example is that the capacity (not size!) of a std::vector could be adjusted to match the actual allocated space instead of the requested space, and therefore reduce (on average) the number of reallocations needed. Such requests usually come paired with other guru allocator APIs, such as the ability to expand an allocation in-place if there happen to be free blocks adjacent.
Note that in your particular case you do still have undefined behavior, because you're calling operator= on a non-POD object which hasn't first been constructed. If you gave class Integer a trivial default constructor that would change.

you actually need
Integer* a=new Integer[MAX_IT];
//cout<<a[0].getA();
for(int i=1;i<MAX_IT;i++) << note < not <=
{
a[i]=i;
}
better would be to use std::vector though

Related

std::allocator<T>: Is constructing on unallocated memory in C++ allowed? [duplicate]

This question already has answers here:
What is the purpose of allocating a specific amount of memory for arrays in C++?
(5 answers)
Closed 4 years ago.
So I am relatively new to C++, and I recently encountered the std::allocator class. I understand that this is a powerful tool used in creation of vectors, lists, deques, etc. and I am trying to learn more about it.
One thing that confuses me is the following:
For example, if we define some allocator<int> denoted as alloc, and we use it to allocate n locations in memory via auto const b = a.allocate(n), where b is the pointer to the first int element in the allocated memory, then, one is also obliged to construct the allocated memory in order to actually access it, right?
If we introduce some iterating pointer auto e=b, then the construction can be performed via alloc.construct(e++,int_obj), where int_obj is some user-initialized object of the type int. This is all nice and tame as long as the total amount of calls to construct is less than n. However, I am not quite sure what happens when the number of user calls to construct exceeds n. I initially expected some warning or error message to rear its ugly head, however nothing happened. As a simple example, here is the snipped of the code I tried to run:
int n{ 0 }; // Size of the array is initialized to 0.
cin >> n; // User reads in the size.
allocator<int> alloc; // 'alloc' is an object that can allocate ints.
auto const b = alloc.allocate(n); // Pointer to the beginning of the array.
auto e = b; // Moving iterator that will point to the end of the array
for (int i = 0;i != 10;++i)
alloc.construct(e++, i); // We start constructing 10 elements in the array, regardless of the size n, which can in principle be less than 10.
for (auto i = b;i != e;++i)
cout << *i << "\t";
Initially I run this code for n=1, and all works nice; it prints out digits from 0 to 9, even though I allocated space only for one digit. That is a red flag already, right? Then I change n to 2, and program breaks after printing the digit number four, which is more what I expected.
What I conclude from this behavior is that trying to construct memory yet unallocated via std::allocator is undefined and unpredictable, and as such should be avoided (which is obvious to begin with). But, this seems as as extremely dangerous pitfall, and I wanted to know if there is some build-in workaround in C++ that will always prevent user when trying to construct unallocated memory.
But, this seems as as extremely dangerous pitfall,
It certainly is, and it's your responsibility to avoid this pitfall. That's why it's always recommended to use existing containers first, because they've been carefully written, reviewed and tested to avoid such bugs.
If you do need to handle raw memory directly, it's generally better to write your own container which you can test to a similar standard, rather than interleaving manual memory management with the rest of your program logic.
and I wanted to know if there is some build-in workaround in C++ that will always prevent user when trying to construct unallocated memory
No, because that necessarily incurs a runtime cost for something that was incorrect when it was written. I don't want my program to run slower because someone else is bad at their job.
However, there are plenty of tools to help you test and diagnose these bugs, for example:
the clang and gcc compilers have address sanitizers which compile these checks into your program (it's an optional compiler facility rather than part of the language)
valgrind is an external program whose default tool (memcheck) runs your real program looking for these bugs. It's slow, but it's used for testing or debugging and not for live programs

C/C++ Allocation

Giving a number X and reading X numbers into an uni-dimensional array, which of the following ways is the best(fastest as execution time)?
Please note that X is a number between 1 and 1000000
scanf("%d", &x);
int array[x];
//continue reading X numbers into array
Or
scanf("%d", &x);
int array[1000000];
//continue reading X ...
Or
scanf("%d", &x);
int * array = malloc(x*sizeof(int));
//same as above
free(array);
Or the C++ dynamic allocation method?
Note 1: that I am posting this from a mobile phone, I hope the format for the code above is fine, if not, I ask nicely somebody (<3) to edit it, since it is painfull to indent code from a phone.
Note 2: How could I test what I asked above?
Since there appears scanf (and the comments assume that there's another million calls to scanf) any questions regarding the memory allocation in combination with "Which is fastest?" can be universally answered with: "Yes" (read as: irrelevant).
While automatic storage ("stack allocation") is generally faster than freestore, it is entirely insignificant compared to the time you will spend in scanf. That being said, it is usually (not necessarily, but usually) dynamic deallocation which is slow, not allocation.
A couple of points to note in general on that code:
Reading an integer from some external source (file, network, argv, whatever) and doing an allocation based on that number without doing a sanity check first is massively bad karma. This is bound to cause a problem one day, it is how many existing real-world exploits came into being. Do not trust blindly that any number that you got from somewhere is automatically valid. Even if no malice is involved, accident may still provide an invalid number which will cause catastrophic failure.
Allocating a non-constant sized array on the stack will work under recent versions of C and will "work" as an extension even under C++ if you use GCC, but it is normally not allowable in C++ (meaning it will fail to compile).
Allocating a million integers means roughly 4MB of memory, which is pretty harsh towards your maximum stack size (often only 1MB). Expect a stack overflow happening.
Allocating an unknown number of integers (but expecting the number to be up to a million) is similar to (3).
The worst thing re (3) and (4) is that it may actually succeed. Which possibly means your program will unexpectedly crash later (encountering a stack overflow), in an entirely unrelated innocent piece of code. And you will wonder why that happens, since the code that crashes looks like it is perfectly valid (and it is, indeed!).
You'll get compilation error for this code:
scanf("%d", &x);
int array[x];
x should be known at compilation time in this case.
When using int array[1000000] you allocate memory on the stack, not in the heap, so it's fundamental difference comparing to malloc or new operator. It would be faster because it takes actually only one CPU command of modifying stack pointer.
If comparing malloc and new, malloc will be faster because new will eventually call malloc inside. But the performance gain will be tiny, It doesn't worth to optimize your c++ program in this way, just use c++ when you need to allocate dynamic memory.

What's the advantage of malloc?

What is the advantage of allocating a memory for some data. Instead we could use an array of them.
Like
int *lis;
lis = (int*) malloc ( sizeof( int ) * n );
/* Initialize LIS values for all indexes */
for ( i = 0; i < n; i++ )
lis[i] = 1;
we could have used an ordinary array.
Well I don't understand exactly how malloc works, what is actually does. So explaining them would be more beneficial for me.
And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing? And is there a way to print the values stored in the variable directly from the memory allocated space, for example here it is lis?
Your question seems to rather compare dynamically allocated C-style arrays with variable-length arrays, which means that this might be what you are looking for: Why aren't variable-length arrays part of the C++ standard?
However the c++ tag yields the ultimate answer: use std::vector object instead.
As long as it is possible, avoid dynamic allocation and responsibility for ugly memory management ~> try to take advantage of objects with automatic storage duration instead. Another interesting reading might be: Understanding the meaning of the term and the concept - RAII (Resource Acquisition is Initialization)
"And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing?"
- If you still consider n to be the amount of integers that it is possible to store in this array, you will most likely experience undefined behavior.
More fundamentally, I think, apart from the stack vs heap and variable vs constant issues (and apart from the fact that you shouldn't be using malloc() in C++ to begin with), is that a local array ceases to exist when the function exits. If you return a pointer to it, that pointer is going to be useless as soon as the caller receives it, whereas memory dynamically allocated with malloc() or new will still be valid. You couldn't implement a function like strdup() using a local array, for instance, or sensibly implement a linked representation list or tree.
The answer is simple. Local1 arrays are allocated on your stack, which is a small pre-allocated memory for your program. Beyond a couple thousand data, you can't really do much on a stack. For higher amounts of data, you need to allocate memory out of your stack.
This is what malloc does.
malloc allocates a piece of memory as big as you ask it. It returns a pointer to the start of that memory, which could be treated similar to an array. If you write beyond the size of that memory, the result is undefined behavior. This means everything could work alright, or your computer may explode. Most likely though you'd get a segmentation fault error.
Reading values from the memory (for example for printing) is the same as reading from an array. For example printf("%d", list[5]);.
Before C99 (I know the question is tagged C++, but probably you're learning C-compiled-in-C++), there was another reason too. There was no way you could have an array of variable length on the stack. (Even now, variable length arrays on the stack are not so useful, since the stack is small). That's why for variable amount of memory, you needed the malloc function to allocate memory as large as you need, the size of which is determined at runtime.
Another important difference between local arrays, or any local variable for that matter, is the life duration of the object. Local variables are inaccessible as soon as their scope finishes. malloced objects live until they are freed. This is essential in practically all data structures that are not arrays, such as linked-lists, binary search trees (and variants), (most) heaps etc.
An example of malloced objects are FILEs. Once you call fopen, the structure that holds the data related to the opened file is dynamically allocated using malloc and returned as a pointer (FILE *).
1 Note: Non-local arrays (global or static) are allocated before execution, so they can't really have a length determined at runtime.
I assume you are asking what is the purpose of c maloc():
Say you want to take an input from user and now allocate an array of that size:
int n;
scanf("%d",&n);
int arr[n];
This will fail because n is not available at compile time. Here comes malloc()
you may write:
int n;
scanf("%d",&n);
int* arr = malloc(sizeof(int)*n);
Actually malloc() allocate memory dynamically in the heap area
Some older programming environments did not provide malloc or any equivalent functionality at all. If you needed dynamic memory allocation you had to code it yourself on top of gigantic static arrays. This had several drawbacks:
The static array size put a hard upper limit on how much data the program could process at any one time, without being recompiled. If you've ever tried to do something complicated in TeX and got a "capacity exceeded, sorry" message, this is why.
The operating system (such as it was) had to reserve space for the static array all at once, whether or not it would all be used. This phenomenon led to "overcommit", in which the OS pretends to have allocated all the memory you could possibly want, but then kills your process if you actually try to use more than is available. Why would anyone want that? And yet it was hyped as a feature in mid-90s commercial Unix, because it meant that giant FORTRAN simulations that potentially needed far more memory than your dinky little Sun workstation had, could be tested on small instance sizes with no trouble. (Presumably you would run the big instance on a Cray somewhere that actually had enough memory to cope.)
Dynamic memory allocators are hard to implement well. Have a look at the jemalloc paper to get a taste of just how hairy it can be. (If you want automatic garbage collection it gets even more complicated.) This is exactly the sort of thing you want a guru to code once for everyone's benefit.
So nowadays even quite barebones embedded environments give you some sort of dynamic allocator.
However, it is good mental discipline to try to do without. Over-use of dynamic memory leads to inefficiency, of the kind that is often very hard to eliminate after the fact, since it's baked into the architecture. If it seems like the task at hand doesn't need dynamic allocation, perhaps it doesn't.
However however, not using dynamic memory allocation when you really should have can cause its own problems, such as imposing hard upper limits on how long strings can be, or baking nonreentrancy into your API (compare gethostbyname to getaddrinfo).
So you have to think about it carefully.
we could have used an ordinary array
In C++ (this year, at least), arrays have a static size; so creating one from a run-time value:
int lis[n];
is not allowed. Some compilers allow this as a non-standard extension, and it's due to become standard next year; but, for now, if we want a dynamically sized array we have to allocate it dynamically.
In C, that would mean messing around with malloc; but you're asking about C++, so you want
std::vector<int> lis(n, 1);
to allocate an array of size n containing int values initialised to 1.
(If you like, you could allocate the array with new int[n], and remember to free it with delete [] lis when you're finished, and take extra care not to leak if an exception is thrown; but life's too short for that nonsense.)
Well I don't understand exactly how malloc works, what is actually does. So explaining them would be more beneficial for me.
malloc in C and new in C++ allocate persistent memory from the "free store". Unlike memory for local variables, which is released automatically when the variable goes out of scope, this persists until you explicitly release it (free in C, delete in C++). This is necessary if you need the array to outlive the current function call. It's also a good idea if the array is very large: local variables are (typically) stored on a stack, with a limited size. If that overflows, the program will crash or otherwise go wrong. (And, in current standard C++, it's necessary if the size isn't a compile-time constant).
And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing?
You haven't allocated enough space for n integers; so code that assumes you have will try to access memory beyond the end of the allocated space. This will cause undefined behaviour; a crash if you're lucky, and data corruption if you're unlucky.
And is there a way to print the values stored in the variable directly from the memory allocated space, for example here it is lis?
You mean something like this?
for (i = 0; i < len; ++i) std::cout << lis[i] << '\n';

Memory issues with a very large array in C++

Hi I have the following:
struct myStructure
{
vector<int> myVector;
};
myStructure myArray[10000000];
As you can see I have a very large array of a vectors. The problem is that i dont have a priori knowledge of the number of elements I need to have in the array, but I know that 10 million elements is the max i can have. I have tried two things:
a) make myArray a global array, however the problem is that i have a function that will access myArray many many times, which is resulting in memory leaks and the program crashing for large calculations.
b) declare myArray dynamically from within the function that needs to access it, the memory is kept in check but the program runs about 8 times slower.
Any ideas on how to address this issue. Thanks
access myArray many many times, which is resulting in memory leaks and the program crashing for large calculations
You should fix those bugs in any case.
the memory is kept in check but the program runs about 8 times slower
Since you're already using dynamic allocation with an array of vectors it's not immediately obvious why dynamically allocating one more thing would result in such a slowdown. So you should look into this as well.
Then I would go with a vector<vector<int>> that isn't global but has the appropriate lifespan for its uses
#include <vector>
#include <functional>
#include <algorithm>
using std::vector;
int main() {
vector<vector<int>> v;
for(int i=0;i<100;++i) {
std::for_each(begin(v),end(v),std::mem_fn(&vector<int>::clear));
foo(v);
for(int j=0;j<100;++j) {
std::for_each(begin(v),end(v),std::mem_fn(&vector<int>::clear));
foo(v);
for(int k=0;k<100;++k) {
std::for_each(begin(v),end(v),std::mem_fn(&vector<int>::clear));
foo(v);
for(int l=0;l<100;++l) {
std::for_each(begin(v),end(v),std::mem_fn(&vector<int>::clear));
foo(v);
}
}
}
}
}
The best solution I can find is to call the function "malloc" which reserves space in "heap memory", in the array case you should code something like:
int* myArray = (int*) malloc ( sizeof(int)* Len );
..after that, don't forget to liberate heap memory using free(myArray);
it's a powerful tool to make arrays super large.
Declare this structure in an object with a lifetime guaranteed to surpass the objects that access it and use a reference to access this object. Ideally, you should have a class in your hierarchy that calls all the functions dealing with this struct, so these functions may well be members of your large array of vectors.
Did you try turning your array of vectors into a vector of vectors? Not knowing how many of an item you will need is what vectors are for, after all.
I believe it would be
vector<vector<int>> myVecs;
Use a different data structure. I'd suggest trying something like one of the
sparse matrix classes from Boost. They are optimised for storing numeric data in which each row or column contains a significant number of zeroes. Mind you, if the problem you're trying to solve isn't suitable for handling with a sparse data structure, it would be a good idea to set out the nature of the problem you're trying to solve, in greater detail. Take another look at https://stackoverflow.com/questions/how-to-ask even though I guess you already read that.
But before you do that I think you probably have another problem too:
access myArray many many times, which is resulting in memory leaks and
the program crashing for large calculations
It looks to me from what you write there that your code may have some pre-existing bugs. Unless your crashes are simply caused by trying to allocate a 10000000-element array as an auto variable.

Weird Memory Leak with large dynamic array

I have the following code
int main()
{
int* myDynamicArray;
myDynamicArray = new int[20000000];
int numIte;
cout << "number of iterations" << endl;
cin >> numIte;
for (int i = 0; i < numIte; ++i)
foo(myDynamicArray);
delete [] myDynamicArray;
return 0;
}
The thing that i dont understand is that when the number of iterations input is large, the memory used by the system increases as we loop through more iterations. Is that normal?
Since foo is not shown and because it it possibly doesn't make sense to call it without the array index passed in, I'll make a guess. In other words, I'm guessing that the real foo accepts some kind of array index or length as a parameter and that it accesses the elements of myDynamicArray based on that index.
If that is true (and it is not a simple case of foo leaking memory), then what you might be measuring is the amount of memory actually committed. The allocation is for 80MB, but the commit of the memory may not happen until you access the array. So the more of the array accessed by foo could cause more of the memory to be committed.
Without having a full definition for foo, this question is impossible to answer. However here are some thoughts...
It is probably a good idea to wrap myDynamicArray inside some form of safe pointer, possibly std::auto_ptr or in the case that foo might keep reference to the pointer, std::tr1::shared_ptr.
Unless the call to the foo constructor/function is causing additional memory to be allocated, there is no reason to suggest that increasing the number of loop iterations should affect the programs runtime memory usage in any way.
Finally, how are you monitoring the runtime memory usage of the program? Watching the numbers within Windows Task Manager (or equivalent) isn't a particularly robust solution, you could try manually tracking all memory allocations yourself (by overriding new/malloc) to get a true idea of when, where and how much memory is being allocated on the heap.