How to demonstrate memory error using arrays in C++ - c++

I'm trying to think of a method demonstrating a kind of memory error using Arrays and C++, that is hard to detect. The purpose is to motivate the usage of STL vector<> in combination with iterators.
Edit: The accepted answer is the answer i used to explain the advantages / disadvantages. I also used: this

Improperly pairing new/delete and new[]/delete[].
For example, using:
int *array = new int[5];
delete array;
instead of:
int *array = new int[5];
delete [] array;
And while the c++ standard doesn't allow for it, some compilers support stack allocating an array:
int stack_allocated_buffer[size_at_runtime];
This could be the unintended side effect of scoping rules (e.g constant shadowed by a member variable)... and it works until someone passes 'size_at_runtime' too large and blows out the stack. Then lame errors ensue.

A memory leak? IMO, vector in combination with iterators doesn't particularly protect you from errors, such as going out of bounds or generally using an invalidated iterator (unless you have VC++ with iterator debugging); rather it is convenient because it implements a dynamically resizable array for you and takes care of memory management (NB! helps make your code more exception-safe).
void foo(const char* zzz)
{
int* arr = new int[size];
std::string s = zzz;
//...
delete[] arr;
}
Above can leak if an exception occurs (e.g when creating the string). Not with a vector.
Vector also makes it easier to reason about code because of its value semantics.
int* arr = new int[size];
int* second_ref = arr;
//...
delete [] arr;
arr = 0; //play it safe :)
//...
second_ref[x] = y;
//...
delete [] second_ref;
But perhaps a vector doesn't automatically satisfy 100% of dynamic array use cases. (For example, there's also boost::shared_array and the to-be std::unique_ptr<T[]>)

I think the utility of std::vector really shows when you need dynamic arrays.
Make one example using std::vector. Then one example using an array to realloc. I think it speaks for itself.

One obvious:
for (i = 0; i < NUMBER_OF_ELEMENTS; ++i)
destination_array[i] = whatever(i);
versus
for (i = 0; i < NUMBER_OF_ELEMENTS; ++i)
destination_vector.push_back(whatever(i));
pointing out that you know the second works, but whether the first works depends on how destination_array was defined.

void Fn()
{
int *p = new int[256];
if ( p != NULL )
{
if ( !InitIntArray( p, 256 ) )
{
// Log error
return;
}
delete[] p;
}
}
You wouldn't BELIEVE how often I see that. A classic example of where any form of RAII is useful ...

I would think the basic simplicity of using vectors instead of dynamic arrays is already convincing.
You don't have to remember to delete your memory...which is not so simple since attempts to delete it might be bypassed by exceptions and whatnot.
If you want to do dynamic arrays yourself, the safest way to do it in C++ is to wrap them in a class and use RAII. But vectors do that for you. That's kind of the point, actually.
Resizing is done for you.
If you need to support arbitrary types, you don't have to do any extra work.
A lot of algorithms are provided which are designed to handle containers, both included and by other users.
You can still use functions that need arrays by passing the vector's underlying array if necessary; The memory is guaranteed to be contiguous by the standard, except with vector<bool> (google says as of 2003, see 23.2.4./1 of the spec).
Using an array yourself is probably bad practice in general, since you will be re-inventing the wheel...and your implementation will almost definitely be much worse than the existing one...and harder for other people to use, since they know about vector but not your weird thing.
With a dynamic array, you need to keep track of the size yourself, grow it when you want to insert new elements, delete it when it is no longer needed...this is extra work.
Oh, and a warning: vector<bool> is a dirty rotten hack and a classic example of premature optimization.

Why don't you motivate it based on the algorithms that the STL provides?

In raw array, operator[] (if I may call so) is susceptible to index-out-of-bound problem. With vector it is not (There is at least a run time exception).
Sorry, I did not read the question carefully enough. index-out-of-bound is a problem, but not a memory error.

Related

Reallocation of a struct array

I would like to ask you how to reallocate a struct array in C++?
In C there is realloc which is quite good, but it is not recommended to use it in C++. Maybe some of you would tell me that I should not use a struct array?
Well, in this task we cannot use any STL containers, so struct is the only option, I suppose. It is for the matter of practice with allocation, reallocation of memory and other things...
In the example bellow I wrote a code how I would do it in C by using malloc and realloc. Can you give me an advice how to do it in C++.
Thanks.
class CCompany
{
public:
CCompany();
bool NewAccount(const char * accountID, int initialBalance);
struct ACCOUNT
{
char *accID;
int initialBalance;
...
};
ACCOUNT* accounts ;
...
...
private:
int ReallocationStep = 100;
int accountCounter = 1;
int allocatedAccounts = 100;
...
}
CCompany::CCompany()
{
accounts = (ACCOUNT*)malloc(allocatedItems*sizeof(*accounts));
}
bool CCompany::NewAccount(const char * accountID, int initialBalance)
{
// Firstly I check if there is already an account in the array of struct. If so, return false.
...
// Account is not there, lets check if there is enough memory allocated.
if (accountCounter == allocatedAccounts)
{
allocatedAccounts += ReallocationStep;
accounts = (ACCOUNT *) realloc(accounts, allocatedAccounts * sizeof(*accounts));
}
// Everything is okay, we can add it to the struct array
ACCOUNT account = makeStruct(accID, initialBalance);
accounts[CounterAccounts] = account;
return true;
}
If You have no possibility to use STL containers, maybe You should consider using some sort of list instead of array. Basing on Your code, this could be better solution than reallocating memory over and over the time.
Personally I don't think that realloc is not recommended in C++, yet for many uses of malloc, realloc, free there are other concepts in C++ (like new, placement new, delete, ...), shifting the semantics more on "objects" rather than on "plain memory".
So it is still valid to use the realloc-approach as you did; And - if dynamic data structures like linked lists are not a choice - actually the realloc-metaphor is the best I can think of, because it avoids unnecessarily copying, deleting, recreating items again and again while still providing a continuous block of memory holding all the objects.
According to other questions+answers(1, 2), you should avoid using malloc and realloc in C++ where possible.
The latter of those two references gives a good suggestion: If you're not allowed to use std::vector due to it being an STL container, perhaps std::fstream might be worth looking into as an alternative. This would suggest that working with files without relying upon excess working memory could be the intended goal of the assessment task. I can't see the assignment criteria, so I can't say whether or not this would be compliant.
Even with an assignment criteria on your side, some lecturers like to change requirements with little or no notice; in fact, sometimes just seeing a solution to the assignment that isn't what they had in mind will (unfairly) prompt such a modification. Any assessment that prompts you to reinvent std::vector seems silly to me, but if you have two options, and only one of them involves staying in your degree, I think your only solution will be to use realloc; there's no need for malloc here.
To reduce the overhead of calling realloc so often (as pointed out by another answer), you could remove two of your three counters, call realloc when the remaining counter is about to become a power of two, and reallocate by a factor of two like I did in push_back:
void *push_back(void **base, void const *value, size_t *nelem, size_t size) {
typedef unsigned char array[size];
array *b = *base;
if (SIZE_MAX / sizeof *b <= *nelem) {
return NULL;
}
if (*nelem & -~*nelem ? 0 : 1) {
b = realloc(b, (*nelem * 2 + 1) * sizeof *b);
if (!b) {
return NULL;
}
*base = b;
}
b += (*nelem)++;
return value
? memmove(b, value, sizeof *b)
: b;
}
The correct C++ way would be to use a std::vector which can deal nicely with reallocations. As your assignment do not allow you to use standard containers, you can:
either build a custom container using new and delete for reallocation and based on an array or a linked list
or directly use an array and stick to new and delete for reallocations - still acceptable C++
or revert to the good old malloc and realloc from the C standard library which is included in the C++ standard library. But you must be aware that this will not initialize the structs.
Because malloc/realloc would not call constructors, the last way must be seen as a low level optimization and the no initialization should be explicetely documented.

How to properly work with dynamically-allocated multi-dimensional arrays in C++ [duplicate]

This question already has answers here:
How do I declare a 2d array in C++ using new?
(29 answers)
Closed 7 years ago.
How do I define a dynamic multi-dimensional array in C++? For example, two-dimensional array? I tried using a pointer to pointer, but somehow it is failing.
The first thing one should realize that there is no multi-dimensional array support in C++, either as a language feature or standard library. So anything we can do within that is some emulation of it. How can we emulate, say, 2-dimensional array of integers? Here are different options, from the least suitable to the most suitable.
Improper attempt #1. Use pointer to pointer
If an array is emulated with pointer to the type, surely two-dimensional array should be emulated with a pointer to pointer to the type? Something like this?
int** dd_array = new int[x][y];
That's a compiler error right away. There is no new [][] operator, so compiler gladly refuses. Alright, how about that?
int** dd_array = new int*[x];
dd_array[0][0] = 42;
That compiles. When being executed, it crashes with unpleasant messages. Something went wrong, but what? Of course! We did allocate the memory for the first pointer - it now points to a memory block which holds x pointers to int. But we never initialized those pointers! Let's try it again.
int** dd_array = new int*[x];
for (std::size_t i = 0; i < x; ++i)
dd_array[i] = new int[y];
dd_array[0][0] = 42;
That doesn't give any compilation errors, and program doesn't crash when being executed. Mission accomplished? Not so fast. Remember, every time we did call a new, we must call a delete. So, here you go:
for (std::size_t i = 0; i < x; ++i)
delete dd_array[i];
delete dd_array;
Now, that's just terrible. Syntax is ugly, and manual management of all those pointers... Nah. Let's drop it all over and do something better.
Less improper attempt #2. Use std::vector of std::vector
Ok. We know that in C++ we should not really use manual memory management, and there is a handy std::vector lying around here. So, may be we can do this?
std::vector<std::vector<int> > dd_array;
That's not enough, obviously - we never specified the size of those arrays. So, we need something like that:
std::vector<std::vector<int> > dd_array(x);
for(auto&& inner : dd_array)
inner.resize(y);
dd_array[0][0] = 42;
So, is it good now? Not so much. Firstly, we still have this loop, and it is a sore to the eye. What is even more important, we are seriously hurting performance of our application. Since each individual inner vector is independently allocated, a loop like this:
int sum = 0;
for (auto&& inner : dd_array)
for (auto&& data : inner)
sum += data;
will cause iteration over many independently allocated inner vectors. And since CPU will only cache continuous memory, those small independent vectors cann't be cached altogether. It hurts performance when you can't cache!
So, how do we do it right?
Proper attempt #3 - single-dimensional!
We simply don't! When situation calls for 2-dimensional vector, we just programmatically use single-dimensional vector and access it's elements with offsets! This is how we do it:
vector<int> dd_array(x * y);
dd_array[k * x + j] = 42; // equilavent of 2d dd_array[k][j]
This gives us wonderful syntax, performance and all the glory. To make our life slightly better, we can even build an adaptor on top of a single-dimensional vector - but that's left for the homework.

Avoiding copy when writing array to vector

But if you want to prevent the 5000 copy (or more) of some potentially ugly heavy objects. Just for performance and ignore readable code you can do this
#include <iostream>
#include <vector>
int main(int argc, char* argv[])
{
std::vector<int> v;
int* a = (int*) malloc(10*sizeof(int));
//int a[10];
for( int i = 0; i<10;++i )
{
*(a + i) = i;
}
delete v._Myfirst; // ? not sure
v._Myfirst = a;
for( int i = 0; i<10;++i )
{
std::cout << v[i] << std::endl;
}
system("PAUSE");
return 0;
}
simply replace the _Myfirst underlying "array". but be very careful, this will be deleted by the vector.
Does my delete/free of my first fail in some cases?
Does this depend on the allocator?
Is there a swap for that? There is the brand new swap for vectors, but how about from an array?
I had to interface with a code I cannot modify (political reasons), which provides me with arrays of objects and have to make them vector of objects.
The pointer seems the right solution, but it's a lot of memory jumps and I have to perform the loop anyway. I was just wondering if there was some way to avoid the copy and tell the vector, "now this is your underlying array".
I agree that the above is highly ugly and this is why I actually reserve and push in my real code. I just wanted to know if there was a way to avoid the loop.
Please don't answer with a loop and a push back, as obviously that's what my "real" code is already doing. If there is no function to perform this swap can we implement a safe one?
Question 1: does my delete/free of my first fail in some cases ?
Yes. You do not know how the memory for _Myfirst is allocated (apart from it uses the allocator). The standard does not specify that that the default allocator should use malloc to allocate the memory so you do not know if delete will work.
Also you are mixing allocation schemes.
You are allocating with malloc(). But expecting the std::vector<> to allocate with new (because you are calling delete).
Also even if the allocator did use new it would be uisng the array version of new and thus you would need to use the array version of delete to reclaim the memory.
Question 2: does this depends on allocator ?
Yes.
Question 3: is there a swap for that ? there is the brand new swap for vectors, but from an array ?
No.
To prevent expensive copies. Use emplace_back().
std::vector<int> v;
v.reserve(10); // reserve space for 10 objects (min).
for( int i = 0; i<10;++i )
{
v.emplace_back(i); // construct in place in the array (no copy).
}
std::vector<SomeUglyHeavyObject*> vect(5000, NULL);
for (int i=0; i<5000; i++)
{
vect[i] = &arr[i];
}
There, avoids copying and doesn't muck with std::vector internals.
If you don't like the way the vector behaves, don't use a vector. Your code depends on implementation details that you MUST not rely upon.
Yes it's ugly as you said. Make it pretty by using the correct container.
Which container you use depends on what operations you need on the container. It may be something as simple as a C-style array will suit your needs. It may be a vector of pointers, or shared pointers if you want the vector to manage the lifetime of the HeavyObjects.
In response to the need to increase the size of collection (see comments below):
std::vector guarantees that the underlying storage is contiguous, so if you increase the size, it will allocate a new larger buffer and copy the old one into the new (using the copy constructor for the objects if one exists.) Then swap the new buffer into place -- freeing the old one using its own allocator.
This will trash your memory unless the buffer you brute forced into the vector was allocated in a manner consistent with the vector's allocator.
So if your original array is big enough, all you need is a "maxUsed" size_t to tell you how big the collection is and where to put new entries. If it's not big enough, then your technique will either fail horribly, or incur the copy costs you are trying to avoid.

Dedicated function for memory allocation causes memory leak?

Hy all,
I believe that the following piece of code is generating memory leak?
/* External function to dynamically allocate a vector */
template <class T>
T *dvector(int n){
T *v;
v = (T *)malloc(n*sizeof(T));
return v;
}
/* Function that calls DVECTOR and, after computation, frees it */
void DiscontinuousGalerkin_Domain::computeFaceInviscidFluxes(){
int e,f,n,p;
double *Left_Conserved;
Left_Conserved = dvector<double>(NumberOfProperties);
//do stuff with Left_Conserved
//
free(Left_Conserved);
return;
}
I thought that, by passing the pointer to DVECTOR, it would allocate it and return the correct address, so that free(Left_Conserved) would successfully deallocate. However, it does not seem to be the case.
NOTE: I have also tested with new/delete replacing malloc/free without success either.
I have a similar piece of code for allocating a 2-D array. I decided to manage vectors/arrays like that because I am using them a lot, and I also would like to understand a bit deeper memory management with C++.
So, I would pretty much like to keep an external function to allocate vectors and arrays for me. What's the catch here to avoid the memory leak?
EDIT
I have been using the DVECTOR function to allocate user-defined types as well, so that is potentially a problem, I guess, since I don't have constructors being called.
Even though in the piece of code before I free the Left_Conserved vector, I also would like to otherwise allocate a vector and left it "open" to be assessed through its pointer by other functions. If using BOOST, it will automatically clean the allocation upon the end of the function, so, I won't get a "public" array with BOOST, right? I suppose that's easily fixed with NEW, but what would be the better way for a matrix?
It has just occurred me that I pass the pointer as an argument to other functions. Now, BOOST seems not to be enjoying it that much and compilation exits with errors.
So, I stand still with the need for a pointer to a vector or a matrix, that accepts user-defined types, that will be passed as an argument to other functions. The vector (or matrix) would most likely be allocated in an external function, and freed in another suitable function. (I just wouldn't like to be copying the loop and new stuff for allocating the matrix everywhere in the code!)
Here is what I'd like to do:
template <class T>
T **dmatrix(int m, int n){
T **A;
A = (T **)malloc(m*sizeof(T *));
A[0] = (T *)malloc(m*n*sizeof(T));
for(int i=1;i<m;i++){
A[i] = A[i-1]+n;
}
return A;
}
void Element::setElement(int Ptot, int Qtot){
double **MassMatrix;
MassMatrix = dmatrix<myT>(Ptot,Qtot);
FillInTheMatrix(MassMatrix);
return;
}
There is no memory leak there, but you should use new/delete[] instead of malloc/free. Especially since your function is templated.
If you ever want to to use a type which has a non-trivial constructor, your malloc based function is broken since it doesn't call any constructors.
I'd replace "dvector" with simply doing this:
void DiscontinuousGalerkin_Domain::computeFaceInviscidFluxes(){
double *Left_Conserved = new double[NumberOfProperties];
//do stuff with Left_Conserved
//
delete[] Left_Conserved;
}
It is functionally equivalent (except it can call constructors for other types). It is simpler and requires less code. Plus every c++ programmer will instantly know what is going on since it doesn't involve an extra function.
Better yet, use smart pointers to completely avoid memory leaks:
void DiscontinuousGalerkin_Domain::computeFaceInviscidFluxes(){
boost::scoped_array<double> Left_Conserved(new double[NumberOfProperties]);
//do stuff with Left_Conserved
//
}
As many smart programmers like to say "the best code is the code you don't have to write"
EDIT: Why do you believe that the code you posted leaks memory?
EDIT: I saw your comment to another post saying
At code execution command top shows
allocated memory growing
indefinitely!
This may be completely normal (or may not be) depending on your allocation pattern. Usually the way heaps work is that they often grow, but don't often shrink (this is to favor subsequent allocations). Completely symmetric allocations and frees should allow the application to stabilize at a certain amount of usage.
For example:
while(1) {
free(malloc(100));
}
shouldn't result in continuous growth because the heap is highly likely to give the same block for each malloc.
So my question to you is. Does it grow "indefinitely" or does it simply not shrink?
EDIT:
You have asked what to do about a 2D array. Personally, I would use a class to wrap the details. I'd either use a library (I believe boost has a n-dimmentional array class), or rolling your own shouldn't be too hard. Something like this may be sufficient:
http://www.codef00.com/code/matrix.h
Usage goes like this:
Matrix<int> m(2, 3);
m[1][2] = 10;
It is technically more efficient to use something like operator() for indexing a matrix wrapper class, but in this case I chose to simulate native array syntax. If efficiency is really important, it can be made as efficient as native arrays.
EDIT: another question. What platform are you developing on? If it is *nix, then I would recommend valgrind to help pinpoint your memory leak. Since the code you've provided is clearly not the problem.
I don't know of any, but I am sure that windows also has memory profiling tools.
EDIT: for a matrix if you insist on using plain old arrays, why not just allocate it as a single contiguous block and do simple math on indexing like this:
T *const p = new T[width * height];
then to access an element, just do this:
p[y * width + x] = whatever;
this way you do a delete[] on the pointer whether it is a 1D or 2D array.
There is no visible memory leak, however there is a high risk for a memory leak with code like this. Try to always wrap up resources in an object (RAII).
std::vector does exactly what you want :
void DiscontinuousGalerkin_Domain::computeFaceInviscidFluxes(){
int e,f,n,p;
std::vector<double> Left_Conserved(NumOfProperties);//create vector with "NumOfProperties" initial entries
//do stuff with Left_Conserved
//exactly same usage !
for (int i = 0; i < NumOfProperties; i++){//loop should be "for (int i = 0; i < Left_Conserved.size();i++)" .size() == NumOfProperties ! (if you didn't add or remove any elements since construction
Left_Conserved[i] = e*f + n*p*i;//made up operation
}
Left_Conserved.push_back(1.0);//vector automatically grows..no need to manually realloc
assert(Left_Conserved.size() == NumOfProperties + 1);//yay - vector knows it's size
//you don't have to care about the memory, the Left_Conserved OBJECT will clean it up (in the destructor which is automatically called when scope is left)
return;
}
EDIT: added a few example operations. You really should read about STL-containers, they are worth it !
EDIT 2 : for 2d you can use :
std::vector<std::vector<double> >
like someone suggested in the comments. but usage with 2d is a little more tricky. You should first look into the 1d-case to know what's happening (enlarging vectors etc.)
No, as long as you aren't doing anything drastic between the call to your dvector template and the free, you aren't leaking any memory. What tells you there is a memory leak?
May I ask, why you chose to create your own arrays instead of using STL containers like vector or list? That'd certainly save you a lot of trobule.
I don't see memory leak in this code.
If you write programs on c++ - use new/delete instead malloc/free.
For avoid possible memory leaks use smart pointers or stl containers.
What happens if you pass a negative value for n to dvector?
Perhaps you should consider changing your function signature to take an unsigned type as the argument:
template< typename T >
T * dvector( std::size_t n );
Also, as a matter of style, I suggest always providing your own memory release function any time you are providing a memory allocation function. As it is now, callers rely on knowledge that dvector is implemented using malloc (and that free is the appropriate release call). Something like this:
template< typename T >
void dvector_free( T * p ) { free( p ); }
As others have suggested, doing this as an RAII class would be more robust. And finally, as others have also suggested, there are plenty of existing, time-tested libraries to do this so you may not need to roll your own at all.
So, some important concepts discussed here helped me to solve the memory leaking out in my code. There were two main bugs:
The allocation with malloc of my user-defined types was buggy. However, when I changed it to new, leaking got even worse, and that's because one my user-defined types had a constructor calling an external function with no parameters and no correct memory management. Since I called that function after the constructor, there was no bug in the processing itself, but only on memory allocation. So new and a correct constructor solved one of the main memory leaks.
The other leaking was related to a buggy memory-deallocation command, which I was able to isolate with Valgrind (and a bit a patience to get its output correctly). So, here's the bug (and, please, don't call me a moron!):
if (something){
//do stuff
return; //and here it is!!! =P
}
free();
return;
And that's where an RAII, as I understood, would avoid misprogramming just like that. I haven't actually changed it to a std::vector or a boost::scoped_array coding yet because it is still not clear to me if a can pass them as parameter to other functions. So, I still must be careful with delete[].
Anyway, memory leaking is gone (by now...) =D

How to handle passing runtime-sized arrays between classes in C++

Right now I have a simple class that handles the parsing of XML files into ints that are useful to me. Looks something like this:
int* DataParser::getInts(){
*objectNumbers = new int[getSize()];
for (int i=0;i<getSize();i++){
objectNumbers[i]=activeNode->GetNextChild()->GetContent();
}
return objectNumbers;
}
In the main part of the program, I receive this by doing:
int* numbers= data->getInts();
///Do things to numbers[]
delete numbers;
Everything works fine until the delete command, which crashes everything. What is the proper way of doing this?
Part of the problem is that you are not pairing new[] with delete[]. This probably isn't the root of your bug here but you should get in the habbit of doing this.
The bug is almost certainly related to the code that you left commented out. Can you add some more context there so we can see what you're doing with the numbers value?
In general, I find it's much easier to use a vector for this type of problem. It takes the memory management out of the equation and has the added benefit of storing the size with the dynamic memory.
void DataParser::getInts(std::vector<int>& objectNumbers){
for (int i=0;i<getSize();i++){
objectNumbers.push_back(activeNode->GetNextChild()->GetContent());
}
}
...
std::vector<int> numbers;
data.getInts(numbers);
You need to
delete [] numbers;
The rules is whenever you
ptr = new Type[...];
make sure you
delete [] ptr;
instead of the regular
delete ptr;
which will result in undefined behavior (thanks Neil Butterworth) and is intended for deletion of a single instance where ptr points, not an array.
The following line:
*objectNumbers = new int[getSize()];
What does it do? If you are returning objectNumbers, this is a pointer to int, and you should really be doing:
objectNumbers = new int[getSize()];
Anyway, C++ gives you collections (vector, list etc) -- I'd have used one of those instead of a plain array. As noted elsewhere it is important to match your new with a delete and a new [] with a delete [].
Passing arrays around isn't good design -- you are making the implementation public. Try to pass iterators to the begining and end of the array/collection/sequence of ints instead following the STL design.
simply use a std::vector instead;
std::vector<int> DataParser::getInts(){
std::vector<int> objectNumbers(getSize());
for (int i=0;i<getSize();i++){
objectNumbers[i]=activeNode->GetNextChild()->GetContent();
}
return objectNumbers;
}
You will quickly run into trouble and maintenance issues. Consider using std::vector, this is the proper way to do it.