Dynamic Allocation of memory c++ performance improvements - c++

Good night
I'm learning C++ for scientific code development. Differently of my learning strategy used in python and MATLAB, I'm trying to learn C++ in old and deep way (using a book ref[1]). I'm struggling to understand pointers and dynamic memory allocation.
I came across this exercise and some doubts appeared. The exercise is a simple 2x2 matrix multiplication. The code:
#include<iostream>
//Perform C[2][2] = A[2][2]*B[2][2] Using dynamic memory allocation
int main(){
//declaring length of the matrices
int row,col;
row = 2;
col = 2;
//Allocating memory for matrice A [heap ?]
double** A, **B, **C;
A = new double* [row];
B = new double* [row];
C = new double* [row];
for (int i=0; i<row; ++i){
A[i] = new double [col];
B[i] = new double [col];
C[i] = new double [col];
}
//Performing calculation
for (int i=0; i<2; ++i){
for (int j=0; j<2; ++j){
A[i][j] += 10;
B[i][j] += 20;
C[i][j] += A[i][j]*B[i][j];
std::cout << C[i][j] << " ";
}
std::cout << std::endl;
}
//Deleting memory allocation [Is there an memory leak ? (No *=NULL)]
for (int i=0; i<2; ++i){
delete[] A[i];
delete[] B[i];
delete[] C[i];
}
delete[] A;
delete[] B;
delete[] C;
return 0;
}
My doubts are:
1 - I read about the memory division (global/stack/heap), where are they located in the hardware ? HD/RAM/CPU cache ?
2 - when I allocate an array:
int arrayS[2]; //Is this at the stack ?
int* arrayH; //Is this at the heap ?
arrayH = new int [2];
3 - In my resolution of the matrix problem. Is there a memory leak or garbage creation ? (note I didn't pointed the arrays to *=NULL to get rid of the address)
4 - Do you suggest any way to improve my code performance and efficiency ?
Thank you !

1) The global area, stack and heap are all located within your application's CPU address space. Which will mean they're in RAM, subject to virtual memory paging (in which case they may go and live on the HD for a while) and CPU caching (in which case they may go and live on the CPU for a while). But both of those things are transparent as far as the generated code is concerned.
2) yes, arrayS[2] will be on the stack (unless it's global, of course), and anything returned by new (or malloc, should you include some C code) is on the heap.
3) I can't see any leaks but if you're going to use row and col rather than repeating the 2 magic constant all over the place then do so uniformly and mark them as const so that you can't accidentally modify them.
4) in terms of cache efficiency, it may be better to do one allocation as new double[col * row] and then either to spread your pointers out throughout that block or to index as [i*col + j], assuming i indexes rows and all rows are col items long. new is otherwise permitted to spread your rows out wherever it likes across memory, which is more likely to lead to cache faults.
As to style? Well, you didn't ask, but I think cdhowie's comment is valid. Also Deduplicator's point should be read more deeply: the STL has a bunch of pieces to help make sure you don't leak memory — read up on unique_ptr and shared_ptr.

I read about the memory division (global/stack/heap), where are they located in the hardware ? HD/RAM/CPU cache ?
They can all be stored wherever your particular C++ program decides that they should be. Some local variables are likely to exist only in registers, if they have a short life span and never have a pointer taken to them.
Odds are that everything else will wind up somewhere in system RAM (which can be paged out to disk, if your process is unlucky). The difference is how they are used by the program, not where they happen to be. Where they are is an implementation detail that you don't really need to worry about.
when I allocate an array
Yes, your analysis of stack vs. heap there is correct. Your first one (int arrayS[2];) has the caveat that if you declare this variable as part of a class or struct, it could exist either place depending on how the class/struct gets created. (If the class/struct gets created on the heap then the array would be on the heap; if it gets created on the stack then on the stack.)
And of course if it's a global then it has global storage.
Do you suggest any way to improve my code performance and efficiency ?
Encapsulate the concept of a matrix inside of a class, and use std::vector instead of allocating your own arrays. The amount of indirection will be the same after compiler optimization, but you no longer have to concern yourself with managing the memory used by the arrays.

In c++ the object can be located stack and in heap.In your example int arrayS[2] is located in the stack of the funcion in which it was declared.If you want to create object in heap (a memory that is available to all the functions that exists in the program ) you must use the operator new.The pointer that keeps the address of the object allocated (in your case arrayH) is also push in the stack of the function in which it is declared
Actually memory of one process(program that is being executable) is dived in three parts:
Code Area (is the area in which the code of the program is located );
The stack area (is starts at the location where the Stack-Pointer points in memory.Stack pointer is a register that keeps every time the address of the current stack.In a program more that one stack at one time,this depends on the level of recursivity of your program)
The data area (is the global memory accessible from entire program.It is allocated in c++ only with new Operator or as global variable)
It exists also shared_memory which permits to the programmer to allocate memory available in more than one process.
**There is no memory leak in your code.But in c++ are implemented the concept of smart_pointer which is very similar with the garabage-colector from C# and Java **
Also the C++ is an OOP programming language so you definitely should write your code using classes ,inheritance,polimorfism etc.

Related

How Dynamic memory allocation allocates memory during run time ?

int a[10];
The above code will create a array of four int variable sizes & thus the programme will be able to store only 4 integers.
Now consider the following commands
int *a,*b,*c,*d;
a= (int *)malloc(sizeof(int));
b= (int *)malloc(sizeof(int));
c= (int *)malloc(sizeof(int));
d= (int *)malloc(sizeof(int));
The above part of code will create four int type pointer & will allocate them memory of int size.
I learnt that dynamic memory allocation allocates memory at rum time.
I want to know that irrespective of using array or malloc(dynamic memory allocation), the user will be getting only four int sized space to store.If we rule out that it is a pointer variable with int size memory, then what will be the use of dynamic memory allocation.In both cases , the user will get only four int spaces & to get more he will need to access the source code.So why do we use malloc or dynamic memory allocation ?
Consider
int a,*b;
cin >> a;
b= (int *)malloc(a*sizeof(int));
The user types a number a and gets a ints. The number a is not known to either to programmer or the compiler here.
As pointed out in the comments, this is still bad style in C++, use std::vector if possible. Even new is still better than malloc. But i hope the (bad) example helps to clarify the basic idea behind dynamic memory allocation.
You're right that it's all just memory. But there is a difference in usage.
In the general case, you don't necessarily know ahead of time the amount of memory you will need and then time when such memory can be safely released. malloc and its friends are written so that they can keep track of memory used this way.
But in many special cases, you happen to know ahead of time how much memory you will need and when you will stop needing it. For example, you know you need a single integer to act as a loop counter when running a simple loop and you'll be done with it once the loop has finished executing. While malloc and its friends can still work for you here, local variables are simpler, less error prone and will likely be more efficient.
int a[10];
The above line of code will allocate an array of 10 int's of automatic storage duration, if it was within a local scope.
int *a,*b,*c,*d;
The above, however, will allocate 4 pointers to int also of automatic storage duration, likewise if it was within a local scope.
a= (int *)malloc(sizeof(int));
b= (int *)malloc(sizeof(int));
c= (int *)malloc(sizeof(int));
d= (int *)malloc(sizeof(int));
And finally, the above will allocate int variable per each pointer dynamically. So, every pointer of the above will be pointing to a single int variable.
Do note that dynamically allocated memory can be freed and resized at runtime unlike static memory allocation. Memory of automatic storage duration are freed when run out of scope, but cannot be resized.
If you program in C, casting the result of malloc is unnecessary.
I suggest you to read this: Do I cast the result of malloc?
Then what your doing in your code with the 4 pointers is unnecessary; in fact you can just allocate an array of 4 int with one malloc:
int *a;
a = malloc(4 * sizeof(int));

What if I delete an array once in C++, but allocate it multiple times?

Suppose I have the following snippet.
int main()
{
int num;
int* cost;
while(cin >> num)
{
int sum = 0;
if (num == 0)
break;
// Dynamically allocate the array and set to all zeros
cost = new int [num];
memset(cost, 0, num);
for (int i = 0; i < num; i++)
{
cin >> cost[i];
sum += cost[i];
}
cout << sum/num;
}
` `delete[] cost;
return 0;
}
Although I can move the delete statement inside the while loop
for my code, for understanding purposes, I want to know what happens with the code as it's written. Does C++ allocate different memory spaces each time I use operator new?
Does operator delete only delete the last allocated cost array?
Does C++ allocate different memory spaces each time I use operator new?
Yes.
Does operator delete only delete the last allocated cost array?
Yes.
You've lost the only pointers to the others, so they are irrevocably leaked. To avoid this problem, don't juggle pointers, but use RAII to manage dynamic resources automatically. std::vector would be perfect here (if you actually needed an array at all; your example could just keep reading and re-using a single int).
I strongly advise you not to use "C idioms" in a C++ program. Let the std library work for you: that's why it's there. If you want "an array (vector) of n integers," then that's what std::vector is all about, and it "comes with batteries included." You don't have to monkey-around with things such as "setting a maximum size" or "setting it to zero." You simply work with "this thing," whose inner workings you do not [have to ...] care about, knowing that it has already been thoroughly designed and tested.
Furthermore, when you do this, you're working within C++'s existing framework for memory-management. In particular, you're not doing anything "out-of-band" within your own application "that the standard library doesn't know about, and which might (!!) it up."
C++ gives you a very comprehensive library of fast, efficient, robust, well-tested functionality. Leverage it.
There is no cost array in your code. In your code cost is a pointer, not an array.
The actual arrays in your code are created by repetitive new int [num] calls. Each call to new creates a new, independent, nameless array object that lives somewhere in dynamic memory. The new array, once created by new[], is accessible through cost pointer. Since the array is nameless, that cost pointer is the only link you have that leads to that nameless array created by new[]. You have no other means to access that nameless array.
And every time you do that cost = new int [num] in your cycle, you are creating a completely new, different array, breaking the link from cost to the previous array and making cost to point to the new one.
Since cost was your only link to the old array, that old array becomes inaccessible. Access to that old array is lost forever. It is becomes a memory leak.
As you correctly stated it yourself, your delete[] expression only deallocates the last array - the one cost ends up pointing to in the end. Of course, this is only true if your code ever executes the cost = new int [num] line. Note that your cycle might terminate without doing a single allocation, in which case you will apply delete[] to an uninitialized (garbage) pointer.
Yes. So you get a memory leak for each iteration of the loop except the last one.
When you use new, you allocate a new chunk of memory. Assigning the result of the new to a pointer just changes what this pointer points at. It doesn't automatically release the memory this pointer was referencing before (if there was any).
First off this line is wrong:
memset(cost, 0, num);
It assumes an int is only one char long. More typically it's four. You should use something like this if you want to use memset to initialise the array:
memset(cost, 0, num*sizeof(*cost));
Or better yet dump the memset and use this when you allocate the memory:
cost = new int[num]();
As others have pointed out the delete is incorrectly placed and will leak all memory allocated by its corresponding new except for the last. Move it into the loop.
Every time you allocate new memory for the array, the memory that has been previously allocated is leaked. As a rule of thumb you need to free memory as many times as you have allocated.

Segmentation Fault with big input

So I wrote this program in C++ to solve COJ(Caribbean Online Judge) problem 1456. http://coj.uci.cu/24h/problem.xhtml?abb=1456. It works just fine with the sample input and with some other files I wrote to test it but I kept getting 'Wrong Answer' as a veredict, so I decided to try with a larger input file and I got Segmentation Fault:11. The file was 1000001 numbers long without the first integer which is the number of inputs that will be tested. I know that error is caused by something related to memory but I am really lacking more information. Hope anyone can help, it is driving me nuts. I program mainly in Java so I really have no idea how to solve this. :(
#include <stdio.h>
int main(){
long singleton;
long N;
scanf("%ld",&N);
long arr [N];
bool sing [N];
for(int i = 0; i<N; i++){
scanf("%ld",&arr[i]);
}
for(int j = 0; j<N; j++){
if(sing[j]==false){
for(int i = j+1; i<N; i++){
if(arr[j]==arr[i]){
sing[j]=true;
sing[i]=true;
break;
}
}
}
if(sing[j]==false){
singleton = arr[j];
break;
}
}
printf("%ld\n", singleton);
}
If you are writing in C, you should change the first few lines like this:
#include <stdio.h>
#include <stdlib.h>
int main(void){
long singleton;
long N;
printf("enter the number of values:\n");
scanf("%ld",&N);
long *arr;
arr = malloc(N * sizeof *arr);
if(arr == NULL) {
// malloc failed: handle error gracefully
// and exit
}
This will at least allocate the right amount of memory for your array.
update note that you can access these elements with the usual
arr[ii] = 0;
Just as if you had declared the array as
long arr[N];
(which doesn't work for you).
To make it proper C++, you have to convince the standard committee to add Variable length arrays to the language.
To make it valid C, you have to include <stdbool.h>.
Probably your VLA nukes your stack, consuming a whopping 4*1000001 byte. (The bool only adds a quarter to that) Unless you use the proper compiler options, that is probably too much.
Anyway, you should use dynamic memory for that.
Also, using sing without initialisation is ill-advised.
BTW: The easiest C answer for your programming challenge is: Read the numbers into an array (allocated with malloc), sort (qsort works), output the first non-duplicate.
When you write long arr[N]; there is no way that your program can gracefully handle the situation where there is not enough memory to store this array. At best, you might get a segfault.
However, with long *arr = malloc( N * sizeof *arr );, if there is not enough memory then you will find arr == NULL, and then your program can take some other action instead, for example exiting gracefully, or trying again with a smaller number.
Another difference between these two versions is where the memory is allocated from.
In C (and in C++) there are two memory pools where variables can be allocated: automatic memory, and the free store. In programming jargon these are sometimes called "the stack" and "the heap" respectively. long arr[N] uses the automatic area, and malloc uses the free store.
Your compiler and/or operating system combination decide how much memory is available to your program in each pool. Typically, the free store will have access to a "large" amount of memory, the maximum possible that a process can have on your operating system. However, the automatic storage area may be limited in size , as well as having the drawback that if allocation fails then you have to have your process killed or have your process go haywire.
Some systems use one large area and have the automatic area grow from the bottom, and free store allocations grow from the top, until they meet. On those systems you probably wouldn't run out of memory for your long arr[N], although the same drawback remains about not being able to handle when it runs out.
So you should prefer using the free store for anything that might be "large".

C++ StackOverflowException initializing struct over 63992

"Process is terminated due to StackOverflowException" is the error I receive when I run the code below. If I change 63993 to 63992 or smaller there are no errors. I would like to initialize the structure to 100,000 or larger.
#include <Windows.h>
#include <vector>
using namespace std;
struct Point
{
double x;
double y;
};
int main()
{
Point dxF4struct[63993]; // if < 63992, runs fine, over, stack overflow
Point dxF4point;
vector<Point> dxF4storage;
for (int i = 0; i < 1000; i++) {
dxF4point.x = i; // arbitrary values
dxF4point.y = i;
dxF4storage.push_back(dxF4point);
}
for (int i = 0; i < dxF4storage.size(); i++) {
dxF4struct[i].x = dxF4storage.at(i).x;
dxF4struct[i].y = dxF4storage.at(i).y;
}
Sleep(2000);
return 0;
}
You are simply running out of stackspace - it's not infinite, so you have to take care not to run out.
Three obvious choices:
Use std::vector<Point>
Use a global variable.
Use dynamic allocation - e.g. Point *dxF4struct = new Point[64000]. Don't forget to call delete [] dxF4struct; at the end.
I listed the above in order that I think is preferable.
[Technically, before someone else points that out, yes, you can increase the stack, but that's really just moving the problem up a level somewhere else, and if you keep going at it and putting large structures on the stack, you will run out of stack eventually no matter how large you make the stack]
Increase the stack size. On Linux, you can use ulimit to query and set the stack size. On Windows, the stack size is part of the executable and can be set during compilation.
If you do not want to change the stack size, allocate the array on the heap using the new operator.
Well, you're getting a stack overflow, so the allocated stack is too small for this much data. You could probably tell your compiler to allocate more space for your executable, though just allocating it on the heap (std::vector, you're already using it) is what I would recommend.
Point dxF4struct[63993]; // if < 63992, runs fine, over, stack overflow
That line, you're allocating all your Point structs on the stack. I'm not sure the exact memory size of the stack but the default is around 1Mb. Since your struct is 16Bytes, and you're allocating 63393, you have 16bytes * 63393 > 1Mb, which causes a stackoverflow (funny posting aboot a stackoverflow on stack overflow...).
So you can either tell your environment to allocate more stack space, or allocate the object on the heap.
If you allocate your Point array on the heap, you should be able to allocate 100,000 easily (assuming this isn't running on some embedded proc with less than 1Mb of memory)
Point *dxF4struct = new Point[63993];
As a commenter wrote, it's important to know that if you "new" memory on the heap, it's your responsibility to "delete" the memory. Since this uses array new[], you need to use the corresponding array delete[] operator. Modern C++ has a smart pointer which will help with managing the lifetime of the array.

Is this code generating memory leaks or is it clean? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I would like to know the best way to determine if the following code is generating memory leaks, as I understand that you have to free the memory when you are done using it, but in the same time this array or pointer to int is out of scope when the function return then it does not matter if I release it or not?
But in this particular case I cannot release the memory without interrupting the heap atleast according to microsoft debugger in Visual studio 2010.
And for learning how to best approach this could you please describe your ways of checking for memory leaks.
Thanks in advance
#include <iostream>
#include <algorithm>
#include <cmath>
using namespace std;
int sieve(int n)
{
int *a = (int *) malloc(sizeof(int) * n);
int max = floor(sqrt((double)n));
int p = 2;
memset(a,0,sizeof(int) * n);
while(p<=max)
{
for(int i = 2 * p; i <= n; i+= p)
a[i] = 1;
while(a[++p]) /* Empty */ ;
}
while(a[n]) n--;
/* free(a); */ // free our array as we are done with it. but it generate a heap error
return n;
}
int main(void)
{
cout << sieve(100) << endl;
system("pause");
return 0;
}
there are memory leaks in your program as you are not freeing the memory allocated.
use "free" here as you have used malloc and incase if you plan on using "new" then use "delete" operator
try to free memory before return statement.
you can also use the valgrind tool to find out the memory leaks in your program
check this url for more information.
this tool will help you to find out memory leaks in your programme.
This line might be the cause of your error:
for(int i = 2 * p; i <= n; i+= p)
Here you loop while i is smaller or equal to n. But as all arrays the index must go from 0 to (size - 1). You should allocate one extra entry for the array:
int *a = (int *) malloc(sizeof(int) * (n + 1));
Yes, your program as it stands will leak memory.
In general, if you are allocating memory dynamically (using malloc in C, or new in C++), and you aren't using smart pointers, you need to free the memory using free or delete respectively.
In your particular test program, it exits immediately after calling the sieve() method which means the allocated memory will automatically be deallocated by the operating system.
Also note that you seem to be writing C code in C++:
malloc is the C way to allocate memory. If you must use raw pointers, you should use new in C++ instead.
Even better, use standard library containers like vector which will automatically manage the memory for you.
More important things:
You allocated n int with int *a = (int *) malloc(sizeof(int) * n);. If you want to access this information, you must begin in 0 till n-1. In line while(a[n]) n--; you are out of bounds of your allocated memory section. Maybe a core !!!!
And, what happens if all a[n] values are 0 different? you decrease n until arrive to negative values. What about this? You access to a[n] values with n < 0. Maybe another core.
Remember, this is C / C++. It executes all you write, so be carefull !!!
General programming techniques:
Verify all array limits.
Verify malloc return values.
Always free or delete your pointers
Use correct types: n maybe be unsigned, use it!!
You are allocating n ints, indexed from 0 to n-1. However, you are looping from 4 to n inclusive, meaning that you will assign index n and corrupt memory. That's why the free is causing an error (it checks for corruption when you call it).
You have to fix your memory corruption, then put back the free call.