Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Here is my code :
int** tmp = new int*[l];
for(int i = 0; i < l; i++)
tmp[i] = new int[h];
for(int i=0;i< l;i++)
delete[] tmp[i];
delete[] tmp;
I would like to know if i'm correctly deallocation memory. The problem that i have is that when i check the process of my program on the task manager, memory wont drop.
Is it normal?
The code above is ok, altough in general it's the sort of terrible thing you hope you'll never encounter in a codebase you have to work on.
std::vector or boost::multi_array would be both better choices here, and they destroy without all that unnecessary error prone code. Basically if you have to wonder what the code is doing and whether it's correct, then something is wrong with it already.
CPU load is not connected to memory allocations directly and it's just a whole another problem you have with your code. Some loop that's endlessly polling the OS for something might be a reason for that; I have no information to what your code is doing besides allocating and deallocating memory, so it's hard to tell what could be improved.
After your comment... don't rely on task manager to tell you real memory usage of a program. Use a specialized leak detector for that. As #H2CO3 pointed out, OS might not immediately report deleted memory as free.
In its barebone implementation, new and delete are just sugar over malloc and free (from the C library), so we will reason about those instead.
Operating Systems usually provide primitives to (de)allocate memory, however those primitives:
are not as fine-grained as malloc and free: they work by 4K blocks, for example
are relatively expansive: notably, they often zero-out the memory
As a result, most implementations of malloc and free are not simple one-line wrappers around OS primitives, but instead will keep a pool of allocated pages and handle most requests internally. Some implementations even have a per-thread pool to avoid contention (such as jemalloc) or multiple pools with per-thread affinity (such as tcmalloc).
This results:
in faster malloc/free calls
at the expense of memory footprint of the process being slightly higher than strictly needed
Note: and I have not touched on fragmentation yet...
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 months ago.
Improve this question
What happens when we do push_back with size() == capacity()?
I've heard a lot of opinions regarding this question. Most popular is: when the vector's size reaches its capacity, it allocates a new region of memory, copies the vector to the newly allocated memory, and inserts new value to the vector's end.
But, why do we have to do it? We have a virtual memory mechanism, we can just call realloc(vec.data(), (sizeof(vec::value_type) * vec.size()) * 2). The Allocator will give us a new memory page, and virtual addresses make the memory "consistent", so we don`t have to copy values from the vector.
Do I understand the virtual memory mechanism wrong?
You understand virtual memory mechanism correctly, basically you can create any amount of continuous page-aligned arrays in the proces' virtual memory space and they would be backed by non-contiguous physical memory.
But that is irrelevant to std::vector because std::allocator does not provide any API to take advantage of that, I think some see this as an oversight.
Just be mindful that C++ is not restricted to only architectures supporting virtual memory, although I think it would be an implementation detail of the standard library if it were implemented anyway.
No, you cannot use C realloc because C++ has objects with real lifetimes, everything is not just a blob of bytes that can be freely copied at whim, some special blobs might not like being moved and they will not appreciate it if you force them to.
Yes, if you are dealing with PODs, this would work with a custom::vector, not std::vector based on std::allocator.
There is a paper in works which addresses your concerns and goes beyond realloc, arguing "its day has passed" - P0901 which few days ago received rather positive feedback from the committee.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm wondering how much performance a realloc() really costs: I'm doing it quite often to extend an available memory area by one element (=specific structure). Is - thanks to the MMU - such an realloc() just the extension of the reserved memory area or is there a complete copying of all data imaginable under some conditions?
As far as I know a std::vector very often has to copy the memory area when it's size increases and the predefined amount of memory is too small...
realloc copies all the data. Assuming anything else is just asking for performance trouble. The situations when realloc can avoid copying are few and you should absolutely not count on them. I've seen more than one implementation of realloc that doesn't even bother implementing the code to avoid copying because it's not worth the effort.
The MMU has nothing to do with it because the cost to remap the pages of the memory backing an allocation don't pay off until you hit more than two pages. This is based on research I read 15 years ago and since then memory copying has become faster, while memory management has become more expensive because of MP systems. This was also for zero-copy schemes inside the kernel only, without passing the syscall overhead, which is significant and would slow things down here. It would also require that your allocation is perfectly aligned and sized, further reducing the usefulness of implementing realloc this way.
At best realloc can avoid copying data if the memory chunk it would expand into is not allocated. If realloc is the only thing your application does you might get lucky, but as soon as there's just a little fragmentation or other things allocate, you're out of luck. Always assume that realloc is malloc(new_size); memcpy(new, old, old_size); free(old);.
A good practice when dealing with resizing arrays with realloc is to keep track of how many elements you have in the array and have a separate capacity. Grow the capacity and realloc only when the number of elements hits the capacity. Grow the capacity by 1.5x on every realloc (most people do 2x, it's often recommended in literature, but research shows that 2x causes very bad memory fragmentation problems, while 1.5x is almost as efficient and is much nicer to memory). Something like this:
if (a->sz == a->cap) {
size_t ncap = a->cap ? a->cap + a->cap / 2 : INITIAL_CAP;
void *n = realloc(a->a, ncap * sizeof(*a->a));
if (n == NULL)
deal_with_the_error();
a->a = n;
a->cap = ncap;
}
a->a[a->sz++] = new_element;
This works even for the initial allocation if your struct containing the array is zero initialized.
Copying data is not the expensive part (though some may disagree). Hitting the embedded malloc and free is expensive, and could account for almost all of your execution time, depending on what else you are doing.
If so, fixing it should give you a big speedup.
This is how I tell what fraction of time things spend.
The simplest solution is to do it less often. When you allocate an array, allocate it extra large, and then keep track yourself of how much of it you are actually using.
The behavior really depends on the implementation. But all try to minimize the cost of relocating the memory. Because relocation is very expensive for performance. It has a direct impact on cache. I have no numbers, but it is very expensive operation.
For example, in case of relocation, if the runtime faces two options of relocating the memory or extending the currently reserved one, it chooses the latter.
But it is not as simple as I said. It also has to consider memory fragmentation.
So there are several trade-offs to satisfy.
In case of vector that you mentioned, they use a different scheme. If the vector has m bytes in reserve, and it needs an extra n bytes, the runtime will allocate 2 * (n+m) to minimize the possibility of a future relocation. If you exceed the new size, next time it will use a factor of 4 instead of 2; and so on. Numbers I mentioned are not real.
I'm not very into the implementations, hope others give you more specific information.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
This might be a bit stupid question - should I call delete on huge map/set at the end of the program?
Assuming the map/set is needed throughout all program (delete is last line before return) and it's size is really huge (> 4GB). The delete call take a long time and from my perspective does not have any value(memory cannot be released any sooner), am I wrong? If so, why?
There is no guarantee in the C and C++ standards of what happens after your program exits. Including no guarantee that anything is cleaned up. Some smaller real-time OS' for example, will not perform automatic cleanup. So at least in theory, your program should definitely do delete everything you new to fulfil it's obligation as a complete and portable program that can run forever.
It is also possible that someone takes your code and puts a loop around it, so that your tree is now created a million times, and then goes to find you, bringing along the "trusty convincer", aka base-ball bat, when they find out WHY it's now running out of memory after 500 iterations.
Of course, like all things, this can be argued many different ways, and it really depends on what you are trying to achieve, why you are writing the program etc. My compiler project leaks memory like a sieve, because I use exactly the method of memory management that you describe (partly because tracking the lifetime of each dynamically allocated object is quite difficult, and partly because "I can't be bothered". I'm sure if someone actually wants a good pascal compiler, they won't go for my code anwyay).
Actually, my compiler project builds many different data structures, some of which are trees, arrays, etc, but basically none of them perform any cleanup afterwords. It's not as simple to fix as the case of building a large tree that need each node deleting. However, conceptually, it all boils down to "do cleanup" or "not do cleanup", and that in turn comes down to "well, who is going to use/modify the code, and what do you know about the environment it will run in".
As long as your program stays more or less the same, with the memory being used for the entire execution, it doesn't make a real difference.
However, if someone ever tries to take your program and make it into a component of some other program, not releasing the memory could result in a huge memory leak.
So to be on the safe side, and also to be more organized, always free what you allocate. In your case it's very easy so there's no downside - just a potential upside.
If you are using C++ STL, then there is no need for any explicit deletion, because map/set will automatically manage heap memory for you. Actually, you are not allowed to delete those map/set, since they are not pointers.
For the large objects stored inside map/set, you can use smart pointers when constructing those objects, and then you will no longer need to call their destructors. Memory leak may not be a problem for a toy program, but it is unacceptable for real-life programs, since they may run for a long time or even forever.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am developing a C++ application with Qt; involving tremendous number crunching. A large amount of dynamic memory is required for the entire operation. However the requirement is variable depending on a variable set by the user.
In resource monitor I can see that the Commit memory (memory allocated by OS for the exe) keeps on increasing with time as my program creates arrays in dynamic memory. So if I let Windows know beforehand that my exe will use X MB of memory, will this result in improved performance? If yes then how do I do this?
If you have a lot of memory allocations and cpu-intensive process that runs together, you might consider restructuring your program to use some memory pools.
The idea behind memory pool is that you allocate a pool of resources that you will probably need when processing beings, (maps, vectors, or any objects you happen to new very often), and the time you need a new object, you take the first available one from the pool, reset and use it, and when you are done with it you put it back into the pool so that it can be used again later.
This pattern can happen to be faster than continuously use new and delete, but only if your program intensively uses dynamic allocations while it is doing, for example, a minmax search over a huge tree, or something as intensive as that.
So if I let Windows know beforehand that my exe will use X MB of memory, will this result in improved performance? If yes then how do I do this?
I don't think so. The memory your app operates on is virtual and you don't really have a good control on how Windows actually allocates/maps physical memory onto virtual.
But you can try allocating the required amount of memory upfront and then use it as a pool for custom allocators. It may result in some performance hit however.
You can do a large alloc and delete.
char *ptr = new char[50*1024*1024L];
delete[] *ptr;
I doubt if there is going to be any performance difference.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
How do memory leak detection tools like purify and valgrind work?
How can I design and implement my own such tool?
Such tools usually instrument the exeutable with their own code. for instance they replace every call to malloc() and free() with their own functions which allows them to follow every allocation.
In Visual Studio this can be done automatically using just the C Runtime Library using functions from the family of _CrtDumpMemoryLeaks()
For basic leak detection you just need to hook into low level memory allocation routines, e.g. by patching malloc/free. You then track all allocations and subsequently report any that have not been freed at an appropriate point, e.g. just prior to exit.
For actual work, valgrind works quite well. It detects invalid read/write and memory leaks.
For hobby project, you can create your own memory management module which keeps track of various pointer allocation and its usage. If you don't see some mem location being used for long time, it might be a leak.
You can look for some BSD implementations of memory management/profiling tools for examples of code. For instance http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools
I am developing this tool: Deleaker.
Sure, the obvious idea is to hook all functions that do allocations and deallocations. That's not only malloc and free, rather HeapAlloc / HeapFree (if we are talking about Windows platform) because modern VC++ versions (after VC 6.0) just redirect malloc / free to winapi's HeapAlloc / HeapFree.
For each allocation a stack is saved and an object is saved. On each deallocation the object is freed. At a first glance, it's so simple: just store a list of allocated objects and remove an object on deallocation hook.
But there are tricky parts:
Speed
You need to maintain a list of allocated objects. If you add/remove an object in each hooked function, the process is being executing toooo slooow. That seems to be a common problem of such tools.
Stack Trace
Using dbghelp.dll function to get stack trace takes a lot of time. You have to get stack entries faster: for example by manual reading of process memory.
False positives
Some leaks are produced by system DLLs. If you show all of them, you got a tons of leaks, but user can't "solve" it: he/she doesn't have access to its source code, and can't prevent executing this code. It's impossible to stop this leaks. Some of them are single allocations at a system dll's entry point, so it's not a real leak (a good question, what's the leak at all?). How to recognize those leaks that must be shown? A good filtering is an answer.
Hope this helps.