I am writing a small toy language/compiler for (fun and) scientific applications. Core design principles are simplicity and efficiency (some kind of "modern" Fortran if you will). The language would have built-in arrays, that would look something like that:
let x: Real[5] = {1.0, 2.0, 3.0, 4.0, 5.0}
let n = get_runtime_value()
let y: Integer[100,n] = ...
In the above statement, the user does not explicitly states whether the array should be allocated on the stack or on the heap. If at all possible, I'd rather not expose that to the users (my reasoning is that most engineers don't know the difference, and should not have to care. They have other problems to worry about.).
Technically, I could write something like:
if (some input parameter cannot be known at compile time)
allocate on the heap
else # candidate for the stack
if (the array is not returned by the function && the allocated size is smaller than some threshold)
allocate on the stack
else
allocate on the heap
However, this design scares me for a few reasons:
Added complexity, longer compilation times?
In C++, the compiler can perform RVO and return a value on the stack directly. I guess I could make the algorithm more complex to detect such cases, but this will make the whole thing more complex/buggy/slow to compile.
A slight change in array size could cause the switch from stack to heap. That could be confusing for the user. Defining this threshold would also require some care.
I need to check that some reference to that array is not being returned either (as well as references of references, etc.). I imagine that could be expensive to track down.
Note that I do not want to expose pointers or references in my language. Arrays will always be passed by reference under the hood.
Is there a neat way in the literature to solve this problem? Has it been done before in an existing language? All the languages I know require the user to specify where they want their data: Fortran has ::allocatable, C++ has std::vector and std::array, etc. I could also do something like llvm's SmallVector and always allocate a few elements on the stack before moving to the heap. Does my approach make any sense at all? I am using this project to learn more about compilers and language design. Is there something I should be watchful for?
The option is really up to you, but if I were you, I would let the user choose whether to allocate on the heap or stack. If not, it will most likely get very confusing for you, and the user. If you still want to implement the feature, I have some tips.
Instead of checking if things cannot be known at compile-time, check what can be known at compile-time-- it will simplify things.
Make everything defined on the heap or stack by default-- this will make it easier to handle situations where you need to switch from stack to heap (or vice versa) (it's easier because suppose when append is called you can switch to heap).
In conclusion, I recommend you let the user be explicit about declaring an array on the stack or heap.
Related
What are the internal differences of choosing to use an std::vector vs a dynamically allocated array? I mean, not only performance differences like in this matching title question.
I mean, I try to design a library. So I want to offer a wrapper over an StackArray, that is just a C-Style array with some member methods that contains as a member T array[N]. No indirections and the new operator removed to force the implementor to have a array type always stored in the stack.
Now, I want to offer the dynamic variant. So, with a little effort, I just can declare something like:
template <typename T>
class DynArray
{
T* array,
size_t size,
size_t capacity
};
But... this seems pretty similar to a base approach to a C++ vector.
Also, an array stored in the heap can be resized by copying the elements to a new mem location (is this true?). That's pretty much the same that makes a vector when a push_back() operation exceeds its allocated capacity, for example, right?
Should I offer both API's if exists some notable differences? Or I am overcomplicated the design of the library and may I just have my StackArray and the Vector should be just the safe abstraction over a dynamically allocated array?
First there is a mindset (usually controversial) between the usage of the modern tools that the standard provides and the legacy ones.
You usually must be studing and asking things about C++ modern features, not comparing them with the old ones. But, for learning purposes, I have to admit that it's quite interesting dive deep somethimes in this topics.
With that in mind, std::vector is a collection that makes much more that just care about the bytes stored in it. There is a constraint really important, that the data must lie in contiguous memory, and std::vector ensures this in its internal implementation. Also, has an already well known, well tested implementation of the RAII pattern, with the correct usage of new[] and delete[] operators. You can reserve storage and emplace_abck() elements in a convenient and performant way which makes this collection really unique... there are really a lot of reasons that shows why std::vector is really different from a dynamically allocated array.
Not only is to worry about manual memory management, which almost an undesirable thing to do in modern C++ (embedded systems, or operating system themselves are a good point to discuse this last sentence). It's about to have a tool, std::vector<T> that makes your life as developer easier, specially in a prone-error language like C++.
Note: I say error-prone because it's a really hard to master language, which needs a lot of study and training. You can make almost everything in the world, and has an incredible amount of features that aren't begginer friendly. Also, the retrocompatibility constraint makes it really bigger, with literally thousand of things that you must care about. So, with a great power, always comes a great responsability.
I'm new to C++ and I'm wondering why I should even bother using new and delete? It can cause problems (memory leaks) and I don't get why I shouldn't just initialize a variable without the new operator. Can someone explain it to me? It's hard to google that specific question.
For historical and efficiency reasons, C++ (and C) memory management is explicit and manual.
Sometimes, you might allocate on the call stack (e.g. by using VLAs or alloca(3)). However, that is not always possible, because
stack size is limited (depending on the platform, to a few kilobytes or a few megabytes).
memory need is not always FIFO or LIFO. It does happen that you need to allocate memory, which would be freed (or becomes useless) much later during execution, in particular because it might be the result of some function (and the caller - or its caller - would release that memory).
You definitely should read about garbage collection and dynamic memory allocation. In some languages (Java, Ocaml, Haskell, Lisp, ....) or systems, a GC is provided, and is in charge of releasing memory of useless (more precisely unreachable) data. Read also about weak references. Notice that most GCs need to scan the call stack for local pointers.
Notice that it is possible, but difficult, to have quite efficient garbage collectors (but usually not in C++). For some programs, Ocaml -with a generational copying GC- is faster than the equivalent C++ code -with explicit memory management.
Managing memory explicitly has the advantage (important in C++) that you don't pay for something you don't need. It has the inconvenience of putting more burden on the programmer.
In C or C++ you might sometimes consider using the Boehm's conservative garbage collector. With C++ you might sometimes need to use your own allocator, instead of the default std::allocator. Read also about smart pointers, reference counting, std::shared_ptr, std::unique_ptr, std::weak_ptr, and the RAII idiom, and the rule of three (in C++, becoming the rule of 5). The recent wisdom is to avoid explicit new and delete (e.g. by using standard containers and smart pointers).
Be aware that the most difficult situation in managing memory are arbitrary, perhaps circular, graphs (of reference).
On Linux and some other systems, valgrind is a useful tool to hunt memory leaks.
The alternative, allocating on the stack, will cause you trouble as stack sizes are often limited to Mb magnitudes and you'll get lots of value copies. You'll also have problems sharing stack-allocated data between function calls.
There are alternatives: using std::shared_ptr (C++11 onwards) will do the delete for you once the shared pointer is no longer being used. A technique referred to by the hideous acronym RAII is exploited by the shared pointer implementation. I mention it explicitly since most resource cleanup idioms are RAII-based. You can also make use of the comprehensive data structures available in the C++ Standard Template Library which eliminate the need to get your hands too dirty with explicit memory management.
But formally, every new must be balanced with a delete. Similarly for new[] and delete[].
Indeed in many cases new and delete are not needed, you can just use standard containers instead and leaving to them the allocation/deallocation management.
One of the reasons for which you may need to use allocation explicitly is for objects where the identity is important (i.e. they are not just values that can be copied around).
For example if you have a gui "window" object then making copies probably doesn't make sense and thus you're more or less ruling out all standard containers (they're designed for objects that can be copied and assigned). In this case if the object needs to survive the function that creates it probably the simplest solution is to just allocate explicitly it on the heap, possibly using a smart pointer to avoid leaks or use-after-delete.
In other cases it may be important to avoid copies not because they're illegal, but just not very efficient (big objects) and explicitly handling the instance lifetime may be a better (faster) solution.
Another case where explicit allocation/deallocation may be the best option are complex data structures that cannot be represented by the standard library (for example a tree in which each node is also part of a doubly-linked list).
Modern C++ styles often frown on explicit calls to new and delete outside of specialized resource management code.
This is not because the stack/automatic storage is sufficient, but rather because RAII smart resource owners (be they containers, shared pointers, or something else) make almost all direct memory wrangling unnessecary. And as the problem of memory management is often error prone, this makes your code more robust, easier to read, and sometimes faster (as the fancy resource owners can use techniques you might not bother with everywhere).
This is exemplified by the rule of zero: write no destructor, copy/move assign, copy/move constructor. Store state in smart storage, and have it handle it for you.
None of the above applies when you yourself are writing smart memory owning classes. This is a rare thing to need to do, however. It also requires C++14 (for make_unique) to get rid of the penultimate excuse to call new.
Now, the free store is still used, just not directly, under the above style. The free store (aka heap) is needed because automatic storage (aka the stack) only supports really simple object lifetime rules (scope based, compile time deterministic size and count, FILO order). As runtime sized and counted data is common, and object lifetime is often not that simple, the free store is used by most programs. Sometimes copying an object around on the stack is enough to make the simple lifetime less of a problem, but at other times identity is important.
The final reason is stack overflow. On some C++ implementations the stack/automatic storage is seriously constrained in size. What more is that there is rarely if ever a reliable failure mode when you put to much stuff in it. By storing large data on the free store, we can reduce the chance the stack will overflow.
First, if you don't need dynamic allocation, don't use it.
The most frequent reason for needing dynamic allocation is that
the object will have a lifetime which is determined by the
program logic rather than lexical scope. The new and
delete operators are designed to support explicitly managed
lifetimes.
Another common reason is that the size or structure of the
"object" is determined at runtime. For simple cases (arrays,
etc.) there are standard classes (std::vector) which will
handle this for you, but for more complicated structures (e.g.
graphs and trees), you'll have to do this yourself. (The usual
technique here is to create a class representing the graph or
tree, and have it manage the memory.)
And there is the case where the object must be polymorphic, and
the actual type won't be known until runtime. (There are some
tricky ways of handling this without dynamic allocation in the
simplest cases, but in general, you'll need dynamic allocation.)
In this case, std::unique_ptr might be appropriate to handle
the delete, or if the object must be shared, std::shared_ptr
(although usually, objects which must be shared fall into the
first category, above, and so smart pointers aren't
appropriate).
There are probably other reasons as well, but these are the
three that I've encountered the most often.
Only on simple programs you can know beforehand how much memory you'd use. In general you can not foresee how much memory you'd use.
However with modern C++11 you generally rely on standard libraries like vector and map for memory allocation, and the use of smart pointers helps you avoid memory leaks, so you don't really need to use new and delete explicitly by hand.
When you are using New then your object stores in Heap, and it remains there until you don't manually delete it. but in the case without using new your object goes in Stack and it destroys automatically when it goes out of scope.
Stack is set to a fix size, so if there is no any block for assign a new object then Stack Overflow occurs. This often happens when a lot of nested functions are being called, or if there is an infinite recursive call. If the current size of the heap is too small to accommodate new memory, then more memory can be added to the heap by the operating system.
Another reason may be if you are explicitly calling an external library or API with a C-style interface. Setting up a callback in such cases often means context data must be supplied and returned in the callback, and such an interface usually provides only a 'simple' void* or int*. Allocating an object or struct with new is appropriate for such actions, (you can delete it later in the callback, should you need to).
What parts of standard C++ will call malloc/free rather than new/delete?
This MSDN article lists several cases where malloc/free will be called rather than new/delete:
http://msdn.microsoft.com/en-us/library/6ewkz86d.aspx
I'd like to know if this list is (in increasing order of goodness and decreasing order of likelihood):
True for other common implementations
Exhaustive
Guaranteed by some part of the C++ standard
The context is that I'd like to replace global new/delete and am wondering what allocations I'd miss if I did.
I'd like to know if this list is (in increasing order of goodness and decreasing order of likelihood):
1. True for other common implementations
2. Exhaustive
3. Guaranteed by some part of the C++ standard
I'd say you cannot really tell from that list (I suppose the one given in the Remarks section) what other C++ implementations than MS will use.
The C++ implementation is free to use any of the OS provided system calls arbitrarily. So the answer for all 3 of your questions is: No.
As for use of malloc() vs new() in implementations of the C++ specific part of the compiler ABI:
I think you can suppose that C++ specific implementations will use new() or placement new for any allocator implementations.
If those listed methods use new() (most unlikely) or malloc() internally to allocate memory doesn't matter for a user of the C++ standard library implementations.
NOTE:
If you're asking from the background of planning to override new(), or use placement new to provide your own memory allocation mechanism for all memory allocation in a programs context: That's not the way to go!
You'll have to provide your own versions of malloc(), free() et. al. then. E.g. when using GCC in conjunction with newlib, there are appropriate stubs you can use for this.
A new is basically a wrapped malloc. The compiler is allowed to use stdio functions at will, for example if you try and implement your own memcpy you'll get some weird recursion. If the compiler sees you copying more than a certain amount (say a dumb bit-for-bit copy constructor) it will use memcpy.
So yes, new is sort of a lie, new means "allocate some memory and construct something there and let me write it as one thing", if you allocate an array of floats say, they are uninitialised, malloc will probably be directly used.
Notice I say probably, I'm not sure if they're set to zero these days :P
Anyway, all compiler optimisations ('cept copy elisioning and other return-value-optimisation stuff - BUT THIS IS THE ONLY EXCEPTION) are invisible to you, that is the point. The program cannot tell it was optimised, you'd have to be timing it and stuff. For example:
(x*10)/2
This will not be optimised if the compiler has no idea about the range of x, because x*10 could overflow, but x*5 might not. So if it optimised it'd change the result.
if(x>0 && x<10) {
(x*10)/2
}
will become x*5 because the compiler, being really smart (much more than this) sees "there's no way x*10 can overflow, so x*5 is safe."
If you have a global new/delete that you defined, the compiler cannot optimise because it cannot know it'll have no effects if it does. If you define your own everything it "simplified" to malloc/free will go away.
NOTE:_
I've deliberately ignored the malloc and type-saftey stuff. It's not relevant.
The compiler assumes that malloc, free, memcpy and so forth are all super-optimised and will use them ONLY WHERE SAFE - as described above. There's a GCC thread on the mailing list somewhere where I learned of the memcpy thing.
Calloc and malloc are much, much more low level than new and delete. Firstly malloc and calloc are not safe, because u use cast on type whatever you want, and access of data in that memory is uncontrolled. (You can end up writing on someone else's memory) If you are doing some real low level programming you will have to use malloc and calloc. If you are regular programmer just use new and delete they are much easier. Why do you need precise implementation? (I have to say implementation depends because there are many different ones)
I'm implementing a compacting garbage collector for my own personal use in C++0x, and I've got a question. Obviously the mechanics of the collector depend upon moving objects, and I've been wondering how to implement this in terms of the smart pointer types that point to it. I've been thinking about either pointer-to-pointer in the pointer type itself, or, the collector maintains a list of pointers that point to each object so that they can be modified, removing the need for a double de-ref when accessing the pointer but adding some extra overhead during collection and additional memory overhead. What's the best way to go here?
Edit: My primary concern is for speedy allocation and access. I'm not concerned with particularly efficient collections or other maintenance, because that's not really what the GC is intended for.
There's nothing straight forward about grafting on extra GC to C++, let alone a compacting algorithm. It isn't clear exactly what you're trying to do and how it will interact with the rest of the C++ code.
I have actually written a gc in C++ which works with existing C++ code, and it had a compactor at one stage (though I dropped it because it was too slow). But there are many nasty semantic problems. I mentioned to Bjarne only a few weeks ago that C++ lacks the operator required to do it properly and the situation is that it is unlikely to ever exist because it has limited utility..
What you actually need is a "re-addres-me" operator. What happens is that you do not actually move objects around. You just use mmap to change the object address. This is much faster, and, in effect, it is using the VM features to provide handles.
Without this facility you have to have a way to perform an overlapping move of an object, which you cannot do in C++ efficiently: you'd have to move to a temporary first. In C, it is much easier, you can use memmove. At some stage all the pointers to or into the moved objects have to be adjusted.
Using handles does not solve this problem, it just reduces the problem from arbitrary sized objects to constant sized ones: these are easier to manage in an array, but the same problem exists: you have to manage the storage. If you remove lots of handle from the array randomly .. you still have a problem with fragmentation.
So don't bother with handles, they don't work.
This is what I did in Felix: you call new(shape, collector) T(args). Here the shape is a descriptor of the type, including a list of offsets which contain (GC) pointers, and the address of a routine to finalise the object (by default, it calls the destructor).
It also contains a flag saying if the object can be moved with memmove. If the object is big or immobile, it is allocated by malloc. If the object is small and mobile, it is allocated in an arena, provided there is space in the arena.
The arena is compacted by moving all the objects in it, and using the shape information to globally adjust all the pointers to or into these objects. Compaction can be done incrementally.
The downside for a C++ programmer is the need to construct a correct shape object to pass. This doesn't bother me because I'm implementing a language which can generate the shape information automatically.
Now: the key point is: to do compaction, you must use a precise collector. Compaction cannot work with a conservative collector. This is very important. It is fine to allow some leakage if you see an value that looks like a pointer but happens to be an integer: some object won't be collected, but this is usually no big deal. But for compaction you have to adjust the pointers but you'd better not change that integer: so you have to know for sure when something is a pointer, so your collector has to be precise: the shape must be known.
In Ocaml this is relatively simple: everything is either a pointer or integer and the low bit is used at run time to tell. Objects pointed at have a code telling the type, and there are only a few types: either a scalar (don't scan it) or an aggregate (scan it, it only contains integers or pointers).
This is a pretty straight-forward question so here's a straight-forward answer:
Mark-and-sweep (and occasionally mark-and-compact to avoid heap fragmentation) is the fastest when it comes to allocation and access (avoiding double de-refs). It's also very easy to implement. Since you're not worried about collection performance impact (mark-and-sweep tends to freeze up the process in a nondeterministically), this should be the way to go.
Implementation details found at:
http://www.brpreiss.com/books/opus5/html/page424.html#secgarbagemarksweep
http://www.brpreiss.com/books/opus5/html/page428.html
A nursery generation will give you the best possible allocation performance because it is just a pointer bump.
You could implement pointer updates without using double indirection by using techniques like a shadow stack but this will be slow and very error prone if you're writing this C++ code by hand.
I would like to know what are the common memory management issues associated with C and C++. How can we debug these errors.
Here are few i know
1)uninitialized variable use
2)delete a pointer two times
3)writing array out of bounds
4)failing to deallocate memory
5)race conditions
1) malloc passed back a NULL pointer. You need to cast this pointer to whatever you want.
2) for string, need to allocate an extra byte for the end character.
3) double pointers.
4) (delete and malloc) and (free and new) don't go together
5) see what the actual function returns (return code) on failure and free the memory if it fails.
6) check for size allocating memory malloc(func +1)
7) check how u pass the double pointe **ptr to function
8) check for data size for behaviour undefined function call
9) failure of allocation of memory
Use RAII (Resource Acquisition Is Initialization). You should almost never be using new and delete directly in your code.
Preemptively preventing these errors in the first place:
1) Turn warnings to error levels to overcome the uninitialized errors. Compilers will frequently issue such warnings and by having them accessed as errors you'll be forced to fix the problem.
2) Use Smart pointers. You can find a good versions of such things in Boost.
3) Use vectors or other STL containers. Don't use arrays unless you're using one of the Boost variety.
4) Again, use a container object or smart pointer to handle this issue for you.
5) Use immutable data structures everywhere you can and place locks around modification points for shared mutable objects.
Dealing with legacy applications
1) Same as above.
2) Use integration tests to see how different components of your application play out. This should find many cases of such errors. Seriously consider having a formal peer review done by another group writing a different segment of the application who would come into contact with your naked pointers.
3) You can overload the new operator so that it allocates one extra byte before and after an object. These bytes should then be filled with some easily identifiable value such as 0xDEADBEEF. All you then have to do is check the preceeding byte before and after to witness if and when your memory is being corrupted by such errors.
4) Track your memory usage by running various components of your application many times. If your memory grows, check for missing deallocations.
5) Good luck. Sorry, but this is one of those things that can work 99.9% of the time and then, boom! The customer complains.
In addition to all already said, use valgrind or Bounds Checker to detect all of these errors in your program (except race conditions).
The best technique I know of is to avoid doing pointer operations and dynamic allocation directly. In C++, use reference parameters in preference to pointers. Use stl objects rather than rolling your own lists and containers. Use std::string instead of char *. Failing all that, take Rob K's advice and use RAII wherever you need to do alocations.
For C, there are some simliar things you can try to do, but you are pretty much doomed. Get a copy of Lint and pray for mercy.
Use a good compiler and set warning level to max
Wrap new/malloc and delete/free and bookkeep all allocations/deallocations
Replace raw arrays with an array class that does bounds checking (or use std::vector) (harder to do in C)
See 2.
This is hard, there exists some special debuggers such as jinx that specialize in this but I don't know how good they are.
Make sure you understand when to place object on the heap and when on the stack. As a general rule only put objects on the heap if you must, this will safe you lots of trouble.. Learn STL and use containers provided by the standard lib.
Take a look at my earlier answer to "Any reason to overload global new and delete?" You'll find a number of things here that will help with early detection and diagnosis, as well as a list of helpful tools. Most of the tools and techniques can be applied to either C or C++.
It's worth noting that valgrind's memcheck will spot 4 of your items, and helgrind may help spot the last (data races).
One common pattern I use is the following.
I keep the following three private variables in all allocator classes:
size_t news_;
size_t deletes_;
size_t in_use_;
In the allocator constructor, all of these three are initialized to 0.
Then on,
whenever allocator does a new, it increments news_, and
whenever allocator does a delete, it increments deletes_
Based on that, I put lot of asserts in the allocator code as:
assert( news_ - deletes_ == in_use_ );
This works very good for me.
Addition: I place the assert as precondition and postcondition on all non-trivial methods of the allocator. If the assert blovs, then I know I am doing something wrong. If the assert does not blow, with the all the testing I can do, then I get a reasonably sufficient confidence about the memory management correctness of my program.
1)uninitialized variable use
Automatically detected by compiler (turn warning to full and treat warnings as errors).
2)delete a pointer two times
Don't use RAW pointers. All pointers should by inside either a smart pointer or some form of RAII object that manages the lifetime of the pointer.
3)writing array out of bounds
Don't do it. Thats a logical bug.
You can mitigate it by uisng a container and a method that throws on out of bounds access (vector::at())
4)failing to deallocate memory
Don't use RAW pointers. See (2) above.
5)race conditions
Don't allow them. Allocate resources in priority order to avoid conflicting locks then lock objects when there is a potential for multiple write access (or read access when it is important).
One you forgot:
6) dereferencing pointer after it has been freed.
So far everyone seems to be answering "how to prevent", not "how to debug".
Assuming you're working with code which already has some of these issues, here are some ideas on debugging.
uninitialized variable use
The compiler can detect a lot of this. Initializing RAM to a known value helps in debugging those that escape. In our embedded system, we do a memory test before we leave the bootloader, which leaves all the RAM set to 0x5555. This turns out to be quite useful for debugging: when an integer == 21845, we know it was never initialized.
delete a pointer two times
Visual Studio should detect this at runtime. If you suspect this is happening in other systems, you can debug by replacing the delete call with custom code something like
void delete( void*p){ assert(*(int*)p!=p); _system_delete(p); *(int*)p=p;}
writing array out of bounds
Visual Studio should detect this at runtime. In other systems, add your own sentinels
int headZONE = 0xDEAD;
int array[whatever];
int tailZONE = 0xDEAD;
//add this line to check for overruns
//- place it using binary search to zero in on trouble spot
assert(headZONE==tailZONE&&tailZONE==0xDEAD)
failing to deallocate memory
Watch the stack growth. Record the free heap size before and after points which create and destroy objects; look for unexpected changes. Possibly write your own wrapper around the memory functions to track blocks.
race conditions
aaargh. Make sure you have a logging system with accurate timestamping.