It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
If I'd like to know how a function written in like standard C++ library work (not just the MSDN description). I mean how does it allocate, manage, deallocate memory and return you the result. where or what do you need to know to understand that?
You can look at the library headers. A lot of functionality is actually implemented there because the library is highly templatized (and templates generally need to be implemented in headers). The location of the headers depends on the compiler, but you should be able to find them quite easily (e.g. search for a file named algorithm).
You may also ask the compiler to preprocess your code to see all the related headers (this will produce extremely long output). With GCC you can do this by g++ -E yoursource.cc.
If what you are looking for isn't implemented in headers, you need the library sources, which are generally not installed by default and which are not even available for commercial compilers such as MSVC. Look for glibc (C library) and libstdc++ (C++ library), which are the ones used by GCC and some other compilers.
In any case, notice that the standard library implementations tend to be rather cryptic due to a lot of underscores being used in variable names and such (to avoid name collisions with user's macros), and often they are also infested with #ifdefs and other preprocessor cruft.
You need to know the techniques used to write C++ libraries. Getting Bjarne Stroustrup's book is a good start. Also, SGI has very detailed documentation on the STL at a suitably high level of abstraction.
If you are going to be investigating the windows based stuff you might want to study the systems part of the windows library.
To complement windows: understanding the Posix specification is also important.
First a few basic data-structure principles, then a note and some links about allocators...
The STL containers use a number of different data structures. The map, set, multimap and multiset are normally implemented as binary trees with red-black balancing rules, for example, and deque is possibly (more impression than knowledge) a circular queue in an array, exploiting an array-doubling or similar growth pattern.
None of the data structures are actually defined by the standard - but the specified performance characteristics limit the choices significantly.
Normally, your contained data is contained directly in the data structure nodes, which are held (by default) in heap allocated memory. You can override the source of memory for nodes by providing an allocator template parameter when you specify the container - more on that later. If you need the container nodes to reference (not contain) your items, specify a pointer or smart-pointer type as the contained type.
For example, in an std::set, the nodes will be binary tree nodes with space in them for an int and the two child pointers, and the metadata that the library needs (e.g. the red/black flag). The binary tree node will not move around your applications address-space, so you can store pointers to your data item elsewhere if you want, but that isn't true for all containers - e.g. an insert in a vector moves all items above the insert point up by one, and may have to reallocate the whole vector, moving all items.
The container class instance is normally very small - a few pointers is typical. For example, the std::set etc usually have a root pointer, a pointer to the lowest-key node and a pointer to the highest-key node, and probably a bit more metadata.
One issue the STL faces is creating and destroying instances in multi-item nodes without creating/destroying the node. This happens in std::vector and std::deque, for instance. I don't know, strictly, how the STL does it - but the obvious approach requires placement new and explicit destructor calls.
Placement new allows you to create an object in an already-allocated piece of memory. It basically calls the constructor for you. It can take parameters, so it can call a copy constructor or other constructor, not just the default constructor.
http://www.devx.com/tips/Tip/12582
To destruct, you literally call the destructor explicitly, via a (correctly typed) pointer.
((mytype*) (void*) x)->~mytype ();
This works if you haven't declared an explicit constructor, and even for built-in types like "int" that don't need destructing.
Likewise, to assign from one constructed instance to another, you make an explicit call to operator=.
Basically, the containers are able to create, copy and destroy data within an existing node fairly easily, and where needed, metadata tracks which items are currently constructed in the node - e.g. size() indicates which items are currently constructed in an std::vector - there may be additional non-constructed items, depending on the current capacity().
EDIT - It's possible that the STL can optimise by using (directly, or in effect) std::swap rather than operator= to move data around. This would be good where the data items are (for example) other STL containers, and thus own lots of referenced data - swapping could avoid lots of copying. I don't know if the standard requires this, or allows but doesn't mandate it. There is a well-known mechanism for doing this kind of thing, though, using a "traits" template. The default "traits" can provide an assignment-using method whereas specific overrides may support special-case types by using a swapping method. The abstraction would be a move where you don't care what is left in the source (original data, data from target, whatever) as long as it's valid and destructible.
In binary tree nodes, of course, there should be no need for this as there is only one item per node and it's always constructed.
The remaining problem is how to reserve correctly-aligned and correctly-sized space within a node struct to hold an unknown type (specified as a template parameter) without getting unwanted constructor/destructor calls when you create/destroy the node. This will get easier in C++0x, since a union will be able to hold non-POD types, giving a convenient uninitialised-space type. Until then, there's a range of tricks that more-or-less work with different degrees of portability, and no doubt a good STL implementation is a good example to learn from.
Personally, my containers use a space-for-type template class. It uses compiler-specific allocation checks to determine the alignment at compile-time and some template trickery to choose from an array-of-chars, array-of-shorts, array-of-longs etc of the correct size. The non-portable alignment-checking tricks are selected using "#if defined" etc, and the template will fail (at compile time) when someone throws a 128-bit alignment requirement at it because I haven't allowed for that yet.
How to actually allocate the nodes? Well, most (all?) STL containers take an "Allocator" parameter, which is defaulted to "allocator". That standard implementation gets memory from and releases it to the heap. Implement the right interface and it can be replaced with a custom allocator.
Doing that is something I don't like to do, and certainly not without Stroustrups "The C++ Programming Language" on my desk. There's a lot of requirements to meet in your allocator class, and at least in the past (things may have improved), compiler error messages were not helpful.
Google says you could look here, though...
http://www2.roguewave.com/support/docs/leif/sourcepro/html/toolsug/12-6.html
http://en.wikipedia.org/wiki/Allocator_%28C%2B%2B%29
Operating system functions to allocate/free memory are not really relevant to the C++ standard library.
The standard library containers will (by default) use new and delete for memory, and that uses a compiler-specific runtime which almost certainly manages its own heap data structure. This approach is generally more appropriate for typical applications use, where the platform-specific operating system heap is usually more appropriate for allocating large blocks.
The application heap will allocate/free memory from the operating system heap, but "how?" and "when?" are platform-specific and compiler-specific details.
For the Win32 memory management APIs, look here...
http://msdn.microsoft.com/en-us/library/ms810603.aspx
I'm sure you can find win64 equivalents if needed.
I haven't this book, but according to its description, http://www.amazon.com/C-Standard-Template-Library/dp/0134376331 includes
-Practical techniques for using and implementing the component
Isn't this what you want?
Related
I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.
I'm wrestling with some pain being caused by std::allocator_traits::construct. In order for a container to be a "conforming" user of the allocator concept, it needs to use construct rather than placement new to construct objects. This is very sticky for me. Currently I have a class (class A) that is designed to be allocator aware, and at some point it needs to create another instance of some other class (class B) in allocated memory. The problem is that class B implements the construction of the new object. If I could use placement new, this wouldn't be an issue: A would handle allocation, pass B the memory address, and B would construct into that. But since the construction needs to be performed via construct, I need to inject the allocator type into B, templating it, which creates a huge mess.
It's bad enough that I am seriously considering just using placement new, and static asserting that my instance of the allocator does not have a construct method (note that the static construct function calls the instance method if it exists, otherwise it calls placement new). I have never felt the tiniest urge to write a construct method for an allocator. The cost of making this part of the allocator concept seems very high to me; construction has gotten entangled with allocation, where allocators were supposed to help separate them. What justifies the existence of construct/destruct? Insight into the design decision, examples of real (not toy) use cases, or thoughts on the gravity of electing to simply use placement new appreciated.
There is a similar question; std::allocator construct/destroy vs. placement new/p->~T(). It was asked quite a long time ago, and I don't find the answer accepted there as sufficient. Logging is a bit trite as a use case, and even then: why is the allocator logging the actual construction of objects? It can log allocations and deallocations in allocate and deallocate, it doesn't answer the question in the sense of: why was construction made a province of the allocator in the first place? I'm hoping to find a better answer; it's been quite a few years and much about allocators has changed since then (e.g. allocators being stateful since 11).
A few points:
There really isn't a std container concept. The container requirements tables in the standard are there to document the containers specified by the standard.
If you have a container that wants to interact with std::allocator_traits<Alloc>, all you have to do is assume that Alloc conforms to the minimum C++11 allocator requirements and interact with it via std::allocator_traits<Alloc>.
You are not required to call std::allocator_traits<Alloc>::construct.
You are forbidden from calling Alloc::construct because it may not exist.
The standard-specified containers are required to call std::allocator_traits<Alloc>::construct only for container::value_type, and are forbidden from using std::allocator_traits<Alloc>::construct on any other types the container may need to construct (e.g. internal nodes).
Why was construct included in the "allocator concept" way back in C++98?
Probably because the committee at the time felt that this would ease dealing with x86 near and far pointers -- a problem that no longer exists today.
That being said, std::scoped_allocator_adaptor is a modern real-world example of an allocator that customizes both construct and destroy. For the detailed specification of those customizations I point you towards the latest C++1z working draft, N4567. The spec is not simple, and that is why I'm not attempting to reproduce it here.
What is the pros/cons of using the built-in std::list instead of an own linked list implementation based on pointers like in C?
Are there some special case where one is preferred over the other?
There are plenty of good reasons to use std::list instead of your own linked list implementation:
std::list is guaranteed (via the c++ standard library's
implementation of standard) to work as explained on the tin (no
bugs, exception safety and thread safety as by the standard).
std::list does not require you to spent time developing and
testing it.
std::list is well known so that anybody else every working with
the code (or yourself later in life) can understand what's going on
without the need to first get to grips with a custom linked list
implementation.
I cannot really think of any good reason to use your own custom linked list.
std::list is usually implemented as a doubly-linked list. If you only need a singly-linked list, you should consider std::forward_list.
Finally, if you're concerned with performance, you shouldn't use linked lists at all. Elements in a linked list are necessarily allocated individually (and often inserted at random places), so that processing a linked list generally results in many cache misses, each giving a performance hit.
Typically, you want to use std::list, as answered by #Walter.
However, a list implemented by "intrusively" integrating the next (and prev, if any) pointer directly into the contained objects, can avoid several disadvantages of std::list and the other STL containers, which may or may not be relevant to you (quoted from Boost.Intrusive documentation):
An object can only belong to one container: If you want to share an object
between two containers, you either have to store multiple copies of those
objects or you need to use containers of pointers: std::list<Object*>.
The use of dynamic allocation to create copies of passed values can be a
performance and size bottleneck in some applications. […]
Only copies of objects are stored in non-intrusive containers. Hence copy
or move constructors and copy or move assignment operators are required.
Non-copyable and non-movable objects can't be stored in non-intrusive
containers.
It's not possible to store a derived object in a STL-container while
retaining its original type.
The second point is probably not applicable for most typical usages of lists, where you would dynamically allocate the elements anyway.
If the last point is relevant to you, you may be interested in Boost.PointerContainer ‒ although a std::list<std::unique_ptr<Obj>> usually also does the job well enough.
Instead of completely implementing a list yourself, have a look at the aforementioned Boost.Intrusive library.
The answer provided by Walter covers the main reasons to prefer the stl implementation. The main reason to consider a clasic C style implementation is increased performance. The cost of this increased performance is primarily the potential for errors. This can be addressed with testing and the inclusion of some appropriate asserts (checks for null pointers..._
Contrary to the statements in Walter's answer there are cases where a high performance list is a good data structure choice.
If you need the performance of a custom list but want to avoid the work of constructing and testing your own check out the boost intrusive lists (singly and doubly linked) at:
http://www.boost.org/doc/libs/1_39_0/doc/html/intrusive.html
These will get you the same performance as a custom construction with (almost) the convenience of the stl versions.
I know reference counter technique but never heard of mark-sweep technique until today, when reading the book named "Concepts of programming language".
According to the book:
The original mark-sweep process of garbage collection operates as follow: The runtime system allocates storage cells as requested and disconnects pointers from cells as necessary, without regard of storage reclamation ( allowing garbage to accumulate), until it has allocated all available cells. At this point, a mark-sweep process is begun to gather all the garbage left floating-around in the heap. To facilitate the process, every heap cells has an extra indicator bit or field that is used by the collection algorithm.
From my limited understanding, smart-pointers in C++ libraries use reference counting technique. I wonder is there any library in C++ using this kind of implementation for smart-pointers? And since the book is purely theoretical, I could not visualize how the implementation is done. An example to demonstrate this idea would be greatly valuable. Please correct me if I'm wrong.
Thanks,
There is one difficulty to using garbage collection in C++, it's to identify what is pointer and what is not.
If you can tweak a compiler to provide this information for each and every object type, then you're done, but if you cannot, then you need to use conservative approach: that is scanning the memory searching for any pattern that may look like a pointer. There is also the difficulty of "bit stuffing" here, where people stuff bits into pointers (the higher bits are mostly unused in 64 bits) or XOR two different pointers to "save space".
Now, in C++0x the Standard Committee introduced a standard ABI to help implementing Garbage Collection. In n3225 you can find it at 20.9.11 Pointer safety [util.dynamic.safety]. This supposes that people will implement those functions for their types, of course:
void declare_reachable(void* p); // throw std::bad_alloc
template <typename T> T* undeclare_reachable(T* p) noexcept;
void declare_no_pointers(char* p, size_t n) noexcept;
void undeclare_no_pointers(char* p, size_t n) noexcept;
pointer_safety get_pointer_safety() noexcept;
When implemented, it will authorize you to plug any garbage collection scheme (defining those functions) into your application. It will of course require some work of course to actually provide those operations wherever they are needed. One solution could be to simply override new and delete but it does not account for pointer arithmetic...
Finally, there are many strategies for Garbage Collection: Reference Counting (with Cycle Detection algorithms) and Mark And Sweep are the main different systems, but they come in various flavors (Generational or not, Copying/Compacting or not, ...).
Although they may have upgraded it by now, Mozilla Firefox used to use a hybrid approach in which reference-counted smart pointers were used when possible, with a mark-and-sweep garbage collector running in parallel to clean up reference cycles. It's possible other projects have adopted this approach, though I'm not fully sure.
The main reason that I could see C++ programmers avoiding this type of garbage collection is that it means that object destructors would run asynchronously. This means that if any objects were created that held on to important resources, such as network connections or physical hardware, the cleanup wouldn't be guaranteed to occur in a timely fashion. Moreover, the destructors would have to be very careful to use appropriate synchronization if they were to access shared resources, while in a single-threaded, straight reference-counting solution this wouldn't be necessary.
The other complexity of this approach is that C++ allows for raw arithmetic operations on pointers, which greatly complicates the implementation of any garbage collector. It's possible to conservatively solve this problem (look at the Boehm GC, for example), though it's a significant barrier to building a system of this sort.
I need to create a custom allocator for std:: objects (particularly and initially for std::vector) but it might eventually come to use others
The reason I need to create a custom allocator is that I need to track allocated (heap & stack) resources by individual components of the application (this is an inherent feature of the application). I will need the custom allocator to monitor the heap portion of the resources, so it is essential that I'm able to pass to the std::vector constructor something like
trackerId idToTrackUsage;
myAlloca<int> allocator(idToTrackUsage);
vector<int> Foo( allocator );
However, after reading a bit I found this little bomb about the STL / C++ standard (see references) saying that all allocator instances of a given type should be equivalent (that is that == should return true for any two instances) and, most terminal; any allocator should be able to deallocate memory allocated by any other instance (that is, without having a way to know what that other instance might be). In short, allocators cannot have state.
So I'm trying to find the best way around this. Any clever ideas? I really really REALLY don't want to have to keep a custom version of std::vector around.
EDIT: i read about scoped allocators for c++0x on http://www2.research.att.com/~bs/C++0xFAQ.html#scoped-allocator but i couldn't really get far into understanding how this applies to my problem. If anyone thinks c++0x alleviates this problem, please comment
References:
Allocator C++ article in Wikipedia
Some random further reading courtesy of Google
Aside from the obvious answer ("if you violate any requirement, that's undefined behavior, good night and thanks for playing"), I imagine the worst that would likely happen, is that the vector implementation can rely on the requirement that "all instances of the allocator class are interchangeable" in the obvious way:
vector(const Allocator &youralloc = Allocator()) {
const Allocator hawhaw;
// use hawhaw and ignore youralloc.
// They're interchangeable, remember?
}
Looking at the source, GCC's vector implementation (which I think is based eventually on SGI's original STL implementation) does sort-of store a copy of the allocator object passed into that constructor, so there's some hope that this won't happen.
I'd say try it and see, and document what you've done very carefully, so that anyone attempting to use your code on an implementation that you haven't checked, knows what's going on. Implementers are encouraged in the standard to relax the restrictions on allocators, so it would be a dirty trick to make it look as though they're relaxed when really they aren't. Which doesn't mean it won't happen.
If you're really lucky, there's some documentation for your container implementation that talks about allocators.
You could, of course, leave a pointer to whatever state you need in any allocated blocks. This does of course mean that any per-block state must be stored in that block, and the allocator instances would act more like handles than actual objects in as of themselves.
Making the allocator state static would do the trick, if you're able to work with that. It does mean that all allocators of that type will have to share their state, but from your requirements, that sounds like it could be acceptable
To respond to your edit: yes, in C++0x or C++11, allocators can have state.