Good Idea / Bad Idea Should I Reimplement Most Of C++? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Recently, I've got a dangerous idea into my head after reading this blog post. That idea can be expressed like this:
I don't need most of what the C++ standard library offers. So, why don't I implement a less general, but easier to use version?
As an example, using the STL spits out reams of incomprehensible and mangled compiler errors. But, I don't care about allocators, iterators and the like. So why don't I take a couple of hours and implement an easy to use linked list class, for example?
What I'd like to know from the StackOverflow community is this: what are the dangers, possible disadvantages and possible advantages to "rolling my own" for most of the existing functionality in C++?
Edit: I feel that people have misunderstood me about this idea. The idea was to understand whether I could implement a very small set of STL functionality that is greatly simplified - more as a project to teach me about data structures and the like. I don't propose re-inventing the entire wheel from the ground up, just the part that I need and want to learn about. I suppose what I wanted to figure out is whether the complexity of using the STL warrants the creation of smaller, simpler version of itself.
Re-using boost or similiar.
Most of what I code is for University and we're not allowed to use external libraries. So it's either the C++ standard library, or my own classes.
Objectivity of this question.
This question is not subjective. Nor should it be community Wiki, since it's not a poll. I want concrete arguments that highlight one advantage or one disadvantage that could possibly occur with my approach. Contrary to popular belief, this is not opinion, but based on experience or good logical arguments.
Format.
Please post only one disadvantage or one advantage per answer. This will allow people to evaluate individual ideas instead of all your ideas at once.
And please...
No religious wars. I'm not a fan boy of any language. I use whatever's applicable. For graphics and data compression (what I'm working on at the moment) that seems to be C++. Please constrain your answers to the question or they will be downvoted.

So, why don't I implement a less
general, but easier to use version?
Because you can't. Because whatever else you might say about C++, it is not a simple language, and if you're not already very good at it, your linked list implementation will be buggy.
Honestly, your choice is simple:
Learn C++, or don't use it. Yes, C++ is commonly used for graphics, but Java has OpenGL libraries too. So does C#, Python and virtually every other language. Or C. You don't have to use C++.
But if you do use it, learn it and use it properly.
If you want immutable strings, create your string as const.
And regardless of its underlying implementation, the STL is remarkably simple to use.
C++ compiler errors can be read, but it takes a bit of practice. But more importantly, they are not exclusive to STL code. You'll encounter them no matter what you do, and which libraries you use. So get used to them. And if you're getting used to them anyway, you might as well use STL too.
Apart from that, a few other disadvantages:
No one else will understand your code. If you ask a question on SO about std::vector, or bidirectional iterators, everyone who's reasonably familiar with c++ can answer. If you ask abut My::CustomLinkedList, no one can help you. Which is unfortunate, because rolling your own also means that there will be more bugs to ask for help about.
You're trying to cure the symptom, rather than the cause. The problem is that you don't understand C++. STL is just a symptom of that. Avoiding STL won't magically make your C++ code work better.
The compiler errors. Yes, they're nasty to read, but they're there. A lot of work in the STL has gone into ensuring that wrong use will trigger compiler errors in most cases. In C++ it's very easy to make code that compiles, but doesn't work. Or seems to work. Or works on my computer, but fails mysteriously elsewhere. Your own linked list would almost certainly move more errors to runtime, where they'd go undetected for a while, and be much harder to track down.
And once again, it will be buggy. Trust me. I've seen damn good C++ programmers write a linked list in C++ only to uncover bug after bug, in obscure border cases. And C++ is all border cases. Will your linked list handle exception safety correctly? Will it guarantee that everything is in a consistent state if creating a new node (and thereby calling the object type's constructor) throws an exception? That it won't leak memory, that all the appropriate destructors will be called? Will it be as type-safe? Will it be as performant? There are a lot of headaches to deal with when writing container classes in C++.
You're missing out on one of the most powerful and flexible libraries in existence, in any language. The STL can do a lot that would be a pain even with Java's giant bloated class library. C++ is hard enough already, no need to throw away the few advantages it offers.
I don't care about allocators,
iterators and the like
Allocators can be safely ignored. You pretty much don't even need to know that they exist. Iterators are brilliant though, and figuring them out would save you a lot of headaches. There are only three concepts you need to understand to use STL effectively:
Containers: You already know about these. vectors, linked lists, maps, sets, queues and so on.
Iterators: Abstractions that let you navigate a container (or subsets of a container, or any other sequence of value, in memory, on disk in the form of streams, or computed on the fly).
Algorithms: Common algorithms that work on any pair of iterators. You have sort, for_each, find, copy and many others.
Yes, the STL is small compared to Java's library, but it packs a surprising amount of power when you combine the above 3 concepts. There's a bit of a learning curve, because it is an unusual library. But if you're going to spend more than a day or two with C++, it's worth learning properly.
And no, I'm not following your answer format, because I thought actually giving you a detailed answer would be more helpful. ;)
Edit:
It'd be tempting to say that an advantage of rolling your own is that you'd learn more of the language, and maybe even why the STL is one of its saving graces.. But I'm not really convinced it's true. It might work, but it can backfire too.
As I said above, it's easy to write C++ code that seems to work. And when it stops working, it's easy to rearrange a few things, like the declaration order of variables, or insert a bit of padding in a class, to make it seemingly work again. What would you learn from that? Would that teach you how to write better C++? Perhaps. But most likely, it'd just teach you that "C++ sucks". Would it teach you how to use the STL? Definitely not.
A more useful approach might be utilizing the awesome power of StackOverflow in learning STL the right way. :)

Disadvantage: no one but you will use it.
Advantage: In the process of implementing it you will learn why the Standard Library is a good thing.

Advantages: eating your own dogfood. You get exactly what you do.
Disadvantages: eating your own dogfood. Numerous people, smarter than 99 % of us, have spent years creating STL.

I suggested you learn why:
using the STL spits out reams of
incomprehensible and mangled compiler
errors
first

Disadvantage: you may spend more time debugging your class library than solving whatever university task you have in front of you.
Advantage: you're likely to learn a lot!

There is something you can do about the cryptic compiler STL error messages. STLFilt will help simplify them. From the STLFilt Website:
STLFilt simplifies and/or reformats
long-winded C++ error and warning
messages, with a focus on STL-related
diagnostics (and for MSVC 6, it fully
eliminates C4786 warnings and their
detritus). The result renders many of
even the most cryptic diagnostics
comprehensible.
Have a look here and, if you are using VisualC, also here.

I think you should do it.
I'm sure I'll get flambayed for this, but you know, every C++ programmer around here has drunk a little too much STL coolaid.
The STL is a great library, but I know from first hand experience that if you roll your own, you can:
1) Make it faster than the STL for your particular use cases.
2) You'll write a library with just the interfaces you need.
3) You'll be able to extend all the standard stuff. (I can't tell you how much I've wished std::string had a split() method)...
Everyone is right when they say that it will be a lot of work. Thats true.
But, you will learn a lot. Even if after you write it, you go back to the STL and never use it again, you'll still have learned a lot.

A bit of my experience : Not that long ago I have implemented my own vector-like class because I needed good control on it.
As I needed genericity I made a templated array.
I also wanted to iterate through it not using operator[] but incrementing a pointer like a would do with C, so I don't compute the address of T[i] at each iteration... I added two methods one to return pointer to the allocated memory and another that returns a pointer to the end.
To iterate through an array of integer I had to write something like this :
for(int * p = array.pData(); p != array.pEnd(); ++p){
cout<<*p<<endl;
}
Then when I start to use vectors of vectors I figure out that when it was possible a could allocate a big bloc of memory instead of calling new many times. At this time I add an allocator to the template class.
Only then I notice that I had wrote a perfectly useless clone of std::vector<>.
At least now I know why I use STL...

Disadvantage : IMHO, reimplimenting tested and proven libraries is a rabit hole which is almost garanteed to be more trouble than it's worth.

Another Disadvantage:
If you want to get a C++ job when you're finished with University, most people who would want to recruit you will expect that you are familiar with the Standard C++ library. Not necessarily intimately familiar to the implementation level but certainly familiar with its usage and idioms. If you reimplement the wheel in form of your own library, you'll miss out on that chance. This is nonwithstanding the fact that you will hopefully learn a lot about library design if you roll your own, which might earn you a couple of extra brownie points depending on where you interview.

Disadvantage:
You're introducing a dependency on your own new library. Even if that's sufficient, and your implementation works fine, you still have a dependency. And that can bite you hard with code maintenance. Everyone else (including yourself, in a year's time, or even a month's) will not be familiar with your unique string behavior, special iterators, and so on. Much effort will be needed just to adapt to the new environment before you could ever start refactoring/extending anything.
If you use something like STL, everyone will know it already, it's well understood and documented, and nobody will have to re-learn your custom throwaway environment.

You may be interested in EASTL, a rewrite of the STL Electronic Arts documented a while back. Their design decisions were mostly driven by the specific desires/needs in multiplatform videogame programming. The abstract in the linked article sums it up nicely.

Advantage
If you look into MFC, you'll find that your suggestion already is used in productive code - and has been so for a long time. None of MFC's collection classes uses the STL.

Why don't you take a look at existing C++ libraries. Back when C++ wasn't quite as mature, people often wrote their own libraries. Have a look at Symbian (pretty horrible though), Qt and WxWidgets (if memory serves me) have basic collections and stuff, and there are probably many others.
My opinion is that the complexity of STL derives from the complexity of the C++ language, and there's little you can do to improve on STL (aside from using a more sensible naming convention). I recommend simply switching to some other language if you can, or just deal with it.

Disadvantage : You're university course is probably laid out like this for a reason. The fact that you are irritated enough by it (sarcasm not intended), may indicate you are not getting the paridigm, and will benefit a lot when you have a paradigm shift.

As an example, using the STL spits out
reams of incomprehensible and mangled
compiler errors
The reason for this is essentially C++ templates. If you use templates (as STL does) you will get reams of incomprehensible error messages. So if you implement your own template based collection classes you will not be in any better spot.
You could make non template based containers and store everything as void pointers or some base class e.g. But you would lose compile time type checks and C++ sucks as a dynamic language. It is not as safe to do this as it would be in e.g. Objective-C, Python or Java. One of the reasons being that C++ does not have a root class for all classes to all introspection on all objects and some basic error handling at runtime. Instead your app would likely crash and burn if you were wrong about the type and you would not be given any clues to what went wrong.

Disadvantage: reimplementing all of that well (that is, at a high level of quality) will certainly take a number of great developers a few years.

what are the dangers, possible disadvantages and possible advantages to "rolling my own" for most of the existing functionality in C++?
Can you afford and possibly justify the amount of effort/time/money spent behind reinventing the wheel?
Re-using boost or similiar.
Rather strange that you cannot use Boost. IIRC, chunks of contribution come in from people related to/working in universities (think Jakko Jarvi). The upsides of using Boost are far too many to list here.
On not 'reinventing the wheel'
Disadvantage: While you learn a lot, you also set yourself back, when you come to think of what your real project objectives are.
Advantage: Maintenance is easier for the folks who are going to inherit this.

STL is very complex because it needs to be for a general purpose library.
Reasons why STL is the way it is:
Based on interators so standard algorithms only need a single implementation for different types of containers.
Designed to behave properly in the face of Exceptions.
Designed to be 'thread' safe in multi threaded applications.
In a lot of applications however you really have enough with the following:
string class
hash table for O(1) lookups
vector/array with sort / and binary search for sorted collections
If you know that:
Your classes do not throw exceptions on construction or assignment.
Your code is single threaded.
You will not use the more complex STL algorithms.
Then you can probably write your own faster code that uses less memory and produces simpler compile/runtime errors.
Some examples for faster/easier without the STL:
Copy-on-Write string with reference counted string buffer. (Do not do this in a multi-threaded environment since you would need to lock on the reference count access.)
Use a good hash table instead of the std::set and std::map.
'Java' style iterators that can be passed around as a single object
Iterator type that does not need to know the type of the container (For better compile time decoupling of code)
A string class with more utility functions
Configurable bounds checking in your vector containers. (So not [] or .at but the same method with a compile or runtime flag for going from 'safe' to 'fast' mode)
Containers designed to work with pointers to objects that will delete their content.

It looks like you updated the question so now there are really two questions:
What should I do if I think the std:: library is too complex for my needs?
Design your own classes that internally use relevant std:: library features to do the "heavy lifting" for you. That way you have less to get wrong, and you still get to invent your own coding interface.
What should I do if I want to learn how data structures work?
Design your own set of data structure classes from the ground up. Then try to figure out why the standard ones are better.

Related

Is it Bad Practice to use C++ only for the STL containers?

First a little background ...
In what follows, I use C,C++ and Java for coding (general) algorithms, not gui's and fancy program's with interfaces, but simple command line algorithms and libraries.
I started out learning about programming in Java. I got pretty good with Java and I learned to use the Java containers a lot as they tend to reduce complexity of book keeping while guaranteeing great performance. I intermittently used C++, but I was definitely not as good with it as with Java and it felt cumbersome. I did not know C++ enough to work in it without having to look up every single function and so I quickly reverted back to sticking to Java as much as possible.
I then made a sudden transition into cracking and hacking in assembly language, because I felt I was concentrated too much attention on a much too high level language and I needed more experience with how a CPU interacts with memory and whats really going on with the 1's and 0's. I have to admit this was one of the most educational and fun experiences I've had with computers to date.
For obviously reasons, I could not use assembly language to code on a daily basis, it was mostly reserved for fun diversions. After learning more about the computer through this experience I then realized that C++ is so much closer to the "level of 1's and 0's" than Java was, but I still felt it to be incredibly obtuse, like a swiss army knife with far too many gizmos to do any one task with elegance. I decided to give plain vanilla C a try, and I quickly fell in love. It was a happy medium between simplicity and enough "micromanagent" to not abstract what is really going on. However, I did miss one thing about Java: the containers. In particular, a simple container (like the stl vector) that expands dynamically in size is incredibly useful, but quite a pain to have to implement in C every time. Hence my code currently looks like almost entirely C with containers from C++ thrown in, the only feature I use from C++.
I'd like to know if its consider okay in practice to use just one feature of C++, and ignore the rest in favor of C type code?
The short answer is, "This is not really the most effective way to use C++."
When used correctly, the strong type system, the ability to pass by reference, and idioms like RAII make C++ programs more likely to be correct, readable, and maintainable.
No one can stop you from using the language the way you want to. But you may be limiting yourself by not learning and leveraging actual C++ features.
If you write code that other people will have to read and maintain, they will probably appreciate the use of "real C++" instead of "C with classes" (in the words of a previous commenter).
Seems fine to me. That's the only part of C++ that I really use as well.
Right now, I'm writing a number cruncher. There's no polymorphism, no control delegation, no interaction. <iostream> was a bottleneck so I rewrote I/O in C.
The functions are mostly inside one class which represents a work thread. So that's not so much OO as having thread-local variables.
As well as vector, I use <algorithms> pretty heavily. But the heavy-duty data structures are written in plain C. Mainly circular singly-linked lists, which can't even easily have distinct begin() and end(), meaning not only containers but sequences (and for-loops) are off-limits. And then templates help the preprocessor to generate the main inner loop.
The most natural way of solving your problem is probably right. You don't want solutions in search of a problem. Learning to use C++ is well and good, but object orientation is suited to some problems and not others.
On the other hand, using bsearch from stdlib.h in a C++ program would be wrong.
You should use C++ in whatever way makes the most sense for you.

Learning C++ and overcautiousness

How should I learn C++? I hear that the language gives enough rope to shoot myself in the head, so should I treat every C++ line I write as a potential minefiled?
How should I learn C++?
Refer to:
Books to refer for learning OOP through C++
https://stackoverflow.com/questions/631793/good-book-to-learn-c-from
The Definitive C++ Book Guide and List
https://stackoverflow.com/questions/1122921/suggested-c-books
https://stackoverflow.com/questions/1686906/what-is-a-very-practical-c-book
https://stackoverflow.com/questions/681551/a-c-book-that-covers-non-syntax-related-problems
I hear that the language gives enough
rope to shoout myself in the head, so
should I treat every C++ line I write
as a potential minefiled?
A C++ statement can do either what you want it to do, or something else. This depends on your understanding of what that C++ statement means. But this is not specific to this language.
By learning the language, and using techniques for building correct software (like Object Oriented design, especially Design by Contract, and testing techniques), you will be able to guarantee that your program behaves as you intended it to.
I love your metaphor! What Stroustrup actually said was:
http://en.wikiquote.org/wiki/Bjarne_Stroustrup
C makes it easy to shoot yourself in
the foot; C++ makes it harder, but
when you do it blows your whole leg
off.
This was many years ago. I started learning C++ in ca. 1991 and it really was a minefield. There were no common libraries, no debuggers and the AT&T approach used a C code generator. There are now many good IDEs which support C++.
Personally I moved to Java because I find it a cleaner language but C++ is fine as long as you don't try to be tricky. Avoid native C constructs where there are existing class libraries (Stroustrup initially did not provide a String class as he though it a useful "rite of passage" to have to write one!) Now you can use a proven one.
I'm assuming you have no choice in the language. How you go about it depends on where you are coming from. C++ is not the easiest of the object-oriented languages to start on and Stroustrup's book is not necessarily the best intro.
UPDATE the OP is worried about blowing themselves up when learning the language. Generally it's a good idea to start with a subset of what one will do later. I assumed the OP is worried about:
Things which you have to know and use whatever level you program at (such as destructors)
Things which add additional complexity to the learning process and can be shelved until later (such as multiple inheritance)
what follows are some places where I blew myself up... They are not subjective, they happened!
There are some up-front gotchas that don't exist in Java or C#.
destructors. You have to manage your own memory. Failing to write destructors will blow your fingers and toes off.
equality. You will have to write an equals method (in simple Java you may get away without it)
copy constructor. Ditto. a = b will invoke this. Bites you in the bottom.
And I'd suggest avoiding multiple inheritance unless you really need it. Then avoid it anyway.
And avoid operator overloading. It looks cute to write:
vector1 = vector2 + vector3;
but
vector1 = vector2.plus(vector3);
is just as clear, only a few more characters, and you can search for it.
Well, it's not a minefield.
Really, the most problems are related to anything related to pointers, so you'll have to understand them (which it's not easy at first) and be careful when using them.
I think it's more a question of experience, having all the basics clear and trying to get a clear design since the beggining.
More than a minefield, I think it's like going to the most dangeours neighbourhood in your town. Yes, it's dangerous, but only for the ones without the attitude. :-D
I would say that that C++ is a challenging environment, if not a minefield. The fundamental issue is that problem symptoms and problem causes are not always easy to tie up. As Khelben has said one major reason for that is that we have pointers to deal with and hence we can do quite a lot of damage when pointers are not pointing where we think they are.
So you need to pay special attention when dealing with arrays and pointers, out-by-one errors can result in memory corruption and these then result in interesting problem manifestations.
Every formal language is a minefield. There're less mines in managed environments. For instance, in C# if you overblow an array you won't cause someone else's remote function to do strange things. You won't have code run differently in tests and prod because someone forgot to initialize a variable in constructor.
However, these are the easy ones. You learn to avoid them, and then you stay with the real mines, which are there in every language.
More specifically, these are some of the most important points when moving to C++:
always initialize variables. even theoretical possibility of having your program logic depend on what was in the memory beforehand is a nightmare.
dependencies: avoid data members of other compound types (classes) without pimpl idiom. This will make your users exposed to the inner workings of the types you use, and increase compilation time dramatically. Dependencies are your enemy.
in C++, you can optimize for performance in ridiculously huge number of ways. don't. Unless you are in the innermost loop of a heavy math software, and even then don't.
avoid DLLs on windows. They don't work with singletons, causing problems to popular libraries.
use boost, shared pointers whenever you can. avoid reinventing the wheel and regular pointers.
use std::string, smart containers instead of arrays. These are dangerous. It will be faster than managed containers anyway.
use RAII. This one is priceless.
prefer data members to inheritance, or you will expose the base type definition to your type's users.
learn to avoid nested includes with forward declarations.
How should I learn C++?
depends. where are you coming from? anyway, I'd suggest:
use an up-to-date compiler such as gcc-4.4 or 4.5
C++0x is worth it for the type inference alone (local variables don't need explicit type designations)
write small, standalone, short-lived utilities (try porting such tools written in other languages)
STL has complex parts, but the basic things are easy, don't shy away from it. FMPOV it embodies the spirit of C++
use state-of-the-art C++ libraries: stuff like Boost.Foreach, Boost.Tuple, Boost.Regex or Boost.Optional turn C++ into serious competition in the scripting department
when you're comfortable:
learn to generalize your code with templates
learn to use RAII
then:
add C libraries to the mix. this might be the first time you'll need to tinker with pointers and casts!
add OOP if you feel like it
should I treat every C++ line I write as a potential minefiled?
be cautious, but don't worry too much. it's true that you can't know what a + b means without knowing the whole program containing such an expression because of operator overloads and argument-dependent lookup, and I've seen many people whine about it. a killing counter-argument is that you cannot really know what a->plus(b) does in Java or a scripting language in the face of inheritance: all methods are virtual, yoyo effect in extremis! (this does kill me in large codebases with rampant inheritance written in languages w/o ADL or operator overloading!)
anecdotes from my experience learning basics of C and C++:
C: unless you do something really, really, really stupid, the program will compile just fine, and SIGSEGV or SIGBUS as soon as you run it
C++: unless you do something really, really, really "clever", the program will either fail to compile, or compile and do what you mean (a mantra Perl "inherited" from Interlisp as I've been told).
a ranty post scriptum:
C++ can be used as a much higher-level language than C: whereas you can't do almost anything beyond simple arithmetic without pointers in C, it's possible to write complete programs in C++ without a pointer in sight, save for char **argv.
there's a whole class of programs that can be implemented in C++ using it as a "scripting" language with unparalleled runtime speed and simple runtime environment (the "dll hell" is nothing compared to the volatility of real scripting languages).
however, the "scripting language" cloak is a leaky abstraction: it's built from native C++ mechanisms such as ADL, operator overloading and templates, and that has its price. get ready for abysmal compile times and unintelligible error messages. OTOH, at least the error messages can be greatly improved with tools like STLfilt, and I think it's well worth it overall.
one thing where C++ really shines in contrast to environments such as Java (perhaps C# too? don't know that one that well) is destructors (vs finalizers and GC). it's one of the pillars of the "scriptiness" of the language. whereas GC adds a whole level of semantic complexity (things don't cease to exist as soon as they're inaccessible from the program) and syntactic verbiage and duplication (finally), destructors are the workhorse of natural semantics and obviate code duplication that's unavoidable with finally.
BTW, "enough rope to shoot myself in the head" almost killed me. I think I'll borrow it. ;)
C++ has some pitfalls certainly, but writting safe code is certainly possible.
Some things to think about. There are far from the only things for writting safe C++ code but they seem like a good start.
Use std::string and std::vector to store strings and collections rather than C style strings and native arrays. They are much easier to get right.
When you allocate an object using new always think about who owns the pointer to this memory and is responsible for deleting it. If you can't think of a single owner for the data that manages it's lifetime, then either rethink your design or think about using a "smart pointer" to manage the lifetime.
Prefer indexing into arrays rather than using pointer arithmetic where possible. Whenever you index into an array every time ask your self "How do I know that this index can only index a valid index in the array".
If a class has a pointer to some data then write methods to act on that data. Don't write methods that return that pointer or at some point you'll end up using the pointer after the data has been deleted. (Not always possible but something to aim for)
If you write simple code that uses strings and vectors and as much as possible encapsule pointers as members of classes that both manage the lifetime of the data and provide the methods that act on that data then that's a good strarting point.
As others have said, read effective c++ and other books.
In C++, foot shoots YOU.
The question is do you need it for anything? If you want to make game code, 3d tools, or something similar you pretty much have to have it. If not, you don't. The errors people are afraid of are seldom big killers but there are plenty of other things that will come up if you make a large enough project.
You may find this spoof interview with Bjarne Stroustrup to be enlightening:
http://www-users.cs.york.ac.uk/susan/joke/cpp.htm
The syntax of C++ is easy, just like Java or C# with pointers. So learning C++ is fast.
The hard thing is that when it comes to a project, C++ is harder to use and more error prone compared to Java or C#. It is just too flexible and the programmer is responsible for too many things.
In a 100 lines of code, you don't need to worry about memory and null pointers at all as you can find them quickly. But when it comes to 10000 lines of code, memory management could be hard. The exception mechanism in C++ is also weak. Thirdly, you need to worry the null pointer problem in C++ in a big project.
I look at the dilemma from a different perspective. The more discipline you have in development the faster you can develop quality robust code. Assembly requires more discipline than C. C requires more discipline than C++.
Don't worry about hanging yourself, blowing your foot or leg off. Just work on improving your quality develop process. For example, a code review will help regardless of the language. Unit testing and test frameworks will also save some bloodshed. Everything boils down to project deadlines and money.

Programmer productivity with STL vs. custom utility classes

I work in an environment where, for historical reasons, we are using a hodgepodge of custom utility classes, most of which were written before STL arrived on the scene. There are many reasons why I think it is a good idea to at least write new code using STL, but the main one is that I believe it increases programmer productivity. Unfortunately, that is a point that I don't know how to prove.
Are there any studies out there that attempt to quantify, or even just hint at, a productivity gain from using STL?
Update: I guess I should have been more specific. I am not advocating rewriting existing code, and I am not worried about the new hires that will get a running start. The former would be foolish, and the latter by itself is not enough to convince people.
There are no studies that will show STL is more productive just because it is STL. Productivity gains from using it are due to it being a standard programmers are familiar with it, and because the code is already written and tested.
If your company already has utility classes that employees are familiar with, and this utility code is used throughout an existing codebase, then switching to STL could actually be detrimental to productivity.
That said for new code using STL would be a good move. I would not necessarily argue this from a productivity standpoint, but one of maintainability. If you have code that predates STL it sounds like code at your company has quite a long lifetime and is likely to be maintained by a number of new programmers over the years.
You may also want to broach the use of STL as a way for everyone to keep their C++ skillset sharp. While I wouldn't reject a C++ candidate who didn't know STL, I would definitely view it as a black mark.
The reason the STL is so "good" is simply because it's been around a long time now and the implementations have seen a lot of users and eyes. They're well debugged and the algorithms have been pretty well optimized by vendors.
STL will be more productive for new devs in your shop because it's likely they are already familiar with it. Your custom libs will be foreign and probably lacking in features that devs are accustomed to using. That's not a huge issue after the initial ramp-up period for new devs though.
There's no real pressing reason to shift to STL just because it is. If you have perfectly useful utility classes in your application I'd recommend sticking with them unless they're not workable for new code. Mixing STL with your custom libraries in new code is going to cause compatibility problems at some point and refactoring the code to use all STL is going to introduce bugs. Either way you'll be losing productivity.
Edit: By "new" code, I mean new code in existing applications using the older class libraries.
If you're developing new standalone apps that do not draw on any of the old applicaiton code I'd recommend STL because it's there, most every C++ dev knows how to use it and it's pretty stable (and you get support from your toolset vendor if it's not).
The only argument for increasing productivity with STL I can think of would be a possibility of integrating other libraries easily - Boost, Arabica, SOCI, etc. They are all designed to work with STL, not your in-house container classes.
I think the critical factor is the turnover rate of the development team.
If the team is stable, and they've been working on this code for ten years and probably will be ten years from now, they will be more productive using the utility classes they're familiar with, and nudging them to use the STL will result in reduced productivity and bugs.
If the team turns over a lot, then the new developers will likely be familiar with the STL, and obviously not this proprietary utility class library, and will do better with the STL.
I have been in the same situation. It depends on the utility classes in question. Many custom utility classes I have seen have been buggy, or poorly designed, and are causing bugs and inefficiencies throughout the source. In this case, a straight replacement with STL equivalents will improve the codebase, increase productivity, decrease bugs and reduce maintainance costs for the future.
You can sometimes help to bridge the gap between the two worlds by retrofitting STL-style iterators to your old classes, though be careful to get them right!
It may be that the reverse is true where the home-grown utilities are in any way domain-specific to the work you do.
One place where STL is useful is not in the actual libraries etc that are provide by STL but rather in the coding style that STL uses.
In the following there are 2 main tests required:
struct DoSomething {
bool operator ()(Container::value_type const & v) const
{
// ...
}
};
void bar (Container const & c)
{
std::for_each (c.begin (), c.end (), DoSomething ());
}
The first tests verify that 'DoSomething::operator()' does what it's supposed to do. The second test only needs to verify that 'DoSomething::operator()' is called.
This separation can help improve productivity when compared with a hand written for loop. For the above, the number of tests is the sum of the "thing to be done" and the tests that the call is made for a non empty container. In the case of the for loop the number of tests is the product of the tests.
The reasons I see are:
Quality of the code
The writers of the STL (either its interface, or the implementation for your compilers) are without doubts magnitudes better than the best developer in your company.
Their job is to make a usable STL, which means that every function/method/object is heavily tested anyway.
Maintainability of the code
With the turnover, some code can become slowly and stealthily unmaintained. Which means that if there is a bug in this code, or if its performance or interface are found lacking (see the "Quality" above), then you'll find no one to expertly improve it.
The code change will have a non-zero probability of a code regression elsewhere, and if your in-house library is not unit-tested, that this regression will go undetected for quite some time.
And I'm not even mentioning the "I WON'T ever touch that slimy code" syndrome when someone trying to correct the code just finds he doesn't understand why it was done this way (because of macros, strange pre-conditions, etc..
Conclusion
Combine the two, and you'll find that for generic code (i.e. arrays, strings, etc.), you'll go better by slowly migrating from in-house libraries to STL library, writting "converter functions" when needed.
I think this kind of migration is part of the maintenance of your code, and that you should slowly (i.e. with new code, or when refactoring), whenever possible, use STL instead of in-house generic libraries.
Since you already have the utility classes, there's no need for using the STL. It will quickly become a maintainability problem as you find the need to begin integrating the STL into your utility libraries. IMO, it will be a productivity loss.
That said, the STL may offer a lot of features and utilities that your utility libraries don't provide. (Particularly useful is the <algorithms> header, which you can probably begin using immediately without many problems with legacy code.) If you're writing a lot of new code, it's far better to use the STL (and Boost, if you can) than to write your own utility classes and algorithms. As a result, it will be a major productivity gain.
I don't know of any studies on the topic, though.
Consider the situation when you hire a new C++ programmer. Is s/he more likely to have experience with STL classes or your own ones? You could be looking at huge savings in time before they become productive if you start to switch to the STL.
Currently, any newly hired employee will have to learn how to use the old classes, and that costs time and money. Using STL like everyone else would save your company money and keep it attractive among the pool of candidates.
I don't think there is a way to justify replacing these custom classes wholesale, unfortunately. If possible, replace them with STL components a class at at time.
This is a maintenance & readability issue as much as anything. (1) Using STL helps maintainers understand the code at a glance, versus needeing to learn their way around custom implementations. (2) The STL is WELL documented and well understood - what about your custom classes? (3) The STL is well vetted. (4) The STL comes with asymptotic runtime upper & lower bounds.
Good luck.

Boost considered harmful? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Lots of the answers to C++ questions here contain the response:
"You should be using boost::(insert
your favourite smart pointer here) or
even better boost::(insert your
favourite mega complex boost type
here)"
I'm not at all convinced that this is doing any favours to the questioners who, by and large, are obvious C++ novices. My reasons are as follows:
Using smart pointers without
understanding what is going on under
the hood is going to lead to a
generation of C++ programmers who
lack some of the basic skills of a
programmer. Pretty much this seems to
have happened in the Java field
already.
Deciding which type of smart pointer
to use depends very much on the
problem domain being addressed. This
is almost always missing from the
questions posted here, so simply
saying "use a shared pointer" is
likely to be at the least unhelpful
and possibly totally wrong.
Boost is not yet part of the C++
standard and may not be available on
the specific platform the questioner
is using. Installing it is a bit
painful (I just did it using Jam) and
is way overkill if all you want are a
few smart pointers.
If you are writing FOSS code, you
don't want the code to be heavily
dependent on external libraries that,
once again, your users may not have.
I've been put off using FOSS code on
a number of occasions simply because
of the Byzantine complexity of the
dependencies between libraries.
To conclude, I'm not saying don't recommend Boost, but that we should be more careful when doing so.
Few points:
Using anything without understanding is considered harmful. But it is only the ignorant technology user (and his manager) who gets burned in the end.
You don't have to install boost to get the smart pointers - they are header only. And installation itself is rather straightforward, in the simplest approach just typing one or two commands.
Many of the Boost libraries and solutions are present in TR1 or will be present in C++0x
You will always depend on external libraries... Try to choose the one that have a bright future in terms of maintenance and support.
Unless you want to roll-out your custom solution - which would have some advantages and disadvantages.
C++ is not a novice-friendly language. With apologies to Scott Meyers, a beginner isn't learning just one language with C++, but four:
The C parts
Object Oriented parts: classes, inheritance, polymorphism, etc.
The STL: containers, iterators, algorithms
Templates and metaprogramming
I would argue that if the beginner is already climbing this mountain, they should be pointed towards the more "modern" aspects of C++ from the start. To do otherwise means that the beginner will learn C-ish C++ with regular pointers, resource leaks, etc. Find themselves in a world of pain, and then discover Boost and other libraries as a way to stem the hurt.
It's a complicated picture no matter what, so why not point them in a direction that has a positive pay-off for the invested mental efort?
As for dependencies, a great deal of Boost is header-only. And Boost's liberal license should permit its inclusion in just about any project.
Do you know how the compiler works ? Do you know how the OS works ? Do you know how the processor works ? Do you know how electronics works ? Do you know how electricity works ?
At some point you are using a black box, the question is, "is my ignorance problematic for what I am currently doing?".
If you have the taste for knowledge that's a great thing - and I clearly consider that a plus when interviewing engineers - but don't forget the finality of your work : build systems that solve problems.
I disagree. No-one would suggest that you should dive in to smart pointers without a thorough understanding of what's going on behind the scenes, but used sensibly they can remove a whole host of common errors. Moreover, Boost is high-quality production code from which a C++ novice can learn a great deal, in terms of design as much as implementation. It's not all hugely complicated, either, and you can pick and choose the bits you need.
It's impossible to understand everything thoroughly all the time. So take the word of many professional C++ developers for it that many parts of boost are indeed very useful things to use in your day-to-day development.
The inclusion of quite a lot of boost in C++0X is testament that even the team that manages the evolution of the language thinks that boost is a Good Thing (tm)
C++ is a weird, tough language. It's relatively easy to learn compared to how incredibly hard it is to master. There's some really arcane stuff you can do with it. Boost::mpl builds on some of those arcane things. I love boost, but I cringe every time I see someone in my organisation use boost::mpl. The reason: even quite seasoned C++ developers have trouble wrapping their head around how it works, and the code that uses it often reflects that (it ends up looking like someone banged code out until it worked). This is not a good thing, so I partially agree that some parts of boost should not be used without caution (boost::spirit is another example).
The C++ standard is also a weird thing. Most common compilers don't implement all of the existing standard (e.g. template exports). It's only a guideline of what to expect.
If your developer doesn't have the savvy to decide which smart pointer to use in a particular situation, perhaps they shouldn't be messing around in that part of the code without senior guidance.
There are always external libraries, starting with the run-time. A lot of boost is header-only so it does not introduce new external dependencies.
Quite frankly, for beginners I think boost isn't that well-suited. I think a beginner is better off understanding how the basics work before moving up the food chain using higher level tool/libs like boost or even STL. At the beginner stage it is not about productivity, it is about understanding. I think knowing how pointers work, being able for instance to manually create a linked list or sort one are part of the fundamentals that each programmer should learn.
I think boost is a great library. I love it. My favourite library is boost::bind and boost::function, which make function pointers much more flexible and easy-to-use. It fits in very well with different frameworks and keeps the code tidy.
I also use different Boost classes. For example, I use boost::graph to create graph classes and I use boost::filesystem for working with files inside directories.
However, boost is very complex. You need to be an experienced programmer to know its worth. Moreover, you need to have atleast some experience in C++ to understand how Boost works and implications of using Boost here or there.
Therefore, I would highly recommend looking at Boost for experienced programmers, especially if they are trying to re-invent the wheel (again). It can really be what it says on the tin: a boost towards your goal.
However, if you feel that the person asking a question is a beginner and tries to understand (for example) memory allocation, telling him to try boost smart pointers is a very bad idea. It's not helpful at all. The advantages of smart pointer classes, etc. can be comprehended only when the person experienced how standard memory allocation techniques work.
To finish off, Boost is not like learning to drive a car with automatic gearbox. It's like learning to drive on a F1 racing car.
I fully agree with you. It is the reason that i first explain them how it should be done (i.e when recommending boost::variant, i explain they should in general use a discriminated union. And i try not to say it's just a "magic boost thing" but show how they in principle implemented it. When i recommend boost::shared_ptr, i explain they would need to use a pointer - but it's better to use a smart pointer that has shared ownership semantics.). I try not to say just "use boost::xxx" when i see the questioner is a beginner. It is a language that's not just as simple to use as some scripting language. One has to understand the stuff one uses, because the language does not protect the programmer from doing bad things.
Of course it's not possible for novices to understand everything from the start on. But they should understand what their boost library solves and how it does it basically.
You can't compare this with learning processors or assembly language first. Similar it's not important to know how the bit-pattern of a null-pointer looks like. Knowledge of those are irrelevant in learning programming with C++. But pointers, array or any other basic things in C++ is not. One doesn't get around learning them before using [boost|std]::shared_ptr or [boost|std]::array successfully. These are things that has to be understood first in order to use the boost utilities successfully in my opinion. It's not about details like how to manually implement the pimpl-idiom using raw pointers - that's not the point I'm making. But the point is that one should first know basic things about pointers or the other parts a boost library helps with (for pointers, what they are and what they are good for, for example). Just look at the shared_ptr manual and try to get it without knowing about pointers. It's impossible.
And it's important to always point them to the appropriate boost manual. Boost manuals are high quality.
The consensus among almost all the answers is that boost is very valuable for experienced developers and for complex, real world, C++ software. I completely agree.
I also think that boost can be very valuable for beginners. Isn't it easier to use lexical_cast than to use ostringstream? Or to use BOOST_FOREACH instead of iterator syntax? The big problem is lack of good documentation of boost, especially for beginners. What is needed is a book that will tell you how to start with boost, which libraries are simple libraries that simplify tasks, and which libraries are more complex. Using these libraries together with good documentation will IMO make learning C++ easier.
We should encourage the use of standard canned libraries (and Boost is almost as standard as they get) whenever possible.
Some people seem to think that beginners should be taught the C side of C++ first, and then introduced to the higher-level stuff later. However, people tend to work as they're trained, so we're going to see a lot of production code written with badly managed raw pointers (well-managed raw pointers are awfully difficult sometimes), arrays (and the inevitable confusion between delete and delete []), and stuff like that. I've worked with code like that. I don't want to do it again any more than I have to.
Start beginners off with the way you want them writing code. This means teaching them about the STL containers and algorithms and some of the Boost libraries at first, so the first thing they think about when needing a group of things is a vector<>. Then teach them the lower-level constructs, so they'll know about them (or where to look them up) when they encounter them, or on the very rare occasions when they need to micro-optimize.
There's basically two types of programmers: the coders, who should be taught languages the way they should be writing them, and the enthusiast, who will learn the low-level stuff, including principles of operating systems, C, assembly code, and so on. Both are well served by learning the language they're going to use up front, while only the enthusiasts will be well served by learning from some arbitrary level of fundamentals.
I think you are mixing a lot of different concerns, not all of them related to Boost specifically:
First, should programmers (or C++ novices specifically) be encouraged to use libraries, idioms, paradigms, languages or language features they don't understand?
No, of course not. Every programmer should understand the tools they use, especially in a language like C++. However, I don't see a lot of questions here on SO where people are encouraged to not understand the code they're using. When people say they want to do X in C++, I think it's find to say "Boost has an implementation of X which works, which is more than a homebrewed solution would do, so use that".
Of course if the question is "how does X work", the question can't be answered with "use Boost's implementation". But I really don't see the problem in recommending Boost for the former kind of questions.
I also don't see how it's even possible to use Boost without understanding what's going on under the hood. C++, with or without Boost, is not Java. Using Boost in no way protects you from the complexities of the language. You still have to worry about copy constructors, pointer arithmetics, templates and everything else that can blow up in your face.
This is nothing like what happened in Java. They designed a language that removed all the subtleties. Boost doesn't do that. Quite the contrary, it has pioneered new idioms and techniques in generic programming. Using Boost is not always simple.
About the availability of Boost, I think that's a non-issue. It is available on the platforms used in the vast majority of questions, and if they're not able to use Boost, the suggestion is still not harmful, just useless.
Further, most Boost libraries are header-only and don't require you to install the whole thing. If you only want smart pointers, simply include those headers and nothing else.
About FOSS, you have a point in some cases But I'd say this is a problem for less universal libraries that users do not have. But Boost is extremely common, and if people don't have it, they should get it, as it is applicable to pretty much any problem domain. And of course, the license is compatible with any FOSS project you care to mention.
I'd rather work on a OSS project that used Boost to do the heavy lifting than one which reinvented its own (buggy and proprietary) wheels, with steep learning curves that could have been avoided.
So yeah, in some cases, recommending Boost is unhelpful. But I don't see how it can be harmful.
In any case, I don't see how it can be even half as harmful as teaching novices to roll their own. In C++, that's a recipe for disaster. It's the sole reason why C++ still has a reputation for being error-prone and produce buggy software. Because for far too long, people wrote everything from scratch themselves, distrusting the standard library, distrusting 3rd party code, distrusting everything that wasn't legal in C.
I'm not at all convinced that this is doing any favours to the questioners who, by and large, are obvious C++ novices. ...:
Using smart pointers without understanding what is going on under the hood is going to lead to a generation of C++ programmers who lack some of the basic skills of a programmer.
Do we tell novice programmers that they must learn assembly language before they get to read up on modern programming languages? They clearly don't know what's going on under the hood otherwise.
Should "Hello World" include an implementation of the I/O subsystem?
Personally I learned how to construct objects before I learned how to write classes. I think I learned how to use STL vectors before I learned C-style arrays. I think it's the right approach: "here's how to refer to several nearly identical variables using a std::vector, later I'll show you what's swept under the rug via C-style arrays and new[] and delete[]."
I disagree. Of course you will always know more about the internal workings of everything when coding it from scratch than when using 3rd party libraries. But time and money are limited, and using good 3rd party libraries such as boost is a very good way to save your resources.
I can see your point, but understanding something does not mean that you have to rewrite everything from scratch.
They are not "standard" but they are as standard as a library can get.
It is true that deploying them can be painful (but not all of the sublibraries require compilation); on the other hand they do not have further dependencies on their own, so I wouldn't be too worried about that part neither.
I agree with you, high level libraries hide things from you. It might be a good idea in the short run, but in the long run, the novice will have severe gaps in their understanding of the language.
It's easy for us non-novices to say "just use this library" because we've been down that long hard road of learning things the hard way, and naturally we want to save someone else the trouble of doing the same.
Novices SHOULD have to struggle with rolling their own low-level solutions to problems. And then, when they've got a better understanding of how their own solution worked, they can use the third-party solution, confident that they have some idea of what's going on under the hood. They'll use that library better!
I think this is a broader subject than just being about Boost. I completely regret picking up VB as my first language. If I had just started with ugly, hard to learn c, I'd be years ahead of where I am now.
I would agree with the point about smart pointers. I am a C++ beginner, and when asking a simple question about pointer syntax, one answer suggested smart pointers were the way to go. I know I'm not ready for boost (I'm not really ready for the STL either), so in most cases I steer myself away from that type of suggestion.
Scoped and dynamic resource ownership are general basic neeeds and boost's implementation of'em is very good an highly recommended. I use them a lot and they work fine.
Boost is a great library. I really hope that it grows in breadth and acceptance. Use it, extend it, and promote it.
One of the great things about the .NET community is that it has a great base class library. One of the fundemental problems with C++, I believe, is the minimalistic C++ standard library. Anywhere you go to develop code, FOSS or corporate, there is some selection of libraries that are used since there isn't a broad standard library. So you end up being a INSERT_YOUR_COMPANY_HERE C++ programmer and not necessarily too transferrable. Yes, you design/architecture skills transfer, but there is the learning curve with picking up familiarity with whatever set of libraries the next place is using. Where as a .NET developer will basically be using the same class library and can hit the ground running. Also, the libraries that are built (and reused) have a broader base to build on.
Just as an aside, you can use http://codepad.org for a code paste bin and it supports boost!
I have worked for companies who have viewed boost as library to avoid due in part to its past reputation as a poorly managed project. I know things have changed with the project, but commercial projects who want to use boost must be aware of the source of the code contained in the library, or at least be assured that they're not going to be liable for IP or patent infringements.
Sadly, the library has this reputation and it will take a while for it to break before it sees wide use in the commercial sector. I also feel this is a reason not to recommend it blindly.

Does using boost pointers change your OO design methodology?

After switching from C++ to C++ w/boost, do you think your OOD skills improved?
Do you notice patterns in "Normal" C++ code that you wouldn't consider that you've switched, or do you find that it enables a more abstract design?
I guess I'm really wondering if you just use it as a tool, or if you change your entire approach to OO design to make more efficient use of objects when using boost pointers.
Edit:summary
This question was kind of strange--I was asking because I've run into so much C++ code that was not at all OO. I'm fairly sure (with that and my work on it before moving to a managed language) that it's harder to think in OO in C++ than a managed language.
From looking at these posts, I'm guessing that you learn the value of OO before finding a need for a better way to manage memory, so by the time you start looking for something like Boost, you're already using OO methodologies pretty heavily.
I was kind of expecting a bunch of answers saying that it helped them think in OO, but now that I think about it, if you aren't using OO, boost pointers are not very helpful, and you wouldn't see the need for them (so you wouldn't have replied).
In a project in C++ I was doing about six years ago, we implemented our own boost-like automatic pointer scheme. It worked pretty well, except for the various bugs in it. (Sure wish we had used boost...)
Nonetheless, it really didn't change how we developed code. Object oriented design, with or without managed pointers, is very similar. There's times when you need to return objects, or times when pointers to objects are more important. The nice thing about smart pointers has only a small amount to do with how you design your application. Instead of passing a potentially dangerous memory leak around, you can pass that same data and be fairly certain that it's not going to leak.
In that respect, there are some things you can tend to do more with smart pointers: simplify your code. Instead of returning integers or basic structures every where, you can more freely pass complicated data structures or classes without worry. You can build more complex apps, faster, without having to worry so much. It lets you have the raw power of C and C++ when you need it (why would you be using C or C++ if you didn't need it?) and have the ease of memory management that's such an amazing productivity boost. If automatically managed memory wasn't useful, it wouldn't be in almost every other language on the planet!
STL/Boost is a tool for the job. They help me implement my ideas not the other way around. Clarification: Boost did not boost up my OOD skills.
It has deeply changed my way of coding, and i'm spreading the word. Through the use of Boost.Graph and Boost.PropertyMap in particular, i realized that i could write "true" algorithms in a simple class, not (yet) knowing how to access information, not even knowing (or caring) what sub-actions might be done while executing the algorithm.
My team is now designing complex computing functionality using a graphical tool.
One might argue that templates are really the base of this change, but Boost clearly paved the way. To me, discovering new Boost libraries is very often a great opportunity to learn important stuff that can be applied to our everyday work !
Once I discovered boost::bind (and boost::function) I found instead of thinking in terms of inheritance and abstract base classes ("interfaces" in java/c#-speak) I started seeing everything as a functor.
For example, pre-boost I'd have built a menu system where the menus were containers of IActionable* items and anything which wanted to be hooked into the menu system would have to inherit IActionable and provide an action method. Post-boost and I'm implementing menus containing boost::function<void()> objects and just throwing anything I want into them using boost::bind.
Another thing: just looking at the way in which boost successfully employs templates really made me raise my expectations of what was possible with them and make the effort to make better use of them in my own code, so I'm writing a lot more "generic" and less "OOP" code.
The smart pointers are certainly useful and get a lot of coverage, but apart from cleaning up some explicit deletes they're hardly a paradigm shift.
For me, it didn't change the way I do design, but Boost does give me additional tools so that certain things are easier. For example, with "smart" pointers, I no longer have to think about making sure certain object creations have to destroyed at the proper time (mostly in the exceptional case). But like any tool, I have to understand when to use them and when NOT to.