Which is faster, string.empty() or string.size() == 0? - c++

Recently, during a discussion I was asked by a fellow programmer to do some code changes. I had something like:
if( mystring.size() == 0)
// do something
else
// do something else
The discussion was regarding the use of mystring.empty() to validate if the string is empty. Now, I agree that it can be argued that string.empty() is more verbose and readable code, but are there any performance benefits to it?
I did some digging and found these 2 answers pertaining to my question:
Implementation from basic_string.h
SO Answer that points to ISO Standard - here
Both the answers buttress my claim that the string.empty() is just more readable and doesn't offer any performance benefits, compared to string.size() == 0.
I still want to be sure, if there are any implementations of string that keep an internal boolean flag to validate if a string is empty or not?
Or there are other ways that some implementations use, that would nullify my claim??

The standard defines empty() like this:
bool empty() const noexcept;
Returns: size() == 0.
You'd be hard-pressed to find something that doesn't do that, and any performance difference would be negligible due to both being constant time operations. I would expect both to compile to the exact same assembly on any reasonable implementation.
That said, empty() is clear and explicit. You should prefer it over size() == 0 (or !size()) for readability.

Now, this is a pretty trivial matter, but I'll try to cover it exhaustively so whatever arguments are put by colleagues aren't likely to take you by surprise....
As usual, if profiling proved you really really had to care, measure: there could be a difference (see below). But in a general code review situation for not-proved-problematically-slow code, the outstanding issues are:
in some other containers (e.g. C++03 lists but not C++11), size() was less efficient than empty(), leading to some coding tips to prefer empty() over size() in general so that if someone needed to switch the container later (or generalise the processing into a template where the container type may vary) no change needs to be made to retain efficiency.
does either reflect a more natural way to conceive of the test? - not just what you happened to think of first, or size() because you're not as used to using empty(), but when you're 100% focused on the surrounding code logic, does size() or empty() fit in better? For example, perhaps because it's one of several tests of size() and you like having consistency, or because you're implementing a famous algorithm or formula that's traditionally expressed in terms of size: being consistent might reduce the mental noise/effort in verifying the implementation against the formula.
Most of the time the factors above are insignificant, and raising the issue in a code review is really a waste of time.
Possible performance difference
While the Standard requires functional equivalence, some implementations might implement them differently, though I've struggled and so far failed to document a particularly plausible reason for doing so.
C++11 has more constraints than C++03 over behaviours of other functions that impact implementation choices: data() must be NUL terminated (used to be just c_str()), [size()] is now a valid index and must return a reference to a NUL character. For various subtle reasons, these restrictions make it even more likely that empty() will be no faster than size().
Anyway - measure if you have to care.

Related

Is the three-way comparison operator always efficient?

Herb Sutter, in his proposal for the "spaceship" operator (section 2.2.2, bottom of page 12), says:
Basing everything on <=> and its return type: This model has major advantages, some unique to this proposal compared to previous proposals for C++ and the capabilities of other languages:
[...]
(6) Efficiency, including finally achieving zero-overhead abstraction for comparisons: The vast majority of comparisons are always single-pass. The only exception is generated <= and >= in the case of types that support both partial ordering and equality. For <, single-pass is essential to achieve the zero-overhead principle to avoid repeating equality comparisons, such as for struct Employee { string name; /*more members*/ }; used in struct Outer { Employeee; /*more members*/ }; – today’s comparisons violates zero-overhead abstraction because operator< on Outer performs redundant equality comparisons, because it performs if (e != that.e) return e < that.e; which traverses the equal prefix of
e.name twice (and if the name is equal, traverses the equal prefixes of other members of Employee twice as well), and this cannot be optimized away in general. As Kamiński notes, zero-overhead abstraction is a pillar of C++, and achieving it for comparisons for the first time is a significant advantage of this design based on <=>.
But then he gives this example (section 1.4.5, page 6):
class PersonInFamilyTree { // ...
public:
std::partial_ordering operator<=>(const PersonInFamilyTree& that) const {
if (this->is_the_same_person_as ( that)) return partial_ordering::equivalent;
if (this->is_transitive_child_of( that)) return partial_ordering::less;
if (that. is_transitive_child_of(*this)) return partial_ordering::greater;
return partial_ordering::unordered;
}
// ... other functions, but no other comparisons ...
};
Would define operator>(a,b) as a<=>b > 0 not lead to large overhead? (though in a different form than he discusses). That code would first test for equality, then for less, and finally for greater, rather than only and directly testing for greater.
Am I missing something here?
Would define operator>(a,b) as a<=>b > 0 not lead to large overhead?
It would lead to some overhead. The magnitude of the overhead is relative, though - in situations when costs of running comparisons are negligible in relation to the rest of the program, reducing code duplication by implementing one operator instead of five may be an acceptable trade-off.
However, the proposal does not suggest removing other comparison operators in favor of <=>: if you want to overload other comparison operators, you are free to do it:
Be general: Don’t restrict what is inherent. Don’t arbitrarily restrict a complete set of uses. Avoid special cases and partial features. – For example, this paper supports all seven comparison operators and operations, including adding three-way comparison via <=>. It also supports all five major comparison categories, including partial orders.
For some definition of large. There is overhead because in a partial ordering, a == b iff a <= b and b <= a. The complexity would be the same as a topological sort, O(V+E). Of course, the modern C++ approach is to write safe, clean and readable code and then optimizing. You can choose to implement the spaceship operator first, then specialize once you determine performance bottlenecks.
Generally speaking, overloading <=> makes sense when you're dealing with a type where doing all comparisons at once is either only trivially more expensive or has the same cost as comparing them differently.
With strings, <=> seems more expensive than a straight == test, since you have to subtract each pair of two characters. However, since you already had to load each pair of characters into memory, adding a subtraction on top of that is a trivial expense. Indeed, comparing two numbers for equality is sometimes implemented by compilers as a subtraction and test against zero. And even for compilers that don't do that, the subtract and compare against zero is probably not significantly less efficient.
So for basic types like that, you're more or less fine.
When you're dealing with something like tree ordering, you really need to know up-front which operation you care about. If all you asked for was ==, you really don't want to have to search the rest of the tree just to know that they're unequal.
But personally... I would never implement something like tree ordering with comparison operators to begin with. Why? Because I think that comparisons like that ought to be logically fast operations. Whereas a tree search is such a slow operation that you really don't want to do it by accident or at any time other than when it is absolutely necessary.
Just look at this case. What does it really mean to say that a person in a family tree is "less than" another? It means that one is a child of the other. Wouldn't it be more readable in code to just ask that question directly with is_transitive_child_of?
The more complex your comparison logic is, the less likely it is that what you're doing is really a "comparison". That there's probably some textual description that this "comparison" operation could be called that would be more readable.
Oh sure, such a class could have a function that returns a partial_order representing the relationship between the two objects. But I wouldn't call that function operator<=>.
But in any case, is <=> a zero-overhead abstraction of comparison? No; you can construct cases where it costs significantly more to compute the ordering than it does to detect the specific relation you asked for. But personally if that's the case, there's a good chance that you shouldn't be comparing such types through operators at all.

Is there a constexpr ordering of types in C++?

Does C++ provide an ordering of the set of all types as a constant expression? It doesn't matter which particular order, any one will do. This could be in form of a constexpr comparison function:
template <typename T1, typename T2>
constexpr bool TypeLesser ();
My use for this is for a compile time self-balancing binary search tree of types, as a replacement of (cons/nil) type lists, to speed up the compilation. For example, checking whether a type is contained in such a tree may be faster than checking if it is contained in a type list.
I will also accept compiler-specific intrinsics if standard C++ does not provide such a feature.
Note that if the only way to get an ordering is to define it manually by adding boilerplate all over the code base (which includes a lot of templates and anonymous structs), I will rather stay with type lists.
The standard’s only ordering is via type_info (provided by typeid expression), which you can use more easily via type_index – the latter provides ordinary comparison functionality so that it can be used in collections.
I guess its ancestry is the class Andrei Alexandrescu had in “Modern C++ Design”.
It's not compile time.
To reduce compilation time you can define traits classes for the types in question, assigning each type some ordinal value. A 128 bit UUID would do nicely as a type id, to avoid the practical issue of guaranteeing unique id's. This of course assumes that you or the client code controls the set of possible types.
The idea of having to "register" relevant types has been used before, in early Boost machinery for determining function result types.
I must anyway recommend seriously measuring compilation performance. The balancing operations that are fast at run time, involving only adjustment of a few pointers, may be slow at compile time, involving creating a huge descriptor of a whole new type. So even though checking for type set membership may be faster, building the type set may be seriously much slower, e.g. O(n2).
Disclaimer: I haven't tried that.
But anyway, I remember again that Andrei Alexandrescu discussed something of the sort in the already mentioned “Modern C++ Design”, or if you don't have access to that book, look in the Loki library (which is a library of things from that book).
You have two main problems: 1) You have no specific comparison criteria (Hence the question, isn't?), and 2) You don't have any standard way to sort at compile-time.
For the first use std::type_info as others suggested (Its currently used on maps via the std::type_index wrapper) or define your own metafunction to specify the ordering criteria for different types. For the second, you could try to write your own template-metaprogramming based quicksort algorithm. Thats what I did for my personal metaprogramming library and works perfectly.
About the assumption "A self-balancing search tree should perform better than classic typelists" I really encourage you to do some profillings (Try templight) before saying that. Compile-time performance has nothing to do with classic runtime performance, depends heavily in the exact implementation of the template instantation system the compiler has.
For example, based on my own experience I'm pretty sure that my simple "O(n)" linear search could perform better than your self balanced tree. Why? Memoization. Compile-time performance is not only instantation depth. In fact memoization has a crucial role on this.
To give you a real example: Consider the implementation of quicksort (Pseudo meta code):
list sort( List l )
{
Int pivot = l[l.length/2];
Tuple(List,List) lists = reorder( l , pivot , l.length/2 );
return concat( sort( lists.left ) , sort( lists.right ) );
}
I hope that example is self-explanatory. Note the functional way it works, there are no side effects. I will be glad if some day metaprogramming in C++ has that syntax...
Thats the recursive case of quicksort. Since we are using typelists (Variadic typelists in my case), the first metainstruction, which computes the value of the pivot has O(n) complexity. Specifically requires a template instantation depth of N/2. The seconds step (Reordering) could be done in O(n), and concatting is O(1) (Remember that are C++11 variadic typelists).
Now consider an example of execution:
[1,2,3,4,5]
The first step calls the recursive case, so the trace is:
Int pivot = l[l.length/2]; traverses the list until 3. That means the instantations needed to perform the traversings [1], [1,2], [1,2,3] are memoized.
During the reordering, more subtraversings (And combinations of subtraversing generated by element "swapping") are generated.
Recursive "calls" and concat.
Since such linear traversings performed to go to the middle of the list are memoized, they are instantiated only once along the whole sort execution. When I first encountered this using templight I was completely dumbfounded. The fact, looking at the instantations graph, is that only the first large traverses are instantiated, the little ones are just part of the large and since the large where memoized, the little are not instanced again.
So wow, the compiler is able of memoizing at least the half of that so slow linear traversings, right? But, what is the cost of such enormous memoization efforts?
What I'm trying to say with this answer is: When doing template meta-programming, forget everything about runtime performance, optimizations, costs, etc, and don't do assumptions. Measure. You are entering in a completely different league. I'm not completely sure what implementation (Your selft balancing trees vs simple linear traversing) is faster, because that depends on the compiler. My example was only to show how actually a compiler could break down completely your assumptions.
Side note: The first time I did that profilings I showed them to an algorithms teacher of my university, and he's still trying to figure out whats happening. In fact, he asked a question here about how to measure the complexity and performance of this monster: Best practices for measuring the run-time complexity of a piece of code

String search algorithm used by string::find() c++

how its faster than cstring functions? is the similar source available for C?
There's no standard implementation of the C++ Standard Library, but you should be able to take a look at the implementation shipped with your compiler and see how it works yourself.
In general, most STL functions are not faster than their C counterparts, though. They're usually safer, more generalized and designed to accommodate a much broader range of circumstances than the narrow-purpose C equivalents.
A standard optimization with any string class is to store the string length along with the string. Which will make any string operation that requires the string length to be known to be O(1) instead of O(n), strlen() being the obvious one.
Or copying a string, there's no gain in the actual copy but figuring out how much memory to allocate before the copy is O(1). The overall algorithm is still O(n). The basic operation is still the same, shoveling bytes takes just as long in any language.
String classes are useful because they are safer (harder to shoot your foot) and easier to use (require less explicit code). They became popular and widely used because they weren't slower.
The string class almost certainly stores far more data about the string than you'd find in a C string. Length is a good example. In tradeoff for the extra memory use, you will gain some spare CPU cycles.
Edit:
However, it's unlikely that one is substantially slower than the other, since they'll perform fundamentally the same actions. MSDN suggests that string::find() doesn't use a functor-based system, so they won't have that optimization.
There are many possiblities how you can implement a find string technique. The easiest way is to check every position of the destination string if there is the searchstring. You can code that very quickly, but its the slowest possiblity. (O(m*n), m = length search string, n = length destination string)
Take a look at the wikipedia page, http://en.wikipedia.org/wiki/String_searching_algorithm, there are different options presented.
The fastest way is to create a finite state machine, and then you can insert the string without going backwards. Thats then just O(n).
Which algorithm the STL actually uses, I don't know. But you could search for the sourcecode and compare it with the algorithms.

'for' loop vs Qt's 'foreach' in C++

Which is better (or faster), a C++ for loop or the foreach operator provided by Qt? For example, the following condition
QList<QString> listofstrings;
Which is better?
foreach(QString str, listofstrings)
{
//code
}
or
int count = listofstrings.count();
QString str = QString();
for(int i=0;i<count;i++)
{
str = listofstrings.at(i);
//Code
}
It really doesn't matter in most cases.
The large number of questions on StackOverflow regarding whether this method or that method is faster, belie the fact that, in the vast majority of cases, code spends most of its time sitting around waiting for users to do something.
If you are really concerned, profile it for yourself and act on what you find.
But I think you'll most likely find that only in the most intense data-processing-heavy work does this question matter. The difference may well be only a couple of seconds and even then, only when processing huge numbers of elements.
Get your code working first. Then get it working fast (and only if you find an actual performance issue).
Time spent optimising before you've finished the functionality and can properly profile, is mostly wasted time.
First off, I'd just like to say I agree with Pax, and that the speed probably doesn't enter into it. foreach wins hands down based on readability, and that's enough in 98% of cases.
But of course the Qt guys have looked into it and actually done some profiling:
http://blog.qt.io/blog/2009/01/23/iterating-efficiently/
The main lesson to take away from that is: use const references in read only loops as it avoids the creation of temporary instances. It also make the purpose of the loop more explicit, regardless of the looping method you use.
It really doesn't matter. Odds are if your program is slow, this isn't the problem. However, it should be noted that you aren't make a completely equal comparison. Qt's foreach is more similar to this (this example will use QList<QString>):
for(QList<QString>::iterator it = Con.begin(); it != Con.end(); ++it) {
QString &str = *it;
// your code here
}
The macro is able to do this by using some compiler extensions (like GCC's __typeof__) to get the type of the container passed. Also imagine that boost's BOOST_FOREACH is very similar in concept.
The reason why your example isn't fair is that your non-Qt version is adding extra work.
You are indexing instead of really iterating. If you are using a type with non-contiguous allocation (I suspect this might be the case with QList<>), then indexing will be more expensive since the code has to calculate "where" the n-th item is.
That being said. It still doesn't matter. The timing difference between those two pieces of code will be negligible if existent at all. Don't waste your time worrying about it. Write whichever you find more clear and understandable.
EDIT: As a bonus, currently I strongly favor the C++11 version of container iteration, it is clean, concise and simple:
for(QString &s : Con) {
// you code here
}
Since Qt 5.7 the foreach macro is deprecated, Qt encourages you to use the C++11 for instead.
http://doc.qt.io/qt-5/qtglobal.html#foreach
(more details about the difference here : https://www.kdab.com/goodbye-q_foreach/)
I don't want to answer the question which is faster, but I do want to say which is better.
The biggest problem with Qt's foreach is the fact that it takes a copy of your container before iterating over it. You could say 'this doesn't matter because Qt classes are refcounted' but because a copy is used you don't actually change your original container at all.
In summary, Qt's foreach can only be used for read-only loops and thus should be avoided. Qt will happily let you write a foreach loop which you think will update/modify your container but in the end all changes are thrown away.
First, I completely agree with the answer that "it doesn't matter". Pick the cleanest solution, and optimize if it becomes a problem.
But another way to look at it is that often, the fastest solution is the one that describes your intent most accurately. In this case, QT's foreach says that you'd like to apply some action for each element in the container.
A plain for loop say that you'd like a counter i. You want to repeatedly add one to this value i, and as long as it is less than the number of elements in the container, you would like to perform some action.
In other words, the plain for loop overspecifies the problem. It adds a lot of requirements that aren't actually part of what you're trying to do. You don't care about the loop counter. But as soon as you write a for loop, it has to be there.
On the other hand, the QT people have made no additional promises that may affect performance. They simply guarantee to iterate through the container and apply an action to each.
In other words, often the cleanest and most elegant solution is also the fastest.
The foreach from Qt has a clearer syntax for the for loop IMHO, so it's better in that sense. Performance wise I doubt there's anything in it.
You could consider using the BOOST_FOREACH instead, as it is a well thought out fancy for loop, and it's portable (and presumably will make it's way into C++ some day and is future proof too).
A benchmark, and its results, on this can be found at http://richelbilderbeek.nl/CppExerciseAddOneAnswer.htm
IMHO (and many others here) it (that is speed) does not matter.
But feel free to draw your own conclusions.
For small collections, it should matter and foreach tends to be clearer.
However, for larger collections, for will begin to beat foreach at some point. (assuming that the 'at()' operator is efficient.
If this is really important (and I'm assuming it is since you are asking) then the best thing to do is measure it. A profiler should do the trick, or you could build a test version with some instrumentation.
You might look at the STL's for_each function. I don't know whether it will be faster than the two options you present, but it is more standardized than the Qt foreach and avoids some of the problems that you may run into with a regular for loop (namely out of bounds indexing and difficulties with translating the loop to a different data structure).
I would expect foreach to work nominally faster in some cases, and the about same in others, except in cases where the items are an actual array in which case the performace difference is negligible.
If it is implemented on top of an enumerator, it may be more efficient than a straight indexing, depending on implementation. It's unlikely to be less efficient. For example, if someone exposed a balanced tree as both indexable and enumerable, then foreach will be decently faster. This is because each index will have to independently find the referenced item, while an enumerator has the context of the current node to more efficiently navigate to the next ont.
If you have an actual array, then it depends on the implementation of the language and class whether foreach will be faster for the same as for.
If indexing is a literal memory offset(such as C++), then for should be marginally faster since you're avoiding a function call. If indexing is an indirection just like a call, then it should be the same.
All that being said... I find it hard to find a case for generalization here. This is the last sort of optimization you should be looking for, even if there is a performance problem in your application. If you have a performance problem that can be solved by changing how you iterate, you don't really have a performance problem. You have a BUG, because someone wrote either a really crappy iterator, or a really crappy indexer.

Should one prefer STL algorithms over hand-rolled loops?

I seem to be seeing more 'for' loops over iterators in questions & answers here than I do for_each(), transform(), and the like. Scott Meyers suggests that stl algorithms are preferred, or at least he did in 2001. Of course, using them often means moving the loop body into a function or function object. Some may feel this is an unacceptable complication, while others may feel it better breaks down the problem.
So... should STL algorithms be preferred over hand-rolled loops?
It depends on:
Whether high-performance is required
The readability of the loop
Whether the algorithm is complex
If the loop isn't the bottleneck, and the algorithm is simple (like for_each), then for the current C++ standard, I'd prefer a hand-rolled loop for readability. (Locality of logic is key.)
However, now that C++0x/C++11 is supported by some major compilers, I'd say use STL algorithms because they now allow lambda expressions — and thus the locality of the logic.
I’m going to go against the grain here and advocate that using STL algorithms with functors makes code much easier to understand and maintain, but you have to do it right. You have to pay more attention to readability and clearity. Particularly, you have to get the naming right. But when you do, you can end up with cleaner, clearer code, and paradigm shift into more powerful coding techniques.
Let’s take an example. Here we have a group of children, and we want to set their “Foo Count” to some value. The standard for-loop, iterator approach is:
for (vector<Child>::iterator iter = children.begin();
iter != children.end();
++iter)
{
iter->setFooCount(n);
}
Which, yeah, it’s pretty clear, and definitely not bad code. You can figure it out with just a little bit of looking at it. But look at what we can do with an appropriate functor:
for_each(children.begin(), children.end(), SetFooCount(n));
Wow, that says exactly what we need. You don’t have to figure it out; you immediately know that it’s setting the “Foo Count” of every child. (It would be even clearer if we didn’t need the .begin() / .end() nonsense, but you can’t have everything, and they didn’t consult me when making the STL.)
Granted, you do need to define this magical functor, SetFooCount, but its definition is pretty boilerplate:
class SetFooCount
{
public:
SetFooCount(int n) : fooCount(n) {}
void operator () (Child& child)
{
child.setFooCount(fooCount);
}
private:
int fooCount;
};
In total it’s more code, and you have to look at another place to find out exactly what SetFooCount is doing. But because we named it well, 99% of the time we don’t have to look at the code for SetFooCount. We assume it does what it says, and we only have to look at the for_each line.
What I really like is that using the algorithms leads to a paradigm shift. Instead of thinking of a list as a collection of objects, and doing things to every element of the list, you think of the list as a first class entity, and you operate directly on the list itself. The for-loop iterates through the list, calling a member function on each element to set the Foo Count. Instead, I am doing one command, which sets the Foo Count of every element in the list. It’s subtle, but when you look at the forest instead of the trees, you gain more power.
So with a little thought and careful naming, we can use the STL algorithms to make cleaner, clearer code, and start thinking on a less granular level.
The std::foreach is the kind of code that made me curse the STL, years ago.
I cannot say if it's better, but I like more to have the code of my loop under the loop preamble. For me, it is a strong requirement. And the std::foreach construct won't allow me that (strangely enough, the foreach versions of Java or C# are cool, as far as I am concerned... So I guess it confirms that for me the locality of the loop body is very very important).
So I'll use the foreach only if there is only already a readable/understandable algorithm usable with it. If not, no, I won't. But this is a matter of taste, I guess, as I should perhaps try harder to understand and learn to parse all this thing...
Note that the people at boost apparently felt somewhat the same way, for they wrote BOOST_FOREACH:
#include <string>
#include <iostream>
#include <boost/foreach.hpp>
int main()
{
std::string hello( "Hello, world!" );
BOOST_FOREACH( char ch, hello )
{
std::cout << ch;
}
return 0;
}
See : http://www.boost.org/doc/libs/1_35_0/doc/html/foreach.html
That's really the one thing that Scott Meyers got wrong.
If there is an actual algorithm that matches what you need to do, then of course use the algorithm.
But if all you need to do is loop through a collection and do something to each item, just do the normal loop instead of trying to separate code out into a different functor, that just ends up dicing code up into bits without any real gain.
There are some other options like boost::bind or boost::lambda, but those are really complex template metaprogramming things, they do not work very well with debugging and stepping through the code so they should generally be avoided.
As others have mentioned, this will all change when lambda expressions become a first class citizen.
The for loop is imperative, the algorithms are declarative. When you write std::max_element, it’s obvious what you need, when you use a loop to achieve the same, it’s not necessarily so.
Algorithms also can have a slight performance edge. For example, when traversing an std::deque, a specialized algorithm can avoid checking redundantly whether a given increment moves the pointer over a chunk boundary.
However, complicated functor expressions quickly render algorithm invocations unreadable. If an explicit loop is more readable, use it. If an algorithm call can be expressed without ten-storey bind expressions, by all means prefer it. Readability is more important than performance here, because this kind of optimization is what Knuth so famously attributes to Hoare; you’ll be able to use another construct without trouble once you realize it’s a bottleneck.
It depends, if the algorithm doesn't take a functor, then always use the std algorithm version. It's both simpler for you to write and clearer.
For algorithms that take functors, generally no, until C++0x lambdas can be used. If the functor is small and the algorithm is complex (most aren't) then it may be better to still use the std algorithm.
I'm a big fan of the STL algorithms in principal but in practice it's just way too cumbersome. By the time you define your functor/predicate classes a two line for loop can turn into 40+ lines of code that is suddenly 10x harder to figure out.
Thankfully, things are going to get a ton easier in C++0x with lambda functions, auto and new for syntax. Checkout this C++0x Overview on Wikipedia.
I wouldn't use a hard and fast rule for it. There are many factors to consider, like often you perform that certain operation in your code, is just a loop or an "actual" algorithm, does the algorithm depend on a lot of context that you would have to transmit to your function?
For example I wouldn't put something like
for (int i = 0; i < some_vector.size(); i++)
if (some_vector[i] == NULL) some_other_vector[i]++;
into an algorithm because it would result in a lot more code percentage wise and I would have to deal with getting some_other_vector known to the algorithm somehow.
There are a lot of other examples where using STL algorithms makes a lot of sense, but you need to decide on a case by case basis.
I think the STL algorithm interface is sub-optimal and should be avoided because using the STL toolkit directly (for algorithms) might give a very small gain in performance, but will definitely cost readability, maintainability, and even a bit of writeability when you're learning how to use the tools.
How much more efficient is a standard for loop over a vector:
int weighted_sum = 0;
for (int i = 0; i < a_vector.size(); ++i) {
weighted_sum += (i + 1) * a_vector[i]; // Just writing something a little nontrivial.
}
than using a for_each construction, or trying to fit this into a call to accumulate?
You could argue that the iteration process is less efficient, but a for _ each also introduces a function call at each step (which might be mitigated by trying to inline the function, but remember that "inline" is only a suggestion to the compiler - it may ignore it).
In any case, the difference is small. In my experience, over 90% of the code you write is not performance critical, but is coder-time critical. By keeping your STL loop all literally inline, it is very readable. There is less indirection to trip over, for yourself or future maintainers. If it's in your style guide, then you're saving some learning time for your coders (admit it, learning to properly use the STL the first time involves a few gotcha moments). This last bit is what I mean by a cost in writeability.
Of course there are some special cases -- for example, you might actually want that for_each function separated to re-use in several other places. Or, it might be one of those few highly performance-critical sections. But these are special cases -- exceptions rather than the rule.
IMO, a lot of standard library algorithms like std::for_each should be avoided - mainly for the lack-of-lambda issues mentioned by others, but also because there's such a thing as inappropriate hiding of details.
Of course hiding details away in functions and classes is all part of abstraction, and in general a library abstraction is better than reinventing the wheel. But a key skill with abstraction is knowing when to do it - and when not to do it. Excessive abstraction can damage readability, maintainability etc. Good judgement comes with experience, not from inflexible rules - though you must learn the rules before you learn to break them, of course.
OTOH, it's worth considering the fact that a lot of programmers have been using C++ (and before that, C, Pascal etc) for a long time. Old habits die hard, and there is this thing called cognitive dissonance which often leads to excuses and rationalisations. Don't jump to conclusions, though - it's at least as likely that the standards guys are guilty of post-decisional dissonance.
I think a big factor is the developer's comfort level.
It's probably true that using transform or for_each is the right thing to do, but it's not any more efficient, and handwritten loops aren't inherently dangerous. If it would take half an hour for a developer to write a simple loop, versus half a day to get the syntax for transform or for_each right, and move the provided code into a function or function object. And then other developers would need to know what was going on.
A new developer would probably be best served by learning to use transform and for_each rather than handmade loops, since he would be able to use them consistently without error. For the rest of us for whom writing loops is second nature, it's probably best to stick with what we know, and get more familiar with the algorithms in our spare time.
Put it this way -- if I told my boss I had spent the day converting handmade loops into for_each and transform calls, I doubt he'd be very pleased.