Efficient way to get a reversed copy of std::string - c++

Dealing with algorithmic tasks I frequently need to get a copy of reversed std::string. Also, the source string should not be modified. As far as I concerned, there are two ways to do it:
Use std::reverse :
// std::string sourceString has been initialized before.
std::string reversedString = sourceString;
std::reverse(reversedString.begin(), reversedString.end());
Use reverse iterators. This one I found on the Internet:
// std::string sourceString has been initialized before.
std::string reversedString{sourceString.rbegin(), sourceString.rend()};
My question is which approach I should prefer according to efficiency and best practices.
C-style solutions are not in my concern, I am only interested in STL-way approaches.

My question is which approach I should prefer according to efficiency
The one which should be preferred according to efficiency is the one that has been measured to be more efficient. Both have the same asymptotic complexity.
But, I won't bother to measure the difference unless it happens to be a bottleneck. I prefer 2, but it's subjective.

I could say that constructing a data structure with the right data initially is faster generally, but general statements about performance is generally wrong. You should measure the performance and benchmark if you're concerned about performance.
If you're not concerned enough about performance to write benchmark code, then you should take the style that looks the best for you.
Also, you forgot C++20 style:
auto reversed = sourceString | std::views::reverse;
std::string reversedString{begin(reversed), end(reversed)};
Which in the end is not that different from the iterator range style since the string still need a iterator pair.

Like others have said you should first decide what is more meaningful to your code-base, style or speed. If style, just use std::reverse which has an average runtime of O(n). If speed is a bottleneck and you run this reverse string method all the time, I would consider creating a doubly-linked list. Then reversing the LL can happen in O(1) runtime.

Related

unordered_set vs vector -- prefer idiomatic or performant?

I'm working with data that is unique from other data of the same type. Very abstractly, a set fits the definition of the data I'm working with. I feel inclined to use std::unordered_set instead of std::vector for that reason.
Beyond that, both classes can fit my requirements. My question is about performance -- which might perform better? I cannot write out the code one way and benchmark it, then rewrite it the other way. That will take me hundreds of hours. If they'll perform similarly, do you think it would be worth-while to stick with the idiomatic unordered_set?
Here is a simpler use case. A company is selling computers. Each is unique from another in at least one way, guaranteed.
struct computer_t
{
std::string serial;
std::uint32_t gb_of_ram;
};
std::unordered_set<computer_t> all_computers_in_existence;
std::unordered_set<computer_t> computers_for_sale; // subset of above
// alternatively
std::vector<computer_t> all_computers_in_existence;
std::vector<computer_t> computers_for_sale; // subset of above
The company wants to stop selling computers that aren't popular and replace them with other computers that might be.
std::unordered_set<computer_t> computers_not_for_sale;
std::set_difference(all_computers_in_existence.begin(), all_computers_in_existence.end(),
computers_for_sale.begin(), computers_for_sale.end(),
std::inserter(computers_not_for_sale, computers_not_for_sale.end()));
calculate_and_remove_least_sold(computers_for_sale);
calculate_and_add_most_likely_to_sell(computers_for_sale, computers_not_for_sale);
Based on the above sample code, what should I choose? Or is there another, new STL feature (in C++17) I should investigate? This really is as generic as it gets for my use-case without making this post incredibly long with details.
Idiomatic should be your first choice. If you implement it using unordered_set and the performance is not good enough, there are faster non-STL hash tables which are easy to switch to. 99% of the time it won't come to that.
Your example code using std::set_difference will not work, because that requires the inputs be sorted, which unordered_set is not. That's OK though, subtracting is done easily using unordered_set::erase(key).

C++ Counting Map

Recently I was dealing with what I am sure is a very common problem, which essentially boils down into the following:
Given a long text, calculate the frequency of each word occurring in the text.
I was able to solve this problem using std::unordered_map. This, however, turned quite ugly, as for every word in the text, if that's already been encountered I had to do a find, erase, and then a re-insert into the map with the value incremented.
I realise there are other ways of doing this, such as using a hashing function on top of a vanilla array/vector and increment value there, but I was wondering if there was a more elegant way of solving this problem, like an STL component, or function. that would have a similar interface to Pythons Counter collections.
I know C++ being C++ I can't really expect such high level concepts to always be implemented for me, but was just wondering if you guys new about anything (or at least your Googling skills are superior to mine) which could make my code a little nicer.
I'm not quite sure why an std::unordered_map (or just std::map) would involve much complexity. I'd write the code something like this:
std::unordered_map<std::string, int> words;
std::string word;
while (word = getword(input))
++words[word];
There's no need for any kind of find/erase/reinsert.
In case it's not clear how/why this works: operator[] will create an entry for a value if none exists yet in the map. The associated value will be a value-initialized object of the specified type, which will be zero in the case of an int (or similar). We then increment that every time we encounter the word.
An alternative solution:
std::multiset<std::string> m;
for (auto w: words) m.insert(w);
m.count("some word");
The advantage is that you don't have to rely on the 'trick' with operator[], making the code more readable.
EDIT: As Kerrek pointed out in the comments, this solution is slower. multiset stores all the elements you insert, even if they are deemed equal (they might still differ in some aspect that operator== does not check). This causes a significant overhead compared to unordered_map<std::string, int>, which only has to store each word once.
(As a side note, processing the complete works of William Shakespeare using the map solution takes about 0.33s on my machine, as opposed to 0.78s for the multiset solution.)

String search algorithm used by string::find() c++

how its faster than cstring functions? is the similar source available for C?
There's no standard implementation of the C++ Standard Library, but you should be able to take a look at the implementation shipped with your compiler and see how it works yourself.
In general, most STL functions are not faster than their C counterparts, though. They're usually safer, more generalized and designed to accommodate a much broader range of circumstances than the narrow-purpose C equivalents.
A standard optimization with any string class is to store the string length along with the string. Which will make any string operation that requires the string length to be known to be O(1) instead of O(n), strlen() being the obvious one.
Or copying a string, there's no gain in the actual copy but figuring out how much memory to allocate before the copy is O(1). The overall algorithm is still O(n). The basic operation is still the same, shoveling bytes takes just as long in any language.
String classes are useful because they are safer (harder to shoot your foot) and easier to use (require less explicit code). They became popular and widely used because they weren't slower.
The string class almost certainly stores far more data about the string than you'd find in a C string. Length is a good example. In tradeoff for the extra memory use, you will gain some spare CPU cycles.
Edit:
However, it's unlikely that one is substantially slower than the other, since they'll perform fundamentally the same actions. MSDN suggests that string::find() doesn't use a functor-based system, so they won't have that optimization.
There are many possiblities how you can implement a find string technique. The easiest way is to check every position of the destination string if there is the searchstring. You can code that very quickly, but its the slowest possiblity. (O(m*n), m = length search string, n = length destination string)
Take a look at the wikipedia page, http://en.wikipedia.org/wiki/String_searching_algorithm, there are different options presented.
The fastest way is to create a finite state machine, and then you can insert the string without going backwards. Thats then just O(n).
Which algorithm the STL actually uses, I don't know. But you could search for the sourcecode and compare it with the algorithms.

'for' loop vs Qt's 'foreach' in C++

Which is better (or faster), a C++ for loop or the foreach operator provided by Qt? For example, the following condition
QList<QString> listofstrings;
Which is better?
foreach(QString str, listofstrings)
{
//code
}
or
int count = listofstrings.count();
QString str = QString();
for(int i=0;i<count;i++)
{
str = listofstrings.at(i);
//Code
}
It really doesn't matter in most cases.
The large number of questions on StackOverflow regarding whether this method or that method is faster, belie the fact that, in the vast majority of cases, code spends most of its time sitting around waiting for users to do something.
If you are really concerned, profile it for yourself and act on what you find.
But I think you'll most likely find that only in the most intense data-processing-heavy work does this question matter. The difference may well be only a couple of seconds and even then, only when processing huge numbers of elements.
Get your code working first. Then get it working fast (and only if you find an actual performance issue).
Time spent optimising before you've finished the functionality and can properly profile, is mostly wasted time.
First off, I'd just like to say I agree with Pax, and that the speed probably doesn't enter into it. foreach wins hands down based on readability, and that's enough in 98% of cases.
But of course the Qt guys have looked into it and actually done some profiling:
http://blog.qt.io/blog/2009/01/23/iterating-efficiently/
The main lesson to take away from that is: use const references in read only loops as it avoids the creation of temporary instances. It also make the purpose of the loop more explicit, regardless of the looping method you use.
It really doesn't matter. Odds are if your program is slow, this isn't the problem. However, it should be noted that you aren't make a completely equal comparison. Qt's foreach is more similar to this (this example will use QList<QString>):
for(QList<QString>::iterator it = Con.begin(); it != Con.end(); ++it) {
QString &str = *it;
// your code here
}
The macro is able to do this by using some compiler extensions (like GCC's __typeof__) to get the type of the container passed. Also imagine that boost's BOOST_FOREACH is very similar in concept.
The reason why your example isn't fair is that your non-Qt version is adding extra work.
You are indexing instead of really iterating. If you are using a type with non-contiguous allocation (I suspect this might be the case with QList<>), then indexing will be more expensive since the code has to calculate "where" the n-th item is.
That being said. It still doesn't matter. The timing difference between those two pieces of code will be negligible if existent at all. Don't waste your time worrying about it. Write whichever you find more clear and understandable.
EDIT: As a bonus, currently I strongly favor the C++11 version of container iteration, it is clean, concise and simple:
for(QString &s : Con) {
// you code here
}
Since Qt 5.7 the foreach macro is deprecated, Qt encourages you to use the C++11 for instead.
http://doc.qt.io/qt-5/qtglobal.html#foreach
(more details about the difference here : https://www.kdab.com/goodbye-q_foreach/)
I don't want to answer the question which is faster, but I do want to say which is better.
The biggest problem with Qt's foreach is the fact that it takes a copy of your container before iterating over it. You could say 'this doesn't matter because Qt classes are refcounted' but because a copy is used you don't actually change your original container at all.
In summary, Qt's foreach can only be used for read-only loops and thus should be avoided. Qt will happily let you write a foreach loop which you think will update/modify your container but in the end all changes are thrown away.
First, I completely agree with the answer that "it doesn't matter". Pick the cleanest solution, and optimize if it becomes a problem.
But another way to look at it is that often, the fastest solution is the one that describes your intent most accurately. In this case, QT's foreach says that you'd like to apply some action for each element in the container.
A plain for loop say that you'd like a counter i. You want to repeatedly add one to this value i, and as long as it is less than the number of elements in the container, you would like to perform some action.
In other words, the plain for loop overspecifies the problem. It adds a lot of requirements that aren't actually part of what you're trying to do. You don't care about the loop counter. But as soon as you write a for loop, it has to be there.
On the other hand, the QT people have made no additional promises that may affect performance. They simply guarantee to iterate through the container and apply an action to each.
In other words, often the cleanest and most elegant solution is also the fastest.
The foreach from Qt has a clearer syntax for the for loop IMHO, so it's better in that sense. Performance wise I doubt there's anything in it.
You could consider using the BOOST_FOREACH instead, as it is a well thought out fancy for loop, and it's portable (and presumably will make it's way into C++ some day and is future proof too).
A benchmark, and its results, on this can be found at http://richelbilderbeek.nl/CppExerciseAddOneAnswer.htm
IMHO (and many others here) it (that is speed) does not matter.
But feel free to draw your own conclusions.
For small collections, it should matter and foreach tends to be clearer.
However, for larger collections, for will begin to beat foreach at some point. (assuming that the 'at()' operator is efficient.
If this is really important (and I'm assuming it is since you are asking) then the best thing to do is measure it. A profiler should do the trick, or you could build a test version with some instrumentation.
You might look at the STL's for_each function. I don't know whether it will be faster than the two options you present, but it is more standardized than the Qt foreach and avoids some of the problems that you may run into with a regular for loop (namely out of bounds indexing and difficulties with translating the loop to a different data structure).
I would expect foreach to work nominally faster in some cases, and the about same in others, except in cases where the items are an actual array in which case the performace difference is negligible.
If it is implemented on top of an enumerator, it may be more efficient than a straight indexing, depending on implementation. It's unlikely to be less efficient. For example, if someone exposed a balanced tree as both indexable and enumerable, then foreach will be decently faster. This is because each index will have to independently find the referenced item, while an enumerator has the context of the current node to more efficiently navigate to the next ont.
If you have an actual array, then it depends on the implementation of the language and class whether foreach will be faster for the same as for.
If indexing is a literal memory offset(such as C++), then for should be marginally faster since you're avoiding a function call. If indexing is an indirection just like a call, then it should be the same.
All that being said... I find it hard to find a case for generalization here. This is the last sort of optimization you should be looking for, even if there is a performance problem in your application. If you have a performance problem that can be solved by changing how you iterate, you don't really have a performance problem. You have a BUG, because someone wrote either a really crappy iterator, or a really crappy indexer.

How can I increase the performance in a map lookup with key type std::string?

I'm using a std::map (VC++ implementation) and it's a little slow for lookups via the map's find method.
The key type is std::string.
Can I increase the performance of this std::map lookup via a custom key compare override for the map? For example, maybe std::string < compare doesn't take into consideration a simple string::size() compare before comparing its data?
Any other ideas to speed up the compare?
In my situation the map will always contain < 15 elements, but it is being queried non stop and performance is critical. Maybe there is a better data structure that I can use that would be faster?
Update: The map contains file paths.
Update2: The map's elements are changing often.
First, turn off all the profiling and DEBUG switches. These can slow down STL immensely.
If that's not it, part of the problem may be that your strings are identical for the first 80-90% of the string. This isn't bad for map, necessarily, but it is for string comparisons. If this is the case, your search can take much longer.
For example, in this code find() will likely result in a couple of string compares, but each will return after comparing the first character until "david", and then the first three characters will be checked. So at most, 5 characters will be checked per call.
map<string,int> names;
names["larry"] = 1;
names["david"] = 2;
names["juanita"] = 3;
map<string,int>::iterator iter = names.find("daniel");
On the other hand, in the following code, find() will likely check 135+ characters:
map<string,int> names;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/wilma"] = 1;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/fred"] = 2;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/barney"] = 3;
map<string,int>::iterator iter = names.find("/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/betty");
That's because the string comparisons have to search deeper to find a match since the beginning of each string is the same.
Using size() in your comparison for equality won't help you much here since your data set is so small. A std::map is kept sorted so its elements can be searched with a binary search. Each call to find should result in less than 5 string comparisons for a miss, and an average of 2 comparisons for a hit. But it does depend on your data. If most of your path strings are of different lengths, then a size check like Motti describes could help a lot.
Something to consider when thinking of alternative algorithms is how many many "hits" you get. Are most of your find() calls returning end() or a hit? If most of your find()s return end() (misses) then you are searching the entire map every time (2logn string compares).
Hash_map is a good idea; it should cut your search time in about half for hits; more for misses.
A custom algorithm may be called for because of the nature of path strings, especially if your data set has common ancestry like in the above code.
Another thing to consider is how you get your search strings. If you are reusing them, it may help to encode them into something that is easier to compare. If you use them once and discard them, then this encoding step is probably too expensive.
I used something like a Huffman coding tree once (a long time ago) to optimize string searches. A binary string search tree like that may be more efficient in some cases, but its pretty expensive for small sets like yours.
Finally, look into alternative std::map implementations. I've heard bad things about some of VC's stl code performance. The DEBUG library in particular is bad about checking you on every call. StlPort used to be a good alternative, but I haven't tried it in a few years. I've always loved Boost too.
As Even said the operator used in a set is < not ==.
If you don't care about the order of the strings in your set you can pass the set a custom comparator that performs better than the regular less-than.
For example if a lot of your strings have similar prefixes (but they vary in length) you can sort by string length (since string.length is constant speed).
If you do so beware a common mistake:
struct comp {
bool operator()(const std::string& lhs, const std::string& rhs)
{
if (lhs.length() < rhs.length())
return true;
return lhs < rhs;
}
};
This operator does not maintain a strict weak ordering, as it can treat two strings as each less than the other.
string a = "z";
string b = "aa";
Follow the logic and you'll see that comp(a, b) == true and comp(b, a) == true.
The correct implementation is:
struct comp {
bool operator()(const std::string& lhs, const std::string& rhs)
{
if (lhs.length() != rhs.length())
return lhs.length() < rhs.length();
return lhs < rhs;
}
};
The first thing is to try using a hash_map if that's possible - you are right that the standard string compare doesn't first check for size (since it compares lexicographically), but writing your own map code is something you'd be better off avoiding. From your question it sounds like you do not need to iterate over ranges; in that case map doesn't have anything hash_map doesn't.
It also depends on what sort of keys you have in your map. Are they typically very long? Also what does "a little slow" mean? If you have not profiled the code it's quite possible that it's a different part taking time.
Update: Hmm, the bottleneck in your program is a map::find, but the map always has less than 15 elements. This makes me suspect that the profile was somehow misleading, because a find on a map this small should not be slow, at all. In fact, a map::find should be so fast, just the overhead of profiling could be more than the find call itself. I have to ask again, are you sure this is really the bottleneck in your program? You say the strings are paths, but you're not doing any sort of OS calls, file system access, disk access in this loop? Any of those should be orders of magnitude slower than a map::find on a small map. Really any way of getting a string should be slower than the map::find.
You can try to use a sorted vector (here's one sample), this may turn out to be faster (you'll have to profile it to make sure of-course).
Reasons to think it'll be faster:
Less memory allocations and deallocations (the vector will expand to the maximal size used and then reuse freed memory).
Binary find with random access should be faster than tree traversal (espacially due to data locality).
Reasons to think it'll be slower:
Deleations and additions will mean moving strings around in memory, since string's swap is efficiant and the size of the data set is small this may not be an issue.
std::map's comparator isn't std::equal_to it's std::less, I'm not sure what the best way to short circuit a < compare so that it would be faster than the built in one.
If there are always < 15 elems, perhaps you could use a key besides std::string?
Motti has a good solution. However, I'm pretty sure that for your < 15 elements a map isn't the right way because its overhead will always be greater than that of a simple lookup table with an appropriate hashing scheme. In your case, it might even be enough to hash by length alone, and if that still produces collisions, use a linear search through all entries of the same length.
To establish if I'm right, a benchmark is of course required but I'm quite sure of its outcome.
You might consider pre-computing a hash for a string, and saving that in your map. Doing so gives the advantage of hash compares instead of string compares during the search through the std::map tree.
class HashedString
{
unsigned m_hash;
std::string m_string;
public:
HashedString(const std::string& str)
: m_hash(HashString(str))
, m_string(str)
{};
// ... copy constructor and etc...
unsigned GetHash() const {return m_hash;}
const std::string& GetString() const {return m_string;}
};
This has the benefits of computing a hash of the string once, on construction. After this, you could implement a comparison function:
struct comp
{
bool operator()(const HashedString& lhs, const HashedString& rhs)
{
if(lhs.GetHash() < rhs.GetHash()) return true;
if(lhs.GetHash() > rhs.GetHash()) return false;
return lhs.GetString() < rhs.GetString();
}
};
Since hashes are now computed on HashedString construction, they are stored that way in the std::map, and so the compare can happen very quickly (an integer compare) in an astronomically high percentage of the time, falling back on standard string compares when the hashes are equal.
Maybe you could reverse the strings prior to using them as keys in the map? That could help if the first few letters of each string are identical.
Here are some things you can consider:
0) Are you sure this is where the performance bottleneck is? Like the results from Quantify, Cachegrind, gprof or something like that? Because lookups on such a smap map should be fairly fast...
1) You can override the functor used to compare the keys in std::map<>, there is a second template parameter to do that. I doubt you can do much better than operator<, however.
2) Are the contents of the map changing a lot? If not, and given the very small size of your map, maybe using a sorted vector and binary search could yield better results (for example because you can exploit memory locality better.
3) Are the elements known at compile time? You could use a perfect hash function to improve lookup times if that is the case. Search for gperf on the web.
4) Do you have a lot of lookups that fail to find anything? If so, maybe comparing with the first and last elements in the collection may eliminate many mismatches quicker than a full search every time.
These have been suggested already, but in more detail:
5) Since you have so few strings, maybe you could use a different key. For example, are your keys all the same size? Can you use a class containing a fixed-length array of characters? Can you convert your strings to numbers or some data structure with only numbers?
Depending on the usage cases, there are some other techniques you can use. For example we had an application that needed to keep up with over a million different file paths. The problem with that there were thousands of objects that needed to keep small maps of these file paths.
Since adding new file paths to the data set was an infrequent operation, when path was added to the system, a master map was searched. If the path was not found, then it was added and a new sequenced integer (starting at 1) was returned. If the path already existed, then the previously assigned integer was returned. Then each map maintained by each object was converted from a string based map to an integer map. Not only did this greatly improve performance, it reduced memory usage by not having so many duplicate copies of the strings.
Sure, this is a very specific optimization. But when it comes to performance improvements, you often find yourself having to make tailored solutions to specific problems.
And I hate strings :) Not are they slow to compare, but they can really trash your CPU caches on high performance software.
Try std::tr1::unordered_map (found in the header <tr1/unordered_map>). This is a hash map, and, while it doesn't maintain a sorted order of elements, will likely be far faster than a regular map.
If your compiler doesn't support TR1, get a newer version. MSVC and gcc both support TR1, and I believe the newest versions of most other compilers also have support. Unfortunately, a lot of the library reference sites haven't been updated, so TR1 remains a largely-unknown piece of technology.
I hope C++0x isn't the same way.
EDIT: Note that the default hashing method for tr1::unordered_map is tr1::hash, which needs to be specialized to work on a UDT, probably.
Where you have long common substrings, a trie might be a better data structure than a map or a hash_map. I said "might", though - a hash_map already only traverses the key once per lookup, so should be fairly fast. I won't discuss it further since others already have.
You could also consider a splay tree if some keys are more frequently looked up than others, but of course this makes the worst-case lookup worse than a balanced tree, and lookups are mutating operations, which may matter to you if you're using e.g. a reader-writer lock.
If you care about the performance of lookups more than modifications, you might do better with an AVL tree than a red-black, which I think is what STL implementations generally use for map. An AVL tree is typically better balanced and so will on average require fewer comparisons per lookup, but the difference is marginal.
Finding an implementation of these that you're happy with might be an issue. A search on the Boost main page suggests they have a splay and AVL tree but not a trie.
You mentioned in a comment that you never have a lookup that fails to find anything. So you could in theory skip the final comparison, which in a tree of 15 < 2^4 elements could give you something like a 20-25% speedup without doing anything else. In fact, maybe more than that, since equal strings are the slowest to compare. Whether it's worth writing your own container just for this optimisation is another question.
You might also consider locality of reference - I don't know whether you could avoid the occasional page miss by allocating the keys and the nodes out of a small heap. If you only need about 15 entries at a time, then assuming a file name limit below 256 bytes you could ensure that everything accessed during a lookup fits into a single 4k page (apart from the key being looked up, of course). It may be that comparing the strings is insignificant compared with a couple of page loads. However, if this is your bottleneck there must be an enormous number of lookups going on, so I'd guess that everything is reasonably close to the CPU. Worth checking, maybe.
Another thought: if you are using pessimistic locking on a structure where there's a lot of contention (you said in a comment the program is massively multi-threaded) then regardless of what the profiler tells you (what code the CPU cycles are spent in), it might be costing you more than you think by effectively limiting you to 1 core. Try a reader-writer lock?
hash_map is not standard, try using unordered_map available in tr1 (which is available in boost if your tool chain doesn't already have it).
For small numbers of strings you might be better using vector, as map is typically implemented as a tree.
Why don't you use a hashtable instead? boost::unordered_map could do. Or you can roll out your own solution, and store the crc of a string instead of the string itself. Or better yet, put #defines for the strings, and use those for lookup, e.g.,
#define "STRING_1" STRING_1