How to compare strings in C++ and mergesort - c++

I am going to write a code which will compare some objects in array and sort them by name.
First of all how to compare strings in C++? In java it is easy oneString.compareTo(another);
If you are in posession of merge sort in C++ please share. Thank You!

Comparing strings in C++ is very similar to in Java - the method is called compare instead of compareTo. So use oneString.compare(another);.

You can use the std::string member function operator<() as a comparison function for std::sort (using C++11 lambda expressions). This may or may not use merge sort as an implementation, but it will have the same O(N log N) complexity.
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> v = { "foo", "bar" };
std::sort(v.begin(), v.end());
std::for_each(v.begin(), v.end(), [](std::string const& elem) {
std::cout << elem << "\n";
});
return 0;
}
Output on Ideone

Related

Iterating and printing std::map using std::for_each

Recently, i learnt about the STL types and templates and we were given a challenge as part of practicing the STL and getting used to using it:
Iterate over a std::map<std::string, size_t>
Print its contents
Restrictions:
Can only use: std::vector, std::map, std::string, std::algorithm, std::functional
Cannot define complex types nor templates
Cannot use the . (member access), -> (member access via pointer), * (dereference) operators
Cannot use for, while, do-while nor if-else, switch and other conditionals
Can use std::for_each and other functions of function templates to iterate over collection of elements
No lambdas
No std::cout, std::cerr, std::ostream etc.
No auto types
Can use other STL templates so long as they are included in the headers described at (1)
Allowed to use these functions:
void print(const std::string& str)
{
std::cout << str << std::endl;
}
std::string split(const std::pair<std::string, size_t> &r)
{
std::string name;
std::tie(name, std::ignore) = r;
return name;
}
Originally, i had wanted to use std::for_each(std::begin(mymap), std::end(mymap), print) to iterate over the map and then use the print function to print out the contents. Then i realised that i am actually working with std::pair<std::string, size_t> which made me consider the use of std::bind and std::tie to break the std::pair up. But since i THINK i need to do it inside the std::for_each expression, how can i break up the std::pair while also call print on the elements?
I have also considered using Structured Binding but i am not allowed to use auto.
So, the question is, how do i make use of the STL to iterate the map to extract then print out the keys using the helper functions provided? Obviously, without the restrictions the challenge would have been very easy, but i am at a loss as to what kind of functions in the STL are appropriate in light of this.
I used from your function that takes a "std::pair& as for_each third argument.
I use printf() for print values.
#include <string>
#include <iostream>
#include <map>
#include <algorithm>
#include <vector>
using namespace std;
std::string Split(const std::pair<std::string, size_t> &r)
{
std::string name;
std::tie(name, std::ignore) = r;
return name;
}
int main()
{
string name1{ "John" };
string name2{ "Jack" };
std::map<std::string, size_t> sample = { {name1, 31}, {name2, 35} };
static vector<std::string> names;
std::for_each(sample.begin(), sample.end(), [](std::pair<std::string, size_t> pickup)
{
static int i = 0;
names.push_back(Split(pickup));
printf("%s\n", names[i].c_str());
i++;
});
}

Finding out whether two ranges (one of them is sorted) have a common element

I wrote the following code that does it:
std::vector<int> vec;
std::vector<int> sortedRange;
// ...
bool hasCommonElement =
std::any_of(begin(vec), end(vec),
std::bind(std::binary_search, begin(sortedRange), end(sortedRange), _1));
The compiler is complaining that it cannot find out which overload of binary search I mean. Do you have any other elegant solution? Or a good reason why it does not compile?
Edit
I do know I can use a lambda. But here the bind seems more elegant (if I had generic lambdas, it would be great! But I don't).
I do know that I can qualify the iterator type: binary_search<std::vector<int>::iterator>. But it is even less elegant.
I know I can also do it by sorting "vec" and using set_intersection. But this is more complicated too.
You can do it with a lambda instead of bind:
bool hasCommonElement = any_of(begin(vec), end(vec), [&](int x) {return binary_search(begin(sortedRange), end(sortedRange), x);});
You have two issues. The binary_search is a template (which parameters are not deduced) and you need to qualify the placeholder:
#include <vector>
#include <algorithm>
#include <functional>
int main( {
std::vector<int> vec;
std::vector<int> sortedRange;
bool hasCommonElement =
std::any_of(
begin(vec), end(vec),
std::bind(
std::binary_search<std::vector<int>::iterator, int>,
begin(sortedRange),
end(sortedRange),
std::placeholders::_1));
return 0;
}

how to bind elements from one container to call member func on another container

I've two containers - one is of vector type and the other one is of unordered_set.
Now, I want to check if any of element from the vector exists in the unordered_set or not - something like find_first_of does - and return true/false accordingly.
Now, since I wanted to exploit find of unordered_set, I's thinking to use any_of(vector_container.begin(), vector_container.end(), predicate) instead of using find_first_of.
Is there a way that I can use boost::bind to bind elements from the vector to find from the unordered_set so that I don't have to write the predicate class?
Doing this with find() is very awkward, but there is an alternative: you can use unordered_set's count() function:
boost::algorithm::any_of(
vector_container.begin(), vector_container.end(),
boost::bind(&boost::unordered_set<int>::count, boost::ref(set_container), _1));
Here's one slightly lazy predicate, using count(n) == 0 as "non-existence" (and "== 1" as "existence"):
boost::bind(
std::equal_to<std::size_t>(),
boost::bind(std::mem_fun(&std::unordered_set<int>::count), &s, _1),
0)
This takes advantage of the composability of boost::bind. If you're a little more verbose you can substitute find(n) == end() or find(n) != end() instead.
Here's a little demo, removing all the elements that are in a set from a vector:
#include <boost/bind.hpp>
#include <unordered_set>
#include <algorithm>
#include <functional>
#include <iostream>
#include <vector>
int main()
{
std::unordered_set<int> s { 1, 2, 3, 4 };
std::vector<int> v { 2, 5, 9, 1 };
v.erase(
std::remove_if(
v.begin(), v.end(),
boost::bind(
std::equal_to<std::size_t>(),
boost::bind(std::mem_fun(&std::unordered_set<int>::count), &s, _1),
1)),
v.end());
for (int n : v) { std::cout << n << "\n"; }
}
std::any_of(v.begin(), v.end(), boost::bind(&std::set<int>::find, &s, boost::lambda::_1) != s.end());

how to find duplicates in std::vector<string> and return a list of them?

So if I have a vector of words like:
Vec1 = "words", "words", "are", "fun", "fun"
resulting list: "fun", "words"
I am trying to determine which words are duplicated, and return an alphabetized vector of 1 copy of them. My problem is that I don't even know where to start, the only thing close to it I found was std::unique_copy which doesn't exactly do what I need. And specifically, I am inputting a std::vector<std::string> but outputting a std::list<std::string>. And if needed, I can use functor.
Could someone at least push me in the right direction please? I already tried reading stl documentation,but I am just "brain" blocked right now.
In 3 lines (not counting the vector and list creation nor the superfluous line-breaks in name of readability):
vector<string> vec{"words", "words", "are", "fun", "fun"};
list<string> output;
sort(vec.begin(), vec.end());
set<string> uvec(vec.begin(), vec.end());
set_difference(vec.begin(), vec.end(),
uvec.begin(), uvec.end(),
back_inserter(output));
EDIT
Explanation of the solution:
Sorting the vector is needed in order to use set_difference() later.
The uvec set will automatically keep elements sorted, and eliminate duplicates.
The output list will be populated by the elements of vec - uvec.
Make an empty std::unordered_set<std::string>
Iterator your vector, checking whether each item is a member of the set
If it's already in the set, this is a duplicate, so add to your result list
Otherwise, add to the set.
Since you want each duplicate only listed once in the results, you can use a hashset (not list) for the results as well.
IMO, Ben Voigt started with a good basic idea, but I would caution against taking his wording too literally.
In particular, I dislike the idea of searching for the string in the set, then adding it to your set if it's not present, and adding it to the output if it was present. This basically means every time we encounter a new word, we search our set of existing words twice, once to check whether a word is present, and again to insert it because it wasn't. Most of that searching will be essentially identical -- unless some other thread mutates the structure in the interim (which could give a race condition).
Instead, I'd start by trying to add it to the set of words you've seen. That returns a pair<iterator, bool>, with the bool set to true if and only if the value was inserted -- i.e., was not previously present. That lets us consolidate the search for an existing string and the insertion of the new string together into a single insert:
while (input >> word)
if (!(existing.insert(word)).second)
output.insert(word);
This also cleans up the flow enough that it's pretty easy to turn the test into a functor that we can then use with std::remove_copy_if to produce our results quite directly:
#include <set>
#include <iterator>
#include <algorithm>
#include <string>
#include <vector>
#include <iostream>
class show_copies {
std::set<std::string> existing;
public:
bool operator()(std::string const &in) {
return existing.insert(in).second;
}
};
int main() {
std::vector<std::string> words{ "words", "words", "are", "fun", "fun" };
std::set<std::string> result;
std::remove_copy_if(words.begin(), words.end(),
std::inserter(result, result.end()), show_copies());
for (auto const &s : result)
std::cout << s << "\n";
}
Depending on whether I cared more about code simplicity or execution speed, I might use an std::vector instead of the set for result, and use std::sort followed by std::unique_copy to produce the final result. In such a case I'd probably also replace the std::set inside of show_copies with an std::unordered_set instead:
#include <unordered_set>
#include <iterator>
#include <algorithm>
#include <string>
#include <vector>
#include <iostream>
class show_copies {
std::unordered_set<std::string> existing;
public:
bool operator()(std::string const &in) {
return existing.insert(in).second;
}
};
int main() {
std::vector<std::string> words{ "words", "words", "are", "fun", "fun" };
std::vector<std::string> intermediate;
std::remove_copy_if(words.begin(), words.end(),
std::back_inserter(intermediate), show_copies());
std::sort(intermediate.begin(), intermediate.end());
std::unique_copy(intermediate.begin(), intermediate.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
}
This is marginally more complex (one whole line longer!) but likely to be substantially faster when/if the number of words gets very large. Also note that I'm using std::unique_copy primarily to produce visible output. If you just want the result in a collection, you can use the standard unique/erase idiom to get unique items in intermediate.
In place (no additional storage). No string copying (except to result list). One sort + one pass:
#include <string>
#include <vector>
#include <list>
#include <iostream>
#include <algorithm>
using namespace std;
int main() {
vector<string> vec{"words", "words", "are", "fun", "fun"};
list<string> dup;
sort(vec.begin(), vec.end());
const string empty{""};
const string* prev_p = ∅
for(const string& s: vec) {
if (*prev_p==s) dup.push_back(s);
prev_p = &s;
}
for(auto& w: dup) cout << w << ' ';
cout << '\n';
}
You can get a pretty clean implementation using a std::map to count the occurrences, and then relying on std::list::sort to sort the resulting list of words. For example:
std::list<std::string> duplicateWordList(const std::vector<std::string>& words) {
std::map<std::string, int> temp;
std::list<std::string> ret;
for (std::vector<std::string>::const_iterator iter = words.begin(); iter != words.end(); ++iter) {
temp[*iter] += 1;
// only add the word to our return list on the second copy
// (first copy doesn't count, third and later copies have already been handled)
if (temp[*iter] == 2) {
ret.push_back(*iter);
}
}
ret.sort();
return ret;
}
Using a std::map there seems a little wasteful, but it gets the job done.
Here's a better algorithm than the ones other people have proposed:
#include <algorithm>
#include <vector>
template<class It> It unique2(It const begin, It const end)
{
It i = begin;
if (i != end)
{
It j = i;
for (++j; j != end; ++j)
{
if (*i != *j)
{ using std::swap; swap(*++i, *j); }
}
++i;
}
return i;
}
int main()
{
std::vector<std::string> v;
v.push_back("words");
v.push_back("words");
v.push_back("are");
v.push_back("fun");
v.push_back("words");
v.push_back("fun");
v.push_back("fun");
std::sort(v.begin(), v.end());
v.erase(v.begin(), unique2(v.begin(), v.end()));
std::sort(v.begin(), v.end());
v.erase(unique2(v.begin(), v.end()), v.end());
}
It's better because it only requires swap with no auxiliary vector for storage, which means it will behave optimally for earlier versions of C++, and it doesn't require elements to be copyable.
If you're more clever, I think you can avoid sorting the vector twice as well.

How to filter or "grep" a C++ vector?

I've got a vector<MyType> and would like another vector<MyType> containing only those MyTypes which fulfill some simple criteria, e.g. that some data member equals something. What's the best way to solve this?
Use copy_if:
#include <algorithm> // for copy_if
#include <iterator> // for back_inserter
std::vector<MyType> v2;
std::copy_if(v1.begin(), v1.end(), std::back_inserter(v2),
[](MyType const & x) { return simple_citerion(x); } );
Using a little bit of Boost, you can:
std::vector<int> v = {1,2,-9,3};
for (auto i : v | filtered(_arg1 >=0))
std::cout << i << "\n";
This sample uses Phoenix for implicit lambdas defined by expression template (_arg1 >= 0), but you can use any callable (C++03 or higher) with Boost adaptors (fitlered, transformed, reversed etc)
See here for more showcase material and a full example:
Is it possible to use boost::filter_iterator for output?