The following code groups values for any container with a generic grouping lambda:
template<class Iterator, class GroupingFunc,
class T = remove_reference_t<decltype(*declval<Iterator>())>,
class GroupingType = decltype(declval<GroupingFunc>()(declval<T&>()))>
auto groupValues(Iterator begin, Iterator end, GroupingFunc groupingFunc) {
map<GroupingType, list<T>> groups;
for_each(begin, end,
[&groups, groupingFunc](const auto& val){
groups[groupingFunc(val)].push_back(val);
} );
return groups;
}
With the following usage:
int main() {
list<string> strs = {"hello", "world", "Hello", "World"};
auto groupOfStrings =
groupValues(strs.begin(), strs.end(),
[](auto& val) {
return (char)toupper(val.at(0));
});
print(groupOfStrings); // assume a print method
list<int> numbers = {1, 5, 10, 24, 13};
auto groupOfNumbers =
groupValues(numbers.begin(), numbers.end(),
[](int val) {
int decile = int(val / 10) * 10;
return to_string(decile) + '-' + to_string(decile + 9);
});
print(groupOfNumbers); // assume a print method
}
I am a bit reluctant regarding the (over?)-use of declval and decltype in groupValues function.
Do you see a better way for writing it?
(Question is mainly for better style and clarity unless of course you see any other issue).
Code: http://coliru.stacked-crooked.com/a/f65d4939b402a750
I would probably move the last two template parameters inside the function, and use std::result_of to give a slightly more tidy function:
template <typename T>
using deref_iter_t = std::remove_reference_t<decltype(*std::declval<T>())>;
template<class Iterator, class GroupingFunc>
auto groupValues(Iterator begin, Iterator end, GroupingFunc groupingFunc) {
using T = deref_iter_t<Iterator>;
using GroupingType = std::result_of_t<GroupingFunc(T&)>;
std::map<GroupingType, std::list<T>> groups;
std::for_each(begin, end, [&groups, groupingFunc](const auto& val){
groups[groupingFunc(val)].push_back(val);
});
return groups;
}
live demo
Related
I made a function that merges two sorted queues.
Queue<int> merge(Queue<int> a, Queue<int> b){
Queue<int> result;
while (!a.isEmpty() && !b.isEmpty()) {
int a1 = a.peek();
int b1 = b.peek();
if (a1 < b1) {
if (! result.isEmpty()) {
if (result.back() > a1) {
error("queue a is not sorted");
}
}
result.add(a1); // add the element to the result and make sure to remove it from the
a.dequeue(); // input queue so we don't get stuck in an infinite loop
} else {
if (! result.isEmpty()) {
if (result.back() > b1) {
error("queue b is not sorted");
}
}
result.add(b1);
b.dequeue();
}
} while (!a.isEmpty()) {
if (! result.isEmpty()) {
if (result.back() > a.peek()) {
error("queue a is not sorted");
}
}
result.add(a.front());
a.dequeue();
} while (!b.isEmpty()) {
if (! result.isEmpty()) {
if (result.back() > b.peek()) {
error("queue b is not sorted");
}
}
result.add(b.front());
b.dequeue();
}
return result;}
Now, I am trying to merge multiple queues together, recursively. Here is my thought process so far:
Divide the input collection of k sequences into two halves, left and right.
Make a recursive call to recMultiMerge on the "left" half of the sequences to generate one combined, sorted sequence. Then, do the same for the "right" half of the sequences, generating a second combined, sorted sequence.
Using the binary merge function I made above, join the two combined sequences into the final result sequence, which is then returned.
I'm having trouble on the actual recursive call, because I can't figure out how to store the result and recurse again. Here is my attempt so far:
Queue<int> recMultiMerge(Vector<Queue<int>>& all)
{
Queue<int> result = {};
Vector<Queue<int>> left = all.subList(0, all.size() / 2);
Vector<Queue<int>> right = all.subList(all.size() / 2, all.size() / 2);
if (all.isEmpty()) {
return {};
}
else if (left.size() == 1) {
return left[0];
}
else if (right.size() == 1) {
return right[0];
}
else {
Queue<int> leftCombined = recMultiMerge(left);
Queue<int> rightCombined = recMultiMerge(right);
result = merge(leftCombined, rightCombined);
}
return result;
}
The problem is, I can't get it to return more than just the first queue. Here is the problem illustrated in a test case:
on
Vector<Queue<int>> all = {{3, 6, 9, 9, 100}, {1, 5, 9, 9, 12}, {5}, {}, {-5, -5}, {3402}}
it yields
{3, 6, 9, 9, 100}
instead of
{-5, -5, 1, 3, 5, 5, 6, 9, 9, 9, 9, 12, 100, 3402}
Any advice?
An explanation of why your code gives the results you see.
The first call to recMultiMerge has 6 queues. left will be the first three ({3, 6, 9, 9, 100}, {1, 5, 9, 9, 12}, {5}), and right will be the last three ({}, {-5, -5}, {3402}).
Then you make a recursive call with left. In that call, all.size() will be 3. left will have one queue ({3, 6, 9, 9, 100}), and right will also only have one queue ({1, 5, 9, 9, 12}). (I'm assuming the 2nd parameter to Vector.subList is a count.) This will stop at the second if because left.size() == 1. The result will be that first queue.
Now we're back at the first recursive call (having lost the 2nd and 3rd queues), and we again main a recursive call with right (which has 3 queues in it). This will proceed like the last call did, returning the first queue (which in this case is empty) and losing the other two.
Then you merge those two queues ({3, 6, 9, 9, 100} and {}), resulting in your answer: {3, 6, 9, 9, 100}.
This reveals two problems: Not properly dividing a Vector with an odd number of queues in it, and terminating the recursion too early (when the left half of the split only has one queue in it, even though the right half may not be empty).
I'd start with a binary fold.
template<class X, class Op>
X binary_fold( span<X> elems, Op op );
or std::vector<X>, but I prefer span for this.
It splits elems into two pieces, then either recurses or calls op on it.
You can then test binary_fold with debugging code that simply prints the pieces on the left/right in some way, and you can see how the recursion plays out.
Once you have that, you plug back in your merge program and it should just work.
Live example.
Full code:
template<class X>
struct span {
X* b = 0;
X* e = 0;
X* begin() const { return b; }
X* end() const { return e; }
std::size_t size() const { return end()-begin(); }
X& front() const { return *begin(); }
X& back() const { return *(end()-1); }
X* data() const { return begin(); }
bool empty() const { return size()==0; }
span( X* s, X* f ):b(s),e(f) {}
span() = default;
span( X* s, std::size_t l ):span(s, s+l) {}
span( std::vector<X>& v ):span( v.data(), v.size() ) {}
template<std::size_t N>
span( X(&arr)[N] ):span(arr, N) {}
template<std::size_t N>
span( std::array<X, N>& arr ):span(arr.data(), N) {}
span except_front( std::size_t n = 1 ) const {
n = (std::min)(n, size());
return {begin()+n, end()};
}
span only_front( std::size_t n = 1 ) const {
n = (std::min)(n, size());
return {begin(), begin()+n};
}
span except_back( std::size_t n = 1 ) const {
n = (std::min)(n, size());
return {begin(), end()-n};
}
span only_back( std::size_t n = 1 ) const {
n = (std::min)(n, size());
return {end()-n, end()};
}
};
template<class X, class Op>
X binary_fold( span<X> elems, Op op ) {
if (elems.empty()) return {};
if (elems.size() == 1) return elems.front();
auto lhs = binary_fold( elems.only_front( elems.size()/2 ), op );
auto rhs = binary_fold( elems.except_front( elems.size()/2 ), op );
return op(std::move(lhs), std::move(rhs));
}
I have a vector or list of which I only want to apply code to specific elements. E.g.
class Container : public std::vector<Element*>
Or
class Container : public std::list<Element*>
And:
Container newContainer = inputContainer.Get(IsSomething);
if (!newContainer.empty()) {
for (Element* const el: newContainer ) {
[some stuff]
}
} else {
for (Element* const el : inputContainer) {
[some stuff]
}
}
I've written a member function Get() as follows.
template<typename Fn>
auto Container::Get(const Fn& fn) const {
Container output;
std::copy_if(cbegin(), cend(), std::inserter(output, output.end()), fn);
return output;
}
and IsSomething would be a lambda, e.g.
auto IsSomething= [](Element const* const el)->bool { return el->someBool; };
From performance perspective: Is this a good approach? Or would it be better to copy and remove?
template<typename Fn>
auto Container::Get(const Fn& fn) const {
Container output(*this);
output.erase(std::remove_if(output.begin(), output.end(), fn), end(output));
return output;
}
Or is there a better approach anyhow?
edit: different example
As my previous example can be written in a better way, let's show a different example:
while (!(container2 = container1.Get(IsSomething)).empty()&&TimesFooCalled<SomeValue)
{
Container container3(container2.Get(IsSomething));
if (!container3.empty()) {
Foo(*container3.BestElement());
} else {
Foo(*container2.BestElement());
}
}
Not answering your direct question, but note that you can implement the original algorithm without copying anything. Something like this:
bool found = false;
for (Element* const el: inputContainer) {
if (IsSomething(el)) {
found = true;
[some stuff]
}
}
if (!found) {
for (Element* const el : inputContainer) {
[some stuff]
}
}
The usual pattern that I use is something like this:
for(auto const * item : inputContainer) if(IsSomething(item)) {
// Do stuff with item
}
This is usually good enough, so other approaches seem overkill.
For better performance it is always better not to copy or remove elements from the list you get. In my experience it's even faster if you only go through the list once, for caching reasons. So here is what I would do to find one or the other "best" value from a list:
auto const isBetter = std::greater<Element>();
Element const * best = nullptr, const * alt_best = nullptr;
for(Element const * current : inputContainer) {
if(IsSomething(current)) {
if(!best || isBetter(*best, *current)) best = current;
} else {
if(!alt_best || isBetter(*alt_best, *current)) alt_best = current;
}
}
if(best) {
// do something with best
} else if(alt_best) {
// do something with alt_best
} else {
// empty list
}
If you find yourself doing this a lot or you want to make this part of your class's interface you could consider writing an iterator that skips elements you don't like.
If you actually want to remove the item from the list, you could do something like this:
inputContainer.erase(std::remove_if(std::begin(inputContainer), std::end(inputContainer),
[](Element const *item) {
if(IsSomething(item)) {
// Do something with item
return true;
}
return false;
}
));
I want to return a range from a function that represents a view on a STL collection, something like this:
auto createRange() {
std::unordered_set<int> is = {1, 2, 3, 4, 5, 6};
return is | view::transform([](auto&& i) {
return i;
});
}
However, view::transform does not take ownership of is, so when I run this, there is undefined behavior, because is is freed when createRange exits.
int main(int argc, char* argv[]) {
auto rng = createRange();
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
If I try std::move(is) as the input, I get a static assert indicating that I can't use rvalue references as inputs to a view. Is there any way to ensure that the view takes ownership of the collection?
Edit: Some Additional Info
I want to add some clarifying info. I have a stream of data, data that I have a view on that transforms the data into a struct, Foo, that looks something like this:
struct Foo {
std::string name;
std::unordered_set<int> values;
}
// Take the input stream and turn it into a range of Foos
auto foos = data | asFoo();
What I want to do is create a range of std::pair<std::string, int> by distributing the name throughout the values. My naive attempt looks something like this:
auto result = data | asFoo() | view::transform([](auto&& foo) {
const auto& name = foo.name;
const auto& values = foo.values;
return values | view::transform([name](auto&& value) {
return std::make_pair(name, value);
}
}) | view::join;
However, this results in the undefined behavior because values is freed. The only way that I have been able to get around this is to make values a std::shared_ptr and to capture it in the lambda passed to view::transform to preserve it's lifetime. That seems like an inelegant solution.
I think what I am looking for is a view that will take ownership of the source collection, but it does not look like range-v3 has that.
Alternatively, I could just create the distributed version using a good old fashioned for-loop, but that does not appear to work with view::join:
auto result = data | asFoo() | view::transform([](auto&& foo) {
const auto& name = foo.name;
const auto& values = foo.values;
std::vector<std::pair<std::string, std::string>> distributedValues;
for (const auto& value : values) {
distributedValues.emplace_back(name, value);
}
return distributedValues;
}) | view::join;
Even if this did work with view::join, I also think that the mixed metaphor of ranges and loops is also inelegant.
Views do not own the data they present. If you need to ensure the persistence of the data, then the data itself needs to be preserved.
auto createRange() {
//We're using a pointer to ensure that the contents don't get moved around, which might invalidate the view
std::unique_ptr<std::unordered_set<int>> is_ptr = std::make_unique<std::unordered_set<int>>({1,2,3,4,5,6});
auto & is = *is_ptr;
auto view = is | view::transform([](auto&& i) {return i;});
return std::make_pair(view, std::move(is_ptr));
}
int main() {
auto[rng, data_ptr] = createRange();
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
An alternate method is to make sure the function is provided the data set from which the view will be created:
auto createRange(std::unordered_set<int> & is) {
return is | view::transform([](auto&& i) {return i;});
}
int main() {
std::unordered_set<int> is = {1,2,3,4,5,6};
auto rng = createRange(is);
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
Either solution should broadly represent what your solution for your project will need to do.
I decided to implement Python's slice with C++ by myself. I wrote a function which accepts variadic slice_info<int> arguments and returns the slice selection of an n-dimensional array, ndvalarray<T, D>.
I compiled with Visual C++ 2015 and the code is like the following:
template<typename T>
struct slice_info {
T fr, to, step;
slice_info(T i) {
fr = 1; to = i; step = 1;
}
slice_info(std::initializer_list<T> il) {
std::vector<T> l(il);
if (l.size() == 1)
{
fr = 1; step = 1; to = l[0];
}
else if (l.size() == 2) {
fr = l[0]; step = 1; to = l[1];
}
else {
fr = l[0]; step = l[2]; to = l[1];
}
}
slice_info(const slice_info<T> & x) : fr(x.fr), to(x.to), step(x.step) {
}
};
template<typename T, int D>
void slice(const ndvalarray<T, D> & va, slice_info<int>&& s) {
// ndvalarray<T, D> is a n-dimensional array of type T
}
template<typename T, int D, typename ... Args>
void slice(const ndvalarray<T, D> & va, slice_info<int>&& s, Args&& ... args) {
slice(va, std::forward<Args>(args)...);
}
template<typename T, int D>
void slice_work_around(const ndvalarray<T, D> & va, const std::vector<slice_info<int>> & vs) {
}
int main(){
// Here is a 3-dimensional arr
ndvalarray<int, 3> arr;
// I want to get a slice copy of arr.
// For dimension 1, I select elements from position 1 to position 2.
// For dimension 2, I select elements from position 3 to position 6 stride 2
// For dimension 3, I select only the position 7 element
slice(arr, { 1, 2 }, { 3, 6, 2 }, {7}); // #1 error
slice(arr, { 1, 2 }, slice_info<int>{ 3, 6, 2 }, slice_info<int>{7}); // #2 yes
slice_work_around(arr, { {1, 2}, {3, 6, 2}, {7} }); // #3 yes
}
I thought #1 is an error because
braced-init-list is not an expression and therefore has no type
I tried #2 and #3, and they worked. However I am still wondering is there are possible ways to make #1 possible. This question is a bit similar with c11-variable-number-of-arguments-same-specific-type, and in my case these variable number of arguments are braced-init-list.
slice_info<T> accepts a std::initializer_list<T>, in order to describe a slice selection of a dimension, like std::slice(). ndvalarray has more than one dimension, so I have to give a pack of slice_info<T>.
I choose to implement a constructor with a std::initializer_list<T> argument, because I can use a braced-init-list, and I don't need to give T or call constructors explicitly (like #2 does). Maybe it's a bad design, but I think it's simple to use.
I've just started implementing my first medium scale program in D 2.0 after reading Andrei's book The D Programming Language. One of the first problems I came to was using the std.algorithm library with a built-in associative array. For example:
#!/usr/bin/env rdmd
import std.stdio;
import std.algorithm;
void main()
{
alias int[string] StringHashmap;
StringHashmap map1;
map1["one"] = 1;
map1["two"] = 2;
writefln("map1: %s", map1);
StringHashmap map2;
map2["two"] = 2;
map2["three"] = 3;
writefln("map2: %s", map2);
auto inter = setIntersection(map1, map2);
}
It seemed a simple enough thing to me, expecting that iterating over the inter would produce the single "two" entry. However, I get this compiler error:
./test.d(20): Error: template
std.algorithm.setIntersection(alias
less = "a < b",Rs...) if
(allSatisfy!(isInputRange,Rs)) does
not match any function template
declaration
./test.d(20): Error: template
std.algorithm.setIntersection(alias
less = "a < b",Rs...) if
(allSatisfy!(isInputRange,Rs)) cannot
deduce template function from argument
types !()(int[string],int[string])
I can see that the built-in associative array doesn't seem to provide any version of the range to use with the std algorithms.
Am I missing something? Doing something wrong? If not, is this a glaring omission? Is there some reason why this is properly unavailable?
Use this:
auto inter = setIntersection(map1.keys, map2.keys);
Note that std::map in C++ is a sorted data structure, while an associative array in D is unordered. std.algorithm.setIntersection assumes a sorted range, so you can't use this function until you've converted the associative array into a sorted range, e.g. (result)
import std.typecons;
import std.array;
import std.algorithm;
import std.stdio;
auto byItemSorted(K,V)(V[K] dict) {
auto app = appender!(Tuple!(K,V)[])();
foreach (k, v; dict)
app.put(tuple(k, v));
auto res = app.data; // if there's byItem() we don't need this appender stuff.
sort(res);
return res;
}
auto dictIntersection(K,V)(V[K] map1, V[K] map2) {
return setIntersection(byItemSorted(map1), byItemSorted(map2));
}
void main () {
auto map1 = ["red":4, "blue":6],
map2 = ["blue":2, "green":1],
map3 = ["blue":6, "purple":8];
writeln("map1 & map2 = ", array(dictIntersection(map1, map2)));
writeln("map1 & map3 = ", array(dictIntersection(map1, map3)));
}
But this method is inefficient — it takes O(N log N) to sort a range.
A more efficient method is like to write your own intersection routine, which only takes O(N) (result):
import std.stdio;
struct DictIntersection(K,V) {
V[K] m1, m2;
this(V[K] map1, V[K] map2) { m1 = map1; m2 = map2; }
int opApply(int delegate(ref K, ref V) dg) {
int res = 0;
foreach (k, v; m1) {
V* p = k in m2;
if (p && v == *p) {
res = dg(k, v);
if (res)
break;
}
}
return res;
}
}
DictIntersection!(K,V) dictIntersection(K,V)(V[K] map1, V[K] map2) {
return typeof(return)(map1, map2);
}
void main () {
auto map1 = ["red":4, "blue":6],
map2 = ["blue":2, "green":1],
map3 = ["blue":6, "purple":8];
write("map1 & map2 = ");
foreach (k, v; dictIntersection(map1, map2)) write(k, "->", v, " ");
write("\nmap1 & map3 = ");
foreach (k, v; dictIntersection(map1, map3)) write(k, "->", v, " ");
}
However, because opApply doesn't count as an input range, all range algorithms won't work with this. (I don't know how this can be made into an input range.)
You can get either the keys or the values from an associative array.
To get the intersection on the values, use
auto inter = setIntersection(map1.values, map2.values);
foreach (i; inter) {
writeln(i);
}
To get the intersection on the keys, use
auto inter = setIntersection(map1.keys, map2.keys);
foreach (i; inter) {
writeln(i);
}
I don't think you can get access to a range containing the key, value pairs like with a C++ std::map.
See http://www.digitalmars.com/d/2.0/hash-map.html