Related
Following the question in Heterogenous vectors of pointers. How to call functions.
I would like to know how to identify null points inside the vector of boost::variant.
Example code:
#include <boost/variant.hpp>
#include <vector>
template< typename T>
class A
{
public:
A(){}
~A(){}
void write();
private:
T data;
};
template< typename T>
void A<T>::write()
{
std::cout << data << std::endl;
}
class myVisitor
: public boost::static_visitor<>
{
public:
template< typename T>
void operator() (A<T>* a) const
{
a->write();
}
};
int main()
{
A<int> one;
A<double> two;
typedef boost::variant<A<int>*, A<double>* > registry;
std::vector<registry> v;
v.push_back(&one);
v.push_back(&two);
A<int>* tst = new A<int>;
for(auto x: v)
{
boost::apply_visitor(myVisitor(), x);
try {delete tst; tst = nullptr;}
catch (...){}
}
}
Since I am deleting the pointer I would hope that the last one will give me an error or something. How can I check if the entry in the entry is pointing to nullptr?
Note: this partly ignores the X/Y of this question, based on the tandom question (Heterogenous vectors of pointers. How to call functions)
What you seem to be after is polymorphic collections, but not with a virtual type hierarchy.
This is known as type erasure, and Boost Type Erasure is conveniently wrapped for exactly this use case with Boost PolyCollection.
The type erased variation would probably look like any_collection:
Live On Coliru
#include <boost/variant.hpp>
#include <cmath>
#include <iostream>
#include <vector>
#include <boost/poly_collection/any_collection.hpp>
#include <boost/type_erasure/member.hpp>
namespace pc = boost::poly_collection;
BOOST_TYPE_ERASURE_MEMBER(has_write, write)
using writable = has_write<void()>;
template <typename T> class A {
public:
A(T value = 0) : data(value) {}
// A() = default; // rule of zero
//~A() = default;
void write() const { std::cout << data << std::endl; }
private:
T data/* = 0*/;
};
int main()
{
pc::any_collection<writable> registry;
A<int> one(314);
A<double> two(M_PI);
registry.insert(one);
registry.insert(two);
for (auto& w : registry) {
w.write();
}
}
Prints
3.14159
314
Note that the insertion order is preserved, but iteration is done type-by-type. This is also what makes PolyCollection much more efficient than "regular" containers that do not optimize allocation sizes or use pointers.
BONUS: Natural printing operator<<
Using classical dynamic polymorphism, this would not work without adding virtual methods, but with Boost TypeErasure ostreamable is a ready-made concept:
Live On Coliru
#include <boost/variant.hpp>
#include <cmath>
#include <iostream>
#include <vector>
#include <boost/poly_collection/any_collection.hpp>
#include <boost/type_erasure/operators.hpp>
namespace pc = boost::poly_collection;
using writable = boost::type_erasure::ostreamable<>;
template <typename T> class A {
public:
A(T value = 0) : data(value) {}
// A() = default; // rule of zero
//~A() = default;
private:
friend std::ostream& operator<<(std::ostream& os, A const& a) {
return os << a.data;
}
T data/* = 0*/;
};
int main()
{
pc::any_collection<writable> registry;
A<int> one(314);
A<double> two(M_PI);
registry.insert(one);
registry.insert(two);
for (auto& w : registry) {
std::cout << w << "\n";
}
}
Printing the same as before.
UPDATE
To the comment:
I want to create n A<someType> variables (these are big objects). All of these variables have a write function to write something to a file.
My idea is to collect all the pointers of these variables and at the end loop through the vector to call each write function. Now, it might happen that I want to allocate memory and delete a A<someType> variable. If this happens it should not execute the write function.
This sounds like one of the rare occasions where shared_ptr makes sense, because it allows you to observe the object's lifetime using weak_ptr.
Object Graph Imagined...
Let's invent a node type that can participate in a pretty large object graph, such that you would keep an "index" of pointers to some of its nodes. For this demonstration, I'll make it a tree-structured graph, and we're going to keep References to the leaf nodes:
using Object = std::shared_ptr<struct INode>;
using Reference = std::weak_ptr<struct INode>;
Now, lets add identification to the Node base so we have an arbitrary way to identify nodes to delete (e.g. all nodes with odd ids). In addition, any node can have child nodes, so let's put that in the base node as well:
struct INode {
virtual void write(std::ostream& os) const = 0;
std::vector<Object> children;
size_t id() const { return _id; }
private:
size_t _id = s_idgen++;
};
Now we need some concrete derived node types:
template <typename> struct Node : INode {
void write(std::ostream& os) const override;
};
using Root = Node<struct root_tag>;
using Banana = Node<struct banana_tag>;
using Pear = Node<struct pear_tag>;
using Bicycle = Node<struct bicycle_tag>;
// etc
Yeah. Imagination is not my strong suit ¯\(ツ)/¯
Generate Random Data
// generating demo data
#include <random>
#include <functional>
#include <array>
static std::mt19937 s_prng{std::random_device{}()};
static std::uniform_int_distribution<size_t> s_num_children(0, 3);
Object generate_object_graph(Object node, unsigned max_depth = 10) {
std::array<std::function<Object()>, 3> factories = {
[] { return std::make_shared<Banana>(); },
[] { return std::make_shared<Pear>(); },
[] { return std::make_shared<Bicycle>(); },
};
for(auto n = s_num_children(s_prng); max_depth && n--;) {
auto pick = factories.at(s_prng() % factories.size());
node->children.push_back(generate_object_graph(pick(), max_depth - 1));
}
return node;
}
Nothing fancy. Just a randomly generated tree with a max_depth and random distribution of node types.
write to Pretty-Print
Let's add some logic to display any object graph with indentation:
// for demo output
#include <boost/core/demangle.hpp>
template <typename Tag> void Node<Tag>::write(std::ostream& os) const {
os << boost::core::demangle(typeid(Tag*).name()) << "(id:" << id() << ") {";
if (not children.empty()) {
for (auto& ch : children) {
ch->write(os << linebreak << "- " << indent);
os << unindent;
}
os << linebreak;
}
os << "}";
}
To keep track of the indentation level I'll define these indent/unindent
manipulators modifying some custom state inside the stream object:
static auto s_indent = std::ios::xalloc();
std::ostream& indent(std::ostream& os) { return os.iword(s_indent) += 3, os; }
std::ostream& unindent(std::ostream& os) { return os.iword(s_indent) -= 3, os; }
std::ostream& linebreak(std::ostream& os) {
return os << "\n" << std::setw(os.iword(s_indent)) << "";
}
That should do.
Getting Leaf Nodes
Leaf nodes are the nodes without any children.
This is a depth-first tree visitor taking any output iterator:
template <typename Out>
Out get_leaf_nodes(Object const& tree, Out out) {
if (tree) {
if (tree->children.empty()) {
*out++ = tree; // that's a leaf node!
} else {
for (auto& ch : tree->children) {
get_leaf_nodes(ch, out);
}
}
}
return out;
}
Removing some nodes:
Yet another depht-first visitor:
template <typename Pred>
size_t remove_nodes_if(Object tree, Pred predicate)
{
size_t n = 0;
if (!tree)
return n;
auto& c = tree->children;
// depth first
for (auto& child : c)
n += remove_nodes_if(child, predicate);
auto e = std::remove_if(begin(c), end(c), predicate);
n += std::distance(e, end(c));
c.erase(e, end(c));
return n;
}
DEMO TIME
Tieing it all together, we can print a randomly generated graph:
int main()
{
auto root = generate_object_graph(std::make_shared<Root>());
root->write(std::cout);
This puts all its leaf node References in a container:
std::list<Reference> leafs;
get_leaf_nodes(root, back_inserter(leafs));
Which we can print using their write() methods:
std::cout << "\nLeafs: " << leafs.size();
for (Reference& ref : leafs)
if (Object alive = ref.lock())
alive->write(std::cout << " ");
Of course all the leafs are still alive. But we can change that! We will remove one in 5 nodes by id:
auto _2mod5 = [](Object const& node) { return (2 == node->id() % 5); };
std::cout << "\nRemoved " << remove_nodes_if(root, _2mod5) << " 2mod5 nodes from graph\n";
std::cout << "\n(Stale?) Leafs: " << leafs.size();
The reported number of leafs nodes would still seem the same. That's... not
what you wanted. Here's where your question comes in: how do we detect the
nodes that were deleted?
leafs.remove_if(std::mem_fn(&Reference::expired));
std::cout << "\nLive leafs: " << leafs.size();
Now the count will accurately reflect the number of leaf nodes remaining.
Live On Coliru
#include <memory>
#include <vector>
#include <ostream>
using Object = std::shared_ptr<struct INode>;
using Reference = std::weak_ptr<struct INode>;
static size_t s_idgen = 0;
struct INode {
virtual void write(std::ostream& os) const = 0;
std::vector<Object> children;
size_t id() const { return _id; }
private:
size_t _id = s_idgen++;
};
template <typename> struct Node : INode {
void write(std::ostream& os) const override;
};
using Root = Node<struct root_tag>;
using Banana = Node<struct banana_tag>;
using Pear = Node<struct pear_tag>;
using Bicycle = Node<struct bicycle_tag>;
// etc
// for demo output
#include <boost/core/demangle.hpp>
#include <iostream>
#include <iomanip>
static auto s_indent = std::ios::xalloc();
std::ostream& indent(std::ostream& os) { return os.iword(s_indent) += 3, os; }
std::ostream& unindent(std::ostream& os) { return os.iword(s_indent) -= 3, os; }
std::ostream& linebreak(std::ostream& os) {
return os << "\n" << std::setw(os.iword(s_indent)) << "";
}
template <typename Tag> void Node<Tag>::write(std::ostream& os) const {
os << boost::core::demangle(typeid(Tag*).name()) << "(id:" << id() << ") {";
if (not children.empty()) {
for (auto& ch : children) {
ch->write(os << linebreak << "- " << indent);
os << unindent;
}
os << linebreak;
}
os << "}";
}
// generating demo data
#include <random>
#include <functional>
#include <array>
static std::mt19937 s_prng{std::random_device{}()};
static std::uniform_int_distribution<size_t> s_num_children(0, 3);
Object generate_object_graph(Object node, unsigned max_depth = 10) {
std::array<std::function<Object()>, 3> factories = {
[] { return std::make_shared<Banana>(); },
[] { return std::make_shared<Pear>(); },
[] { return std::make_shared<Bicycle>(); },
};
for(auto n = s_num_children(s_prng); max_depth && n--;) {
auto pick = factories.at(s_prng() % factories.size());
node->children.push_back(generate_object_graph(pick(), max_depth - 1));
}
return node;
}
template <typename Out>
Out get_leaf_nodes(Object const& tree, Out out) {
if (tree) {
if (tree->children.empty()) {
*out++ = tree;
} else {
for (auto& ch : tree->children) {
get_leaf_nodes(ch, out);
}
}
}
return out;
}
template <typename Pred>
size_t remove_nodes_if(Object tree, Pred predicate)
{
size_t n = 0;
if (!tree)
return n;
auto& c = tree->children;
// depth first
for (auto& child : c)
n += remove_nodes_if(child, predicate);
auto e = std::remove_if(begin(c), end(c), predicate);
n += std::distance(e, end(c));
c.erase(e, end(c));
return n;
}
#include <list>
int main()
{
auto root = generate_object_graph(std::make_shared<Root>());
root->write(std::cout);
std::list<Reference> leafs;
get_leaf_nodes(root, back_inserter(leafs));
std::cout << "\n------------"
<< "\nLeafs: " << leafs.size();
for (Reference& ref : leafs)
if (Object alive = ref.lock())
alive->write(std::cout << " ");
auto _2mod5 = [](Object const& node) { return (2 == node->id() % 5); };
std::cout << "\nRemoved " << remove_nodes_if(root, _2mod5) << " 2mod5 nodes from graph\n";
std::cout << "\n(Stale?) Leafs: " << leafs.size();
// some of them are not alive, see which are gone ("detecing the null pointers")
leafs.remove_if(std::mem_fn(&Reference::expired));
std::cout << "\nLive leafs: " << leafs.size();
}
Prints e.g.
root_tag*(id:0) {
- bicycle_tag*(id:1) {}
- bicycle_tag*(id:2) {
- pear_tag*(id:3) {}
}
- bicycle_tag*(id:4) {
- bicycle_tag*(id:5) {}
- bicycle_tag*(id:6) {}
}
}
------------
Leafs: 4 bicycle_tag*(id:1) {} pear_tag*(id:3) {} bicycle_tag*(id:5) {} bicycle_tag*(id:6) {}
Removed 1 2mod5 nodes from graph
(Stale?) Leafs: 4
Live leafs: 3
Or see the COLIRU link for a much larger sample.
Let's say I have movable and not copyable object and I have boost multi-index array with random_access index. I need to move my object out of array front, but I cannot find any method, that would give me rvalue/lvalue reference in documentation. I can only see front() which gives me constant reference and pop_front() which erases element, but does not return anything. So is there a way to move element out of boost multi-index?
Adding to #sehe's answer, the following shows how to modify the code in case your moveable type is not default constructible:
Edited: changed code to properly deal with destruction of *extracted.
Edited: added alternative with std::unique_ptr.
Edited: added a second altrnative by sehe.
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <iostream>
#include <type_traits>
struct moveonly {
int x;
moveonly(int x) noexcept : x(x) {}
moveonly(moveonly&& o) noexcept : x(o.x) { o = {-1}; }
moveonly& operator=(moveonly o) noexcept { using std::swap; swap(x, o.x); return *this; }
};
static_assert(not std::is_copy_constructible<moveonly>{}, "moveonly");
namespace bmi = boost::multi_index;
using Table = bmi::multi_index_container<moveonly,
bmi::indexed_by<
bmi::random_access<bmi::tag<struct _ra> >
> >;
template <typename Container>
void dump(std::ostream& os, Container const& c) {
for (auto& r: c) os << r.x << " ";
os << "\n";
}
moveonly pop_front(Table& table) {
std::aligned_storage<sizeof(moveonly), alignof(moveonly)>::type buffer;
moveonly* extracted = reinterpret_cast<moveonly*>(&buffer);
auto it = table.begin();
if (it == table.end())
throw std::logic_error("pop_front");
if (table.modify(it, [&](moveonly& v) { new (extracted) moveonly{std::move(v)}; })) {
table.erase(it);
}
try {
moveonly ret = std::move(*extracted);
extracted->~moveonly();
return ret;
} catch(...) {
extracted->~moveonly();
throw;
}
}
int main() {
Table table;
table.push_back({1});
table.push_back({2});
table.push_back({3});
dump(std::cout << "table before: ", table);
std::cout << "Extracted: " << pop_front(table).x << "\n";
dump(std::cout << "table after: ", table);
}
Same thing using std::unique_ptr for cleanup:
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <iostream>
#include <memory>
#include <type_traits>
struct moveonly {
int x;
moveonly(int x) noexcept : x(x) {}
moveonly(moveonly&& o) noexcept : x(o.x) { o = {-1}; }
moveonly& operator=(moveonly o) noexcept { using std::swap; swap(x, o.x); return *this; }
};
static_assert(not std::is_copy_constructible<moveonly>{}, "moveonly");
namespace bmi = boost::multi_index;
using Table = bmi::multi_index_container<moveonly,
bmi::indexed_by<
bmi::random_access<bmi::tag<struct _ra> >
> >;
template <typename Container>
void dump(std::ostream& os, Container const& c) {
for (auto& r: c) os << r.x << " ";
os << "\n";
}
moveonly pop_front(Table& table) {
std::aligned_storage<sizeof(moveonly), alignof(moveonly)>::type buffer;
moveonly* extracted = reinterpret_cast<moveonly*>(&buffer);
auto it = table.begin();
if (it == table.end())
throw std::logic_error("pop_front");
if (table.modify(it, [&](moveonly& v) { new (extracted) moveonly{std::move(v)}; })) {
table.erase(it);
}
std::unique_ptr<moveonly,void(*)(moveonly*)> ptr = {
extracted,
[](moveonly* p){ p->~moveonly(); }
};
return std::move(*extracted);
}
int main() {
Table table;
table.push_back({1});
table.push_back({2});
table.push_back({3});
dump(std::cout << "table before: ", table);
std::cout << "Extracted: " << pop_front(table).x << "\n";
dump(std::cout << "table after: ", table);
}
Sehe provides yet another alternative based on boost::optional which is the most elegant of all:
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/optional.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <iostream>
#include <memory>
#include <type_traits>
struct moveonly {
int x;
moveonly(int x) noexcept : x(x) {}
moveonly(moveonly&& o) noexcept : x(o.x) { o = {-1}; }
moveonly& operator=(moveonly o) noexcept { using std::swap; swap(x, o.x); return *this; }
};
static_assert(not std::is_copy_constructible<moveonly>{}, "moveonly");
namespace bmi = boost::multi_index;
using Table = bmi::multi_index_container<moveonly,
bmi::indexed_by<
bmi::random_access<bmi::tag<struct _ra> >
> >;
template <typename Container>
void dump(std::ostream& os, Container const& c) {
for (auto& r: c) os << r.x << " ";
os << "\n";
}
moveonly pop_front(Table& table) {
boost::optional<moveonly> extracted;
auto it = table.begin();
if (it == table.end())
throw std::logic_error("pop_front");
if (table.modify(it, [&](moveonly& v) { extracted = std::move(v); })) {
table.erase(it);
}
return std::move(*extracted);
}
int main() {
Table table;
table.push_back({1});
table.push_back({2});
table.push_back({3});
dump(std::cout << "table before: ", table);
std::cout << "Extracted: " << pop_front(table).x << "\n";
dump(std::cout << "table after: ", table);
}
Non-const element operations are not supported because they could leave elements in a state which would break invariants placed on them by the various indexes.
The closest thing you can do is using modify:
moveonly pop_front(Table& table) {
moveonly extracted;
auto it = table.begin();
if (it == table.end())
throw std::logic_error("pop_front");
if (table.modify(it, [&](moveonly& v) { extracted = std::move(v); })) {
table.erase(it);
}
return extracted;
}
Note that modify does incur the cost of checking all indexes, and may fail. Fortunately, if it does fail, the effect is that iterator is erased:
Effects: Calls mod(e) where e is the element pointed to by position and rearranges *position into all the indices of the multi_index_container. Rearrangement on sequenced indices does not change the position of the element with respect to the index; rearrangement on other indices may or might not succeed. If the rearrangement fails, the element is erased.
Postconditions: Validity of position is preserved if the operation succeeds.
And here's a live demo:
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <iostream>
struct moveonly {
int x;
moveonly(int x = -1) noexcept : x(x) {}
moveonly(moveonly&& o) noexcept : x(o.x) { o = {}; }
moveonly& operator=(moveonly o) noexcept { using std::swap; swap(x, o.x); return *this; }
};
static_assert(not std::is_copy_constructible<moveonly>{}, "moveonly");
namespace bmi = boost::multi_index;
using Table = bmi::multi_index_container<moveonly,
bmi::indexed_by<
bmi::random_access<bmi::tag<struct _ra> >
> >;
template <typename Container>
void dump(std::ostream& os, Container const& c) {
for (auto& r: c) os << r.x << " ";
os << "\n";
}
moveonly pop_front(Table& table) {
moveonly extracted;
auto it = table.begin();
if (it == table.end())
throw std::logic_error("pop_front");
if (table.modify(it, [&](moveonly& v) { extracted = std::move(v); })) {
table.erase(it);
}
return extracted;
}
int main() {
Table table;
table.push_back({1});
table.push_back({2});
table.push_back({3});
dump(std::cout << "table before: ", table);
std::cout << "Extracted: " << pop_front(table).x << "\n";
dump(std::cout << "table after: ", table);
}
Which prints:
table before: 1 2 3
Extracted: 1
table after: 2 3
This question already has answers here:
How can I print a list of elements separated by commas?
(34 answers)
Closed 7 years ago.
Is there a way to use a std::ostream_iterator (or similar) such that the delimiter isn't placed for the last element?
#include <iterator>
#include <vector>
#include <algorithm>
#include <string>
using namespace std;
int main(int argc, char *argv[]) {
std::vector<int> ints = {10,20,30,40,50,60,70,80,90};
std::copy(ints.begin(),ints.end(),std::ostream_iterator<int>(std::cout, ","));
}
Will print
10,20,30,40,50,60,70,80,90,
I'm trying to avoid the trailing the delimiter. I want to print
10,20,30,40,50,60,70,80,90
Sure, you could use a loop:
for(auto it = ints.begin(); it != ints.end(); it++){
std::cout << *it;
if((it + 1) != ints.end()){
std::cout << ",";
}
}
But given C++11 range based loops this is cumbersome to track position.
int count = ints.size();
for(const auto& i : ints){
std::cout << i;
if(--count != 0){
std::cout << ",";
}
}
I'm open to using Boost. I looked into boost::algorithm::join() but needed to make a copy of the ints to strings so it was a two-liner.
std::vector<std::string> strs;
boost::copy(ints | boost::adaptors::transformed([](const int&i){return boost::lexical_cast<std::string>(i);}),std::back_inserter(strs));
std::cout << boost::algorithm::join(strs,",");
Ideally I'd just like to use a std::algorithm and not have the delimiter on the last item in the range.
Thanks!
#Cubbi pointed out in a comment that is is exactly what infix_iterator does
// infix_iterator.h
//
// Lifted from Jerry Coffin's 's prefix_ostream_iterator
#if !defined(INFIX_ITERATOR_H_)
#define INFIX_ITERATOR_H_
#include <ostream>
#include <iterator>
template <class T,
class charT=char,
class traits=std::char_traits<charT> >
class infix_ostream_iterator :
public std::iterator<std::output_iterator_tag,void,void,void,void>
{
std::basic_ostream<charT,traits> *os;
charT const* delimiter;
bool first_elem;
public:
typedef charT char_type;
typedef traits traits_type;
typedef std::basic_ostream<charT,traits> ostream_type;
infix_ostream_iterator(ostream_type& s)
: os(&s),delimiter(0), first_elem(true)
{}
infix_ostream_iterator(ostream_type& s, charT const *d)
: os(&s),delimiter(d), first_elem(true)
{}
infix_ostream_iterator<T,charT,traits>& operator=(T const &item)
{
// Here's the only real change from ostream_iterator:
// Normally, the '*os << item;' would come before the 'if'.
if (!first_elem && delimiter != 0)
*os << delimiter;
*os << item;
first_elem = false;
return *this;
}
infix_ostream_iterator<T,charT,traits> &operator*() {
return *this;
}
infix_ostream_iterator<T,charT,traits> &operator++() {
return *this;
}
infix_ostream_iterator<T,charT,traits> &operator++(int) {
return *this;
}
};
#endif
#include <vector>
#include <algorithm>
#include <string>
#include <iostream>
using namespace std;
int main(int argc, char *argv[]) {
std::vector<int> ints = {10,20,30,40,50,60,70,80,90};
std::copy(ints.begin(),ints.end(),infix_ostream_iterator<int>(std::cout,","));
}
Prints:
10,20,30,40,50,60,70,80,90
copy could be implement as:
template<class InputIterator, class OutputIterator>
OutputIterator copy (InputIterator first, InputIterator last, OutputIterator result)
{
while (first!=last) {
*result = *first;
++result; ++first;
}
return result;
}
The assignment to the ostream_iterator (output iterator) could be implemented as:
ostream_iterator<T,charT,traits>& operator= (const T& value) {
*out_stream << value;
if (delim!=0) *out_stream << delim;
return *this;
}
So the delimiter will be appended on every assignment to the output iterator. To avoid the delimiter being appended to the last vector element, the last element should be assigned to an output iterator without delimiter, for example:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main() {
std::vector<int> ints = {10,20,30,40,50,60,70,80,90};
std::copy(ints.begin(), ints.end()-1, std::ostream_iterator<int>(std::cout, ","));
std::copy(ints.end()-1, ints.end(), std::ostream_iterator<int>(std::cout));
std::cout << std::endl;
return 0;
}
Results in:
10,20,30,40,50,60,70,80,90
this would be easier. Dunno this's what you want
#include<iostream>
#include<algorithm>
#include<vector>
#include<iterator>
int main()
{
std::vector<int> ints={10,20,30,40,50,60,70,80,90};
std::copy(ints.begin(),ints.end(),std::ostream_iterator<int> (std::cout,","));
std::cout<<(char)8;
}
Use the erase method of std::string:
string join (const vector< vector<int> > data, const char* separator){
vector< vector<int> > result(data[0].size(), vector<int>(data.size()));
stringstream rowStream;
vector<string> rowVector;
for (size_t i = 0; i < data.size(); i++ ){
copy(data[i].begin(), data[i].begin() + data[i].size(), ostream_iterator<int>(rowStream, " "));
rowVector.push_back(rowStream.str().erase(rowStream.str().length()-1));
rowStream.str("");
rowStream.clear();
}
copy(rowVector.begin(), rowVector.begin() + rowVector.size(), ostream_iterator<string>(rowStream, separator));
return rowStream.str().erase(rowStream.str().length()-3);
}
Are there any C++ transformations which are similar to itertools.groupby()?
Of course I could easily write my own, but I'd prefer to leverage the idiomatic behavior or compose one from the features provided by the STL or boost.
#include <cstdlib>
#include <map>
#include <algorithm>
#include <string>
#include <vector>
struct foo
{
int x;
std::string y;
float z;
};
bool lt_by_x(const foo &a, const foo &b)
{
return a.x < b.x;
}
void list_by_x(const std::vector<foo> &foos, std::map<int, std::vector<foo> > &foos_by_x)
{
/* ideas..? */
}
int main(int argc, const char *argv[])
{
std::vector<foo> foos;
std::map<int, std::vector<foo> > foos_by_x;
std::vector<foo> sorted_foos;
std::sort(foos.begin(), foos.end(), lt_by_x);
list_by_x(sorted_foos, foos_by_x);
return EXIT_SUCCESS;
}
This doesn't really answer your question, but for the fun of it, I implemented a group_by iterator. Maybe someone will find it useful:
#include <assert.h>
#include <iostream>
#include <set>
#include <sstream>
#include <string>
#include <vector>
using std::cout;
using std::cerr;
using std::multiset;
using std::ostringstream;
using std::pair;
using std::vector;
struct Foo
{
int x;
std::string y;
float z;
};
struct FooX {
typedef int value_type;
value_type operator()(const Foo &f) const { return f.x; }
};
template <typename Iterator,typename KeyFunc>
struct GroupBy {
typedef typename KeyFunc::value_type KeyValue;
struct Range {
Range(Iterator begin,Iterator end)
: iter_pair(begin,end)
{
}
Iterator begin() const { return iter_pair.first; }
Iterator end() const { return iter_pair.second; }
private:
pair<Iterator,Iterator> iter_pair;
};
struct Group {
KeyValue value;
Range range;
Group(KeyValue value,Range range)
: value(value), range(range)
{
}
};
struct GroupIterator {
typedef Group value_type;
GroupIterator(Iterator iter,Iterator end,KeyFunc key_func)
: range_begin(iter), range_end(iter), end(end), key_func(key_func)
{
advance_range_end();
}
bool operator==(const GroupIterator &that) const
{
return range_begin==that.range_begin;
}
bool operator!=(const GroupIterator &that) const
{
return !(*this==that);
}
GroupIterator operator++()
{
range_begin = range_end;
advance_range_end();
return *this;
}
value_type operator*() const
{
return value_type(key_func(*range_begin),Range(range_begin,range_end));
}
private:
void advance_range_end()
{
if (range_end!=end) {
typename KeyFunc::value_type value = key_func(*range_end++);
while (range_end!=end && key_func(*range_end)==value) {
++range_end;
}
}
}
Iterator range_begin;
Iterator range_end;
Iterator end;
KeyFunc key_func;
};
GroupBy(Iterator begin_iter,Iterator end_iter,KeyFunc key_func)
: begin_iter(begin_iter),
end_iter(end_iter),
key_func(key_func)
{
}
GroupIterator begin() { return GroupIterator(begin_iter,end_iter,key_func); }
GroupIterator end() { return GroupIterator(end_iter,end_iter,key_func); }
private:
Iterator begin_iter;
Iterator end_iter;
KeyFunc key_func;
};
template <typename Iterator,typename KeyFunc>
inline GroupBy<Iterator,KeyFunc>
group_by(
Iterator begin,
Iterator end,
const KeyFunc &key_func = KeyFunc()
)
{
return GroupBy<Iterator,KeyFunc>(begin,end,key_func);
}
static void test()
{
vector<Foo> foos;
foos.push_back({5,"bill",2.1});
foos.push_back({5,"rick",3.7});
foos.push_back({3,"tom",2.5});
foos.push_back({7,"joe",3.4});
foos.push_back({5,"bob",7.2});
ostringstream out;
for (auto group : group_by(foos.begin(),foos.end(),FooX())) {
out << group.value << ":";
for (auto elem : group.range) {
out << " " << elem.y;
}
out << "\n";
}
assert(out.str()==
"5: bill rick\n"
"3: tom\n"
"7: joe\n"
"5: bob\n"
);
}
int main(int argc,char **argv)
{
test();
return 0;
}
Eric Niebler's ranges library provides a group_by view.
according to the docs it is a header only library and can be included easily.
It's supposed to go into the standard C++ space, but can be used with a recent C++11 compiler.
minimal working example:
#include <map>
#include <vector>
#include <range/v3/all.hpp>
using namespace std;
using namespace ranges;
int main(int argc, char **argv) {
vector<int> l { 0,1,2,3,6,5,4,7,8,9 };
ranges::v3::sort(l);
auto x = l | view::group_by([](int x, int y) { return x / 5 == y / 5; });
map<int, vector<int>> res;
auto i = x.begin();
auto e = x.end();
for (;i != e; ++i) {
auto first = *((*i).begin());
res[first / 5] = to_vector(*i);
}
// res = { 0 : [0,1,2,3,4], 1: [5,6,7,8,9] }
}
(I compiled this with clang 3.9.0. and --std=c++11)
I recently discovered cppitertools.
It fulfills this need exactly as described.
https://github.com/ryanhaining/cppitertools#groupby
What is the point of bloating standard C++ library with an algorithm that is one line of code?
for (const auto & foo : foos) foos_by_x[foo.x].push_back(foo);
Also, take a look at std::multimap, it might be just what you need.
UPDATE:
The one-liner I have provided is not well-optimized for the case when your vector is already sorted. A number of map lookups can be reduced if we remember the iterator of previously inserted object, so it the "key" of the next object and do a lookup only when the key is changing. For example:
#include <map>
#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
struct foo {
int x;
std::string y;
float z;
};
class optimized_inserter {
public:
typedef std::map<int, std::vector<foo> > map_type;
optimized_inserter(map_type & map) : map(&map), it(map.end()) {}
void operator()(const foo & obj) {
typedef map_type::value_type value_type;
if (it != map->end() && last_x == obj.x) {
it->second.push_back(obj);
return;
}
last_x = obj.x;
it = map->insert(value_type(obj.x, std::vector<foo>({ obj }))).first;
}
private:
map_type *map;
map_type::iterator it;
int last_x;
};
int main()
{
std::vector<foo> foos;
std::map<int, std::vector<foo>> foos_by_x;
foos.push_back({ 1, "one", 1.0 });
foos.push_back({ 3, "third", 2.5 });
foos.push_back({ 1, "one.. but third", 1.5 });
foos.push_back({ 2, "second", 1.8 });
foos.push_back({ 1, "one.. but second", 1.5 });
std::sort(foos.begin(), foos.end(), [](const foo & lhs, const foo & rhs) {
return lhs.x < rhs.x;
});
std::for_each(foos.begin(), foos.end(), optimized_inserter(foos_by_x));
for (const auto & p : foos_by_x) {
std::cout << "--- " << p.first << "---\n";
for (auto & f : p.second) {
std::cout << '\t' << f.x << " '" << f.y << "' / " << f.z << '\n';
}
}
}
How about this?
template <typename StructType, typename FieldSelectorUnaryFn>
auto GroupBy(const std::vector<StructType>& instances, const FieldSelectorUnaryFn& fieldChooser)
{
StructType _;
using FieldType = decltype(fieldChooser(_));
std::map<FieldType, std::vector<StructType>> instancesByField;
for (auto& instance : instances)
{
instancesByField[fieldChooser(instance)].push_back(instance);
}
return instancesByField;
}
and use it like this:
auto itemsByX = GroupBy(items, [](const auto& item){ return item.x; });
I wrote a C++ library to address this problem in an elegant way. Given your struct
struct foo
{
int x;
std::string y;
float z;
};
To group by y you simply do:
std::vector<foo> dataframe;
...
auto groups = group_by(dataframe, &foo::y);
You can also group by more than one variable:
auto groups = group_by(dataframe, &foo::y, &foo::x);
And then iterate through the groups normally:
for(auto& [key, group]: groups)
{
// do something
}
It also has other operations such as: subset, concat, and others.
I would simply use boolinq.h, which includes all of LINQ. No documentation, but very simple to use.
I came across one requirement where the record is stored as
Name : Employee_Id : Address
where Name and Employee_Id are supposed to be keys that is, a search function is to be provided on both Name and Employee Id.
I can think of using a map to store this structure
std::map< std:pair<std::string,std::string> , std::string >
// < < Name , Employee-Id> , Address >
but I'm not exactly sure how the search function will look like.
Boost.Multiindex
This is a Boost example
In the above example an ordered index is used but you can use also a hashed index:
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <string>
#include <iostream>
struct employee
{
int id_;
std::string name_;
std::string address_;
employee(int id,std::string name,std::string address):id_(id),name_(name),address_(address) {}
};
struct id{};
struct name{};
struct address{};
struct id_hash{};
struct name_hash{};
typedef boost::multi_index_container<
employee,
boost::multi_index::indexed_by<
boost::multi_index::ordered_unique<boost::multi_index::tag<id>, BOOST_MULTI_INDEX_MEMBER(employee,int,id_)>,
boost::multi_index::ordered_unique<boost::multi_index::tag<name>,BOOST_MULTI_INDEX_MEMBER(employee,std::string,name_)>,
boost::multi_index::ordered_unique<boost::multi_index::tag<address>, BOOST_MULTI_INDEX_MEMBER(employee,std::string,address_)>,
boost::multi_index::hashed_unique<boost::multi_index::tag<id_hash>, BOOST_MULTI_INDEX_MEMBER(employee,int,id_)>,
boost::multi_index::hashed_unique<boost::multi_index::tag<name_hash>, BOOST_MULTI_INDEX_MEMBER(employee,std::string,name_)>
>
> employee_set;
typedef boost::multi_index::index<employee_set,id>::type employee_set_ordered_by_id_index_t;
typedef boost::multi_index::index<employee_set,name>::type employee_set_ordered_by_name_index_t;
typedef boost::multi_index::index<employee_set,name_hash>::type employee_set_hashed_by_name_index_t;
typedef boost::multi_index::index<employee_set,id>::type::const_iterator employee_set_ordered_by_id_iterator_t;
typedef boost::multi_index::index<employee_set,name>::type::const_iterator employee_set_ordered_by_name_iterator_t;
typedef boost::multi_index::index<employee_set,id_hash>::type::const_iterator employee_set_hashed_by_id_iterator_t;
typedef boost::multi_index::index<employee_set,name_hash>::type::const_iterator employee_set_hashed_by_name_iterator_t;
int main()
{
employee_set employee_set_;
employee_set_.insert(employee(1, "Employer1", "Address1"));
employee_set_.insert(employee(2, "Employer2", "Address2"));
employee_set_.insert(employee(3, "Employer3", "Address3"));
employee_set_.insert(employee(4, "Employer4", "Address4"));
// search by id using an ordered index
{
const employee_set_ordered_by_id_index_t& index_id = boost::multi_index::get<id>(employee_set_);
employee_set_ordered_by_id_iterator_t id_itr = index_id.find(2);
if (id_itr != index_id.end() ) {
const employee& tmp = *id_itr;
std::cout << tmp.id_ << ", " << tmp.name_ << ", " << tmp .address_ << std::endl;
} else {
std::cout << "No records have been found\n";
}
}
// search by non existing id using an ordered index
{
const employee_set_ordered_by_id_index_t& index_id = boost::multi_index::get<id>(employee_set_);
employee_set_ordered_by_id_iterator_t id_itr = index_id.find(2234);
if (id_itr != index_id.end() ) {
const employee& tmp = *id_itr;
std::cout << tmp.id_ << ", " << tmp.name_ << ", " << tmp .address_ << std::endl;
} else {
std::cout << "No records have been found\n";
}
}
// search by name using an ordered index
{
const employee_set_ordered_by_name_index_t& index_name = boost::multi_index::get<name>(employee_set_);
employee_set_ordered_by_name_iterator_t name_itr = index_name.find("Employer3");
if (name_itr != index_name.end() ) {
const employee& tmp = *name_itr;
std::cout << tmp.id_ << ", " << tmp.name_ << ", " << tmp .address_ << std::endl;
} else {
std::cout << "No records have been found\n";
}
}
// search by name using an hashed index
{
employee_set_hashed_by_name_index_t& index_name = boost::multi_index::get<name_hash>(employee_set_);
employee_set_hashed_by_name_iterator_t name_itr = index_name.find("Employer4");
if (name_itr != index_name.end() ) {
const employee& tmp = *name_itr;
std::cout << tmp.id_ << ", " << tmp.name_ << ", " << tmp .address_ << std::endl;
} else {
std::cout << "No records have been found\n";
}
}
// search by name using an hashed index but the name does not exists in the container
{
employee_set_hashed_by_name_index_t& index_name = boost::multi_index::get<name_hash>(employee_set_);
employee_set_hashed_by_name_iterator_t name_itr = index_name.find("Employer46545");
if (name_itr != index_name.end() ) {
const employee& tmp = *name_itr;
std::cout << tmp.id_ << ", " << tmp.name_ << ", " << tmp .address_ << std::endl;
} else {
std::cout << "No records have been found\n";
}
}
return 0;
}
If you want to use std::map, you can have two separate containers, each one having adifferent key (name, emp id) and the value should be a pointer the structure, so that you will not have multiple copies of the same data.
Example with tew keys:
#include <memory>
#include <map>
#include <iostream>
template <class KEY1,class KEY2, class OTHER >
class MultiKeyMap {
public:
struct Entry
{
KEY1 key1;
KEY2 key2;
OTHER otherVal;
Entry( const KEY1 &_key1,
const KEY2 &_key2,
const OTHER &_otherVal):
key1(_key1),key2(_key2),otherVal(_otherVal) {};
Entry() {};
};
private:
struct ExtendedEntry;
typedef std::shared_ptr<ExtendedEntry> ExtendedEntrySptr;
struct ExtendedEntry {
Entry entry;
typename std::map<KEY1,ExtendedEntrySptr>::iterator it1;
typename std::map<KEY2,ExtendedEntrySptr>::iterator it2;
ExtendedEntry() {};
ExtendedEntry(const Entry &e):entry(e) {};
};
std::map<KEY1,ExtendedEntrySptr> byKey1;
std::map<KEY2,ExtendedEntrySptr> byKey2;
public:
void del(ExtendedEntrySptr p)
{
if (p)
{
byKey1.erase(p->it1);
byKey2.erase(p->it2);
}
}
void insert(const Entry &entry) {
auto p=ExtendedEntrySptr(new ExtendedEntry(entry));
p->it1=byKey1.insert(std::make_pair(entry.key1,p)).first;
p->it2=byKey2.insert(std::make_pair(entry.key2,p)).first;
}
std::pair<Entry,bool> getByKey1(const KEY1 &key1)
{
const auto &ret=byKey1[key1];
if (ret)
return std::make_pair(ret->entry,true);
return std::make_pair(Entry(),false);
}
std::pair<Entry,bool> getByKey2(const KEY2 &key2)
{
const auto &ret=byKey2[key2];
if (ret)
return std::make_pair(ret->entry,true);
return std::make_pair(Entry(),false);
}
void deleteByKey1(const KEY1 &key1)
{
del(byKey1[key1]);
}
void deleteByKey2(const KEY2 &key2)
{
del(byKey2[key2]);
}
};
int main(int argc, const char *argv[])
{
typedef MultiKeyMap<int,std::string,int> M;
M map1;
map1.insert(M::Entry(1,"aaa",7));
map1.insert(M::Entry(2,"bbb",8));
map1.insert(M::Entry(3,"ccc",9));
map1.insert(M::Entry(7,"eee",9));
map1.insert(M::Entry(4,"ddd",9));
map1.deleteByKey1(7);
auto a=map1.getByKey1(2);
auto b=map1.getByKey2("ddd");
auto c=map1.getByKey1(7);
std::cout << "by key1=2 (should be bbb ): "<< (a.second ? a.first.key2:"Null") << std::endl;
std::cout << "by key2=ddd (should be ddd ): "<< (b.second ? b.first.key2:"Null") << std::endl;
std::cout << "by key1=7 (does not exist): "<< (c.second ? c.first.key2:"Null") << std::endl;
return 0;
}
Output:
by key1=2 (should be bbb ): bbb
by key2=ddd (should be ddd ): ddd
by key1=7 (does not exist): Null
If EmployeeID is the unique identifier, why use other keys? I would use EmployeeID as the internal key everywhere, and have other mappings from external/human readable IDs (such as Name) to it.
C++14 std::set::find non-key searches solution
This method saves you from storing the keys twice, once one the indexed object and secondly on as the key of a map as done at: https://stackoverflow.com/a/44526820/895245
This provides minimal examples of the central technique that should be easier to understand first: How to make a C++ map container where the key is part of the value?
#include <cassert>
#include <set>
#include <vector>
struct Point {
int x;
int y;
int z;
};
class PointIndexXY {
public:
void insert(Point *point) {
sx.insert(point);
sy.insert(point);
}
void erase(Point *point) {
sx.insert(point);
sy.insert(point);
}
Point* findX(int x) {
return *(this->sx.find(x));
}
Point* findY(int y) {
return *(this->sy.find(y));
}
private:
struct PointCmpX {
typedef std::true_type is_transparent;
bool operator()(const Point* lhs, int rhs) const { return lhs->x < rhs; }
bool operator()(int lhs, const Point* rhs) const { return lhs < rhs->x; }
bool operator()(const Point* lhs, const Point* rhs) const { return lhs->x < rhs->x; }
};
struct PointCmpY {
typedef std::true_type is_transparent;
bool operator()(const Point* lhs, int rhs) const { return lhs->y < rhs; }
bool operator()(int lhs, const Point* rhs) const { return lhs < rhs->y; }
bool operator()(const Point* lhs, const Point* rhs) const { return lhs->y < rhs->y; }
};
std::set<Point*, PointCmpX> sx;
std::set<Point*, PointCmpY> sy;
};
int main() {
std::vector<Point> points{
{1, -1, 1},
{2, -2, 4},
{0, 0, 0},
{3, -3, 9},
};
PointIndexXY idx;
for (auto& point : points) {
idx.insert(&point);
}
Point *p;
p = idx.findX(0);
assert(p->y == 0 && p->z == 0);
p = idx.findX(1);
assert(p->y == -1 && p->z == 1);
p = idx.findY(-2);
assert(p->x == 2 && p->z == 4);
}