I want to create an adjacency matrix for a graph. Since I read it is not safe to use arrays of the form matrix[x][y] because they don't check for range, I decided to use the vector template class of the stl. All I need to store in the matrix are boolean values. So my question is, if using std::vector<std::vector<bool>* >* produces too much overhead or if there is a more simple way for a matrix and how I can properly initialize it.
EDIT: Thanks a lot for the quick answers. I just realized, that of course I don't need any pointers. The size of the matrix will be initialized right in the beginning and won't change until the end of the program. It is for a school project, so it would be good if I write "nice" code, although technically performance isn't too important. Using the STL is fine. Using something like boost, is probably not appreciated.
Note that also you can use boost.ublas for matrix creation and manipulation and also boost.graph to represent and manipulate graphs in a number of ways, as well as using algorithms on them, etc.
Edit: Anyway, doing a range-check version of a vector for your purposes is not a hard thing:
template <typename T>
class BoundsMatrix
{
std::vector<T> inner_;
unsigned int dimx_, dimy_;
public:
BoundsMatrix (unsigned int dimx, unsigned int dimy)
: dimx_ (dimx), dimy_ (dimy)
{
inner_.resize (dimx_*dimy_);
}
T& operator()(unsigned int x, unsigned int y)
{
if (x >= dimx_ || y>= dimy_)
throw std::out_of_range("matrix indices out of range"); // ouch
return inner_[dimx_*y + x];
}
};
Note that you would also need to add the const version of the operators, and/or iterators, and the strange use of exceptions, but you get the idea.
Best way:
Make your own matrix class, that way you control every last aspect of it, including range checking.
eg. If you like the "[x][y]" notation, do this:
class my_matrix {
std::vector<std::vector<bool> >m;
public:
my_matrix(unsigned int x, unsigned int y) {
m.resize(x, std::vector<bool>(y,false));
}
class matrix_row {
std::vector<bool>& row;
public:
matrix_row(std::vector<bool>& r) : row(r) {
}
bool& operator[](unsigned int y) {
return row.at(y);
}
};
matrix_row& operator[](unsigned int x) {
return matrix_row(m.at(x));
}
};
// Example usage
my_matrix mm(100,100);
mm[10][10] = true;
nb. If you program like this then C++ is just as safe as all those other "safe" languages.
The standard vector does NOT do range checking by default.
i.e. The operator[] does not do a range check.
The method at() is similar to [] but does do a range check.
It will throw an exception on out of range.
std::vector::at()
std::vector::operator[]()
Other notes:
Why a vector<Pointers> ?
You can quite easily have a vector<Object>. Now there is no need to worry about memory management (i.e. leaks).
std::vector<std::vector<bool> > m;
Note: vector<bool> is overloaded and not very efficient (i.e. this structure was optimized for size not speed) (It is something that is now recognized as probably a mistake by the standards committee).
If you know the size of the matrix at compile time you could use std::bitset?
std::vector<std::bitset<5> > m;
or if it is runtime defined use boost::dynamic_bitset
std::vector<boost::dynamic_bitset> m;
All of the above will allow you to do:
m[6][3] = true;
If you want 'C' array performance, but with added safety and STL-like semantics (iterators, begin() & end() etc), use boost::array.
Basically it's a templated wrapper for 'C'-arrays with some NDEBUG-disable-able range checking asserts (and also some std::range_error exception-throwing accessors).
I use stuff like
boost::array<boost::array<float,4>,4> m;
instead of
float m[4][4];
all the time and it works great (with appropriate typedefs to keep the verbosity down, anyway).
UPDATE: Following some discussion in the comments here of the relative performance of boost::array vs boost::multi_array, I'd point out that this code, compiled with g++ -O3 -DNDEBUG on Debian/Lenny amd64 on a Q9450 with 1333MHz DDR3 RAM takes 3.3s for boost::multi_array vs 0.6s for boost::array.
#include <iostream>
#include <time.h>
#include "boost/array.hpp"
#include "boost/multi_array.hpp"
using namespace boost;
enum {N=1024};
typedef multi_array<char,3> M;
typedef array<array<array<char,N>,N>,N> C;
// Forward declare to avoid being optimised away
static void clear(M& m);
static void clear(C& c);
int main(int,char**)
{
const clock_t t0=clock();
{
M m(extents[N][N][N]);
clear(m);
}
const clock_t t1=clock();
{
std::auto_ptr<C> c(new C);
clear(*c);
}
const clock_t t2=clock();
std::cout
<< "multi_array: " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s\n"
<< "array : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s\n";
return 0;
}
void clear(M& m)
{
for (M::index i=0;i<N;i++)
for (M::index j=0;j<N;j++)
for (M::index k=0;k<N;k++)
m[i][j][k]=1;
}
void clear(C& c)
{
for (int i=0;i<N;i++)
for (int j=0;j<N;j++)
for (int k=0;k<N;k++)
c[i][j][k]=1;
}
What I would do is create my own class for dealing with matrices (probably as an array[x*y] because I'm more used to C (and I'd have my own bounds checking), but you could use vectors or any other sub-structure in that class).
Get your stuff functional first then worry about how fast it runs. If you design the class properly, you can pull out your array[x*y] implementation and replace it with vectors or bitmasks or whatever you want without changing the rest of the code.
I'm not totally sure, but I thing that's what classes were meant for, the ability to abstract the implementation well out of sight and provide only the interface :-)
In addition to all the answers that have been posted so far, you might do well to check out the C++ FAQ Lite. Questions 13.10 - 13.12 and 16.16 - 16.19 cover several topics related to rolling your own matrix class. You'll see a couple of different ways to store the data and suggestions on how to best write the subscript operators.
Also, if your graph is sufficiently sparse, you may not need a matrix at all. You could use std::multimap to map each vertex to those it connects.
my favourite way to store a graph is vector<set<int>>; n elements in vector (nodes 0..n-1), >=0 elements in each set (edges). Just do not forget adding a reverse copy of every bi-directional edge.
Consider also how big is your graph/matrix, does performance matter a lot? Is the graph static, or can it grow over time, e.g. by adding new edges?
Probably, not relevant as this is an old question, but you can use the Armadillo library, which provides many linear algebra oriented data types and functions.
Below is an example for your specific problem:
// In C++11
Mat<bool> matrix = {
{ true, true},
{ false, false},
};
// In C++98
Mat<bool> matrix;
matrix << true << true << endr
<< false << false << endr;
Mind you std::vector doesn't do range checking either.
Related
I am following this example to make an adjacency list. However it seems like the vector size cannot be dynamic.
Visual studio throws an error
expression did not evaluate to a constant
on this line
vector<int> adj[V];
The strange thing is that the same exact code works correctly on codeblocks IDE.
I've tried replacing the above line with vector<int> adj; but then I cannot send the vector as a parameter to addEdge(adj, 0, 1); as it throws another error about pointers which I also don't know how to correct.
What could I do to dynamically create my vector?
C++ - How to create a dynamic vector
You don't need to do that for this example. But if you did need it, you could use std::make_unique.
The linked example program is ill-formed. I recommend to not try to learn from that. The issue that you encountered is that they use a non-const size for an array. But the size of an array must be compile time constant in C++. Simple fix is to declare the variable type as const:
const int V = 5;
I've tried replacing the above line with vector<int> adj;
You can't just replace an array of vectors with a single vector and expect the program to work without making other changes.
I need the size to be dynamic as it will only be known at compile time.
Assuming you meant to say that the size will only be known at runtime, the solution is to use a vector of vectors.
As written by eerorika, the example code isn't a good one, and you should avoid using raw arrays like that. An array in C/C++ is of static size, each vector in this array is dynamic, but the entire array is not!
There are two approaches for such a question. Either use adjacency lists (which is more common):
#include <vector>
#include <stdint.h>
class Vertix
{
public:
Vertix(uint64_t id_) : id(id_) {}
uint64_t get_id() const { return id; }
void add_adj_vertix(uint64_t id) { adj_vertices.push_back(id); }
const std::vector<uint64_t>& get_adj_vertices() const { return adj_vertices; }
private:
uint64_t id;
std::vector<uint64_t> adj_vertices;
};
class Graph
{
public:
void add_vertix(uint64_t id)
{
vertices[id] = Vertix(id);
}
void add_edge(uint64_t v_id, uint64_t u_id)
{
edges.emplace_back(u_id, v_id);
vertices[u_id].add_adj_vertix(v_id);
}
private:
std::vector<Vertix> vertices;
std::vector<std::pair<uint64_t, uint64_t>> edges;
};
or use double vector to represent the edges matrix:
std::vector<std::vector<uint64_t>> edges;
But it isn't a real matrix, and you cannot check if (u, v) is in the graph in O(1), which misses the point of having adjacency matrix. Assuming you know the size of Graph on compile time, you should write something like:
#include <array>
#include <stdint.h>
template <size_t V>
using AdjacencyMatrix = std::array<std::array<bool, V>, V>;
template <size_t V>
void add_edge(AdjacencyMatrix<V>& adj_matrix, uint64_t u, uint64_t v)
{
if (u < V && v < V)
{
adj_matrix[u][v] = true;
}
else
{
// error handling
}
}
Then you can use AdjacencyMatrix<5> instead of what they were using on that example, in O(1) time, and although it has static size, it does work as intended.
There’s no need to use C-style arrays in modern C++. Their equivalent is std::array, taking the size as a template parameter. Obviously that size can’t be a runtime variable: template parameters can be types or constant expressions. The compiler error reflects this: std::array is a zero cost wrapper over an internal, raw “C” array.
If the array is always small, you may wish to use a fixed-maximum-size array, such as provided by boost. You get all performance benefits of fixed size arrays and can still store down to zero items in it.
There are other solutions:
If all vectors have the same size, make a wrapper that takes two indices, and uses N*i1+i2 as the index to an underlying std::vector.
If the vectors have different sizes, use a vector of vectors: std::vector>. If there are lots of vectors and you often add and remove them, you may look into using a std::list of vectors.
I would like to use a class with the same functionality as std::vector, but
Replace std::vector<T>::size_type by some signed integer (like int64_t or simply int), instead of usual size_t. It is very annoying to see warnings produced by a compiler in comparisons between signed and unsigned numbers when I use standard vector interface. I can't just disable such warnings, because they really help to catch programming errors.
put assert(0 <= i && i < size()); inside operator[](int i) to check out of range errors. As I understand it will be a better option over the call to .at() because I can disable assertions in release builds, so performance will be the same as in the standard implementation of the vector. It is almost impossible for me to use std::vector without manual checking of range before each operation because operator[] is the source of almost all weird errors related to memory access.
The possible options that come to my mind are to
Inherit from std::vector. It is not a good idea, as said in the following question: Extending the C++ Standard Library by inheritance?.
Use composition (put std::vector inside my class) and repeat all the interface of std::vector. This option forces me to think about the current C++ standard, because the interface of some methods, iterators is slightly different in C++ 98,11,14,17. I would like to be sure, that when c++ 20 became available, I can simply use it without reimplementation of all the interface of my vector.
An answer more to the underlying problem read from the comment:
For example, I don't know how to write in a ranged-based for way:
for (int i = a.size() - 2; i >= 0; i--) { a[i] = 2 * a[i+1]; }
You may change it to a generic one like this:
std::vector<int> vec1{ 1,2,3,4,5,6};
std::vector<int> vec2 = vec1;
int main()
{
// generic code
for ( auto it = vec1.rbegin()+1; it != vec1.rend(); it++ )
{
*it= 2* *(it-1);
}
// your code
for (int i = vec2.size() - 2; i >= 0; i--)
{
vec2[i] = 2 * vec2[i+1];
}
for ( auto& el: vec1) { std::cout << el << std::endl; }
for ( auto& el: vec2) { std::cout << el << std::endl; }
}
Not using range based for as it is not able to access relative to the position.
Regarding point 1: we hardly ever get those warnings here, because we use vectors' size_type where appropriate and/or cast to it if needed (with a 'checked' cast like boost::numeric_cast for safety). Is that not an option for you? Otherwise, write a function to do it for you, i.e. the non-const version would be something like
template<class T>
T& ati(std::vector<T>& v, std::int64_t i)
{
return v.at(checked_cast<decltype(v)::size_type>(i));
}
And yes, inheriting is still a problem. And even if it weren't you'd break the definition of vector (and the Liskov substitution principle I guess), because the size_type is defined as
an unsigned integral type that can represent any non-negative value of difference_type
So it's down to composition, or a bunch of free functions for accessing with a signed size_type and a range check. Personally I'd go for the latter: less work, as easy to use, and you can still pass your vector to functions teaking vectors without problems.
(This is more a comment than a real answer, but has some code, so...)
For the second part (range checking at runtime), a third option would be to use some macro trick:
#ifdef DEBUG
#define VECTOR_AT(v,i) v.at(i)
#else
#define VECTOR_AT(v,i) v[i]
#endif
This can be used this way:
std::vector<sometype> vect(somesize);
VECTOR_AT(vect,i) = somevalue;
Of course, this requires editing your code in a quite non-standard way, which may not be an option. But it does the job.
For efficiency purposes I need to write a code that takes a vector of integers as defined in Eigen 3, VectorXi, and maps that vector to a character. Like a dictionary in Python. How can I do this? The Eigen documentation does things the other way around (see below) - it maps a character to a vector. I can't get it to work in reverse. Has anyone ever tried this before?
std::map<char,VectorXi, std::less<char>,
Eigen::aligned_allocator<std::pair<char, VectorXi> > > poop;
VectorXi check(modes);
check << 0,2,0,0;
poop['a']=check;
cout << poop['a'];
Your code is trying to approach the solution from the completely opposite side. There, you construct a map in which the keys are of type char and the values are Eigen vectors.
Note further that according to the Eigen homepage custom allocators are only required for the fixed-size versions of Eigen types.
In order to use a std::map with a Eigen vector as key, you need an appropriate comparison function. For this, you can specialize std::less which is allowed for custom types. Here is a possible implementation:
namespace std
{
template<>
std::less<Eigen::VectorXi>(Eigen::VectorXi const& a, Eigen::VectorXi const& b)
{
assert(a.size()==b.size());
for(size_t i=0;i<a.size();++i)
{
if(a[i]<b[i]) return true;
if(a[i]>b[i]) return false;
}
return false;
}
}
Then you should be able to write
std::map<VectorXi, char> poop;
VectorXi check;
check << 0,2,0,0;
poop[check]='a';
cout << poop[check]; //prints 'a'
The code above is untested, however.
I'm currently trying to do a complicated variable correction to a bunch of variables (based on normalizing in various phase spaces) for some data that I'm reading in. Since each correction follows the same process, I was wondering if there would be anyway to do this iteratively rather than handle each variable by itself (since I need to this for about 18-20 variables). Can C++ handle this? I was told by someone to try this in python but I feel like it could be done in C++ in some way... I'm just hitting a wall!
To give you an idea, given something like:
class VariableClass{
public :
//each object of this class represents an event for this particlular data set
//containing the following variables
double x;
double y;
double z;
}
I want to do something along the lines of:
for (int i=0; i < num_variables; i++)
{
for (int j=0; j < num_events; j++)
{
//iterate through events
}
//correct variable here, then move on to next one
}
Thanks in advance for any advice!!!
I'm assuming your member variables will not all have the same type. Otherwise you can just throw them into a container. If you have C++11, one way you could solve this problem is a tuple. With some template metaprogramming you can simulate a loop over all elements of the tuple. The function std::tie will build a tuple with references to all of your members that you can "iterate" like this:
struct DoCorrection
{
template<typename T>
void operator()(T& t) const { /* code goes here */ }
};
for_each(std::tie(x, y, z), DoCorrection());
// see linked SO answer for the detailed code to make this special for_each work.
Then, you can specialize operator() for each member variable type. That will let you do the appropriate math automatically without manually keeping track of the types.
taken from glm (detail vec3.incl)
template <typename T>
GLM_FUNC_QUALIFIER typename tvec3<T>::value_type &
tvec3<T>::operator[]
(
size_type i
)
{
assert(i < this->length());
return (&x)[i];
}
this would translate to your example:
class VariableClass{
public :
//each object of this class represents an event for this particlular data
double x;
double y;
double z;
double & operator[](int i) {
assert(i < 3);
return (&x)[i];
}
}
VariableClass foo();
foo.x = 2.0;
std::cout << foo[0] << std::endl; // => 2.0
Althought i would recomment glm, if it is just about vector math.
Yes, just put all your variables into a container, like std::vector, for example.
http://en.cppreference.com/w/cpp/container/vector
I recommend spending some time reading about all the std classes. There are many containers and many uses.
In general you cannot iterate over members without relying on implementation defined things like padding or reordering of sections with different access qualifiers (literally no compiler does the later - it is allowed though).
However, you can use a the generalization of a record type: a std::tuple. Iterating a tuple isn't straight-forward but you will find plenty of code that does it. The worst here is the loss of named variables, which you can mimic with members.
If you use Boost, you can use Boost.Fusion's helper-macro BOOST_FUSION_ADAPT_STRUCT to turn a struct into a Fusion sequence and then you can use it with Fusion algorithms.
I have a sequence, e.g
std::vector< Foo > someVariable;
and I want a loop which iterates through everything in it.
I could do this:
for (int i=0;i<someVariable.size();i++) {
blah(someVariable[i].x,someVariable[i].y);
woop(someVariable[i].z);
}
or I could do this:
for (std::vector< Foo >::iterator i=someVariable.begin(); i!=someVariable.end(); i++) {
blah(i->x,i->y);
woop(i->z);
}
Both these seem to involve quite a bit of repetition / excessive typing. In an ideal language I'd like to be able to do something like this:
for (i in someVariable) {
blah(i->x,i->y);
woop(i->z);
}
It seems like iterating through everything in a sequence would be an incredibly common operation. Is there a way to do it in which the code isn't twice as long as it should have to be?
You could use for_each from the standard library. You could pass a functor or a function to it. The solution I like is BOOST_FOREACH, which is just like foreach in other languages. C+0x is gonna have one btw.
For example:
#include <iostream>
#include <vector>
#include <algorithm>
#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
void print(int v)
{
std::cout << v << std::endl;
}
int main()
{
std::vector<int> array;
for(int i = 0; i < 100; ++i)
{
array.push_back(i);
}
std::for_each(array.begin(), array.end(), print); // using STL
foreach(int v, array) // using Boost
{
std::cout << v << std::endl;
}
}
Not counting BOOST_FOREACH which AraK already suggested, you have the following two options in C++ today:
void function(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
std::for_each(someVariable.begin(), someVariable.end(), function);
struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());
Both require you to specify the "body" of the loop elsewhere, either as a function or as a functor (a class which overloads operator()). That might be a good thing (if you need to do the same thing in multiple loops, you only have to define the function once), but it can be a bit tedious too. The function version may be a bit less efficient, because the compiler is generally unable to inline the function call. (A function pointer is passed as the third argument, and the compiler has to do some more detailed analysis to determine which function it points to)
The functor version is basically zero overhead. Because an object of type functor is passed to for_each, the compiler knows exactly which function to call: functor::operator(), and so it can be trivially inlined and will be just as efficient as your original loop.
C++0x will introduce lambda expressions which make a third form possible.
std::for_each(someVariable.begin(), someVariable.end(), [](Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
});
Finally, it will also introduce a range-based for loop:
for(Foo& arg : my_someVariable)
{
blah(arg.x, arg.y);
woop(arg.z);
}
So if you've got access to a compiler which supports subsets of C++0x, you might be able to use one or both of the last forms. Otherwise, the idiomatic solution (without using Boost) is to use for_eachlike in one of the two first examples.
By the way, MSVS 2008 has a "for each" C++ keyword. Look at How to: Iterate Over STL Collection with for each.
int main() {
int retval = 0;
vector<int> col(3);
col[0] = 10;
col[1] = 20;
col[2] = 30;
for each( const int& c in col )
retval += c;
cout << "retval: " << retval << endl;
}
Prefer algorithm calls to hand-written loops
There are three reasons:
1) Efficiency: Algorithms are often more efficient than the loops programmers produce
2) Correctness: Writing loops is more subject to errors than is calling algorithms.
3) Maintainability: Algorithm calls often yield code that is clearer and more
straightforward than the corresponding explicit loops.
Prefer almost every other algorithm to for_each()
There are two reasons:
for_each is extremely general, telling you nothing about what's really being done, just that you're doing something to all the items in a sequence.
A more specialized algorithm will often be simpler and more direct
Consider, an example from an earlier reply:
void print(int v)
{
std::cout << v << std::endl;
}
// ...
std::for_each(array.begin(), array.end(), print); // using STL
Using std::copy instead, that whole thing turns into:
std::copy(array.begin(), array.end(), std::ostream_iterator(std::cout, "\n"));
"struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());"
I think approaches like these are often needlessly baroque for a simple problem.
do i=1,N
call blah( X(i),Y(i) )
call woop( Z(i) )
end do
is perfectly clear, even if it's 40 years old (and not C++, obviously).
If the container is always a vector (STL name), I see nothing wrong with an index and nothing wrong with calling that index an integer.
In practice, often one needs to iterate over multiple containers of the same size simultaneously and peel off a datum from each, and do something with the lot of them. In that situation, especially, why not use the index?
As far as SSS's points #2 and #3 above, I'd say it could be so for complex cases, but often iterating 1...N is often as simple and clear as anything else.
If you had to explain the algorithm on the whiteboard, could you do it faster with, or without, using 'i'? I think if your meatspace explanation is clearer with the index, use it in codespace.
Save the heavy C++ firepower for the hard targets.