Array Size as Constructor Parameter - c++

I am creating a C++ class which wraps an floating point 2d array and provides some additional functionality. I want to pass the array's shape as parameters to the constructor, see the code (the class Block part) below. The line ends with comment "// here" would cause error during compilation because _nx and _ny are not known at that time.
There are two solutions (I think) around this: one is using pointer (see solution 1 in the code below) and dynamically allocate the array; the other is using template (see solution 2 in the code below), but I have several reasons not to use them:
I don't want to use pointer as long as there is a pointer-less
option; in other words, I don't want to use new and delete. The
reason for this is a personal preference for purer c++.
I don't want
to use template because there can be many different block shapes - I
don't want the compiler to create many classes for each of them,
this is an overkill and increases the executable size.
In addition, I don't want to use stl vector because the array size is fixed after creation; also I am doing numerical computation, so a 'raw' array suits me much better.
I have searched in SO and there are five or six questions asking similar problems, there is no conclusion which one is better though, and none of them are from a numerical standing point so vector or new/detele are good answers for them - but not for me. Another reason I post this question is I want to know if I am too restrictive in using c++ features. As I will be using c++ extensively, it's very important to be aware of c++'s limitation and stop asking/searching too much for some feature that doesn't exist.
#include <iostream>
#include <memory>
using namespace std;
class Block
{
public:
Block(int nx, int ny):_nx(nx),_ny(ny){}
void Report(void)
{
cout << "Block With Size ["<<_nx<<","<<_ny<<"]\n";
}
private:
const int _nx, _ny;
double _data[_nx][_ny]; // here
};
/// Solution 1, using auto_ptr
class BlockAuto
{
public:
BlockAuto(int nx, int ny):_nx(nx),_ny(ny),_data(new double[_nx*_ny]){}
void Report(void)
{
cout << "BlockAuto With Size ["<<_nx<<","<<_ny<<"]\n";
}
private:
const int _nx;
const int _ny;
const auto_ptr<double> _data;
};
/// Solution 2, using template
template<unsigned int nx, unsigned int ny>
class BlockTpl
{
public:
BlockTpl():_nx(nx),_ny(ny){}
void Report(void)
{
cout << "BlockTpl With Size ["<<_nx<<","<<_ny<<"]\n";
}
private:
const int _nx;
const int _ny;
double _data[nx][ny]; // uncomfortable here, can't use _nx, _ny
};
int main(int argc, const char *argv[])
{
Block b(3,3);
b.Report();
BlockAuto ba(3,3);
ba.Report();
BlockTpl<3,4> bt;
bt.Report();
return 0;
}

Just use a std::vector. I had the same decision problem a week before and had asked here.
If you use reserve(), which shall not make your vector reallocate itself many times(if any), then vectors are not going to influence the performance of your project. In other words, vector is unlikely to be your bottleneck.
Notice, that in C++ vectors are used widely, thus in release mode, the optimizations made to them are really efficient.
Or wait for std::dynarray to be introduced! (Unfortunately not in C++14, but in array TS or C++17). Source, credits to manlio.
Never forget: Premature optimization is the source of Evil. - Knuth.
Don't believe me? You shouldn't! Experiment yourself and find out!
Here is my experiment to get me convinced when I had exactly the same question as you.
Experiment code:
#include <iostream>
#include <vector>
#include <ctime>
#include <ratio>
#include <chrono>
using namespace std;
int main() {
const int N = 100000;
cout << "Creating, filling and accessing an array of " << N << " elements.\n";
using namespace std::chrono;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
int array[N];
for(int i = 0; i < N; ++i)
array[i] = i;
for(int i = 0; i < N; ++i)
array[i] += 5;
high_resolution_clock::time_point t2 = high_resolution_clock::now();
duration<double> time_span = duration_cast<duration<double>>(t2 - t1);
std::cout << "It took me " << time_span.count() << " seconds.";
std::cout << std::endl;
cout << "Creating, filling and accessing an vector of " << N << " elements.\n";
t1 = high_resolution_clock::now();
vector<int> v;
v.reserve(N);
for(int i = 0; i < N; ++i)
v.emplace_back(i);
for(int i = 0; i < N; ++i)
v[i] += 5;
t2 = high_resolution_clock::now();
time_span = duration_cast<duration<double>>(t2 - t1);
std::cout << "It took me " << time_span.count() << " seconds.";
std::cout << std::endl;
return 0;
}
Results (notice the -o2 compiler flag):
samaras#samaras-A15:~$ g++ -std=gnu++0x -o2 px.cpp
samaras#samaras-A15:~$ ./a.out
Creating, filling and accessing an array of 100000 elements.
It took me 0.002978 seconds.
Creating, filling and accessing an vector of 100000 elements.
It took me 0.002264 seconds.
So, just a std::vector. :) I am pretty sure you know how to change your code for that and you do not need me to tell you (is so, let me know of course :) ).
You can try with other time methods, found in my pseudo-site.

I think you're being overly cautious in rejecting std::vector just because of the resizability issue. Surely your program can accommodate sizeof(Block) being a few of pointer sizes larger than the raw pointer solution. A vector for maintaining the matrix should be no different than your pointer solution as far as performance goes, if you use a single vector instead of a vector of vectors.
Using vector will also make it a lot more unlikely that you'll screw up. For instance, your auto_ptr solution has undefined behavior because the auto_ptr is going to delete, instead of delete[], the array in the destructor. Also, you most likely won't get the behavior you expect unless you define a copy constructor and assignment operator.
Now, if you must eschew vector, I'd suggest using unique_ptr instead of auto_ptr.
class Block
{
public:
Block(int nx, int ny):_nx(nx),_ny(ny), _data(new double[_nx*_ny])
{}
void Report(void)
{
cout << "Block With Size ["<<_nx<<","<<_ny<<"]\n";
}
private:
const int _nx, _ny;
std::unique_ptr<double[]> _data; // here
};
This will correctly call delete[] and it won't transfer ownership of the array as readily as auto_ptr.

std::vector is your friend, no need to rebuild the wheel :
class Block
{
public:
BlockAuto(int p_rows, int p_cols):m_rows(nx),m_cols(ny)
{
m_vector.resize(m_rows*m_cols);
}
double at(uint p_x, uint p_y)
{
//Some check that p_x and p_y aren't over limit is advised
return m_vector[p_x + p_y*m_rows];
}
void Report(void)
{
cout << "Block With Size ["<<_nx<<","<<_ny<<"]\n";
}
private:
const int m_rows;
const int m_cols;
std::vector<double> m_vector;
//or double* m_data
};
You may also use a simple double* as in your first solution. Do not forget to delete it when destroying the block though.

Memory is cheap these days, and your block matrices are very very small.
So, when you don't want to use templates, and don't want to use dynamic allocation, well just use a fixed size array sufficiently large for the largest possible block.
It's that simple.
The code you show with std::auto_ptr has two main problems:
std::auto_ptr is deprecated in C++11.
std::auto_ptr always performed a delete p, which yields Undefined Behavior when the allocation was of an array, like new T[n].
By the way, regarding the envisioned code bloat with templates, you may be pleasantly surprised if you measure.
Also in passing, this smells quite a bit of premature optimization. With C++ it's a good idea to always keep performance in mind, and not do needlessly slow or memory consuming things. But also, a good idea to not get bogged down in needlessly working around some perceived performance problem that really doesn't matter, or wouldn't have mattered if it were just ignored.
And so, your main default choice should be to use a std::vector for the storage.
Then, if you suspect that it's too slow, measure. The release version. Oh I've said that only twice, so here's third: measure. ;-)

Related

Memory size of the new allocated int in c++, Is there a different and better way to see it?

In this program I am trying to find out how much memory is allocated for my pointer. I can see it in this way that it should be 1 gibibyte which is = 1 073 741 824 bytes. My problem is that the only way I can get this thru is by taking the size of int which is 4 and multiplying by that const number. Is there a different way?
#include "stdafx.h"
#include <iostream>
#include <new>
int main(){
const int gib = 268435256; //Created a constant int so I could allocate 1
//Gib memory
int *ptr = new int [gib];
std::cout << sizeof (int)*gib << std::endl;
std::cout << *ptr << std::endl;
std::cout << ptr << std::endl;
try {
}catch (std::bad_alloc e) {
std::cerr << e.what() << std::endl;
}
system("PAUSE");
delete[] ptr;
return 0;
}
No, there is no way. The compiler internally adds information about how much memory was allocated and how many elements were created by new[], because otherwise it couldn't perform delete[] correctly. However, there is no portable way in C++ to get that information and use it directly.
So you have to store the size separately while you still know it.
Actually, you don't, because std::vector does it for you:
#include <iostream>
#include <vector>
#include <new>
int main() {
const int gib = 268435256;
try {
std::vector<int> v(gib);
std::cout << (v.capacity() * sizeof(int)) << '\n';
} catch (std::bad_alloc const& e) {
std::cerr << e.what() << '\n';
}
}
You should practically never use new[]. Use std::vector.
Note that I've used capacity and not size, because size tells you how many items the vector represents, and that number can be smaller than the number of elements supported by the vector's currently allocated memory.
There is also no way to avoid the sizeof, because the size of an int can vary among implementations. But that's not a problem, either, because a std::vector cannot lose its type information, so you always know how big one element is.
You wouldn't need the multiplication if it was a std::vector<char>, a std::vector<unsigned char> or a std::vector<signed char>, because those three character types' sizeof is guaranteed to be 1.
There is no way to retrieve the amount of allocated memory from the pointer. Lets forget for a moment that standard containers (and smart pointers) exist, then you could use a struct that encapsulates the pointer and the size. The most simple dynamic array I can imagine is this:
template <typename T>
struct my_dynamic_array {
size_t capacity;
T* data;
my_dynamic_array(size_t capacity) : capacity(capacity),data(new T[capacity]) {}
~my_dynamic_array() { delete[] data; }
const T& operator[](int i) const { return data[i];}
T& operator[](int i) { return data[i];}
};
Note that his is just a basic example for the sake of demonstration, eg you shouldnt copy instances of this struct or bad things will happen. However, it can be used like this:
my_dynamic_array<int> x(5);
x[3] = 1;
std::cout << x[3];
ie no pointers and no manual memory allocation in the code using the array, which is a good thing. Actually, this alone is already a big deal, because now you can make use of RAII and cannot forget to delete the memory.
Next you may want to resize your array, which requires just a bit more boilerplate (again: take it with a grain of salt!):
template <typename T>
struct my_dynamically_sized_array : my_dynamic_array<T> {
size_t size;
my_dynamically_sized_array(size_t size, size_t capacity) :
my_dynamic_array<T>(capacity),size(size) {}
void push(const T& t) {
my_dynamic_array<T>::data[size] = t;
++size;
}
};
It can be used like this:
my_dynamically_sized_array<int> y(0,3);
y.push(3);
std::cout << y[0];
Of course the memory would need to be reallocated when the size grows bigger than the capacity and many more things would be required to make this wrapper really functional (eg being able to copy would be nice).
The bottom line is: Dont do any of this! To write a good full-blown container class much more than I can outline here is required and most of that is boiler-plate that doesnt really add value to your code base, because std::vector already is a thin wrapper around dynamically allocated memory that offers you all you need while not imposing overhead for stuff you dont use.

C++ Vector: push_back Objects vs push_back Pointers performance

I'm testing performance difference between pushing back Objects vs pushing back object Pointers to Vector in C++.
I've read in Stackoverflow and other articles that you should avoid pushing back pointers unless you must do so...
However, I realized that there is a HUGE performance gain for pushing back Pointers,,,
This is a simple test I ran:
tstart = chrono::system_clock::now();
vector<MyObject> VectorOfObjects;
for (int i=0; i<10000; i++) {
MyObject x("test");
VectorOfObjects.push_back(x);
}
tend = chrono::system_clock::now();
tt = tend-tstart;
cout << "Pushback Object: " << tt.count()*1000 << " Milliseconds\n" << endl;
tstart = chrono::system_clock::now();
vector<MyObject *> VectorOfPointers;
for (int i=0; i<10000; i++) {
VectorOfPointers.push_back(new MyObject("test"));
}
tend = chrono::system_clock::now();
tt = tend-tstart;
cout << "Pushback Pointers: " << tt.count()*1000 << " Milliseconds\n" << endl;
The result is actually pretty surprising:
Pushback Objects: 989 Milliseconds
Pushback Pointers: 280 Milliseconds
As you can see, pushing back pointers is 3~4 times faster than pushing back objects! which is a huge performance difference, especially when dealing with large volume of data.
So my question is: WHY NOT USE Vector of Pointers??
Answers to almost every post on Stackoverflow regarding similar question says Avoid Vector of Pointers..
I know memory leakage might be a problem,, but we can always use Smart Pointers,, and even manually deleting the pointers at destruction is not that difficult..
I'm also curious about the cause of this performance difference..
Thanks
UPDATE:
Actually I tested on ideone .... and here, pushback objects is Faster!!!
In Visual Studio,, pushing back Objects was Wayyy slower..
Why is this...??
To be fair, when measuring your code you should account for deallocation of all those pointers. A sample code would read as :
#include <chrono>
#include <string>
#include <iostream>
#include <functional>
#include <vector>
using namespace std;
// 1. A way to easily measure elapsed time -------------------
template<typename TimeT = std::chrono::milliseconds>
struct measure
{
template<typename F>
static typename TimeT::rep execution(F const &func)
{
auto start = std::chrono::system_clock::now();
func();
auto duration = std::chrono::duration_cast< TimeT>(
std::chrono::system_clock::now() - start);
return duration.count();
}
};
// -----------------------------------------------------------
// 2. MyObject -----------------------------------------------
struct MyObject {
string mem;
MyObject(const char *text) : mem(text) {};
};
// -----------------------------------------------------------
int main()
{
vector<MyObject> VectorOfObjects;
vector<MyObject *> VectorOfPointers;
cout << "Pushback Object: " << measure<>::execution([&]()
{
for (int i = 0; i < 100000; i++) {
MyObject x("test");
VectorOfObjects.push_back(x);
}
}) << endl;
cout << "Pushback Pointers: " << measure<>::execution([&]()
{
for (int i = 0; i < 100000; i++)
VectorOfPointers.push_back(new MyObject("test"));
for (auto &item : VectorOfPointers)
delete item;
}) << endl;
return 0;
}
and when compiled with
g++ -std=c++11 -O3 -march=native -Wall -pedantic
the results are (I'm using +1 order of magnitude in the for loops) :
Pushback Object: 20
Pushback Pointers: 32
If you used
VectorOfObjects.emplace_back("test");
The duration of the VectorOfObjects modification would drop to 18
If you preallocated both vectors
vector<MyObject> VectorOfObjects;
VectorOfObjects.reserve(100000);
vector<MyObject *> VectorOfPointers;
VectorOfPointers.reserve(100000);
the result would be 17-34 (for the vector of objects again)
If you use a vector of unique pointers then the results are similar
vector<unique_ptr<MyObject>> VectorOfPointers;
note that I'm limiting the scope of the vectors to explicitly account for the destruction of the smart pointers
Other choices would include boost's pointer containers in which case the related data structure would be a pointer vector
I prefer to use shared pointers opposed to regular pointers and I always use them when I can.
I use shared pointers with vectors when dealing with the vector changing a lot.
You should avoid regular pointers when dealing with vectors, as they need to be manually destructed and will just cause memory leaks.
So to answer your question...
Look into the shared_ptr library and use those instead here is a link http://www.cplusplus.com/reference/memory/shared_ptr/
hope this answers your question
The way your sample code is written, there is definitly a memory leak problem. Agreed that you can fix that problem by doing a deletes.
It can be done but it is just cumbersome. If you take care of things like memory leaks, it is fine.
The root issue for performance here is that there are copies of objects being made. You create an object. When you add it to the vector. It creates a new object and copies yours using the copy constructor.
C++11 improves the situation a bit by introducing emplace_back(). So if you are using C++11, you may be able to get the same performance by using emplace.

sending back a vector from a function

How to translate properly the following Java code to C++?
Vector v;
v = getLargeVector();
...
Vector getLargeVector() {
Vector v2 = new Vector();
// fill v2
return v2;
}
So here v is a reference. The function creates a new Vector object and returns a reference to it. Nice and clean.
However, let's see the following C++ mirror-translation:
vector<int> v;
v = getLargeVector();
...
vector<int> getLargeVector() {
vector<int> v2;
// fill v2
return v2;
}
Now v is a vector object, and if I understand correctly, v = getLargeVector() will copy all the elements from the vector returned by the function to v, which can be expensive. Furthermore, v2 is created on the stack and returning it will result in another copy (but as I know modern compilers can optimize it out).
Currently this is what I do:
vector<int> v;
getLargeVector(v);
...
void getLargeVector(vector<int>& vec) {
// fill vec
}
But I don't find it an elegant solution.
So my question is: what is the best practice to do it (by avoiding unnecessary copy operations)? If possible, I'd like to avoid normal pointers. I've never used smart pointers so far, I don't know if they could help here.
Most C++ compilers implement return value optimization which means you can efficiently return a class from a function without the overhead of copying all the objects.
I would also recommend that you write:
vector<int> v(getLargeVector());
So that you copy construct the object instead of default construct and then operator assign to it.
void getLargeVector(vector<int>& vec) {
// fill the vector
}
Is a better approach for now. With c++0x , the problem with the first approach would go by making use of move operations instead copy operations.
RVO can be relied upon to make this code simple to write, but relying RVO can also bite you. RVO is a compiler-dependent feature, but more importantly an RVO-capable compiler can disable RVO depending on the code itself. For example, if you were to write:
MyBigObject Gimme(bool condition)
{
if( condition )
return MyBigObject( oneSetOfValues );
else
return MyBigObject( anotherSetOfValues );
}
...then even an RVO-capable compiler won't be able to optimize here. There are many other conditions under which the compiler won't be able to optimize, and so by my reckoning any code that by design relies on RVO for performance or functionality smells.
If you buy in to the idea that one function should have one job (I only sorta do), then your dilema as to how to return a populated vector becomes much simpler when you realize that your code is broken at the design level. Your function really does two jobs: it instantiates the vector, then it fills it in. Even with all this pedantary aside, however, a more generic & reliable solution exists than to rely on RVO. Simply write a function that populates an arbitrary vector. For example:
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
template<typename Iter> Iter PopulateVector(Iter it, size_t howMany)
{
for( size_t n = 0; n < howMany; ++n )
{
*(it++) = n;
}
return it;
}
int main()
{
vector<int> ints;
PopulateVector(back_inserter(ints), 42);
cout << "The vector has " << ints.size() << " elements" << endl << "and they are..." << endl;
copy(ints.begin(), ints.end(), ostream_iterator<int>(cout, " "));
cout << endl << endl;
static const size_t numOtherInts = 42;
int otherInts[numOtherInts] = {0};
PopulateVector(&otherInts[0], numOtherInts);
cout << "The other vector has " << numOtherInts << " elements" << endl << "and they are..." << endl;
copy(&otherInts[0], &otherInts[numOtherInts], ostream_iterator<int>(cout, " "));
return 0;
}
Why would you like to avoid normal pointers? Is it because you don't want to worry about memory management, or is it because you are not familiar with pointer syntax?
If you don't want to worry about memory management, then a smart pointer is the best approach. If you are uncomfortable with pointer syntax, then use references.
You have the best solution. Pass by reference is the way to handle that situation.
Sounds like you could do this with a class... but this could be unnecessary.
#include <vector>
using std::vector;
class MySpecialArray
{
vector<int> v;
public:
MySpecialArray()
{
//fill v
}
vector<int> const * getLargeVector()
{
return &v;
}
};

A proper way to create a matrix in c++

I want to create an adjacency matrix for a graph. Since I read it is not safe to use arrays of the form matrix[x][y] because they don't check for range, I decided to use the vector template class of the stl. All I need to store in the matrix are boolean values. So my question is, if using std::vector<std::vector<bool>* >* produces too much overhead or if there is a more simple way for a matrix and how I can properly initialize it.
EDIT: Thanks a lot for the quick answers. I just realized, that of course I don't need any pointers. The size of the matrix will be initialized right in the beginning and won't change until the end of the program. It is for a school project, so it would be good if I write "nice" code, although technically performance isn't too important. Using the STL is fine. Using something like boost, is probably not appreciated.
Note that also you can use boost.ublas for matrix creation and manipulation and also boost.graph to represent and manipulate graphs in a number of ways, as well as using algorithms on them, etc.
Edit: Anyway, doing a range-check version of a vector for your purposes is not a hard thing:
template <typename T>
class BoundsMatrix
{
std::vector<T> inner_;
unsigned int dimx_, dimy_;
public:
BoundsMatrix (unsigned int dimx, unsigned int dimy)
: dimx_ (dimx), dimy_ (dimy)
{
inner_.resize (dimx_*dimy_);
}
T& operator()(unsigned int x, unsigned int y)
{
if (x >= dimx_ || y>= dimy_)
throw std::out_of_range("matrix indices out of range"); // ouch
return inner_[dimx_*y + x];
}
};
Note that you would also need to add the const version of the operators, and/or iterators, and the strange use of exceptions, but you get the idea.
Best way:
Make your own matrix class, that way you control every last aspect of it, including range checking.
eg. If you like the "[x][y]" notation, do this:
class my_matrix {
std::vector<std::vector<bool> >m;
public:
my_matrix(unsigned int x, unsigned int y) {
m.resize(x, std::vector<bool>(y,false));
}
class matrix_row {
std::vector<bool>& row;
public:
matrix_row(std::vector<bool>& r) : row(r) {
}
bool& operator[](unsigned int y) {
return row.at(y);
}
};
matrix_row& operator[](unsigned int x) {
return matrix_row(m.at(x));
}
};
// Example usage
my_matrix mm(100,100);
mm[10][10] = true;
nb. If you program like this then C++ is just as safe as all those other "safe" languages.
The standard vector does NOT do range checking by default.
i.e. The operator[] does not do a range check.
The method at() is similar to [] but does do a range check.
It will throw an exception on out of range.
std::vector::at()
std::vector::operator[]()
Other notes:
Why a vector<Pointers> ?
You can quite easily have a vector<Object>. Now there is no need to worry about memory management (i.e. leaks).
std::vector<std::vector<bool> > m;
Note: vector<bool> is overloaded and not very efficient (i.e. this structure was optimized for size not speed) (It is something that is now recognized as probably a mistake by the standards committee).
If you know the size of the matrix at compile time you could use std::bitset?
std::vector<std::bitset<5> > m;
or if it is runtime defined use boost::dynamic_bitset
std::vector<boost::dynamic_bitset> m;
All of the above will allow you to do:
m[6][3] = true;
If you want 'C' array performance, but with added safety and STL-like semantics (iterators, begin() & end() etc), use boost::array.
Basically it's a templated wrapper for 'C'-arrays with some NDEBUG-disable-able range checking asserts (and also some std::range_error exception-throwing accessors).
I use stuff like
boost::array<boost::array<float,4>,4> m;
instead of
float m[4][4];
all the time and it works great (with appropriate typedefs to keep the verbosity down, anyway).
UPDATE: Following some discussion in the comments here of the relative performance of boost::array vs boost::multi_array, I'd point out that this code, compiled with g++ -O3 -DNDEBUG on Debian/Lenny amd64 on a Q9450 with 1333MHz DDR3 RAM takes 3.3s for boost::multi_array vs 0.6s for boost::array.
#include <iostream>
#include <time.h>
#include "boost/array.hpp"
#include "boost/multi_array.hpp"
using namespace boost;
enum {N=1024};
typedef multi_array<char,3> M;
typedef array<array<array<char,N>,N>,N> C;
// Forward declare to avoid being optimised away
static void clear(M& m);
static void clear(C& c);
int main(int,char**)
{
const clock_t t0=clock();
{
M m(extents[N][N][N]);
clear(m);
}
const clock_t t1=clock();
{
std::auto_ptr<C> c(new C);
clear(*c);
}
const clock_t t2=clock();
std::cout
<< "multi_array: " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s\n"
<< "array : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s\n";
return 0;
}
void clear(M& m)
{
for (M::index i=0;i<N;i++)
for (M::index j=0;j<N;j++)
for (M::index k=0;k<N;k++)
m[i][j][k]=1;
}
void clear(C& c)
{
for (int i=0;i<N;i++)
for (int j=0;j<N;j++)
for (int k=0;k<N;k++)
c[i][j][k]=1;
}
What I would do is create my own class for dealing with matrices (probably as an array[x*y] because I'm more used to C (and I'd have my own bounds checking), but you could use vectors or any other sub-structure in that class).
Get your stuff functional first then worry about how fast it runs. If you design the class properly, you can pull out your array[x*y] implementation and replace it with vectors or bitmasks or whatever you want without changing the rest of the code.
I'm not totally sure, but I thing that's what classes were meant for, the ability to abstract the implementation well out of sight and provide only the interface :-)
In addition to all the answers that have been posted so far, you might do well to check out the C++ FAQ Lite. Questions 13.10 - 13.12 and 16.16 - 16.19 cover several topics related to rolling your own matrix class. You'll see a couple of different ways to store the data and suggestions on how to best write the subscript operators.
Also, if your graph is sufficiently sparse, you may not need a matrix at all. You could use std::multimap to map each vertex to those it connects.
my favourite way to store a graph is vector<set<int>>; n elements in vector (nodes 0..n-1), >=0 elements in each set (edges). Just do not forget adding a reverse copy of every bi-directional edge.
Consider also how big is your graph/matrix, does performance matter a lot? Is the graph static, or can it grow over time, e.g. by adding new edges?
Probably, not relevant as this is an old question, but you can use the Armadillo library, which provides many linear algebra oriented data types and functions.
Below is an example for your specific problem:
// In C++11
Mat<bool> matrix = {
{ true, true},
{ false, false},
};
// In C++98
Mat<bool> matrix;
matrix << true << true << endr
<< false << false << endr;
Mind you std::vector doesn't do range checking either.

STL vectors with uninitialized storage?

I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
Also, see my comments below for a clarification about when the initialization occurs.
SOME CODE:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}
std::vector must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector (or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
The best way is to use reserve() and push_back(), so that the copy-constructor is used, avoiding default-construction.
Using your example code:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
The only problem with calling reserve() (or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve() the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.
The interface supplied by std::vector does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.
In C++11 (and boost) you can use the array version of unique_ptr to allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.
auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);
C++0x adds a new member function template emplace_back to vector (which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:
memberVector.emplace_back(data1[i], data2[i]);
To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use
vect.push_back(MyStruct(fieldValue1, fieldValue2))
the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.
You can use boost::noinit_adaptor to default initialize new elements (which is no initialization for built-in types):
std::vector<T, boost::noinit_adaptor<std::allocator<T>> memberVector;
As long as you don't pass an initializer into resize, it default initializes the new elements.
So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad idea because it will break std::vector's internal reallocation algorithms.
Summary: You're not going to be able to do this with std::vector.
You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:
template <typename T>
struct no_init
{
T value;
no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }
no_init(T& v) { value = v; }
T& operator=(T& v) { value = v; return value; }
no_init(no_init<T>& n) { value = n.value; }
no_init(no_init<T>&& n) { value = std::move(n.value); }
T& operator=(no_init<T>& n) { value = n.value; return this; }
T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }
T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};
To use:
std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);
Err...
try the method:
std::vector<T>::reserve(x)
It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.
The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?
After verification on g++, the following code:
#include <iostream>
#include <vector>
struct MyStruct
{
int m_iValue00 ;
int m_iValue01 ;
} ;
int main()
{
MyStruct aaa, bbb, ccc ;
std::vector<MyStruct> aMyStruct ;
aMyStruct.push_back(aaa) ;
aMyStruct.push_back(bbb) ;
aMyStruct.push_back(ccc) ;
aMyStruct.resize(6) ; // [EDIT] double the size
for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
{
std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
}
return 0 ;
}
gives the following results:
[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856
The initialization you saw was probably an artifact.
[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
So...
:-/
[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like
struct MyStruct
{
MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
int d1, d2 ;
} ;
This will probably get inlined in your code.
But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.
I tested a few of the approaches suggested here.
I allocated a huge set of data (200GB) in one container/pointer:
Compiler/OS:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Settings: (c++-17, -O3 optimizations)
g++ --std=c++17 -O3
I timed the total program runtime with linux-time
1.) std::vector:
#include <vector>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
}
real 0m36.246s
user 0m4.549s
sys 0m31.604s
That is 36 seconds.
2.) std::vector with boost::noinit_adaptor
#include <vector>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
}
real 0m0.002s
user 0m0.001s
sys 0m0.000s
So this solves the problem. Just allocating without initializing costs basically nothing (at least for large arrays).
3.) std::unique_ptr<T[]>:
#include <memory>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
auto data = std::unique_ptr<size_t[]>(new size_t[size]);
}
real 0m0.002s
user 0m0.002s
sys 0m0.000s
So basically the same performance as 2.), but does not require boost.
I also tested simple new/delete and malloc/free with the same performance as 2.) and 3.).
So the default-construction can have a huge performance penalty if you deal with large data sets.
In practice you want to actually initialize the allocated data afterwards.
However, some of the performance penalty still remains, especially if the later initialization is performed in parallel.
E.g., I initialize a huge vector with a set of (pseudo)random numbers:
(now I use fopenmp for parallelization on a 24 core AMD Threadripper 3960X)
g++ --std=c++17-fopenmp -O3
1.) std::vector:
#include <vector>
#include <random>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m41.958s
user 4m37.495s
sys 0m31.348s
That is 42s, only 6s more than the default initialization.
The problem is, that the initialization of std::vector is sequential.
2.) std::vector with boost::noinit_adaptor:
#include <vector>
#include <random>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m10.508s
user 1m37.665s
sys 3m14.951s
So even with the random-initialization, the code is 4 times faster because we can skip the sequential initialization of std::vector.
So if you deal with huge data sets and plan to initialize them afterwards in parallel, you should avoid using the default std::vector.
From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.
From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then
copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));
Now you don't pay for copying a struct each time.
If you really insist on having the elements uninitialized and sacrifice some methods like front(), back(), push_back(), use boost vector from numeric . It allows you even not to preserve existing elements when calling resize()...
I'm not sure about all those answers that says it is impossible or tell us about undefined behavior.
Sometime, you need to use an std::vector. But sometime, you know the final size of it. And you also know that your elements will be constructed later.
Example : When you serialize the vector contents into a binary file, then read it back later.
Unreal Engine has its TArray::setNumUninitialized, why not std::vector ?
To answer the initial question
"Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?"
yes and no.
No, because STL doesn't expose a way to do so.
Yes because we're coding in C++, and C++ allows to do a lot of thing. If you're ready to be a bad guy (and if you really know what you are doing). You can hijack the vector.
Here a sample code that works only for the Windows's STL implementation, for another platform, look how std::vector is implemented to use its internal members :
// This macro is to be defined before including VectorHijacker.h. Then you will be able to reuse the VectorHijacker.h with different objects.
#define HIJACKED_TYPE SomeStruct
// VectorHijacker.h
#ifndef VECTOR_HIJACKER_STRUCT
#define VECTOR_HIJACKER_STRUCT
struct VectorHijacker
{
std::size_t _newSize;
};
#endif
template<>
template<>
inline decltype(auto) std::vector<HIJACKED_TYPE, std::allocator<HIJACKED_TYPE>>::emplace_back<const VectorHijacker &>(const VectorHijacker &hijacker)
{
// We're modifying directly the size of the vector without passing by the extra initialization. This is the part that relies on how the STL was implemented.
_Mypair._Myval2._Mylast = _Mypair._Myval2._Myfirst + hijacker._newSize;
}
inline void setNumUninitialized_hijack(std::vector<HIJACKED_TYPE> &hijackedVector, const VectorHijacker &hijacker)
{
hijackedVector.reserve(hijacker._newSize);
hijackedVector.emplace_back<const VectorHijacker &>(hijacker);
}
But beware, this is hijacking we're speaking about. This is really dirty code, and this is only to be used if you really know what you are doing. Besides, it is not portable and relies heavily on how the STL implementation was done.
I won't advise you to use it because everyone here (me included) is a good person. But I wanted to let you know that it is possible contrary to all previous answers that stated it wasn't.
Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.
Do the structs themselves need to be in contiguous memory, or can you get away with having a vector of struct*?
Vectors make a copy of whatever you add to them, so using vectors of pointers rather than objects is one way to improve performance.
I don't think STL is your answer. You're going to need to roll your own sort of solution using realloc(). You'll have to store a pointer and either the size, or number of elements, and use that to find where to start adding elements after a realloc().
int *memberArray;
int arrayCount;
void GetsCalledALot(int* data1, int* data2, int count) {
memberArray = realloc(memberArray, sizeof(int) * (arrayCount + count);
for (int i = 0; i < count; ++i) {
memberArray[arrayCount + i].d1 = data1[i];
memberArray[arrayCount + i].d2 = data2[i];
}
arrayCount += count;
}
I would do something like:
void GetsCalledALot(int* data1, int* data2, int count)
{
const size_t mvSize = memberVector.size();
memberVector.reserve(mvSize + count);
for (int i = 0; i < count; ++i) {
memberVector.push_back(MyType(data1[i], data2[i]));
}
}
You need to define a ctor for the type that is stored in the memberVector, but that's a small cost as it will give you the best of both worlds; no unnecessary initialization is done and no reallocation will occur during the loop.